Commun. Math. Phys. 201, 1 – 34 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
Loop Groups, Anyons and the Calogero–Sutherland Model Alan L. Carey1 , Edwin Langmann2 1 2
Department of Pure Mathematics, University of Adelaide, Adelaide, Australia Theoretical Physics, Royal Institute of Technology, S-10044 Stockholm, Sweden
Received: 12 May 1998 / Accepted: 4 August 1998
Abstract: The positive energy representations of the loop group of U(1) are used to construct a boson-anyon correspondence. We compute all the correlation functions of our anyon fields and study an anyonic W -algebra of unbounded operators with a common dense domain. This algebra contains an operator with peculiar exchange relations with the anyon fields. This operator can be interpreted as a second quantized Calogero– Sutherland (CS) Hamiltonian and may be used to solve the CS model. In particular, we inductively construct all eigenfunctions of the CS model from anyon correlation functions, for all particle numbers and positive couplings.
1. Introduction The viewpoint of Graeme Segal [PS, SeW] on integrable systems links the infinite dimensional Grassmanian approach of Sato [S] with the representation theory of loop groups. These two points of view overlap in the study of two dimensional quantum field theories. In the Sato approach, as in much of the physics literature, quantum field theory is regarded as an algebraic theory in which the usual Hilbert space formalism is absent. The Segal approach on the other hand deals with positive energy representations of loop groups in Hilbert spaces. Reconciling these points of view can be quite difficult although this has been done for many cases (see for example [CR, CHMS, BMT]). One way of thinking about the Segal approach is that it revolves around a Hilbert space definition of vertex operators. The algebraic approach to vertex operators is much studied in connection with Kac-Moody algebras [K, F] and may be regarded as the Lie algebraic version of the loop group projective representation theory. These Segal vertex operators arise from a boson field theory and were previously studied in a formal way in [Sk, C, M] and made more precise in [StW, DFZ]). In this approach one regularizes the vertex operators so that they are proportional to operators representing loop group elements and then, after taking an appropriate limit, one finds that they generate fermions in some
2
A. L. Carey, E. Langmann
cases (the boson-fermion correspondence) and operators forming a Kac-Moody algebra in others [PS, Se, CR, CHu], depending on the precise form of the cocycle in the loop group projective representation. We may summarize the present paper as enlarging the loop group representation theory to encompass a boson-anyon correspondence. Our results extend those of the previous paragraph in that we construct, from a certain positive energy loop group representation, Segal-type vertex operators on a Hilbert space which have, as their limits, anyon field operators. These anyon field operators applied to the vacuum, or cyclic vector, give new vectors in the Hilbert space which can be interpreted as anyon states. Each N particle anyon sector carries a representation of the braid group. The construction builds in fractional statistics from the outset, the precise statistics depending on the choice of anyon vertex operator. The idea of using a vertex operator construction to obtain particles with anyon type statistics is not new, see for example [Kl] and more recently [AMOS1, AMOS2, I, H, MS] and references therein. However the vertex operators described in these more recent references are not defined on the Fermion Fock space as limits of implementors of fermion gauge transformations. In other words they do not come from loop group elements. Indeed it is difficult to give a precise meaning to them at all and we do not attempt to do so here. Our vertex operators can be seen to have similar formal properties to those appearing in the papers mentioned, but are well defined in terms of positive energy representations of loop groups in the sense of [PS]. The benefits of our approach are the following. First there is a quantum Hamiltonian acting on the anyon states. This we believe resolves a long standing difficulty in the study of anyons in that it provides a basis for models incorporating interactions. Second we obtain a unifying view of a number of interesting ideas that have emerged in recent times in the physics literature. The most important of these is the connection with the Calogero– Sutherland (CS) model [AMOS1, AMOS2, I, MS] (see also [H, HLV, BHKV, P]). Specifically we find that n-point anyon correlation functions provide useful building blocks for solutions to the CS system. Comparing with the known solutions of the CS system [Fo2] we find that Jack polynomials [St] may be expressed in terms of anyon correlation functions. (Similar relations were previously obtained by different methods in [Fo1].) From this point of view the anyon Hamiltonian is a second quantized CS Hamiltonian. The final connection we make is with W -algebras, again a connection which has been known from other approaches for some time [AMOS1, AMOS2, I, MS]. In this paper we do not recover the full import of the W -algebra connection in the anyon case. This is a matter we intend to develop more fully elsewhere. However we do construct that part of the W -algebra that we need as an algebra of unbounded operators with a common dense domain. This suffices for our purposes, namely the construction of an anyon Hamiltonian, constructing the CS model solutions as anyon correlation functions, obtaining the link with Jack polynomials, and finding the algebraic relations of the Hamiltonian with the anyon fields.
2. Summary This paper contains a number of technical sections. In order to make the results accessible we present a summary here. At the same time we take the opportunity to introduce some of our notation. However, the reader will need to take some notation on trust and refer to later sections for the details.
Loop Groups and Anyons
3
We work on an interval SL = [−L/2, L/2] which we will think of as a circle of ∂ regarded as a self adjoint circumference L. We let P± be the spectral projections of −i ∂x 2 operator on a dense domain in L (SL ). We let F denote the free fermion Fock space over L2 (SL ). We choose the usual positive energy condition that the fermion fields are in a Fock representation of the algebra of the canonical anticommutation relations defined by P− . This means the fermion fields {ψ(f ), ψ(g)∗ | f, g ∈ L2 (SL )} satisfy h, ψ(f )∗ ψ(g)iF = hg, P− f iL2 (SL ) ,
(1)
where is the vacuum or cyclic vector in F. We let Q denote the Fermion charge operator on F, and R a unitary charge shift operator on F satisfying R−1 QR = Q + I (the precise choice for R will be explained later). We will construct regularized anyon field operators φνε (x), where ν ∈ R is a parameter determining the statistics, x ∈ SL , and ε > 0 is a regularization parameter. For positive ε the operator φνε (x) is proportional to a unitary operator on F which represents a certain U(1) valued loop on SL . These operators are not periodic but obey (the parameter ν0 will be explained below), φνε (x + L) = e−iπνν0 Q φνε (x)e−iπνν0 Q , and in the limit as ε ↓ 0 they converge to operator valued distributions φν (x) satisfying 0
φν (x)φν (y) = e−iπνν
0
sgn(x−y) ν 0
φ (y)φν (x)
(2)
for x 6= y. In particular for p ∈ 3∗ = { 2π L n|n ∈ Z}, the formula φˆ ν (p) = lim ε↓0
Z
L/2
−L/2
dx eipx eiπνν0 Qx/L φνε (x)eiπνν0 Qx/L
(3)
is a well-defined operator on F (Proposition 1). Note that we have to insert factors to compensate for the non-periodicity of φνε (x) before Fourier transformation. We also find that the statistics parameters ν, ν 0 for which Eq. (2) holds cannot be arbitrary but have to be integer multiples of some fixed (arbitrary) number ν0 > 0. (If one is only interested in a single species of anyons one can choose ν0 = |ν|.) A main focus is on the correlation functions of the anyon fields. These are distributions defined by taking the limit as εj ↓ 0 of
,... ,νN w1 ν1 νN w2 (4) Cεν11,... ,εN (ν0 , w1 , w2 |y1 , . . . , yN ) := , R φε1 (x1 ) · · · φεN (xN )R for xj ∈ SL , νj /ν0 , wj ∈ Z (for fixed ν0 ). Using general results for implementors of U(1) loops we obtain ,... ,νN Cεν11,... ,εN (ν0 , w1 , w2 |y1 , . . . , yN ) = δw1 +w2 +(ν1 +...+νN )/ν0 ,0
×eiπ(w1 −w2 )ν0 (ν1 x1 +···νN xN )/L
N N Y Y
b(xj − xk ; εj + εk )νj νk
(5)
j=1 k=j+1
with
π 2π π π b(x, ε) := e−i L x − e− L ε ei L x = −2ie−πε/L sin (x + iε). L
(6)
4
A. L. Carey, E. Langmann
The reason for studying these correlation functions is the connection with the Calogero– Sutherland (CS) Hamiltonian [Su]. This is defined on the set of functions f ∈ C 2 (SLN ; C) which are zero on {(x1 , . . . , xN ) ∈ SLN |xj = xk for some k 6= j and/or xj = ±L/2 }, HN,β = −
N N π 2 X ) β(β − 1) ∂2 X ( L , + 2 2 π ∂x sin L (xj − xk ) j j,k=1 j=1
(7)
j6=k
and which extends to a self-adjoint operator on L2 (SLN ).1 We will prove that the eigenfunctions and spectrum of this Hamiltonian can be obtained from anyon correlation functions, namely as finite linear combinations of functions ν 2π ∗ ν 2π ∗ ν ν ˆ ˆ (8) fν,N (n|x) := lim , φ ( nN ) · · · φ ( n1 ) φε (x1 ) · · · φε (xN ) , ε↓0 L L where nj ∈ N0 (Theorem 3). We will obtain these results by constructing a self-adjoint operator Hν,3 which can be regarded as a “second quantization” of the CS Hamiltonian Eq. (7) for β = ν 2 : it obeys the relations Hν,3 φνε (x1 ) · · · φνε (xN ) ' HN,ν 2 φνε (x1 ) · · · φνε (xN ), where “'” mean “equal in the limit ε ↓ 0” (see Theorem 2 for details). We obtain Hν,3 by arguing by analogy with the well known W -algebra associated with fermions. Using analogous formulae we construct the first few generators Hν,s , s = 1, 2, 3, of an anyon W -algebra. Understanding the complete anyon W -algebra is a problem we leave for a further investigation. This main result implies explicit formulas for the eigenvalues and a simple algorithm to construct eigenvectors 9ν,N (n) of Hν,2 as finite linear combinations of vectors 2π 2π ην,N (n) = φˆ ν ( n1 ) · · · φˆ ν ( nN ), n = (n1 , . . . , nN ) ∈ NN 0 . L L
(9)
These vectors Eq. (9) can be naturally interpreted as N -anyon states with anyon momenta pj = 2π L nj . Using these relations we can compute
9ν,N (n), Hν,3 φνε (x1 ) · · · φνε (xN )
in two different ways, and in the limit ε ↓ 0 we obtain functions (of the variables (x1 , . . . , xN )) in L2 (SLN ) which are the promised eigenfunctions of HN,ν 2 (Theorem 3). In the last subsection we observe that we recover the known spectrum of the CS Hamiltonian. Comparing with the known solutions of the CS model [Fo2], we can establish the relationship between the eigenfunctions of Theorem 3 and the Jack polynomials. 1 Since Eq. (7) obviously is a positive symmetric operator, this follows e.g. from Theorem X.23 in Ref. [RS2] (the Friedrich’s extension). Our approach will lead to a particular self-adjoint extension which is related to the standard one [Su] in a simple manner.
Loop Groups and Anyons
5
3. Preliminaries The subsequent discussion relies on some standard material which is summarized in this section. We will follow essentially the treatment in [CHu, PS, CR]. 3.1. Notation. We denote by N and N0 the positive and non-negative integers, respectively. Let 2π n | n ∈ Z} L
(10)
2π 1 (n + ) | n ∈ Z}. L 2
(11)
3∗ = {p = and 3∗0 = {k =
Our underlying Hilbert space for the fermions we take to be L2 (SL ) ∼ = `2 (3∗0 ). These are identified via the Fourier transform defined by Z L/2 1 dxf (x)e−ikx (12) fˆ(k) = √ 2π −L/2 for k ∈ 3∗0 . An orthogonal basis of L2 (SL ) is provided by the functions 1 ek (x) = √ eikx , k ∈ 3∗0 , 2π and then we have f=
(13)
2π X ˆ f (k)ek . L k
∂ is defined The spectral projection P− corresponding to the negative eigenvalues of 1i ∂x ˆ \ f )(k) = f (k) for k < 0 and = 0 otherwise. We also use P as (P = I − P . − + −
3.2. Quasi-free representations of the CAR algebra. Let {a(f ), a(g)∗ | f, g ∈ L2 (SL )} be the usual generators of the fermion field algebra over L2 (SL ), satisfying the canonical anticommutation relations (CAR) a(f )a(g) + a(g)a(f ) = 0, a(f )a(g)∗ + a(g)∗ a(f ) = hf, giL2 (SL ) I.
(14)
In the representation πP− of this algebra determined by the projection P− we write ψ(f ) = πP− (a(f )). If denotes the cyclic (or vacuum) vector in the Fock space F on which πP− acts then this representation is specified by the following conditions, We also use the notation
ψ(P+ f ) = 0 = ψ ∗ (P− f ).
(15)
ψˆ (∗) (k) = ψ (∗) (ek ), k ∈ 3∗0 .
(16)
3.3. Wedge representation of the loop group. Each unitary operator U on L2 (SL ), with P± U P∓ Hilbert–Schmidt, defines an “implementer” 0(U ), on the Fock space F satisfying 0(U )ψ(f )0(U )−1 = ψ(U f ).
(17)
6
A. L. Carey, E. Langmann
Of particular interest is the representation of the smooth loop group G = C ∞ (SL ; U (1)) of U (1) by implementors of the unitaries U (ϕ) acting on L2 (SL ). These are defined for ϕ ∈ G by U (ϕ)f = ϕf, f ∈ L2 (SL ).
(18)
Then 0 gives a projective representation of G on F. Writing 0(ϕ) for 0(U (ϕ)) we may choose
and we have
0(ϕ)∗ = 0(ϕ∗ )
(19)
0(ϕ)0(ϕ0 ) = σ(ϕ, ϕ0 )0(ϕϕ0 ),
(20)
where σ(ϕ, ϕ0 ) is some U (1) valued group two-cocycle on G. We will determine this cocycle next. The choice of phase of 0(ϕ) is important for giving an exact formula for σ. For those ϕ = eiα , with α ∈ LieG := C ∞ (SL ; R), the map r → 0(eirα ) is required to be a one parameter group such that the generator d0(α) of this group satisfies h, d0(α)i = 0. Then we have [d0(α), ψ(g)∗ ] = ψ(αg)∗
(21)
and a standard calculation [CHu, PS, CR] gives [d0(α1 ), d0(α2 )] = is(α1 , α2 )I
(22)
with the Lie algebra two-cocycle 1 s(α1 , α2 ) = 4π Hence the
Z
L/2
−L/2
dx(
dα1 (x) dα2 (x) α2 (x) − α1 (x) ). dx dx
0(eiα ) = eid0(α)
(23) (24)
are Weyl operators satisfying Eq. (20) with σ(eiα1 , eiα2 ) = e−is(α1 ,α2 )/2 . We will also use d0(α) for complex valued α. These are naturally defined by linearity,
Then
d0(α1 + iα2 ) = d0(α1 ) + id0(α2 ) α1,2 ∈ C ∞ (SL ; R).
(25)
d0(α)∗ = d0(α∗ )
(26)
(we use the same symbol ∗ for Hilbert space adjoints and complex conjugation), and Eqs. (22) and (23) extend to C ∞ (SL ; C) so that s defines a complex bilinear form in an obvious way. Here a technical remark is in order. The operators d0(α), α ∈ C ∞ (SL ; C), are all unbounded. However, there is a common, dense, domain D which is left invariant by all operators 0(ϕ), ϕ ∈ G (this is discussed in more detail in Appendix B). Thus Eqs. (21), (22), (25) and similar equations below are all well-defined on D. We also note that all vectors in D are analytic for all the operators d0(α), α ∈ C ∞ (SL ; C) (see e.g. [CR]). It is convenient to decompose loops into their positive, negative and zero Fourier components,
Loop Groups and Anyons
7
α(x) = α+ (x) + α− (x) + α; ¯ α± (x) = where
Z α(p) ˆ =
L/2 −L/2
1 X 1 ipx ˆ α(p)e ˆ , α¯ = α(0) L ±p>0 L
dx α(x)e−ipx p ∈ 3∗ .
(27)
(28)
Then ¯ d0(α) = d0(α+ ) + d0(α− ) + αQ
(29)
d0(α− ) = d0(α+ )∗ = Q = 0
(30)
with Q = d0(I). Note that
(highest weight condition) implying
h, d0(α1 )d0(α2 )i = , d0(α1− )d0(α2+ ) = is(α1− , α2+ ).
(31)
We also have is(α− , α+ ) ≥ 0 which can be also easily be seen from the explicit formula for s, X p (32) αˆ 1 (−p)αˆ 2 (p). is(α1 , α2 ) = 2πL p∈3∗ Standard arguments now give us (for α real valued)
− + , 0(eiα ) = e−is(α ,α ) .
(33)
We also need R = 0(φ1 ) which implements the operator U (φ1 ), where φ1 (x) = e2πix/L (for an explicit construction of 0(φ1 ) see e.g. [R]). The phase of this unitary operator will be fixed latter. Notice that ¯ R−1 d0(α)R = d0(α) + αI.
(34)
(This will be explained in more detail in Appendix A.) General loops in G are of the form ϕ = eif with f (x) = w
2π x + α(x) L
(35)
with periodic α and integer w = [f (L/2) − f (−L/2)]/2π (w is the winding number of ϕ). We then define ¯ ¯ Rw eiαQ/2 0(ei(α 0(eif ) := eiαQ/2
+
+α− )
).
(36)
This fixes the phase for all implementors. With that we get
where we introduced
σ(eif1 , eif2 ) = e−iS(f1 ,f2 )/2 ,
(37)
S(f1 , f2 ) = s(α1 , α2 ) + (wf1 α¯ 2 − α¯ 1 wf2 ).
(38)
It is worth noting that one can write
8
A. L. Carey, E. Langmann
L L L L S(f1 , f2 ) = f1 ( )f2 (− ) − f1 (− )f2 ( ) 2 2 2 2 Z L/2 df1 (x) df2 (x) 1 f2 (x) − f1 (x) ) dx( + 4π −L/2 dx dx
(39)
which (up to trivial, but nevertheless important, rescaling of variables) is identical to the antisymmetric two cocycle introduced by Segal [Se]. Notice that our choice of phase for the implementors implies that
(40) , 0(eif ) = 0 if w 6= 0. We will need the following relation: 0(eif1 )0(eif2 ) · · · 0(eifN ) = (
Y
e−iS(fj ,fk )/2 )0(eif1 eif2 · · · eifN )
(41)
j0
(56)
with
These functions have the following important properties which we summarize as
10
A. L. Carey, E. Langmann
Lemma 1. − , αy+ 0 ,ε0 ) = αy+ 0 ,ε+ε0 (y), S(αy,ε
S(fy,ε , fy0 ,ε0 ) = πsgn(y − y 0 ; ε + ε0 ), ∓ , αy±0 ,ε0 ) S(δy,ε
=
(57)
−δy±0 ,ε+ε0 (y).
(The proof of these relations is a straightforward calculation which we skip.) × × ∗ Then for ε > 0 and integer ν the operators φνε (y) :=× 0(eiνfy,ε ) ×= φ−ν ε (y) are well-defined, and from Lemma 1 and Eqs. (20) and (37) we conclude 0
φνε (y)φνε0 (y 0 ) = e−iπνν 0
0
sgn(y−y 0 ;ε+ε0 ) ν 0 0 ν φε0 (y )φε (y).
(58)
0
For odd integers ν, ν and in the limit ε, ε ↓ 0 these formally become anticommutator relations. This suggests that the φ±1 ε (y) in this limit are fermion operators. Indeed one can prove Z L/2 1 ∗ ˆ dy φ1ε (y)eiky , k ∈ 3∗0 (59) ψ (k) = lim √ ε↓0 2πL −L/2 in the sense of strong convergence on a dense domain (see e.g. [CHu, PS]). This is the central result of the boson-fermion correspondence. We note that this relation also fixes the phase of the unitary operator R. 4.2. Construction of anyons. To construct anyons we have to extend the relations Eq. (58) to non-integer ν, ν 0 . However, the functions eiνfy,ε (x) are not periodic and thus 0(eiνfy,ε ) does not exist. To circumvent this problem, we note that S(f1 , f2 ) Eq. (38) is invariant under changes α¯ i → α¯ i λ and wi → wi /λ with an arbitrary scaling parameter λ. We use this to construct a function f˜y,ε (x) which has the following properties, (i) (ii)
eiν fy,ε (x) is periodic for all ν, S(f˜y,ε , f˜y,ε ) = S(fy,ε , fy,ε ). ˜
Since the functions ν f˜y,ε (x) have winding numbers different from zero, the first requirement can only be fulfilled for ν values which are an integer multiple of some fixed number ν0 > 0. Then 2π 2πν0 + − f˜y,ε (x) = x− (x) + αy,ε (x) y + αy,ε Lν0 L
(60)
has the desired properties. Thus the operators ∗ φνε (y) :=× 0(eiµν0 fy,ε ) ×= φ−ν ε (y) , ν := ν0 µ, µ ∈ Z ×
˜
×
(61)
are well-defined for ε > 0, and they obey the exchange relations Eq. (58) but now for all ν, ν 0 which are integer multiples of ν0 . Thus the theory of loop groups provides a simple and rigorous construction of regularized anyon field operators φνε (x). Remark. To be precise, one should denote the anyon operators defined in Eq. (61) as φεν0 ,µ (y). Then Eq. (58) would read 0
0
0
0
0
φεν0 ,µ (y)φνε00 ,µ (y 0 ) = e−iπν0 µµ sgn(y−y ;ε+ε ) φνε00 ,µ (y 0 )φεν0 ,µ (y) µ, µ0 ∈ Z. 2
Making the ν0 -dependence manifest would allow us to obtain slightly more general results. However, it would also lead to a proliferation of indices which is a price we are not willing to pay.
Loop Groups and Anyons
11
We note that this definition and Eq. (43) imply that the anyon fields are not periodic but the operators φˇ νε (y) := eiπνν0 Qy/L φνε (y)eiπνν0 Qy/L = Rν/ν0
× ×
−
eiνd0(αy,ε +αy,ε ) × +
×
(62)
are. This suggests that the Fourier modes φˆ ν (p) of the anyons fields as defined in Eq. (3) are well-defined operators. In fact: Proposition 1. The φˆ ν (p) defined in Eq. (3) are operators with Db as common, dense, invariant domain. Especially, φˆ ν (0)R` = R`+ν/ν0 ∀` ∈ Z.
(63)
The proof of this is given in Appendix C. It implies that all vectors ην,N (n) Eq. (9) are in Db . This is important due to the following result also proven in Appendix C: Proposition 2. For η ∈ Db , Fην (x1 , . . . , xN ) := lim hη, φνε (x1 ) · · · φνε (xN )i ε↓0
(64)
exists and has the form Fην (x1 , . . . , xN ) = e−iπν
2
(x1 +...+xN )N/L
1νN (x1 , . . . , xN ) 2
× Pη (ν|e−2πix1 /L , . . . , e−2πixN /L ),
(65)
where ν 2 N N Y Y b(xj − xk ; ε) 1νN (x) := lim 2
ε↓0
(66)
j=1 k=j+1
with b given in Eq. (6) and Pη (ν|z) a symmetric polynomial.2 Especially, Fην (x) ∈ L2 (SLN ). Proposition 2 follows from the following explicit formula derived in Appendix C: for ηb Eq. (51), Fηνb (x1 , · · · , xN ) = δ`,N ν/ν0 e−iπν ×
n N Y X j=1
2
(x1 +...+xN )N/L
! −iqj xk
νe
1νN (x1 , . . . , xN ). 2
(67)
k=1
We note that 1νN (x1 , . . . , xN ) equals, up to a constant, to the well-known ground state wave function of the Sutherland model (see e.g. [Su]). This will be further explored in Sect. 6. Using Eqs. (9) and (3) we now obtain 2
2
I.e. a polynomial which is invariant under permutations of the arguments, see [McD].
12
A. L. Carey, E. Langmann
Z ην,N (n) = = with pj =
L/2
lim
ε1 ,... ,εN ↓0
−L/2
Z
L/2
lim
ε1 ,... ,εN ↓0
2π L nj
−L/2
Z
dx1 eip1 x1 · · · iP1 x1
dx1 e
L/2
−L/2
Z ···
L/2
−L/2
dxN eipN xN φˇ νε1 (x1 ) · · · φˇ νεN (xN ) dxN eiPN xN φνεN (x1 ) · · · φνεN (xN ) (68)
and
2π Pj = Pj,ν,N (n) = L
1 nj + ν (N − j + ) 2 2
j = 1, 2, . . . N.
(69)
(To derive this formula we used repeatedly eicQ Rw = eicw Rw for c ∈ R and w ∈ Z.) These Pj can be interpreted as anyon momenta, and they will play an important role in Sect. 6. It is interesting to note how the momentum shifts ∝ ν 2 appear in our formalism: they are due to the factors e−πνν0 Qx/L in Eq. (3) which are necessary to make the anyon operators periodic. We finally formulate a highest weight condition for the Fourier modes of the anyon field operators which is analogous to Eq. (136) in Appendix A and will also play an important role in Sect. 6. Proposition 3. The vector ην,N (n) Eq. (9) is non-zero only if the following conditions are fulfilled, n1 + n2 + . . . + nN ≥ 0, N X 2j−1−` nj ≥ 0 for ` = 1, 2, . . . N . n` +
(70) (71)
j=`+1
Again we defer the proof to Appendix C. 4.3. Anyon correlation functions. The results of the last two subsections enable us to complete one of our main aims namely to compute all anyon correlations functions. First Eqs. (42), (44), and (57) imply ,··· ,νN ν1 νN N (xN ) = Jεν11,··· φνε11 (x1 ) · · · φνεN ,εN (x1 , . . . , xN ) × φε1 (x1 ) · · · φεN (xN ) × ×
×
(72)
where ,··· ,νN Jεν11,··· ,εN (x1 , . . . , xN ) =
Y
b(xj − xk ; εj + εk )νj νk
(73)
jj 0
−ip1 y1
···
Y
−L/2
dyN e−ipN yN
ˇ j − x` ; 2ε)−ν 2 , b(y
j,`
ˇ ε) := 1 − e− L ε ei L x . Comparing with Eq. (65) we see and b(x,
2π L nj
2π
Z
Pην,N (n) (z1 , . . . , zN ) = lim ε↓0
Y
×
Z
−i 2π L n1 y1
−L/2
− 2π L ε
1−e
L/2
2π
dy1 e
i 2π L (yj −yj 0 )
e
···
ν 2 Y
j>j 0
L/2
−L/2
dyN e−i L nN yN 2π
1 − e− L ε ei L yj z` 2π
2π
−ν 2
.
j,`
We now can expand the integrand in a Taylor series in the exponentials and then perform the yj -integrations. The final result is N j−1 N 2 X Y YY ν −ν 2 0 (−1)µjj0 (−z` )mj` , (76) Pην,N (n) (z1 , . . . , zN ) = LN 0 µ m jj j` 0 j=1 j =1 `=1
2 P ±ν are the binomial coefficients as usual and 0 here means summation over where n all µjj 0 , mj` ∈ N0 such that j−1 X j 0 =1
µjj 0 −
N X j 0 =j+1
µj 0 j +
N X
mj` = nj
for j = 1, 2 . . . N .
(77)
`=1
4.4. The braid group. The braid group will not play a role in our deliberations, however we mention one observation for completeness. We define operators on the N -anyon subspace as follows. On a vector φνε (x1 ) · · · φνε (xN ) define, for i ∈ {1, 2, ..., N − 1}, σi to be the operator which interchanges the ith and (i + 1)th arguments and multiplies by the phase: e−iπν
2
sgn(xi −xi+1 ;ε)/2
.
An easy calculation reveals that the braid relations hold: σi σj = σj σi , |i − j| > 1, σi2 = 1, σj σi σj = σi σj σi , |i − j| = 1. So we have a braid group action on each N -anyon subspace.
14
A. L. Carey, E. Langmann
5. W -Charges 5.1. Motivation. There are self-adjoint operators W s on F obeying [W s , ψˆ ∗ (k)] = k s−1 ψˆ ∗ (k) ∀k ∈ 3∗0 , W s = 0
(s ∈ N).
(78)
If we introduce an operator valued distribution ψ ∗ (x) such that Z dx ek (x)ψ ∗ (x), ψˆ ∗ (k) = ψ ∗ (ek ) = SL
the commutator relations in Eq. (78) are (formally3 ) equivalent to ∂ s−1 ∗ ψ (x). ∂xs−1 ˆ Eq. (49), These operators W s can be represented in terms of the operators ρ(p) [W s , ψ ∗ (x)] = is−1
(79)
ˆ W 1 = ρ(0), π X × × 2 ×, ˆ ρ(−p) ˆ W = × ρ(p) L p∈3∗ W3 =
4π 2 3L2
X
× ×
×
ρ(p ˆ 1 )ρ(p ˆ 2 )ρ(−p ˆ 1 − p2 ) × −
p1 ,p2 ∈3∗
π2 ρ(0), ˆ 3L2
(80)
etc. These formulas are known in the physics literature (see e.g. [B]). We shall construct operators which obey similar relations with the anyon field operators φνε (x). To explain our method, we will first present a construction of operators W s obeying Eq. (78) for all s ∈ N. We then show how to partly extend this to anyons. The extension is essentially trivial for s = 1, 2. The first non-trivial case is s = 3. We propose a natural generalization of W 3 and show that it corresponds to a “second quantization” of the CS Hamiltonian Eq. (7), as described in the Introduction. To simplify our notation, we set ν0 = ν > 0 in the rest of the paper. 5.2. W -charges for fermions. We define ˜ ˜ × × Wεν (y; a) := N ν (a) × eiνd0(fy+a,ε −fy,ε ) × −I ,
(81)
with functions f˜y,ε given by Eqs. (60), (53) and the normalization constant N ν (a) =
i . π π 2Lν 2 cosν 2 ( L a) tan( L a)
(82)
In this section we are mainly interested in the fermion case where ν = 1, but in our discussion on anyons later we will need these formulas for general non-zero ν ∈ R. We claim that Z L/2 ∞ X (−ia)s−1 ν s−2 ν,s dy Wεν (y; a) = (83) W W ν (a) := lim ε↓0 −L/2 (s − 1)! s=1 defines an operator valued generating function for operators W ν,s , s ∈ N. To be more precise: 3
Our results below will actually give a precise mathematical meaning to this
Loop Groups and Anyons
15
Lemma 2. For all a ∈ R and non-zero ν ∈ R, the operators W ν (a) Eqs. (81)–(83) are well-defined on Db and leave Db invariant. Especially, W ν (a) = 0.
(84)
Moreover, Eq. (83) defines a family of operators W ν,s , s ∈ N, which have Db as a common, dense invariant domain of definition. The proof of this result is in Appendix D. We now show how to compute these operators W ν,s explicitly. We define 1 (1 − ν) δ˜y,ε (x) := − ∂y f˜y,ε (x) = δy,ε (x) + 2π L where ∂y =
∂ ∂y ,
(85)
and4 (1 − ν) ρ˜ε (y) := d0(δ˜y,ε ) = ρε (y) + Q. L
With that we obtain d0(f˜y+a,ε − f˜y,ε ) = −2π
∞ X as k=1
s!
(86)
∂ys−1 ρ˜ε (y),
Wεν (y; a)
Eq. (83) in a formal power series in a. A straightforward and one can expand computation then gives Z L/2 × × ν,1 dy × ρ˜ε (y) × , W = ε↓0
−L/2
W ν,2 = π W ν,3 =
Z
L/2
−L/2
4π 2 3 etc.
Z
dy
L/2
−L/2
× ×
× ρ˜ε (y)2 ×
dy
× ×
ε↓0
× ρ˜ε (y)3 ×
,
ε↓0
+
π2 (2 − 3ν 2 )W ν,1 , 3L2 ν 2
(87)
(this list can be easily extended with the help of a symbolic programming language like MAPLE). Note that for ν = 1, these are identical to the operators in Eq. (80), W 1,s = W s for s = 1, 2, 3. Later we will also need the following formulas which are obtained by simple computations from the definitions above, W ν,1 = (2 − ν)Q, π W ν,2 = W 2 + (1 − ν)(1 − 3ν)Q2 , L π 4 π 2 (1 − ν)2 (4 − ν)Q3 W ν,3 = W 3 + 4 (1 − ν)QW 2 + L 3 L 2 π 2 − (1 − ν)(1 − ν − 3ν 2 )Q, 3 L etc. The result described in the last subsection can now be stated as follows. R L/2 4 −ipx Note that ρ(p) ˆ = limε↓0
−L/2
dx ρε (x)e
, which motivates our notation.
(88)
16
A. L. Carey, E. Langmann
Theorem 1. The operators W 1,s obey the relations Eq. (78), i.e. W 1,s = W s for all s ∈ N. Proof. We recall Eq. (84) for ν = 1. Here we will show that [W 1 (a), ψˆ ∗ (k)] = e−ika ψˆ ∗ (k).
(89)
These two relations prove the result, as can be seen by an expansion in a formal power series in a and using Eq. (83). To prove Eq. (89) we use the boson-fermion correspondence Eq. (59). We thus compute the commutator of Wε10 (y) with φ1ε (x) = 0(eifx,ε ). With Eqs. (45), (46) and (57) we obtain [Wε10 (y; a), φ1ε (x)] = (· · · ) × 0(ei[fx,ε +fy+a,ε0 −fy,ε0 ] ) × ×
with (· · · ) := N 1 (a)
×
π sin L (y + a − x + iε) ˜ π i cot − c.c. = (y − x + i ε) ˜ − c.c. , π sin L (y − x + iε) ˜ 2L L
where ε˜ = ε + ε0 and c.c. means the same term complex conjugated. We now use that ±
π 1 i ± cot (y − x ± iε) ˜ = + δx, ε˜ (y) 2L L 2L
(90)
which is easily seen by expanding the l.h.s as a Taylor series in e±i(y−x)2π/L e−ε2π/L . Thus (· · · ) = δx,ε˜ (y) independent of a (!), and we obtain [Wε10 (y; a), φ1ε (x)] = δx,ε+ε0 (y) × 0(ei[fx,ε +fy+a,ε0 −fy,ε0 ] ) × . ×
×
Using Eqs. (83) and (59) we thus obtain for the l.h.s. of Eq. (89), Z L/2 Z L/2 1 × × dx eikx lim dy δx,ε+ε0 (y) × 0(ei[fx,ε +fy+a,ε0 −fy,ε0 ] ) × lim √ 0 ↓0 ε↓0 ε 2πL −L/2 −L/2 Z L/2 1 × × = lim √ dx eikx × 0(eifx+a,ε ) × ε↓0 2πL −L/2 in the sense of strong convergence on a dense domain. Recalling 0(eifx,ε ) = φ1ε (x) and using Eq. (59) again we obtain the r.h.s. of Eq. (89). We finally discuss a technical point which will be important in the next section: Our proof above shows that [W 1 (a), φ1ε (x)] ' φ1ε (x + a), where “'” means equality after smearing with appropriate test functions and taking the strong limit ε ↓ 0 on an appropriate dense domain. It will be useful to characterize “'” more explicitly as follows. Using Eq. (61) we define Z L/2 ˜ ˜ ˜ × × ν ˜ dy δx,ε+ε0 (y) × 0(eiν[fx,ε +fy+a,ε0 −fy,ε0 ] ) × . (91) φε (x; a) := lim 0 ε ↓0
Then
φ˜ νε (x; a)
'
φνε (x
−L/2
+ a). We now define
∂ s−1 ˜ ν ∂εs−1 ν φ (x) := (x; a) φ ε ε s−1 s−1 ∂ε x ∂x a=0
(92)
for s = 1, 2, . . . , which we regard as ε-deformed differentiations. We specify the relation between these and the ordinary differentiations in the following
Loop Groups and Anyons
17
Lemma 3. ∂ s−1 ν ∂εs−1 ν × × ν φ (x) = φ (x) + ε × cs,ν ε ε (x)φε (x) ×, ∂ε xs−1 ∂xs−1 ε
(93)
5 where cs,ν ε (x) is a well-defined operator-valued distribution for ε ↓ 0. Especially, 2,ν c1,ν ε (x) = cε (x) = 0, (iν)2 X i(p1 +p2 )x 1 −ε(|p1 +p2 |) −ε|p1 |−ε|p2 | e (x) = ρ(p ˆ ) ρ(p ˆ ) e − e . (94) c3,ν 1 2 ε L2 p ,p ∈3∗ ε 1
2
(The proof is a straightforward computation which we skip.) 5.3. W -charges for anyons. The considerations of the preceding section may be extended to cover the case of anyons i.e. ν an arbitrary non-zero real number. Using an argument similar to that in the proof of Theorem 1, we compute [Wεν0 (y; a), φνε (x)] = (· · · ) × 0(eiν[fx,ε +fy+a,ε0 −fy,ε0 ] ) × ˜
×
˜
˜
×
with (· · · ) equal to ν
"
N (a)
π (y + a − x + iε) ˜ sin L π sin L (y − x + iε) ˜
#
ν 2
− c.c.
ν 2 2 π π π ˜ + c.c. = N ν (a) cosν ( a) 1 + tanh( a) cot (y − x + iε) L L L 1 = δx,ε˜ (y) − (ν 2 − 1)a∂y δx,ε˜ (y) + O(a2 ), 2
(95)
where ε˜ = ε + ε0 (in the last line we Taylor expanded in a and used cot2 (z) = −1 − d cot(z)/dz and Eq. (90)). Integrating this in y, performing a partial integration, and using Eq. (91) we thus obtain [W ν (a), φνε (x)] = φ˜ νε (x; a) + iπν(ν 2 − 1)a × [ρ˜ε (x + a) − ρ˜ε (x)]φ˜ νε (x; a) × +O(a3 ). ×
×
Comparing now equal powers of a on both sides of the last equation using Eqs. (91)–(94) we see that the generalization of Theorem 1 to anyons holds true only for s = 1, 2, [W ν,s , φνε (x)] = ν 2−s is−1
∂ s−1 ν φ (x) ∂xs−1 ε
s = 1, 2,
(96)
but for s > 2 we get correction terms, e.g. [W ν,3 , φνε (x)] =
i2 ∂ε2 ν × × φ (x) + 2πi(ν 2 − 1) × ρ˜ε (x)0 φνε (x) ×, ν ∂ε x2 ε
(97)
where ρ˜ε (x)0 := ∂x ρ˜ε (x). We define Hν,1 := 5
1 ν,1 W , Hν,2 := W ν,2 ν
We will only need this for s = 1, 2, 3 and thus do not specify the cs,ν ε (x) for s > 3.
(98)
18
A. L. Carey, E. Langmann
which according to Eq. (96) are the anyon W -charges for s = 1, 2. In the following we only consider the first non-trivial case s = 3. To proceed, it is crucial to observe that the correction term in Eq. (97) can be partly canceled using the following operator, Z C = −πi
L/2
−L/2
dy
× ×
× + − [ρ+ε (y) − ρ− ε (y)]∂y [ρε (y) + ρε (y)] ×
2π X × × =− ˆ ρ(−p) ˆ × pρ(p) ×, L p>0
ε↓0
(99)
where ± ρ± y,ε := d0(δy,ε ).
(100)
This operator obeys the remarkable relations, Cφνε (x) + φνε (x)C = 2πiν × ρ˜ε (x)0 φνε (x) × +2 × Cφνε (x) × . ×
×
×
×
(101)
The proof of this, which we now outline, is by a computation similar to the one leading to Eq. (97). We consider the operator −
−
Vε (y; a, b) :=× e−iad0(iδy,ε −iδy,ε )+ibd0(∂y δy,ε +∂y δy,ε ) × +
×
+
×
(102)
and observe that ∂ ∂ dx Vε (y; a, b) C = −π lim ε↓0 −L/2 ∂a ∂b Z
L/2
.
(103)
a=b=0
Using Eqs. (45), (46) and (57) one then computes Vε0 (y; a, b)φνε (x) + φνε (x)Vε0 (y; a, b) which by a Taylor expansion in a and b and integrating in y gives Eq. (101). (For details see Appendix D, Proof of Lemma 4.)
Loop Groups and Anyons
19
We also note that Eq. (138) implies
Thus the operator
CR` = 0 ∀` ∈ Z.
(104)
Hν,3 = νW ν,3 + (1 − ν 2 )C
(105)
obeys the relation [Hν,3 , φνε (x)] = i2
∂ε2 ν × × φ (x) + 2(1 − ν 2 )(× Cφνε (x) × −φνε (x)C). ∂ε x2 ε
Again there are correction terms, however, in contrast to the one in Eq. (97) it vanishes when applied to vectors Rw ! We obtain [Hν,3 , φνε (x)]Rw = i2
∂2 ν φ (x)Rw ∂x2 ε
(106)
ν w (we used Lemma 3 and × cs,ν ε (x)φε (x) × R = 0). This seems to be the best we can do to generalize the relation Eq. (79) for s = 3 to the anyon case. To fully appreciate this operator Hν,3 one has to extend the computation above to a product of multiple anyon operators. We thus obtain our main result: ×
×
Theorem 2. The operator Hν,3 obeys the following relations, [Hν,3 , φνε1 (x1 ) · · · φνεN (xN )]Rw = HN,ν 2 ,ε φνε1 (x1 ) · · · φνεN (xN )Rw
(107)
for all integers w, where HN,ν 2 ,ε = −
N N X X ∂2 + ∂x2k k,`=1 sin2 k=1
k6=`
π L (xk
π 2 2 2 (L ) ν (ν − 1)
− x` − isgn(k − `)(εk + ε` ))
+ CN,ν 2 ,ε (x) (108)
is a regularized version of the CS Hamiltonian Eq. (7), i.e. the function CN,ν 2 ,ε (x)6 is non-singular and vanishes uniformly as εj ↓ 0 for all j = 1, 2. . . . N . The proof of this theorem is a straightforward but tedious extension of the computation leading to Eq. (106) (which is the special case N = 1), and the interested reader can find it in Appendix D. 6. The Calogero–Sutherland Hamiltonian and its Eigenfunctions We are now ready to show how the results of the last section provide the means to construct eigenfunctions and the corresponding eigenvalues of the CS Hamiltonian Eq. (7). 6.1. Eigenfunctions from anyon correlation functions. We claim that Theorem 2 essentially relates these eigenfunctions of the Sutherland Hamiltonian HN,ν 2 Eq. (7), to the eigenvectors of the operator Hν,3 . In fact the key step is just to observe the following elementary corollary of Theorem 2. 6
The interested reader can find the definition of this function in Eq. (156) below.
20
A. L. Carey, E. Langmann
Proposition 4. Let η ∈ Db . Then lim < η, Hν,3 φνε (x1 ) · · · φνε (xN ) >= HN,ν 2 Fην (x1 , . . . , xN ), ε↓0
(109)
where Fην is defined in Eq. (64) Especially, if η is an eigenvector of Hν,3 with the eigenvalue E, then Fην is an eigenfunction of HN,ν 2 with the same eigenvalue E. The immediate next question is to ask if our method constructs all eigenvectors of (7). We answer this in two steps. We first state and prove another consequence of Theorem 2. Proposition 5. The vectors ην,N (n) defined in Eq. (9) are in Db , and they obey Hν,3 ην,N (n) = Eν,N (n)ην,N (n) + γ
∞ XX
nην,N (n + n[ej − e` ])
(110)
j K
(2.5)
was used. Note that the number of terms on the rhs of (2.4) is mij := min (i + 1, j + 1).
(2.6)
The fusion rules (2.4) can be regarded as kind of “orbifold” of the multiplication table of SU(2) representations, with the sector φ1 playing the role of the fundamental representation: It obeys φ1 × φ0 = φ1 , φ1 × φi = φi−1 + φi+1 for 0 < i < K, φ1 × φK = φK−1 + φK . j lines from node i Thus, its fusion graph – the graph with one node i per sector i and N1i to node j – has a tadpole form, with the only loop attached to node K. This graph arises from the Dynkin diagram A2K (also familiar as the fusion graph of the fundamental representation sector in SU(2) WZW models) by dividing out the Z2 -symmetry. The abelian ring generated by the representations φi , or equivalently by the fusion matrices Ni with k , Ni jk := Nij
is a polynomial fusion ring in the sense of [17]: It is spanned by φ1 and pˇi (φ1 ), where ˇ polynomial, see also [45]. pˇi (x) is the ith Cebyshev Other distinguished sectors of the c(2, q) models are the vacuum sector φ0 with fusion rules φ0 × φi = φi , and the sector φK corresponding to the representation with the minimal conformal dimension of all the values listed in (2.3), namely hK = − K(K+1) 2(2K+3) . The fusion graph of φK , denoted Gq , is described by the connectivity matrix 0 0 ··· 0 1 0 0 · · · 1 1 . . .. (2.7) Cq = . .. . . 0 1 · · · 1 1 1 ··· 1 which is simply the symmetric (K + 1) × (K + 1) fusion matrix of φK :
372
A. Recknagel
0
0
1
0
3
2
1
2 1
Fig. 1. Fusion graphs Gq of the minimal dimension field for q = 5, 7, 9; the labels i at the nodes refer to the sectors φi
Cq
ij
= NK
ij
.
The graphs Gq , see Fig. 1 for examples, will play a surprising role in the following. 2.2. Path representations and global observable algebra. Consider some highest weight representation Hh of the Virasoro algebra at central charge c with highest weight vector |hi. It is well known that there exists a surjective homomorphism of Virasoro representations from the Verma module Vh (with the same highest weight h and central charge c) onto Hh . Therefore, the images of the Poincar´e-Birkhoff-Witt (PBW) vectors (which are a natural basis of Vh ) Lm1 . . . Lmn |hi, (2.8) where m1 ≤ . . . ≤ mn < 0 and n ≥ 0, form a spanning system of Hh . If Hh is an irreducible representation occurring in the minimal models, it contains null-vectors and the set (2.8) is linearly dependent. From the Kac determinant [21], the energy levels of the null-states are known, and there are also recursive formulas [9, 7] which in principle allow to compute them explicitly. Following a different route, Feigin, Nakanishi and Ooguri succeeded in singling out a linearly independent subset of (2.8) for c and h as in (2.2,3). A closer look at the construction given in [22] shows that the main fact to be exploited is simply that the vacuum vector is cyclic and separating for the local observable algebras. This implies that the non-trivial null-vectors of the vacuum representation h = 0 at c = c(2, q) re-appear as local (point-like) null-fields which generate the “annihilating ideal ” [22] of the energy-momentum W -algebra. The annihilating ideal is represented (by zero) in all the other sectors as well, which produces null-vectors there. In the vacuum Verma module V0 with central charge c(2, q), q = 2K + 3, there is the trivial singular vector L−1 |0i and a second one |s(q) i at energy q − 1, see [21], which can be written as (q) |0i |s(q) i = lim N (q) (z)|0i = N1−q z→0
for a certain primary null-field N (q) (z) of conformal dimension q−1. Using the notations of [52, 12], we may give a recursive formula for the normal ordered products N (q) (z), N (5) = N (T, T ),
N (q) = N (T, N (q−2) ),
(2.9)
see also [22]; the normal ordering prescription denoted by N (·, ·) involves usual normal ordering and additional corrections which render the resulting field quasi-primary. For
From Path Representations to Global Morphisms for a Class ofMinimal Models
373
our purposes, the precise expression for the Laurent modes of N (q) (z) will be inessential, we only need the familiar part, X : Lm1 . . . LmK+1 : + . . . , (2.10) Nn(q) = m1 ,... ,mK+1 m1 +...+mK+1 =n
where the dots indicate the correction terms – which are polynomials in the Lm of degree at most K. From the vacuum Verma module, we can pass to the irreducible vacuum representation by quotienting out the (intersecting) sub-Verma modules built over the singular vectors |s(q) i and L−1 |0i. Translating into the space of fields, this means that we have to set equal to zero all descendants of the null-field N (q) (z), i.e. the whole annihilating ideal. (The equation L−1 |0i = 0 translates into ∂z 1 = 0 and gives no information.) This means, in particular, that (2.11) Nn(q) |vi = 0 for any mode of the null-field acting on any vector |vi in any of the representations Hi with highest weight hi (2.3). Taking |vi to be one of the highest weight vectors |hi i, each of Eqs. (2.11) leads to linear relations among the PBW spanning system (2.8). Moreover, since N (q) (z) is primary and generates the full annihilating ideal, it is also clear that all linear dependencies in Hi arise from (2.11), or from equations φm Nn(q) |vi = 0 with some other field φ(z) in the energy-momentum W -algebra. We can, therefore, determine a linearly independent subset of (2.8) from the null-field modes – provided we do the book-keeping correctly. This combinatorial problem can be solved by introducing a suitable lexicographic ordering of the PBW vectors: We refer to [22] for the details and mention only that one of the ordering criteria is the length k of the monomial Lm1 . . . Lmk ; this should make it plausible that formula (2.10) for Nn(q) is accurate enough for our purposes. Taking everything together, Feigin, Nakanishi and Ooguri prove the following statement: Proposition 2.1 ([22]). Let Hi be an irreducible highest weight representation of the c(2, q) minimal model, q = 2K + 3, with highest weight hi as in (2.3), and let |hi i denote the highest weight vector. Then Hi has a basis consisting of those of the vectors Lm1 . . . Lmn |hi i with m1 ≤ . . . ≤ mn < 0 and n ≥ 0 that satisfy the “difference two condition” ml+K − ml ≥ 2
(2.12)
for 1 ≤ l ≤ n − K as well as the “initial condition” ]{ml = 1} ≤ i.
(2.13)
The difference two condition (2.12) controls whether monomials in the Virasoro generators are “too densely packed”: All PBW vectors that contain (sub-)arrays also appearing in the expansion (2.10) of null-field modes are removed from the basis. Whereas (2.12) characterizes the c(2, q) model as a whole, the initial condition (2.13), stating that the mode L−1 appears at most i times in the independent PBW monomials belonging to Hi , allows to distinguish the irreducible representations from each other. Let us now derive path representations from the FNO bases. We first notice that the PBW vectors (2.8), with h being one of the hi in the list (2.3), can be uniquely expressed as sequences of positive integers via the identification
374
A. Recknagel aM a2 a1 L−M . . . L−2 L−1 |hi i 7−→ (a0 (i); a1 , a2 , . . . , aM , 0, . . . );
(2.14)
am and have compared to (2.8), we have rewritten multiple equal modes as powers L−m also filled in the gaps (i.e. am = 0 is possible). The “dummy variable” a0 (i) is used to number the different sectors hi . After this reformulation, the difference two condition (2.12) takes the simple form
am + am+1 ≤ K for all m > 0,
(2.15)
0 ≤ am ≤ K;
(2.16)
which implies that the initial condition (2.13) of course means that a1 ≤ i.
(2.17)
We can get rid of this extra inequality by choosing the numbering a0 of sectors appropriately: Set a0 (i) = K − i in Eq. (2.14), then the initial condition is nothing but the difference two condition (2.15) extended to m = 0. We have accomplished a one-to-one map from the FNO basis vectors, Proposition 2.1, to sequences (am )m≥0 of integers am ∈ {0, . . . , K} which are subject to condition (2.16) for all m ≥ 0. In the next step, we encode these restrictions into the following labeled graph: It consists of K + 1 nodes which we denote by i running from 0 to K. The node i carries the label l(i) := K − i. (2.18) The graph contains (precisely) one edge from node i to node j if the associated labels satisfy l(i) + l(j) ≤ K, (2.19) and none otherwise. This means that there are i + 1 unoriented edges emanating from the node i. As it happens, the graph we have described is just the fusion graph Gq of the minimal dimension field φK , with its connectivity matrix given in Eq. (2.7). Now consider a path on Gq , i.e. a sequence (im )m≥0 , where each im is a node of Gq such that there is an edge in Gq from the node im to the node im+1 for all m ≥ 0. Since the labeling (2.18) is unambiguous, each path of Gq is in one-to-one correspondence to a sequence of labels (l(im ))m≥0 which satisfy 0 ≤ l(im ) ≤ K and l(im ) + l(im+1 ) ≤ K – and these are precisely the requirements (2.15,16) on the sequences of am discussed above. Therefore, we have established a connection from the FNO bases of the irreducible modules of the c(2, q) model to paths over the fusion graph Gq . Since the monomials occurring in the PBW system (2.8) are finite, we also have to restrict to finite paths (to finite label sequences) – or to paths stabilizing at the node K, i.e. paths such that there exists an M 0 with im = K for all m > M
(2.20)
(l(im ) = 0 for all m > M ). In the following we will often speak of finite paths (of arbitrary length) when we mean infinite paths subject to the “tail condition” (2.20). The initial condition (2.17), on the other hand, really becomes a simple initial condition for paths: The FNO basis vectors of the irreducible representation Hj correspond
From Path Representations to Global Morphisms for a Class ofMinimal Models
375
to the (finite) paths starting from node j (i.e. i0 = j). We denote the complex vector space generated by those paths by Pj . Finally, our construction allows us to introduce an action of the energy operator L0 on the path spaces Pi . For each path |pi = (im )m≥0 ∈ Pi , let (lm )m≥0 be the associated sequence of labels. (Note that the associated sequences exist only for the paths themselves, not for arbitrary linear combinations.) We define an “energy operator” LG0 on Pi by declaring each path to be an eigenvector of LG0 : X (2.21) m lm · |pi; LG0 |pi = hi + m≥0
the constant hi is purely conventional, but with this choice we recover precisely the energy values in the c(2, q) representations. We summarize our results in the following statement: Proposition 2.2. Let Gq be the (labeled) fusion graph of the minimal dimension field φK in the c(2, q) model, q = 2K + 3, defined by the connectivity matrix (2.7) and with labels as in Eq. (2.18). For i = 0, . . . , K, denote by Pi the vector space generated by all finite paths on Gq which start at node i. Then Pi and the irreducible highest weight representation Hi of the c(2, q) model are isomorphic as Z-graded spaces, Pi ∼ = L0 Hi , where the Z-gradings are given by the usual energy operator L0 on Hi and by the LG0 action (2.21) on Pi . Let us emphasize that the path representations discussed here are not be confused with the path spaces underlying certain RSOS models, see e.g. [6, 25, 55]. In particular, those paths are defined on An Dynkin diagrams (with n different from the number of CFT sectors), and they have infinite tails that contribute non-trivially to the energy and even to the sector identification. They also do not admit the physical interpretation in terms of quasi-particles which will be discussed below. On the other hand, quasiparticle representations of the type considered here played a certain role in the work [23] on integrable representations of affine Lie algebras. Up to now, we have worked with unbounded Laurent modes of point-like fields. If we wish, we can view the previous constructions as a discussion of Virasoro algebra representation theory and now forget all the details. Then, we define the path spaces Pi over the graphs Gq as the state spaces of the c(2, q) model. Together with the energy operator LG0 , which governs the “time evolution” of the models, they contain all the invariant information that can be extracted from the FNO basis. In particular, we no longer need to identify “elementary” paths over Gq (i.e. true paths as opposed to linear combinations thereof) with states in the PBW system. The apparent advantage of this identification is that it would provide a concrete action of the Virasoro generators Ln on the path spaces. This “PBW type action”, however, can hardly be cast into a closed form and bears no relation to the path structure. In order to obtain useful formulas for the Virasoro action, one has to apply more refined methods [62] based on a novel interpretation of paths, which will be sketched in Sect. 2.3 below. From the point of view of algebraic QFT, the unbounded Virasoro generators are not of prime importance, anyway. Instead, one would like to construct a covariant net of local observables A(I), I ⊂ S 1 . As was mentioned in the introduction, we are not able to give a local description of the c(2, q) models, yet, but the path picture for
376
A. Recknagel
the irreducible modules at least provides a natural algebra of bounded operators which we can view as the global observable algebra, more precisely the “universal” [26] observable algebra A ≡ Auniv generated by the A(I): Let Ai = πi (A) denote the global (universal) observable algebra acting in the representation space Hi (or Pi ). With the elements of Ai , we can map any state in Hi (or path in Pi ) into any other, therefore we take Ai to be the path algebra associated with Pi . (n) be the space To give a precise definition, we introduce some more notation. Let Pi,j of paths of length n over Gq which run from node i to node j, i.e. (n) ={ |pi = (i0 , . . . , in ) | im joined with im+1 on Gq , i0 = i, in = j }C . Pi,j
(2.22)
The finite-dimensional algebra A(n) i,j is linearly generated by so-called strings |pihq|, (n) which are multiplied like matrix units, pairs of elementary paths |pi, |qi ∈ Pi,j ˜ |pihq| · |pih ˜ q| ˜ = δq,p˜ |pihq|, where δq,p˜ is the Kronecker symbol. Strings act on paths in an analogous way, therefore (n) (n) A(n) i,j = End (Pi,j ). We can endow Pi,j with a scalar product making (elementary) ∗ (n) paths pairwise orthonormal; then Ai,j becomes a ∗ -algebra such that |pihq| = |qihp|. L (n) (n) Denote by A(n) i the multi-matrix algebra Ai := j Ai,j . Note that this standard scalar product on the path spaces does not directly reproduce the Shapovalov form on the irreducible modules via the original identification of PBW vectors with paths. But, as explained, this identification involved certain choices without any invariant meaning. Since the (invariant) L0 eigenspaces are finite-dimensional, it is in principle possible to define a new (pseudo) scalar product on the spaces of fixed energy so as to reproduce the Shapovalov form, but a closed formula for all energy levels is not known. These questions have been addressed in [42], and also in [62]; again, the quasi-particle interpretation of paths seems to be useful in this context. If the graph Gq contains an edge from the node j to the node k, we can embed the (n) (n+1) into Pi,k by concatenation of this edge to elementary paths; we denote path space Pi,j the corresponding linear map by (n) (n+1) −→ Pi,k ; cjk : Pi,j
(2.23)
note that cjk is independent of the starting node i. Concatenation induces an embedding (n+1) of the simple factor A(n) of A(n) whenever j and k are joined by an edge of i into Ai,k i,j Gq , i.e. whenever Cq jk = 1, see (2.7). In other words, Cq is the embedding matrix of the Bratteli diagram associated to the injection 8(n) : Ai(n) −→ A(n+1) i
(2.24)
induced by (2.23). In turn, this diagram determines 8(n) up to inner isomorphism in , see e.g. [11, 39]. A(n+1) i Definition 2.3. The global path observable algebra of the c(2, q) minimal model is A = A0 ⊕ . . . ⊕ AK , where Ai is the C ∗ -closure of the inductive limit of the system (n) with embeddings (2.24) induced by concatenation of paths. A(n) i ,8
From Path Representations to Global Morphisms for a Class ofMinimal Models
377
By construction, each generator Ln of the Virasoro algebra, restricted to finite energy subspaces of the representation Hi , can be expressed by elements of Ai ; in particular, the “conformal Hamiltonian” LG0 of (2.21) is “affiliated” to A in the sense that all its spectral projections are contained in A. Instead of Ai , we will sometimes write Ai = πi (A) ≡ pr i (A). Each Ai is an AF-algebra, i.e. an “approximately finite-dimensional” algebra, see e.g. [11, 39]. The most convenient way to picture AF-algebras (up toisomorphism) is by (n) . This is an infinite the (infinite) Bratteli diagram associated to the “tower” A(n) i ,8 graph, subdivided into floors which correspond to the finite-dimensional sub-algebras (n) th A(n) i , n = 0, 1, . . . . The n floor consists of as many nodes as Ai has simple factors (in our case: one node on the zeroth, K + 1 nodes on higher floors), and the nodes are labeled by the sizes m of the factors Mm (C). In addition, the Bratteli diagram contains lines from each floor to the consecutive one, which fix the isomorphism class of the ; in our case, the lines are just the edges between embedding 8(n) of Ai(n) into A(n+1) i nodes of Gq , only now “source” and “range” of an edge are viewed as belonging to different floors. An example is shown in Fig. 2. *
1
1
1
1
2
1
1
2
3
1
2
3
5
3
5
1
1
1
1
1
. .. .
2
2
. .. .
8
.. ..
b5 , associated to the observable algebras in the vacuum sector, in the Fig. 2. Bratteli diagrams B5,0 , B5,1 and B h = − 15 representation, and to the path field algebra of the c(2, 5) model. Labels on the nth floor give the dimensions of the spaces of paths running in n steps from the top node (on the zeroth or −2nd floor of the diagram) to the node 0 (on the left) or 1 (on the right) of G5
Since the maps (2.24) are unital, the labeling of floors n with n ≥ 1 follows from the dimensions of the factors of A(0) i and the structure of the (unlabeled) Bratteli diagram. The latter is the same for all Ai , i = 0, . . . , K, for q = 2K + 3 fixed, except for the floors zero and one: It depends on the starting node i which other nodes can be reached on the first floor; among those, however, one always finds the node K, which in turn is connected to any other node of Gq , thus the second floor already contains K + 1 nodes.
378
A. Recknagel
It is clear from the description that the infinite Bratteli diagram Bq,i associated to the algebra Ai simply displays all the infinite paths on Gq with starting node i – which is why AF-algebras constructed in this particular way are also called path algebras [54]. In later sections, Bratteli diagrams (finite or infinite ones) will provide effective tools when we are interested in certain statements on algebras or homomorphisms thereof only up to isomorphism. One such statement follows immediately from the observation that, for fixed q, all the Bq,i look alike except for the first few floors: This implies that the algebras Ai are mutually stably isomorphic, i.e. their infinite matrix amplifications M∞ (Ai ) are isomorphic – see e.g. [11]. Another fact which is obvious from the shape of the Bratteli diagrams is that all the Ai have trivial center. Therefore, the center of the global path observable algebra A is CK+1 – in accordance with the general theorems on the universal observable algebra Auniv proven in [26]. 2.3. Quasi-particle interpretation of paths. We have seen that our reformulation of the c(2, q) highest weight modules as path spaces gives a neat picture of these CFTs, with a great deal of information encoded in one labeled graph, which moreover is the fusion graph of a distinguished sector of the theory. While in the remainder of this paper, we are mainly interested in the specific consequence that the path representations provide us with an operator algebraic description of global features of the c(2, q) models, we here want to comment on other aspects of the path structure. We will give a natural interpretation of paths in terms of quasi-particles and use the decomposition of the modules Hi into sub-sectors of fixed quasi-particle numbers to compute the characters c
L0 − 24 χK i (q) = tr Hi q
(2.25)
of the c(2, q) model with q = 2K + 3. (We hope there will be no confusion between the formal variable q in (2.25) and the q that labels our conformal models.) It turns out that the quasi-particle picture of path spaces, details of which were first discussed in [62], leads to particular expressions for χK i (q) that reveal interesting facts about the physics of the c(2, q) models. For simplicity, let us first concentrate on the example of the c(2, 5) model, also known as the Lee–Yang edge singularity; this CFT describes a special critical point in the phase diagram of the Ising model with (complex) magnetic field [16]. The theory contains two irreducible modules with highest weights h0 = 0 and h1 = − 15 , which can be realized on the two-node tadpole graph G5 , see Fig. 1, with connectivity matrix 01 . C5 = 11 The label sequences (lm )m≥0 associated to the paths over G5 are sequences of 0’s and 1’s satisfying (2.26) lm = 1 ⇒ lm+1 = 0, the “tail condition” lm = 0 for “large” m, and the initial conditions l0 = 1 or l0 = 0 for the vacuum or for the h = − 15 representation, resp. Here, we could directly interpret each sequence as a list of possible states of a quasiparticle, with a “1” in the mth position indicating that the mth state is occupied; then (2.26) would be a generalized Pauli principle (“more exclusive” than the usual one) imposed on the quasi-particles. However, in the higher c(2, q) models, it turns out that a slight change of perspective is to be preferred. Namely, we identify a quasi-particle of the
From Path Representations to Global Morphisms for a Class ofMinimal Models
379
c(2, 5) with the basic length two segment (1 0) of a sequence. Then (2.26) becomes the ordinary Pauli principle, simply prohibiting that two quasi-particles “overlap”. Except for that restriction (and the initial conditions on the sequences) the quasi-particles behave like free particles with a “dispersion relation” given by the energy operator LG0 eq. (2.21). Let us use this to compute the energies of all n-quasi-particle states, in the h = − 15 representation, say. The state with the minimal energy corresponds to the sequence s(n) 0 = (0; 1, 0, 1, 0, . . . , 1, 0, 0, . . . ), where the last “1” is at the (2n − 1)st position; because of (2.26), the quasi-particles cannot be packed more densely. s(n) 0 has energy E0 (n) = h1 + 1 + 3 + . . . + 2n − 1 = h1 + n2 . The simplest excitations of this n-quasi-particle ground state are obtained by shifting the last segment “10” to the right, step by step, and filling in “0”s. The resulting states have energies E0 (n) + 1, E0 (n) + 2, etc. If we excite the last two quasi-particles simultaneously (by shifting the last segment “1010” to the right), we obtain states with energies E0 (n)+2, E0 (n)+4, etc. The same can be done with 3, 4, . . . , n quasi-particles. If we combine several of such shifts, we can generate all n-quasi-particle states from (n) the ground state s(n) 0 , and thus, the total contribution of the n-quasi-particle sector H1 to the character is c
tr H(n) q L0 − 24 1
2 c = q h1 − 24 · q n · 1 + q + q 2 + . . . · 1 + q 2 + q 4 + . . . · · · 1 + q n + q 2n + . . . ;
the product occurs since the different excitations are independent of each other. The total state space H1 can be decomposed with respect to the quasi-particle number, M (n) H1 , H1 = n≥0
which immediately leads to the following sum form for the character of the h1 = − 15 representation in the c(2, 5) minimal model: χ11 (q)
=q
c h1 − 24
X q n2 , (q)n
n≥0
where we have used the abbreviation (q)n := (1 − q)(1 − q 2 ) · · · (1 − q n ). For the vacuum module, corresponding to label sequences starting with l0 = 1, the same procedure leads to the formula X q n2 +n c 1 h0 − 24 . χ0 (q) = q (q)n n≥0
For the other c(2, q) models, one may proceed in a similar fashion. Again, the Hilbert spaces (or the path spaces) can be decomposed with respect to the quasi-particle content of the states, but now K different (and independent) species of quasi-particles are present; thus, M Hi(n1 ,... ,nK ) (2.27) Hi = n1 ,... ,nK
in the c(2, q) model with q = 2K + 3. The correspondence between quasi-particles and basic segments of paths (i.e. label sequences) is as follows: Since as in the c(2, 5) case,
380
A. Recknagel
we want to interpret the difference two condition as an (ordinary) exclusion principle for quasi-particles, the latter must correspond to patterns of length two. Only now, each quasi-particle can occur in different “shapes”. The “lightest” particle is given by the segment (1 0) as before, the second lightest occurs in the two forms (2 0) or (1 1), and so forth up to the “heaviest” quasi-particle which is the class of segments (K 0), (K − 1 1), . . . , (1 K − 1). Expressions like “lightest” for the moment just refer to the dispersion law (2.21). Besides respecting the exclusion principle dictated by the difference two condition, this identification of quasi-particles with classes of patterns of length two ensures that each quasi-particle can be excited in energy steps of one. It is amusing to look at an example (with K ≥ 2) where a quasi-particle of the second lightest species upon excitation “moves through” one of the lightest type: (0; 2, 0, 1, 0, . . . ) ,→ (0; 1, 1, 1, 0, . . . ) ,→ (0; 1, 0, 2, 0, . . . ) ,→ (0; 1, 0, 1, 1, . . . ) ,→ . . . . Each arrow indicates that the total energy of the configuration increases by one unit, without changing the quasi-particle content; in the first step, the heavier quasi-particle changes its appearance from (2 0) to (1 1); after the second step, it has passed the lighter quasi-particle, which then has “jumped” to the left – a phenomenon reminiscent of the “time shift” in soliton scattering. This sketch already suggests how to implement the Virasoro modes Ln on the path spaces in a way which takes the path structure into account. General considerations of Virasoro actions on path representations were presented in [42], and explicit formulas for the su(1,1) generators L±1 acting in the vacuum module were constructed in [62], following the guiding principle that these operators should leave the decomposition (2.27) of Hi invariant. (In contrast, the Ln one would obtain directly from the PBW vectors do not respect the quasi-particle numbers.) To state the precise formulas for L±1 , and also to rigorously prove that the quasi-particles of different species can be excited independently of each other, is slightly technical, and we refer to [62] for the complete analysis. Once these results are established, it is again straightforward to derive sum forms for the characters of the c(2, q) model, q = 2K + 3: χK i (q)
=q
c hi − 24
X n1 ,... ,nK ≥0
q N1 +···+NK +Ni+1 +···+NK , (q)n1 · · · (q)nK 2
2
(2.28)
where Ni := ni + . . . + nK . The denominators show up because the different quasiparticles can be excited independently, the q-exponent in the numerator is the minimal energy of a configuration with nl quasi-particles of species l, l = 1, . . . , K. One can compare these formulas to the well-known Rocha-Caridi expressions for the characters of Virasoro minimal models, which follow directly from the Feigin-Fuchs results on the chain of singular vectors. If one applies the Jacobi triple product identity to the Rocha-Caridi characters, see [46], one obtains the product form Y c hi − 24 (1 − q l )−1 . (2.29) χK i (q) = q l6≡0,±(i+1) (mod 2K+3) l>0
Equating (2.28) and (2.29) yields combinatorial identities known as Andrews-Gordon identities, see e.g. [5]. In the special case of K = 1, they reduce to the famous RogersRamanujan identities.
From Path Representations to Global Morphisms for a Class ofMinimal Models
381
The surprising feature of the character sum formulas (2.28) is their relation to nonconformal models. In the first place, expressions of the same type naturally appear in the theory of 1-dimensional quantum spin chains. This was first shown in [44], where partition sums of such chains were calculated from the Bethe Ansatz. Kedem and McCoy also were the first to realize that the excitation spectrum of the chains can be interpreted in terms of quasi-particles, having specific dispersion relations and obeying generalized Pauli principles. Since 1-dimensional quantum spin chains are essentially equivalent to 2-dimensional statistical models on the lattice, the results of [62] on path space representations of L±1 are also relevant as an approach towards a Virasoro action on the lattice. Conformal field theories arise in the continuum limit of 2-dimensional lattice models (or 1-dimensional quantum spin chains) at the critical point. Alternatively, they can appear as scaling limits of 2-dimensional massive QFTs – in particular of integrable field theories which have been obtained as perturbations of CFTs [72]. It turns out that the conformal characters (2.28) carry information on such perturbations as well: The q-exponent, giving the ground state energy of a quasi-particle configuration with prescribed n := (n1 , . . . , nK ), can be written as a quadratic form nt Mq n + mtq,i n with an integer K × K matrix Mq and integer K-vectors mq,i . Whereas Mq is “universal” within the c(2, q) model, mq,i also depends on the sector. In our cases, Mq is the inverse of the Cartan matrix of the K-node tadpole graph. The associated connectivity matrix has a Perron-Frobenius eigenvector vq , and the astonishing fact is that the ratios of the entries of vq coincide with the mass ratios of the particles present in the so-called φ1,3 -perturbation of the c(2, q) minimal model, see [27]. In this sense, the expressions “lightest” and “heaviest” quasi-particle used above receive a literal meaning. Note that such “coincidences” occur for other sum expressions of conformal characters, too: By now, sum formulas for the characters of many conformal coset models have been found (without reference to path representations), see e.g. [67, 43, 10], and also [46] for a discussion of W -algebra extensions of certain minimal models. Most remarkably, there exist two different sum forms for the characters of the Ising models, involving the inverse of the Cartan matrix of A1 or of E8 , respectively. On the other hand, the two possible massive perturbations of the Ising CFT have either one or eight massive particles, in the latter case with mass ratios given by the Perron-Frobenius vector of the incidence matrix of the E8 graph [72]. In summary, we have observed that the quasi-particle structure of the CFT highest weight representations and the associated sum forms of the conformal characters reveal certain aspects of non-conformal deformations (lattice models or massive QFTs) of the CFT. Although the precise relationship remains to be worked out, we may at least conclude that the path representations of the c(2, q) models are not just an artifact of our constructions, but do indeed have deeper physical meaning.
3. The Quantum Symmetry Algebra For our construction of global amplimorphisms for the c(2, q) models, we will make use of an action of a quantum symmetry algebra (QSA) on the path representations, which allows to determine covariant field multiplets. In Subsect. 3.1, we collect some general statements on “quantum symmetries” of low-dimensional QFTs. We do, however, by no means attempt to give a complete account of this subject, which has been an area of intense research during the last decade. We recommend e.g. [31, 64] as sources where
382
A. Recknagel
to find details and references to the historical development. Likewise, nothing new will be added to the general theory of quantum symmetries – except for the remark that, for a certain class of CFTs, there exists a canonical semi-simple QSA, a fact that was observed already before [57, 36] but is somewhat contrary to the prevailing opinion. Subsect. 3.2 is devoted to these matters, but it can be skipped if one is merely interested in the QSA action on the path spaces, which is set up in Subsect. 3.3. 3.1. General remarks on quantum symmetries. Local quantum field theories on a lowdimensional space-time are interesting especially because of their superselection structure. Their statistics is governed by the braid group in contrast to the permutation group symmetry of QFTs in higher dimensions. The statistical data of a QFT define a representation category, which is a “rigid braided (or symmetric) monoidal category with unit”, see e.g. [51, 31]. A natural question to ask is whether this category is equivalent to the representation category of some group or algebra, which then could be regarded as the internal symmetry group (or algebra), the “global gauge group”, of the model. For local QFTs in space-time dimension ≥ 3, this problem has been settled by the famous duality theorem of Doplicher and Roberts [19] with the result that in this situation the statistical properties always “come from” a compact Lie group. In two dimensions, and for charges localized in space-like cones in 2 + 1 dimensions, the situation is not quite as clear yet. Research essentially focuses on three different variants of quantum symmetry algebras: One is given by quantum groups in the sense of deformations of the universal enveloping algebras of ordinary Lie algebras. In certain theories, e.g. the WZW models, they arise rather naturally [4, 1], and they have the further virtue that they are closely related to lattice discretizations of CFTs, see e.g. [56]. It is, however, rather unlikely that quantum groups cover all kinds of statistics that can arise in conformal models. In addition, there is the unpleasant feature that in the cases most relevant for rational CFTs, namely when the deformation parameter is a root of unity, indecomposable representations show up, the significance of which remains to be uncovered. A more speculative, and more spectacular, concept of quantum symmetry uses Ocneanu’s string algebras or “paragroups” [54]. They occur naturally in the general DHR framework as intertwiner algebras associated to the endomorphisms of a local QFT [26, 30, 60]. By construction, these string algebras contain all information on the statistical properties of a theory, but a “naive” implementation of string algebras as “global gauge symmetries” of a QFT leads to huge total Hilbert spaces and enormous field algebras, as was shown in [59]. The ideas in [26, 65] suggest that an appropriate use of string algebras as symmetry algebras requires to give up the clear-cut division between spacetime and internal symmetries. Such a “mixing” of symmetries seems to be realized in our c(2, q) models since, as C ∗ -algebras, the intertwiner algebras (internal symmetries) are simply path algebras associated to the fusion graphs of the QFT, and thus they are of the same kind as our path observable algebra (space-time symmetries). It should be interesting to pursue this relation further but, at present, we are interested in quantum symmetries as a mere practical tool and will therefore not resort to string symmetry algebras either. Finally, a comparatively modest approach to quantum symmetry is to rely on semisimple ∗ -algebras with additional structures making them into weak quasi-triangular quasi-Hopf algebras [50]. It has been shown in [64] that under certain standard assumptions such a semi-simple QSA always exists, in particular for rational CFTs. One could argue that this type of QSA does not always arise in a natural way, and that it may be difficult to compute the extra structure they are supplied with. Even more problematical,
From Path Representations to Global Morphisms for a Class ofMinimal Models
383
the constructions known so far did not single out one specific semi-simple QSA for a given model, not even up to isomorphism. Instead, if i ∈ I labels the sectors of a rational k , then each algebra CFT, say, with fusion rules Nij g∼ =
M
Mni (C)
(3.1)
i∈I
can be viewed as a quantum symmetry of the CFT as soon as the integers ni satisfy the inequalities [64] X k Nij nk . (3.2) n i nj ≥ k∈I
We will, however, show in Subsect. 3.2 that the results of [53] lead to a canonical choice of ni among the infinitely many solutions of (3.2) – at least for the large class of so-called quasi-rational CFTs, see [53] and Definition 3.2 below. It turns out that, in our path setting, this canonical semi-simple QSA leads to a very natural (and “slim”) field algebra. Since, for the purpose of constructing amplimorphisms, we will not need to endow g with more complicated quantum symmetry structures beyond a co-product, it is this third concept of quantum symmetry that will be used in the following. Let us just sketch the main data of semi-simple QSAs g and how they are implemented into the space of states of a CFT. Complete definitions and proofs can be found in [64]. g is a matrix algebra as in (3.1) with n0 = 1 for the vacuum sector. Consequently, the projection onto the factor Mn0 (C), ε ≡ pr 0 : g −→ C, is the natural candidate for the co-unit of g; it has to be checked whether this ε satisfies the correct relations. Since the representation category of g must “mimic” the braided tensor category associated to the CFT, the QSA g has to be endowed with a co-product 1 : g −→ g ⊗ g which “reproduces” the fusion rules of the CFT. The simplest way to make this requirement precise is to look at the (two-floor) Bratteli diagram of 1 viewed as an k lines from the factor gk ≡ ek ·g algebra homomorphism: The diagram must contain Nij of g to the factor gi ⊗ gj of g ⊗ g. Here, ei are the minimal central projections of g, i.e. ei · g ∼ = Mni (C). The formulation with the help of Bratteli diagrams has the additional advantage to make the freedom of so-called “twists” [20] explicit: 1 is fixed only up to conjugation with a unitary in g ⊗ g. Of course, the other data of g have to be changed accordingly when 1 is twisted, since there are compatibility conditions; e.g. co-unit and co-product must obey (ε ⊗ id) ◦ 1 = id = (id ⊗ ε) ◦ 1. k = δkj for all k, the co-product is always injective, but it need not be a Since N0j unital embedding. Obviously, 1 is unital iff (3.2) holds as an equality for all i, j ∈ I. Similarly, the co-product in general is not co-associative (it cannot be if it is nonunital) but only quasi-co-associative, i.e. there exists a (quasi-invertible) re-associator ϕ ∈ g ⊗ g ⊗ g intertwining (1 ⊗ id) ◦ 1 and (id ⊗ 1) ◦ 1. The braid group statistics of a low-dimensional QFT is reflected by the existence of an R-matrix R ∈ g ⊗ g which is an intertwiner between 1 and 10 , the co-product with tensor factors interchanged. We have ignored the antipode S : g −→ g , a C-linear anti-automorphism of g which translates charge conjugation of sectors into the QSA, and have also not discussed important compatibility conditions like the so-called pentagon and hexagon identities, involving 1, ϕ and R. Sometimes, existence of a representation of the modular group on g is also assumed, see e.g. [34, 68].
384
A. Recknagel
Since we will not use those structures, we again refer to [64] for further details k and a set of integers and merely recall that given a rational CFT with fusion rules Nij ni , i ∈ I, n0 = 1, satisfying (3.2), then one can solve all the constraints and obtain a weak quasi-triangular quasi-Hopf algebra as a QSA which reproduces the representation category of the CFT. The QSA is implemented into the quantum field theory as follows: One forms an enlarged Hilbert space Htot in which each irreducible representation space Hi , i ∈ I, of the observable algebra occurs with multiplicity ni , i.e. Htot =
M
Hi ⊗ Vi ,
dim Vi = ni .
(3.3)
i∈I
Each multiplicity space Vi carries an irreducible representation of g which is equivalent to the defining representation τi ≡ pr i on Cni – assuming that g is directly given as matrix algebra. Thus, Htot carries a representation of g, which will be denoted by U . What we have sketched is already a special realization of the QSA within a QFT, of the form constructed in [64]. There it was shown that Htot furthermore carries an action of a field algebra F (with a net structure inherited from A), which can be decomposed into field multiplets transforming covariantly under the g-action. The observable algebra A is recovered as the fixed point sub-algebra of F. The Hamiltonian of the QFT also commutes with the g-action. Moreover, the local braid relations of the field operators can be written in terms of the R-matrix of g, and in this way one gains complete control of the braid group representations associated with the sectors of a low-dimensional QFT. The covariance properties of field multiplets will be an important part of our construction of global amplimorphisms of the observable algebra. They are formulated in Eq. (4.10) below, in a slightly different form than in [64]. Beyond that, we will only use information encoded in the co-product of the QSA, and of course a representation U of g on Htot . First of all, we have to choose an appropriate matrix algebra g(q) as a QSA for each of our c(2, q) minimal models, i.e. we have to fix the sizes ni in Eq. (3.1). We claim that the choice (3.4) ni = i + 1 is one possibility; in order to show this, we prove the following: Lemma 3.1. The integers ni = i + 1, i = 0, . . . , K, satisfy the inequalities ni nj ≥
X
k Nij nk ,
k k are the fusion rules of the c(2, q) model, q = 2K + 3, as listed in Eqs. (2.4,5). where Nij
Proof. Suppose that i ≥ j. Then Eqs. (2.4,5) give X k
k Nij nk = ni−j + n{i−j+2} + . . . + n{i+j}
= i − j + 1 + {i − j + 2} + 1 + . . . + {i + j} + 1 ≤ (i + 1)(j + 1) = ni nj ,
where in the third line we have used the estimate {m} ≤ m before summing up.
From Path Representations to Global Morphisms for a Class ofMinimal Models
385
Therefore, the results collected above imply that g(q) := C ⊕ M2 (C) ⊕ . . . ⊕ MK+1 (C)
(3.5)
can be endowed with all the structures making it into some QSA of the c(2, q) model, q = 2K + 3, and we could immediately proceed with implementing g(q) on the total Hilbert space tot := H0 ⊕ H1 ⊗ C2 ⊕ . . . ⊕ HK ⊗ CK+1 . (3.6) H(q) But before that, we would like to take the opportunity and argue that the matrix algebra (3.5) may indeed be called the semi-simple QSA of the c(2, q) model. 3.2. The canonical choice of sector multiplicities. In order to explain that the choice of dimensions ni = i + 1 in Eq. (3.4) is even a canonical one, we will slightly digress and give an account of some results of the work [53]. There, the notion of quasi-rational CFTs was introduced: These are models of CFT such that a certain factor space of each irreducible highest weight module is finite-dimensional. The dimensions ni of those factor spaces are invariants of the theory like the central charge and the conformal weights. They satisfy the sub-multiplicativity relation (3.2) and can, therefore, be used as dimensions of the defining representation of a semi-simple quantum symmetry algebra of the model. We will show that the minimal models of our interest are quasi-rational in Nahm’s sense, and that the dimensions of the relevant factor spaces are given by formula (3.4). Up to the present, the only treatment of quasi-rationality available is in terms of modes of unbounded quantum fields, and using the fusion product as a computational tool. It would be very interesting to try and fit the notions developed in [53] into the algebraic DHR framework, and to find out whether there is a more conceptual interpretation of the dimensions ni , perhaps by some operator algebraic constructions. In view of the inequalities (3.2), it seems likely that the ni are related to the Jones indices d2i associated to the sectors, since the statistical dimensions di satisfy (3.2) as equalities, see e.g. [26, 48]. On the other hand, it is quite clear already from [53] that quasi-rationality is a useful property for practical problems: It allows, in particular, for an algorithmic definition of fusion in a large class of CFTs. Below, we will add further remarks on this aspect. The notion of a quasi-rational representation can be introduced for arbitrary bosonic W -algebras with a finite number of generating fields. A rather detailed definition of W -algebras can be found, e.g. in [52, 12], but for our purposes, theP following sketch s −n−s is sufficient: W contains a finite set of generating fields W s (z) = n∈Z Wn z with conformal dimensions s ∈ Z+ , where z ∈ C is the coordinate of left-moving fields, and the rhs gives the expansion of W s (z) in terms of Laurent modes of W s (z). Among them, there is a field W 2 (z) of dimension two, which we identify with the energy momentum tensor T (z); all the other generators W s (z) are Virasoro primary. W is then linearly generated by these fields, their derivatives with respect to z, and (derivatives of) normal-ordered products. Note that one can choose a linear basis of W consisting only of primary generators W s (z) and quasi-primary normal ordered products. With respect to this basis, W is an infinite-dimensional Lie algebra, and for most of what follows, we may simply regard a W -algebra as the universal enveloping algebra generated by the Laurent modes of the fields W s (z) – in this way avoiding a general discussion of normal ordered products. We denote by W−− the linear span of modes
386
A. Recknagel
W−− := { φn | φ(z) ∈ W, φ 6 = 1, n ≤ −dim φ } H = { ω(z)φ(z) | φ(z) ∈ W, φ 6 = 1, ω a 1-form vanishing at ∞ },
(3.7)
where in the second description we have used contour integration around zero to project onto Laurent modes. Definition 3.2. Let V be an irreducible highest weight representation of a finitely generated bosonic W -algebra W, and put V a := W−− V . The representation V is called quasi-rational if the quotient space V /V a is finite-dimensional. A CFT with finitely generated bosonic W -algebra is called quasi-rational if all the irreducible highest weight representations making up the chiral space of states are quasi-rational. Note that in the last part of the definition we do not require that the CFT involves only a finite number of irreducible representations, i.e. that it is rational. On the one hand, there is the plausible conjecture that any rational CFT with appropriate W -algebra is quasi-rational. On the other hand, there are definitely examples of non-rational quasirational theories – which is one of the advantages of working with Definition 3.2. In particular, many of the N = 2 superconformal QFT associated to Calabi-Yau manifolds probably are non-rational, but since they are relevant for string theory, it is important to have tools which work in a larger class of CFT than just the rational ones. The restriction to bosonic W -algebras in Definition 3.2 is not essential, see [37] for an extension to the fermionic case. Given a representation Vi of a quasi-rational CFT, we use the notation ni = dim Vi /Via (3.8) for the dimension of the factor space. Note that ni ≥ 1 for any representation. To a subspace Vis ⊂ Vi such that dim Vis = ni and Vis + Via is dense in Vi , we will refer to as a small space of the CFT. While the integers ni are invariantly defined, the small spaces are not. Often, however, there are natural choices in explicit computations. The most interesting property of the dimensions ni was also proven in [53]: Proposition 3.3 ([53]). Consider a CFT with W -algebra W and let Vm , m ∈ I, be the collection of irreducible W-representations occurring in the CFT. Assume that among these there are two quasi-rational highest weight representations Vi and Vj with small k for i, j, k ∈ I, space dimensions ni and nj . Denoting the fusion rules of the theory by Nij we have the following inequality: X k Nij nk . n i nj ≥ k
Corollary 3.4. The set of quasi-rational highest weight representations of a W -algebra forms a sub-category (of the category of all highest weight representations) which is closed under fusion. Quasi-rational representations are semi-rational, i.e. the fusion decomposition of two quasi-rational representations contains only a finite number of irreducibles. The ni can be used as dimensions of the defining representation of a semi-simple quantum symmetry algebra of a quasi-rational CFT [57, 36]. To sketch the proof of Proposition 3.3, we have to use the fusion product of two W -algebra representations. For α = 1, 2, choose two distinct “punctures” zα ∈ C
From Path Representations to Global Morphisms for a Class ofMinimal Models
387
and “insert” a representation Viα at zα . The action of a field φ(z) ∈ W on a vector vα ∈ Viα (zα ) can be written as a contour integral I ω(z)φ(z) vα , Cα
where ω(z) is a meromorphic 1-form and Cα is a “small” contour encircling zα . The well-known fusion product representation of W on the tensor product Vi1 ⊗Vi2 is defined via Cauchy’s formula I I I ω(z)φ(z) (v1 ⊗ v2 ) := ω(z)φ(z) v1 ⊗ v2 + v1 ⊗ ω(z)φ(z) v2 , (3.9) C12
C1
C2
where C12 encircles both z1 and z2 . Introducing a mode expansion on the lhs – here, a choice is involved – leads to a zα -dependent, co-product-like expression of the fusion product action of φn on Vi1 ⊗ Vi2 in terms of modes φm that act in each Viα separately, see e.g. [24, 51, 53]. If the action (3.9) can be diagonalized we obtain a decomposition into irreducibles – hence the name “fusion product”. Contour integrals may be used to prove Proposition 3.3 in the following way. Consider any vector v1 ⊗ v2 ∈ Vi1 ⊗ Vi2 such that vα ∈ Viα are L0 -eigenvectors. Having chosen small spaces Visα , we can decompose v2 into a sum v2 = v2s + v2a with v2s ∈ Vis2 , v2a ∈ Via2 . By definition, v2a is of the form v2a = φ−h−n v20 for some v20 ∈ Vi2 and some field mode φ−h−n ∈ W−− , where h is the dimension of φ and n ≥ 0. Using Eq. (3.9), this can be rewritten as X fl (z1 − z2 )φl−h+1 v1 ⊗ v20 , (3.10) v1 ⊗ v2 = φ−h−n (v1 ⊗ v20 ) − l≥0
where fl (z1 − z2 ) are certain meromorphic functionsarising from the mode expansion. The first term on the rhs of (3.10) is in Vi1 ⊗ Vi2 a , and the remaining terms have a lower L0 -degree than v1 ⊗ v2 . Proceeding inductively, and treating v1 in the same way, one finally reaches vectors in the tensor productof the small spaces, which proves that Vi1 ⊗ Vi2 is generated by vectors in Vi1 ⊗ Vi2 a and Vis1 ⊗ Vis2 . In other words, the space Vis1 ⊗ Vis2 contains a small space Vi1 ⊗ Vi2 s of Vi1 ⊗ Vi2 endowed with the W-action from the fusion product. By definition, this representation contains all other irreducible highest weight representations Vk with multiplicity Nik1 i2 which implies the desired inequality. For more details of the proof, we refer to [53]. We would, however, like to emphasize that the statement in Proposition 3.3 is independent of any choice to be made when giving explicit formulas for the fusion product action of field modes. The practical use of the concept of quasi-rationality combined with the fusion product now becomes apparent: It allows to determine the fusion rules of quasi-rational representations by a finite algorithm. One simply has to diagonalize the zero-mode sub-algebra W0 of W on the finite-dimensional space Vis1 ⊗Vis2 in order to obtain a decomposition into irreducibles. In this procedure, however, two subtleties are hidden. First of all, diagonalizability may fail to hold in general, although the known examples for such a situation are typically plagued with certain pathological features. The other problem involves the so-called spurious states, i.e. vectors in the space Viσ1 i2 := Vis1 ⊗Vis2 ∩ Vi1 ⊗Vi2 a . This intersection is non-zero if the inequality in Proposition 3.3 becomes strict, e.g. in most minimal models of the Virasoro algebra. Fusion rules are then obtained from diagonalizing the zero mode action in the space Vis1 ⊗ Vis2 /Viσ1 i2 . The construction of spurious
388
A. Recknagel
states is not yet well understood at an abstract level, although in concrete examples it is usually possible to determine Viσ1 i2 from the null-fields of the theory. As was shown in [53], the fusion rules can be calculated – up to the subtleties mentioned above – on even smaller spaces: Denote by Vih the highest weight subspace fashion as before that both (wrt. L0 ) of the module Vi , then one can show in a similar Vih1 ⊗ Vis2 and Vis1 ⊗ Vih2 contain the space Vi1 ⊗ Vi2 h . This further reduction is very useful in concrete applications, and often allows to avoid cumbersome calculations with spurious states. We will now show that the minimal models with central charge c(2, q) and highest weights hi as in Eqs. (2.2,3) are quasi-rational CFTs and that the dimensions of their small spaces are given by the simple formula (3.4). Consider a highest weight representation with highest weight hr,s ≡ hr,s (p, q) =
(pr − qs)2 − (p − q)2 4pq
(3.11)
of the Virasoro algebra at central charge c(p, q), where r, s ≥ 1 are integers, but we admit arbitrary p, q ∈ R>0 . Feigin and Fuchs have shown that the Verma module Vh over |hr,s i contains a singular vector |vi at level r · s (and maybe others, all at higher energy), which is of the form 0 |vi = Lrs −1 |hr,s i + α|v i,
(3.12)
where the complex number α depends on r, s and p/q, and |v 0 i ∈ Vh involves the modes L−2 , L−3 , . . . – see [21] for more explicit statements. After passing to the irreducible module Hr,s , Eq. (3.12) implies that a Lrs −1 |hr,s i ∈ Hr,s , since L−2 , L−3 , . . . ∈ Vir −− , cf. Eq. (3.7). On the other hand, one can easily convince oneself – using, e.g., the explicit formula for normal ordered products in [52, 12] – that 00 none of the vectors |hr,s i, L−1 |hr,s i, . . . , Lrs−1 −1 |hr,s i can be written as φm |v i for 00 some |v i ∈ Hr,s , some normal ordered product φ(z) of the energy momentum tensor, and m ≤ −dim φ. Thus, for the degenerate models of the Virasoro algebra with highest weights hr,s (p, q), the vectors listed above can be taken as a basis of the small spaces; the equations ( s rs if p/q ∈ /Q nr,s ≡ dim Hr,s = min{ rs, (q − r)(p − s) } if p/q ∈ Q for arbitrary degenerate representations and in particular formula (3.4) for our models follow. As a further consequence, we indeed obtain the sub-multiplicativity relation in Lemma 3.1 as a special case of Proposition 3.3, without resorting to any combinatorics. / Q, are simple examples The models with hr,s (p, q) as above and r, s ∈ Z+ , p/q ∈ of non-rational quasi-rational CFTs, and of theories where the dimensions nr,s satisfy the relation of Proposition 3.3 as an equality. for the minimal models c(2, q), P In contrast, k nk in general, which also means that q ≥ 5 an odd integer, we have ni nj > k Nij there exist spurious states. As mentioned above, this fact complicates the computation of fusion rules by diagonalizing the L0 -action on His ⊗ Hjs or His ⊗ Hjh . Nevertheless, the special cases we are
From Path Representations to Global Morphisms for a Class ofMinimal Models
389
interested in are simple enough to let us circumvent these problems, e.g. in the following three ways: First of all, we may “perturb” the models slightly by moving the central charge away from the minimal values q ∈ Z while keeping the relations (3.11) for the highest weights. Then, the model becomes merely degenerate, the spurious states disappear, k from the L0 -action in the tensor product of small spaces. and we may read off the Nij Afterwards, we can move c back to the minimal value, take into account the conformal grid symmetry hr,s = hq−r,p−s for minimal models if necessary, and we will recover the fusion rules of the minimal model. In the limit c → cmin.mod. some higher level vectors in the irreducible components of the fusion product move towards singular vectors of the minimal representations, thus producing the spurious states in a controlled way. The second possibility is to calculate the fusion rules directly within the minimal model c(2, q), but using diagonalization of L0 in the smaller spaces His ⊗ Hjh . All the highest weight spaces are, of course, one-dimensional in pure Virasoro models. For each q, the small space H1s of the highest weight representation with h1 = − q−3 2q is twodimensional, so we can obtain the fusion rules of H1 with Hk , k 6 = 0, from diagonalizing 2 × 2 matrices. But in each c(2, q) model, the conformal family corresponding to H1 generates the whole fusion ring. The most direct method, however, is to calculate the spurious states explicitly. In the c(2, q) models, this can be done since the null-fields are known, see Eq. (2.9). Their Laurent modes applied to highest weight states give null-vectors which can be used to compute the spurious states via contour integration and Cauchy’s theorem. We refrain from giving further details here and rather refer to [53, 3]. In summary, we have seen that Nahm’s results [53] indeed provide a canonical choice for the isomorphism type of a semi-simple quantum symmetry algebra for a quasi-rational CFT, namely M Mni (C), g= i∈I
where ni = dim His are the small space dimensions associated sentations. The ni are invariants of the quasi-rational CFT.
to the irreducible repre-
One can show that for quasi-rational models even the other basic data of g, namely the co-product, the R-matrix and the re-associator, can be reconstructed explicitly: The small spaces can be used to define a finite-dimensional vector bundle equipped with a flat connection, which leads to a generalization of the Knizhnik–Zamolodchikov equation from WZW models to arbitrary quasi-rational CFTs. Then, Drinfeld’s construction [20] can be applied to recover all data of a weak quasi-triangular quasi Hopf algebra from the generalized Knizhnik-Zamolodchikov equation [3]. The fusion rules are part of this structure, to be determined by the diagonalization procedure sketched above. Therefore, we can in principle delete them from the input data listed in Subsect. 2.1 and instead derive them from information referring only to individual highest weight representations. 3.3. Action of the quantum symmetry algebra on the path spaces. Irrespective of the previous discussion, which showed that the numbers ni = i + 1 chosen in Eq. (3.4) are canonical data of the c(2, q) model, we will see in the following that this choice of sector multiplicities leads to a particularly natural action of the associated quantum symmetry tot , see Eqs. (3.5,6). The point is that algebra g(q) on the “amplified” Hilbert space H(q) tot for this special choice of ni , the space H(q) can again be represented as a path space
390
A. Recknagel
over a Bratteli diagram Bbq that has essentially the same form as the Bq,i underlying the individual spaces Hi . The extended Bratteli diagram Bbq looks as follows: Floors are numbered −2, −1, 0, 1, . . . ; the −2nd floor consists of one node, labeled ∗, all other floors of K + 1 nodes labeled 0, . . . , K as usual; the embedding matrices between floors l and l + 1 for l ≥ −1 are the (K + 1) × (K + 1) connectivity matrices Cq of the c(2, q)-graph Gq , see Eq. (2.7), whereas the matrix describing the embedding of floor −2 into floor −1 is simply Cin = (1, 1, . . . , 1)t ∈ M(K+1)×1 (Z) – i.e. the node ∗ is joined to every node on the −1st floor by a single edge. Compared to the Bratteli diagrams Bq,i of Subsect. 2.2, the only new building block of Bbq is an extremely simple, “canonical” one. In Fig. 2, an example is shown. b(n) denote the space of all paths on Bbq from the node ∗ on the −2nd floor to the Let P ∗,i node i on the nth floor, for n ≥ −1 and i = 0, . . . , K. b(0) = ni . Lemma 3.5. dim P ∗,i Proof. We compute the dimension of this path space by applying the first two embedding matrices to the dimension 1 of the space of length zero paths: b(0) = ti Cq Cin 1 = ti (1, 2, . . . , K + 1)t = i + 1 = ni , dim P ∗,i where i is the ith standard unit vector, with rows labeled from 0 to K.
b∗ of As a consequence, we obtain an (L0 -graded) isomorphism between the space P tot b , all (finite) paths over the extended Bratteli diagram Bq and the total Hilbert space H(q) tot b∗ ∼ : P = H(q)
(3.13)
b(0) by ν running from 1 to ni ; then In order to show this, we first label the paths in P ∗,i we identify a state in the ν th copy of Hi with a path on Bbq which reaches node i on the 0th floor along the path |νi, and continues towards infinity according to the path space representations of each Hi introduced in Subsect. 2.2. We note that, with a different choice of multiplicities n˜ i , it would still be possible to find an extended Bratteli diagram whose associated path space is isomorphic to H˜ tot , but this diagram could be very different in shape from the one determined by the graph Gq . In other words, the path field algebra associated to the extended Bratteli diagram, see below, would no longer be of the same stable isomorphism type as the Ai . Let us, for one more time, refer to the ideas of Subsect. 3.2 where the multiplicities ni were interpreted as the dimensions of special subspaces His ⊂ Hi . For degenerate Virasoro models, we could simply choose m His = { Lm −1 |hi i | m ≥ 0, L−1 |hi i linearly independent of Ln Hi for all n ≤ −2 }C . tot b∗ ∼ even without knowledge of the With this information, we may derive that P = H(q) values of ni : Recall that by Proposition 2.2 and with the notations of (2.22) we have
His ∼ =
K M l=0
(1) ∼ Pi,l =
K M l=0
(1) Pl,i ;
From Path Representations to Global Morphisms for a Class ofMinimal Models
391
the second relation holds since the graphs Gq are unoriented. But the last decomposition just describes the space of paths on the two extra floors of Bbq , ending at node i on floor 0. Since from floor zero on, the extended diagram is as the diagrams Bq,i , the extended b∗ is simply L Hi ⊗ Hs , no matter what the dimensions of the small spaces path space P i i are. By construction, this space is our total space of states. Given the path representation of the total Hilbert space, the action of the quantum symmetry algebra is implemented in a straightforward way. For each fixed q, we denote b by F (n) ∗,i the string algebra over Bq generated by pairs of paths joining node ∗ on floor L (n) b(0) −2 to node i on floor n. We set F (n) ∗ := i F ∗,i , and we enumerate the paths |νi ∈ P ∗,i from ν = 0, . . . , ni as before. L Definition 3.6. As an associative algebra, the quantum symmetry algebra g = i Mni (C) acts on the total Hilbert space Htot by the representation U which is defined on i ∈ Mni (C) as matrix units Eµν (0) i ; = |µihν| ∈ F∗,i U Eµν in other words, U (g) = F∗(0) . This natural action of g on Htot is unital and faithful. The operators U (a), a ∈ g, flip b∗ , leaving already the node on the 0th floor fixed. only the first two edges of a path in P On the highest weight state |hµi i of the µth copy of Hi , µ = 1, . . . , ni , they act as U (a)|hµi i =
ni X
(ai )µν |hνi i;
ν=1
the complex numbers (ai )µν are the matrix entries of the ith factor of a ∈ g. As a consequence, the vacuum is “invariant” under g in the sense that it transforms in the trivial representation given by the co-unit. (This fact reminds us that g generalizes the group algebra rather than the Lie algebra of a global gauge group.) In particular, we conclude that the g-action has no “observable effect”: Proposition 3.7. Let F denote the path algebra generated by all finite strings over Bbq . The commutant of U (g) in F is canonically isomorphic to the observable algebra, U (g)0 ∩ F ∼ = A. Proof. The claim is more or less obvious, and the proof will be given mainly in order to set up a matrix notation for the elements of F which will be useful in later computations. The identification of elementary strings with matrix units has already been used for finite floor sub-algebras; now, it will be applied to make the form of field algebra elements explicit up to a certain floor: For fixed q = 2K +3, let N = n0 +n1 +. . .+nK = (K+1)(K+2) 2 be the total number of paths on Bbq from ∗ to all the nodes on the zeroth floor. Then we can write any element F ∈ F as matrix N (3.14) F = Frs r,s=1 , where Frs is a linear combination of strings that take the route |rihs| from ∗ down to the 0th floor and then continue arbitrarily (such that ends meet eventually). In this notation, the operators in U (g) have the block-diagonal form
392
A. Recknagel
U (a) ∈ bl diag C · 1A0 , M2 (C · 1A1 ), . . . , MK+1 (C · 1AK )
for all a ∈ g, where 1Ai is the unit of the path algebra Ai . The commutant of U (g) in F is then simply given by (3.15) U (g)0 = bl diag A0 , D2 (A1 ), . . . , DK+1 (Ak ) , where Di : A −→ A⊕i , a 7−→ a ⊕ · · · ⊕ a are the diagonal embeddings; obviously, U (g)0 is canonically isomorphic to A. In view of this proposition, we are entitled to call F the (path representation of the) field algebra of the c(2, q)-model. Of course, like our path observable algebra, F is a global object. As an AF-algebra, F once more belongs to the same stable isomorphism class as all the path algebras associated to the path representations Hi , which is clear from the form of the Bratteli diagram Bbq . Unlike the global observable algebra A, however, F is a simple algebra. Note also that although F 6 = U (g) ∨ U (g)0 , the centers of the QSA and the observable algebra A coincide on the total Hilbert space, Z(A) = Z(U (g)), in agreement with the general theory. Compared to the huge field algebras constructed elsewhere, our path field algebra F envelops the path observable algebra A rather tightly. 4. Covariant Field Multiplets This section contains the main step towards global amplimorphisms of the c(2, q) minimal models, namely the construction of covariant field multiplets inside the field algebra F. Once these are known, the amplimorphisms follow immediately. In order to arrive at the multiplets, the QSA is first endowed with a co-product 1 reproducing the fusion rules of the c(2, q) model – or rather with an equivalent collection of amplimorphisms νi , i = 0, . . . , K, of g. With the help of these amplimorphisms, one can formulate equations on elements of F which are to form covariant multiplets. In our cases, the special properties of the underlying path spaces allow for a natural solution of the covariance conditions. We emphasize once more that no use is made of further quantum symmetry structures (R-matrices, re-associators, etc.) of g other than the co-product. 4.1. Amplimorphisms of the QSA. For the moment, we treat g as an abstract matrix LK algebra g = i=0 Mni (C). Since g satisfies the inequality (3.2) on the dimensions ni , existence of a co-product 1 : g −→ g ⊗ g reproducing the c(2, q) fusion rules i is guaranteed. The fusion rules fix the (two-floor) Bratteli diagram of the algebra Nij homomorphism 1 : g −→ g ⊗ g, but there is the freedom of “twisting” 1 by inner automorphisms of g ⊗ g. Given 1 : g −→ g⊗g, we can use the minimal central projections ei , i = 0, . . . , K, to introduce amplimorphisms [68] ( g −→ Mni (g) νi : a 7−→ νi (a) := (1 ⊗ ei )1(a) of g; here, the isomorphism Mni (g) ∼ = g ⊗ Mni (C) has tacitly been applied.
From Path Representations to Global Morphisms for a Class ofMinimal Models
393
Vice versa, a collection of amplimorphisms νi : g −→ Mni (g) for i = 0, . . . , K, of the above type defines a co-product ( g −→ g ⊗ g PK 1{ν} : a 7−→ 1(a) := i=0 νi (a) – up to inner automorphism, since again an explicit isomorphism from Mni (g) to g⊗gi , gi = ei ·g, has to be chosen. The invariant information contained in 1 and in the collection of g-amplimorphisms is, however, the same, and we will see that the objects we aim at, namely the amplimorphisms of the observable algebra, are independent of twists of 1. To work with amplimorphisms of the semi-simple QSA instead of the co-product was proposed by Szlachanyi and Vecsernyes, and the idea has been applied to G-spin chains and to the Ising model [66, 68]. Amplimorphisms seem to be better adapted to the DHR framework, and, for some purposes, are easier to handle in practice. k , find amplimorphisms νi of g whose Our task is, given the c(2, q)-fusion rules Nij k Bratteli diagrams contain Nij lines from gk to Mni (gj ). This is fairly easy to do, in fact, the main difficulties in writing down the νi are of notational type. Fix a superselection sector i ∈ Iq , and choose, for each j ∈ Iq , an enumeration (ij|1), (ij|2), . . . , (ij|mij )
(4.1)
of the fusion results |i − j|, {|i − j| + 2}, . . . , i + j in the decomposition of φi × φj , see Eqs. (2.4–6); recall that mij = min(i, j) + 1. Using this notation, and the decomposition of g and its elements a ∈ g with the help of the minimal central projections, ai := a · ei , we define the auxiliary map ( g −→ Mni (g0 ) ⊕ Mni (g1 ) ⊕ · · · ⊕ Mni (gK ) (4.2) νei : a 7−→ νei (a) by νei (a) := ai ⊕ bl diag a(i1|1) , . . . , a(i1|mi1 ) , 0di1 ⊕ · · · ⊕ bl diag a(iK|1) , . . . . . . , a(iK|miK ) , 0diK ; 0dij denotes a square matrix of size dij := ni nj −
mij X
n(ij|l)
(4.3)
l=1
with all entries equal to zero. The dij are just the defects in the inequalities (3.2). (In the terminology of Subsect. 3.2, dij is the dimension of the “spurious space” in the fusion of the representations i and j.) The Bratteli diagram of νei reproduces the fusion rules k . Nij The target space of νei is isomorphic to Mni (g), and we arriveL at a true amplimorphism of the QSA if we choose a specific isomorphism. Since both l Mni (gl ) and Mni (g) P embed canonically into Mni N (C), N = l nl , we can simply use a permutation matrix Pπ ∈ Mni N (C) in order to “rearrange” the matrix νei (a) ∈ Mni N (C). The permutation π ∈ Sni N is obtained from comparing the following two labelings of the standard basis s , s = 1, . . . , ni N , of Cni N :
394
A. Recknagel
Labeling I is in terms of triples (nj , σ; α) with j = 0, . . . , K, σ = 1, . . . , nj , α = 1, . . . , ni . Here, (nj ,σ;α) = s with s = ni (n0 + . . . + nj−1 ) + (α − 1)nj + σ . Labeling II is in terms of triples (α; nj , σ) with the same range of indices j, σ, α as above. Here, (α;nj ,σ) = s with s = (α − 1)N + (n0 + . . . + nj−1 ) + σ . The triples L (nj , σ; α) of labeling I provide a natural row-column ordering for the sub-algebra l Mni (gl ) of Mni N (C), whereas the triples (α; nj , σ) of labeling II are appropriate when we consider the sub-algebra Mni (g). Accordingly, let π ∈ Sni N be the permutation s 7−→ π(s) with π(s) = (α; nj , σ) for s = (nj , σ; α)
(4.4)
in an obvious notation, and let Pπ be the ni N × ni N matrix whose sth column is the standard unit vector π(s) . Proposition 4.1. For all i ∈ Iq , the map ( g −→ Mni (g) νi : a 7−→ Pπt νei (a) Pπ , with νei as in (4.2) and Pπ determined by (4.4), is an amplimorphism of the QSA g such that any associated co-product 1{ν} reproduces the c(2, q)-fusion rules. Proof. First, we have to show that Pπt νei (a) Pπ ∈ Mni N (C) is contained in the subalgebra Mni (g) for all a ∈ g. This is true since νei (a) is an element of the block-diagonal sub-algebra Mni (g0 )⊕· · ·⊕Mni (gK ) of Mni N (C), which means, in terms of labeling I, that νei (a) (nj ,σ;α)(n ,τ ;β) 6 = 0 only if j = k; the same applies for νi (a) (α;nj ,σ)(β;n ,τ ) , k k only now the counting is wrt. labeling II, and νi (a) ∈ Mni (g) follows. An explicit formula for the matrix elements of the amplimorphism νi is
νi (a)
(α;nj ,σ)(β;nk ,τ )
= δjk
mij X l=1
a(ij|l)
fl (α,σ),fl (β,τ )
(4.5a)
with fl (α, σ) = σ + (α − 1)nj − (n(ij|1) + . . . + n(ij|l−1) ),
(4.5b)
and we use the convention that the matrix element (al )ρσ vanishes unless 1 ≤ ρ, σ ≤ nl . Thus, at most one of the terms in the sum of Eq. (4.5a) gives a non-zero contribution. The assertion on the Bratteli diagram associated to the amplimorphism νi (and to 1{ν} ) is enforced by the very construction of the auxiliary maps νei , which have been chosen as the simplest possible realization of the Bratteli diagram dictated by the fusion rules. Both in the definition of νei and of νi one could introduce additional twists by unitaries. Note that the amplimorphisms νi are ∗ -homomorphisms with respect to the canonical -operation on complex matrix algebras and, more importantly, that they are all nonunital – except for ν0 = id. This non-unitality may be traced back to the non-zero defects dij of Eq. (4.3). The rank of νi (1) does not depend on the specifically simple choice of amplimorphisms νi . Likewise, the following important statement is one on the inner isomorphism class of the g-amplimorphism: ∗
From Path Representations to Global Morphisms for a Class ofMinimal Models
395
Proposition 4.2. Upon composition, the amplimorphisms νi of g realize the fusion rules, i.e. for each pair i, j ∈ Iq there exists a unitary Uij ∈ Mni nj (g) such that M k ∗ νk (a)⊕Nij Uij (νi ◦ νj )(a) = Uij k∈Iq
for all a ∈ g and the fusion rules given in Subsect. 2.1. In particular, νi ◦ νj and νj ◦ νi are unitarily equivalent. Proof. Recall that an algebra homomorphism ψ : A −→ B is extended to an algebra homomorphism ψ : Mn (A) −→ Mn (B) of the amplifications by setting ψ (aij )ni,j=1 = n ψ(aij ) i,j=1 . Thus, the lhs is a map from g to Mni (Mnj (g)), the rhs can be regarded P k as an element of Mni nj (g) because of the basic inequality ni nj ≥ k Nij nk . The existence of the unitaries Uij is clear since in the Bratteli diagrams associated to νi ◦ νj L ⊕N k P P k l k l and k νk ij there are k Nim Njk resp. k Nij Nkm lines from the factor gl to the factor Mni nj (gm ) : The diagrams coincide. Note that with our simple realization of the amplimorphisms, the unitaries Uij are in fact permutation matrices. Let us illustrate the notions of this section in the simplest model of our series, namely the CFT describing the Lee–Yang edge singularity of the Ising model. To the c(2, 5) minimal model, we associate the QSA g(5) = C ⊕ M2 (C). (In the language of small spaces from Subsect. 3.2, dim H0s = 1 for the vacuum representation is a general fact, and dim H1s = 2 follows from the presence of the null-vector (L2−1 − 25 L−2 )|h1 i = 0 in the irreducible module H1 of the c(2, 5) theory.) tot The total Hilbert space H(5) = H0 ⊕ (H1 ⊗ C2 ) is identified with the path space over the Bratteli diagram Bb5 , see Fig. 2, and the global (path) field algebra is the string tot is implemented in a straightforward algebra over Bb5 . The representation of g(5) on H(5) way following Definition 3.6. The amplimorphisms of g(5) are also obtained easily: ν0 = id is trivial, and the fusion rule φ1 ×φ1 = φ0 +φ1 with this enumeration of fusion results, i.e. (11|1) = 0, (11, 2) = 1, yields a0 0 0 0 11 12 11 12 a1 a1 0 a1 a1 0 ⊕ νe1 (a) = 0 a21 a22 0 a21 a22 1
1
1
0 0
1
0 0
12 a11 1 a1 ∈ 22 a21 1 a1 g. In the c(2, 5)-model, the defect is d11 = 1, explaining the zero in the lower right corner of the matrix νe1 (a). The permutation Pπ which has to be applied in passing to an amplimorphism ν1 : g −→ M2 (g) amounts to slicing the matrices in νe1 (a) into quarters and rearranging them as a0 0 0 0 11 12 a a ⊕ ⊕ 1 1 a12 0 a11 1 1 0 . (4.6) ν1 (a) = 21 22 0 0 a a 1 1 a22 a21 1 ⊕ 0 0 1 ⊕ 0 0 for the auxiliary morphism νe1 : g −→ M2 (g0 ) ⊕ M2 (g1 ), where a = a0 ⊕
396
A. Recknagel
Finally, it is straightforward to verify that (ν1 ◦ ν1 )(a) and ν0 (a) ⊕ ν1 (a) are equal up to simultaneous permutation of rows and columns – so that the νi implement the Lee–Yang fusion rules. 4.2. Construction of covariant field multiplets. In this subsection, we will construct multiplets of “charged fields” which transform covariantly under the QSA action, and which can be used to define (global) amplimorphisms of the path observable algebra ni A. The charged fields associated with a sector i ∈ Iq are operators Fi = (Fi )αβ α,β=1 ∈ Mni (F) in the nth i amplification of the path field algebra F, and they are subject to the conditions for all a ∈ g, (4.7) a · Fi αβ = Fi · νi (a) αβ Fi∗ Fi = νi (1).
(4.8)
In both equations, we have identified a ∈ g with its image U (a) on Htot . Equation (4.7) simply expresses the g-covariance of the charged multiplets – written in terms of gamplimorphisms, cf. [68], rather than the co-product as in [64]. The second relation, where “1” of course is the unit of the QSA g, can be viewed as a completeness condition; however, since the g-amplimorphisms νi are in general non-unital, the field multiplets Fi are merely partial isometries in Mni (F). The task of finding explicit solutions of (4.7) and (4.8) is the only slightly technical part of the constructions presented in this paper. The matrix notation of eq. (3.14) for the field operators (Fi )αβ will prove useful in this process, only now we will group the indices according to “labeling II” of the last subsection. Thus, we write (4.9) Fi αβ = (Fi )(α;nk ,σ)(β;nj ,ρ) k,σ;j,ρ with j, k ∈ Iq , ρ = 1, . . . , nj , σ = 1, . . . , nk . Again, this notation serves to make the first two floors of a string in F “visible”: The indices (nj , ρ)(nk , σ) of the (Fi )αβ -entry indicate that it is a (linear combination of) string(s) of the form |pk∞ ◦ σk∗ ihpj∞ ◦ ρ∗j |, where |ρ∗j i is the ρ’th path from ∗ to the node j on the 0th floor, and |pj∞ i ∈ Pj runs from j on to infinity (with the usual tail condition); the symbol ◦ denotes concatenation of the two pieces. In (4.9), the |σk∗ ihρ∗j | part of (Fi )αβ is “resolved” into a matrix; to the entry with indices (nk , σ)(nj , ρ) of this matrix, we will refer to as k-j-string for short (although it is actually only a “half-open” string running to infinity from the nodes k and j on the 0th floor). Since we can identify states in the irreducible representations Pj with j-paths on the extended Bratteli diagram (in obvious terminology), we find that the field operators carry one representation into another. More precisely, the matrix element (4.9) maps the (ρth copy of the) space Hj into the (σ th copy of the) space Hk . In the following, we will first deal with the covariance condition (4.7), which turns out to determine only the “coarse structure” of the operators Fi αβ – i.e. to determine that some of the matrix elements (4.9) have to vanish, and some have to coincide. Afterwards, the completeness relation (4.8) will place non-trivial constraints on the non-vanishing entries in Fi αβ , and it is this step where the combinatorial structure of the path spaces becomes essential in solving the constraints. Proposition 4.3. The field multiplet Fi αβ , α, β = 1, . . . , ni , transforms covariantly under the g-action if and only if the matrix elements (4.9) have the following form:
From Path Representations to Global Morphisms for a Class ofMinimal Models mij X
k (Fi )(α;nk ,σ)(β;nj ,ρ) = Cij; α
397
δ(ij|l),k δσ+n(ij|1) +···+n(ij|l−1) , ρ+(β−1)nj .
l=1 k b Here, Cij; α denotes some k-j-string on the Bratteli diagram Bq . In particular, the matrix elements vanish unless φk occurs in the fusion decomposition of φi × φj .
Proof. One advantage of the explicit matrix notation for field algebra elements is that k the QSA-operators a ∈ g commute with the entries Cij; α of Fi αβ , since a ≡ U (a) acts non-trivially only between ∗ and the 0th floor of Bbq , whereas the C k are made ij; α
k up from half-open strings starting on the 0th floor. Therefore, we can simply treat Cij; α as if they were numerical coefficients, for the time being. With Fi as above, let us calculate the lhs of Eq. (4.7): X a · (Fi )αβ (nm ,τ )(nj ,ρ) = a(nm ,τ )(nk ,σ) (Fi )(α;nk ,σ)(β;nj ,ρ)
=
X
(nk ,σ) mij
k δm,k (am )τ,σ Cij; α
(nk ,σ)
=
m Cij; α
X
δ(ij|l),k δσ+n(ij|1) +···+n(ij|l−1) , ρ+(β−1)nj
l=1 mij X
δ(ij|l),m (am )τ,fl (β,ρ)
l=1
with fl (β, ρ) = ρ + (β − 1)nj − n(ij|1) − . . . − n(ij|l−1) from (4.5b). For the rhs of (4.7), we obtain Fi νi (a) (α;n ,τ )(β;n ,ρ) m
=
X
m Cik; α
j
mij X
(γ;nk ,σ)
× δk,j m = Cij; α
mij X l,l0 =1
δ(ij|l),k δτ +n(ik|1) +···+n(ik|l−1) , σ+(γ−1)nk
l=1 mik X l0 =1
a(ik|l0 )
fl0 (γ,σ),fl0 (β,ρ)
δ(ij|l),m a(ik|l0 )
gl,l0 (τ ),fl0 (β,ρ)
;
here, we have introduced the shorthand gl,l0 (τ ) = τ + n(ik|1) + . . . + n(ik|l−1) − n(ik|1) − . . . − n(ik|l0 −1) for the first index. Now recall our convention that (al0 )ρ,σ = 0 unless 1 ≤ ρ, σ ≤ nl0 , and also that we have once and for all fixed an enumeration of the fusion results (ij|1), . . . , (ij|mij ) in Eq. (4.1). Put together, this means that the last expression contains an implicit Kronecker symbol enforcing l = l0 and, therefore, it agrees with the lhs of (4.7) calculated before. Given the formula (4.5) for the g-amplimorphisms, one can also show that the multiplets Fi must have the form given in the proposition in order to solve (4.7). One may e.g. insert matrix units for a ∈ g to achieve complete decoupling of all equations, and the only difficulty is to keep track of the indices. Since this is rather tedious, and k since later we will not aim at the most general solution for the coefficients Cij; α , we omit details of the “only if” part of the proof. Let us now turn to the completeness relation. We first prove the following intermediate result, which does not yet involve the special structure of our path spaces.
398
A. Recknagel
Proposition 4.4. The field multiplets Fi solve the completeness relation (4.8) if and only k if the coefficients Cij; α of Proposition 4.3 satisfy ni X α=1
∗ k k Cim; k ,1 δm,j 1Pj , α Cij; α = δNij
(4.10)
where 1Pj is the identity operator on the path space Pj of Sect. 2, viewed as a subspace b(q) . of P Proof. Assume that (4.10) holds, and insert the formula for Fi from Proposition 4.3 into the lhs of (4.8); this yields X ∗ (Fi )(γ;nk ,σ)(α;nm ,τ ) (Fi )(γ;nk ,σ)(β;nj ,ρ) Fi∗ Fi (α;nm ,τ )(β;nj ,ρ) = =
X
k Cim; γ
∗
k Cij; γ
(γ;nk ,σ)
(γ;nk ,σ) mim mij
XX
δ(im|l0 ),k δ(ij|l),k
l0 =1 l=1
× δσ+n(im|1) +···+n(im|l0 −1) , τ +(α−1)nm δσ+n(ij|1) +···+n(ij|l−1) , ρ+(β−1)nj = 1Pj δm,j
mij X X
δ(ij|l),k δτ +(α−1)nj , ρ+(β−1)nj δσ+n(ij|1) +···+n(ij|l−1) , τ +(α−1)nj .
(nk ,σ) l=1
We have again used the fact that the enumeration of the fusion results was fixed, which enforces l = l0 above. Since the range of both τ and ρ is 1, . . . , nj , the last but one Kronecker symbol implies τ = ρ and α = β. Thus, Fi∗ Fi is diagonal, Fi∗ Fi (α;nm ,τ )(β;nj ,ρ) = δα,β δm,j δτ,ρ 2ij (α, τ ) 1Pj with a “cutoff factor”
( P k nk ≥ τ + (α − 1)nj , 1 if k Nij 2ij (α, τ ) := 0 otherwise.
We compare this to the rhs of (4.8), which is only a special case of (4.5): m
ij X 1(ij|l) f (α,τ ),f (β,ρ) . νi (1) (α;nm ,τ )(β;nj ,ρ) = 1Pj δm,j
l=1
l
l
(We have multiplied by the unit operator on Pj as we are actually working with U (νi (1)) b(q) .) Since 1 ∈ g is a diagonal matrix, the elements above vanish unless acting on P fl (α, τ ) = fl (β, ρ), i.e. unless τ = ρ and α = β; finally, a closer look at the “defect” of the non-unital amplimorphism νi shows that the cutoff factor 2ij (α, τ ) appears in νi (1), too. All in all, given (4.10), the completeness relation (4.8) is satisfied. Proving the reverse direction is easy recalling the structure of the matrix elements of Fi in Proposition 4.3: Note that the Kronecker symbols are independent of α and that, for fixed α, there is at most one non-zero entry in each column (β; nj , ρ). Thus, ∗ k P k the entries in Fi∗ Fi are precisely of the form α Cim; α Cij; α , and condition (4.10) simply follows from the matrix structure of νi (1).
From Path Representations to Global Morphisms for a Class ofMinimal Models
399
k The actual task is to construct (half-open) strings Cij; α which satisfy the relations (4.10). This will be done with the help of embeddings of path spaces, mapping elementary paths to elementary paths. First, we need a combinatorial lemma comparing the sizes of certain path spaces. (2) be the space of paths of length 2 on Gq which Lemma 4.5. As in Eq. (2.22), let Pk,m th start from node k on the 0 floor and end at node m on the 2nd floor, k, m ∈ Iq . With the k and the sector multiplicities ni = i + 1, the following estimate c(2, q) fusion rules Nij holds for all i, k, m ∈ Iq : X (2) (2) k ≥ Nij dim Pj,m . ni dim Pk,m j∈Iq
(2) through the embedding Proof. As in the proof of Lemma 3.5, we express dim Pj,m matrix Cq and standard unit vectors i ; thus, the lhs is (2) 2 = ni tm Cq2 k = ni (NK )mk = ni (N0 + N1 + . . . + NK )mk . ni dim Pk,m
Here, the fact that Gq is just the fusion graph of the minimal dimension field φK of the c(2, q)-model is very convenient: The last equality is the fusion rule φK × φK . For the rhs, we compute in the same fashion that X X (2) k k 2 2 Nij dim Pj,m = Nij (NK )mj = (NK Ni )mk k
k
= (i + 1)(NK + . . . + Ni ) + iNi−1 + . . . + 2N1 + N0
mk
.
The last step follows from applying the fusion rule φK × φi = φK + φK−1 + . . . + φK−i twice, see Subsect. 2.1. Subtracting the rhs from the lhs, we obtain a matrix whose elements are all non-negative if and only if ni ≥ i + 1. From the proof of this lemma, we learn as a by-product that our choice of sector multiplicities is indeed the minimal one such that the above dimension estimate holds k true – and, as a consequence, such that the construction of the Cij: α to be given below is possible. This seems remarkable since up to now the special values ni = i + 1 were distinguished on the general grounds of Subsect. 3.2, whereas the basic inequalities P only k n i nj ≥ k Nij nk can in general be fulfilled with some of the multiplicities taken smaller than i + 1. Below, the following generalizations of Lemma 4.5 will be useful: Corollary 4.6. For all n ≥ 2, and for all i, k, m ∈ Iq , we have X (n) (n) k ≥ Nij dim Pj,m . ni dim Pk,m j∈Iq
Furthermore, if the sector indices are such that k + m ≥ K, the dimension inequality also holds for n = 1. Proof. The first claim is true because all path spaces are based on the same fusion graph Gq , so for n ≥ 2 we obtain the dimensions of the path spaces P (n) upon application of Cqn−2 to those of P (2) ; this does not spoil the estimate in the previous lemma. The case (1) are empty: These are precisely n = 1 is special only as far as some of the spaces Pk,m those with k + m < K, as follows from the form of Gq .
400
A. Recknagel
k After these preparations, we are ready to construct strings Cij; α with the property (4.10). Lemma 4.5 guarantees that for all i, j, k, m ∈ Iq there exist injective homomorphisms M (2) ⊕N k (2) (2) ⊕ni ij : −→ Pk,m (4.11) Pj,m ιki, m j
of path spaces of length 2, which leave the endpoints (here the node m) fixed. We arrange the injections in such a way that elementary paths are mapped to elementary paths. This allows us to extend ι(2) to longer paths simply by requiring compatibility with concatenations clm , see Eq. (2.23): Given a collection of maps ιki, l(n) for some n ≥ 2, we (n+1) by define ιki, m (n+1) l k (n) (4.12) ιki, m cm (p) = clm ιi, l (p) k L (n) ⊕Nij . Moreover, the second part of Corollary for all elementary paths |pi ∈ k Pj,l 4.6 states that injections ι(1) can already be defined for paths of length 1 at least in some cases. Among those are the first edge of the highest weight path (wrt the LG0 action (2.21)) in each Pi , i.e. the edge [i → K]; we choose ιki (1) such as to map this edge to [k → K] – and whenever possible, we require already ι(2) to be induced by ι(1) according to (4.12). (2) for all i, k, m ∈ Iq – the choice Having chosen such a collection of injections ιki, m involved is a finite one, to be discussed later – the compatibility with concatenation also k (n) ensures that we can take the inductive limit of the system ιi, m n , and we arrive at well-defined embeddings of infinite path spaces M ⊕Nijk ⊕ni −→ Pk , (4.13) Pj ιki : j
which by definition map elementary paths to elementary paths, preserve length and endpoint of every finite path and, in particular, map ground states to ground states. k We need two more (canonical) maps to be able to write down a formula for Cij; α. th One is the projection from the ni -fold direct sum of Pk onto the α factor, α = 1, . . . , ni , ⊕ni −→ Pk . (4.14) pr α : Pk The other is the inclusion of Pj into the direct sum of path spaces occurring in the fusion k = 0. We write of i and k – which, however, vanishes if Nij εkij := δNijk ,1 · inclj : Pj −→
M
Pj 0
⊕Nijk 0
j0
for this “weighted” inclusion. Proposition 4.7. Denote by 0kij; α := pr α ◦ ιki ◦ εkij : Pj −→ Pk k the composition of the maps (4.13–15), and define the k-j-strings Cij; α by X k | 0kij; α (p) i h p |. Cij; α := |pi∈Pj k Then the Cij; α satisfy the assumption of Proposition 4.4.
(4.15)
From Path Representations to Global Morphisms for a Class ofMinimal Models
401
Proof. The proof is straightforward, using that 0kij; α maps elementary paths to elementary paths injectively, as well as the string multiplication rule: ni X α=1
ni X ∗ k k Cim; α Cij; α =
X
X
| q i h 0kim; α (q) | 0kij; α (p) i h p |
α=1 |qi∈Pm |pi∈Pj
= δNijk ,1
X
X
|qi∈Pm |pi∈Pj
| q i δp,q h p | = δNijk ,1 δm,j 1Pj .
As an aside, let us mention the following applications of this proposition, or of Eq. k : Pj −→ Pk⊕ni given by (4.10). The operator Cij k = Cij
X
| ιki (εkij (p)) i h p |,
(4.16)
|pi∈Pj k k = Cij; which we can also write as a column vector Cij α k k more, the operators 5ij and 5i in Mni (Ak ), defined as k k Cij 5kij = Cij
∗
,
5ki =
X
ni α=1
5kij ,
, is an isometry. Further-
(4.17)
j∈Iq
k ∗ as a row vector of j-k-strings. are both projections; in the first line, we regard Cij L (n) (n) k When restricted to finite paths in Pk = l Pk,l , cf. Eq. (2.23), the rank of 5i is P (n) k j Nij dimPj . These operators will become important later when we will discuss the amplimorphisms of the observable algebra. k Clearly, the construction of the strings Cij; α we have given is not the only way to solve the completeness conditions (4.10). Nevertheless, we think that from the point of view of path spaces, our procedure is the most – and maybe even the only – natural one. Besides that, requiring that the embeddings ιki are compatible with path prolongation reduces the amount of choices to be made quite drastically: ιki is determined up to a unitary transformation in the finite sub-algebra Mni (A(2) k ) – with the further constraint that elementary paths should be mapped to elementary paths. In the example below, it turns out that this essentially leaves only twists by certain permutation matrices in Mni (C · 1Ak ). All in all, our prescription how to construct charged field multiplets in the amplified path field algebra leads us to an almost unique and above all natural solution, which was possible by exploiting the “fine structure” of the underlying path spaces. At the end of this section, let us again take a look at the case of the c(2, 5) minimal model. There, the non-trivial field multiplet F1 is an element of M2 (F), and according to Proposition 4.3, it has the following matrix structure: 1 0 C11; 0 0 0 0 α 2 1 1 0 C11;α 0 0 0 (4.18) (F1 )αβ β=1 = C10;α 1 1 C11;α 0 0 0 0 C10;α for α = 1, 2. With the help of the explicit formula (4.6) for the amplimorphism ν1 of g(5) , it is straightforward to check covariance (4.7) of F1 explicitly – and in this example, also uniqueness of the solution (4.18) is not too difficult to show.
402
A. Recknagel
k The strings C1j; α are constructed as above: For k = 0, we have to choose embeddings (n) (n) (n) : P1,m −→ P0,m ⊕ P0,m for both path ends m = 0, 1 and for n ≥ 2 – or already (1) (1) for n = 1 if possible: P0,0 = ∅, but ι01,(1) 1 can be defined: It maps the edge [1 → 1] ∈ P1,1 (1) (1) to one of the two copies of [0 → 1] present in P0,1 ⊕ P0,1 . This determines ι01 (n) on all paths in P1(n) that go through the node 1 on the first floor. In particular, one of the two (2) (2) ⊕ P0,1 is in the image of those paths, so we have to copies of the path (0; 1, 1) in P0,1 (2) (2) (2) ⊕ P0,1 . This map the remaining path (1; 0, 1) ∈ P1,1 to the other copy of (0; 1, 1) in P0,1 0 (2) 0 defines ι1, m and therefore ι1 completely. For k = 1, we need to define ι11 : P0 ⊕ P1 −→ P1 ⊕ P1 . The only natural possibility is to identify the space P1 on the left with one of the P1 ’s on the right, and then map P0 (1) into the second copy of P1 with the “initial condition” ι11,(1) 1 [0 → 1] = [1 → 1] ∈ P1,1 . Then, the only choice in the construction is which copy of P1 on the right to identify with the P1 on the left. This means that both embeddings ιk1 are determined up to a permutation matrix in M2 (C), acting trivially within the path representation spaces Pk . Strictly speaking, however, we are not forced to map the space P1 in P0 ⊕ P1 “as a whole” into one of the P1 in the target space, but we could also “distribute” it over both copies. This relatively unnatural choice would introduce a higher degree of indeterminacy into our construction. Note that in any case we can arrange ιk1 so as to map lowest energy states (the sequences of nodes (0; 1, 1, 1 . . . ) and (1; 1, 1, 1 . . . ) ) to lowest energy states again.
ι01,(n) m
5. Global Amplimorphisms It is now very easy to write down global amplimorphisms of the path observable algebra which implement the charged sectors of our models. Copying the procedure of [68], we associate to each field multiplet Fi ∈ Mni (F) the linear map A −→ Mni (F) ni (5.1a) ρi : A 7−→ ρi (A) = ρi (A)αβ α,β=1
with ρi (A)αβ :=
ni X γ=1
Fi
αγ
A Fi∗
γβ
.
(5.1b)
Here and in the following, A is viewed as sub-algebra of F by the diagonal embedding as in the proof of Proposition 3.7. We list the relevant properties of these maps in a series of propositions. First we have to show that the maps ρi deserve the name amplimorphisms of the global observable algebra: Proposition 5.1. The map ρi takes values in the ni th amplification of the global path observable algebras A. It is an injective ∗ -homomorphism of AF-algebras. Proof. By definition, ρi (A)αβ is an element of F for all A ∈ A. In order to show that ρi (A)αβ are observables, it is sufficient to check that they commute with U (a) for all a ∈ g, see Proposition 3.7. Identifying a with U (a) for simplicity, and using the summation convention, we compute for arbitrary A ∈ A,
From Path Representations to Global Morphisms for a Class ofMinimal Models
403
a · ρi (A)αβ = a · (Fi )αγ A (Fi∗ )γβ = (Fi )αδ (νi (a))δγ A (Fi∗ )γβ = (Fi )αδ A (νi (a))δγ (Fi∗ )γβ ∗ ∗ = (Fi )αδ A Fi νi (a)∗ δβ = (Fi )αδ A a∗ Fi δβ = (Fi )αδ A (Fi∗ )δβ a = ρi (A)αβ · a; we have applied Proposition 3.7 and the properties (4.7) and (4.8) of the covariant field multiplets repeatedly. By construction, it is clear that ρi respects the ∗ -operations of the path algebras, and multiplicativity is again established with the help of the covariance and completeness relations of the field multiplets: ρi (A)αγ ρi (B)γβ = (Fi )αδ A (Fi∗ )δγ (Fi )γ B (Fi∗ )β = (Fi )αδ A νi (1) δ B (Fi∗ )β = Fi · νi (1) α AB (Fi∗ )β = ρi (AB) αβ for all elements A, B of the path observable algebra A ⊂ F. That ρi is an injective map from A to Mni (A) also follows directly from (4.8): “Sandwiching” ρi (A) by Fi∗ and Fi , and using multiplicativity of the g-amplimorphism νi , we obtain Fi∗ ρi (A) Fi αβ = νi (1)αγ A νi (1)γβ = A νi (1)αβ for all α, β = 1, . . . , ni . Thus, ρi (A) = 0 implies A = 0.
Note that these properties do not hold if we extend ρi to all of F – and also that the proof did not depend on any specific properties tied to the path representations. Those came into play when we had to solve the conditions (4.7) and (4.8) for the field multiplets k explicitly by constructing the strings Cij; α which appear in the concrete expressions for the amplimorphisms. In order to obtain such formulas, we use the decomposition A = A0 + . . . + AK of elements of A = A1 ⊕ . . . ⊕ AK , and the diagonal embedding (3.15) of A into the field algebra. Inserting the covariant field multiplets from Proposition 4.3 into (5.1), and keeping track of the Kronecker symbols, yields the following expressions for the elements of the amplimorphisms: X ∗ k k δN k ,1 Cil; (5.2) ρi (A) (α;n ,σ)(β;nj ,τ ) = δk,j δσ,τ α Al Cil; β k
l∈Iq
il
k with Cil; α as in Proposition 4.7; the matrix notation is as in Subsect. 4.1. By construction of the string coefficients, ρi maps the n-floor finite-dimensional sub-algebra A(n) of A into Mni (A(n) ) and is compatible with the embeddings 8(n) induced by path concatenation. We say that the amplimorphisms preserve the filtration of the systems (n) . A(n) i ,8 In addition, formula (5.2) makes it clear that the amplimorphisms ρi are non-unital in general. We have ρi (1)(α;nk ,σ)(β;nj ,τ ) = δk,j δσ,τ 5ki αβ
with the projection 5ki ∈ Mni (Ak ) from Eq. (4.17), which is 6 = 1 for i 6 = 0. During the construction of the field multiplets, we have made several choices, and we must determine to which extent the amplimorphisms ρi depend on them:
404
A. Recknagel
Proposition 5.2. The A-amplimorphisms ρi depend only on the isomorphism class of the g-amplimorphisms νi which enter the covariance law (4.7) of the field multiplets. For a given class of νi , the ρi are unique up to conjugation by unitaries in Mni (A). k Proof. The second statement follows immediately from the Cij; α dependence of the A-amplimorphisms. To see the first claim, assume that we “twist” the g-amplimorphism νi by a unitary u ∈ Mni (g) and use νi0 := Adu ◦ νi instead. Then the covariant field multiplets change as Fi0 := Fi · u−1 , where u−1 acts through the representation U of g on Htot , see Definition 3.6. Since U (u) commutes with all elements of A, formula (5.1) for ρi stays invariant under a twist.
The isomorphism classes of the g-amplimorphisms were of course fixed by the c(2, q) fusion rules from the start, therefore we conclude that our construction yields filtration-preserving amplimorphisms of the global path observable algebra of unique k isomorphism type. But moreover, in view of the remarks on the choice of strings Cij; α made after Proposition 4.7, the ρi which are “natural” from the path perspective are fixed up to conjugation by a permutation matrix in Mni (A(2) ). We can now ask whether the ρi implement the representations of the global observable algebra on the vacuum sector: Proposition 5.3. Let πi : A −→ Ai be the projection of A onto the representation in the sector Hi , and denote the extensions to amplifications Mn (A) by the same symbol. Then the amplimorphisms ρi of eq. (5.1) satisfy π0 ◦ ρi ' πi , where ' denotes equivalence by isometries. Proof. The statement follows immediately from formula (5.2). Setting k = 0, the condition Nil0 = 1 enforces l = i (uniqueness of the conjugate sector), and we remain with ∗ 0 0 π0 ρi (A) αβ = Cii; α Ai Cii; β . Thus, the string isometries Cii0 from Eq. (4.16) ‘transport” the representation Ai into k Mni (A0 ) and implement the equivalence. Note that by our construction of the Cij; α the equivalence also holds when A and Ai are replaced by finite-dimensional sub-algebras A(n) and Ai(n) . Proposition 5.4. Upon composition, the amplimorphisms realize the fusion rules of the c(2, q) minimal models, i.e. the relations Nk
π0 ◦ (ρi ◦ ρj ) '
ij M M
π0 ◦ ρk
k∈Iq r=1
hold in Mni nj (A0 ). Proof. We apply the lhs to A ∈ A and find after a short calculation X ∗ 0 i 0 i δNijk ,1 Cii; π0 ρi (ρj (A)) αα0 ,ββ 0 = α Cjk; α0 Ak Cii; β Cjk; β 0 , k∈Iq
0
where α, β = 1, . . . , ni and α , β 0 = 1, . . . , nj . Again, the claim follows from the k completeness and orthogonality relation (4.10) for the strings Cij; α. In view of all these properties, we may regard the amplimorphisms ρi as variants of the DHR morphisms of algebraic quantum field theory.
From Path Representations to Global Morphisms for a Class ofMinimal Models
405
6. Summary and Open Problems In this article, we have discussed c(2, q) minimal models of the Virasoro algebra and established the global aspects of a description in the framework of algebraic quantum field theory. Starting from path representations for the irreducible Virasoro modules, we could introduce global observable algebras for these CFTs as the string algebras associated the path spaces. The other important ingredient of the DHR formulation are morphisms of the observable algebra which implement the charged superselection sectors on the vacuum state space. We were able to derive global morphisms by constructing covariant field multiplets within a field algebra extension of the global observables. The field algebra in turn was defined with the help of a quantum symmetry algebra acting on an appropriately enlarged Hilbert space and commuting with the observables. Since the QSA we used is finite-dimensional, we were led to a comparatively small field algebra. The QSA appeared as an auxiliary object, and indeed only the simplest part (the co-product) of the structure of a weak quasi-triangular quasi-Hopf algebra was needed for our construction. One could now try to extract further information on the c(2, q) models from the amplimorphisms we have constructed. Since the algebraic approach is superior to all others when it comes to the discussion of braid group statistics, one should in particular try to compute intertwiners (statistics operators), left inverses and Markov traces associated to the morphisms of the observable algebra. In this way, interesting braid group representations and knot invariants might arise and, as a by-product, one could supply the QSA g with the full explicit data of a weak quasi-triangular quasi Hopf algebra. The results of [68] indicate that dealing with non-unital amplimorphisms rather that unital endomorphisms does not pose severe problems, and we hope that this program can be carried out even in the absence of local information in our path algebras and morphisms. Nevertheless, if we aim at a complete description of the c(2, q) models within the algebraic framework, it is important to recover the local net structure inside the global algebras and to show that our amplimorphisms are equivalent to localizable ones. One possible starting point for the construction of local sub-algebras of the path algebra is provided by the su(1,1) action on the path spaces constructed in [62]. But although there seems to be no principle problem, it is technically rather difficult to implement the conditions of M¨obius covariance on the local sub-algebras. In this context, we conjecture that it is precisely because they are filtration-preserving that our global amplimorphisms have good chances to be equivalent to covariant and localized morphisms: The su(1,1)-action of [62] is designed in analogy with the excitation of quasi-particles and therefore respects the finite length filtration of the path spaces as much as possible. Given an action of su(1,1) or even the Virasoro algebra on the path spaces, we could also make our somewhat abstract amplimorphisms more concrete: Although their action on path algebra elements can be made as explicit as we wish, it would be interesting to have formulas for ρi applied to Virasoro generators, similar to [49] where neat expressions could be obtained because the Virasoro modes have simple expansions in terms of the fermion modes, which in turn are acted on by the endomorphisms in a simple fashion. In addition, once sub-algebras of local observables have been identified within the global path algebra, one would also like to make contact to the von Neumann algebra description of local QFTs with all its particular merits. It remains to be seen whether one then meets problems with the non-unitarity of the c(2, q) models, which apparently played no role at the purely algebraic level of our global considerations.
406
A. Recknagel
We have seen in this paper that the path representations of the c(2, q) minimal models open up a lot of interesting possibilities. They allow to organize a great deal of information on these CFTs into a single labeled graph; they naturally lead to an AF-algebraic description of theories which a priori are defined in terms of unbounded Virasoro modes; they can, in this respect, be regarded as an alternative to the usual free field constructions; they seem to encode, via the quasi-particle reformulation, structural details of non-conformal relatives of the c(2, q) minimal models within the CFT and might, therefore, even be useful in the context of massive integrable quantum field theories. In view of these facts, it is desirable to find similar path representations for other conformal models as well. In [46], this has been achieved for a subset of modules in the c(even, odd) Virasoro minimal models, with our graphs Gq again playing an important role. However, since some of the sectors still lack a path description, our construction cannot be applied to those models, yet. Whereas the results in [46] have been obtained by factorization of some of the characters of the minimal models, i.e. by purely combinatorial means, one could alternatively try to imitate the FNO procedure in order to determine explicit bases of the irreducible modules. However, for general minimal models the structure of the annihilating ideal is more involved than in the c(2, q) cases, and it seems necessary to pass to the maximally extended chiral observable algebra before progress can be made. We also came across some problems of a more abstract nature: In the introduction, we raised the speculation that there is a general relation between AF-algebras and conformal field theories. Somewhat related to this conjecture, we have seen that the global path observable algebras of the c(2, q) models are of the same type as the string algebras which show up as intertwiner (symmetry) algebras in the DHR framework. Thus it seems that in the case of these conformal models, internal and space-time symmetries are indeed “inexorably linked” [65]. Still, the relationship remains to be made precise. On the other hand, we have shown that there also is a canonical semi-simple QSA for any quasi-rational CFT. The question is whether one can find an axiomatic foundation of the notions introduced in [53]. Since the dimensions of the small spaces provide new invariants of quasi-rational CFTs, this might lead to interesting developments within algebraic QFT and even in the theory of operator algebras. Acknowledgement. I would like to thank A. Alekseev, D. Buchholz, K. Fredenhagen, J. Fr¨ohlich, J. Fuchs, A. Ganchev, K. Gaw¸edzki, W. Nahm, V. Schomerus, B. Schroer, K. Szlachanyi and P. Vecsernyes for useful discussions, comments and encouragement. I am particularly indebted to M. R¨osgen whose collaboration on path representations was invaluable. This work was partially supported by a HCM Fellowship of the European Union.
References 1. Alekseev, A.Yu., Faddeev, L.D., Semenov-Tian-Shansky, M.A.: Hidden quantum groups inside Kac– Moody algebras. Commun. Math. Phys. 149, 335–345 (1992) 2. Alekseev, A.Yu., Recknagel, A.: The embedding structure and the shift operator of the U(1) lattice current algebra. Lett. Math. Phys. 37, 15–27 (1996) 3. Alekseev, A.Yu., Recknagel, A., Schomerus, V.: Generalization of the Knizhnik–Zamolodchikov equations. Lett. Math. Phys. 41, 169–180 (1997) 4. Alvarez-Gaum´e, L., Gomez, C., Sierra, G.: Quantum group interpretation of some conformal field theories. Phys. Lett. B 220, 142–152 (1989) 5. Andrews, G.E.: q-series: their development and application in analysis, number theory, combinatorics, physics, and computer algebra. Conf. Board of the Math. Sciences, Reg. Conf. Ser. in Math. 66 (1986)
From Path Representations to Global Morphisms for a Class ofMinimal Models
407
6. Andrews, G.E., Baxter, R.J., Forrester, P.J.: Eight-vertex SOS model and generalized Rogers-Ramanujantype identities. J. Stat. Phys. 35, 193–266 (1984) 7. Bauer, M., Di Francesco, P., Itzykson, C., Zuber, J.-B.: Covariant differential equations and singular vectors in Virasoro representations. Nucl. Phys. B 362, 515–562 (1991) 8. Belavin, A.A., Polyakov, A.M., Zamolodchikov, A.B.: Infinite conformal symmetry in two-dimensional quantum field theory, Nucl. Phys. B 241, 333–380 (1984) 9. Benoit, L., Saint-Aubin, Y.: Degenerate conformal field theories and explicit expressions of some null vectors. Phys. Lett. B 215, 517–522 (1988) 10. Berkovich, A., McCoy, B.M., Schilling, A.: Rogers–Schur–Ramanujan type identities for the M (p, p0 ) minimal models of conformal field theory. Commun. Math. Phys. 191, 325–395 (1998) 11. Blackadar, B.: K-Theory for operator algebras. New York: MSRI Publ. 5, 1986 12. Blumenhagen, R., Flohr, M., Kliem, A., Nahm, W., Recknagel, A., Varnhagen, R.: W-algebras with two and three generators. Nucl. Phys. B 361, 255–289 (1991) 13. B¨ockenhauer, J.: Localized endomorphisms of the chiral Ising model. Commun. Math. Phys. 177, 265– 304 (1996) 14. B¨ockenhauer, J., Fuchs, J.: Higher level WZW sectors from free fermions. J. Math. Phys. 38, 1227–1256 (1997) 15. Buchholz, D., Mack, G., Todorov, I.T.: The current algebra on the circle as a germ of local field theories. Nucl. Phys. B (Proc. Suppl.) 5B, 20–56 (1988) 16. Cardy, J.L.. Conformal invariance and the Yang–Lee edge singularity two dimensions. Phys. Rev. Lett. 54, 1354–1356 (1985) 17. Caselle, M., Ponzano, G., Ravanini, F.: Towards a classification of fusion rule algebras in rational conformal field theories. Int. J. Mod. Phys. B 6, 2075–2090 (1992) 18. Doplicher, S., Haag, R., Roberts, J.E.: Local observables and particle statistics I,II. Commun. Math. Phys. 23, 199–230 (1971); 35, 49–85 (1974) 19. Doplicher, S., Roberts, J.E.: A new duality theory for compact groups. Invent. Math. 98, 157–218 (1989); Why there is a field algebra with a compact gauge group describing the superselection structure in particle physics. Commun. Math. Phys. 131, 51–107 (1990) 20. Drinfeld, V.G.: Quasi-Hopf algebras and Knizhnik–Zamolodchikov equations. In: Problems of modern quantum field theory, Proceedings Alushta 1989, Research reports in physics, Berlin–Heidelberg–New York: Springer 1989; Quasi-Hopf algebras. Leningrad Math. J. Vol. 1, No. 6 (1990) 21. Feigin, B.L., Fuchs, D.B.: Invariant skew-symmetric differential operators on the line and Verma modules over the Virasoro algebra. Funct. Anal. Appl. 16, 114–126 (1982); Verma modules over the Virasoro algebra. Lect. Notes in Math. 1060, Berlin–Heidelberg–New York: Springer 1984, pp. 230–245 22. Feigin, B.L., Nakanishi, T., Ooguri, H.: The annihilating ideals of minimal models. Int. J. Mod. Phys. A 7 Suppl. 1A, 217–238 (1992) 23. Feigin, B.L., Stoyanovsky, A.V.: Quasi-particles models for the representations of Lie algebras and geometry of flag manifold. Preprint, 35 pp., hep-th/9308079 24. Felder, G., Fr¨ohlich, J., Keller, G.: On the structure of unitary conformal field theory I,II. Commun. Math. Phys. 124, 417–463 (1989), 130, 1–49 (1990) 25. Forrester, P.J., Baxter, R.J.: Further exact solutions of the eight-vertex SOS model and generalizations of the Rogers–Ramanujan identities. J. Stat. Phys. 38, 435 (1985) 26. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with braid group statistics and exchange algebras I,II. Commun. Math. Phys. 125, 201–226 (1989); Rev. Math. Phys. Special issue 111–154 (1992) 27. Freund, P.G.O., Klassen, T.R., Melzer, E.: S-Matrices for perturbations of certain conformal field theories. Phys. Lett. B 229, 243–247 (1989) 28. Friedan, D., Qiu, Z., Shenker, S.H.: Conformal invariance, unitarity and two-dimensional critical exponents. Phys. Rev. Lett. 52, 1575–1578 (1984) 29. Fr¨ohlich, J.: Statistics of fields, the Yang–Baxter equation and the theory of knots and links. In: Nonperturbative quantum field theory, eds. G. t’Hooft, A. Jaffe, G. Mack, P.K. Mitter and R. Stora, Plenum 1988 30. Fr¨ohlich, J., Gabbiani, F.: Braid statistics in local quantum theory. Rev. Math. Phys. 2, 251–353 (1990) 31. Fr¨ohlich, J., Kerler, T.: Quantum groups, quantum categories and quantum field theory. Lect. Notes in Math. 1542, Berlin–Heidelberg–New York: Springer, 1993 32. Fuchs, J.: Algebraic conformal field theory. Proceedings Gosen 1991, Ahrenshoop Symp. 1991, pp. 99–114
408
A. Recknagel
33. Fuchs, J.. Ganchev, A., Vecsernyes, P.: Level 1 WZW superselection sectors. Commun. Math. Phys. 146, 553–584 (1992); Simple WZW superselection sectors. Lett. Math. Phys. 28, 31–42 (1993) 34. Fuchs, J., Ganchev, A., Vecsernyes, P.: Towards a classification of rational Hopf algebras. NIKHEF preprint, 44 pp., hep-th/9402153; On the quantum symmetry of rational field theories. Theor. Math. Phys. 98, 266–276 (1994) 35. Gabbiani, F., Fr¨ohlich, J.: Operator algebras and conformal field theory. Commun. Math. Phys. 155, 569–640 (1993) 36. Gaberdiel, M.R.: An explicit construction of the quantum group in chiral WZW-models. Commun. Math. Phys. 173, 357–377 (1995) 37. Gaberdiel, M.R.: Fusion of twisted representations. Int. J. Mod. Phys. A 12, 5183–5208 (1997) 38. Gaw¸edzki, K.: Quantum group symmetries in conformal field theory. In: Quantum and non-commutative analysis. Proc. Kyoto 1992, 239–251, hep-th/9210100 39. Goodman, F.M., de la Harpe, P., Jones, V.F.R.: Coxeter graphs and towers of algebras. New York: MSRI Publ. 14, 1989 40. Haag, R.: Local quantum physics. Berlin–Heidelberg–New York: Springer, 1992 41. Haag, R., Kastler, D.: An algebraic approach to field theory. J. Math. Phys. 5, 848–861 (1964) 42. Kaufmann, R.M.: Path space decompositions for the Virasoro algebra and its Verma modules. Int. J. Mod. Phys. A 10, 943–962 (1995) 43. Kedem, R., Klassen, T.R., McCoy, B.M., Melzer, E.: Fermionic quasiparticle representations for char(1) (1) acters of G(1) 1 × G1 /G2 . Phys. Lett. B 304, 263–270 (1993); Fermionic sum representations for conformal field theory characters. Phys. Lett. B 307, 68–76 (1993) 44. Kedem, R., McCoy, B.M.: Construction of modular branching functions from Bethe’s equations in the 3-state Potts chain. Stony Brook preprint, 34 pp., hep-th/9210129 45. Kellendonk, J., Recknagel, A.: Virasoro representations on fusion graphs. Phys. Lett. B 298, 329–334 (1993) 46. Kellendonk, J., R¨osgen, M., Varnhagen, R.: Path spaces and W-fusion in minimal models. Int. J. Mod. Phys. A 9, 1009–1024 (1994) 47. Knizhnik, V.G.. Zamolodchikov, A.B.: Current algebra and Wess–Zumino model in two dimensions. Nucl. Phys. B 247, 83–103 (1984) 48. Longo, R.: Index for subfactors and statistics of quantum fields I,II. Commun. Math. Phys. 126, 217–247 (1989), 130, 285–309 (1990) 49. Mack, G.. Schomerus, V.: Conformal field algebras with quantum symmetry from the theory of superselection sectors. Commun. Math. Phys. 134, 139–196 (1990) 50. Mack, G., Schomerus, V.: Quasi-Hopf quantum symmetry in quantum theory. Nucl. Phys. B 370, 185– 230 (1991) 51. Moore, G., Seiberg, N.: Polynomial equations for rational conformal field theories. Phys. Lett. B 212, 451–460 (1988); Classical and conformal quantum field theory. Commun. Math. Phys. 123, 177–254 (1989) 52. Nahm, W.: Chiral algebras of two-dimensional chiral field theories and their normal ordered products. In: Recent developments in conformal field theories, Proc. Trieste October 1989, Proc. 3 Reg. Conf. on Math. Phys., Islamabad 1989 53. Nahm, W.: Quasi-rational fusion products. Int. J. Mod. Phys. B 8, 3693–3702 (1994) 54. Ocneanu, A.: Quantized groups, string algebras and Galois theory for algebras. In: Operator algebras and applications II, Vol. 135, Cambridge: London Math. Soc. 1988; Quantum symmetry, differential geometry of finite graphs and classification of subfactors. University of Tokyo Seminary Notes 45, recorded by Y. Kawahigashi, July 1990 55. Pasquier, V.: Two-dimensional critical systems labeled by Dynkin diagrams. Nucl. Phys. B 285, 162–172 (1987); Etiology of IRF models. Commun. Math. Phys. 118, 355–364 (1988) 56. Pasquier, V., Saleur, H.: Common structures between finite systems and conformal field theories through quantum groups. Nucl. Phys. B 330, 523–556 (1990) 57. Recknagel, A.: AF-algebras and applications of K-theory in conformal field theory. Talk given at the International Congress in Mathematical Physics, Satellite colloquium on “New Problems in the General Theory of Fields and Particles”, Paris, July 1994 58. Rehren, K.-H.: Markov traces as characters for local algebras. Nucl. Phys. B (Proc. Suppl.) 18B, 259–268 (1990); Braid group statistics and their superselection rules. In: The algebraic theory of superselection sectors. Introduction and recent results. ed. D. Kastler, Singapore: World Scientific, 1990 59. Rehren, K.-H.: Field operators for anyons and plektons. Commun. Math. Phys. 145, 123–148 (1992)
From Path Representations to Global Morphisms for a Class ofMinimal Models
409
60. Rehren, K.-H.: Quantum symmetry associated with braid group statistics. In: Quantum groups, Proc. Clausthal 1989, eds. H.-D. Doebner et al., Lect. Notes in Phys. 370, Berlin–Heidelberg–New York: Springer 1990, pp. 318–339; Quantum symmetry associated with braid group statistics II. In: Quantum symmetries, Proc. Clausthal 1991, eds. H.-D. Doebner et al., 14–23, Singapore: World Scientific, 1993 61. Rehren, K.-H.: Subfactors and coset models. In: Generalized symmetries in physics, Proc. Clausthal, 1993, hep-th/9308145; On the range of the index of subfactors. Vienna preprint ESI-93-14, 9 pp. 62. R¨osgen, M.: Pfaddarstellungen minimaler Modelle. Diplomarbeit BONN-IR-93-24; R¨osgen, M., Varnhagen, R.: Steps towards lattice Virasoro algebras: su(1,1). Phys. Lett. B 350, 203–211 (1995) 63. Schellekens, A.N., Yankielowicz, S.: Extended chiral algebras and modular invariant partition functions. Nucl. Phys. B 327, 673–703 (1989) 64. Schomerus, V.: Construction of field algebras with quantum symmetry from local observables. Commun. Math. Phys. 169, 193–236 (1995) 65. Schroer, B.: Reminiscences about many pitfalls and some successes of QFT within the last three decades. Rev. Math. Phys. 7, 645–688 (1995) 66. Szlachanyi, K.,Vecsernyes, P.: Quantum symmetry and braid group statistics in G-spin models. Commun. Math. Phys. 156, 127–168 (1993) 67. Terhoeven, M.: Lift of dilogarithm to partition identities. Bonn preprint, 8 pp., hep-th/9211120 68. Vecsernyes, P.: On the quantum symmetry of the chiral Ising model. Nucl. Phys. B 415, 557–588 (1994) 69. Wassermann, A.: Operator algebras and conformal field theory. Proceedings ICM Z¨urich 1994, Basel– Bostin: Birkh¨auser; Operator algebras and conformal field theory II,III. Cambridge preprints 70. Wiesbrock, H.W.: Conformal quantum field theory and half sided modular inclusions of von Neumann algebras. Commun. Math. Phys. 158, 537–544 (1993); A note on strongly additive conformal field theory and half sided modular conormal standard inclusions. Lett. Math. Phys. 31, 303–308 (1994) 71. Witten, E.: Non-abelian bosonization in two dimensions. Commun. Math. Phys. 92, 455–472 (1984) 72. Zamolodchikov, A.B.: Integrable field theory from conformal field theory. Adv. Studies in Pure Math. 19, 641–674 (1989) Communicated by G. Felder
Commun. Math. Phys. 201, 411 – 421 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
A Note on a Symplectic Structure on the Space of G-Monopoles Michael Finkelberg1,? , Alexander Kuznetsov1,? , Nikita Markarian1 , Ivan Mirkovi´c2,? 1 Independent Moscow University, 11 Bolshoj Vlasjevskij Pereulok, Moscow 121002, Russia. E-mail:
[email protected];
[email protected];
[email protected];
[email protected] 2 Dept. of Mathematics and Statistics, University of Massachusetts at Amherst, Amherst, MA 01003-4515, USA. E-mail:
[email protected] Received: 20 April 1998 / Accepted: 29 August 1998
To Sasha Shen on his 40th birthday Abstract: Let G be a semisimple complex Lie group with a Borel subgroup B. Let X = G/B be the flag manifold of G. Let C = P1 3 ∞ be the projective line. Let α ∈ H2 (X, Z). The moduli space of G-monopoles of topological charge α is naturally identified with the space Mb (X, α) of based maps from (C, ∞) to (X, B) of degree α. The moduli space of G-monopoles carries a natural hyperk¨ahler structure, and hence a holomorphic symplectic structure. It was explicitly computed by R. Bielawski in case G = SLn . We propose a simple explicit formula for another natural symplectic structure on Mb (X, α). It is tantalizingly similar to R. Bielawski’s formula, but in general (rank > 1) the two structures do not coincide. Let P ⊃ B be a parabolic subgroup. The construction of the Poisson structure on Mb (X, α) generalizes verbatim to the space of based maps M = Mb (G/P, β). In most cases the corresponding map T ∗ M → T M is not an isomorphism, i.e. M splits into nontrivial symplectic leaves. These leaves are explicilty described. 1. Introduction 1.1. Let G be a semisimple complex Lie group with the Cartan datum (I, ·) and the root datum (Y, X, . . . ). Let H ⊂ B = B+ , B− ⊂ G be a Cartan subgroup and a pair of opposite Borel subgroups respectively. Let PX = G/B be the flag manifold of G. Let C = P1 3 ∞ be the projective line. Let α = i∈I ai i ∈ N[I] ⊂ H2 (X, Z). We propose a simple explicit formula for a symplectic structure on the moduli Mb (X, α) of based maps from (C, ∞) to (X, B+ ) of degree α. It generalizes the well known formula for G = SL2 [1]. 1.2. Recall that for G = SL2 we have (X, B+ ) = (P1 , ∞). Recall the natural local coordinates on Mb (P1 , a) (see [1]). We fix a coordinate z on C such that z(∞) = ∞. ?
M.F. and A.K. were partially supported by CRDF grant RM1-265, and I.M. by NSF grant DMS-9622863
412
M. Finkelberg, A. Kuznetsov, N. Markarian, I. Mirkovi´c
p(z) , where q(z) q(z) is a degree a polynomial with the leading coefficient 1, and p(z) is a degree < a polynomial. Let U be the open subset of based maps such that the roots x1 , . . . , xa of q(z) are multiplicity free. Let y k be the value of p(z) at xk . Then x1 , . . . , xa , y 1 , . . . , y a form an e´ tale coordinate system on U . The symplectic form on Mb (P1 , a) equals Pa dy k ∧ dxk . In other words, the Poisson brackets of these coordinates are as folk=1 yk lows: {xk , xm } = 0 = {y k , y m }; {xk , y m } = δkm y m . For an arbitrary G and i ∈ I let Xi ⊂ X be the corresponding B− -orbit of codimension 1 (Schubert cell), and let Xi ⊃ Xi be its closure (Schubert variety). For φ ∈ Mb (X, α) we define x1i , . . . , xai i ∈ A1 as the points of intersection of φ(P1 ) with Xi . This way we obtain the projection π α : Mb (X, α) → Aα (the configuration space of I-colored divisors of degree α on A1 ). Let U ⊂ Mb (X, α) be the open subset of based maps such that φ(P1 ) ∩ Xi ⊂ Xi for any i ∈ I, and xki 6= xlj for any i, j ∈ I, 1 ≤ k ≤ ai , 1 ≤ l ≤ aj . Locally in X the cell Xi is the zero divisor of a function ϕi (globally, ϕi is a section of the line bundle Lωi corresponding to the fundapi (z) , where mental weight ωi ∈ X). The rational function ϕi ◦ φ on C is of the form qi (z) pi (z) is a degree ai polynomial with the leading coefficient 1, and qi (z) is a degree < ai polynomial. Let yik be the value of qi (z) at xki . Then xki , yik , i ∈ I, 1 ≤ k ≤ ai , form an e´ tale coordinate system on U . The Poisson brackets of these coordinates are as follows:
Then a based map φ : (C, ∞) → (P1 , 0) of degree a is a rational function
{xki , xlj } = 0 = {yik , yil }; {xki , yjl } = δij δkl yjl ; {yik , yjl } = i · j
yik yjl
xki − xlj
for i 6= j.
1.3. It follows that the symmetric functions of the x-coordinates (well defined on the whole Mb (X, α)) are in involution. In other words, the projection π α : Mb (X, α) → Aα is an integrable system on Mb (X, α). The fibers of π α : U → Aα are Lagrangian submanifolds of U . It is known that all the fibers of π α : Mb (X, α) → Aα are equidimensional of the same dimension |α| (see [3]), hence π α is flat, hence all the fibers are Lagrangian. 1.4. Let P ⊃ B be a parabolic subgroup. The construction of the Poisson structure on Mb (X, α) generalizes verbatim to the space of based maps M = Mb (G/P, β). In most cases the corresponding map P : T ∗ M → T M is not an isomorphism, i.e. M splits into nontrivial symplectic leaves. For certain degrees α ∈ N[I] we have the natural embedding 5 : Mb (X, α) ,→ M, and the image is a symplectic leaf of P . Moreover, all the symplectic leaves are of the form g5(Mb (X, α)) for certain α ∈ N[I], g ∈ P, see Theorem 2. 1.5. The above Poisson structure is a baby (rational) version of the Poisson structure on the moduli space of B-bundles over an elliptic curve [2]. We learnt of its definition (as a differential in the hypercohomology spectral sequence, see Sect. 2) from B. Feigin. Thus, our modest contribution reduces just to a proof of Jacobi identity. Note that the Poisson structure of [2] arises as a quasiclassical limit of elliptic algebras. On the other hand, cb (X, α) of B-bundles on C trivialized Mb (X, α) is an open subset in the moduli space M cb (X, α) is at ∞, such that the induced H-bundle has degree α. One can see easily that M 2|α| isomorphic to an affine space A , and the symplectic structure on Mb (X, α) extends
Symplectic Structure on Space of G-Monopoles
413
cb (X, α). The latter one can be quantized along the lines to the Poisson structure on M of [2]. 1.6. The moduli space of G-monopoles of topological charge α (see e.g. [4]) is naturally identified with the space Mb (X, α). The moduli space of G-monopoles carries a natural hyperk¨ahler structure, and hence a holomorphic symplectic form (for each choice of complex structure). For G = SLn this form was computed by R.Bielawski in his thesis. The corresponding brackets of the above coordinates are as follows: {xki , xlj } = 0 = {yik , yil }; {xki , yjl } = δij δkl yjl ; {yik , yjl } = 0. Thus for n > 2 our form is different from the form arising from this hyperk¨ahler structure. 1.7. Notations . For a subset J ⊂ I we denote by PJ ⊃ B the corresponding parabolic subgroup. Thus, P∅ = B. Denote by XJ = G/PJ the corresponding parabolic flag variety; thus, X∅ = X. We denote by $ : X → XJ the natural projection. We denote by x ∈ XJ the marked point $(B+ ). Let M = Mb (XJ , α) denote the space of based algebraic maps φ : (C, ∞) → (XJ , x) of degree α ∈ H2 (XJ , Z). Let g denote the Lie algebra of G. Let gXJ denote the trivial vector bundle with the fiber g over XJ and let pXJ ⊂ gXJ (resp. rXJ ⊂ pXJ ⊂ gXJ ) be its subbundle with the fiber over a point P equal to the corresponding Lie subalgebra p ⊂ g (resp. its nilpotent radical r ⊂ p ⊂ g). In case J = ∅ we will also denote rX∅ by nX , and pX∅ by bX . Note that the quotient bundle hX := bX /nX is trivial (abstract Cartan algebra). Recall that the tangent bundle T XJ of XJ (resp. cotangent bundle T ∗ XJ ) is canonically isomorphic to the bundle gXJ /pXJ (resp. rXJ ). 2. The Poisson Structure 2.1. The fibers of the tangent and cotangent bundles of the space M at the point φ are computed as follows: Tφ M = H 0 (C, (φ∗ T XJ ) ⊗ OC (−1)) = H 0 (C, (φ∗ gXJ /pXJ ) ⊗ OC (−1)), Tφ∗ M = H 1 (C, (φ∗ T ∗ XJ ) ⊗ OC (−1)) = H 1 (C, (φ∗ rXJ ) ⊗ OC (−1)). The second identification follows from the first by Serre duality. We have a tautological complex of vector bundles on XJ : rXJ → gXJ → gXJ /pXJ
(1)
The pull-back via φ of this complex twisted by OC (−1) gives the following complex of vector bundles on C: (φ∗ rXJ ) ⊗ OC (−1) → (φ∗ gXJ ) ⊗ OC (−1) → (φ∗ gXJ /pXJ ) ⊗ OC (−1).
(2)
Consider the hypercohomology spectral sequence of the complex (2). Since gXJ is the trivial vector bundle we have H • (C, (φ∗ gXJ ) ⊗ OC (−1)) = 0, hence the second differential of the spectral sequence induces a map d2 : H 1 (C, (φ∗ rXJ ) ⊗ OC (−1)) → H 0 (C, (φ∗ gXJ /pXJ ) ⊗ OC (−1)) that is a map PφXJ : Tφ∗ M → Tφ M. This construction easily globalizes to give a morphism P XJ : T ∗ M → T M.
414
M. Finkelberg, A. Kuznetsov, N. Markarian, I. Mirkovi´c
Theorem 1. P defines a Poisson structure on M. Here we will reduce the theorem to the case XJ = X. This case will be treated in the next section. 2.2. Let $∗ : H2 (X, Z) → H2 (XJ , Z) be the push-forward map. The map $ induces a map 5 : Mb (X, α) → Mb (XJ , $∗ α). Proposition 1. The map 5 respects P , that is the following square is commutative T ∗ Mb (X, α) x 5∗
PX
−−−−→
T Mb (X, α) 5∗ y
P XJ
5∗ T ∗ Mb (XJ , $∗ α) −−−−→ 5∗ T Mb (XJ , $∗ α). Proof. We have the following commutative square on X −−−−→
gX
−−−−→
gX /bX
$∗ rXJ −−−−→
gX
−−−−→
gX /bX y
nX x
$∗ rXJ −−−−→ $∗ gXJ −−−−→ $∗ (gXJ /pXJ ). Consider its pull-back via φ ∈ Mb (X, α) twisted by OC (−1). Let d2 denote the second differential of the hypercohomology spectral sequence of the middle row. Then we evidently have P X · 5∗ = d2 , 5∗ · d2 = P XJ and the proposition follows. Now, assume that we have proved that P X defines a Poisson structure. For any β ∈ H2 (XJ , Z) we can choose α ∈ H2 (X, Z) such that $∗ α = β and the map 5 is open. Then the algebra of functions on Mb (XJ , β) is embedded into the algebra of functions on Mb (X, α) and Proposition 1 shows that the former bracket is induced by the latter one. Hence it is also a Poisson bracket. 3. The Case of X In this section we will denote Mb (X, α) simply by M. 3.1. Since h is a trivial vector bundle on X the exact sequences 0 → nX → bX → hX → 0, 0 → hX → gX /nX → gX /bX → 0 induce the isomorphisms Tφ M = H 0 (C, (φ∗ gX /bX ) ⊗ OC (−1)) = H 0 (C, (φ∗ gX /nX ) ⊗ OC (−1)), Tφ∗ M = H 1 (C, (φ∗ nX ) ⊗ OC (−1)) = H 1 (C, (φ∗ bX ) ⊗ OC (−1)).
Symplectic Structure on Space of G-Monopoles
415
Applying the construction of 2.1 to the following tautological complex of vector bundles on X, bX → gX ⊕ hX → gX /nX ,
(3)
and taking into account the above isomorphisms we get a map PeφX : Tφ∗ M → Tφ M. Lemma 1. We have PeφX = PφX . Proof. The same reasons as in the proof of Proposition 1 work if we consider the following commutative diagram: nX −−−−→
gX
−−−−→ gX /bX x
nX −−−−→ y
gX y
−−−−→ gX /nX
bX −−−−→ gX ⊕ hX −−−−→ gX /nX . It will be convenient for us to use the complex (3) for the definition of the map PφX instead of (1). 3.2. Here we will describe the Pl¨ucker embedding of the space M. Let X ⊃ f ∼ = I be the set of fundamental weights: hi, ωj i = δij . We denote by (, ) the scalar product on X such that (i0 , j 0 ) = i · j for the simple roots i0 , j 0 . For a dominant weight λ ∈ X we denote by Vλ the irreducible G-module with highest weight λ. Recall that X is canonically embedded into the product of projective spaces Y P(Vω ). X⊂ ω∈f
This induces the embedding M⊂
Y
Mb (P(Vω ), hα, ωi).
ω∈f
Note that the marked point of the space P(Vω ) is just the highest weight vector vω with respect to the Borel subgroup B. A degree d based map φω : (C, ∞) → (P(Vω ), vω ) can be represented by a Vω valued degree d polynomial in z, taking the value vω at infinity. Let us denote the affine space of such polynomials by Rd (Vω ). The Pl¨ucker embedding of the space M is the embedding into the product of affine spaces Y Rhα,ωi (Vω ). M⊂ ω∈f
A map φ ∈ M will be represented by a collection of polynomials (φω ∈ Rhα,ωi (Vω ))ω∈f .
416
M. Finkelberg, A. Kuznetsov, N. Markarian, I. Mirkovi´c
3.3. The coordinates. The dual representation Vω∗ decomposes into the sum of weight subspaces Vω∗ = ⊕λ∈X V ∗ λω . We choose a weight base (fωλ ) of Vω∗ , such that fω−ω (vω ) = 1. Suppose hi, ωi = 0 0 +j 0 = dim V ∗ −ω+i = 1, and dim V ∗ −ω+i = 0 if i · j = 0, and 1. Then dim V ∗ −ω ω ω ω 0 0 0 ∗ −ω+i +j dim V ω = 1 if i · j 6= 0. Hence, in the latter case, the vectors fω−ω , fωi −ω and 0 0 fωi +j −ω are defined uniquely up to multiplication by a constant. Let Ei , Fi , Hi be the 0 0 0 standard generators of g. Then we will take fωi −ω := Ei fω−ω , fωi +j −ω := Ej Ei fω−ω . We consider the polynomials φλω := fωλ (φω ): the λ weight components of φω . In i0 −ω is the degree < hα, ωi particular, φ−ω ω is the degree hα, ωi unitary polynomial and φω polynomial. hα,ωi hα,ωi 1 i0 −ω be the roots of φ−ω be the values of φω Let x1ω , . . . , xω ω and yω , . . . , yω hα,ωi at the points x1ω , . . . , xω respectively. Consider the open subset U ⊂ M formed by all the maps φ such that all xkω are distinct and all yωk are non-zero. On this open set we have hα,ωi hα,ωi Y X yωk φ−ω ω (z) i0 −ω . (z − xkω ), φω (z) = φ−ω ω (z) = −ω 0 k (φω ) (xω )(z − xkω ) k=1 k=1 The collection of 2|α| functions (xkω , yωk ), (ω ∈ f, 1 ≤ k ≤ hα, ωi)
(4)
is an e´ tale coordinate system in U . One can either check this straightforwardly, or just note that the matrix of P X in these coordinates has a maximal rank, see Remark 2 below. So let us compute the map P X in these coordinates. 3.4. The action of g on Vω induces an embedding gX /nX ⊂ ⊕ Vω ⊗ Lω ω∈f
of vector bundles over X and the dual surjection ⊕ Vω∗ ⊗ L∗ω → bX ,
ω∈f
where Lω stands for the line bundle, corresponding to the weight ω. Hence we have the following complex: ⊕ Vω∗ ⊗ L∗ω → gX ⊕ hX → ⊕ Vω ⊗ Lω .
ω∈f
ω∈f
(5)
Remark 1. The differentials in the above complex in the fiber over a point B0 ∈ X are computed as follows: X X ϕ(ξ k v 0 )ξk ⊕ ω(hi )ϕ(v 0 )hi ∈ g ⊕ h, ϕ ∈ Vω∗ 7→ ξ ⊕ h ∈ g ⊕ h 7→ ξv 0 − ω(h)v 0 ∈ Vω . Here v 0 is a highest weight vector of Vω with respect to B0 ; (ξk ), (ξ k ) are dual (with respect to the standard scalar product) bases of g; and (hi ), (hi ) are dual bases of h.
Symplectic Structure on Space of G-Monopoles
417
3.5. In order to compute the brackets of the coordinates (4) at a point φ ∈ M we need to take the pull-back of the complex (5) via φ, twist it by OC (−1) and compute the second differential of the hypercohomology spectral sequence. The following lemma describes this differential in general situation. f
g
Lemma 2. Consider a complex K • = (F −→ A ⊗ OC −→ G) on C, where A is a vector space and f ∈ Hom(F, A ⊗ OC ) = A ⊗ H 0 (C, F ∗ ), g ∈ Hom(A ⊗ OC , G) = A∗ ⊗ H 0 (C, G). Consider D = tr(f ⊗ g) ∈ H 0 (C, F ∗ ) ⊗ H 0 (C, G) = H 0 (C × C, F ∗ G), where tr : A ⊗ A∗ → C is the trace homomorphism. Then e for some 1) The restriction of D to the diagonal 1 ⊂ C × C vanishes, hence D = D1 e ∈ H 0 (C × C, (F ∗ G)(−1)) = H 0 (C, F ∗ (−1)) ⊗ H 0 (C, G(−1)) = D = H 1 (C, F(−1))∗ ⊗ H 0 (C, G(−1)). 2) The second differential d2 : H 1 (C, F(−1)) → H 0 (C, G(−1)) of the hypercohomole ogy spectral sequence of K • ⊗ OC (−1) is induced by the section D. Proof. The first statement is evident. To prove the second statement consider the following commutative diagram on C × C F(−1) OC ey D
f (−1)1
(1g)|
1 −−−−−→ A ⊗ OC×C (−1, 0) −−−−→ (OC (−1) G)|1
1g y
1
OC (−2) G(−1) −−−−→
OC (−1) G
|
−−−1−→ (OC (−1) G)|1 .
Both rows are complexes with acyclic middle term, hence the second differentials of the hypercohomology spectral sequences commute with the maps induced on cohomology by the vertical arrows: H 1 (C × C, F(−1) OC ) ey D
d
2 −−−− → H 0 (C × C, (OC (−1) G)|1 )
d
2 → H 0 (C × C, (OC (−1) G)|1 ). H 1 (C × C, OC (−2) G(−1)) −−−−
Now it remains to note that = H 1 (C, F(−1)), H 1 (C × C, F(−1) OC ) 1 H (C × C, OC (−2) G(−1)) = H 0 (C, G(−1)), H 0 (C × C, (OC (−1) G)|1 ) = H 0 (C, G(−1)), and that the map H 0 (C, G(−1)) → H 0 (C, G(−1)) induced by the map d2 in the second row of the above diagram is identity.
418
M. Finkelberg, A. Kuznetsov, N. Markarian, I. Mirkovi´c
3.6. Consider the pullback of (5) via φ ∈ M, and twist it by OC (−1). We want to apply Lemma 2 to compute the (ωi , ωj )-component of the second differential of the hypercohomology spectral sequence. In notations of the lemma we have X X ξ k φωi (z) ⊗ ξk φωj (w) − ωi (hi )φωi (z) ⊗ ωj (hi )φωj (w) = Dωi ,ωj (z, w) = X = ξ k φωi (z) ⊗ ξk φωj (w) − (ωi , ωj )φωi (z) ⊗ φωj (w) ∈ Vωi ⊗ Vωj (z, w). P Lemma 3. The operator ξ k ⊗ ξk − (ωi , ωj ) acts as a scalar multiplication on every irreducible summand Vλ ⊂ Vωi ⊗ Vωj . On Vωi +ωj it acts as a multiplication by 0. If ωi = ωj , then on V2ωi −i0 ⊂ Vωi ⊗ Vωi it acts as a multiplication by (−2). If i 6= j, i · j 6= 0, then on Vωi +ωj −i0 −j 0 ⊂ Vωi ⊗ Vωj it acts as a multiplication by (i · j − 2). P k Proof. It is easy to check that ξ ⊗ ξk commutes with the natural action of g on Vωi ⊗ Vωj . The first part of the lemma follows. The rest of the lemma can be checked P by the straightforward computation of the action of ξ k ⊗ ξk on the highest vectors of the corresponding subrepresentations. 3.7. If we want to compute the brackets of the coordinates (4) we are interested in the components of Dωi ,ωj (z, w) in the weights ωi + ωj , ωi + ωj − i0 , ωi + ωj − j 0 , ωi + ωj − i0 − j 0 .
(6)
The following lemma describes the corresponding weight components of the tensor product Vωi ⊗ Vωj . Lemma 4. The embedding Vωi +ωj ⊂ Vωi ⊗ Vωj induces an isomorphism in the weights (6) with the following two exceptions: 0
0
0
2ωi −i 2ωi −i ⊕ V2ω (1) (Vωi ⊗ Vωi )2ωi −i = V2ω 0 ; the G-projection to the second summand i i −i is given by the formula
a(vωi ⊗ Fi vωi ) + b(Fi vωi ⊗ vωi ) 7→ 0
0
ω +ω −i0 −j 0
a−b (vωi ⊗ Fi vωi − Fi vωi ⊗ vωi ). 2 ω +ω −i0 −j 0
⊕ Vωii+ωjj−i0 −j 0 if i 6= j and i · j 6= 0; the (2) (Vωi ⊗ Vωj )ωi +ωj −i −j = Vωii+ωjj G-projection to the second summand is given by the formula a(vωi ⊗ Fi Fj vωj ) + b(Fi vωi ⊗ Fj vωj ) + c(Fj Fi vωi ⊗ vωj ) 7→ b−a−c (vωi ⊗ Fi Fj vωj + (i · j)Fi vωi ⊗ Fj vωj + Fj Fi vωi ⊗ vωj ). 7→ i·j−2 Proof. Straightforward. 3.8. Hence (see Lemma 3, Lemma 4) when λ is one of the weights (6) the λ-component e ω ,ω (z, w) = Dωi ,ωj (z, w) is zero with the following e ωλ ,ω (z, w) of the polynomial D D i j i j z−w two exceptions:
Symplectic Structure on Space of G-Monopoles
419
0
0
ωi −i i i −i i (w) − φω (z)φω φω 0 ωi (z)φωi ωi ωi (w) i −i e ω2ω,ω (Fi vωi ⊗ vωi − vωi ⊗ Fi vωi ), (7) = D i i z−w
ωi −i0
+ωj −i0 −j 0 φωi e ωωi,ω = D i j
ω −j 0
ω −i0 −j 0
· (vωi
0
ω
0
j i i −i −j (w) − φω (w) − φω (z)φωjj (w) ωi (z)φωj ωi · z−w ⊗ Fi Fj vωj + (i · j)Fi vωi ⊗ Fj vωj + Fj Fi vωi ⊗ vωj ). (8)
(z)φωjj
Note that the scalar multiplicators of Lemma 3 canceled with the denominators of Lemma 4. 3.9. Now we can compute the brackets. Proposition 2. We have {xkωi ,xlωj } =
0;
{xkωi ,yωl j } =
δkl δij yωl j ;
{yωk i ,xlωj } =
−δkl δij yωk i ;
{yωk i ,yωl j } = i ·
yωk yωl j k i jl xωi − xωj
{yωk i , yωl i } =
(9)
, if i 6= j;
0.
Proof. Note that if p ∈ Vωi (z), then D
0
*
E
dyωk i (p) = fωi i−ωi , p(xkωi ) , dxkωi (p) =
p(xk ) i fω−ω , ωi 0ωi k i (φωi ) (xωi )
+ ,
where h•, •i stands for the natural pairing. Note also that 0
k ωi −i i (xkωi ) = yωk i φω ωi (xωi ) = 0, φωi
by definition and 0
i i , Fi vωi i = −hfω−ω , Ei Fi vωi i = −1. hfωi i−ωi , Fi vωi i = hEi fω−ω i i
Now the proposition follows from Lemma 2 and from the formulas of 3.8.
Remark 2. The matrix of the bivector field P X in the coordinates (xkω , yωk ) looks as follows: diag(yωk ) 0 − diag(yωk ) ∗ Since on the open set U this matrix is evidently nondegenerate it follows that the functions (xkω , yωk ) indeed form an e´ tale coordinate system.
420
M. Finkelberg, A. Kuznetsov, N. Markarian, I. Mirkovi´c
3.10. Now we can prove Theorem 1. Proof of Theorem 1. The reduction to the case J = ∅ has been done in 2.2. The latter case is straightforward by virtue of Proposition 2. Corollary 1. The map P X provides the space Mb (X, α) with a holomorphic symplectic structure. Proof. Since P X gives a Poisson structure it suffices to check that P X is nondegenrate at any point. To this end recall that the hypercohomology spectral sequence of a complex K • converges to H • (X, K • ). Since the only nontrivial cohomology of the complex (1) is hX in degree zero, the complex (2) is quasiisomorphic to (φ∗ h) ⊗ OC (−1) in degree zero, hence the hypercohomology sequence of the complex (2) converges to zero, hence PφX is an isomorphism. Remark 3. One can easily write down the corresponding symplectic form in the coordinates (4): X dy k ∧ dxk 1 X dxkωi ∧ dxlωj ω ω + i · j . yωk 2 xkωi − xlωj i6=j
4. Symplectic Leaves 4.1. We fix β ∈ N[I − J] ⊂ Z[I − J] = H2 (XJ , Z), and consider the Poisson structure on M = Mb (XJ , β). In this section we will describe the symplectic leaves of this structure. Consider α ∈ N[I] ⊂ Z[I] = H2 (X, Z) such that $∗ α = β (see 2.2). Note that $∗ is nothing but the natural projection from N[I] to N[I − J]. Thus α − $∗ α ∈ N[J]. We will call an element γ ∈ N[J] antidominant if hγ, j 0 i ≤ 0 for any j ∈ J. We will call α ∈ N[I] a special lift of β ∈ N[I − J] if $∗ α = β, and α − β is antidominant. Lemma 5. If α is a special lift of β, then the natural projection 5 : Mb (X, α) → Mb (XJ , β) (see 2.2) is an immersion. Proof. Let φ ∈ Mb (X, α). Then Tφ Mb (X, α) = H 0 (C, (φ∗ gX /bX ) ⊗ OC (−1)), and T5φ Mb (XJ , β) = H 0 (C, ((5φ)∗ gXJ /pXJ ) ⊗ OC (−1)). Hence the kernel of the natural map 5∗ : Tφ Mb (X, α) → T5φ Mb (XJ , β) equals H 0 (C, φ∗ ($∗ pXJ /bX ) ⊗ OC (−1)). Now $∗ pXJ /bX has a natural filtration with the successive quotients of the form Lθ , where θ is a positive root of the root subsystem spanned by J ⊂ I. Since α is a special lift, deg φ∗ Lθ ≤ 0. We conclude that H 0 (C, (φ∗ Lθ ) ⊗ OC (−1)) = 0, and thus H 0 (C, φ∗ ($∗ pXJ /bX ) ⊗ OC (−1)) = 0. The lemma is proved. Remark 4. For fixed β ∈ N[I − J] the set of its special lifts is evidently finite. It is nonempty (see e.g. the proof of Theorem 2). 4.2. It follows from Proposition 1 that 5(Mb (X, α)) is a symplectic leaf of the Poisson structure P on M, if α is a special lift of β. The group PJ acts naturally on M; it preserves P since the complex (2) is PJ equivariant. It follows that for g ∈ PJ the subvariety g5(Mb (X, α)) ⊂ M is also a symplectic leaf. Certainly, g5(Mb (X, α)) = 5(Mb (X, α)) whenever g ∈ B.
Symplectic Structure on Space of G-Monopoles
421
Theorem 2. Any symplectic leaf of P is of the form g5(Mb (X, α)), where α is a special lift of β, and g ∈ PJ . Proof. We only need to check that for any ψ ∈ M there exists a special lift α, a point φ ∈ Mb (X, α), and g ∈ PJ such that ψ = g5φ. In other words, it suffices to find a special lift α, and a point φ ∈ M(X, α) (unbased maps!) such that φ(∞) ∈ PJ x (the smallest PJ -orbit in X), and ψ = 5φ. Recall that given a reductive group G with a Cartan subgroup H and a set of simple roots 1 ⊂ X(H), the isomorphism classes of G-torsors over C are numbered by the set X+∗ (H) of the dominant coweights of G : η ∈ X+∗ (H) iff hη, i0 i ≥ 0 for any i0 ∈ 1. For example, if H = G = H, then X+∗ (H) = Y . If φ ∈ M(X, α), we may view φ as a reduction of the trivial G-torsor to B. Let φH be the corresponding induced H-torsor. Then its isomorphism class equals −α. Let us view ψ as a reduction of the trivial G-torsor to PJ . Let LJ be the Levi quotient of PJ , and let ψ LJ be the corresponding induced LJ -torsor. Let ϕ be the Harder–Narasimhan flag of ψ LJ . We may view it as a reduction of ψ to a parabolic subgroup PK , K ⊂ J. By definition, the isomorphism class η of ϕLK (as an element of Y ) has the following properties: a) $∗ η = −β; b) hη − $∗ η, j 0 i > 0 for j ∈ J − K; c) hη − $∗ η, k0 i = 0 for k ∈ K. 0 stands for the quotient of LK by its center, then the induced torsor In particular, if LK 0 LK 0 we ϕ is trivial. Choosing its trivial reduction to the positive Borel subgroup of LK obtain a reduction φ of ψ to B. Thus φ is a map from C to X of degree α = −η. We see that α is a special lift of β, and φ ∈ Mb (X, α) has the desired properties.
Acknowledgement. This paper has been written during the stay of the second author at the Max-PlanckInstitut f¨ur Mathematik. He would like to express his sincere gratitude to the Institut for the hospitality and the excellent work conditions. It is clear from the above discussion that the present note owes its existense to the generous explanations of B.Feigin. We are very grateful to the referee for the clarifying comments.
References 1. Atiyah, M.F., Hitchin, N.: The Geometry and Dynamics of Magnetic Monopoles. Princeton, NJ: Princeton University Press, 1988 2. Feigin, B., Odesskii, A.: Vector bundles on Elliptic Curves and Sklyanin Algebras In: Topics in quantum groups and finite type invariants, Mathematics at the Independent University of Moscow, B. Feigin and V. Vassiliev eds, Advances in the Mathematical Sciences 38, AMS Translations, ser. 2, vol. 185, 65–84 (1998) 3. Finkelberg, M., Mirkovi´c, I.: Semiinfinite Flags. I. Case of global curve P1 . Preprint alg-geom/9707010 4. Jarvis, S.: Euclidean monopoles and rational maps. To appear in Proc. Lond. Math. Soc. Communicated by G. Felder
Commun. Math. Phys. 201, 423 – 444 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
The ζ -Determinant and the Additivity of the η -Invariant on the Smooth, Self-Adjoint Grassmannian Krzysztof P. Wojciechowski Department of Mathematics, IUPUI (Indiana/Purdue), Indianapolis, IN 46202-3216, USA. E-mail:
[email protected] Received: 28 March 1996 / Accepted: 2 September 1998
Abstract: In this paper we discuss the existence of the ζ-determinant of a Dirac operator D on a compact manifold with boundary. We show that the determinant is well defined in the case of the operator D with a domain determined by a boundary condition from the smooth, self-adjoint Grassmannian Gr∗∞ (D) discussed in the papers [5, 13, 29]. We prove a generalization of a pasting formula for the η-invariant (see [34]). The results of the paper are used in the recent proof of the projective equality of the ζ-determinant and Quillen determinant on Gr∗∞ (D) (see [30, 31]). Introduction Recent studies in Quantum Field Theory and Topology have stressed the importance of the correct definition of the renormalized determinant of the Dirac operator over a manifold with boundary. The renormalization successfully used in the case of a closed manifold is the ζ-renormalization introduced by Ray and Singer in [27] (see also [32]). The ζ-determinant of the Dirac operator D on a closed manifold is given by the formula: det ζ D = e
iπ 2 (ηD (0)−ζD 2 (0))
·e−1/2·(d/ds(ζD2 (s))|s=0 ) ,
(0.1)
where ζD2 (s) and ηD (s) are functions constructed from the eigenvalues of the operator D. Now let us assume that D : C ∞ (M ; S) → C ∞ (M ; S) is a compatible Dirac operator acting on sections of a bundle of Clifford modules S over a compact Riemannian manifold M with boundary Y . In this paper we concentrate on the case of an odd-dimensional manifold M , and from now on we assume that n = dim M = 2m + 1. Let us point out, however, that our results are true for Dirac operators on an evendimensional manifold as well. The necessary modifications due to the different algebraic structure of the spinors in the odd and even case can be found in [7], where we discuss
424
K. P. Wojciechowski
the applications of our results in the even-dimensional situation (see also [3] for an introductory discussion of applications of the ζ-determinant of elliptic boundary problems in Quantum Chromodynamics). In the present paper we discuss only the Product Case. Namely we assume that the Riemannian metric on M and the Hermitian structure on S are products in a certain collar neighborhood of the boundary. Let us fix N = [0, 1] × Y the collar. Then in N the operator D has the form D = G(∂u + B),
(0.2)
where G : S|Y → S|Y is a unitary bundle isomorphism (Clifford multiplication by the unit normal vector) and B : C ∞ (Y ; S|Y ) → C ∞ (Y ; S|Y ) is the corresponding Dirac operator on Y , which is an elliptic self-adjoint operator of first order. Furthermore, G and B do not depend on the normal coordinate u and they satisfy the identities G2 = −Id and GB = −BG.
(0.3)
Since Y has dimension 2m the bundle L S|Y decomposes into its positive and negative chirality components S|Y = S + S − and we have a corresponding splitting of the operator B into B ± : C ∞ (Y ; S ± ) → C ∞ (Y ; S ∓ ), where (B + )∗ = B − . Equation (0.2) can be rewritten in the following form i 0 0 B− . ∂u + 0 −i B+ 0 In order to obtain a nice unbounded Fredholm operator we have to impose a boundary condition on the operator D. Let 5≥ denote the spectral projection of B onto the subspace of L2 (Y ; S|Y ) spanned by the eigenvectors corresponding to the nonnegative eigenvalues of B. It is well known that 5≥ is an elliptic boundary condition for the operator D (see [1, 6]). The meaning of the ellipticity is as follows. We introduce the unbounded operator D5≥ equal to the operator D with domain dom D5≥ = {s ∈ H 1 (M ; S) ; 5≥ (s|Y ) = 0}, where H 1 denotes the first Sobolev space. Then the operator D5≥ = D : dom(D5≥ ) → L2 (M ; S) is a Fredholm operator with kernel and cokernel consisting only of smooth sections. The orthogonal projection 5≥ is a pseudodifferential operator of order 0 (see [6]) . In fact we can take any pseudodifferential operator R of order 0 with principal symbol equal to the principal symbol of 5≥ and obtain an operator DR which satisfies the aforementioned properties. Let us point out, however, that only the projection onto the kernel of the operator R is used in the construction of the operator DR . Therefore we can restrict ourselves to the study of the Grassmannian Gr(D) of all pseudodifferential projections which differ from 5≥ by an operator of order −1. The space Gr(D) has infinitely many connected components and two boundary conditions P1 and P2 belong to the same connected component if and only if index DP1 = index DP2 . We are interested however in self-adjoint realizations of the operator D. The involution G : S|Y → S|Y equips L2 (Y ; S|Y ) with a symplectic structure, and Green’s formula (see [6])
ζ-Determinant on the Smooth, Self-Adjoint Grassmannian
425
Z (Ds1 , s2 ) − (s1 , Ds2 ) = −
Y
< G(s1 |Y ); s2 |Y > dy,
(0.4)
shows that the boundary condition R provides a self-adjoint realization DR of the operator D if and only if ker R is a Lagrangian subspace of L2 (Y ; S|Y ) (see [5, 6, 13]). It is therefore reasonable to restrict ourselves to those elements of Gr(D) which are Lagrangian subspaces of L2 (Y ; S|Y ). More precisely we introduce Gr∗ (D), the Grassmannian of orthogonal, pseudodifferential projections P such that P −5≥ is an operator of order -1 and − GP G = Id − P.
(0.5)
The space Gr∗ (D) is contained in the connected component of Gr(D) parameterizing projections P with index DP = 0. In this paper we discuss only the Smooth, Self-adjoint Grassmannian, a dense subset of the space Gr∗ (D), defined by ∗ (D) = {P ∈ Gr∗ (D) ; P − 5≥ has a smooth kernel}. Gr∞
(0.6)
∗ (D) if and only if ker B The spectral projection 5≥ is an element of Gr∞
= Remark 0.1. {0}. However, it is well-known that P (D) the (orthogonal) Calderon projection is an element of Gr∗ (D) (see for instance [5]), and it has been recently observed by Simon Scott (see [28], Proposition 2.2.) that P (D) − 5≥ is a smoothing operator, and ∗ (D). The finite-dimensional perturbations of 5≥ hence that P (D) is an element of Gr∞ discussed below (see also [13], [21] and [34]) provide further examples of boundary ∗ (D). The latter were introduced by Jeff Cheeger, who called them conditions from Gr∞ Ideal Boundary Conditions (see [10, 11]). For any P ∈ Gr∗ (D) the operator DP has a discrete spectrum nicely distributed along the real line (see [5, 13]). Therefore one might expect that det ζ (DP ) is well defined. To see that, we have to study the asymptotic expansion of the heat kernels involved in the construction of the determinant, or equivalently the expansion of the operator (DP − λ)−1 . The existence of a nice asymptotic expansion of the trace of the heat kernels used in the constructions of ηDP (s) and ζDP2 (s) was established in a recent work of Gerd Grubb [18] . She used the machinery developed in her earlier work and her joint work with Bob Seeley (see [15, 16, 17]). However, at the moment, the problem of explicit computation of the coefficents in the expansion is open. From this point of view the existence of the invariants used to define det ζ depends on the vanishing of particular coefficients in the corresponding expansions. We choose a different route. It follows from our earlier work on Grassmannians (see ∗ (D) is a path connected space. As a consequence [5, 6], and [13] Appendix B) that Gr∞ we can perform a Unitary Twist and replace the operator DP by a unitarily equivalent ∗ (D) denotes an appropriate finite-dimensional operator (D + R)5σ , where 5σ ∈ Gr∞ modification of 5≥ defined below in Sect. 1. The operator D5σ has a well-defined ζdeterminant and the correction term R lives in the collar N . The operator R is no longer a differential operator, but for any 0 ≤ u ≤ 1, Ru = R|{u}×Y is a pseudodifferential operator. If we assume that P − 5≥ has a smooth kernel then the operator Ru has a smooth kernel for each 0 ≤ u ≤ 1 and this is all that one needs in order to study the correction terms appearing in the corresponding heat kernels. This is the reason why we ∗ (D). The results of the paper should hold also in the restrict attention to the space Gr∞ ∗ case of P ∈ Gr (D). The main result of this paper is the following theorem.
426
K. P. Wojciechowski
∗ Theorem 0.2. For any projection P ∈ Gr∞ (D), ηDP (s) and ζDP2 (s) are holomorphic functions of s in a neighborhood of s = 0. ∗ (D). Corollary 0.3. The ζ-determinant is a well-defined smooth function on Gr∞
Remark 0.4. (1) The result stated above implies the existence of the Quillen ζ-function metric for families of elliptic boundary value problems. This metric was studied before by Piazza [25] in the context of b-calculus developed by Melrose and his collaborators. (2) In fact, we are able to obtain complete asymptotic expansions of the heat kernels for the operator DP . The reason is that Duhamel’s Principle allows one to study the interior contribution and the boundary contribution separately and identify the singularities caused by the boundary contribution. This procedure was used before in [13] (Sect. 4 and Appendix A) and [20] (Sect. 1). As the asymptotic expansion has been already studied (see [16, 17, 18]) we leave the details to the reader and in this paper we concentrate instead on the analysis of the ζ-determinant. (3) A more difficult problem than the existence of the asymptotic expansion is to show that the invariants used in the construction of the determinant are well defined. For instance Grubb and Seeley showed the regularity of the η-function only for finitedimensional perturbations of the Atiyah–Patodi–Singer boundary condition (see [16]). A similar result was also obtained by Dai and Freed (see [12]). The η-invariant of a more general class of boundary problems was also studied recently by Br¨uning and Lesch (see [8]). There, however, the authors had to deal with the residuum of the η-function at s = 0, which is not present in our situation. (4) The results of this paper were announced in a talk the author gave at the Annual Meeting of the AMS in San Francisco in January 1995. The delay in publication was partly due to work on the applications, which were the motivation for the present work (see [3, 4, 7, 29, 30], and [31]). We also want to single out one particular result, which is related to the discussion of the dependence of spectral invariants on the symbol of the operator given in [35]. ∗ (D), i.e. Proposition 0.5. The value of the ζ-function at s = 0 is constant on Gr∞
ζDP2 (0) = ζDP2 (0), 1
(0.7)
2
∗ (D). for any P1 , P2 ∈ Gr∞
The results of this paper allow us to study the ζ-determinant as a function on ∗ (D). In particular, we are interested in the relation of the ζ-determinant and the Gr∞ Quillen determinant defined as a canonical section of the determinant line bundle over the Grassmannian. It was observed by Scott [28] that when restricted to the self-adjoint Grassmannian the determinant line bundle over Gr(D) becomes trivial. Moreover, it has ∗ (D). The Quillen determinant expressed in this trivia natural trivialization over Gr∞ alization becomes a function. We refer to the determinant obtained in this way as the canonical determinant and we denote it by det C DP (see [28] for details). In recent work of the author and Simon Scott the relation between det ζ DP and det C DP is studied. In fact, it has been shown that, up to a natural multiplicative constant, the two determinants are equal. Proposition 0.5 and Proposition 4.7 are used in an essential way in the proof of this result. We refer the reader to [29, 4, 30, 31] for details. In this paper we discuss another application of Theorem 0.2, the extension of the ∗ (D). This formula additivity formula for the η-invariant to boundary conditions from Gr∞
ζ-Determinant on the Smooth, Self-Adjoint Grassmannian
427
has been previously known only for finite-dimensional perturbations of the Atiyah– Patodi–Singer condition (see [34]). Let us point out that the additivity formula for the η-invariant stated in Theorem 4.1 and Proposition 0.5 implies the additivity of the phase of the ζ-determinant under the pasting of two manifolds with the same boundary. This extends the result of Dai and Freed (see [12]). In the first two sections of the paper we study the η-function of DP . We obtain the following result as a conclusion of our computations. ∗ (D) the function ηDP (s) is a holomorphic function of Theorem 0.6. For any P ∈ Gr∞ s in the half-plane Re(s) > −1.
Section 3 contains a discussion of ζDP2 (s) and d/ds(ζDP2 (s))|s=0 . In Sect. 4 we discuss the additivity formula for the η-invariant. In the Appendix we present proofs of two technical results used in Sect. 2 and Sect. 3. 1. Boundary Contribution to the η-Function. Unitary Twist and Duhamel’s Principle Let us assume for a moment that the manifold M does not have a boundary. The Dirac operator D is then a self-adjoint elliptic operator with a discrete spectrum {λk }k∈Z . We define the η-function of D as follows: X sign(λk )|λk |−s . (1.1) ηD (s) = λk6 =0
The function ηD (s) is a holomorphic function of s for Re(s) > dim (M ) and it has a meromorphic extension to C with isolated simple poles on the real axis. The point s = 0 is not a pole and ηD = ηD (0) the η-invariant of the operator D is an important invariant, which has found numerous applications in geometry, topology and physics. In the case of a compatible Dirac operator D the η-function is actually a holomorphic function of s for Re(s) > −2. This was shown by Bismut and Freed [2] , who used the heat kernel representation of the η-function Z ∞ s−1 2 1 t 2 ·Tr De−tD dt, (1.2) ηD (s) = s+1 0( 2 ) 0 which in particular allows us to express the η-invariant as Z ∞ 2 1 1 √ ·Tr De−tD dt. ηD (0) = √ π 0 t
(1.3)
It follows from (1.2) that the estimate
√ 2 |Tr De−tD | < c t
implies the regularity of the η-function. In fact, Bismut and Freed proved a sharper 2 result: Let E(t; x, y) denote the kernel of the operator De−tD , then there exists a positive constant c such that for any x ∈ M and for any 0 < t < 1, √ (1.4) |Tr E(t; x, x)| < c t. We argue along the same lines and prove the following proposition, which implies Theorem 0.6.
428
K. P. Wojciechowski
∗ Proposition 1.1. For any P ∈ Gr∞ (D) there exists a positive constant c > 0 such that for any 0 < t < 1 the following estimate holds:
|Tr DP e−tDP | < c. 2
(1.5)
The proof of Proposition 1.1 occupies Sect. 1 and Sect. 2 of the paper. Proposition 1.1 is a statement on the small time asymptotics, which by Duhamel’s Principle allows us to replace the kernel of the operator by a suitable parametrix built from the heat kernel ˜ , the closed double of the manifold M , and the heat kernel of the of the operator on M operator G(∂u + B) subject to the boundary condition P on a cylinder [0, ∞) × Y . However, we need to start with a concrete representation of the heat kernel on the cylinder. Such a representation is well-known for the original Atiyah–Patodi–Singer condition 5≥ (see [1], or [6] Sect. 22). In general the projection 5≥ is not an element of ∗ (D). Nevertheless, one can find easily a finite-dimensional modthe Grassmannian Gr∞ ification of 5≥ which belongs to this Grassmannian and then use the explicit formulas for the heat kernel on the cylinder. We obtain our modification of the Atiyah–Patodi–Singer condition in the following way. The involution G (see (0.3)) restricted to ker(B) defines a symplectic structure on this subspace of L2 (Y ; S|Y ) and the Cobordism Theorem for Dirac Operators (see for instance [6], Corollary 21.16) implies dim ker(B + ) = dim ker(B − ). The last equality shows the existence of Lagrangian subspaces of ker(B). We choose such a subspace W and let σ : L2 (Y ; S|Y ) → L2 (Y ; S|Y ) denote the orthogonal projection of L2 (Y ; S|Y ) onto W . Let 5> denote the orthogonal projection of L2 (Y ; S|Y ) onto the subspace spanned by eigenvectors of B corresponding to the positive eigenvalues. Then ∗ (D), 5σ = 5> + σ ∈ Gr∞
(1.6)
∗ (D), which is a finite-dimensional perturbation of the Atiyah– gives an element of Gr∞ Patodi–Singer condition. The operator Dσ = D5σ is a self-adjoint operator and the properties of its η-function were studied in [13] (see Sect. 4 and Appendix A). It follows that ηDσ (s) is a holomorphic function for Re(s) > −2. To make a connection with the operator DP we need the following result, which is an easy consequence of the topological structure of the Grassmannians studied in [5, 6, 13] (Appendix B). ∗ (D) there exists a smooth path {gu }0≤u≤1 of unitary Lemma 1.2. For any P ∈ Gr∞ 2 operators on L (Y ; S|Y ) which satisfies
Ggu = gu G and gu − Id has a smooth kernel, ∗ (D) connects P0 = P with such that g1 = Id and the path {Pu = gu 5σ gu−1 } ⊂ Gr∞ P1 = 5σ .
We can always assume that the path {gu } is constant on subintervals [0, 1/4] and [3/4, 1]. We introduce U a unitary operator on L2 (M ; S) using the formula ( Id on M \ N . (1.7) U := gu on N The following lemma introduces the Unitary Twist, which allows us to replace the operator DP by a modified operator D + R subject to the boundary condition 5σ . This makes possible an explicit construction of the heat kernels on a cylinder.
ζ-Determinant on the Smooth, Self-Adjoint Grassmannian
429
Lemma 1.3. The operators DP and DU,σ = (U −1 DU )5σ are unitarily equivalent operators. Proof. Let {fk ; µk }k∈Z denote a spectral resolution of the operator DP . This means that for each k we have Dfk = µk fk and P (fk |Y ) = 0 . This implies U −1 DU (U −1 fk ) = µk (U −1 fk ) and 5σ ((U −1 fk )|Y ) = g0−1 P (fk |Y ) = 0, hence {U −1 fk ; µk } is a spectral resolution of (U −1 DU )5σ .
In the collar N , we have formulas U −1 DU = D + GU −1
∂U + GU −1 [B, U ] , ∂u
and
∂U ∂2U ∂u − U −1 2 + U −1 [B 2 , U ] , ∂u ∂u which restricted to the collar [0, 1/4] × Y give U −1 D2 U = D2 − 2U −1
U −1 DU = D + GU −1 [B, U ] and U −1 D2 U = D2 + U −1 [B 2 , U ].
(1.8)
It follows from Lemma 1.3 that we can study the operator DU,σ instead of the operator DP . We use the representation (1.8) in the construction of the parametrix of 2 the kernel of the operator DU,σ e−tDU,σ . This parametrix is built from the heat kernel ˜ and the heat kernel on the cylinder. The bundle S and the on the double manifold M ˜ (see [13]; see [6] for a operator D extend to the corresponding objects S˜ and D˜ on M detailed discussion of the glueing constructions). There is also the obvious double U˜ of the unitary transformation U . We introduce the operator ˜ ; S) ˜ → C ∞ (M ˜ ; S), ˜ U˜ −1 D˜ U˜ : C ∞ (M ˜ Therefore the estimate (1.4) holds for the kernel which is unitarily equivalent to D. E˜U (t; x, y) of the operator ˜ −1 ˜ ˜ 2 ˜ −tD˜ 2 U˜ . U˜ −1 D˜ U˜ e−t(U DU ) = U˜ −1 De
It follows from Duhamel’s Principle that on M \ N up to exponentially small error 2 ˜ −tD˜ 2 U˜ is equal to the kernel of DU,σ e−tDU,σ for 0 < t < 1 (in t), the kernel of U˜ −1 De (see [13, 20]; see [6] for a detailed discussion of the variant of Duhamel’s Principle we need in this paper). More precisely, we have the following Lemma, which takes care of the situation in the interior of M Lemma 1.4. Let EU,σ (t; x, y) denote the kernel of the operator DU,σ e−tDU,σ , 2
then there exist positive constants c1 , c2 such that for any x ∈ M1/8 = M \ [0, 1/8] × Y and any 0 < t < 1 the following estimate holds c
2 kEU,σ (t; x, x) − E˜U (t; x, x)k ≤ c1 e− t .
(1.9) 2 −tDU,σ
Hence the estimate (1.4) holds for the kernel of the operator DU,σ e
in M1/8 .
430
K. P. Wojciechowski
Now, we study the heat kernel in the collar neighborhood of Y . Once again we apply 2 Duhamel’s Principle to replace the kernel EU,σ (t; x, y) of the operator DU,σ e−tDU,σ by the corresponding kernel on [0, +∞) × Y . It follows from Eq. (1.8) that up to an exponentially small error we can use the kernel of the operator (G(∂u + B) + K1 )e−t(−∂u +B 2
where
2
+K2 )σ
,
(1.10)
K1 = GU −1 [B, U ] and K2 = U −1 [B 2 , U ] .
Let us observe that K1 anticommutes and K2 commutes with the involution G. The symbol exp(−t(−∂u2 + B 2 + K2 )σ ) in (1.10) denotes the following operator. We consider the operator G(∂u + B)5σ on the infinite cylinder [0, +∞) × Y and its square, which we denote by (−∂u2 + B 2 )σ . The operator (−∂u2 + B 2 )σ is an unbounded self-adjoint operator in L2 ([0, +∞) × Y ; S) and the kernel of the operator exp(−t(−∂u2 + B 2 )σ ) is given by explicit formulas (see [1, 6]). We add the bounded operator K2 and obtain the operator (−∂u2 + B 2 + K2 )σ . It follows from standard theory (see for instance [26]) that the semigroup exp(−t(−∂u2 + B 2 + K2 )σ ) is well defined. We study the trace of the 2 2 kernel of (G(∂u + B) + K1 )e−t(−∂u +B +K2 )σ in the next section. 2. Boundary Contribution to the η-Function. Heat Kernel on the Cylinder In this section we continue the proof of Proposition 1.1. We have to show that the 2 boundary contribution to Tr DP e−tDP is bounded for t sufficiently small. Let e(t) denote the operator exp(−t(−∂u2 +B 2 +K2 )σ ) and e1 (t) denote the operator exp(−t(−∂u2 +B 2 )σ ). We have the formula e(t) = e1 (t) +
∞ X
{e1 ∗ K2 e1 ∗ K2 e1 ∗ ... ∗ K2 e1 }(t),
(2.1)
n=1
where the term K2 e1 appears n-times in the curly bracket under the summation sign and ∗ denotes convolution (see for instance [6]; Sect. 22C). It follows from the explicit formulas giving the kernel of the operator e1 (t) (see (2.5) and Appendix formula (A.1)) that Z tr G(∂u + B)e1 (t; (u, y), (v, z))|y=z dy = 0 , y, z ∈ Y. u=v Y
We want to show that there exists a positive constant C such that for any 0 ≤ u ≤ 1/8, Z tr (G(∂u + B) + K1 )e(t; (u, y), (v, z))|y=z dy| < C, y, z ∈ Y. (2.2) | u=v Y
It follows from Formula (2.1) that we have to study the trace Z tr (G(∂u + B) + K1 ) Y
{e1 +
∞ X n=1
{e1 ∗ K2 e1 ∗ K2 e1 ∗ ... ∗ K2 e1 }}(t; (u, y), (v, z))|y=z dy. u=v
The involution G commutes with the operators e1 and K2 and anticommutes with B and K1 , which gives us
ζ-Determinant on the Smooth, Self-Adjoint Grassmannian
Z Y
tr GB{e1 (t) +
n=1
andZ Y
∞ X
Tr K1 {e1 (t) +
∞ X n=1
431
{e1 ∗ K2 e1 ∗ K2 e1 ∗ ... ∗ K2 e1 }}(t; (u, y), (v, z))|y=z dy = 0, u=v
{e1 ∗ K2 e1 ∗ K2 e1 ∗ ... ∗ K2 e1 }}(t; (u, y), (v, z))|y=z dy = 0. u=v
Therefore we have the equality Z
Z Y
tr G(∂u + B) + K1 )e(t; (u, y), (v, z))|y=z dy = u=v
+ Z
∞ X n=1
= Y
Y
tr G∂u {e1 (t) +
{e1 ∗ K2 e1 ∗ K2 e1 ∗ ... ∗ K2 e1 }}(t; (u, y), (v, z))|y=z dy u=v
∞ X tr G∂u { {e1 ∗ K2 e1 ∗ K2 e1 ∗ ... ∗ K2 e1 }}(t; (u, y), (v, z))|y=z dy. u=v n=1
(2.3)
The last equality in (2.3) follows from the fact that Z Y
tr G(∂u e1 )(t; (u, y), (v, z))|y=z dy = 0, u=v
(see formula (A.1)). We have to study the right side of (2.3). The crucial point here is to estimate the first term Z Y
tr G(∂u e1 ) ∗ K2 e1 (t; (u, y), (v, z))|y=z dy. u=v
We estimate the trace in the Y -direction of the operator Z G(∂u e1 ) ∗ K2 e1 (t) =
t
G(∂u e1 (s))K2 e1 (t − s)ds.
0
Our result essentially follows from the fact that E1 (t − s; (u, y), (v, z)), the kernel of the operator e1 (t − s), and F(s; (u, y), (v, z)), the kernel of the operator ∂u e1 (s), have nice “diagonal” representations on the cylinder. We can choose a spectral resolution {ϕn ; µn }n∈Z\{0} of the tangential operator B, such that ϕn corresponds to a positive eigenvalue or is an element of Ran σ and Gϕn = ϕ−n . This means that one has Bϕn = µn ϕn and 5σ ϕn = 0,
(2.4)
432
K. P. Wojciechowski
for µn ≥ 0, and B(Gϕn ) = −µn (Gϕn ) and 5σ (Gϕn ) = Gϕn . Now we can represent our kernels in the following way. X gn (t; u, v)ϕn (y)⊗ϕ∗n (z), E1 (t; (u, y), (v, z)) =
(2.5)
n∈Z\{0}
and X
F(t; (u, y), (v, z)) =
hn (t; u, v)ϕn (y)⊗ϕ∗n (z),
n∈Z\{0}
where gn (t; u, v) and hn (t; u, v) are given by explicit formulas (see (A.1)). We have X (G(∂u e1 (s))K2 e1 (t − s)(ϕn ); ϕn )|u=u0 , T rY G(∂u e1 (s))K2 e1 (t − s)|u=u0 = n∈Z\{0}
and (∂u e1 (s))K2 e1 (t − s)(ϕn )(y)|u=u0 = X Z ∞ dv·gm (s; u0 , v)hn (t − s; v, u0 )(ϕm ; K2 ϕn )ϕm (y). m∈Z\{0}
0
This gives us the following expressions T rY G(∂u e1 (s))K2 e1 (t − s)|u=u0 = X Z ∞ h−n (s; u0 , v)gn (t − s; v, u0 )dv·(ϕn ; K2 ϕn ), m∈Z\{0}
and T rY e1 (s)K2 e1 (t − s)|u=u0 =
(2.6)
0
X m∈Z\{0}
Z
∞
gn (s; u0 , v)gn (t − s; v, u0 )dv·(ϕn ; K2 ϕn ).
0
The existence of the η-invariant for the operator DP is now a consequence of the first part of the following theorem. The second part of the theorem is used below in Sect. 3, where we deal with the ζ-function and its derivative. Theorem 2.1. There exists a positive constant c > 0 such that for any n 6 = 0 and for any 0 < t < 1 we have the following estimates Z ∞ c , (2.7) h−n (s; u0 , v)gn (t − s; v, u0 )dv| < √ | s(t − s) 0 and Z ∞ c | (2.8) gn (s; u0 , v)gn (t − s; v, u0 )dv| < √ . t 0 The proof of Theorem 2.1 is completely elementary and consists of long and tedious computations. We present the proof in the Appendix at the end of the paper. Theorem 2.1 has the following immediate corollary.
ζ-Determinant on the Smooth, Self-Adjoint Grassmannian
433
Corollary 2.2. Let γ(u) denote a non-increasing smooth function equal to 1 for u ≤ 1/8, and equal to 0 for u ≥ 1/4. Then there exists a positive constant c such that |Tr γ(u){G(∂u e1 ) ∗ K2 e1 }(t)| ≤ c·T r|K2 | and
√ |Tr γ(u){e1 ∗ K2 e1 }(t)| ≤ c t·T r|K2 |.
(2.9)
Proof. We prove the first estimate in (2.9), the proof of the second is completely analogous. |Tr γ(u){G(∂u e1 ) ∗ K2 e1 }(t)| Z X Z t Z ∞ ds γ(u)du ≤| n∈Z\{0}
0
0
X Z
≤ c1 ·
n∈Z\{0}
< c2 ·(
X
t
∞
ds|
0
Z
0 t
√
0 t/2
0
h−n (s; u0 , v)gn (t − s; v, u0 )dv|·|(ϕn ; K2 ϕn )|
0
|(ϕn ; K2 ϕn )|)· Z
∞
γ(u)du Z
1 < c3 ·T r|K2 |· p · t/2
h−n (s; u0 , v)gn (t − s; v, u0 )dv·(ϕn ; K2 ϕn )|
0
Z
n∈Z\{0}
∞
ds ≤ c3 ·T r|K2 |· s(t − s)
ds √ < c4 ·T r|K2 |. s
Z
t/2
√
0
ds s(t − s)
Proof of Proposition 1.1. We have |T r γ(u){G(∂u + B) + K1 )e}(t)| ∞ X = |Tr γ(u)G∂u { {e1 ∗ K2 e1 ∗ K2 e1 ∗ ... ∗ K2 e1 }}(t) n=1
≤ |Tr γ(u){G(∂u e1 ) ∗ K2 e1 }(t)| ∞ X γ(u){G(∂u e1 ) ∗ K2 e1 ∗ K2 e1 ∗ ... ∗ K2 e1 }(t)| + |Tr n=2
≤ c·T r|K2 | +
∞ X
Z |T r
t
Z ds1
Z ds2 ...
0
0
n=2
s1
sn−1
dsn ·
0
γ(u)(G∂u e1 )(sn )◦(K2 e1 )(sn−1 − sn )◦...◦(K2 e1 )(t − s1 )| Z s1 Z sn−1 ∞ Z t X ds1 ds2 .. dsn ≤ c·T r|K2 | + n=2
0
0
0
{T r|γ(u)(G∂u e1 )(sn )◦(K2 e1 )(sn−1 − sn )|· k(K2 e1 )(sn−2 − sn−1 k..k(K2 e1 )(t − s1 )k} Z sn−1 Z s1 ∞ Z t X ds1 ds2 ... dsn · ≤ c·T r|K2 | + n=2
0
0
0
T r|γ(u)(G∂u e1 )(sn )◦(K2 e1 )(sn−1 − sn )|·kK2 kn−1 Z sn−2 Z t Z s1 ∞ X n−1 ≤ c·T r|K2 |{1 + c· kK2 k ds1 ds2 ... dsn−1 } n=2
0
0
0
434
K. P. Wojciechowski
= c·T r|K2 |{1 + c·kK2 k·
∞ X (kK2 kt)n−2
(n − 2)!
n=2
} ≤ c1 ·T r|K2 |·ec2 tkK2 k
for some positive constants c1 and c2 . This ends the proof of Proposition 1.1.
3. The Modulus of the ζ-Determinant on the Grassmannian In this section we study the spectral invariants of the operator DP2 used in the construction of the ζ-determinant, namely ζDP2 (0) and d/ds(ζDP2 (s))|s=0 (see (0.1)). Let us review briefly the situation in the case of a closed manifold M . We follow here the presentation in [32] and the necessary technicalities can be found in [14]. We assume that D is an invertible operator. Otherwise det ζ D = 0. We have Z ∞ 2 1 ts−1 Tr e−tD dt (3.1) ζD2 (s) = Tr (D2 )−s = 0(s) 0 which is a well defined holomorphic function of s for Re(s) > n2 , where n = dim M , and has a meromorphic extension to the whole complex plane with only simple poles. The poles and residues are determined by the small time asymptotics of the heat kernel. Let E(t; x, y) denote the kernel of the operator exp(−tD2 ). The pointwise trace Tr E(t; x, x) has an asymptotic expansion as t → 0, tr E(t; x, x) = t−n/2
N X
tk/2 ak (D2 ; x) + o(t
N −n 2
),
(3.2)
k=0
where ak (D2 ; x) are computed from the coefficients of the operator D2 at the point x (see [14]). It follows that the meromorphic extension of ζD2 (s) has poles at the points sk = n−k 2 with residues given by Ress=sk ζD2 (s) = where ak (D2 ) denotes the integral
1
ak (D 0( n−k 2 )
2
),
(3.3)
Z
ak (D ) = 2
M
ak (D2 ; x)dx.
In particular, there are no poles at non-positive integers and ζD2 (0) is given by ζD2 (0) = an (D2 ). R∞
(3.4)
The functions 0(s) and 0 ts−1 Tr e−tD dt have the following asymptotic expansion in a neighborhood of s = 0: Z ∞ 2 1 an (D2 ) + b + sf (s) and 0(s) = + γ + sh(s), (3.5) ts−1 Tr e−tD dt = s s 0 2
where f (s) and h(s) are holomorphic functions of s and γ denotes Euler’s constant. R∞ 2 The number b denotes the regularized value of the integral 0 t−1 Tr e−tD dt. Now we differentiate
ζ-Determinant on the Smooth, Self-Adjoint Grassmannian
435
− ln det ζ (D2 ) = d/ds(ζD2 )|s=0 ) + b + sf (s) d an (D { 1s }|s=0 = b − γ·an (D2 ), = ds + γ + sh(s) s 2
(3.6)
and obtain the formula for the derivative of ζD2 (s) at s = 0. We want to study the corresponding invariants on a manifold with boundary for the ∗ (D). We show that despite the additional poles, caused operator DP , where P ∈ Gr∞ by the boundary contribution, at least in a neighborhood of s = 0 the situation is not different from the case of a closed manifold. First we have the following result which holds in the case of the operator Dσ . R∞ 2 Proposition 3.1. The function 0(s)ζDσ (s) = 0 ts−1 Tr e−tDσ dt has a simple pole at s = 0. Hence ζDσ2 (0) and, according to formula (3.6), ln det ζ (Dσ )2 = −d/ds(ζDσ2 (s))|s=0 are well defined. The proof of Proposition 3.1 consists of a straightforward computation of the boundary contribution and is included in the Appendix. Now the fact that ζDP2 (0) and d/ds(ζDP2 (s))|s=0 are well defined is an immediate consequence of the next theorem. ∗ (D) there exists a constant c > 0 such that the Theorem 3.2. For any P ∈ Gr∞ following estimate holds for any 0 < t < 1: √ 2 2 (3.7) |Tr e−tDP − Tr e−tDσ | < c t·T r|K2 |etkK2 k .
Proof. We essentially repeat the proof of Proposition 1.1. We replace the operator DP by the operator DU,σ and use Duhamel’s Principle to obtain |Tr e−tDP − Tr e−tD5s | = ∞ X {e1 ∗ K2 e1 ∗ K2 e1 ∗ ... ∗ K2 e1 }}(t)| + O(e−c/t ). (3.8) |Tr 2
2
n=1
Now we use the second part of Theorem 2.1 in order to estimate the sum on the right side of (3.8) in exactly the same way as in the proof of Proposition 1.1. Theorem 3.2 shows that the difference ζDP2 (s)−ζDσ2 (s) is a holomorphic function of s for Re(s) > − 21 . Therefore ζDP2 (s) is a holomorphic function of s in a neighborhood of s = 0. The proof of Theorem 0.2 is now complete. Proof of Proposition 0.5. The proposition is an easy corollary of Theorem 3.2. It follows from (3.1) and (3.5) that we have the equality Z ∞ 2 2 1 2 2 ts−1 Tr (e−tDP − e−tDσ )dt =, ζDP (0) − ζDσ (0) = lim s→0 0(s) 0 Z 1 2 2 ts−1 T r(e−tDP − e−tDσ )dt. lim s s→0
0
Now we apply Theorem 3.2 and obtain
436
K. P. Wojciechowski
Z |ζDP2 (0) − ζDσ2 (0)| < lim s s→0
1
ts−1 |Tr e−tDP − Tr e−tDσ |dt 2
2
0
Z
1
< c·lim s s→0
ts−1/2 dt = 0.
0
This ends the Proof of Proposition 0.5.
∗ (D) 4. The Additivity of the η-Invariant on the Grassmannian Gr∞
Now we assume that D : C ∞ (X; S) → C ∞ (X; S) is a compatible Dirac operator acting on sections of a bundle of Clifford modules S over a closed partitioned odd-dimensional manifold X. In this section we study the decomposition of the η-invariant ηD = ηD (0) of the operator D into contributions coming from the different parts of the manifold. So, assume that we have a decomposition of X as M1 ∪ M2 , where M1 and M2 are compact manifolds with boundary such that M1 ∩ M2 = Y = ∂M1 = ∂M2 .
(4.1)
We also assume that the Riemannian metric on X and the Hermitian product on S are products in the bicollar neighborhood N˜ = [−1, 1] × Y of Y , where M1 ∩ N˜ = [−1, 0]×Y . The operator D is given by the formula (0.2) in N˜ . Let Di = D|Mi (i = 1, 2) ∗ (D). Let η(P1 , P2 ) denote the η-invariant of the operator G(∂u +B) and P1 , P2 ∈ Gr∞ on the manifold [−1, 1] × Y subject to the boundary condition P1 at u = −1 and the boundary condition Id − P2 at u = 1. The following Theorem is a generalization of the additivity formula for the η-invariant proved in [34] (see also [13, 20, 21, 33] for partial results and discussion of related topics). ∗ (D) one has the following formula Theorem 4.1. For any P1 , P2 ∈ Gr∞
ηD = ηD1Id−P + ηD2P + η(P1 , P2 ) mod Z.
(4.2)
2
1
Remark 4.2. (1) Theorem 4.1 extends Theorem 0.2 of [34] to boundary conditions from ∗ (D). In [34], Formula (4.2) was proved only for projections of the form 5σ , though Gr∞ the method extends without problems to all finite-dimensional Lagrangian perturbations of the Atiyah–Patodi–Singer condition. However, the Calderon projection of D is seldom ∗ (D). of this type. On the other hand it is an element of Gr∞ (2) Results analogous to Theorem 0.2 of [34] were obtained and discussed by other authors. We refer especially to the papers [9, 12, 19, 22, 23, 24]. (3) The crucial point in the proof of Theorem 4.1 is the extension of the formula on the variation of the η-invariant under a change of boundary condition from the work [21]. Let 5σ denote a projection given by Formula (1.6). The following special case of the additivity formula from [34] is the starting point of the proof of Theorem 4.1, ηD = ηD1Id−5 + ηD25 σ
σ
mod Z.
(4.3)
Now we have to study the variation of the η-invariant under a change of boundary ∗ ∗ (D) and let us choose a path {Pr }0≤r≤1 ⊂ Gr∞ (D) such that condition. Let P ∈ Gr∞ P0 = 5σ and P1 = P . It follows from Lemma 1.2 that we have a smooth family {gr } of
ζ-Determinant on the Smooth, Self-Adjoint Grassmannian
437
unitary operators of the form Id|(S|Y ) + smoothing operator which commutes with G and such that g0 = Id and g1 5σ g1−1 = P. Next, we choose a smooth non-increasing function γ(u) such that γ(u) = 1 f or u < 1/4 and γ(u) = 0 f or u > 3/4, and for each 0 ≤ r ≤ 1 use the family gr,u = grγ(u) f or 0 ≤ u ≤ 1,
(4.4)
in order to construct a corresponding unitary operator Ur on M2 (see Formula (1.7)). The operator D2Ur ,σ is unitarily equivalent to the operator D2Pr . This allows us to study the η-invariant of D2Ur ,σ instead of the η-invariant of D2Pr . It follows from Proposition 1.1 that the η-invariant of D2Ur ,σ is given by the formula (1.3), hence the variation of the η-invariant is given by the standard formula √ 2 2 d (ηD2U ,σ ) = − √ lim ε·Tr (d(D2Ur ,σ )/dr)e−εDUr ,σ mod Z. r ε→0 dr π
(4.5)
The main technical result of this section is the following theorem. ∗ (D), and any path g = {gr,u } connecting 5σ with P , Theorem 4.3. For any P ∈ Gr∞ as described above, the following formula holds:
ηD2P − ηD25
σ
∂g )|r0 = where (g −1˙ ∂u
1 =− π
Z
Z
1
1
dr 0
0
˙ ∂g du Tr G(g −1 )|r mod Z, ∂u
(4.6)
d −1 ∂g dr (g ∂u )|r=r0 .
Proof. We show that √ −εD22 1 2 Ur ,σ 0 √ lim ε·Tr (d(D2Ur ,σ )/dr)|r=r0 e = π π ε→0
Z
˙ ∂g Tr G(g −1 )|r=r0 du. ∂u (4.7)
1
0
We have ˙ ∂U
(U −1˙DU ) = G U −1
!
∂u
+ G[U
−1
BU, U
−1
˙ ∂g
U˙ ] = G g −1
∂u
Thus lim
ε→0
√
−εD22
ε·Tr (d(D2Ur ,σ )/dr)|r=r0 e
Ur ,σ 0
,
! + G[g −1 Bg, g −1 g]. ˙
438
K. P. Wojciechowski
contains two terms. Let us start with lim
ε→0
√
−εD22
ε·Tr G[g −1 Bg, g −1 g]e ˙
Ur ,σ 0
.
−εD22
Ur ,σ 0 by the operator Once again we use Duhamel’s Principle and replace e 2 2 exp(−t(−∂u + B + K2 )σ ) on the cylinder. The point here is that the kernel of this ˙ anticommutes with the operator commutes with G and the operator G[g −1 Bg, g −1 g] involution G. It follows that
˙ −εDUr ,σ = O(e−c/ε ), Tr G[g −1 Bg, g −1 g]e 2
and one is left with
˙ ∂U −εD22 √ 2 Ur ,σ 0 √ lim ε·Tr G(U −1 . )e ∂u π ε→0
−1˙ ∂g )| The term −G(U −1˙ ∂U ∂u )|r=r0 = G(g ∂u r=r0 is supported in [1/4, 3/4] × Y , and so we −εD22
Ur ,σ 0 by the kernel of the operator exp(−ε(−∂u2 + replace the kernel of the operator e B 2 )) on the infinite cylinder (−∞, +∞) × Y . More precisely, this is the kernel of the operator
exp(−εU −1 (−∂u2 + B 2 )U ) = U −1 e−ε(−∂u +B ) U = g −1 e−ε(−∂u +B ) g, 2
2
2
2
at r = r0 . Now we have ˙ ∂g √ 2 −εD22 Ur ,σ − √ lim ε·Tr G(g −1 )|r0 e ∂u π ε→0 Z 1 ˙ ∂g √ 2 2 2 = − √ lim ε du T rY G(g −1 )|r0 g −1 e−ε(−∂u +B ) g ∂u π ε→0 0 Z 1 ˙ ∂g √ 2 2 2 du T rY {g −1 G(g −1 )|r0 e−ε(−∂u +B ) g} = − √ lim ε ∂u π ε→0 0 Z 1 ˙ ∂g √ 2 2 2 du T rY G(g −1 )|r0 e−ε(−∂u +B ) = − √ lim ε ∂u π ε→0 0 Z 1 ˙ ∂g √ 2 2 1 = − √ lim ε √ du T rY G(g −1 )|r0 e−εB ∂u π ε→0 4πε 0 Z 1 Z ˙ ˙ ∂g 2 1 1 1 ∂g =− du lim T rY G(g −1 )|r0 e−εB = − du T rY G(g −1 )|r0 . ε→0 π 0 ∂u π 0 ∂u
Remark 4.4. In the paper [29] we discussed the special case of (4.6) with gr (u) given by the formula Id 0 , gr (u) = 0 exp(irγ(u))2 where 2 : C ∞ (Y ; S − |Y ) → C ∞ (Y ; S − |Y ) is a self-adjoint operator with a smooth kernel. In this case our formula gives Z Z 1 1 1 Tr 2 dr du γ 0 (u)Tr 2 = mod Z. ηD2P − ηD25 = − σ π 0 π 0
ζ-Determinant on the Smooth, Self-Adjoint Grassmannian
439
∗ Corollary 4.5. Let P1 , P2 ∈ Gr∞ D, then
ηD2P − ηD2P = − 1
2
1 π
Z
Z
1
1
dr 0
0
˙ ∂g du Tr G(g −1 )|r mod Z, ∂u
(4.8)
where {gr,u } is any family connecting P1 with P2 in the way described above (see (4.4)). Corollary 4.6. η(P1 , P2 ) = −
1 π
Z
Z
1
dr 0
0
1
˙ ∂g du Tr G(g −1 )|r mod Z, ∂u
(4.9)
where {gr,u } is any family connecting P1 with P2 as described above (see (4.4)). The next result follows directly from the formula (4.8) which can be written as follows Z ˙ ∂g 1 1 d (ηD2P )|r=r0 = − du Tr G(g −1 )|r0 . (4.10) r dr π 0 ∂u d (ηD2P )|r=0 does not depend on the Proposition 4.7. The variation of the η-invariant dr r choice of the base projection P = P0 . It depends only on the family of unitary operators {gr }.
This result plays an important role in the proof of equality of ζ-determinant and C-determinant (see [29, 4, 30, 31]). Proof of Theorem 4.1. As a result of Corollary 4.5 and Corollary 4.6 the following sequence of equalities holds mod Z: ηD = ηD1Id−5 + ηD25 σ
σ
= (ηD1Id−5 + η(5σ , P1 )) − η(5σ , P1 ) + (ηD25 + η(P2 , 5σ )) − η(P2 , 5σ ) σ
σ
= ηD1Id−P + ηD2P + η(P1 , 5σ ) + η(5σ , P2 ) = ηD1Id−P + ηD2P + η(P1 , P2 ). 1
2
1
2
In the last line we use the elementary identities η(P1 , P2 ) = −η(P2 , P1 ) and η(P1 , P2 ) + η(P2 , P3 ) = η(P1 , P3 )
mod Z.
A. Proof of Theorem 2.1 and Proposition 3.1 We start with a discussion of Theorem 2.1. Recall the formulas for the functions gn (t; u, v) (see for instance [6], (22.33) and (22.35)) (u−v)2 (u+v)2 e−µn t gn (t; u, v) = √ ·{e− 4t − e− 4t } for n > 0, 2 πt 2
(A1)
440
K. P. Wojciechowski
and (u+v)2 e−(−µn ) t − (u−v)2 √ ·{e 4t + e− 4t } 2 πt √ u+v +(−µn )e−(−µn )(u+v) ·erf c( √ − (−µn ) t) f or n < 0, 2 t Z ∞ 2 2 2 2 e−r dr < √ e−x . erf c(x) = √ π x π 2
gn (t; u, v) =
where
We begin with the estimate of the integral most singular term is Z
∞
R∞ 0
gn (s; u0 , v)gn (t − s; v, u0 )dv. The
2 e−µn s − (u0 −v)2 e−µn (t−s) − (u4(t−s) 0 +v) 4s √ ·e · √ dv ·e 2 πs 2 π(t − s) 0 Z ∞ 2 t(u0 −v)2 1 e−µn t √ e− 4s(t−s) dv = 4π s(t − s) 0 r Z 2 2 1 1 1 e−µn t s(t − s +∞ −r2 √ 2 e dr = √ e−µn t < √ . < 4π s(t − s) t 2 πt 2 πt −∞ 2
2
We also have the inequality e−
(u+v)2 4t
≤ e−
(u−v)2 4t
,
which holds for u, v ≥ 0 and implies the estimate Z
∞ 0
2 e−µn s − (u0 ∓v)2 e−µn (t−s) − (u4(t−s) 1 0 ±v) 4s √ ·e ·e · √ dv ≤ √ . 2 πs 2 π(t − s) 2 πt 2
This gives
Z
∞ 0
2
1 gn (s; u0 , v)gn (t − s; v, u0 )dv < √ , 2 πt
for positive n. If n < 0 we also have to discuss the terms of the form Z
∞ 0
√ e−µn s − (u0 −v)2 u0 + v 4s √ ·e + µn t − s)dv. ·µn eµn (u0 +v) ·erf c( √ 2 πs 2 t−s 2
We have Z
∞
√ u+v e−µn s − (u0 −v)2 4s √ ·e + µn t − s) ·µn eµn (u+v) ·erf c( √ 2 πs 2 t−s 0 2 r −µ2n t Z ∞ −µ 2 t(u0 −v) 2 µn e n t t − s µn e − 4s(t−s) √ e dv < √ < ce−µn t . < t π s π 0 2
ζ-Determinant on the Smooth, Self-Adjoint Grassmannian
441
Finally, we have to estimate the term in which the erf c function appears twice, Z
∞
0
√ √ u0 + v u0 + v + µn t − s)dv µn eµn (u0 +v) erf c( √ + µn s) ·µn eµn (u0 +v) erf c( √ 2 s 2 t−s Z √ u0 +v √ 2 −( √ 0 +v +µ +µn t−s)2 4 ∞ 2 2µn (u0 +v) −( u2√ n s) s µn e e ·e 2 t−s dv < π 0 r Z Z 2 s(t − s) +∞ −r2 4 2 −µ2n t 4 2 −µ2n t ∞ − t(u 0 +v) 4s(t−s) e dv = µn e 2 e dr = µn e π π t 0 −∞ √√ √ 2 2 8 ≤ √ µ2n e−µn t 2 t π < c te−µn t . π
The computations given above finish the proof of (2.8). We work the same way in order to obtain the estimate (2.7). The only difference is the appearence of hn (t; u0 , v) = ∂gn ∂u (t; u0 , v). The first term we have to consider has the form Z
∞
2 e−µn s |u0 − v| − (u0 −v)2 e−µn (t−s) − (u4(t−s) 0 −v) 4s √ e · √ dv e 2 πs 2s 2 π(t − s) 0 Z ∞ 2 e−µn t |u0 − v| − (u0 −v)2 1 4s √ e dv < 4π 2 s(t − s) 0 2s Z u0 Z ∞ 2 u0 − v − (u0 −v)2 v − u0 − (u0 −v)2 1 e−µn t 4s 4s √ e e { dv + dv} = 4π s(t − s) 0 2s 2s u0 Z u0 Z ∞ 2 2 (u0 −v)2 t(u0 −v)2 1 e−µn t 1 e−µn t √ ·√ { . d(e− 4s ) − d(e− 4s } < = 4π s(t − s) 0 2π s(t − s) u0 2
2
R∞ We work on the other terms which appear in 0 h−n (s; u0 , v)gn (t − s; v, u0 )dv in the same way. The details are left to the reader. Now, we show that the function 0(s)ζDσ (s) has a simple pole at s = 0. We follow the method applied in Sect. 4 of [13] to study the η-invariant of Dσ . Let us point out that the situation is simpler in the case of the η-invariant due to the absence of the boundary contribution. Nevertheless, the result corresponding to Lemma 4.2 of [13] holds also in the present case. Namely modulo a function holomorphic on the whole complex plane, 0(s)ζDσ (s) splits into an interior contribution and a cylinder contribution. This again follows from Duhamel’s Principle. First of all, Z ∞ ts−1 Tr e−tDσ dt, 1
is a holomorphic function on the whole complex plane. For 0 < t < 1, we replace exp(−tDσ2 ) by the operator exp(−tD˜ 2 ) inside of M and by the operator exp(−t(−∂u2 + B 2 )σ ) on N . The interior contribution produces simple poles at the points sk = k−n 2 with residues given by the formula Z ak (D˜ 2 ; x)dx, M
442
K. P. Wojciechowski
˜ , the closed double of M (see where D˜ denotes the double of the Dirac operator D on M formulas (3.2) and (3.3)). In particular the contribution to the residuum at s = 0 is equal to Z an (D˜ 2 ; x)dx = 0. M
This is due to the point-wise vanishing of an (D˜ 2 ; x), which follows from the fact that n = dim M is odd (see for instance [14]). The cylinder contribution has the form Z ∞ Z ∞ X Z ∞ 2 2 ts−1 Tr γ(u)e−t(−∂u +B )5σ dt = ts−1 dt γ(u)gn (t; u, u)du, 0
0
n∈Z\{0}
0
where γ(u) denotes the cut-off function. The integral Z ∞ Z ∞ ts−1 dt γ(u)gn (t; u, u)du 0
0
consists of two terms. The first term produces the contribution Z ∞ 2 X Z ∞ u2 e−µn t s−1 √ ·(1 − sign(n)e− t )du}. t dt{ γ(u) 2 πt 0 n∈Z\{0} 0 This is convergent for Re(s) > n/2 and in fact it is equal to Z
∞ 0
=
t
s−1
R∞ 0
dt{
X Z
n∈Z\{0}
γ(u)du √ · 2 π
Z
∞
0
∞
u2 e−µn t γ(u) √ ·(1 − sign(n)e− t )du} 2 πt 2
ts−3/2 T r e−tB dt. 2
0
It follows now from (3.2) and (3.3) that the expression on the right side has a meromorphic extension to the whole complex plane with simple poles. Moreover, it is regular at s = 0. The reason is that the residuum at s = 0 is given by the formula R∞ γ(u)du 0 √ adim (Y )+1 (B 2 ), 2 π and adim (Y )+1 (B 2 ) is equal to 0 due to the fact that dim (Y ) + 1 is an odd number (see for instance [14]). We are left with Z ∞ Z ∞ X √ u ts−1 dt γ(u){ µn e2uµn erf c( √ + µn t)}du. t 0 0 n>0 We only have to show that this term produces at most a simple pole at s = 0. We can neglect the presence of the cut-off function γ(u) and then we obtain for large 0 Z ∞ Z ∞ X √ d 2uµn u 1 (e ts−1 dt { )erf c( √ + µn t)}du = 2 0 du t 0 n>0
ζ-Determinant on the Smooth, Self-Adjoint Grassmannian
1 = 2
Z Z
∞
443
X
√ u {e2uµn erf c( √ + µn t)}|∞ 0 t n>0
ts−1 dt{
0
√ d u {erf c( √ + µn t)})du} du t Z0 Z ∞ X √ 2 du 1 ∞ s−1 X −µ2n t 2 √ ( t { e e−u /t √ ) − erf c(µn t)}dt = 2 0 π 0 t n>0 n>0 Z ∞ Z ∞ X √ 2 1 1 ts−1 Tr e−tB dt − ts−1 { erf c(µn t)}dt. = 2 0 2 0 n>0 −
∞
e2uµn (
The first sum on the right side is equal to 21 0(s)ζB 2 (s), hence it produces the correct asymptotic expansion with a simple pole at s = 0. We need the next identity in order to deal with the second sum. Z Z ∞ √ √ 1 ∞ d s (t )erf c(µn t)dt ts−1 erf c(µn t)dt = s 0 dt 0 √ ∞ 1 Z ∞ s −µ2 t µn 1 s t e n √ dt = { t erf c(µn t)}|0 − s s 0 2 t Z ∞ 2 0(s + 1/2) 1 ts−1/2 e−µn t dt = − (µ2n )−s . = − µn 2s 2s 0 Therefore we obtain Z √ 0(s + 1/2) 1 0(s + 1/2) 1 ∞ s−1 X ζB 2 (s) = ζB 2 (s), t { erf c(µn t)}dt = − 2 0 4s 2 8s n>0 and the expression on the right side has a meromorphic extension to the whole complex plane, with a simple pole at s = 0 with residuum √ 0(s + 1/2) π ζB 2 (s) = adim (Y ) (B 2 ). Ress=0 8s 8 R∞ 2 We have shown that the cylinder contribution to the trace integral 0 ts−1 Tr e−tD5σ dt has a meromorphic extension to the whole complex plane with an isolated simple pole at s = 0, which ends the proof of Proposition 3.1. References 1. Atiyah, M.F., Patodi, V.K., and Singer, I.M.: Spectral asymmetry and Riemannian geometry. I, Math. Proc. Cambridge Phil. Soc. 77, 43–69 (1975) 2. Bismut, J.-M. and Freed, D.S.: The analysis of elliptic families.II. Dirac operators, eta invariants, and the holonomy theorem. Commun. Math. Phys. 107, 103–163 (1986) 3. Booß–Bavnbek, B., Morchio, G., Strocchi, F., and Wojciechowski, K.P.: Grassmannian and chiral anomaly. J. Geom. Phys. 22, 219–244 (1997) 4. Booß–Bavnbek, B., Scott, S.G., and Wojciechowski, K.P.: The ζ-determinant and C-determinant on the Grassmannian in dimension one. Lett. Math. Phys. 45, 353–362 (1998) 5. Booß–Bavnbek, B., and Wojciechowski, K.P.: Pseudo-differential projections and the topology of certain spaces of elliptic boundary value problems. Commun. Math. Phys. 121, 1–9 (1989) 6. Booß–Bavnbek, B., and Wojciechowski, K.P.: Elliptic Boundary Problems for Dirac Operators. Boston: Birkh¨auser, 1993
444
K. P. Wojciechowski
7. Booß–Bavnbek, B., and Wojciechowski, K.P.: Grassmannian and Boundary Contribution to the ζDeterminant: Introduction into the 4-dimensional case. Preprint (1996) 8. Br¨uning, J. and Lesch, M.: On the η-invariant of certain non-local boundary problems. Preprint (1996) 9. Bunke, U.: On the glueing problem for the η-invariant. J. Diff. Geom. 41/2, 397–448 (1995) 10. Cheeger, J.: Spectral geometry of singular Riemannian spaces. J. Diff. Geom. 26, 575–657 (1983) 11. Cheeger, J.: η-invariants, the adiabatic approximation and conical singularities. J. Diff. Geom. 18, 175– 221 (1987) 12. Dai, X., and Freed, D.: η-invariants and determinant lines. J. Math. Phys. 35, 5155–5195 (1994) 13. Douglas, R.G., and Wojciechowski, K.P.: Adiabatic limits of the η-invariants. The odd–dimensional Atiyah–Patodi–Singer problem. Commun. Math. Phys. 142, 139–168 (1991) 14. Gilkey, P.B.: Invariance Theory, the Heat Equation, and the Atiyah–Singer Index Theory. (Second Edition), Boca Raton, Florida: CRC Press, 1995 15. Grubb, G.: Heat operator trace expansion and index for general Atiyah–Patodi–Singer problem. Comm. Partial Diff. Eq. 17, 2031–2077 (1992) 16. Grubb, G. and Seeley, R.T.: Weakly parametric pseudodifferential operators and Atiyah–Patodi–Singer boundary problems. Invent. Math. 121, 481–529 (1995) 17. Grubb, G. and Seeley, R.T.: Zeta and eta functions for Atiyah–Patodi–Singer operators. J. Geom. Anal. 6, 31–77 (1996) 18. Grubb, G.: Preprint (1997) 19. Hassell, A., Mazzeo, M. and Melrose, R.B.: A signature formula for manifolds with corners of codimension 2. Topology 36, 1055–1075 (1997) 20. Klimek, S., and Wojciechowski, K.P.: Adiabatic cobordism theorems for analytic torsion and η-invariant. J. Funct. Anal. 136, 269–293 (1996) 21. Lesch, M., and Wojciechowski, K.P.: On the η–invariant of generalized Atiyah–Patodi–Singer problems. Illinois J. Math. 40, 30–46 (1996) 22. Mazzeo, R.R. and Melrose, R.B.: Analytic surgery and the η-invariant. GAFA 5, 14–75 (1995) 23. M¨uller, W.: Eta invariants and manifolds with boundary. J. Diff. Geom. 40, 311–377 24. M¨uller, W.: On the index of Dirac operator on manifolds with corners of codimension two. I, J. Diff. Geom. 44, 97–177 25. Piazza, P.: Determinant bundles, manifolds with boundary and surgery. Commun. Math. Phys. 178, 597–626 26. Reed, M., and Simon, B.: Methods of Modern Mathematical Physics, vol. II. New York: Academic Press, 1975 27. Ray, D., and Singer, I.M.: R–torsion and the Laplacian on Riemannian manifolds. Adv. Math. 7, 145–210 (1971) 28. Scott, S.G.: Determinants of Dirac boundary value problems over odd-dimensional manifolds. Commun. Math. Phys. 173, 43–76 (1995) 29. Scott, S.G., and Wojciechowski, K.P.: Determinants, Grassmannians and Elliptic Boundary Problems for the Dirac Operator. Lett. Math. Phys. 40, 135–145 (1997) 30. Scott, S.G., and Wojciechowski, K.P.: ζ-determinant and the Quillen determinant on the Grassmannian of elliptic self-adjoint boundary conditions. C. R. Acad. Sci. Paris S˜er. Math. To appear (1999) 31. Scott, S.G., and Wojciechowski, K.P.: The ζ-determinant and Quillen determinant for a Dirac operator on a manifold with boundary. Preprint (1998) 32. Singer, I.M.: Families of Dirac operators with applications to physics. Asterisque, hors s´erie. 323–340 (1985) 33. Wojciechowski, K.P.: The additivity of the η-invariant: The case of an invertible tangential operator. Houston J. Math. 20, 603–621 (1994) 34. Wojciechowski, K.P.: The additivity of the η-invariant. The case of a singular tangential operator. Commun. Math. Phys. 169, 315–327 (1995) 35. Wojciechowski, K.P.: ζ-determinant, spectral asymmetry, and total symbol of elliptic operators. Preprint (1997) Communicated by A. Jaffe
Commun. Math. Phys. 201, 445 – 460 (1999)
Communications in
Mathematical Physics
On the Stability of the Relativistic Electron-Positron Field? Volker Bach1 , Jean-Marie Barbaroux2 , Bernard Helffer3 , Heinz Siedentop2 1 Fachbereich Mathematik, Technische Universit¨ at Berlin, D-10623 Berlin, Germany. E-mail:
[email protected] 2 Lehrstuhl f¨ ur Mathematik I, Universit¨at Regensburg, D-93040 Regensburg, Germany. E-mail:
[email protected];
[email protected] 3 D´ epartement de Math´ematiques, Bˆatiment 425, Universit´e Paris-Sud, F-91405 Orsay C´edex, France. E-mail:
[email protected] Received: 27 April 1998 / Accepted: 3 September 1998
Abstract: We study the energy of relativistic electrons and positrons interacting through the second quantized Coulomb interaction and a self-generated magnetic field. As states we allow generalized Hartree–Fock states in the Fock space. Our main result is the assertion of positivity of the energy, if the atomic numbers and the fine structure constant are not too big. We also discuss the dependence of the result on the dressing of the electrons (choice of subspaces defining the electrons).
1. Introduction Relativistic particles should be described by quantum electrodynamics. This theory dates back to the very early days of quantum mechanics. It can be traced back to Born and Jordan’s [5] famous paper clarifying Heisenberg’s matrix mechanics which was followed by many important works. But despite these efforts it is still unclear whether the energy in quantum electrodynamics is bounded from below; the existence of a vacuum state, i.e., a state with lowest but finite energy (ground state) remains elusive. In view of this unsatisfactory situation various approximate models have been studied. In this context one can mention the work of Conlon [8], Fefferman [9], Lieb et al. [16, 13, 14, 15]. All of these models assume that the Hamiltonian is particle number conserving. In fact the positron number is assumed to be zero and the electron number enters only as a parameter. There is, however, an interesting observation of Chaix, Iracane, and Lions [6, 7] that concerns an approximate quantum electrodynamics. They consider the second quantized relativistic electron-positron field interacting via the second quantized electric field disregarding any contribution from the magnetic field and no external field present. They argue that the energy of generalized Hartree–Fock states is nonnegative provided the ?
c 1999 The authors. Reproduction of this article for non-commercial purposes by any means is permitted.
446
V. Bach, J.M. Barbaroux, B. Helffer, H. Siedentop
fine structure constant is less than 4/π. Here, we prove and extend this result in various directions. The paper is organized as follows: Sect. 2 gives an introduction into the basic terminology of the subject and allows us to formulate our result properly. Section 3 contains the above mentioned stability result of Chaix et al. [7]. To describe atoms and molecules one needs to also include the external potentials of the nuclei. Section 4 extends this for the atomic case. Besides a constraint on α we also obtain a constraint on Zα. In Sect. 5 we are able to include the classical magnetic field generated by the particles. As usual it destabilizes the problem further. The result is condensed in Fig. 1 where we obtain a critical curve which is the boundary of the region where we obtain positivity of the energy. Since quantum electrodynamics depends on the definition of the electron space, it is interesting to study the consequences on the lowest possible energy. This is done in Sect. 6. The result is that taking the positive spectral subspace of the one-particle Dirac operator including the external field is the optimal choice.
2. Definition of the Problem For the readers’ convenience and to state our results precisely, we introduce some notations following mainly Thaller [19] and partially [1] and [10]. Dirac operator. The Dirac operator of a particle of charge −e in the electric field e∇V and magnetic field ∇×A is DA,V = α·( 1i ∇+eA)+mβ +e2 V acting in H = L2 (R3 )⊗C4 . The 4 × 4 matrices α and β are the four Dirac matrices, 1 0 0 σ β= , α= , 0 −1 σ 0 in standard representation where σ = (σ1 , σ2 , σ3 ) are the three Pauli matrices 01 0 −i 1 0 , σ2 = , σ3 = . σ1 = 10 i 0 0 −1 Although the operator can be defined for more general vector potentials A, it suffices for our purpose to assume B = ∇ × A to be square-integrable which we will do henceforth. (Physically B is the magnetic induction and e∇V the electric field.) We are primarily interested in describing atoms, i.e., we pick V (x) = −Z/|x|, where Z is the atomic number of the atom under consideration. Again, we remark that our results transcribe easily to more general potentials. We also set α = e2 , the Sommerfeld fine structure constant. We assume that DA,V is self-adjoint with form domain H 1/2 (R3 ) ⊗ C4 . This assumption is fulfilled, e.g., if A = 0 and V = Z/|x| with αZ < 2/π. Projections. The spectral projection on the positive subspace of DA,V is denoted by P+A,V := χ[0,∞) (DA,V ). We also set P−A,V := 1 − P+A,V and use the following notation: |DA,V | = P+A,V DA,V − P−A,V DA,V . Fock space. Given a closed subspace H+ of H, we define the one-particle electron sub(1) ⊥ space as F(1) + := H+ . The one-particle positron subspace is F− = C(H+ ), where C is the charge conjugation operator defined as (Cψ)(x) = iβα2 ψ(x). The projections onto H+ and H− := (H+ )⊥ are denoted by PH+ and PH− . The n-particle electron subspace is
Stability of Relativistic Electron-Positron Field
447
F(n) + :=
n ^
H+ ,
ν=1
and the m-particle positron subspace is (m) := F−
m ^
CH− .
ν=1 (0) We also set F(0) + := F− := C. The total Fock space is
F :=
∞ M
(m) F(n) . + ⊗ F−
(1)
n,m=0
Field operators. We denote by x = (x, s) an element of G := R3 × {1, 2, 3, 4}. Assume f, g ∈ H: First, we define in F the electron annihilation operator a(f ). On the subspace F(n+1,m) it acts as (a(f )ψ)(n,m) (x1 , . . . , xn ; y1 , . . . , ym ) Z √ dx (PH+ f )(x) ψ (n+1,m) (x, x1 , . . . , xn ; y1 , . . . , ym ), = n+1
(2)
G
where dx denotes the product measure (Lebesgue measure in the first factor and counting measure in the second factor) on G. By linearity it extends to a bounded operator on F. Note that the map f 7→ a(f ) is anti-linear. Next, we introduce the electron creation operator by defining it first on F(n−1,m) , n
1 X (−1)j+1 PH+ f (xj )ψ (n−1,m) (x1 , . . . , xˆ j , . . . , xn ; y1 , . . . , ym ) (a∗ (f )ψ)(n,m) = √ n j=1 (3) (with the understanding that a negative superscript indicates the zero element). As before it extends by linearity to a bounded operator on F. The operator a∗ (f ) is the adjoint of a(f ) and the map f 7→ a∗ (f ) is linear. We also have the canonical anti-commutation relations {a(f ), a(g)} = {a∗ (f ), a∗ (g)} = 0, {a(f ), a∗ (g)} = (f, PH+ g).
(4) (5)
We define the positron annihilation operator by (b(g)ψ)(n,m) (x1 , . . . , xn ; y1 , . . . , ym ) Z √ n dy [CPH− g](y) ψ (n,m+1) (x1 , . . . , xn ; y, y1 , . . . , ym ), = (−1) m + 1
(6)
G
and the positron creation operator (b∗ (g)ψ)(n,m) (x1 , . . . , xn ; y1 , . . . , ym ) m (−1)n X (−1)k+1 [CPH− g](yk ) ψ (n,m−1) (x1 , . . . , xn ; y1 , . . . , yˆk , . . . , ym ). (7) = √ m k=1
448
V. Bach, J.M. Barbaroux, B. Helffer, H. Siedentop
We observe that, due to the anti-linearity of C, the map g 7→ b(g) is linear and the map g 7→ b∗ (g) is anti-linear. Similarly to (4) and (5), we have for these operators the anti-commutation relations {b(f ), b(g)} = {b∗ (f ), b∗ (g)} = 0, {b∗ (f ), b(g)} = (f, PH− g). Note that, for all f ∈ H, a(f ) and b(f ) annihilate the vacuum vector := 1. This property characterizes the vacuum vector uniquely up to phase factors. We then define the field operator 9(f ) on the Fock space F by 9(f ) = a(f ) + b∗ (f ). This field operator is bounded on F and depends anti-linearly on f . The adjoint operator 9∗ (f ) = a∗ (f ) + b(f ) is also bounded and depends linearly on f . Matrix elements. We fix an orthonormal basis {. . . , e−2 , e−1 , e0 , e1 , . . . } of H where vectors with negative indices are in H− and vectors with nonnegative indices are in H+ . We also assume that all basis vectors ei are in H 1/2 (R3 ) ⊗ C4 . In addition we set ai := a(ei ), a∗i := a∗ (ei ), bi := b(ei ), b∗i := b∗ (ei ), 9i := ai + b∗i and 9∗i := a∗i + bi . By (DA,V )i,j we denote the matrix elements of the Dirac operator, i.e., (DA,V )i,j = (ei , DA,V ej ), and by Wi,j;k,l , we denote the matrix elements of the two-body Coulomb potential W (x, y) = 1/|x − y|, i.e, Z Wi,j;k,l = (ei ⊗ ej , W ek ⊗ el ) =
Z
G
dx
G
dy
ei (x)ej (y)ek (x)el (y) . |x − y|
Note that Wi,j;k,l ∈ C under our assumptions on the basis {ei }i∈Z . Energy. We define D to be the set of all states on B(F), i.e., all positive continuous linear forms on theP set of bounded operators B(F) with ρ(1) = 1, which also have finite kinetic energy, i.e., i,j∈Z (D0,0 )i,j ρ(: 9∗i 9j :) < ∞, where colons denote normal ordering (Thaller [19], p. 285). Then, the energy EA,V,α : D → R in the state ρ is EA,V,α (ρ) =
X
(DA,V )i,j ρ(: 9∗i 9j :) +
i,j∈Z
α 2
X
Wi,j;k,l ρ(: 9∗i 9∗j 9l 9k :)
i,j,k,l∈Z
+
1 8π
Z R3
B2 ,
(8)
where the Sommerfeld fine structure constant α has the physical value of about 1/137.0359895 [2]. The main purpose of this paper is to estimate when EA,V,α is bounded from below or not as a function of α and Z. One-Particle Density Matrix. A trace class operator 0 on H × H is called a one-particle density operator (1-pdm), if it has the following properties: • 0 = 0∗ and −1 ≤ 0 ≤ 1.
Stability of Relativistic Electron-Positron Field
•
449
0=
γ υ υ ∗ −γ
(9)
with γ ∗ = γ and υ t = −υ,
(10)
where the superscript t refers to transposition, i.e., given our basis fixed initially, the matrix elements of B t are (B t )i,j := Bj,i . We note that the Hilbert space H is the orthogonal sum H+ and H− . This means that we can write γ++ γ+− υ++ υ+− γ γ−− υ−+ υ−− 0 = −+ ∗ ∗ υ++ υ−+ −γ¯ ++ −γ¯ +− ∗ ∗ υ+− υ−− −γ¯ −+ −γ¯ −− ∗ , and γ−− := with γ++ := PH+ γPH+ , γ+− := PH+ γPH− , γ−+ := PH− γPH+ = γ+− PH− γPH− appropriately restricted. Similarly υ++ := PH+ υPH+ , υ+− := PH+ υPH− , t , and υ−− := PH− υPH− also appropriately restricted. υ−+ := PH− υPH+ = −υ+− To conclude this item we note that each state ρ defines a 1-pdm 0ρ in a natural way through the formula (11) (h, 0ρ g) = ρ : [9∗ (g1 ) + 9(g˜ 2 )][9(h1 ) + 9∗ (h˜ 2 )] : , P where h := (h1 , h2 ) ∈ H2 , g := (g1 , g2 ) ∈ H2 and given f = k∈Z λk ek , we define P f˜ = k∈Z λk ek . Note that the conjugation defined this way traces back to the fact that the density matrix is given with respect to a fixed basis (see also [1], beginning of Sect. 2.b). The matrix elements are in this case γi,j = ρ(: 9∗j 9i :), (γ++ )i,j = ρ(a∗j ai ), (γ+− )i,j = ρ(bj ai ), (γ−− )i,j = −ρ(b∗i bj ) and υi,j = ρ(: 9j 9i :), (υ++ )i,j = ρ(aj ai ), (υ+− )i,j = ρ(b∗j ai ), (υ−− )i,j = ρ(b∗j b∗i ). In order to characterize all 1-pdm for which there also exist corresponding states (representability), it is natural to introduce unrenormalized one-particle density matrices γ + PH− υ . 0ur := 1 − (γ + PH− ) υ∗
Before we go on, we prove that any one-particle density matrix 0 of the form (9) with the additional requirement that 0 ≤ 0ur ≤ 1 has a quasi-free preimage among the states, i.e., that there exists a quasi-free state such that 0 is its one-particle density matrix. Equivalently, we may show that the unrenormalized one-particle density matrix γ++ γ+− υ++ υ+− γ 1 + γ−− υ−+ υ−− , (12) 0ur = −+ ∗ ∗ υ++ υ−+ 1 − γ¯ ++ −γ¯ +− ∗ ∗ υ+− υ−− −γ¯ −+ −γ¯ −− represented as a 4 × 4 matrix of operators on H+ ⊕ H− ⊕ H+ ⊕ H− , is the one-particle density matrix of some quasi-free state. To this end, we conjugate 0ur by the unitary operator
450
V. Bach, J.M. Barbaroux, B. Helffer, H. Siedentop
W = W ∗ = W −1 which yields W 0ur W ∗
γ++ ∗ υ+− = ∗ υ++ γ−+
where
γur,W :=
1 0 = 0 0
0 0 0 1
0 0 1 0
0 1 , 0 0
(13)
υ+− υ++ γ+− ∗ γur,W υur,W −γ¯ − −γ¯ −+ υ−− = ∗ ∗ υur,W 1 − γ¯ ur,W , (14) −γ¯ +− 1 − γ¯ ++ υ−+ υ−− υ−+ 1 + γ−−
γ++ υ+− ∗ υ+− −γ¯ −−
and
υur,W :=
υ++ γ+− . ∗ −γ¯ −+ υ−−
(15)
∗ ∗ and υur,W = −υ¯ ur,W , since γ = γ ∗ and υ ∗ = −υ. ¯ Further Note that γur,W = γur,W ∗ note that 0 ≤ W 0ur W ≤ 1 because 0 ≤ 0ur ≤ 1. Since 0 is assumed to be trace class, so is γur,W . We may therefore apply [1, Theorem 2.3] and conclude the existence of a quasi-free state ρW whose one-particle density matrix equals W 0ur W ∗ . (In passing note that this state is a linear form on the Fock space over H, the whole Hilbert space with – given our basis . . . , e−2 , e−1 , e0 , e1 , e2 , . . . – corresponding creation .., c∗−2 = b−2 , c∗−1 = b−1 , c∗0 := a∗0 , c∗1 = a∗1 , c∗2 = a∗2 , . . . and annihilation operators .., c−2 = b∗−2 , c−1 = b∗−1 , c0 := a0 , c1 = a1 , c2 = a2 , . . . . The property of being a quasifree state, however, is independent of this representation.) Now, although W has the form [1, Eq. (2a.9)], it does not correspond to a Bogoliubov transformation because the projection PH− onto H− is not Hilbert-Schmidt. Nevertheless it represents the following linear transformation on the CAR,
a∗i 7→ Q(a∗i ) := a∗i , ai 7→ Q(ai ) := ai , ∀i ≥ 0 7 Q(bi ) := b∗i ∀i < 0, b∗i 7→ Q(b∗i ) := bi , bi →
(16)
and, as such, it leaves the set of quasi-free states invariant. (This generalizes the remark made between Eqs. (2a.12) and (2a.13) in [1].) Thus, there exists a quasi-free state ρur whose one-particle density matrix is 0ur . More concretely, we obtain this state from ρW by replacing (17) ρur (di dj ) := ρW Q(di ) Q(dj ) , where di ∈ {ai , a∗i }, for i ≥ 0, and di ∈ {bi , b∗i }, for i < 0. This finishes the existence proof of a quasi-free state for a given 1-pdm. We go on by defining for any given ρ, γ++ γ+− , γ0 = γ−+ 1 + γ−− or equivalently, for g, h ∈ H, (h, γ 0 g) = ρ 9∗ (g)9(h) . The canonical anti-commutation relations imply 0 ≤ γ 0 ≤ 1. Thus, noting that γ−+ γ+− = γ−+ γ+− , we get 2 + γ+− γ−+ ≤ γ++ , γ++ 2 ≤ −γ−− . γ−+ γ+− + γ−−
(18) (19)
Hartree–Fock states. The set of generalized Hartree–Fock states (or quasi-free states with finite particle number) is the set of states ρ that fulfill
Stability of Relativistic Electron-Positron Field
i)
451
For all finite sequences of operators d1 , d2 , · · · , d2K , where di stands for a(f ), a∗ (f ), b(f ), or b∗ (f ), we have ρ(d1 d2 · · · d2K−1 ) = 0 and ρ(d1 d2 · · · d2K ) =
X
sgn(σ)ρ(dσ(1) dσ(2) ) · · · ρ(dσ(2K−1) dσ(2K) ),
σ∈S
where S is the set of permutations σ such that σ(1) < σ(3) < · · · < σ(2K − 1) and σ(2i − 1) < σ(2i) for all 1 ≤ i ≤ K. This implies in particular ρ(d1 d2 d3 d4 ) = ρ(d1 d2 )ρ(d3 d4 ) − ρ(d1 d3 )ρ(d2 d4 ) + ρ(d1 d4 )ρ(d2 d3 ).
(20)
P ii) The state ρ has a finite particle number, i.e., if N := i∈Z (a∗i ai + b∗i bi ) denotes the particle number operator, we have ρ(N ) < ∞, or equivalently, written in terms of the one-particle density matrix, tr(γ++ − γ−− ) < ∞. We write DHF P for the set of all generalized Hartree–Fock states ρ with finite kinetic energy, i.e., i,j∈Z (D0,0 )i,j ρ(: 9∗i 9j :) is absolutely convergent. Hartree–Fock functional. By X we denote the set of all one-particle density matrices with finite kinetic energy. The Hartree–Fock functional is defined by HF : X → R, EA,V,α HF (0) EA,V,α
Z |υ(x, y)|2 α dxdy = tr(D γ) + + αD(ργ , ργ ) 2 |x − y| Z Z |γ(x, y)|2 1 α dxdy + B2 , − 2 |x − y| 8π R3 A,V
(21)
R where D(f, g) := (1/2) R6 dxdyf (x)g(y)|x − y|−1 is the Coulomb scalar product, P P υ(x, y) := i,j∈Z (ei , υej )ei (x)ej (y), γ(x, y) := i,j∈Z (ei , γej )ei (x)ej (y) (note the P4 difference to υ), and ργ (x) := σ=1 γ(x, x). In Sect. 3, Eq. (27), we will be able to identify it as the energy of Hartree–Fock states written in terms of the one-particle density matrix. Since the representability of any unrenormalized 1-pdm holds because of the above remark, the infimum over the one-particle density matrices with finite kinetic energy equals the infimum of EA,V,α over DHF .
3. The Existence of the Hartree–Fock Vacuum Without External Potentials We are now ready to state our first result which has been claimed and argued for by Chaix et al. [7]. Theorem 1. Pick for the one-particle electron subspace H+ := P+0,0 H. Then, E0,0,α |DHF is nonnegative for α ∈ [0, 4/π]. 3.1. Reduction to 1-pdm. First, we express the energy E0,0,α (ρ) in terms of the matrix elements of the 1-pdm 0ρ , where ρ is a Hartree–Fock state. Expanding the quartic term in (8) yields the 48 summands
452
V. Bach, J.M. Barbaroux, B. Helffer, H. Siedentop
ρ(: 9∗i 9∗j 9l 9k :) = ρ(a∗i a∗j )ρ(al ak ) − ρ(a∗i al )ρ(a∗j ak ) + ρ(a∗i ak )ρ(a∗j al ) − ρ(a∗i a∗j )ρ(b∗k al ) + ρ(a∗i b∗k )ρ(a∗j al ) − ρ(a∗i al )ρ(a∗j b∗k ) + ρ(a∗i a∗j )ρ(b∗l ak ) − ρ(a∗i b∗l )ρ(a∗j ak ) + ρ(a∗i ak )ρ(a∗j b∗l ) + ρ(a∗i a∗j )ρ(b∗l b∗k ) − ρ(a∗i b∗l )ρ(a∗j b∗k ) + ρ(a∗i b∗k )ρ(a∗j b∗l ) + ρ(a∗i bj )ρ(al ak ) − ρ(a∗i al )ρ(bj ak ) + ρ(a∗i ak )ρ(bj al ) + ρ(a∗i b∗k )ρ(bj al ) − ρ(a∗i bj )ρ(b∗k al ) + ρ(a∗i al )ρ(b∗k bj ) − ρ(a∗i b∗l )ρ(bj ak ) + ρ(a∗i bj )ρ(b∗l ak ) − ρ(a∗i ak )ρ(b∗l bj ) + ρ(a∗i b∗l )ρ(b∗k bj ) − ρ(a∗i b∗k )ρ(b∗l bj ) + ρ(a∗i bj )ρ(b∗l b∗k ) − ρ(a∗j bi )ρ(al ak ) + ρ(a∗j al )ρ(bi ak ) − ρ(a∗j ak )ρ(bi al ) − ρ(a∗j b∗k )ρ(bi al ) + ρ(a∗j bi )ρ(b∗k al ) − ρ(a∗j al )ρ(b∗k bi ) + ρ(a∗j b∗l )ρ(bi ak ) − ρ(a∗j bi )ρ(b∗l ak ) + ρ(a∗j ak )ρ(b∗l bi ) − ρ(a∗j b∗l )ρ(b∗k bi ) + ρ(a∗j b∗k )ρ(b∗l bi ) − ρ(a∗j bi )ρ(b∗l b∗k ) + ρ(bi bj )ρ(al ak ) − ρ(bi al )ρ(bj ak ) + ρ(bi ak )ρ(bj al ) − ρ(b∗k bi )ρ(bj al ) + ρ(b∗k bj )ρ(bi al ) − ρ(b∗k al )ρ(bi bj ) + ρ(b∗l bi )ρ(bj ak ) − ρ(b∗l bj )ρ(bi ak ) + ρ(b∗l ak )ρ(bi bj ) + ρ(b∗l b∗k )ρ(bi bj ) − ρ(b∗l bi )ρ(b∗k bj ) + ρ(b∗l bj )ρ(b∗k bi ).
(22)
After factoring some of the terms in (22), we obtain E0,0,α (ρ) = T (ρ) +
α (WP (ρ) + WD (ρ) − WX (ρ)), 2
(23)
where T (ρ) =
X
(D0,0 )i,j ρ(: 9∗i 9j :) = tr D0,0 γ
i,j∈Z
= tr |D0,0 |(γ++ − γ−− )
is the kinetic energy in the state ρ. (The terms of the form ρ(a∗i b∗j ) and ρ(bi aj ) disappeared in the sum because (D0,0 )i,j = (ei , D0,0 ej ) = 0, if i ∈ Z and j ∈ Z have opposite sign.) The pairing part of the potential is X
WP (ρ) :=
Wi,j;k,l ρ(a∗i a∗j ) + ρ(bi bj ) − ρ(a∗j bi ) + ρ(a∗i bj )
i,j,k,l∈Z
ρ(a∗k a∗l ) + ρ(bk bl ) − ρ(a∗l bk ) + ρ(a∗k bl ) X Wi,j;k,l (υ++ + υ−− + υ−+ + υ+− )i,j (υ++ + υ−− + υ−+ + υ+− )k,l
=
i,j,k,l∈Z
Z
=
dxdy
|υ(x, y)|2 . |x − y|
(24)
We note that none of the matrix elements of 0ρ involved in this expression is chargeconserving. Furthermore,
Stability of Relativistic Electron-Positron Field
X
WD (ρ) :=
453
Wi,j;k,l ρ(a∗i ak ) + ρ(a∗i b∗k ) + ρ(bi ak ) − ρ(b∗k bi )
i,j,k,l∈Z
(ρ(a∗l aj ) + ρ(a∗l b∗j ) + ρ(bl aj ) − ρ(b∗j bl )) X Wi,j;k,l (γ++ + γ−+ + γ+− + γ−− )k,i (γ++ + γ−+ + γ+− + γ−− )j,l = i,j,k,l∈Z
Z
dxdy
=
γ(x, x)γ(y, y) |x − y|
(25)
is the direct part, i.e, the classical electrostatic energy. It exclusively contains matrix elements of charge conserving operators. Finally, the exchange energy X Wi,j;l,k ρ(a∗i ak ) + ρ(a∗i b∗k ) + ρ(bi ak ) − ρ(b∗k bi ) WX (ρ) := i,j,k,l∈Z
ρ(a∗l aj ) + ρ(a∗l b∗j ) + ρ(bl aj ) − ρ(b∗j bl ) X Wi,j;l,k (γ++ + γ−+ + γ+− + γ−− )k,i (γ++ + γ−+ + γ+− + γ−− )j,l
=
i,j,k,l∈Z
Z
=
dxdy
|γ(x, y)|2 |x − y|
(26)
also contains only matrix elements of charge conserving operators. (The difference between the direct and the exchange part of the energy is the permutation of the indices k and l in the matrix elements of W .) Adding up (24–26) and inserting into (23), we verify that HF (0ρ ). E0,0,α (ρ) = E0,0,α
(27)
The positivity of the operator W immediately implies that WP (ρ) is positive, for all states ρ. Thus, the minimization of the energy E0,V,α (ρ) over the whole space DHF is equivalent to the minimization over the subspace DHFC of DHF , for which all non-charge conserving terms vanish, i.e., υ = 0. (We note that the resulting operator 00 is again a 0 one-particle density matrix: If 0 ∈ X then 0 , resulting from 0 by conjugation with 1 0 the unitary matrix , i.e., substituting υ by its negative, is a one-particle density 0 −1 matrix. Since X is convex, (0 + 00 )/2 = 00 is again a one-particle density matrix.) Thus α T (ρ) + (WD (ρ) − WX (ρ)) . (28) inf E0,0,α (ρ) = inf ρ∈DHF ρ∈DHFC 2 Note also that WD (ρ) is positive for all ρ. 3.2. Bound on the interaction potential. Now we prove that αWX (ρ) can be controlled by the kinetic energy T (ρ), if α is smaller than 4/π. Assume k to be a measurable complex function on R3 × R3 which is tempered in the second variable and whose Fourier transform in the second variable is also measurable. Then Kato’s inequality (Kato [12], V, §5, Formula (5.33), see also Herbst [11]), Z |u(x)|2 2 dx ≤ (u, |∇|u) π R3 |x|
454
V. Bach, J.M. Barbaroux, B. Helffer, H. Siedentop
gives, for any fixed x, 1 2 k(x, ·), k(x, ·) ≤ k(x, ·), |∇|k(x, ·) . π | · −x|
(29)
Since ρ has finite kinetic energy – and thus finite particle number – we have γ++ ∈ S1 (H+ ) and γ−− ∈ S1 (H− ). Moreover, from inequality (18), one deduces that γ+− and γ−+ , and thus γ, are Hilbert-Schmidt operators. Recall also that γ is self-adjoint. In abuse of notation we denote by γ(x, y) the integral kernel of γ. From (29) we obtain 4 Z X π |γ(x, y)|2 π dxdy ≤ tr[(|∇| ⊗ 1)γ 2 ] ≤ tr(|D0,0 |γ 2 ). |x − y| 2 2 s,t=1
Now, on one hand, we have the relation tr(|D0,0 |γ 2 ) = tr |D0,0 |(γ++ + γ−+ + γ+− + γ−− )2 2 2 + γ+− γ−+ ) + (γ−− + γ−+ γ+− ) = tr |D0,0 | (γ++ ≤ tr |D0,0 |(γ++ − γ−− ) = T (ρ),
(30)
(31)
where in the last inequality we have used the relations (18) and (19). On the other hand we have Z Z |γ(x, y)|2 . (32) dx dy WX (ρ) = |x − y| G G Thus, (32) together with (30) and (31) gives 2 WX (ρ) ≤ T (ρ). π
(33)
This inequality and (28) prove Theorem 1.
4. The Existence of the Hartree–Fock Vacuum with External Potentials We are interested in describing also systems with an external potential. To be definite, Z , i.e., a nucleus that we will restrict to the most interesting case, namely V (x) := − |x| interacts via Coulomb forces. As the space H+ defining the Fock space we take P+0,V (H). The energy E0,V,α in a state ρ then becomes E0,V,α (ρ) :=
X
(D0,V )i,j ρ(: 9∗i 9j :) +
i,j∈Z
α 2
X
Wi,j;k,l ρ(: 9∗i 9∗j 9l 9k :),
i,j,k,l∈Z
(34)
where (D0,V )i,j are the matrix elements of D0,V := D0,0 + αV and Wi,j;k,l := (ei ⊗ ej , W ek ⊗ el ) are given as in Sect. 2. We begin with Lemma 1. For c ∈ [0, 1/2) we have |D0,0 | ≤ D0,0 − c/|x| /(1 − 2c).
Stability of Relativistic Electron-Positron Field
455
Proof. We abbreviate k := 1/(1 − 2c). Since the square root is an operator monotone function it suffices to prove (D0,0 )2 ≤ k 2 (D0,0 − c/|x|)2 ,
(35)
which is equivalent to the requirement kD0,0 ψk ≤ kk(D0,0 − c/|x|)ψk,
(36)
for all ψ in S. If we estimate the right hand side from below by the triangle inequality, we get the stronger inequality kD0,0 ψk ≤ kkD0,0 ψk − kckψ/|x|k.
(37)
Rearranging turns (37) into kc/(k − 1)kψ/|x|k ≤ k(−1 + m2 )1/2 ψk,
(38)
which is true for kc/(k − 1) ≤ 1/2 because of Hardy’s inequality Z Z u(x)2 1 dx ≤ |∇u(x)|2 dx. 4 R3 |x|2 R3 Solving for k gives the desired inequality.
The main result of this section is Theorem 2. If the one-particle electron subspace is picked to be H+ := P+0,V (H), then the energy E0,V,α is nonnegative on DHF , if α, Z ≥ 0 fulfill α ≤ 4(1 − 2αZ)/π. Proof. We can imitate the proof of Theorem 1 up to (32). We merely need to use Lemma 1 in addition. We have 0,0 αZ π π α D − , (39) ≤ α |D0,0 | ≤ α 2|x| 4 4(1 − 2αZ) |x| the condition that 2αZ ≤ 1. We need to estimate the right hand side by under D0,0 − (αZ)/|x| , which is immediate under the stated hypothesis on α. 5. The Inclusion of Magnetic Fields As charged particles, electrons and positrons are also coupled to the magnetic field B := ∇×A, where A is a vector potential. Taking this into account, the energy of electrons and positrons interacting via a quantized electric field – which has been integrated out in the Coulomb gauge – and a classical magnetic field is given by X α X (DA,V )i,j ρ(: 9∗i 9j :) + Wi,j;k,l ρ(: 9∗i 9∗j 9l 9k :) EA,V,α (ρ) := 2 i,j∈Z i,j,k,l∈Z Z 1 B2 , (40) + 8π R3 where (DA,V )i,j are the matrix elements of DA,V := α · (−i∇ + eA) + mβ + αV . The potentials V and W are the same as in Sect. 4. We pick the electron subspace depending on the vector potential which we require to H+ := P+A,V (H), i.e., explicitly R have finite energy, i.e., R3 B2 < ∞.
456
V. Bach, J.M. Barbaroux, B. Helffer, H. Siedentop
Theorem 3. For H= P+A,V (H), ∈ (0, 1], α ∈ [0, 4/π] and Z ∈ [0, ∞),
and
π 2 α2 1 > 0, (1 − ) − 4( − 1)(αZ)2 − 16
1 π 2 α2 L 21 ,3 . (1 − ) − 4( − 1)(αZ)2 − 16
− 23
(1 − )2 α ≤
1 , 8π
(41)
implies that EA,V,α is nonnegative on DHF , where L 21 ,3 is the constant in the LiebThirring inequality for moments of order 1/2. Before proving the theorem we remark that the saturating the above inequality is uniquely determined through a third order equation. Since the resulting expression is somewhat lengthy, we will skip it here. Instead we show a plot of the curve (see Fig. 1), defined by equality in (41), which estimates from below the critical curve separating stability and instability.
Fig. 1. The shown curve estimates from below the critical value of the pair (α, αZ), for which the energy EA,V,α is positive. It is obtained from Theorem 3 and Formula (41). On and below the line, non-negativity of the energy in Hartree–Fock states is guaranteed. Note that we assert stability from α0 ≈ 0.5235 to α = 4/π, if Z = 0. However, we do not assert stability if α > 4/π even if Z = 0. For the physical value α ≈ 1/137.0359895 we obtain αZ ≈ 0.49500002, i.e., Z ≈ 67.8328171
Proof. Analogously to the proofs of Theorems 1 and 2 and, in addition, using the diamagnetic inequality (Simon [18] and [13] [note that the last reference contains a typo in Formula (5.4): in the exponent −t should be subtracted]), we estimate Z Z √ πα α dx dy|γ(x, y)|2 /|x − y| ≤ tr(γ ∗ | − i∇ + αA|γ). (42) 2 4 √ Note that e = α. To show positivity of (40) it hence suffices to show h i 1 Z √ πα A,V | − i∇ + αA| B2 ≥ 0, + (43) − tr |D | − 4 8π − since 0 ≤ γ ∗ γ ≤ 1. At this point we are in a position to follow the argument in [14]: By the Birman-Koplienko-Solomyak inequality [3] the first summand in (43) is estimated from below by
Stability of Relativistic Electron-Positron Field
457
( 21 ) 2 2 √ α π − tr (DA,V )2 − 2 (−i∇ + αA)2 . 4 −
(44)
Separating the mixed terms in the first summand and observing that m = 0 gives a lower bound, we obtain the following lower bound for the first summand in (44) ( 21 ) 2 √ √ (αZ)2 1 π 2 α2 2 (−i∇ + αA) . − − tr (1 − ) α · (−i∇ + αA) − ( − 1) |x|2 16 − (45) The first summand is essentially the Pauli operator. Thus, using Hardy’s inequality, (45) is estimated from below by ( 21 ) √ √ π 2 α2 1 2 2 (−i∇ + αA) − (1 − ε) α|B| . − tr (1 − ) − 4( − 1)(αZ) − 16 − (46) Assuming now that the parameters ε, α and Z are such that π 2 α2 1 > 0, (1 − ) − 4( − 1)(αZ)2 − 16 we apply the Lieb–Thirring inequality for the moment 1/2 and see that (46) is estimated from below by
1 α2 π 2 − L 21 ,3 . (1 − ) − 4( − 1)(αZ)2 − 64
− 23
Z (1 − )2 α
B2 ,
(47)
where L 21 ,3 ≤ 0.06003 . Combining (43) trough (47) shows that it suffices to have
1 π 2 α2 L 21 ,3 . (1 − ) − 4( − 1)(αZ)2 − 16
− 23
Optimizing with respect to yields the desired result.
(1 − )2 α ≤
1 . 8π
(48)
6. Optimality of the Quantization The second quantization is not unique, since – as is obvious from the definition of the Fock space (1) and the energy (8) – it depends on what is considered the one-particle electron space H+ . Since we are working with a functional that turns out to be unbounded from below, we would like to make use of the freedom of choice with respect to H+ and investigate the maximal stability range (see also [17]). Thus, we consider E(H+ ) :=
inf
ρ∈DHF
EA,V,α (ρ),
(49)
where H+ is any closed subspace of H such that the orthogonal projections onto H± leave D(DA,V ) invariant where H− := (H+ )⊥ . We denote by S the set of all such subspaces. The optimal H+ in S can indeed be found and is unique:
458
V. Bach, J.M. Barbaroux, B. Helffer, H. Siedentop
Theorem 4. Consider D0,V with values of α and Z as in Theorem 3 and such that 0 6∈ σ(DA,V ). For any H+ in S \ {P+0,V H} we get E(H+ ) < E(P+0,V H) = 0.
(50)
Before we turn to the proof, we mention that the use of the Coulomb potential V in the energy functional is not essential: our proof easily extends to other external potentials where zero is not in the spectrum of the one-particle operator and the energy is bounded from below. Proof. First note that as a direct consequence of Theorem 2, E(P+0,V (H)) = 0. We now prove that if the orthogonal projection PH+ onto H+ does not commute with P+0,V and P−0,V , then the corresponding E(H+ ) is strictly negative. Assume to the contrary that there exist two normalized vectors p ∈ H+ ∩ D(D0,V ) and n ∈ H− ∩ D(D0,V ) such that the real part of (p, D0,V n) is non-zero. Without loss of generality we may pick the orthonormal system . . . , e−1 , e0 , . . . of H such that e−1 = n and e0 = p. We consider the vector ϕ ∈ F defined as ϕ = (1 − 2 )1/2 + a∗0 b∗−1 , where ∈ (0, 1), and is the vacuum. Since 1-pdms are representable by Hartree–Fock states (see Sect. 2, paragraph on 1-pdms), it suffices to show that the infimum of the Hartree–Fock functional is negative in order to show that the energy of the electron positron field is also negative. Alternatively we can show explicitly that φ is a Hartree–Fock state. This follows from the fact that φ = (1 − 2 )1/2 exp[(/(1 − 2 )1/2 )a∗0 b∗−1 ] and by applying Thaller [19], Theorem 10.6. The Bogoliubov transform U : F → F of this theorem which has also the property φ = U is implemented by the unitary operator U on H given by j 6= −1, 0 ej U ej := − e0 + (1 − 2 )1/2 e−1 j = −1 . (1 − 2 )1/2 e + e j=0 0 −1 Finally, note also that images of quasi-free states under Bogoliubov transforms are again quasi-free states which follows directly from the definition. We now compute the one particle density matrix of ϕ. It is γ 0 0 := , 0 −γ¯ with
γ=
1 (1 − 2 ) 2 δj,−1 δi,0 2 δ0,i δ0,j , 1 (1 − 2 ) 2 δi,−1 δj,0 −2 δj,0 δi,0
HF yields where δk,l is the Kronecker symbol. Computing the Hartree–Fock energy E0,V,α HF (0) = (D0,−1 + D−1,0 ) + O(2 ), E0,V,α
(51)
i.e., we have a non-vanishing term linear in plus a term of higher order in , which can be always made negative by appropriately choosing the sign of and picking sufficiently close to zero. Thus, an element H+ in S can give nonnegative E(H+ ), only
Stability of Relativistic Electron-Positron Field
459
if for all p ∈ H+ ∩ D(D0,V ) and n ∈ H− ∩ D(D0,V ) we have (p, D0,V n) = 0. Under our assumptions on S, this implies that the projection PH+ commutes with D0,V and therefore commutes with all spectral projections of D0,V (see e.g. Birman and Solomyak [4], §6.3, Theorem 2), and in particular with P+0,V and P−0,V . Now we consider H+ in S such that the projection PH+ commutes with P+0,V and P−0,V . Suppose that there exists u ∈ H+ such that u 6∈ P+0,V (H). Then, under our assumptions on H+ , there exists a non-zero vector u˜ in D(D0,V ) ∩ H+ ∩ P−0,V (H). Without loss of ˜ We generality, we can take an orthonormal basis . . . , e−1 , e0 , . . . of H such that e0 = u. now pick the following one-particle density matrix, 0 0 δi,0 δj,0 0 0 0 0 0 . 0 := 0 0 −δi,0 δj,0 0 0 0 0 0 The corresponding Hartree–Fock state is the pure state associated to ϕ = a∗0 . The HF = D0,0 < 0. Similarly, if Hartree–Fock energy is easily computed and we get E0,V,α 0,V ⊥ c H+ ∩(P− (H)) 6= ∅, then the corresponding Hartree–Fock energy E(H+ ) is also strictly negative. Thus, a Hilbert space H+ in S is a maximizer only if H+ = P+0,V (H). Theorem 4 generalizes to A 6= 0. An additional minimization over A is required in this case. Acknowledgement. Thanks go to Christian Tix for useful help with the numerical optimization of Formula (41) resulting in Figure 1. Financial support of the European Union through the TMR network FMRX-CT 96-0001 is gratefully acknowledged.
References 1. Bach, V., Lieb, E.H. and Solovej, J.P.: Generalized Hartree–Fock theory and the Hubbard model. J. Stat. Phys. 76(1&2), 3–89 (1994) 2. Barnett, R.M., Carone, C.D., Groom, D.E., Trippe, T.G., Wohl, C.G., Armstrong, B., Gee, P.S., Wagman, G.S., James, F., Mangano, M., M¨onig, K., Montanet, L., Feng, J.L., Murayama, H., Hern´andez, J.J.,Manohar, A., Aguilar-Benitez, M.,Caso, C., Crawford, R.L., Roos, R.L., T¨ornqvist, N.A., Hayes, K.G., Hagiwara, K., Nakamura, K., Tanabashi, M., Olive, K. Honscheid, K., Burchat, P.R., Shrock, R.E.,Eidelman, S., Schindler, R.H., Gurtu, A., Hikasa, K., Conforto, G., Workman, R.L., Grab, C. and Amsler, C.: Review of particle physics. Phys. Rev. D 54(1), 1–720 (July 1996) 3. Birman, M.Sh., Koplienko, L.S. and Solomyak, M.Z.: Estimates for the spectrum of the difference between fractional powers of two self-adjoint operators. Soviet Math. 19(3), 1–6, (1975) Translation of Izv. Vysˇs. Uˇcebn. Zaved. Matematika 4. Birman, M.Sh. and Solomjak, M.Z.: Spectral theory of selfadjoint operators in Hilbert space. Mathematics and its Applications (Soviet Series). Dordrecht: D. Reidel Publishing Co., 1987, Translated from the 1980 Russian original by S. Khrushch¨ev and V. Peller 5. Born, M. and Jordan, P.: Zur Quantenmechanik. Zeitschrift f¨ur Physik 34, 858–888 (1925) 6. Chaix, P. and Iracane, D.: From quantum electrodynamics to mean-field theory: I. The BogoliubovDirac-Fock formalism. J. Phys. B. 22(23), 3791–3814 (December 1989) 7. Chaix, P., Iracane, D. and Lions, P.L.: From quantum electrodynamics to mean-field theory: II.Variational stability of the vacuum of quantum electrodynamics in the mean-field approximation. J. Phys. B. 22(23), 3815–3828 (December 1989) 8. Conlon, J.G.: The ground state energy of a classical gas. Commun. Math. Phys. 94(4), 439–458 (August 1984)
460
V. Bach, J.M. Barbaroux, B. Helffer, H. Siedentop
9. Fefferman, C.L. and de la Llave, R.: Relativistic stability of matter – I. Revista Matematica Iberoamericana 2(1,2), (119–161) (1986) 10. Helffer, B. and Siedentop, H.: Form perturbations of the second quantized Dirac field. Mathematical Physics Electronic Journal 4, Paper 4 (1998) 11. Herbst, I.W.: Spectral theory of the operator (p2 + m2 )1/2 − Ze2 /r. Commun. Math. Phys. 53, 285–294 (1977) 12. Kato, T.: Perturbation Theory for Linear Operators. Vol. 132 of Grundlehren der mathematischen Wissenschaften, 1 edition, Berlin: Springer-Verlag, 1966 13. Lieb, E.H., Loss, M. and Siedentop, H.: Stability of relativistic matter via Thomas-Fermi theory. Helv. Phys. Acta 69(5/6), 974–984 (December 1996) 14. Lieb, E.H., Siedentop, H. and Solovej, J.P.: Stability and instability of relativistic electrons in classical electromagnetic fields. J. Statist. Phys. 89(1-2), 37–59 (1997) Dedicated to Bernard Jancovici. 15. Lieb, E.H., Siedentop, H. and Solovej, J.P.: Stability of relativistic matter with magnetic fields. Phys. Rev. Lett. 79(10), 1785–1788 (September 1997) 16. Lieb, E.H. and Yau, H.-T: The stability and instability of relativistic matter. Commun. Math. Phys. 118, 177–213 (1988) 17. Mittleman, M.H.: Theory of relativistic effects on atoms: Configuration-space Hamiltonian. Phys. Rev. A 24(3), 1167–1175 (September 1981) 18. Simon, B.: Kato’s inequality and the comparison of semigroups. J. Funct. Anal. 32(1), 97–101 (1979) 19. Thaller, B.: The Dirac Equation. Texts and Monographs in Physics. 1 edition, Berlin: Springer-Verlag, 1992 Communicated by B. Simon
Commun. Math. Phys. 201, 461 – 470 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
Lowest-Energy Representations of Non-Centrally Extended Diffeomorphism Algebras T. A. Larsson Vanadisv¨agen 29, S-113 23 Stockholm, Sweden. E-mail:
[email protected] Received: 24 June 1997 / Accepted: 3 September 1998
Abstract: We describe a class of non-central extensions of the diffeomorphism algebra in N -dimensional spacetime, and construct lowest-energy modules thereof, thus generalizing work of Eswara Rao and Moody. There is one representation for each rep\) (an extension of the semi-direct product). Similar modules e k0 gl(N resentation of Vir n are constructed for gauge algebras. 1. Introduction Let diff(N ) denote the diffeomorphism algebra in N -dimensional spacetime. In a significant paper, Eswara Rao and Moody constructed the first interesting lowest-energy representations of a non-central extension thereof [5], but they failed to explicitly describe the extension (except for the “spatial” subalgebra generated by time-independent vector g ; c1 , c2 , c3 , c4 ) fields). In the present paper,a four-parameter non-central extension diff(N is described explicitly,and a realization of this algebra is constructed for each repre\)(an extension of Vir n gl(N \)). The representations in [5] are e k0 gl(N sentation of Vir n recovered (in a Fourier basis)by picking a particular vertex operator module for Vir and \). Thus, my results are related to theirs in about the same way the trivial module for gl(N as tensor densitities are related to functions. Similar results also hold for the algebra of gauge transformations on spacetime; special cases were previously found by [4, 6, 10]. A supersymmetric generalization of the present work can be found on the web [9]. After this work was completed I became aware of related references[1, 2]. 2. Extension Let ξ = ξ µ ∂µ be a vector field and Lξ the Lie derivative. Greek indices µ, ν = 0, 1, .., N −1 label spacetime coordinates and the summation convention is used. The diffeomorphism algebra (algebra of vector fields, Witt algebra) diff(N ) is generated by Lie
462
T. A. Larsson
derivatives satisfying[Lξ , Lη ] = L[ξ,η] . Define two families of operators Snν1 ..νn (gν1 ..νn ) ρ|ν ..ν andRn 1 n (hρ|ν1 ..νn ), where gν1 ..νn and hρ|ν1 ..νn are arbitrary functions on spacetime. The operators are linear in the arguments and totally symmetric in the indices ν1 ..νn . The following relations (2.1–2.3) define a Lie algebra extension of diff(N ), denoted by g ; c1 , c2 , c3 , c4 ): diff(N [Lξ , Snν1 ..νn (gν1 ..νn )] = Snν1 ..νn (ξ µ ∂µ gν1 ..νn +
n X
∂νj ξ µ gν1 ..µ..νn )
j=1
−(n −
µν1 ..νn (∂µ ξ 0 gν1 ..νn ), 1)Sn+1
S1ν (∂ν f ) ≡ 0, 0ν1 ..νn (gν1 ..νn ) Sn+1 1 S0 (f ) = 2πi
Z
(2.1) = Snν1 ..νn (gν1 ..νn ), dtf (t),
if f (t) depends on time only,
[Lξ , Rnρ|ν1 ..νn (hρ|ν1 ..νn )] = Rnρ|ν1 ..νn (ξ µ ∂µ hρ|ν1 ..νn + ∂ρ ξ µ hµ|ν1 ..νn n X + ∂νj ξ µ hρ|ν1 ..µ..νn ) j=1 ρ|µν1 ..νn
−(n + 1)Rn+1
ρ|µν1 ..νn
(∂µ ξ 0 hρ|ν1 ..νn ) − Rn+1
(∂ρ ξ 0 hµ|ν1 ..νn )
ρσν1 ..νn ρσµν1 ..νn (∂ρ ∂σ ξ µ hµ|ν1 ..νn ) − Sn+3 (∂ρ ∂σ ξ 0 hµ|ν1 ..νn ), +Sn+2 n X νj |ν1 ..νˇ j ..νn µν1 ..νn (∂µ hν1 ..νn ) + Rn−1 (hν1 ..νn ) ≡ 0, Sn+1
(2.2)
j=1
where hν1 ..νn is symmetric in ν1 ..νn . ρ|0ν ..νn
Rn+1 1
(hρ|ν1 ..νn ) ≡ Rnρ|ν1 ..νn (hρ|ν1 ..νn ),
Rn0|ν1 ..νn (hν1 ..νn ) ≡ 0, [Lξ , Lη ] = L[ξ,η] + c1 S1ρ (∂ρ ∂ν ξ µ ∂µ η ν ) + c2 S1ρ (∂ρ ∂µ ξ µ ∂ν η ν ) µ|ν
+c3 (R1 (∂µ ξ 0 ∂ν η 0 ) + S3µνρ (∂ρ ∂µ ξ 0 ∂ν η 0 )) c4 + S2ρσ (∂ρ η 0 ∂σ ∂µ ξ µ − ∂ρ ξ 0 ∂σ ∂ν η ν ) 2 +a1 (S2ρσ (∂ρ ∂σ ξ µ ∂µ η 0 − ∂ρ ∂σ η ν ∂ν ξ 0 )
(2.3)
−S3ρµν (∂ρ ∂µ ξ 0 ∂ν η 0 − ∂ρ ∂ν η 0 ∂µ ξ 0 ))
−a2 S1ρ (∂ρ ξ 0 η 0 ) + a3 S1ρ (∂ρ η 0 ∂µ ξ µ − ∂ρ ξ 0 ∂ν η ν ),
µ1 ..µm µ1 ..µm [Sm (gµ1 ..µm ), Snν1 ..νn (hν1 ..νn )] = [Sm (gµ1 ..µm ), Rnρ|ν1 ..νn (hρ|ν1 ..νn )] ρ|µ1 ..µm (gρ|µ1 ..µm ), Rnσ|ν1 ..νn (hσ|ν1 ..νn )] = 0. = [Rm
Verification of all Jacobi identities is straightforward; that these equations define a Lie algebra also follows from the explicit realization in Theorem 3.1 below. Indeed, the extensions were discovered by working out which algebra is generated by (3.6). The extensions a1 – a3 (cocycles are labelled by the factors multiplying them) are cohomologically trivial and may be eliminated by the redefinition
Diffeomorphism Algebra
463
L0ξ = Lξ + a1 S2µν (∂µ ∂ν ξ 0 ) +
a2 S0 (ξ 0 ) + a3 S0 (∂µ ξ µ ). 2
(2.4)
The remaining extensions are non-trivial, which is most easily seen by restricting to the “temporal” subalgebra generated by vector fields on the line xµ = tδ0µ ; it is a Virasoro algebra with central charge 12(c1 + c2 + c3 + c4 ). In particular, this is the whole story in one dimension, and hence (2.1–2.3) is a natural higher-dimensional generalization of the Virasoro algebra. The extensions c1 and c2 were first described by Eswara Rao and Moody [5] and myself [7, 8], respectively, while c3 and c4 are new. S1ρ (gρ ) is a linear operator acting on one-forms gρ dxρ , and as such it may be viewed as a closed one-chain on spacetime. One possibility is that it is an exact one-chain: S1ρ (gρ ) νµ C (jµν ) [Lξ , C νρ (jνρ )]
= C νρ (∂ν gρ ), = −C µν (jµν ), = C νρ (ξ µ ∂µ jνρ + ∂ν ξ µ jµρ + ∂ρ ξ µ jνµ ).
(2.5)
Dzhumadil’daev [3] has given an list of diff(N ) extensions by irreducible modules, but it seems that the extension c1 is missing; however, it is essentially ψ1W − ψ3W + ψ4W in his notation. Moreover, c2 is ψ1W and c1 and c2 become ψ4W and ψ3W upon the substitution (2.5), respectively. The remaining two cocycles are not covered by his theorem, however, because they are extensions by reducible but indecomposable modules.
3. Realization Consider the Heisenberg algebra generated by operators q i (s), pj (t), s, t ∈ S 1 , where latin indices i, j = 1, . . . , N − 1 run over spatial coordinates only, [pj (s), q i (t)] = δji δ(s − t), [pi (s), pj (t)] = [q i (s), q j (t)] = 0.
(3.1)
These operators can be expanded in a Fourier series; e. g. , pj (t) =
∞ X
pˆj (n)eint .
(3.2)
n=−∞
This algebra has a Fock module F (Z-graded by the frequency n) generated by finite strings in the non-negative Fourier modes of q i (t) and the positive modes of pj (t). Define time components by q 0 (t) = t and p0 (t) = −q˙i (t)pi (t); in an obvious notation, q µ (t) = (t, q i (t)), etc. The following relations hold. [q µ (s), q ν (t)] = 0, [pν (s), q µ (t)] = (δνµ − q˙µ (s)δν0 )δ(s − t), ˙ − t). [pµ (s), pν (t)] = (δµ0 pν (s) + δν0 pµ (t))δ(s
(3.3)
Normal ordering is necessary to remove infinities and to obtain a well defined action of diffeomorphisms on F. For any function of q(t) and its derivatives, let ≤ > ˙ ˙ : f (q(t), q(t))p ˙ j (t) : ≡ f (q(t), q(t))p j (t) + pj (t)f (q(t), q(t)),
(3.4)
464
T. A. Larsson
≤ where p> j (t) (pj (t)) is the sum (3.2) over positive (non-positive) Fourier modes only. \)k ,k . e k0 gl(N Let L(s) and Tνµ (t) generate the following algebra Vir c n 1
2
˙ − t)), ˙ − t) + c (... [L(s), L(t)] = (L(s) + L(t))δ(s δ (s − t) + δ(s 24πi ¨ − t), ˙ − t) + k0 δνµ δ(s [L(s), Tνµ (t)] = Tνµ (s)δ(s 4πi [Tνµ (s), Tτσ (t)] = (δνσ Tτµ (s) − δτµ Tνσ (s))δ(s − t) 1 ˙ − t) (k1 δτµ δνσ + k2 δνµ δτσ )δ(s − 2πi
(3.5)
Theorem 3.1. Under the conditions above, the following expressions: Z Lξ = dt : ξ µ (q(t))pµ (t) : + ξ 0 (q(t))L(t) + ∂ν ξ µ (q(t))Tµν (t) Z ≡ dt : ξ i (q(t))pi (t) : − : ξ 0 (q(t))q˙i (t)pi (t) : +ξ 0 (q(t))L(t) + ∂ν ξ µ (q(t))Tµν (t), Z 1 ν1 ..νn dt q˙ν1 (t)...q˙νn (t)gν1 ..νn (q(t)), (gν1 ..νn ) = Sn 2πi Z 1 dt q¨ρ (t)q˙ν1 (t)...q˙νn (t)hρ|ν1 ..νn (q(t)), Rnρ|ν1 ..νn (hρ|ν1 ..νn ) = 2πi
(3.6)
realize the Lie algebra ] diff(N ; 1 + k1 , k2 , −2 + (c + 2N − 2)/12, 1 + k0 ), while the cohomologically trivial parameters are a1 = −1, a2 = (c + 2N − 2)/12, a3 = i/2. The proof is deferred to the appendix. Consequently, this algebra acts on F ⊗ M for \)k ,k module M. It should be stressed that this action is manifestly e k0 gl(N every Vir c n 1 2 well defined, at least for the subalgebra of vector fields that are polynomial in the spatial coordinates and a Fourier polynomial in x0 , because finiteness is preserved when all operators in (3.6) act on finite strings in non-negative Fourier modes in that case. The Hamiltonian Z (3.7) L−i∂0 = −i dt (−: q˙i (t)pi (t) : + L(t)) is the operator responsible for computing the Z-grading. In the absense of normal ordering and central charges in (3.5), (3.6) yields a proper realization of diff(N ). The higher-dimensional analogue of a primary field depends on five parameters λ, w (defined up to an integer), κ, p, and q: ..σp ..σp ..σp (t)] = −ξ 0 (q(t))φ˙ στ11..τ (t) − λξ˙0 (q(t))φστ11..τ (t) [Lξ , φστ11..τ q q q ..σp ..σp (t) − κ∂µ ξ µ (q(t))φστ11..τ (t) +iwξ 0 (q(t))φστ11..τ q q
+
p X i=1
µ
..µ..σp ∂µ ξ σi (q(t))φστ11..τ (t) − q
µ
µ
q X j=1
(3.8)
..σp ∂τj ξ µ (q(t))φστ11..µ..τ (t), q
where [Lξ , q (t)] = ξ (q(t)) − q˙ (t)ξ (q(t)). The result of Eswara Rao and Moody [5] is recovered as follows: they work in a Fourier basis on the torus, and denote q i (t) = δi (z) and pj (t) = dj (z), where z = exp(it). 0
Diffeomorphism Algebra
465
A standard vertex operator realization for the Virasoro generator L(t) was given, based \). Consequently, on the remaining roots αp , but they missed the appearance of gl(N µ Tν (t) = 0 and k0 = k1 = k2 = 0 in their work. 4. Gauge Algebras Consider the gauge algebra map(N, g), i.e. maps from N -dimensional space-time to a finite-dimensional Lie algebra g,where g has basis J a , structure constants f ab c , and a c Killing metric δ ab . Define constants g a and g 0 satisfying f ab c g c = f ab c g 0 = 0. Clearly, a 0a a g = g = 0 if J ∈ [g, g], but they may be non-zero on abelian factors. Let X = Xa (x)J a , x ∈ RN be a g-valued function and define [X, Y ]c = if ab c Xa Yb . diff(N ) n g ; c1 , c2 , c3 , c4 )n e g,g0] map(N, g; k), with map(N, g) has the non-central extension diff(N brackets [JX , JY ] = J[X,Y ] − kδ ab S1ρ (∂ρ Xa Yb ),
a
[Lξ , JX ] = Jξµ ∂µ X − g a S2µν (∂µ ξ 0 ∂ν Xa ) − g 0 S1ρ (∂ρ ∂µ ξ µ Xa ),
[JX , Snν1 ..νn (gν1 ..νn )] = [JX , Rnρ|ν1 ..νn (hρ|ν1 ..νn )] = 0,
(4.1)
gk . Consider in addition to (2.3). Let J a (t), t ∈ S 1 , generate the Kac–Moody algebra b \ e e the algebra Vir c nk0 ,g (gl(N )k1 ,k2 ⊕g0 b gk ), with brackets (3.5) and [J a (s), J b (t)] = if ab c J c (s)δ(s − t) + a
k ab ˙ δ δ(s − t), 2πi
g0 µ ˙ δ δ(s − t), 2πi ν a ¨ − t). ˙ − t) + g δ(s [L(s), J a (t)] = J a (s)δ(s 2πi
[Tνµ (s), J a (t)] =
Then
Z JX =
dt Xa (q(t))J a (t)
(4.2)
(4.3)
g ; c1 , c2 , c3 , c4 ) yields a realization of ] map(N, g; k), with the intertwining action of diff(N a described above, and the parameters k, g a and g 0 in (4.1) and (4.2) agree. A. Proof of Theorem 3.1 We first prove that in absence of normal ordering, (3.6) defines a proper realization of diff(N ). The operators p˜ν (t) = pν (t) + δν0 L(t) satisfy relations (3.3) and also ˙ − t). [p˜µ (s), Tσν (t)] = δµ0 Tσν (s)δ(s µ
µ
(A.1)
Introduce the abbreviated notation ξ (t) ≡ ξ (q(t)). Now, ZZ dsdt [ξ µ (s)p˜µ (s) + ∂σ ξ µ (s)Tµσ (s), η ν (t)p˜ν (t) + ∂τ η ν (t)Tντ (t)] [Lξ , Lη ] = ZZ n = dsdt ξ µ (s) ∂ρ η ν (t)(δµρ − δµ0 q˙ρ (s))δ(s − t)p˜ν (t)
466
T. A. Larsson
o ˙ − t) +η ν (t)δµ0 p˜ν (s)δ(s n +ξ µ (s) ∂ρ ∂τ η ν (t)(δµρ − δµ0 q˙ρ (s))δ(s − t)Tντ (t) o ˙ − t) +∂τ η ν (t)δµ0 Tντ (s)δ(s
(A.2)
+∂σ ξ µ (s)∂τ η ν (t)δµτ Tνσ (s)δ(s − t) − ξ ↔ η, where ξ ↔ η stands for the same expression with ξ and η interchanged everywhere. Rewrite the terms proportional to the derivative of the delta function by noting that ZZ Z Z ˙ dsdt f (s)g(t)δ(s − t) = f g˙ = − f˙g. (A.3) The function arguments were suppressed in the single integrals, because no confusion is possible. This leaves us with Z ξ µ (∂µ η ν − δµ0 q˙ρ ∂ρ η ν )p˜ν + ξ µ η˙ ν δµ0 p˜ν +ξ µ ∂ρ ∂τ η ν (δµρ − δµ0 q˙ρ )Tντ + ξ µ ∂τ η˙ ν δµ0 Tντ + ∂σ ξ µ ∂µ η ν Tνσ − ξ ↔ η Z = ξ µ ∂µ η ν p˜ν + ξ µ ∂µ ∂τ η ν Tντ + ∂σ ξ µ ∂µ η ν Tνσ − ξ ↔ η Z = (ξ µ ∂µ η)ν p˜ν + ∂τ (ξ µ ∂µ η ν )Tντ − ξ ↔ η = Lξµ ∂µ η − ξ ↔ η,
(A.4)
where we used that η˙ ν = q˙ρ ∂ρ η ν . Hence [Lξ , Lη ] = L[ξ,η] , and it is clear that normal ordering must result in some abelian extension of diff(N ). We now proceed to calculate it. Split the delta function into positive and negative energy parts, δ > (t) =
1 X −imt e , 2π m>0
δ ≤ (t) =
1 X −imt e . 2π m≤0
Lemma A.1. 1 ˙ δ(t), 2πi 1 ¨ ˙ (δ(t) + iδ(t)), ii. δ > (t)δ˙≤ (−t) − δ˙> (−t)δ ≤ (t) = 4πi 1 ... ˙ iii. δ˙> (t)δ˙≤ (−t) − δ˙> (−t)δ˙≤ (t) = ( δ (t) + δ(t)). 12πi i. δ > (t)δ ≤ (−t) − δ > (−t)δ ≤ (t) = −
Proof. XX
i. 4π 2 · LHS = =
X
(e−i(m−n)t − ei(m−n)t ) =
m>0 n≤0
k(e
−ikt
−e
k>0
ii. 4π 2 i · LHS =
ikt
)=
XX m>0 n≤0
X
k XX
(e−ikt − eikt )
k>0 m=1
ke
−ikt
˙ = 2πiδ(t), where k = m − n.
k
(ne−i(m−n)t − mei(m−n)t )
(A.5)
Diffeomorphism Algebra
=
k XX
467
(m − k)e−ikt − meikt =
k>0 m=1
=−
X k(k − 1) 2
k
iii. −4π 2 · LHS =
X
−
k>0
e
−ikt
k(k − 1) −ikt k(k + 1) ikt e e − 2 2
¨ + πiδ(t). ˙ = π δ(t)
XX
(mne−i(m−n)t − mnei(m−n)t )
m>0 n≤0
=
k XX
m(m − k)(e−ikt − eikt ) =
k>0 m=1
=−
X k3 − k 6
k
X k>0
e−ikt =
−
k 3 − k −ikt − eikt ) (e 6
... 1 ˙ (2πi δ (t) + 2πiδ(t)). 6
Define ˙ = ξ i (q(t)) − ξ 0 (q(t))q˙i (t), ξ˜i (t) ≡ ξ˜i (q(t), q(t)) > ˜i χ>i ξj (t, s) ≡ [pj (t), ξ (s)] = ∂j ξ˜i (s)δ > (t − s) + δji ξ 0 (s)δ˙> (t − s),
(A.6) (A.7)
≤i >i i and χ≤i ξj (t, s) analogously. Moreover, set χξj (t, s) = χξj (t, s) + χξj (t, s).
Lemma A.2. The expressions defined in (A.6) satisfy the following relations: ∂i ξ˜i = ∂µ ξ µ − ξ˙0 ,
(A.8)
∂j ξ˙˜i ∂i η˜ j = ∂ν ξ˙µ ∂µ η ν + ∂ν ξ 0 q˙ρ ∂ρ η˙ ν − q˙ρ ∂ρ ξ˙µ ∂µ η 0 − ξ¨0 η˙ 0 d −ξ˙0 q˙ρ ∂ρ η˙ 0 + q˙ρ ∂ρ ξ˙0 η˙ 0 + (ξ˙0 η˙ 0 − ∂ν ξ 0 η˙ ν ). dt
(A.9)
Proof. We use that ξ˜0 ≡ 0. Equation (A.8) thus equals ∂µ ξ˜µ = ∂µ ξ µ − ∂µ ξ 0 q˙µ ,
(A.10)
whereas (A.9) becomes ∂ν ξ˙˜µ ∂µ η˜ ν = (∂ν ξ˙µ − ∂ν ξ 0 q¨µ − ∂ν ξ˙0 q˙µ )(∂µ η ν − ∂µ η 0 q˙ν ) = ∂ν ξ˙µ ∂µ η ν − ∂ν ξ 0 (η¨ ν − q˙ρ ∂ρ η˙ ν ) − ∂ν ξ˙0 η˙ ν −q˙ρ ∂ρ ξ˙µ ∂µ η 0 + ξ˙0 (η¨ 0 − q˙ρ ∂ρ η˙ 0 ) + q˙ρ ∂ρ ξ˙0 η˙ 0 . Consider
Z L0ξ =
dt : ξ µ (q(t))pµ (t) : ≡ ZZ
[L0ξ , L0η ] =
Z
> ˜i dt (ξ˜i (t)p≤ i (t) + pi (t)ξ (t)),
(A.11)
(A.12)
> ˜i dsdt [ξ˜i (s)p≤ i (s) + pi (s)ξ (s), > ˜ j (t)] η˜ j (t)p≤ j (t) + pj (t)η
(A.13)
468
T. A. Larsson
ZZ =
n ≤ ≤ dsdt ξ˜i (s)χ≤j ˜ j (t)χ≤i ηi (s, t)pj (t) − η ξj (t, s)pi (s)
≤j >i ˜ j (t)p≤ +ξ˜i (s)p> j (t)χηi (s, t) − χξj (t, s)η i (s) ≤ > ˜i ˜ j (t)χ≤i +χ>j ηi (s, t)pj (t)ξ (s) − pi (s)η ξj (t, s)
o >j >i ˜i ˜ j (t) + p> −p> i (s)χξj (t, s)η j (t)χηi (s, t)ξ (s) . Of these eight terms, the third can be rewritten as ≤j ≤j >i ˜i p> j (t)ξ (s)χηi (s, t) − χξj (t, s)χηi (s, t)
(A.14)
>j ≤ ≤i ˜i χ>j ηi (s, t)ξ (s)pj (t) + χηi (s, t)χξj (t, s).
(A.15)
and the fifth as
Hence
ZZ [L0ξ , L0η ]
=
n ≤ ≤ dsdt ξ˜i (s)χ≤j ˜ j (t)χ≤i ηi (s, t)pj (t) − η ξj (t, s)pi (s)
≤j ≤ ˜i ˜ j (t)χ>i +p> j (t)ξ (s)χηi (s, t) − η ξj (t, s)pi (s) ≤ > ˜ j (t)χ≤i +ξ˜i (s)χ>j ηi (s, t)pj (t) − pi (s)η ξj (t, s)
˜ j (t)χ>i −p> i (s)η ξj (t, s)
+
(A.16)
>j ˜i p> j (t)ξ (s)χηi (s, t))
o ≤j >j ≤i (t, s)χ (s, t) + χ (s, t)χ (t, s) . −χ>i ηi ηi ξj ξj
The regular piece is ZZ j > ˜i dsdt ξ˜i (s)χη ji (s, t)p≤ j (t) + pj (t)ξ (s)χη i (s, t) − ξ ↔ η.
(A.17)
We focus on the first term, ZZ dsdt ξ˜i (s)χη ji (s, t)p≤ j (t) − ξ ↔ η ZZ ˙ − t))p≤ (t) − ξ ↔ η = dsdt ξ˜µ (s)(∂µ η˜ j (t)δ(s − t) + η 0 (t)δµj δ(s j Z n o ˙˜j η 0 p≤ − ξ ↔ η, µ ∂ η)j − ξ 0 (η˙ j − η˙ 0 q˙j ) − ξ = (ξ^ µ j
(A.18)
which equals L0[ξ,η] . We again suppress the integration variable in single integrals, and write ξ µ (s) ≡ ξ µ (q(s)), etc. The extension ext0 (ξ, η) becomes ZZ o n ≤j >j ≤i (t, s)χ (s, t) + χ (s, t)χ (t, s) dsdt − χ>i ηi ηi ξj ξj ZZ n =− dsdt (∂j ξ˜i (s)δ > (t − s) + δji ξ 0 (s)δ˙> (t − s)) × o ×(∂i η˜ j (t)δ ≤ (s − t) + δij η 0 (t)δ˙≤ (s − t)) − ξ ↔ η ZZ n =− dsdt ∂j ξ˜i (s)∂i η˜ j (t)δ > (t − s)δ ≤ (s − t)
Diffeomorphism Algebra
469
+ξ 0 (s)∂j η˜ j (t)δ˙> (t − s)δ ≤ (s − t) +∂i ξ˜i (s)η 0 (t)δ > (t − s)δ˙≤ (s − t) o +δii ξ 0 (s)η 0 (t)δ˙> (t − s)δ˙≤ (s − t) − ξ ↔ η ZZ n 1 ˙ − s) dsdt ∂j ξ˜i (s)∂i η˜ j (t)δ(t = 2πi 1 ¨ − s) − iδ(t ˙ − s)) + ξ 0 (s)∂j η˜ j (t)(δ(t 2 1 ¨ − s) + iδ(t ˙ − s)) − ∂i ξ˜i (s)η 0 (t)(δ(t 2 o ... N −1 0 ˙ − s)) ξ (s)η 0 (t)( δ (t − s) + δ(t − Z6 n 1 1 1 ∂j ξ˙˜i ∂i η˜ j − ξ˙0 ∂j η˙˜ j + ∂i ξ˙˜i η˙ 0 = 2πi 2 2 o N − 1 ¨0 0 ˙0 0 i (−ξ η˙ + ξ η ) + (−ξ˙0 ∂j η˜ j + ∂i ξ˜i η˙ 0 ) , − 6 2
(A.19)
where we used Lemma A.1 and the fact that δii = N − 1. Now consider the full algebra. The regular piece follows from the following calculation. ZZ Z ˙ − t) dsdt ξ µ (s)η ν (t)δµ0 pν (s)δ(s (ξ µ ∂µ η ν − ξ 0 η˙ ν )pν + ZZ Z ˙ − t) dsdt ξ 0 (s)η 0 (t)L(s)δ(s + (ξ µ ∂µ η 0 − ξ 0 η˙ 0 )L + Z n o + (ξ µ ∂µ ∂σ η τ − ξ 0 ∂σ η˙ τ )Tτσ + ∂µ ξ ν ∂σ η τ δνσ Tτµ ZZ ˙ − t) − ξ ↔ η (A.20) + dsdt ξ 0 (s)∂σ η τ (t)Tτσ (s)δ(s Z = [ξ, η]ν pν + [ξ, η]0 L + ∂µ [ξ, η]ν Tνµ , and the full extension is ext(ξ, η) = ext 0 (ξ, η) +
ZZ dsdt
n c ... ˙ − t)) ξ 0 (s)η 0 (t)( δ (s − t) + δ(s 24πi
k0 0 ¨ − t) (ξ (s)∂ν η ν (t) − ∂µ ξ µ (s)η 0 (t))δ(s 4πi o k1 σ τ k2 σ τ ˙ −∂σ ξ µ (s)∂τ η ν (t)( δν δµ + δµ δν )δ(s − t) 2πi Z 2πi n c 1 0 0 (ξ¨ η˙ − ξ˙0 η 0 ) = ext 0 (ξ, η) + 2πi 12 o k0 + (−ξ˙0 ∂ν η˙ ν + η˙ 0 ∂µ ξ˙µ ) + k1 ∂ν ξ˙µ ∂µ η ν + k2 ∂µ ξ˙µ ∂ν η ν . 2 +
The result now follows by means of Lemma A.2, Z n 1 dt (1 + k1 )∂ν ξ˙µ ∂µ η ν + k2 ∂µ ξ˙µ ∂ν η ν ext(ξ, η) = 2πi
(A.21)
470
T. A. Larsson
+∂ν ξ 0 q˙ρ ∂ρ η˙ ν − q˙ρ ∂ρ ξ˙µ ∂µ η 0 − ξ˙0 q˙ρ ∂ρ η˙ 0 + q˙ρ ∂ρ ξ˙0 η˙ 0 c + 2(N − 1) ¨0 0 1 + k0 (∂µ ξ˙µ η˙ 0 − ξ˙0 ∂ν η˙ ν ) − (2 − )ξ η˙ + 2 12 o c + 2(N − 1) ˙0 0 i ξ η + (∂µ ξ µ η˙ 0 − ξ˙0 ∂ν η ν ) , − 12 2
(A.22)
where f˙ = q˙ρ ∂ρ f . As a consistency check we note that the extension satisfies ext(η, ξ) = −ext(ξ, η). To calculate the remaining brackets is a straightforward task. Note that normal ρ|ν ..ν ordering is irrelevant here, because Snν1 ..νn and Rn 1 n depend on q µ only whereas Lξ depends only linearly on pν . Note added A. Dzhumadil’daev has explained his results [3], which I had slightly misunderstood. The Rao–Moody cocycle c1 is included in his list; it is equivalent to his cocycle ψ4W , with 1 2 1 1 ∼ ⊕HDeRham . Similarly, c2 is his ψ3W . HDeRham coefficients in 1DeRham /BDeRham = BDeRham is an N -dimensional trivial dif f (N ) module; setting it to zero gives the substitution (2.5). The closedness condition S1ρ (∂ρ f ) = 0 can be lifted for the cocycle c2 (but not for c1 ). One then obtains ψ1W , first discovered in [7]. Dzhumadil’daev considered extensions by modules of tensor fields, not necessarily irreducible. c3 and c4 are not included in his list, because they are extensions by other types of modules. References 1. Berman, S. and Billig, Y.: Irreducible representations for toroidal Lie algebras. Preprint (1998) 2. Billig,Y.: Principal vertex operator representations for toroidal Lie algebras. J. Math. Phys. 7, 3844–3864 (1998) 3. Dzhumadildaev, A.: Virasoro type Lie algebras and deformations. Z. Phys. C 72, 509–517 (1996) 4. Eswara Rao, S., Moody, R.V. and Yokonuma, T.: Lie algebras and Weyl groups arising from vertex operator representations. Nova J. of Algebra and Geometry 1, 15–57 (1992) 5. Eswara Rao, S. and Moody, R.V.: Vertex representations for N -toroidal Lie algebras and a generalization of the Virasoro algebra. Commun. Math. Phys. 159, 239–264 (1994) 6. Fabbri, M. and Moody, R.V.: Irreducible representations of Virasoso-toroidal Lie algebras. Commun. Math. Phys. 159, 1–13 (1994) 7. Larsson, T. A.; Multi-dimensional Virasoro algebra. Phys. Lett. A 231, 94–96 (1989) 8. Larsson, T. A.: Central and non-central extensions of multi-graded Lie algebras. J. Phys. A 25, 1177–1184 (1992) 9. Larsson, T.A.: Fock representations of non-centrally extended super-diffeomorphism algebras. physics/ 9710022 (1997) 10. Moody, R.V., Eswara Rao, S. andYokonoma, T.: Toroidal Lie algebras and vertex representations. Geom. Ded. 35, 283–307 (1990) Communicated by G. Felder
Commun. Math. Phys. 201, 471 – 491 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
Null Line Preserving Bijections of Schwarzschild Spacetime? Wen-ling Huang Mathematisches Seminar, Universit¨at Hamburg, Bundesstraße 55, 20146 Hamburg, Germany Received: 16 June 1998 / Accepted: 7 September 1998
Abstract: Let n ∈ N, n ≥ 3. A bijection of n-dimensional (exterior) Schwarzschild spacetime is an isometry, if, and only if, images and pre-images of null lines are null lines. 1. Introduction A theorem of A. D. Alexandrov [2, 3] states that any bijection f of flat n-dimensional Lorentz–Minkowski spacetime Mn (n ≥ 3) is a Lorentz transformation up to a dilatation, if f and f −1 preserve distance zero between pairs of points. The significance of this result is that, unlike Einstein’s original derivation of Lorentz transformations, it assumes no regularity conditions (such as linearity, differentiability or even continuity) for the transformations. A related result is Zeeman’s theorem [20, 4]: A bijection of four-dimensional Lorentz– Minkowski spacetime which satisfies y − x is timelike and future-pointing ⇔ f (y) − f (x) is timelike and future-pointing for all x, y ∈ M4 , is an orthochronous Lorentz transformation up to a dilatation. Interest in these and similar characterizations has been growing in recent years (see e.g. [5, 6, 9, 17]). J. A. Lester has examined Robertson–Walker spacetimes [13, 14, 15, 16] in this sense. A result which covers a larger class of spacetimes is due to S. W. Hawking [11]: A conformal diffeomorphism of a four-dimensional, strongly causal spacetime (M, g), can be characterized as a homeomorphism of (M, g) which takes null geodesic curves (as point sets) to null geodesic curves. The homeomorphy condition may be omitted if we demand that also pre-images of null geodesic curves are null geodesic curves [12]: ?
Research supported by the Deutsche Forschungsgemeinschaft
472
W.-l. Huang
Let (M, g) be a strongly causal spacetime. Let f : M → M be a bijection such that f and f −1 take null geodesic curves (as point sets) to null geodesic curves. Then f is a homeomorphism and, by Hawking’s theorem, a conformal transformation. Finally, we mention R. G¨obel [8] who determined the bijections of a spacetime which preserve the generalized Zeeman topology in both directions. A null line is an inextendible null geodesic curve (as a point set). Alexandrov’s theorem may be reformulated in the following way. Any bijection f of Mn , n ≥ 3, such that f and f −1 take null lines to null lines is a Lorentz transformation, up to a dilatation. The hypothesis that null lines are mapped to null lines is much milder than the hypothesis that null geodesic curves are mapped to null geodesic curves, which is used in [11, 12]. In the present paper we examine null line preserving bijections of Schwarzschild spacetime. The four-dimensional Schwarzschild solution of Einstein’s equations represents the spherically symmetric, empty spacetime outside a static, spherically symmetric massive body. Therefore it is a good approximation for the gravitational field of the earth or our solar system. Mercurial perihelion advance and deflection of light in the gravitational field near the sun have been successfully predicted using this solution. Also many differential geometric results underline the fundamental role of the Schwarzschild solution in physics (see e.g. [10]). We determine all bijections of n-dimensional (exterior) Schwarzschild spacetime such that images and pre-images of null lines are null lines. For n ≥ 2 all such bijections are a group (Gn , ◦) which contains the isometry group (1n , ◦) as a subgroup. If n > 2, then Gn = 1n . The two-dimensional Schwarzschild half-plane is conformal to the flat Lorentz–Minkowski plane. Hence the group (G2 , ◦) is isomorphic to the corresponding group of the Lorentz–Minkowski plane. 1.1. Mathematical preliminaries. Definition 1. Let n ∈ N, n ≥ 2, m ∈ R>0 . The n-dimensional (exterior) Schwarzschild spacetime with mass m is the following semi-Riemannian manifold. The manifold is the point set Mnm , n o M2m := (x1 , t) ∈ R2 x1 > 2m , p n o Mnm := (x1 , . . . , xn−1 , t) ∈ Rn (x1 )2 + . . . + (xn−1 )2 > 2m n ≥ 3, together with the complete atlas induced by the chart K := (Mnm , idMnm : Mnm → Mnm ⊂ Rn ). The metric tensor is given by ds2 = (dx1 )2 + . . . + (dxn−1 )2 + where r :=
p
2m dr2 − h(r) dt2 , r − 2m
(x1 )2 + . . . + (xn−1 )2 and h(r) := 1 −
2m r .
Remark 1. The interest of Schwarzschild spacetime in four dimensions is that it solves the vacuum Einstein equations. Definition 1 generalizes this solution to dimension n = 2, 3, . . . . However, the Ricci tensor Rij of Mnm vanishes only in dimension n = 4. Lemma 1. The isometry group of Mnm is the group of all bijections 0 . A . . + (0, . . . , 0, b), δ(X) = X δ : Mnm → Mnm , 0 0···0
Null Line Preserving Bijections of Schwarzschild Spacetime
473
where X = (x1 , . . . , xn−1 , t) are the Cartesian coordinates of a point of Mnm , b is a real number, ∈ {−1, 1}. A is an orthogonal (n − 1) × (n − 1)-matrix in the case n ≥ 3, and A = (1) for n = 2. Lemma 2. Let I1 , I2 be open, not necessarily bounded real intervals. Let γ1 : I1 → Mnm , γ2 : I2 → Mnm be two null geodesics. If there exists an η0 ∈ I1 ∩ I2 with γ1 (η0 ) = γ2 (η0 ) and γ10 (η0 ) = γ20 (η0 ), then γ1 (η) = γ2 (η) for all η ∈ I1 ∩ I2 . n is called maximal if we have I2 ⊆ I1 for Definition 2. A null geodesic γ1 : I1 → Mm n any null geodesic γ2 : I2 → Mm with γ1 (η0 ) = γ2 (η0 ) and γ10 (η0 ) = γ20 (η0 ). A null line in Mnm is the image of a maximal null geodesic.
From now on, let m ∈ R>0 be fixed. Instead of Mnm we simply write Mn . The points of Mn are denoted by capitals, and null lines are denoted by small letters. 2. The Schwarzschild Half-Plane Let n = 2. The metric tensor of the Schwarzschild half-plane M2 is ds2 = h(r)−1 dr2 − h(r) dt2 . ˙ 2 , see Fig. 1, Lemma 3. The set of all null lines in M2 is given by E := E1 ∪E n o E1 := {(s + 2m, µ(s) + b∗ ) | s > 0} b∗ ∈ R , n o E2 := {(s + 2m, −(µ(s) + b∗ )) | s > 0} b∗ ∈ R , where the mapping µ : R>0 → R, µ(s) := s + 2m ln s is bijective. t
r
2m
Fig. 1.
Lemma 4. The mapping f : M2 → R2 , (r, t) = (s + 2m, t) 7→ (µ(s), t) is bijective and the images of the sets E1 , E2 are f (E1 ) := {f (g) | g ∈ E1 } = {{(x1 , x2 ) ∈ R2 | x2 = +x1 + c} | c ∈ R}, f (E2 ) := {f (g) | g ∈ E2 } = {{(x1 , x2 ) ∈ R2 | x2 = −x1 + c} | c ∈ R}. Proposition 1. The group (G2 , ◦) of all bijections of M2 such that images and preimages of null lines are null lines is isomorphic to the corresponding group of bijections of the Lorentz–Minkowski plane R2 . The latter group consists of all permutations σ : R2 → R2 , which preserve the Lorentz–Minkowski distance 0 in both directions,
474
W.-l. Huang
d(x, y) = 0 ⇔ d(σ(x), σ(y)) = 0 for all x, y ∈ R2 , where d((x1 , x2 ), (y1 , y2 )) := (x1 − y1 )2 − (x2 − y2 )2 , cf. [19]. The isomorphism is given by the mapping f in Lemma 4. We remark that f maps M2 conformally into the Lorentz–Minkowski plane R2 whose metric tensor is g = diag(1, −1). 3. Three-Dimensional Schwarzschild Spacetime 3.1. Null geodesics and null lines in M3 . Let n = 3. We call a triple (r, ϕ, t) cylinder coordinates of a point X = (x1 , x2 , x3 ) ∈ M3 if x1 = r cos ϕ, x2 = r sin ϕ, x3 = t, and r > 0. We write X r := r, X t := t, and X ϕ := ϕ if −π < ϕ ≤ π. Lemma 5. Let γ : I → M3 be a maximal null geodesic. There are differentiable functions r : I → R>2m , ϕ : I → R, t : I → R with γ(η) = r(η) cos ϕ(η), r(η) sin ϕ(η), t(η) . The functions r(η) and t(η) are determined uniquely, the function ϕ(η) is determined uniquely modulo 2π. In the following we always use cylinder coordinates and write γ(η) = (r(η), ϕ(η), t(η)) if not stated otherwise. The functions r, ϕ, t given in Lemma 5 satisfy the following system of differential equations: h−1 r0 + r2 ϕ0 − h t0 = 0, 2
2
2
m 2 2 2 r00 − 2 (h−1 r0 − h t0 ) − r h ϕ0 = 0, r 2 ϕ00 + r0 ϕ0 = 0, r 2m 1 r0 t0 = 0. t00 + h r2 From (1), (2) we obtain r00 = (r − 3m) ϕ0 . 2
(1) (2) (3) (4)
(5)
3.1.1. Plane null lines. a a , γ2,b : R>0 → M3 , Definition 3. Let a ∈] − π, π], b ∈ R. The mappings γ1,b s a (s) := s + 2m, a, s + 2m ln −m+b , γ1,b m s a γ2,b (s) := s + 2m, a, −(s + 2m ln ) + m + b m a := are maximal null geodesics which we call plane null geodesics. Their images li,b a γi,b (R≥0 ) are called plane null lines, i = 1, 2. We define a | b ∈ R}, i = 1, 2, Ei (a) := {li,b E(a) := E1 (a) ∪ E2 (a), [ E(a), E := a∈]−π,π]
E(a) := {(r, a, t) | r > 2m, t ∈ R}.
Null Line Preserving Bijections of Schwarzschild Spacetime
475
Remark 2. Let g, h be two distinct plane null lines. g and h intersect in at most one point. Let i, j ∈ {1, 2}, a0 , a1 ∈] − π, π], such that g ∈ Ei (a0 ), h ∈ Ej (a1 ). Then #(g ∩ h) = 1 iff i 6= j and a0 = a1 . Lemma 6. For any point P ∈ M3 there are exactly two plane null lines in E(P ϕ ) a0 a0 2 ∩l2,b , where a0 ∈]−π, π], b1 , b2 ∈ R. Then Qt = b1 +b containing P . Let Q := l1,b 2 . We 1 2 a1 a1 r , have Q = 3m iff b1 = b2 . Let x1 , x2 , y1 , y2 ∈ R, a1 , a2 ∈]−π, π] and Q1 := l1,x1 ∩l2,x 2 a2 a2 r r Q2 := l1,y1 ∩ l2,y2 . Then Q1 = Q2 iff x2 − x1 = y2 − y1 . Lemma 7. Let γ : I → M3 be a maximal null geodesic. Let γ(η) = (r(η), ϕ(η), t(η)) in cylinder coordinates. If ϕ0 (η0 ) = 0 for an arbitrary η0 ∈ I, then γ(I) is contained in a Schwarzschild half-plane. Proof. Since L = r2 ϕ0 is constant from (3) and (4), ϕ0 (η0 ) = 0 implies ϕ(η) = const. 3.1.2. Non-plane null lines. Lemma 8. Let γ : I → M3 be a maximal null geodesic, γ(η) = (r(η), ϕ(η), t(η)), where ϕ0 6= 0. There is a c0 ∈ R>0 such that 2 1 du = 2mu3 − u2 + c0 , u := . (6) dϕ r Lemma 9. Let P ∈ M3 , P r =: u10 , and c0 ∈ R>0 . If 2mu30 − u20 + c0 > 0, then there are exactly four null lines containing P which are not plane and whose corresponding maximal null geodesics γ : I → M3 , γ(η) = (r(η), ϕ(η), t(η)) satisfy the differential equation (6). If 2mu30 − u20 + c0 = 0, then there are exactly two such null lines. 1 [, c > 0 let (Fig. 2) Proof. For u ∈]0, 2m
g(u, c) := 2mu3 − u2 + c. 1 1 27m2 then g(u, c) > 0 ∀u ∈]0, 2m [. 1 1 = 27m 2 then g(u, c) = 0 iff u = 3m . 1 < 27m 2 then there are u1 (c), u2 (c) with
If c > If c If c
u1
0, since vi and vj are linearly (c0 ) (c0 ) (c0 ) (c0 ) = gP,4 6= gP,2 = gP,3 . independent, i 6= j. In the case g(u0 , c0 ) = 0 we have gP,1 (c0 ) be defined as at the end of the proof of Lemma 9. We define classes Definition 4. Let gP,i of null lines, (c0 ) | P ∈ M3 and g((P r )−1 , c0 ) ≥ 0}, i = 1, 2, 3, 4, Ki (c0 ) := {gP,i [ [ K(c0 ) := Ki (c0 ), K(c > 0) := K(c0 ). i=1,2,3,4
c0 >0
Remark 3. Let L3 be the set of all null lines of M3 . Then L3 = E ∪ K(c > 0) from Lemma 6 and Lemma 9. Definition 5. Let 1 , φ(u, c) := √ 3 2mu − u2 + c p 1 + u2 (1 − 2mu) φ(u, c)2 . κ(u, c) := u2 (1 − 2mu) Lemma 10. Let l be a null line in Ki (c0 ), i ∈ {1, 2, 3, 4}, c0 > 0. Let P = ( u10 , a0 , b0 ) ∈ M3 be a point of l. We define Z u φ(u, ˜ c)du˜ + a0 , ϕ1 (u, c) := ϕ2 (u, c) := + u0 u
Z ϕ3 (u, c) := ϕ4 (u, c) := −
Z t1 (u, c) := t3 (u, c) := +
u0 u
u0 u
φ(u, ˜ c)du˜ + a0 , κ(u, ˜ c)du˜ + b0 ,
Z t2 (u, c) := t4 (u, c) := −
u0
κ(u, ˜ c)du˜ + b0 .
Null Line Preserving Bijections of Schwarzschild Spacetime
If c0 ≥
1 27m2 ,
u0 6=
1 3m ,
then l =
n
1 u , ϕi (u, c0 ), ti (u, c0 )
1 ]0, 2m [ 1 1 D(u0 , c0 ) := ] 3m , 2m [ ]0, 1 [ 3m If c0 =
1 27m2 ,
u0 =
1 3m ,
477
o | u ∈ D(u0 , c0 ) , where
1 in the case c0 > 27m 2, 1 in the case c0 = 27m2 , u0 > 1 in the case c0 = 27m 2 , u0
1 27m2 )
Lemma 16. Let g ∈ Ei (a0 ), h ∈ Kj (c0 ), i = 1, 2, j = 1, 2, 3, 4, c0 ≥ h intersect in at most one point.
∪ E. 1 27m2 ,
then g and
Lemma 17. From Lemmas 11, 14, 15, 16 and Remark 6 it follows that, for any distinct g, h, #(g ∩ h) = ∞ iff g ∈ Si , h ∈ Sj , {i, j} = {1, 2} 1 1 ), h ∈ Kj ( ), |j − i| = 2, tg = th , ∈ {+, −}. or g ∈ Ki ( 27m2 27m2 1 1 Lemma 18. Let g ∈ K( 27m 2 ) \ S. Then there is a null line h in K( 27m2 ) \ S which intersects g in exactly one point. − − 1 1 1 1 + Lemma 19. Let g ∈ Ki+ ( 27m 2 ), h ∈ Kj ( 27m2 ) (resp. g ∈ Ki ( 27m2 ), h ∈ Kj ( 27m2 )), r r ϕ i = 1, 2, j = 3, 4. Then there exist points P ∈ g, Q ∈ h such that P = Q , P = Qϕ . 1 1 1 [ (resp. u0 ∈] 3m , 2m [). Let Proof. Let a1 , a2 ∈] − π, π], u0 ∈]0, 3m Z u 1 ϕg (u) =: + φ(u, ˜ ) du˜ + a1 , 2 27m u0 Z u 1 φ(u, ˜ ) du˜ + a2 . ϕh (u) =: − 2 27m u0 Ru 1 ˜ 27m ˜ + (a1 − a2 ) ≡ 0 (mod 2π) with variable u has a The congruence 2 u0 φ(u, 2 ) du solution, i.e. there is a u∗ with ϕg (u∗ ) ≡ ϕh (u∗ )( mod 2π).
Lemma 20. Let P, Q ∈ M3 with P r = Qr , P ϕ = Qϕ , P t > Qt . Then there are a point 1 R ∈ M3 and null lines g, h in K( 27m 2 ) such that P, R ∈ g and Q, R ∈ h.
480
W.-l. Huang
Proof. The assertion follows from Lemma 11 in the case P r = Qr = 3m. Let P r = Ru 1 1 ˜ 27m ˜ is continuous in ]0, 3m [ resp. in Qr 6= 3m, u0 := P1r . The function u0 κ(u, 2 )du 1 1 1 ] 3m , 2m [, and for u0 ∈]0, 3m [ we have Z u Z u 1 1 κ(u, ˜ )d u ˜ = +∞, lim κ(u, ˜ )du˜ = −∞, lim − 2 + 1 u→0 27m 27m2 u→ 3m u0 u0 1 1 , 2m [ we have resp. for u0 ∈] 3m Z u 1 κ(u, ˜ )du˜ = −∞, lim + 2 1 27m u→ 3m u0
Z lim −
1 u→ 2m
u
u0
κ(u, ˜
1 )du˜ = +∞. 27m2
Ru 1 1 1 1 [ resp. u∗ ∈] 3m , 2m [ satisfying u0∗ κ(u, ˜ 27m ˜ = 21 (Qt − Thus there is a u∗ ∈]0, 3m 2 )du 1 1 t P ). Let g ∈ K1 ( 27m 2 ) with P ∈ g and h ∈ K2 ( 27m2 ) with Q ∈ h. Then R := 1 , ϕ (u ), t (u ) ∈ h, since ϕg (u∗ ) = ϕh (u∗ ) and g ∗ g ∗ u∗ Z u∗ 1 1 κ(u, ˜ )du˜ + Qt = (Qt + P t ) = tg (u∗ ). th (u∗ ) = − 2 27m 2 u0 Lemma 21. Let g, h ∈ K(c > 1ϕg := 1ϕh :=
1 27m2 )
be null lines with {R} = g ∩ h 6= ∅ and
sup 1 u1 ,u2 ∈]0, 2m [
sup 1 u1 ,u2 ∈]0, 2m [
Let P ∈ g, Q ∈ h with P r = Qr =: for all u.
1 u0
|ϕg (u1 ) − ϕg (u2 )| ≤ π, |ϕh (u1 ) − ϕh (u2 )| ≤ π.
6= Rr =:
1 u1
and P ϕ = Qϕ =: a. Then
dϕg du
=
dϕh du
3.3. Null line preserving bijections. Any isometry of M3 is a bijection which maps null lines to null lines, and pre-images of null lines are null lines. The converse is also true: Proposition 2. Let : M3 → M3 be a bijection such that and −1 take null lines to null lines. Then is an isometry of M3 . Proof. In the following, we write g¯ := (g) := {(P ) | P ∈ g} for g ∈ L3 . For a set G of null lines and a set P of points of M3 let (G) := {(g) | g ∈ G} = {g¯ | g ∈ G},
(P) := {(P ) | P ∈ P}.
We proceed in steps: 1 Step 1. maps Z(3m) onto Z(3m). takes the null lines in K( 27m 2 ) to the null lines 1 in K( 27m2 ), and the null lines in S to the null lines in S. 3 Proof. Since is bijective, we have (l1 ∩l2 ) = (l1 )∩(l2 ) for all null lines l1 , l2 ∈ L . 1 1 1 Lemma 17 implies K( 27m2 ) = K( 27m2 ); note that S ⊂ K( 27m2 ). Let g ∈ S, then 1 1 ¯ = 1 by Lemma g¯ ∈ K( 27m ¯ 6∈ S, then there is an l¯ ∈ K( 27m ¯ ∩ l) 2 ). Assume g 2 ), #(g 1 1 ¯ ) implies l ∈ K( ), and from g ¯ ∩ l = 6 ∅ follows l ∩ g 6= ∅ and l ∈ S. 18. l¯ ∈ K( 27m 2 27m2 Therefore #(l ∩ g) = ∞ which contradicts #(l ∩ g) = 1, hence g¯ ∈ S. Applying the same arguments to −1 we obtain that g ∈ S iff g¯ ∈ S, and P ∈ Z(3m) iff (P ) ∈ Z(3m).
Null Line Preserving Bijections of Schwarzschild Spacetime
481
Step 2. Either (H + ) = H + and (H − ) = H − , or (H + ) = H − and (H − ) = H + . 1 Proof. Let P, Q ∈ M3 . We write P ∼ Q iff there are l1 , l2 , . . . , lι ∈ K( 27m 2 ) such that P ∈ l1 , Q ∈ lι , li ∩li+1 6= ∅, 1 ≤ i ≤ ι. From Step 1 we have P ∼ Q iff (P ) ∼ (Q). It is easy to see that P ∼ Q iff {P, Q} ⊂ H + or {P, Q} ⊂ H − for all P, Q ∈ M3 \Z(3m).
Step 3. For all a ∈] − π, π] there is an a∗ ∈] − π, π] such that (E(a)) = E(a∗ ) (see Definition 3). Proof. At first we will show that if there is a plane null line g ∈ E(a) with g¯ 6∈ E, then 1 g¯ is a null line in the class K(c > 27m 2 ), and 1ϕg¯ :=
sup 1 u1 ,u2 ∈]0, 2m [
|ϕg¯ (u1 ) − ϕg¯ (u2 )| ≤ π.
Furthermore, for any null line h ∈ E(a) we obtain h¯ ∈ K(c > which will lead to a contradiction.
1 27m2 )
and
dϕg¯ du
=
dϕh¯ du ,
(i) Let g ∈ E be a plane null line. #(g ∩ Z(3m)) = 1 implies #(g¯ ∩ Z(3m)) = 1. 1 1 ¯ ∈ K(c > 27m From Remark 6 we have g¯ ∈ E or g¯ ∈ K(c > 27m 2 ). In the case g 2 ) let ¯ 1ϕg¯ := supu1 ,u2 ∈]0, 1 [ |ϕg¯ (u1 )−ϕg¯ (u2 )|.Assume 1ϕg¯ > π, then there is an h ∈ K(c > 2m 1 1 1 ¯ ∩ g) ) \ { g} ¯ with #( h ¯ ≥ 2. From h¯ ∈ K(c > 27m 2 2 ) we have h ∈ E ∪ K(c > 27m2 ) 27m and #(h ∩ g) ≥ 2 which contradicts Remark 2 and Lemma 16. Hence 1ϕg¯ ≤ π. This implies that for any two distinct points A, B of g, 0 < |(A)ϕ −(B)ϕ | < π. Therefore, if there are two distinct points A, B of g with |(A)ϕ − (B)ϕ | ∈ {0, π}, then g¯ ∈ E is a plane null line. a , b ∈ R (see Definition 3). In the following we assume (ii) Let g ∈ E(a), w.l.o.g. g = l1,b 1 a . For i ∈ Z let g¯ ∈ K(c) with c > 27m2 . Let P := g ∩ Z(3m) = (3m, a, b). Let l := l2,b (see Fig. 4) a √ , l0 = l, li := l2,b+6 3mπ·i a √ gi := l1,b+6 , g0 = g, 3mπ·i √ Pi := li ∩ gi = (3m, a, b + 6 3 m π · i), [P ] := {Pi | i ∈ Z}, Qi := li ∩ g0 .
There are two null lines in S containing [P ]. From Step 1 and Lemma 11, |(P )ϕ − (Pi )ϕ | ∈ {0, π}
for all i ∈ Z.
Assume there is an l¯i ∈ E, i 6= 0. Then (Pi )ϕ = (Qi )ϕ implies |(Qi )ϕ − (P )ϕ | ∈ 1 {0, π}. This contradicts (i) because of 1ϕg¯ ≤ π. So l¯i ∈ K(c > 27m 2 ). Similarly we 1 ¯ have g¯ i , li ∈ K(c > 27m2 ) for all i ∈ Z. (iii) Let Pi , li , gi , [P ], Qi be defined as in (ii). Let J0 := {i ∈ Z | (Pi )ϕ = (P )ϕ },
Jπ := {i ∈ Z | |(Pi )ϕ − (P )ϕ | = π}.
Then (Pi )r = (Pj )r = 3m and (Pi )ϕ = (Pj )ϕ for all i, j ∈ J0 and for all dϕ
dϕl¯
i, j ∈ Jπ . From Lemma 21 we have dug¯ i = duj for all i6= j ∈ J0 and for all i6= j ∈ Jπ . (iv) The set {(3m, a, t)} | t ∈ R} is mapped into the set {(3m, a1 , t)} | t ∈ R}, where a1 := (P )ϕ .
482
W.-l. Huang
g
1
g
P1 Q1
g-1
P l1 P-1
l Fig. 4.
Proof. Assume there is a point B = (3m, a, b0 ) with (B) = (3m, a2 , b˜ 0 ), 0 < |a2 − a1 | < 2π. Let li , gi be defined as in (ii), (iii). Let J ∈ {J0 , Jπ } with #J = ∞. W.l.o.g. a , A := g ∩ h, u0 := 1/(A)r . Then let 0 ∈ J. Let B ∈ h := l2,b 0 Z u0 Z u0 dϕg¯ dϕh¯ ϕ du + a1 ≡ du + a2 (mod 2π). (A) ≡ 1 1 du du 3m 3m For j ∈ J let Aj := gj ∩ h, u∗j := 1/(Aj )r . Then #{u∗j | j ∈ J} = ∞. Since Z u˜ Z u˜ dϕg¯ j dϕg¯ ˜ = du + a1 = du + a1 ϕg¯ j (u) 1 1 du du 3m 3m from (iii), we obtain l(u∗j ) ≡ a2 − a1 (mod 2π) for all j ∈ J, where Z u dϕh¯ dϕg¯ − du. l(u) := 1 du du 3m dϕ
¯ h On the other hand, ϕg¯ (u), ϕh¯ (u) are bounded, and dug¯ − dϕ du is monotonic. Thus the congruence l(u) ≡ a2 − a1 6≡ 0 (mod 2π) has only a finite number of solutions, a contradiction to #{u∗j | j ∈ J} = ∞.
(v) For all A, B ∈ E(a), we have (A)ϕ = (B)ϕ iff (A)r = (B)r . Proof. Let h ∈ E1 (a) \ {g, l}. Then h intersects g or l. By (iv) and Lemma 21 we have dϕg¯ dϕh¯ du = du . For any two points A, B ∈ E(a) let h1 , h2 be two plain null lines satisfying A ∈ h1 and B ∈ h2 . Then Z 1/(A)r Z 1/(A)r dϕh¯ 1 dϕg¯ ϕ du + a1 = du + a1 (mod 2π), (A) ≡ 1 1 du du 3m 3m Z 1/(B)r Z 1/(B)r dϕh¯ 2 dϕg¯ ϕ du + a1 = du + a1 (mod 2π). (B) ≡ 1 1 du du 3m 3m Thus (A)r = (B)r implies (A)ϕ = (B)ϕ for all A, B ∈ E(a). Since 1ϕg¯ ≤ π, (A)ϕ = (B)ϕ implies (A)r = (B)r . Furthermore 0 ≤ |(A)ϕ − (B)ϕ | < π.
Null Line Preserving Bijections of Schwarzschild Spacetime
483
1 1 (vi) We take a point A ∈ E(a), Ar 6= 3m. Let γ1 ∈ K1 ( 27m 2 ), γ2 ∈ K3 ( 27m2 ) with A ∈ γ1 ∩ γ2 . The intersection of γ1 , γ2 , and E(a) contains infinitely many points. Let B be another point in γ1 ∩γ2 ∩E(a). Step 1 and Lemma 14 imply |(A)ϕ −(B)ϕ | ∈ {0, π}. We have (A)ϕ = (B)ϕ , and (A)r = (B)r , a contradiction to the injectivity of tγ¯ 1 (u). 1 ¯ ∈ E for all Hence our assumption in (ii), g¯ ∈ K(c > 27m 2 ), is wrong, and thus g g ∈ E(a). Now let l, g be two distinct null lines in E(a) with non-empty intersection. Then l¯ ∈ E ¯ g¯ ∈ E(a∗ ). If h is an arbitrary null line and l¯ ∩ g¯ 6= ∅, thus there is an a∗ ∈ R such that l, in E(a), then h ∩ g 6= ∅ or h ∩ l 6= ∅. Hence h¯ ∈ E(a∗ ). We obtain (E(a)) ⊆ E(a∗ ). Similarly, −1 (E(a∗ )) ⊆ E(a), and finally (E(a)) = E(a∗ ).
Remark 7. Without loss of generality, we may from now on assume (S1 ) = S1 and (3m, 0, 0) = (3m, 0, 0). Step 4. According to Step 3 we may define µa : R → R, a ∈] − π, π], by (3m, a∗ , µa (t)) := ((3m, a, t)). µa is bijective, and there exists b ∈ R such that µa (t) = −t + b ∀t ∈ R, µa (t) = t + b ∀t ∈ R, or a∗ ∈ {a, a + π, a − π} a∗ ∈ {−a, −a + π, −a − π}. Proof. Since (3m, 0, 0) = (3m, 0, 0) (see Remark 7), we have (E(0)) = E(0). Let µ := µ0 . Then (li,0 ) = li,0 , (li,t ) = li,µ(t) , (l1,t ∩ l2,t ) = l1,µ(t) ∩ l2,µ(t) imply, for all t ∈ R, √ {µ(t + 6 3mπ · i) | i ∈ Z} √ = {µ(t) + 6 3mπ · i | i ∈ Z}. √ √ √ 3mπ] → [0, 6 3mπ] by (l ) =: l and {f (0), f (6 3mπ)} = We define f : [0, 6 1,t 1,f (t) √ {0, 6 3mπ}. Z(3m) l2,x l2,0
6 3mπ
l1,0
l1,x (0,x)
−π
π
ϕ
Fig. 5.
√ Let x, y ∈ [0, 6 3mπ], x < y, then (l1,x ∩ l2,y )ϕ = (l2,x ∩ l1,y )ϕ
⇔
√ y = x + 3 3mπ.
(7)
484
W.-l. Huang
√ With Step 3 it follows that, for all x, y ∈ [0, 6 3mπ], √ √ (8) |y − x| = 3 3mπ ⇒ |f (y) − f (x)| = 3 3mπ. √ √ √ √ Thus f (0) ∈ {0,√ 6 3mπ}√implies f (3 3mπ) = 3 3mπ. Let x, y, z ∈ [0, 3 3mπ] resp. x, y, z ∈ [3 3mπ, 6 3mπ], and x < y. Then (l1,x ∩ l2,z )ϕ = (l2,y ∩ l1,z )ϕ
⇔
z=
x+y . 2
(9)
0 0 , h := l2,y be plain null lines, and let P := g ∩ h, then Let x, y ∈ R, y > x. Let g := l1,x r P > 3m. We have g¯ ∩ {(3m, 0, t) | t ∈ R} = (3m, 0, µ(x)) and h¯ ∩ {(3m, 0, t) | t ∈ R} = (3m, 0, µ(y)). From Step 2 and Step 3 we obtain: If (H + ) = H + and (g) ∈ E1 (0) then µ(y) > µ(x); if (H + ) = H + and (g) ∈ E2 (0) then µ(y) < µ(x); if (H + ) = H − and (g) ∈ E1 (0) then µ(y) < µ(x); if (H + ) = H − and (g) ∈ E2 (0) then µ(y) > µ(x).
Case I. µ is strictly increasing. (i) f = id[0,6√3mπ] . √ √ √ For all x, y ∈ [0, 6 3mπ] with√x < y Proof. Let f (0) = 0 and f (6 3mπ) √= 6 3mπ.√ we have f (x) √ < f (y). Since f (3 3mπ) = 3 3mπ, √ we have f (t) ∈ [0, 3 3mπ] iff t ∈ [0, 3 3mπ]. From (9), for all x, y ∈ [0, 3 3mπ] we have Jensen’s equality f ( 21 (x + y)) = 21 (f (x) + f (y)). It follows from [1] that f is the identity on the interval √ [0, 3 3mπ]. Using (8) and the monotony assumption for f , the assertion follows. √ (ii) µ(t) = t for all t ∈ [0, 6 3mπ]. Proof. We have µ(0) = 0 =√f (0). For t > 0 also µ(t) > 0, and there is an integer k ≥ 0 such that√µ(t) = f (t) + k · 6 3mπ. Since µ is bijective, we obtain µ(t) = f (t) = t for all t ∈ [0, 6 3mπ]. √ √ √ (iii) For all t ∈ [0, 6 3mπ] and k ∈ Z we have µ(t + 6 3mπ k) = t + 6 3mπ k. √ √ √ 3mπ)−t is a Proof. For all t ∈ [0, 6 3mπ[, the mapping ρt : Z → Z, k 7→ µ(t+k·6 6 3mπ ) > ρt (k2 ) iff k1 > k2 . Then ρt (k) = k bijection of Z into itself with ρ√ t (0) = 0, and ρt (k1√ for all k ∈ Z. Thus µ(t + k · 6 3mπ) = t + k · 6 3mπ, i.e. µ is the identity. Case II. µ is √ √ strictly decreasing. (i) f (t) = 6 3mπ − t for all t ∈ [0, 6 3mπ]. √ √ √ √ Proof. Let f (0) = 6 3mπ √ and f (6 3mπ) = 0. Let f : [0, 3 3mπ] √ → [0, 3 √3mπ] be defined by f(t) := 6 3mπ − f (t). f is bijective, f(0) = 0, f(3 3mπ) = 3 3mπ, f( 21 (x + y)) = 21 (f(x) + f(y)), and f(x) < f(y) iff x < y. As in Case I, (i) we obtain √ √ f(t) = t for all t ∈ [0, 3 3mπ]. Then f (t) = 6 3mπ − t.√Using (8) √ and the monotony assumption for f , this equation holds true also for t ∈ [3 3mπ, 6 3mπ]. (ii) In analogy to Case I, (ii) and (iii), we obtain µ(t) = −t for all t ∈ R. Let a ∈] − π, π]. Then (3m, a, 0) is an element of √ l1,b1 ∩ l2,b2 = {(3m, a, 6 3mπ i) | i ∈ Z} √ √ ∪ {(3m, a + π, 6 3mπ i + 3 3mπ) | i ∈ Z},
Null Line Preserving Bijections of Schwarzschild Spacetime
485
√ √ where b1 := −3 3ma and b2 := 3 3ma. In the case µ(t) = t, since (3m, 0, t) = (3m, 0, t) and (li,t ) = li,t for all t ∈ R we have (E(a)) = E(a)
(E(a)) = E(a − π)
or
and there is a b ∈ R with µa (t) = t + b for all t ∈ R. In the case µ(t) = −t we have (E(a)) = E(−a)
(E(a)) = E(−a − π),
or
and there is a b ∈ R with µa (t) = −t + b for all t ∈ R.
Remark 8. In the following we may assume without loss of generality that µ(t) = t. Proof. In the case µ(t) = −t we define the isometry δ : M3 → M3 , (r, ϕ, t) 7→ (r, −ϕ, −t). Then (δ◦)(3m, 0, 0) = (3m, 0, 0), (δ◦)(S1 ) = S1 , and (δ◦)(3m, 0, t) = (3m, 0, t). Step 5. Let P, Q be two points of M3 with P r = Qr . Then (P )r = (Q)r . The mapping f : R>2m → R>2m defined by f (Rr ) = (R)r for all R ∈ M3 is strictly monotonic increasing (resp. decreasing) if (H + ) = H + (resp. (H + ) = H − ). a a , l2,b , a := P ϕ Proof. For any point P ∈ M3 , there exist exactly two plane null lines l1,b 1 2 r through P , and there is a correspondence between P and b2 − b1 . The map preserves b2 − b1 in the case (H + ) = H + . In the other case (H + ) = H − , the difference b2 − b1 changes to b1 − b2 under . Hence P r = Qr implies (P )r = (Q)r .
Step 6. (P )ϕ = P ϕ for all P ∈ M3 . Proof. From Step 3 and ((3m, 0, 0)) = (3m, 0, 0), (S1 ) = S1 , µ0 (t) = t, see Remark 7 and 8, we have (E(0)) = E(0), (E(π)) = E(π). Let P = (3m, a, 0), g ∈ K1 (c0 ) with 1 P ∈ g, a ∈]0, π[, c0 > 27m 2 , π < 1ϕg < 2π. Then g∩E(0) 6= ∅ or g∩E(π) 6= ∅. W.l.o.g. let g ∩ E(0) 6= ∅, Q := g ∩ E(0) =: ( u11 , 0, b1 ). From Step 4 we have (E(a)) = E(a) or (E(a)) = E(a − π). 1 . Then 0 < a2 < a. Define We choose R ∈ g, R =: ( u12 , a2 , b2 ) with u1 < u2 < 3m 1 1 r r (Q) =: u3 , (R) =: u4 . Since E(0) = (E(0)) and 1ϕg¯ < 2π, we have ϕg¯ (u3 ) = 0 1 ) = a − π. or ϕg¯ (u3 ) = −2π. Assume that ϕg¯ ( 3m
P a a2
R g
Q Fig. 6.
In the case ϕg¯ (u3 ) = 0, since ϕg and ϕg¯ are strictly monotonic, we obtain 0 = 1 1 ) = a and 0 = ϕg¯ (u3 ) > ϕg¯ (u4 ) > ϕg¯ ( 3m ) = a − π, thus ϕg (u1 ) < ϕg (u2 ) < ϕg ( 3m
486
W.-l. Huang
0 > ϕg¯ (u4 ) − ϕg (u2 ) > −π in contradiction to |ϕg¯ (u4 ) − ϕg (u2 )| ∈ {0, π}. In the 1 ) = a, −2π = ϕg¯ (u3 ) < case ϕg¯ (u3 ) = −2π, we have 0 = ϕg (u1 ) < ϕg (u2 ) < ϕg ( 3m 1 ϕg¯ (u4 ) < ϕg¯ ( 3m ) = a−π. a ≥ 0 implies −2π < −π < a−π, and there is a point N ∈ g¯ 1 and u3 , such that ϕg¯ (u0 ) = −π. With u∗0 := 1/(−1 (N ))r with 1/N r =: u0 between 3m ∗ we obtain 0 < ϕg (u0 ) < a < π, in contradiction to E(π) = (E(π)). 1 1 ) 6= a−π, and ϕg¯ ( 3m ) = a, i.e. we have for all a ∈]0, π[ that (E(a)) = E(a) Thus ϕg¯ ( 3m and hence (E(a − π)) = E(a − π). Step 7. (P )t = P t . Proof. Let a ∈] − π, π], A = ( u1a , a, ta ) ∈ E(a). Since (E(a)) = E(a), from Step 4 √ there exists ka ∈ Z with µa (t) = t + 6 3m ka π. By considering the images of two plane null lines which intersect in A, we obtain (A)t = µa (At ). Let P := (3m, π, 0). Choose 1 R 3m dϕg 1 3π 1 1 g ∈ K1 (c > 27m 2 ) with P ∈ g, 0 du du = 2 , and ϕg ( 3m ) = π. There is a u0 < 3m 1 + such that ϕg (u0 ) = 0. Define Q := ( u0 , 0, t0 ) := g ∩ E(0) ∩ H . Let a ∈]0, π[ and Ra := g ∩ E(a) ∩ H + =:
1 , a, tg (ua ) . ua
Case 1: (H + ) = H + . Then (Q) = Q, and we have 1 1 √ , a, tg¯ (ua ) = , a, tg (ua ) + 6 3m ka π , (Ra ) = ua ua √ 1 ) = (3m, π, 6 3m kπ π). (P ) = 3m, π, tg¯ ( 3m 2 2 dϕ dϕ dt dt #(g ∩ E(a) ∩ H + ) = 1 implies #(g¯ ∩ E(a) ∩ H + ) = 1 and dug = dug¯ , dug = dug¯ . tg (u) then Since Q = (Q) ∈ g ∩ g¯ we have tg¯ (u) ∈ {tg (u), √ −tg (u) + 2t0 }. If tg¯ (u) = √ ka = kπ = 0; if tg¯ (u) = −tg (u)+2t0 then 2t0 = 6 3m kπ π, 2t0 −2tg (ua ) = 6 3m ka π, t (ua ) ∈ Z for all a ∈]0, π[, a contradiction to the continuity of tg . and 3√g 3mπ Case 2: (H + ) = H − . Then 2 (Q) = Q and Q, 2 (P ) ∈ (g). ¯ For all a ∈]0, π[, ¯ ∩ E(a) ∩ H + ) = 1. #(g ∩ E(a) ∩ H + ) = 1 implies #(g¯ ∩ E(a) ∩ H − ) = 1 and #((g) dϕ(g) dϕ ¯ ¯ ∩ E(a) ∩ H + . Since dug = du , Let Ra := g ∩ E(a) ∩ H + , then 2 (Ra ) = (g) 2 2 dt(g) dtg ¯ we have du = . For Q = ( u10 , 0, t0 ) ∈ g ∩ (g) ¯ we obtain t(g) ¯ (u) ∈ du (u) = t (u). This implies {tg (u), −tg (u) + 2t0 }. Similarly as in Case 1, we have t(g) ¯ g ka = kπ = 0. Thus we have shown that µa (t) = t for all a ∈]0, π[. If a ∈] − π, 0[ we may rotate M3 by σ : M3 → M3 , σ(r, ϕ, t) = (r, ϕ + π, t). Applying the same arguments as in the case a ∈]0, π[ we obtain µa (t) = t. Step 8. (H + ) = H + . a a a a ) = l2,b and (l2,b ) = l1,b . Let Proof. We assume that (H + ) = H − . Then (l1,b a0 a0 r0 −2m b2 −b1 P = (r0 , a0 , b0 ) = l1,b1 ∩ l2,b2 , then r0 − 2m + 2m ln m − m = 2 . Define a0 a0 ∩ l2,b , then (r1 , a0 , b0 ) := (P ) = l1,b 2 1
r1 + 2m ln
r1 − 2m r0 − 2m = −(r0 + 2m ln ) + 6m. m m
Null Line Preserving Bijections of Schwarzschild Spacetime
487
We define f : R>2m → R>2m by f (r) + 2m ln
r − 2m f (r) − 2m = −(r + 2m ln ) + 6m. m m
Then f 2 = id, and by the implicit function theorem, f is continuously differentiable. We 1 have (r, ϕ, t) = (f (r), ϕ, t). Let h ∈ K1 (c0 ), c0 > 27m 2 . Then there are differentiable functions ϕ1 : R>2m → R, t1 : R>2m → R, such that dϕ dϕ1 du 1 1 = = − 2 φ(u, c0 ), dr du dr r h = {(r, ϕ1 (r), t1 (r)) | r > 2m}, where dt1 = dt1 du = − 1 κ(u, c ). 0 dr du dr r2 Following Steps 6 and 7 we have (h) = {(f (r), ϕ1 (r), t1 (r)) | r > 2m} and (h) = {(f (r), ϕ2 (f (r)), t2 (f (r))) | r > 2m}, where ϕ2 := ϕ1 ◦ f −1 and t2 := t1 ◦ f −1 are dϕ1 dϕ2 φ(u,c) 1 0) continuously differentiable. Since φ(u,c κ(u,c0 ) = dt1 = dt2 = κ(u,c) for some c > 27m2 and since
φ(u,c) κ(u,c)
is injective in c, we obtain (h) ∈ K(c0 ). Let y := f (r). Then
dt2 dy
df · dr =
dt1 dr ,
y df and after a short calculation we have ry (y−2m) (r−2m) = r 2 = − dr . Integrating the second r equation and applying f (3m) = 3m, we obtain f (r) = 2 r−1 . For r = 4m we have
f (4m) =
2
12m 5 ,
3m
and
f (4m) + 2m ln implies ln
5 4
= 15 , ln
5 5 4
2m f (4m) − 2m = − 4m + 2m ln + 6m m m
= 1. Hence the assumption (H + ) = H − is not true.
Step 9. From (H + ) = H + immediately follows that (P )r = P r for all P ∈ M3 . Together with (P )ϕ = P ϕ and (P )t = P t for all P ∈ M3 we have that = idM3 is an isometry of M3 . This completes the proof of Proposition 2. 4. n-Dimensional Schwarzschild Spacetime 4.1. Null lines of n-dimensional Schwarzschild spacetime. Let n ∈ {4, 5, . . . }. The three-dimensional Schwarzschild spacetime is a submanifold of the n-dimensional Schwarzschild spacetime Mn . The canonical embedding is κ : M3 → Mn , (x1 , x2 , t) 7→ (x1 , x2 , 0, . . . , 0, t) in Cartesian coordinates. Let χ : J → M3 be a geodesic of M3 , then it is trivial that κ ◦ χ : J → κ(M3 ) ⊂ Mn is a geodesic in Mn . In particular, if χ is a null geodesic of M3 then κ ◦ χ is a null geodesic of Mn . Lemma 22. Let γ : I → Mn be a null geodesic. Then there is an isometry δ ∈ 1n such that δ ◦ γ(I) ⊂ κ(M3 ). Definition 7. Let M3 := {δ(κ(M3 )) | δ ∈ 1n0 }, 1n0 := {δ ∈ 1n | δ(P )t = P t ∀P ∈ Mn }, M0 := κ(M3 ).
488
W.-l. Huang
Let X be a mapping which assigns to any M ∈ M3 an isometry δM := X(M ) ∈ 1n0 with δM (M0 ) = M . Let P be a subset of M3 , and let G be a set of null lines of L3 . Let M ∈ M3 . We define (with respect to X) PM := {δM (κ(P )) | P ∈ P, },
GM := {δM (κ(g)) | g ∈ G}.
−1 For P ∈ M let P ϕM := (κ−1 δM (P ))ϕ .
Remark 9. Let Ln be the set of all null lines of Mn . We have [ [ M, Ln = L3M . Mn = M ∈M3
M ∈M3
Obviously, images and pre-images of null lines under δM are again null lines, hence δM induces a bijection L3M0 → L3M . Also, images and pre-images of null lines under κ : M3 → M0 are null lines, and κ induces a bijection L3 → L3M0 . Remark 10. |P ϕM − QϕM | is independent of the choice of M ∈ M3 for all P, Q ∈ M , and of the choice of X. In the following we choose a fixed X. 4.2. Intersection of two null lines. Let g, h be two distinct null lines. If g, h are contained in the same M ∈ M3 , then we have the same results concerning the intersection of g and h as in the three-dimensional case. In the case that g, h are contained in two distinct M1 , M2 ∈ M3 , it is M1 ∩ M2 = ∅, or there exists a ∈] − π, 0] such that g ∩ h ⊂ M1 ∩ M2 = EM1 (a) ∪ EM1 (a + π). Lemma 23. Let g ∈ Ln . If there are P, Q ∈ g with P, Q ∈ M1 ∈ M3 and |P ϕM1 − / {0, π}, then g ⊂ M1 . QϕM1 | ∈ Definition 8. Let g ∈ L3M , M ∈ M3 . We define 1ϕg := 1ϕκ−1 δ−1 (g) and M
∗ (c > KM
1 1 ) := {g ∈ KM (c > ) | 1ϕg ≤ π}. 27m2 27m2
S 1 ∗ 3 (c > 27m Lemma 24. Let g ∈ M ∈M3 EM ∪ KM 2 ). Let M1 ∈ M . If there exist two distinct points P, Q in M1 ∩ g, then g ⊂ M1 . If |P ϕM1 − QϕM1 | ∈ {0, π}, then g ∈ EM1 . 4.3. Null line preserving bijections. Any isometry of Mn is bijective and preserves null lines of Mn . The converse is also true. Theorem. Let n ∈ N, n ≥ 3. A bijection of Mn is an isometry of Mn , iff images and pre-images of null lines are null lines. In the case n = 3, the theorem is shown in Proposition 2. In the following let n ≥ 4. Let : Mn → Mn be bijective with (g), −1 (g) ∈ Ln for all g ∈ Ln , where (g) := g¯ := {(P ) | P ∈ g} for all g ∈ Ln . Let G ⊂ Ln be a set of null lines and P ⊂ Mn a set of points. We write (G) := {(g) | g ∈ G} = {g¯ | g ∈ G}, (P) := {(P ) | P ∈ P}. At first we show that (M ) ∈ M3 for all M ∈ M3 . The remaining part of the proof is done using the results of Sect. 3. S S 1 1 Lemma S 25. We have g ∈ S M ∈M3 KM ( 27m2 ) iff g¯ ∈ rM ∈M3 KM ( 27m2 r). In particular, g ∈ M ∈M3 SM iff g¯ ∈ M ∈M3 SM . This implies P = 3m iff (P ) = 3m for all P ∈ Mn .
Null Line Preserving Bijections of Schwarzschild Spacetime
489
The proof is similar to the proof of Step 1 in Proposition 2. S
Definition 9. Let P, Q ∈ Mn with P r = Qr = 3m. We write P ∼ Q iff there exists a null line in SM , M ∈ M3 which contains P and Q. For M ∈ M3 with P ∈ ZM (3m) let S [P ]M := {Q ∈ M | P ∼ Q, |P ϕM − QϕM | ∈ {0, π}}. Since any two distinct M1 , M2 ∈ M3 are either disjoint or intersect in two Schwarzschild half-planes, we have [P ]M1 = [P ]M2 whenever P ∈ M1 ∩ M2 . Definition 10. For P ∈ Mn , P r = 3m let [P ] := [P ]M for a M ∈ M3 with P ∈ M, [P ]+ := {Q ∈ [P ] | P ϕM = QϕM }, [P ]− := [P ] \ [P ]+ . Remark 11. Lemma 25 implies ([P ]) = [(P )]. Lemma 26. Let M1 ∈ M3 , a ∈]−π, π]. There exists M2 ∈ M3 with (EM1 (a)) ⊂ M2 . Proof. We take three distinct points A1 , A2 , A3 in EM1 (a) ∩ ZM1 (3m) with Ai ∈ [A1 ]+ and (Ai ) ∈ [(A1 )]+ , i = 2, 3. Let l1 ∈ EM1 (a), A1 ∈ l1 , 6 g2 ∩ l1 =: P, g2 ∈ EM1 (a), A2 ∈ g2 , ∅ = 6 g3 ∩ l1 =: Q. g3 ∈ EM1 (a), A3 ∈ g3 , ∅ = Let M2 ∈ M3 with l¯1 ⊂ M2 . Then (Ai ) ∈ M2 , i = 1, 2, 3. We have g¯ 2 , g¯ 3 ⊂ M2 : In the case (P )ϕM2 = (A1 )ϕM2 we have l¯1 ∈ EM2 , a∗ := (Q)ϕM2 = (A1 )ϕM2 , and g¯ 2 , g¯ 3 ∈ EM2 (a∗ ). In the case (P )ϕM2 6= (A1 )ϕM2 , we obtain with Lemma 25 and with the same argumentation as in Step 3, (i) of the proof of Proposition 2, that 1 l¯1 ∈ KM2 (c > 27m 2 ), 1ϕl¯1 ≤ π, and / {0, π}, |(P )ϕM2 − (A2 )ϕM2 | ∈
|(Q)ϕM2 − (A3 )ϕM2 | ∈ / {0, π},
and finally g¯ 2 , g¯ 3 ⊂ M2 from Lemma 23. For any point R in EM1 (a) \ (g2 ∪ g3 ) there is a null line l in EM1 (a) containing R with ∗ (c > ∅= 6 l ∩gi =: Ri , i = 2, 3. We have (Ri ) ∈ M2 , i = 2, 3. Since g¯ 2 , g¯ 3 ∈ EM2 ∪KM 2 1 ¯ ¯ ), also l ∈ M from Lemma 24. Hence (R) ∈ l ⊂ M . 2 2 2 27m Lemma 27. For any M1 ∈ M3 there exists M2 ∈ M3 with (ZM1 (3m)) = ZM2 (3m). The proof follows immediately from Lemma 25, 26. Lemma 28. Let a ∈] − π, π]. Let M1 , M2 ∈ M3 with (EM1 (a)) ⊂ M2 . There exists a∗ ∈] − π, π] with (EM1 (a)) = EM2 (a∗ ). ∗ (c > Proof. Assume (EM1 (a)) 6= EM2 (a∗ ) for all a∗ ∈] − π, π]. Then g¯ ∈ KM 2 1 for all g ∈ EM1 (a). Choose distinct li ∈ KM1 (c > 27m2 ), i = 1, 2, with
1 27m2 )
6 li ∩ ZM1 (3m) =: B ∈ EM1 (a ± π), li ∩ EM1 (a) =: A, π < 1ϕli < 2π, ∅ = / {0, π}. Using then A ∈ / ZM1 (3m). Our assumption implies |(B)ϕM2 − (A)ϕM2 | ∈ Lemma 23 we obtain l¯i ⊂ M2 . But from (A), (B) ∈ l¯1 ∩ l¯2 and Lemma 13 follows |(B)ϕM2 − (A)ϕM2 | ∈ {0, π}, a contradiction.
490
W.-l. Huang
Proof of the Theorem.. From Lemma 27 and 28 we obtain immediately that takes −1 3 3 elements of M3 to elements of M3 , and κ−1 δ(M ) δM κ : M → M is a null line preserving bijection for any M ∈ M3 . Hence, for all Mi ∈ M3 there are an orthogonal (n − 1) × (n − 1)–matrix Ai and ∈ {−1, 1}, b ∈ R, such that Ai 0 + (0, . . . , 0, b), (10) |Mi (X) = X 0 where (x1 , . . . , xn−1 , t) are Cartesian coordinates for X. For any two points X, Y ∈ Mn there is an element of M3 containing X and Y , so Xt = Y t
⇒
(X)t = (Y )t .
(11)
For any X ∈ Rn \ Mn there is an M1 ∈ M3 with X ∈ < M1 >, where < M1 > denotes the three-dimensional sub-vector space of Rn containing M1 . Define A1 0 + (0, . . . , 0, b), (X) := X 0 where A1 , , b are defined in (10). From (11), and b are fixed for all M ∈ M3 . Let M2 be another three-dimensional subspace with X ∈ < M2 >. Let A2 be the orthogonal (n − 1) × (n − 1)–matrix defined by (10). Then A2 0 A1 0 + (0, . . . , 0, b) = X + (0, . . . , 0, b). X 0 0 This is trivial in the case X = 0. In the case X 6= 0 this follows from 3m Ai 0 + (0, . . . , 0, b), i = 1, 2, (P ) = r · X 0 X 3m n where P := X r · X ∈ M1 ∩ M2 ⊂ M . Hence is well-defined. The restriction of to Rn−1 × {0} is a mapping from Rn−1 × {0} onto Rn−1 × {b} which preserves euclidean distances. Thus there is an orthogonal (n − 1) × (n − 1)–matrix A such that A0 (X) = X + (0, . . . , 0, b) 0
for all X ∈ Rn . From |Mn = it follows that is an isometry of Mn .
References 1. Acz´el, J.: Vorlesungen u¨ ber Funktionalgleichungen und ihre Anwendungen. Basel, Stuttgart: Birkh¨auser Verlag, 1961 2. Alexandrov, A.D.: Seminar report. Uspekhi Mat. Nauk 37(3), 187 (1950) 3. Alexandrov, A.D.: A contribution to chronogeometry. Canad. J. Math. 19, 1119–1128 (1967) 4. Alexandrov, A.D., and Ovchinnikova, V.V.: Notes on the foundations of relativity theory. Vestnik Leningrad Univ. Math. 11, 95–100 (1953) ¨ 5. Benz, W.: Characterizations of geometrical mappings under mild hypotheses: Uber ein modernes Forschungsgebiet der Geometrie. Hamb. Beitr. Wiss.gesch. 15, 393–409 (1994) 6. Benz, W.: Real Geometries. Mannheim, Leipzig, Wien, Z¨urich: BI Wissenschaftsverlag, 1994 7. Chandrasekhar, S.: The Mathematical Theory of Black Holes. New York: Oxford Univ. Press, 1983
Null Line Preserving Bijections of Schwarzschild Spacetime
491
8. G¨obel, R.: Zeeman topologies on space-times of general relativity theory. Commun. Math. Phys. 46, 289–307 (1976) 9. Guts, A.K.: Axiomatic relativity theory. Russ. Math. Surv. 37(2), 41–89 (1982) 10. Hawking, S.W., and Ellis, G.F.R.: The Large Scale Structure of Space-time. Cambridge: Cambridge Univ. Press, 1973 11. Hawking, S.W., King, A.R., and McCarthy, P.J.: A new topology for curved space-time which incorporates the causal, differential, and conformal structures. J. Math. Phys. 17(2), 174–181 (1976) 12. Huang, W.-l.: Transformations of strongly causal space-times preserving null geodesics. J. Math. Phys. 39(3), 1637–1641 (1998) 13. Lester, J.A.: Transformations of Robertson-Walker spacetimes preserving separation zero. Aequationes Math. 25, 216–232 (1982) 14. Lester, J.A.: Separation-preserving transformations of de Sitter spacetime. Abh. Math. Sem. Univ. Hamburg 53, 217–224 (1983) 15. Lester, J.A.: The causal automorphisms of de Sitter and Einstein cylinder spacetimes. J. Math. Phys. 25, 111–116 (1984) 16. Lester, J.A.: Zeeman’s lemma on Robertson-Walker space-times. J. Math. Phys. 30(6), 1296–1300 (1989) 17. Lester, J.A.: Distance preserving transformations. In: F. Buekenhout, editor, Handbook of Incidence Geometry, Amsterdam: Elsevier, 1995, pp. 921–944 18. O’Neill, B.: Semi-Riemannian Geometry. New York, London: Academic Press, 1983 19. R¨atz, J.: On light-cone-preserving mappings of the plane. In: E.F. Beckenbach and W. Walter, editors, General Inequalities 3, Basel: Birkh¨auser Verlag, 1983, pp. 349–367 20. Zeeman, E. C.: Causality implies the Lorentz group. J. Math. Phys. 5(4), 490–493 (1964) Communicated by H. Nicolai
Commun. Math. Phys. 201, 493 – 505 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
On the Asymptotics of the Finite-Perimeter Partition Function of Two-Dimensional Lattice Vesicles T. Prellberg1 , A. L. Owczarek2 1 Department of Theoretical Physics, University of Manchester, Manchester M13 9PL, United Kingdom. E-mail:
[email protected] 2 Department of Mathematics, University of Melbourne, Parkville, 3052, Australia. E-mail:
[email protected] Received: 17 March 1998 / Accepted: 7 September 1998
Abstract: We derive the dominant asymptotic form and the order of the correction terms of the finite-perimeter partition function of self-avoiding polygons on the square lattice, which are weighted according to their area A as q A , in the inflated regime, q > 1. The approach q → 1+ of the asymptotic form is examined. 1. Introduction A simple model of a closed, fluctuating membrane in solution (or vesicle), such as those found in biological contexts, is a self-avoiding surface on a d-dimensional hypercubic lattice. To take account of the effects of factors such as osmotic pressure and pH differences between the inside and outside of the membrane it is advantageous to sort the configurations according to their volume and surface area. In two dimensions, selfavoiding polygons (SAP) weighted by area and perimeter were investigated by Fisher et al. [9] after the general problem of two-dimensional vesicles was discussed by Leibler et al. [13]. Exact enumerations of SAP by area and perimeter, and some related rigorous results on the mean area of polygons of fixed perimeter have also been given [12, 8], after pioneering work of Hiley and Sykes [11] on their enumeration. A vesicle in two dimensions will be modelled in this paper by a self-avoiding polygon on the square lattice, where both the perimeter and area are controlled in some fashion. To be more precise, one quantity often considered when investigating the behaviour of lattice vesicles is the finite-perimeter partition function. This is defined as X cnm q m , (1.1) Zn (q) = m
cnm
is the number of some set of polygon configurations enumerated with respect where to their perimeter, 2n, and area, m, and the sum is over all possible values of m. (Since only the square lattice is considered here, where the perimeter of the polygons contains
494
T. Prellberg, A. L. Owczarek
an even number of bonds, we will use the convention that n denotes half of the length of the perimeter.) It is this quantity that will be the focus of our work here, more precisely, its asymptotic behaviour as n → ∞ for a fixed value of q. Moreover, everywhere we will restrict q to be larger than one, that is, q > 1. In the course of our discussion we will consider several subsets of self-avoiding polygons on the square lattice: these include convex polygons, directed convex polygons, Ferrers diagrams and simple rectangles. The general area-perimeter counting problem for these subsets have been examined previously [4, 5, 7, 1, 2, 3, 6, 14, 15, 16, 17]. In particular, the definitions, including diagrams, of the various polygon models can be found in Bousquet–M´elou [2]. However, their finite-perimeter partition functions’ asymptotics for q > 1 have not been explicitly examined. In this paper we prove that in two dimensions for SAP, Zn (q) = A(q) q n
2
/4
(1 + O(ρn )) as n → ∞ ,
(1.2)
for some 0 < ρ < 1, where A(q) = Ao (q) or A(q) = Ae (q) when n is restricted to subsequences with n being odd or even respectively. We give explicit expressions for Ao (q) and Ae (q). In fact we show that these functions coincide with those obtained if one only considered convex polygons. Note also that the odd/even dichotomy implies there is not a unique asymptotic form for Zn (q) in the regime q > 1. We also deduce that there is an essential singularity in both the A(q) functions as q approaches 1 from above; in particular A(q) ∼
1 ε 3/2 2π2 /3ε e 4 π
as ε = log q → 0+
(1.3)
for both even and odd n. In Fisher et al. [9] there is an argument giving the leading order factor of the finiteperimeter partition function asymptotics for polygons. The partition function Zn (q) is bounded for q > 1 by , q M (n) ≤ Zn (q) ≤ q M (n) Zn (1) = q M (n) µ2n+o(n) saw
(1.4)
where M (n) is the maximal area of a polygon with perimeter 2n and µsaw is the connectivity constant for self-avoiding walks. From this and the exact value of M (n) (see 3.1) it follows immediately that Zn (q) = q n
2
/4 O(n)
e
as n → ∞ .
(1.5)
To refine this result, we show that in fact for all q > 1 the partition function asymptotics is completely dominated by the convex configurations. This is stated in Theorem 2.1. In Theorem 2.2 we then discuss the asymptotics for various models of convex polygons. Taken together these two theorems enable the following explicit expression, described precisely in Corollary 2.3, for the leading asymptotic behaviour of Zn (q) to be given Zn (q) =
∞ (1 + O(ρn )) X k(n−k) q (q −1 ; q −1 )4∞ k=−∞
for some 0 < ρ < 1. Here,
(1.6)
Asymptotics of Inflated Vesicles
495
Fig. 1. Pictorial representation of (1.6): the partition function is asymptotically dominated by convex polygons, which are constructed from rectangles by removing corners made of Ferrers diagrams
def
(x; q)m =
m Y (1 − xq k−1 )
(1.7)
k=1
is the standard q-product notation. This is the main result of our work. The asymptotic form (1.6) has a straightforward combinatorial interpretation (see Fig. 1). The infinite sum has its origin in the generating function for rectangles n−1 X
q k(n−k) .
(1.8)
k=1
(A rectangle of perimeter 2n may have sides of length k and n−k, where 1 ≤ k ≤ n−1, and so an area of k(n − k).) If the range of summation is extended to Z, the change is of 2 the order of O(q −n /4 ). Convex polygons can be constructed by removing corner sites from these rectangles while preserving the perimeter. These “corners” are described by Ferrers diagrams, whose area-generating function is ∞
F (q) =
Y 1 1 = , (q; q)∞ 1 − qk
(1.9)
k=1
which is convergent for |q| < 1. A removal of one corner (ignoring overlaps) corresponds to multiplication with this area-generating function with the area weight replaced by q −1 . Correspondingly, the simultaneous removal of four corners corresponds to multiplication with F (q −1 )4 , leading directly to the expression in (1.6). The rest of the paper is set out as follows: in Sect. 2 we state the two main theorems, where the first theorem compares the asymptotics of the finite-perimeter partition functions of all polygons with those of convex polygons while the second gives the asymptotics of various kinds of convex polygons, and our main result precisely, which combines these theorems to give the finite-perimeter partition function asymptotics for all polygons. In the following Sect. 3 we prove the two main theorems. We end with a
496
T. Prellberg, A. L. Owczarek
discussion of our results, including the derivation of the asymptotics as q → 1+ of the dominant asymptotic part (of the right-hand side) of (1.6). 2. Asymptotic Results Theorem 2.1. Let Zn (q) and Znc (q) be the finite-perimeter partition functions of polygons and convex polygons, respectively, on the square lattice. Then, Zn (q) ∼ Znc (q) “exponentially fast” as n → ∞: more precisely, for all q > 1 there exist C > 0 and 0 < ρ < 1 such that for all integers n > 1, 1≤
Zn (q) < 1 + Cρn . Znc (q)
(2.1)
Theorem 2.2. Let Zn(s) (q) be the finite-perimeter partition function of rectangles (s = 0), Ferrers diagrams (s = 1), stacks or staircase polygons (s = 2), directed convex polygons (s = 3), and convex polygons (s = 4) on the square lattice. Then def
Zn(s) (q) ∼ Zn(s),as (q) =
∞ X 1 q k(n−k) (q −1 ; q −1 )s∞
(2.2)
k=−∞
exponentially fast as n → ∞: more precisely, for all q > 1 there exist C > 0 and 0 < ρ < 1 such that for all integers n > 1, 1 − Cρn
1 there exist C > 0 and 0 < ρ < 1 such that for all integers n > 1, Zn (q) n (2.5) Z as (q) − 1 < Cρ . n Proof of Corollary 2.3. It follows from Theorem 2.1 and Theorem 2.2 (with s = 4) by multiplying the inequalities (2.1) and (2.3) that for q > 1 there exist C > 0 and 0 < ρ < 1 such that 1 − Cρn
µ4saw there exist C > 0 and 0 < ρ < 1 such that for all integers n > 1, 1≤
Zn (q) n ac (q) < 1 + Cρ , Zn,`
(3.2)
where µsaw ' 2.638 is the connectivity constant of self-avoiding walks. Proof of Lemma 3.2. The difference between the set of polygons and at-most-`-convex polygons is precisely the set of polygons with a convexity index of at least ` + 1. These polygons have a bounding rectangle of half perimeter ≤ n − ` − 1, hence an area of at most M (n − ` − 1), and their number is clearly smaller than cn , the total number of polygons with perimeter 2n. Therefore we have the bound ac (q) ≤ cn q M (n−`−1) . 0 ≤ Zn (q) − Zn,`
(3.3)
ac (q) > q M (n) , this leads to Rearranging terms and estimating Zn,`
1≤
Zn (q) M (n−`−1)−M (n) . ac (q) ≤ 1 + cn q Zn,`
(3.4)
498
T. Prellberg, A. L. Owczarek
Now cn grows asymptotically as µ2n saw and we calculate M (n − ` − 1) − M (n) ≤ − Thus, provided that q
`+1 2
(` + 1)2 + 1 `+1 n+ . 2 4
(3.5)
> µ2saw , we can find C > 0 and 0 < ρ < 1 such that cn q M (n−`−1)−M (n) < Cρn ,
which completes the proof.
(3.6)
This lemma seems to suggest that the closer q is to 1, the larger ` has to be chosen to get convergence. However, this is just an artefact of the rather simple estimation. One can sharpen the result with the help of the next lemma. Lemma 3.3. For all non–negative integers ` and for all q > 1 there exist C > 0 and 0 < ρ < 1 such that for all integers n > 1, 1≤
ac (q) Zn,` < 1 + Cρn . c Zn (q)
(3.7)
0 (q) denote the finite-perimeter partition function of polyProof of Lemma 3.3. Let Zn,` P` ac 0 (q) = k=0 Zn,k (q). We first give an upper gons with convexity index `. Clearly Zn,` 0 0 bound on Zn,` (q) in terms of Zn−1,`−1 (q), valid for ` > 0. To do this let us consider any polygon with perimeter 2n and convexity index `: we can add cells (faces of the lattice) to arrive at some polygon with perimeter 2(n − 1) and convexity index ` − 1 while preserving the bounding rectangle. As ` > 0, we can always find an indentation within the polygon of the form depicted in Fig. 2. Adding the marked faces to the polygon clearly changes perimeter and convexity index as desired. This implies that every polygon with perimeter 2n and convexity index ` can be constructed by removing cells (faces of the lattice) from a polygon with perimeter 2(n − 1) and convexity index ` − 1 while preserving the bounding rectangle. By going through this procedure carefully, we will obtain the estimate 0 (q) ≤ Zn,`
2n 0 (q). Z q − 1 n−1,`−1
(3.8)
To show this, we take any polygon with perimeter 2(n−1) and convexity index `−1 and count the number of ways to remove faces. As the convexity index increases by exactly one, the faces to be removed have to be at the boundary of the polygon and have to be connected (one can of course get further such polygons by removing other sites that are not directly at the boundary, but then there is a smaller polygon with which we could have started the construction). There are less than 2n different faces of the polygon at the boundary. If we fix one face and start removing this one and additional faces in a clockwise order, we can remove only a finite number of faces, certainly less than 2n. Each time we remove a face, the weight of the configuration gets reduced by 1/q, and summing up the weights of all configurations generated in this way, we get a change of weight of at most 1/q + 1/q 2 + . . . ≤ 1/(q − 1) by the removal of faces. Together with a multiplicity of at most 2n due to the choice of the first site, this implies the desired inequality (3.8).
Asymptotics of Inflated Vesicles
499
◦
◦
◦
×
×
×
Fig. 2. This figure shows the construction in Lemma 3.3. Shown is part of a polygon (shaded faces) with the thick line representing its border. The perimeter of the polygon is decreased by 2 and the convexity index decreased by 1 by adding the faces marked with × to the polygon. Note that the faces marked with ◦ are not part of the polygon, whereas the unmarked faces can be either
Using this inequality, we get by iteration an upper bound for at-most-`-convex polygons in terms of convex polygons only: ac (q) Zn,`
k ` X 2n c ≤ Zn−k (q). q−1
(3.9)
k=0
This leads to the need to estimate the terms in the sum on the right-hand side of ` ac X (q) Zn,` 0 such that k c ` X Zn−k (q) 2n ≤ Cρn , (3.12) q−1 Znc (q) k=1
which proves the lemma.
Taken together, Lemma 3.2 and Lemma 3.3 prove Theorem 2.1. Proof of Theorem 2.1. For any q > 1 we can choose ` fixed such that q `+1 > µ4saw . Now we can write
500
T. Prellberg, A. L. Owczarek
1≤
ac (q) Zn (q) Zn,` Zn (q) n n = ac (q) Z c (q) ≤ (1 + C1 ρ1 )(1 + C2 ρ2 ), Znc (q) Zn,` n
(3.13)
where the existence of C1 > 0 and 0 < ρ1 < 1 is guaranteed by Lemma 3.2, and Lemma 3.3 guarantees the existence of C2 > 0 and 0 < ρ2 < 1. It follows that for any max(ρ1 , ρ2 ) < ρ < 1 there exists a C > 0 such that 1≤
Zn (q) ≤ 1 + Cρn . Znc (q)
(3.14)
The inequality (3.11) used in the proof of Lemma 3.3 is contained in Lemma 3.4 (with s = 4), which we also use in a remark after the proof of Lemma 3.6. Lemma 3.4. For s ∈ {0, 1, 2, 3, 4} let Zn(s) (q) be defined as in Theorem 2.2. Then, for any positive q and integer n > 1 we have the inequality (s) (q) ≥ q n+1 Zn(s) (q) Zn+2
(3.15)
(s) (q) ≥ q n/2 Zn(s) (q). Zn+1
(3.16)
and the slightly weaker bound
Proof of Lemma 3.4. If we increase the width of each row and then the height of each column of a convex polygon with perimeter 2n by one (by adding cells appropriately), we increase the perimeter by 4 and the area by n + 1. This implies immediately the first inequality. For the second one we have to labour slightly harder. We partition the m,(s) denote the set of convex polygons with respect to their bounding rectangles. Let c(k,`) number of convex polygons of class s with width k, height `, and area m. Then, by simply increasing the width or height of each row or column, respectively, of a polygon by one, we get the estimates m,(s) m−`,(s) ≥ c(k,`) c(k+1,`)
m,(s) m−k,(s) c(k,`+1) ≥ c(k,`) .
and
(3.17)
(We need to treat both cases, as stacks (s = 2) lack reflection symmetry.) If we define P m,(s) (s) (q) = m q m c(k,`) , then this implies the inequalities the partition function Z(k,`) (s) (s) (q) ≥ q ` Z(k,`) (q) Z(k+1,`) (s) (q) = As Zn+1
Pn−1 k=0
(s) (s) Z(k,`+1) (q) ≥ q k Z(k,`) (q).
and
(3.18)
(s) Z(k+1,n−k) , we can now estimate
(s) (q) ≥ Z(1,n) (q) + Zn+1
n−1 X
(s) q n−k Z(k,n−k) (q)
(3.19)
k=1
and (s) (q) ≥ Z(n,1) (q) + Zn+1
n−1 X k=1
(s) q k Z(k,n−k) (q),
(3.20)
Asymptotics of Inflated Vesicles
501
whence it follows that (s) (q) ≥ Zn+1
n−1 X k=1
q n−k + q k (s) Z(k,n−k) (q) ≥ q n/2 Zn(s) (q), 2
where we have used the geometric–arithmetic mean inequality.
(3.21)
A simple idea of over-ounting gives the upper bound for the partition function Zn(s) (q) in the next lemma. Lemma 3.5. For s ∈ {0, 1, 2, 3, 4} let Zn(s) (q) be defined as in Theorem 2.2. Then for any q > 1 and integer n > 1 we have the bound Zn(s) (q) < Zn(s),as (q) =
∞ X 1 q k(n−k) . (q −1 ; q −1 )s∞
(3.22)
k=−∞
Proof of Lemma 3.5. Every configuration in these models can be constructed by removing s Ferrers diagrams from specified corners of rectangles with the restriction that the resulting configuration is still a polygon (this procedure does not change the perimeter). If one removes this restriction, one clearly over-ounts. As the removal of Ferrers diagrams of arbitrary size is equivalent to multiplying the weight of the rectangle with (q −1 ; q −1 )−1 ∞ , this implies for the generating function the inequality Zn(s) (q) ≤ Replacing Zn(0) (q) = lemma.
Pn−1 k=1
Zn(0) (q) . (q −1 ; q −1 )s∞
q k(n−k) by the infinite sum
(3.23) P∞
k=−∞
q k(n−k) proves the
As a consequence of Lemma 3.4 and Lemma 3.5 we can now establish the desired convergence to Zn(s),as (q). This is done in Lemma 3.6 in which we also establish the rate of convergence. Lemma 3.6. For s ∈ {0, 1, 2, 3, 4} let Zn(s) (q) be defined as in Theorem 2.2. Then for all q > 1 there exist C > 0 and 0 < ρ < 1 such that for all integers n > 1, (3.24) q −M (n) Zn(s),as (q) − Zn(s) (q) < Cρn . (s) (q)/q M (n) and Zn(s),as (q)/q M (n) as series in Proof of Lemma 3.6. We first consider Z2n q −1 and show that we have convergence for each of the series coefficients. In order to compare the coefficients, we need to look more closely at the error made by the overounting procedure. The over-ounting results from Ferrers diagrams that touch each other, or from Ferrers diagrams that do not fit into the rectangle. In either case, this necessitates a minimal area removal of size min(k, n − k) from a k × (n − k)-rectangle. Thus, the maximal weight of the excess configurations is
q k(n−k)−min(k,n−k) .
(3.25)
As both Zn(s) (q) and Zn(0) (q)/(q −1 ; q −1 )s∞ have a leading power of q M (n) , this implies that they agree in their leading b n2 c coefficients, if considered as a series in q −1 .
502
T. Prellberg, A. L. Owczarek
If we define for k = 0, 1, 2, . . . the positive numbers dk(s),even = [q −k ] dk(s),odd = [q −k ]
∞ X 2 1 q −` , (q −1 ; q −1 )s∞
(3.26)
1 (q −1 ; q −1 )s∞
(3.27)
`=−∞ ∞ X
q −`(`+1) ,
`=−∞
where [q −k ] denotes the k th coefficient of Zn(s),as (q)/q M (n) in q −1 , then these coefficients (s) (q)/q M (n) , as explained above, for the first n terms. This coincide with those of Z2n coincidence and the upper bound in Lemma 3.5 imply that n X
q −k dk(s),even
k=0 n X k=0
(s) ≤ Z2n (q)/q n ≤ 2
(s) q −k dk(s),odd ≤ Z2n+1 (q)/q n(n+1) ≤
∞ X k=0 ∞ X k=0
q −k dk(s),even ,
(3.28)
q −k dk(s),odd ,
(3.29)
which in turn imply that the error is less than the error made by truncating the expansion of the upper bound in q −1 after n terms. As the left-hand sides converge exponentially fast in q −1 to the right-hand sides, we can now write down the rate of convergence for the middle terms. More precisely, we have shown that for 0 < ρ < 1 there exists a C > 0 such that for all q > ρ−1 , ∞ X 2 1 (s) 1 q −k − n2 Z2n (q) ≤ Cρn , −1 −1 s (q ; q )∞ q k=−∞
(3.30)
∞ X 1 1 (s) q −k(k+1) − n(n+1) Z2n+1 (q) ≤ Cρn , −1 −1 s (q ; q )∞ q
(3.31)
k=−∞
which implies that for 0 < ρ < 1 there exists a C > 0 such that for all q > ρ−2 , ! ∞ X 1 −M (n) k(n−k) (s) q − Zn (q) < Cρn , (3.32) q (q −1 ; q −1 )s∞ k=−∞
which proves the lemma.
Remark. By Lemma 3.4, we have the inequality (s) (q)/q (n+2) Zn+2
2
/4
≥ Zn(s) (q)/q n
2
/4
,
(3.33)
(s) (s) (q)/q n ) and (Z2n+1 (q))/q (n+1/2) ) are monotonwhich implies that the sequences (Z2n ically increasing. Rewriting the upper bound of Lemma 3.5 gives the n-independent upper bounds (P 2 ∞ 1 q −k n even (s) n2 /4 Pk=−∞ . (3.34) < −1 −1 s Zn (q)/q ∞ −(k+1/2)2 (q ; q )∞ n odd k=−∞ q 2
2
Asymptotics of Inflated Vesicles
503
(s) (s) Thus, the sequences (Z2n (q)/q n ) and (Z2n+1 (q))/q (n+1/2) ) converge. One may be (s) (q)/q M (n) tempted to use this convergence and the fact that the series coefficients of Z2n n (s),as M (n) and Zn (q)/q coincide for the leading b 2 c terms, as shown in the first part of the (s) (q)/q M (n) and Zn(s),as (q)/q M (n) , conproof of Lemma 3.6, to show the sequences, Z2n verge to the same (odd and even) limits. However, to use this convergence of the formal power series, and the point–wise convergence of the sequences, to imply equality of the limits one needs to utilise the positivity of the coefficients of the power series. This is precisely what was accomplished in the second part of the proof of Lemma 3.6, which also allowed us to estimate the rate of convergence simultaneously. 2
2
Proof of Theorem 2.2. This follows now directly from Lemma 3.6.
4. Discussion In this paper we have derived the leading asymptotic behaviour of the finite-perimeter generating function for polygons on the square lattice for area fugacity larger than one and have given a combinatorial interpretation of the result. We conclude this paper by considering the behaviour of the form (1.6) when q → 1+ . This is clearly far from being enough to determine the asymptotic behaviour of Zn (1), as one may not interchange the limits n → ∞ and q → 1. We define P∞ −k2 k=−∞ q (4.1) Ae (q) = −1 (q ; q −1 )4∞ and P∞ 2 q −(k+1/2) . (4.2) Ao (q) = k=−∞ (q −1 ; q −1 )4∞ Hence we can write 2 (4.3) Znas (q) = A(q) q n /4 , where A(q) = Ao (q) or A(q) = Ae (q) when n is restricted to subsequences with n being odd or even respectively. The numerators of the functions Ae (q) and Ao (q) can be identified as limiting cases of the elliptic theta functions [18], that is, ϑ3 (0, q
−1
)=
∞ X
q −k
2
(4.4)
k=−∞
and ϑ2 (0, q −1 ) =
∞ X
q −(k+1/2) . 2
(4.5)
k=−∞
This allows the powerful theory of theta functions [18] to be utilised. In particular, the conjugate modulus transformation relates the theta functions of nome p = e−πη = q −1 < 1 to theta functions of nome p0 = e−π/η . This is useful if we consider the asymptotics as p → 1− (that is, q → 1+ ) since then p0 → 0+ . The conjugate modulus transformation yields ϑ3 (0, p) = η −1/2 ϑ3 (0, p0 )
(4.6)
504
T. Prellberg, A. L. Owczarek
and ϑ2 (0, p) = η −1/2 ϑ4 (0, p0 ) = η −1/2
∞ X
(−1)k (p0 )k . 2
(4.7)
k=−∞
Since
ϑ3 (0, p0 ) ∼ ϑ4 (0, p0 ) ∼ 1
(4.8)
as p0 → 0, and since further [10] (p; p)∞ ∼
1/2 2 −π exp η 6η
(4.9)
as p → 1− (η → 0+ ), the asymptotics of the functions Ae (q) and Ao (q) follow after some algebra. We hence obtain Ae (q) ∼ Ao (q) ∼
1 ε 3/2 2π2 /3ε e 4 π
as q → 1+ ,
(4.10)
where ε = log(q). Lastly, we consider exact enumeration data for these models. Comparing Zn (q)/
∞ X
q k(n−k) =
k=−∞
and
∞ X
an,k q −k
(4.11)
k=0 ∞
X 1 = bk q −k , −1 −1 4 (q ; q )∞
(4.12)
k=0
we observe that the coefficients an,k are monotonically increasing in n and bounded above by bk for n ≤ 21. Hence, we are led to conjecture that Znas (q) from (2.4) may, in fact, be a strict upper bound for Zn (q). We leave this as an open question. Acknowledgement. Financial support from the Australian Research Council is gratefully acknowledged by A.L.O. while T.P. thanks the Department of Mathematics at the University of Oslo and the Department of Physics at the University of Manchester, both where parts of this work were completed. This work was supported by EC Grant ERBCHBGCT939319 of the “Human Capital and Mobility Program” and EPSRC Grant No. GR/K79307. The authors thank the referees for their careful comments on our work.
References 1. Bousquet-M´elou, M. and Viennot, X.G.: Heaps of Segments and q-Enumeration of Directed Convex Vesicles. J. Comb. Theory Ser. A 60, 196–224 (1992) 2. Bousquet-M´elou, M.: A Method for the Enumeration of Various Classes of Column-Convex Polygons. Discrete Math. 154, 1–25 (1996) 3. Bousquet-M´elou, M. and F´edou, J.M.: The Generating Function of Convex Polyominoes – The Resolution of a q-Differential System. Discrete Math. 137, 53–75 (1995) 4. Brak, R. and Guttmann, A.J.: Exact Solution of the Staircase and Row–Convex Polygon Perimeter and Area Generating Function. J. Phys A. 23, 4581–4588 (1990) 5. Brak, R., Guttmann, A.J. and Enting, I.G.: Exact Solution of the Row–Convex Polygon Perimeter Generating Function. J. Phys A. 23, 2319–2326 (1990) 6. Brak, R., Owczarek, A.L. and Prellberg, T.: Exact Scaling Behavior of Partially Convex Vesicles. J. Stat. Phys. 76, 1101–1128 (1994)
Asymptotics of Inflated Vesicles
505
7. Delest, M.: Polyominoes and Animals – Some Recent Results. J. Math. Chem. 8, 3–18 (1991) 8. Enting, I.G. and Guttmann, A.J.: On the Area of Square Lattice Polygons. J. Stat. Phys. 58, 475–484 (1990) 9. Fisher, M.E., Guttmann, A.J. and Whittington, S.: 2–Dimensional Lattice Vesicles and Polygons. J. Phys. A 24, 3095–3106 (1991) 10. Hardy, G.H.: Ramanujan: Twelve lectures on subjects suggested by his life and work. London–New York: Cambridge Univ. Press (reprinted by Chelsea, New York), 1940, pp. 113–131 11. Hiley, B.J. and Sykes, M.F.: Probability of Initial Ring Closure in the Restricted Random-Walk Model of a Macromolecule. J. Chem. Phys. 34, 1531–1537 (1961) 12. Janse van Rensburg, E.J. and Whittington, S.G.: Punctured Discs on the Square Lattice. J. Phys. A. 23, 1287–1294 (1990) 13. Leibler, S., Singh, R.R.P. and Fisher, M.E.: Thermodynamic Behaviour of Two-Dimensional Vesicles. Phys. Rev. Lett 59, 1989–1993 (1987) 14. Prellberg, T. and Brak, R.: Critical Exponents from Nonlinear Functional Equations for Partially Directed Cluster Models. J. Stat. Phys. 78, 701–730 (1995) 15. Prellberg, T. and Owczarek, A.L.: Partially Convex Lattice Vesicles: Methods and Recent Results. In: Proceedings of the Conference ‘Confronting the Infinite’, Singapore: World Scientific, 1995, pp. 204–214 16. Prellberg, T.: Uniform q-Series Expansion for Staircase Polygons. J. Phys. A 28, 1289–1304 (1995) 17. Prellberg, T. and Owczarek, A.L.: Stacking Models of Vesicles and Compact Clusters. J. Stat. Phys. 80, 755–779 (1995) 18. Whittaker, E.T. and Watson, G.N.: A Course of Modern Analysis. 4th ed., Cambridge: Cambridge University Press, 1963, pp. 462–490 Communicated by M. E. Fisher
Commun. Math. Phys. 201, 507 – 517 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
Physical Symmetries of Quantum Histories J.D. Maitland Wright Analysis and Combinatorics Research Centre, Mathematics, University of Reading, Reading RG6 6AX, England. E-mail:
[email protected] Received: 2 August 1998/ Accepted: 8 September 1998
Abstract: Gell-Mann and Hartle have proposed a significant generalisation of quantum theory in which decoherence functionals play a key role. Physical symmetries of quantum history systems are shown, under mild physical assumptions, to correspond to isometric Jordan *-automorphisms of the underlying von Neumann algebra. Unitary representations of groups of symmetries of quadratic forms over operator algebras are discussed and related to symmetries of decoherence functionals.
Introduction In [4] it is pointed out that “Gell-Mann and Hartle have proposed a significant generalisation of quantum theory with a scheme whose basic ingredients are ‘histories’ and decoherence functionals”. Building on the formulation of Isham et al. [4–6] we shall define a quantum history system over a von Neumann algebra A to be a pair (A, D), where D is a collection of decoherence functionals for A. A function, defined on pairs of projections in A, is said to be a decoherence functional if it satisfies certain conditions stated below. Elementary decoherence functionals for A can be constructed by fixing a state φ and defining d(p, q) = φ(pq) for all projections p and q in A. Let us recall that a von Neumann algebra is of Type I2 if, and only if, it is isomorphic to the algebra of two-by-two matrices over a commutative von Neumann algebra, that is, is of the form M2 (C) ⊗ L∞ (µ). We recall that A has a Type I2 direct summand if there exists, in A, a non-zero central projection e such that eA is of Type I2 . In particular, L(H) has no direct summand of Type I2 unless the Hilbert space H is two-dimensional. We recall that Gleason’s Theorem breaks down for two-by-two matrices and hence for all von Neumann algebras with a Type I2 direct summand. Unless stated otherwise, we shall suppose from now on that A is a von Neumann algebra with no Type I2 direct summand. Also P (A) will be the set of projections in A.
508
J.D. M. Wright
In a quantum history system (A, D) the projections of A correspond to propositions about histories. (See [4–6, 11, 13–17] and the references given there for a fuller account of this and the related physical background.) We define a physical symmetry of (A, D) to be a bijection w of P (A) onto P (A) such that, whenever d : P (A) × (P (A) → C is a bounded decoherence functional in D then the map on P (A) × (P (A) defined by: (p, q) → d (w(p), w(q)) is a decoherence functional. It is not assumed that w maps orthogonal projections to orthogonal projections, it is not assumed that w(1) = 1, and it is not assumed that (p, q) → (d(w(p), w(q)) is in D. However, we shall show that, provided D contains “enough” elementary decoherence functionals, each physical symmetry w has a unique extension to an isometric Jordan *-isomorphism W of A onto A. Moreover, by a celebrated result of Kadison [9], W is the direct sum of a *-isomorphism and a *-anti-isomorphism. Hence, in particular, w does map orthogonal projections to orthogonal projections and w(1) = 1. Schreckenberg [16, 17] has given an interesting account of physical symmetries for the situation where A = L(H) and H is finite dimensional. This has been extended by Rudolph [13] to the situation where H is infinite dimensional. They postulate an apparently more restrictive notion of physical symmetry than ours but, as we shall show in this note, our notion of physical symmetry is equivalent to theirs when A is specialised to L(H). It was shown in [19] that decoherence functionals are intimately related to quadratic forms on von Neumann algebras; each bounded decoherence functional d on the projections of A corresponds to a bounded quadratic form D on A (see Sect. 1 below). The physical symmetries which leave d invariant can be identified with the group of Jordan automorphisms of A which leave D invariant. It follows from the above, that a quantum history system may be identified with a pair (A, D# ), where D# is a set of quadratic forms on A. In [15] Rudolph and Wright show that “standard homogeneous decoherence functionals” arising from classical quantum mechanics, do not correspond to bounded decoherence functionals on L(H) (unless H is finite dimensional). However it is shown in [15] that they do have natural representations as unbounded quadratic forms on “dense” subalgebras of L(H). Fortunately, all the unbounded quadratic forms which arise in this way are “positive” and have a natural representation which is described below. Thus it seems necessary to extend the Isham formulation to include generalised decoherence functionals which may take infinite values or, equivalently, to affiliate certain unbounded (but positive) “densely defined” quadratic forms to (A, D# ) to obtain an enlarged system (A, DEXT ). Schreckenberg [16, 17] and Rudolph [13] obtain interesting representations of the symmetries of a fixed decoherence functional. By concentrating on symmetries of the associated quadratic form we are led to a somewhat different approach, which is particularly transparent for positive quadratic forms. The representations given here are as groups of unitaries on Hilbert space. The reader should note that, although all decoherence functionals satisfy a “positivity” condition this does not imply that the associated quadratic form is positive. In [23] three distinct positivity conditions for decoherence functionals are identified. Only the strongest of these conditions is equivalent to the associated quadratic form being positive. Let Q be a (possibly unbounded) positive quadratic form whose domain is a subspace of A. Then (see below) there exists a Hilbert space H and a (possibly unbounded) linear map ρ from the domain of Q to a dense subspace of H such that
Physical Symmetries of Quantum Histories
509
Q(x, y) = hρ(x), ρ(y)i for each x and each y in the domain of Q. Let σ be any Jordan automorphism of A such that Q(σ(x), σ(y)) = Q(x, y) for each x and y in the domain of Q and such that σ maps the domain of Q onto itself. Then we shall see that there exists a unique unitary Uσ on H such that ρ(σx) = Uσ (x). Moreover the map σ → Uσ is a representation of the group of all symmetries of Q in the unitary group of L(H). When D is a bounded quadratic form which is not positive then the situation is more complicated. By applying the Haagerup–Pisier–Grothendieck Inequality [8, 12, 7], we can express D as the difference of two positive quadratic forms, D1 and D2 . But this decomposition is not unique and a symmetry of D need not be a symmetry of D1 and D2 . Let G be a subgroup of the symmetries of D. We investigate conditions where there exists a bounded operator ρ from A into a Hilbert space H, a projection P on H and a unitary representation g → Ug such that, for each x and y in A and each g in G, Ug commutes with P , Ug ρ(x) = ρ(gx) and D(x, y) = h(2P − I)ρ(x), ρ(y)i. By applying the results of [22] we find that this is always possible if G is commutative or, more generally, amenable. When specialised to the situation considered by Schreckenberg [16,17], i.e. when A is finite dimensional, our approach shows that we can find an appropriate ρ, P and unitary representation g → Ug , even when G is the group of all symmetries of D. 1. Decoherence Functionals and Quadratic Forms In all that follows A shall be a von Neumann algebra and P (A) the lattice of projections in A. Definition. A decoherence functional associated with A is a function d : P (A) × P (A) → C with the following properties: 1. Hermiticity: For each p and q in P (A), d(p, q) = d(q, p)∗ . (Here ∗ denotes complex conjugation.) 2. Additivity: Whenever p1 is perpendicular to p2 and q is an arbitrary projection d(p1 + p2 , q) = d(p1 , q) + d(p2 , q). 3. Decoherence Positivity: for each p in P (A). 4. Normalisation:
d(p, p) ≥ 0
d(1, 1) = 1.
510
J.D. M. Wright
2# : Countable Additivity. A decoherence functional is said to be countably additive if, whenever {pi : i = 1, 2 . . . } is a countable collection of pairwise orthogonal projections, then, for each q in P (A), X X d(pi , q). d pi , q = Here the series on the right-hand side is rearrangement invariant and hence is absolutely convergent. 2## : Complete Additivity. A decoherence functional is said to be completely additive if, whenever {pi : i ∈ I} is an infinite collection of pairwise orthogonal projections, X X d(pi , q) d pi , q = for each q in P (A). Here all but countably many of the terms d(pi , q) are zero and the convergence is absolute. A decoherence functional is said to be bounded if its range is a bounded set of complex numbers. By a specialisation of the results of [21], when H is an infinite dimensional Hilbert space every completely additive decoherence functional associated with L(H) is bounded. Let B be a complex vector space. Let S : B × B → C be a sesquilinear form on B, that is, S is linear in the first variable and skew linear in the second. Let us recall that S is Hermitian if S(x, y)∗ = S(y, x). We shall define a quadratic form on B to be a Hermitian sesquilinear form on B. Now let B be a normed space. Let kSk = sup{|S(x, y)| : kxk ≤ 1, kyk ≤ 1} then S is said to be bounded with norm kSk precisely when kSk is finite. When B is a C ∗ -algebra we observe that S : B × B → C is sesquilinear if, and only if, the map (x, y) → S(x, y ∗ ) is bilinear on B × B. When Q is a quadratic form on a von Neumann algebra A then Q is said to be normal if, for each z in A, the functional x → Q(x, z) is normal. Let us recall that a bounded decoherence functional d : P (A) × P (A) → C extends to a unique bounded quadratic form D : A × A → C, provided that A has no Type I2 direct summand [19]. In the other direction, it is easy to see that a bounded quadratic form D on any von Neumann algebra A restricts to a decoherence functional precisely when D(1, 1) = 1 and D(p, p) ≥ 0 for each projection p in A. Moreover the restriction is a completely additive decoherence functional if, and only if, the quadratic form is normal [21]. Since studying physical symmetries of a fixed decoherence functional, d, is, essentially, a specialisation of studying Jordan automorphisms of A which leave a quadratic form D invariant, we shall investigate symmetries of quadratic forms in Sect. 3. 2. Quantum History Systems A quantum history system (over a von Neumann algebra A) is defined to be a pair (A, D), where D is a set of decoherence functionals associated with a von Neumann algebra A. For each state φ on A let dφ be the decoherence functional defined by dφ (p, q) = φ(pq) for each p and each q in P (A). Let us recall that when S is a set of states of A then S is said to be separating if, whenever x is a non-zero element of A, there exists φ in S such that φ(x) is not zero.
Physical Symmetries of Quantum Histories
511
For example, if A = L(H) and S is the set of states of the form x → hxξ, ξi for each unit vector ξ in H, then S is separating. A quantum history system (A, D) is said to be full if there exists a separating set of states S such that, for each φ in S, dφ is in D. We define a physical symmetry of a quantum history system (A, D) to be a bijection w from P (A) onto P (A) such that whenever d is in D then the function defined on P (A) × P (A) by (p, q) → d (w(p), w(q)) is a decoherence functional. Theorem 2.1. Let A have no direct summand of Type I2 . Let (A, D) be a full quantum history system. Let w be a physical symmetry of (A, D). Then there exists a unique Jordan *-isomorphism W from A onto A such that W extends w. Moreover W is an isometry and there exists a central projection e such that W , restricted to eA, is a *-isomorphism of eA onto w(e)A and W , restricted to (1 − e)A, is a *-anti-isomorphism of (1 − e)A onto (1 − w(e)) A. Proof. (1) First we show that w(1) = 1. Suppose this is false. Then there exists a state φ such that φ (1 − w(1)) is not zero and dφ is in D. But dφ (w(1), w(1)) = 1, by the normalisation property of decoherence functionals. So φ (w(1)) = 1 = φ(1). This contradiction shows that w(1) = 1. (2) The next step is to show that w maps orthogonal projections to orthogonal projections. Let p and q be orthogonal projections in P (A). Let S be a separating set of states of A such that, for each φ in S, dφ is in D. Fix φ in S. Then by the (orthogonal) additivity property of decoherence functionals we find that dφ (w(p + q), 1) = dφ (w(p), 1) + dφ (w(q), 1) . Thus φ (w(p + q)) = φ (w(p)) + φ (w(q)) . Since S is a separating family for A it follows that w(p + q) = w(p) + w(q). Thus the projection w(p) is smaller than w(p + q). So w(p) = w(p)w(p + q) = w(p)2 + w(p)w(q). Hence 0 = w(p)w(q). (3) We now use the hypothesis that A does not have a Type I2 direct summand for the first time. Because of this hypothesis we can apply the Generalised Gleason Theorem [1, 2, 3] to w to deduce the existence of a unique bounded linear operator W from A into A which extends w. P Let us call a self-adjoint element y of A simple if it is of the form y = tj pj , where p1 , p2 , . . . , pn is a finite set of non-zero, pairwise orthogonal projections and t1 , t2 , . . . , tn are real numbers. By (2), W p1 , W p2 , . . . , W pn are pairwise orthogonal projections. So W y 2 = (W y)2 . Furthermore, since w is injective, none of the projections W pj is zero. Hence kyk = kW (y)k. It follows by spectral theory that for each x in Asa there is a sequence (yn )(n = 1, 2, . . . ) of simple elements which converges to x in norm. Hence W (x) is self-adjoint and kxk = kW (x)k. Moreover, since w maps P (A) onto P (A), for each yn there exists a simple bn such that W bn = yn . Since kW (bn − bm )k = kbn − bm k and (W (bn )) (n = 1, 2, . . . ) is convergent, it follows that (bn )(n = 1, 2, . . . ) is a Cauchy sequence. Thus W is an isometric linear bijection from Asa onto itself. We see from the identity (x + y)2 − (x − y)2 = 2(xy + yx) that W is a Jordan isomorphism of Asa onto
512
J.D. M. Wright
itself. An easy calculation then shows that W is a Jordan *-isomorphism of A onto A. It now follows from a celebrated theorem of Kadison that W is of the required form. (See p. 777 [10] and [9].) Since W is uniquely determined on P (A) it follows by spectral theory that W is unique. Observation.. Let d be a decoherence functional associated with A and let W be any Jordan *-isomorphism of A onto A. Then W maps P (A) onto P (A). Let dW be defined on P (A) × P (A) by dW (p, q) = d(W p, W q). Then dW is a decoherence functional which is completely additive if, and only if, d is completely additive. Corollary 2.2. Let (A, D) be a full quantum history system and let A have no Type I2 direct summand. Then the physical symmetries of (A, D) form a group which is isomorphic to the group of all Jordan *-isomorphisms of A onto A. Proof. Each Jordan *-isomorphism of A onto A restricts to a physical symmetry of (A, D). By Theorem 2.1, all physical symmetries of (A, D) arise in this way. Corollary 2.3. Let (L(H), D) be a full quantum history system and let H have dimension not equal to 2. Then the physical symmetries of (L(H), D) are precisely the set of all *-automorphisms of L(H) and the set of all *-anti-automorphisms of L(H). Proof. This is an immediate consequence of Theorem 2.1 and the observation that the only central projections in L(H) are 0 and 1. Remark. Every *-automorphism of L(H) is of the form z → uzu−1 for some unitary in L(H). See Theorem 9.3.4 [10]. Let α be any *-anti-automorphism of L(H). Let v be the canonical skew linear isometry from H onto its dual H # . Then z → vz ∗ v −1 is a *-anti-isomorphism of L(H) onto L(H # ). So z → vα(z ∗ )v −1 is a *-isomorphism of L(H) onto L(H # ). It then follows from Theorem 9.3.4 [10] that this isomorphism is implemented by a unitary w from H # onto H. Let u = v −1 w−1 . Then u is a skew linear isometry from H onto H (an anti-unitary) such that α(z) = uz ∗ u−1 . It follows from these remarks that the notion of physical symmetry presented here coincides with that of Schreckenberg [16, 17] in finite dimensions and with that of Rudolph for L(H) when H is infinite dimensional [13]. Remark. Given a quantum history system (A, D0 ) it can always be embedded in a full quantum history system (A, D1 ). To see this, let S be a separating family of normal states of A and let cD1 be the union of D0 and {dφ : φ ∈ S}. Furthermore, by replacing D1 by {dW : d ∈ D1 &W ∈ J}, where J is the group of all Jordan *-isomorphisms of A onto A, we may enlarge the quantum history system (A, D1 ) to a system (A, D) which is both full and stable under the action of the group J. (It is easy to see that if each d in D0 is completely additive then this construction of (A, D) ensures that every d in D is also completely additive.) When considering a single decoherence functional d, we shall always suppose that d is embedded in a full quantum history system which is stable under J. Hence we shall define the symmetry group of d, Sym d, to be the group of all W ∈ J such that, for each p and q in P (A), d(W p, W q) = d(p, q).
Physical Symmetries of Quantum Histories
513
3. Symmetries of Decoherence Functionals and Quadratic Forms A Jordan automorphism of a C ∗ -algebra B is a Jordan *-isomorphism of B onto B. For each quadratic form Q on B a symmetry of Q is a Jordan automorphism of B, σ,which leaves Q invariant, that is, such that Q(σ(x), σ(y)) = Q(x, y). When Q is a bounded quadratic form on an arbitrary von Neumann algebra A, we define Sym Q to be the group of all Jordan automorphisms of A which leave Q invariant. Now suppose that A is a von Neumann algebra with no Type I2 direct summand. Let d be a bounded decoherence functional for A and let W be a Jordan automorphism of A. Let D be the unique extension of d to a bounded quadratic form on A [19]. Then D(W x, W y) is the canonical extension of d(W p, W q) to a bounded quadratic form. It is straightforward to see that Sym d can be identified with Sym D. As studying the group of physical symmetries of a fixed (bounded) decoherence functional is, essentially, a specialisation of studying Jordan automorphisms of A which leave a quadratic form D invariant, we shall concentrate in this section on symmetries of a fixed quadratic form. From now onward we drop the restriction that A has no Type I2 direct summand. Definition. Let D be a bounded quadratic form on a von Neumann algebra A. Let G be a subgroup of Sym D. Then D is said to be G-representable if the following two conditions are satisfied. (i)
There exists a Hilbert space H, a bounded operator ρ from A into H and a projection P in L(H) such that D(x, y) = h(2P − I)ρ(x), ρ(y)i
for each x and y in A. (ii) For each g in G there exists a unique unitary Ug in L(H) such that, for each x in A, Ug ρ(x) = ρ(gx), and P commutes with Ug for each g in G. Definition. Let D be a bounded normal quadratic form on a von Neumann algebra A. Let G be a subgroup of Sym D. Then D is said to be normally G-representable if D is G-representable and the form (x, y) → hρ(x), ρ(y)i is normal. Let us recall that, by the Haagerup–Pisier–Grothendieck inequality [8,12,7], there exists a universal constant K such that for each bounded quadratic form Q on a von Neumann algebra A there exists a state φ satisfying the inequality: |Q(x, y)| ≤ Kk(Qkφ(xx∗ + x∗ x)1/2 φ(yy ∗ + y ∗ y)1/2 for each x and each y in A. Furthermore, if Q is normal then, by Proposition 2.3 [8], the state φ can be chosen to be normal. Our main objective in this section is the following theorem: Theorem 3.1. Let D be a bounded quadratic form on a von Neumann algebra A. Let G be a subgroup of Sym D. Let ψ be a G-invariant state of A such that, for some constant M, |D(x, y)| ≤ M ψ(xx∗ + x∗ x)1/2 ψ(yy ∗ + y ∗ y)1/2 for each x and y in A. Then D is G-representable.
514
J.D. M. Wright
Before proving Theorem 3.1 we shall obtain an analogous result for positive quadratic forms. Because, as discussed earlier, unbounded positive quadratic forms arise naturally from physical considerations, we shall investigate positive quadratic forms which need not be bounded. The following straightforward algebraic lemma will simplify the discussion. Lemma 3.2. Let X be a complex vector space. Let Q : X × X → C be a quadratic form such that Q(x, x) ≥ 0 for every x in X. Then there exists a Hilbert space H and a linear map ρ from X onto a dense subspace of H such that, for each x and y in A, Q(x, y) = hρ(x), ρ(y)i. Let G be the group of all linear bijections of X onto X which leave Q invariant, that is, for each g in G, Q(gx, gy) = Q(x, y) for every x and y in A. Then, for each g in G, there exists a unique unitary Ug in L(H) such that, for every x in X, Ug ρ(x) = ρ(gx). The map g → Ug is a group representation of G in the unitary group of H. Proof. Standard arguments give that the Cauchy–Schwarz inequality holds for Q, that is, for each x and y in X, |Q(x, y)| ≤ Q(x, x)1/2 Q(y, y)1/2 . Let N = {x ∈ X : Q(x, x) = 0}. It follows from the Cauchy–Schwarz inequality that x ∈ N if, and only if, Q(x, y) = 0 for all y in X. This implies that N is a vector subspace of X. Let ρ be the canonical quotient map from X onto X/N . Again by appealing to the Cauchy–Schwarz inequality, Q(x1 , y1 ) = Q(x2 , y2 ) if x1 − x2 ∈ N and y1 − y2 ∈ N . So we can define Qˆ on X/N by Qˆ(ρ(x), ρ(y)) = Q(x, y). Then Qˆ is an inner product on X/N . Let H be the Hilbert space completion of this inner product space. It follows immediately that ρ has dense range in H and that Q(x, y) = hρ(x), ρ(y)i. Now let ρ(x) = ρ(y). Then Q(x, z) = Q(y, z) for all z in X. Let g ∈ G. Then Q(x, g −1 z) = Q(y, g −1 z) for all z in X. But since g leaves Q invariant, this gives Q(gx, z) = Q(gy, z) for all z. Thus ρ(gx) = ρ(gy). So we may define a function Ug on X/N by defining Ug ρ(x) = ρ(gx). Clearly Ug is unique. We have, for each x and z, hUg ρ(x), Ug ρ(z)i = hρ(gx), ρ(gz)i = Q(gx, gz) = Q(x, z) = hρ(x), ρ(z)i . It follows that Ug has a unique extension to a bounded linear operator on H. We shall abuse our notation by denoting this extension by Ug . Clearly Ug is unitary. It is straightforward to verify that g → Ug is a group homomorphism. The preceding lemma leads immediately to:
Physical Symmetries of Quantum Histories
515
Proposition 3.3. Let Q be a (possibly unbounded) positive quadratic form on X, a subspace of A. Then there exists a (possibly unbounded) linear operator ρ which maps X onto a dense subspace of a Hilbert space H such that, for each x and y in X, Q(x, y) = hρ(x), ρ(y)i . Let J be the group of all Jordan automorphisms of A which restrict to bijections of X and which leave Q invariant. Then, for each g in J, there exists a unique unitary Ug in L(H) such that, for every x in X, Ug ρ(x) = ρ(gx). The map g → Ug is a group representation of J in the unitary group of H. The following representation of unbounded quadratic forms by sums of products of functionals (compare with [20]) follows easily from the above representation. Corollary 3.4. Let Q be a (possibly unbounded) positive quadratic form on X, a subspace of A. Then there existP(possibly unbounded) linear functionals (φi )(i ∈ I), each φi defined on X, such that |φi (x)|2 converges for each x in X and, for every x and y in X, X Q(x, y) = φi (x)φi (y)∗ . Proof. Let (ηi ) (i ∈ I) be an orthonormal basis of H. Let φi (x) = hρ(x), ηi i for each x in X. Then X φi (x)φi (y)∗ = hρ(x), ρ(y)i = Q(x, y). Corollary 3.5. Let Q be a positive and bounded quadratic form on A. Then there exists a bounded linear operator ρ which maps A onto a dense subspace of a Hilbert space H such that, for each x and y in A, Q(x, y) = hρ(x), ρ(y)i . Let J be the group of all Jordan automorphisms of A which leave Q invariant. Then, for each g in J, there exists a unique unitary Ug in L(H) such that, for every x in X, Ug ρ(x) = ρ(gx). The map g → Ug is a group representation of J in the unitary group of H. Proof. For each x in A, we have kρ(x)k2 = Q(x, x) ≤ kQkkxk2 . So ρ is bounded.
We now give a proof of Theorem 3.1: Let F (x, y) = M ψ(xy ∗ + y ∗ x) for each x and y in A. Then F is a bounded positive quadratic form on A. Also, for each x in A, −F (x, x) ≤ D(x, x) ≤ F (x, x). Let D1 = 1/2(D + F ) and D2 = 1/2(F − D). Then D1 and D2 are positive (bounded) quadratic forms such that D = D1 − D2 and D1 + D2 = F. By Corollary 3.5, there exist Hilbert spaces H1 and H2 and, for j = 1 and j = 2, there exists a bounded linear map ρj from A onto a dense subspace of Hj such that, for each x and y in A: Dj (x, y) = hρj (x), ρj (y)ij .
516
J.D. M. Wright
Let H = H1 ⊕ H2 and let ρ : A → H be the direct sum of ρ1 and ρ2 . Let P be the canonical projection from H onto H1 , that is, P (ξ ⊕ ζ) = ξ. In particular, P ρ(x) = ρ1 (x) for each x in A. Thus, for every x and y in A, h(2P − I)ρ(x), ρ(y)i = hP ρ(x), P ρ(y)i − h(I − P )ρ(x), (I − P )ρ(y)i = hρ1 (x), ρ1 (y)i − hρ2 (x), ρ2 (y)i = D1 (x, y) − D2 (x, y) = D(x, y). Since G is a group of Jordan automorphisms of A which leaves both F and D invariant, G also leaves D1 and D2 invariant. It now follows from Corollary 3.5 that, for j = 1 and j = 2, there exists a group homomorphism g → Ugj from g into the unitary group of Hj such that, for all x in A Ugj ρj (x) = ρj (gx). Let Ug = Ug1 ⊕ Ug2 . Then g → Ug is a group representation of G into the group of all unitaries in L(H). Also Ug ρ(x) = (Ug1 ⊕Ug2 ) (ρ1 (x) ⊕ ρ2 (x)) = ρ1 (gx)⊕ρ2 (gx) = ρ(gx). For arbitrary x and y in A, Ug P (ρ1 (x) ⊕ ρ2 (y)) = Ug (ρ1 (x) ⊕ 0) = Ug1 ρ(x) ⊕ 0 = ρ1 (gx) ⊕ 0 = P (ρ1 (gx) ⊕ ρ2 (gy))
= P Ug (ρ1 (x) ⊕ 0) + Ug (0 ⊕ ρ2 (y)) = P Ug (ρ1 (x) ⊕ ρ2 (y)) . Since the range of ρ1 is dense in H1 and the range of ρ2 is dense in H2 we have Ug P = P Ug . The uniqueness of Ug then follows from the uniqueness of Ug1 and Ug2 . Corollary 3.6. Let A = L(K), where K is finite dimensional and let D be a quadratic form on L(K). Then D is Sym D representable. Proof. The bilinear operator (x, y) → D(x, y ∗ ) may be identified with a linear map from A into the dual of A. Since A is finite dimensional this linear map is bounded. So D is a bounded quadratic form. The existence of ψ satisfying the hypotheses of Theorem 3.1 with respect to Sym D, follows from the observation that each state of L(K) is dominated by a positive multiple of the trace (see the remarks in [19]). Corollary 3.7. Let Q be a bounded quadratic form on a von Neumann algebra A. Let G be any amenable subgroup of Sym Q. Then Q is G-representable. Furthermore, if Q is normal then Q is normally G-representable. Proof. By [22] there exists a G-invariant state ψ on A such that |Q(x, y)| ≤ 4kQkψ(xx∗ + x∗ x)1/2 ψ(yy ∗ + y ∗ y)1/2 for each x and each y in A. It then follows from Theorem 3.1 that Q is G-representable. Furthermore, if Q is normal then, see [22], the state ψ can be chosen to be normal. So, in the notation of the proof of Theorem 3.1, hρ(x), ρ(y)i = F (x, y) = M ψ(xy ∗ + y ∗ x). Hence Q is normally G-representable. Remark. Every commutative group is amenable [18]. In particular, if g is any symmetry of Q and G is the subgroup of Sym Q generated by g, then Q is G-representable.
Physical Symmetries of Quantum Histories
517
References 1. Bunce, L.J. and Wright, J.D.M.: The Mackey–Gleason Problem. Bull. Am. Math. Soc. 26, 288–293 (1992) 2. Bunce, L.J. and Wright, J.D.M.: The Mackey-Gleason Problem for vector measures on projections in von Neumann algebras. J. London Math. Soc. 49, 133–149 (1994) 3. Bunce, L.J. and Wright, J.D.M.: Complex measures on projections in von Neumann algebras. J. London Math. Soc. 46, 269–279 (1992) 4. Isham, C.J., Linden, N., and Schreckenberg, S.: The classification of decoherence functionals : An analogue of Gleason’s theorem. J. Math. Phys. 35, 6360–6370 (1994) 5. Isham, C.J. and Linden, N. : Quantum temporal logic and decoherence functionals in the histories approach to generalised quantum theory. J. Math. Phys. 35, 5452–5476 (1994) 6. Isham, C.J. : Quantum logic and the histories approach to quantum theory. J. Math. Phys. 35, 2157–2185 (1994) 7. Grothendieck, A.: Resum´e de la th´eorie m´etrique des produits tensorielles topologiques. Bol. Soc. Mat. Sao Paolo 8, 1–79 (1956) 8. Haagerup, U.: The Grothendieck Inequality for bilinear forms on C*-algebras. Adv. in Math. 56, 93–116 (1985) 9. Kadison, R.V.: Isometries of operator algebras. Ann. of Math. 54, 325–338 (1951) 10. Kadison, R. V. and Ringrose, J.R.: Fundamentals of the theory of operator algebras (Vol. II). London– New York: Academic Press, 1986 11. Omnès, R.: The interpretation of quantum mechanics. Princeton, NJ: Princeton University Press, 1994 12. Pisier, G.: Grothendieck’s theorem for non-commutative C*-algebras with an appendix on Grothendieck’s constant. J. Funct. Anal. 29, 397–415 (1978) 13. Rudolph, O.: Symmetries of history quantum theories and decoherence functionals. J. Math. Phys. (to appear) 14. Rudolph, O. and Wright, J.D.M.: On tracial operator representations of quantum decoherence functionals. J. Math. Phys. 38, 5643–5652 (1997) 15. Rudolph, O. and Wright, J.D.M.: Homogeneous decoherence functionals in standard and history quantum mechanics. Commun. Math. Phys. (to appear) 16. Schreckenberg, S.: Symmetry and history quantum theory: an analogue of Wigner’s Theorem. J. Math. Phys. 37, 6086–6105 (1996) 17. Schreckenberg, S.: Symmetries of decoherence functionals. J. Math. Phys. 38, 759–769 (1997) 18. Wagon, S.: The Banach.-Tarski Paradox. Cambridge: Cambridge University Press, 1985 19. Wright, J.D.M.: The structure of decoherence functionals for von Neumann quantum histories. J. Math. Phys. 36, 5409–5413 (1995) 20. Wright, J.D.M.: Linear representations of bilinear forms on operator algebras. Expos. Math. 16, 75–84 (1998) 21. Wright, J.D.M.: Decoherence functionals for von Neumann quantum histories: Boundedness and countable additivity. Commun. Math. Phys. 191, 493–500 (1998) 22. Wright, J.D.M.: An invariant Haagerup–Pisier–Grothendieck Inequality. Expos. Math. (to appear) 23. Wright, J.D.M.: Quantum decoherence functionals and positivity. Atti Sem. Mat. Fis. Univ. Modena. (to appear) 24. Ylinen, K.: The structure of bounded bilinear forms on products of C*-algebras. Proc. Am. Math. Soc. 102, 599–601 (1988) Communicated by H. Araki
Commun. Math. Phys. 201, 519 – 548 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
p Dissipative Operator Lixin Tian1 , Zengrong Liu2 1 Department of Mathematics and Physics, Jiangsu University of Science and Technology, Zhenjiang, Jiangsu, 212013, P.R. China. E-mail:
[email protected] 2 Department of Mathematics, Shanghai University, Shanghai, 201800, and LNM, Institute of Mechanics, Academia Sinica, 100080, P.R. China
Received: 10 September 1997 / Accepted: 8 September 1998
Abstract: In this paper the authors prove that the generalized positive p selfadjoint (GPpS) operators in Banach space satisfy the generalized Schwarz inequality, solve the maximal dissipative extension representation of p dissipative operators in Banach space by using the inequality and introducing the generalized indefinite inner product (GIIP) space, and apply the result to a certain type of Schr¨odinger operator.
1. Introduction This research studies a sort of p dissipative operator in Banach space by means of the generalized semi-inner product (GSIP) space and the generalized indefinite inner product (GIIP) space. The research on its maximal dissipative extension representation on an infinite dimensional dynamic system in Banach space is of great importance. Moreover, this paper will give some applications in quantum mechanics and the Schr¨odinger operator. Now, research on the Schr¨odinger operator is one of the key problems to study soliton wave and quantum mechanics. On the basis of the above discussion the authors make further research on the behavior of the nonlinear Schr¨odinger equation and scattering of the corresponding particle collision. The dissipative operator in Hilbert space comes from the Cauchy problem on the hyperbolic partial differential equation with L2 measure. Maximal dissipative operators occur in many applications, for instance they are the infinitesimal generators of strongly continuous semigroups of a contraction operator. Now, with the intensive research on the infinite dimensional dynamic system and such problems as the soliton wave, the scattering of particle collisions in quantum mechanics, great attention was focused on the initial value problems of partial differential equations with the measure of Banach space. For example, the natural measure of the heat equation is the supreme of temperature; the measure of the diffusion equation is in L1 ; the measure of dealing with the scattering of a particle collision is in Lp (p= 2 or p 6 = 2) (see [7, 12, 13, 16, 23, 29, 30]). It is well
520
L.-X. Tian, Z.-R. Liu
known that the dissipative operator in Hilbert space is one with wider applications (see [12, 16, 19, 22, 29, 30, 42, 43, 44, 50]. Especially in [12] R. S. Phillips and G. Lumer researched the operator L0 = A − S in Hilbert space, where A is a skew-symmetric operator and S is a positive operator. And there is a one to one correspondence between the maximal dissipative expansion on L0 and the maximal negative subspace in indefinite inner product space. In this paper we extend the result to the Banach space by using the GSIP space and introducing the GIIP space. Because Banach space does not have the bilinear character as the inner product, this makes it difficult to study the operators in Banach space. The research on GSIP and operator theory in GSIP space originated in Lumer’s semiinner product (SIP) in Banach space (see [31]) and Nath’s GSIP space in Banach space. Many researchers (see [1, 4, 9, 31–35, 37–40, 51]) studied the geometric properties of GSIP space or SIP space which include orthogonal projection, isometry and the Riesz Representation Theorem, etc. And also researchers (see [2, 3, 4, 8, 13, 15, 24, 27, 28, 32, 33, 38, 39, 41]) studied the operator theory in SIP or GSIP space: the adjoint operator, adjoint Abelian operator, generalized p selfadjoint operator, generalized p normal operator, etc. They also established the function models of adjoint Abelian operators in Lp () and C(K) (see [15]). These researches added more to the theory of the SIP and GSIP spaces. G. Lumer and R. S. Phillips [13] researched the characters of the dissipative operator in Banach space and paper [10] obtained the properties of the J dissipative operator in indefinite inner product space. Based on these results, the authors of the paper [7, 8, 25] obtained some important results by dealing with the Schr¨odinger operator on Banach space Lp [0, 2π]. It is important whether the results obtained in [7] can be extended to general Banach space. What we will do in our paper is to further extend in general the Banach space. The most difficult point is to determine if the generalized Schwarz inequality holds for the generalized positive p selfadjoint operator (GPpS operator). The research on indefinite inner product space comes from quantum field theory. Until now, the operator theory in the space is fruitful, for example [9, 10, 11, 14, 42, 43, 44]. It is of real importance in physics to solve the difficulty of divergence by establishing scattering theory with an indefinite inner product space. However, for general Banach space which can’t be changed into Hilbert space we may not use the indefinite inner product to deal with the scattering of particle the collision with the measure Lp (p 6 = 2), as there is a bilinear Hermite function in the indefinite inner product space. Therefore, we introduce GIIP space into the study of the p dissipative operator in Banach space. This extension is of particular significance not only in mathematics but also in a real physical system, which we may find when the maximal dissipative extension of a sort of Schr¨odinger operator is dealt with in the paper. The research on the linear operator with character of chaos and the dynamic behavior of the linear operators in infinite dimension Banach space gives the foundation to look into the complexity and dynamic behavior of infinite dimensional dynamic systems with the metric of the Banach space (see [25, 26, 46, 47, 48]). Based on it we develop the maximal dissipative extension of the p dissipative operator in Banach space and we apply the result to a certain kind of Schr¨odinger operator. The paper includes six sections. In Sect. 1 we give the introduction. In the next section we set up the generalized Schwarz inequality of the generalized positive p selfadjoint operator in GSIP space. In Sect. 3 the generalized indefinite inner product space and generalized Krein space are introduced and some properties are obtained. In Sect. 4 we construct the natural boundary space of p dissipative operator in Banach space. In Sect. 5 we give the maximal dissipative extension representation of a sort of p dissipative
p Dissipative Operator
521
operator in Banach space by the natural boundary space. Finally a kind of Schr¨odinger operator is studied by using the results obtained in Sects. 4 and 5 and we solve the maximal dissipative extension representation of the operator. The main results in the paper are described as follows. Theorem 2.1. Let X be a GSIP space. T ∈ L(X), if T is a generalized positive p selfadjoint operator in X, then the generalized Schwarz inequality of T is right. Theorem 5.1. Let L0 = A − S, where A is p skewsymmetric,Re [Au, u]p = 0 and S is a reversible generalized positive p selfadjoint operator. Suppose that the maximal dissipative extension of L0 is L. Then, there is an one to one correspondence between the maximal dissipative extension L of L0 and the maximal negative subspace N˜ of GIIP ˜ and space H, Lu = L1 u + S 1/2 ϕ (u) ˆ , L1 = A∗ − S, b, N b is the projection of N˜ from H e to H}. b D(L) = {u ∈ D(L1 )|uˆ ∈ N 0
Theorem 6.3. In X = Lp [0, 2π], 1 < p0 < ∞, suppose the Schr¨odinger operator L0 = if 00 − f, D(L0 ) = {f |f, f 00 ∈ X, f (0) = f (2π) , f 0 (0) = f 0 (2π)}. If the maximal dissipative extension of L0 is L, then there is an one to one correspondence e and between the operator L and the maximal negative subspace N˜ of H Lu = iu00 − u + u0 (0) u (0) f, D (L) = {u|u, u0 , u00 ∈ X, αu0 (2π) u (2π) + βu0 (0) u (0) = 0, |β| ≤ |α|} , where f ∈ X satisfy the inequality p −2 β/α + 1 u0 (0) u (0) + |u0 (0) u (0) |p kf k ≤ 0. 2. GSIP Space In order to carry over Hilbert space arguments to the theory of Banach space, Lumer [31] introduced the concept of SIP space which has a more general axiom system than that of Hilbert space. Furthermore, Nath [1] introduced the GSIP space. From [1], a complex Banach space X is called a complex generalized semi-inner product (GSIP) space if corresponding to an arbitrary pair of elements x, y ∈ X, there exists a complex number [x, y]p in X × X which satisfies the following properties for any x, y, z ∈ X, and λ ∈ C (C denotes the complex field): (1) [αx + βy, z]p = α [x, z]p + β [y, z]p , (2) [x, x]p > 0, for x 6 = 0; x = 0 iff [x, x]p = 0, 1/q (3) | [x, y]p | ≤ [x, x]1/p p [y, y]p , 1 < p, q < +∞, 1/p + 1/q = 1. A GSIP [x, y]p, 1 < p < +∞, generates the norm k·k that for x ∈ X, k x k= [x, x]1/p p . If p = 2, the GSIP space is the SIP space. Then we denote the SIP to [·, ·]. From [2, 3], if X is a complex Banach space with norm k·k, for each p ∈ (1, +∞), then there exists a GSIP [x, y]p which generates the norm k·k, and in this case we have
522
L.-X. Tian, Z.-R. Liu
[x, λy]p = |λ|p−2 λ [x, y]p , for any x, y ∈ X, λ ∈ C, h i [tx, y]p = x, |t|(2−p)/(p−1) ty , for t ∈ C. p
0
0
Moreover if p 6 = p , p, p ∈ (1, +∞) and [·, ·]p , [·, ·]p0 are respectively the corresponding GSIP which generalized the norm || · || in Banach space X, then for all x, y ∈ X, y 6 = 0: 0 [x, y]p =k y kp−p [x, y]p0 . Suppose p ∈ (1, +∞) , T ∈ L (X) (L (X) denotes all bounded linear operators). Papers [2, 3] proved that if X is a smooth strictly convex and reflexive Banach space, then there is a unique GSIP [·, ·]p which generates the norm and for each f ∈ X ∗ there is a unique y ∈ X such that f (x) = [x, y]p for all x ∈ X, and in this case we have p−1 kf k = kyk . ∗ 0 From [3]we have that for each f ∈ X and p ∈ (1, +∞) there is a unique y ∈ X such 0 that f (x) = x, y p , for all x ∈ X, where GSIP [·, ·]p generates the norm. Throughout the paper, we shall always assume that X is a Banach space which is smooth strictly convex and reflexive. Definition 2.1 (see [2, 3]). Suppose p ∈ (1, +∞), T ∈ L (X), and y ∈ X, by [T x, y]p = [x, y ∗ ]p , we obtain Tp∗ satisfying Tp∗ y = y ∗ , defines a mapping which maps X into X, Tp∗ is called a generalized p adjoint operator. If Tp∗ = T , T is called a generalized p selfadjoint operator. If p = 2, the generalized 2 selfadjoint operator also is called generalized selfadjoint operator. If T satisfies [T x, x]p ≥ 0, ∀x ∈ X, T is called a generalized positive operator. If for T, [T x, x]p is real, call it generalized Hermite operator. Of course the generalized adjoint operator and the generalized p selfadjoint operator depend on p. Generalized positive operators under GSIP [·, ·]p0 , 1 < p0 < ∞, are generalized positive operators under GSIP, [·, ·]p , p 6 = p0 , 1 < p < ∞. If T ∈ L (X), is both a generalized p selfadjoint operator and generalized positive operator, then we call it generalized positive p selfadjoint operator (GPpS operator). Example 2.1. There exists an operator, which is a generalized positive p selfadjoint (GPpS) operator in Banach space, but isn’t both a generalized selfadjoint operator in SIP space and a selfadjoint operator in Hilbert space. Suppose X = lp , 1 < p < ∞, p 6 = 2. Define the unique GSIP [·, ·]p in X following that: ∞ X xi |yi |p−2 yi, where x = {xi } , y = {yi } ∈ lp . [x, y]p = i=1
n 0o 0 0 Define the operator T : lp → lp , such that T {xi } = xi , where x1 = x1 , xi = 0, i 6 = 1. p
Then [T x, x]p = |x1 | ≥ 0. T is a generalized positive operator. Since p−2
[T x, y]p = x1 y1 |y1 |
= [x, T y]p ,
we also have that T is a generalized p selfadjoint operator. Hence T is a GPpS operator in X. Notice that the unique SIP [·, ·] in X is [x, y] = kyk
2−p
∞ X i=1
p−2
xi |yi |
yi , where x = {xi }, y = {yi } ∈ X.
p Dissipative Operator
523
Then
p−2
[T x, y] = x1 y1 |y1 |p−2 / kyk
p−2
, [x, T y] = x1 y1 |y1 |p−2 / kT yk
,
where x = {xi },y = {yi } ∈ X. Hence T isn’t a generalized selfadjoint operator in SIP space. Since lp , p 6 = 2 isn’t Hilbert space, T also isn’t a selfadjoint operator in Hilbert space. Example 2.2. There exists an operator T in Banach space such that T is a generalized p selfadjoint operator but isn’t both a generalized Hermite operator and a generalized positive operator. Hence the generalized p selfadjoint operator in Banach space differs from the selfadjoint operator in Hilbert space. Let X = Y ⊕ Y , ⊕ is l3 -sum, Y is a two dimensional Hilbert space, inner product (·, ·) in Y . Define the GSIP following that, for 1 < p < ∞, p−3 {(y1 , y10 ) ky10 k + (y2 , y20 ) ky20 k}. hy1 , y2 i , hy10 , y20 i p = khy10 , y20 ik
0 0 Define the operator T = i 0
0 −i 0 0 0 i in X. We easily prove that 0 0 0 −i 0 0
[T x, y]p = [x, T y]p , x, y ∈ X. Hence T is a generalized p selfadjoint operator. But, for x = hy1 , y2 i ∈ X, y1 = x11 , x12 , y2 = x21 , x22 ∈ Y, has p−3
[T x, x]p = khy1 , y2 ik
{(−ix21 x11 + ix22 x12 ) ky1 k
+ (−ix11 x21 + ix12 x22 ) ky2 k}. This is a complex number. So T is not a generalized Hermite operator and generalized positive operator. Example 2.3. There exists an operator T such that T is a generalized Hermitz operator but isn’t a generalized p selfadjoint operator in Banach space. 0 Let X = lp , 1 < p0 0. Choose α such that 1/2 ≤ kKk ≤ 1. From (1), there Step 2 1/(p−1) T y p, is the generalized positive operator A, K = A . By using [αT x, y]p = x, α we have αT = K = A2 , (αT )∗ = K ∗ = α1/(p−1) T.
(2.2)
Then (A∗ )2 = K ∗ = α(2−p)/(p−1) A2 or A2 = α(p−2)/(p−1) (A∗ )2 . Analogously to the discussion in Steps (1) and (2), when N (T ) = {0} then we have α(p−2)/2(p−1) A−1 A∗ = E1 − E2 .
p Dissipative Operator
527
As in the discussion in Step (1), we have E2 = 0. Then α(p−2)/2(p−1) A∗ = A. Thus [T x, y]p = α−1 [αT x, y]p = α−1 [Kx, y]p = α−1 A2 x, y p = α−1 Ax, A∗ y p h i −1 (p−2)/2(p−1) Ay = α−1 (α(p−2)/2(p−1) )p−2 Ax, A∗ y p = α Ax, α p 1/p 1/q ≤ α−1 (α(p−2)/2 )1/p (α(p−2)/2 )1/q [Ax, Ax]p [Ay, Ay]p 1/p 1/q = α−1 Ax, A∗ x p Ay, A∗ y p 1/p 1/q = α−1 A2 x, x p A2 y, y p 1/p 1/q 1/p 1/q = α−1 [Kx, x]p [Ky, y]p = α−1 [αT x, x]p [αT y, y]p 1/p 1/q = [T x, x]p [T y, y]p . When N (T ) 6 = {0}, the same result can be obtained by the same method as in Step (2). Hence the generalized Schwarz inequality of T is satisfied. (4) Step four. Let kT k ≥ 1. We suppose K = αT , α > 0, choosing α such that 0 < kKk < 1. By using the following formula: (I − αT )∗ (αT ) = (αT )(I − αT )∗ and similarly to Steps (1), (2) and (3) we easily prove that T satisfies the generalized Schwarz inequality. From Step (1)–(4), we prove that the GPpS operator satisfies the generalized Schwarz inequality. Corollary 2.1. When T is a GPpS operator and T ∗ = αT , α > 0, then the generalized Schwarz inequality of T is satisfied. Theorem 2.2. When T is a GPpS operator, then there exists a GPpS operator P such that T = P 2 and P is unique. P is called a positive square root of T . Proof. When 1/2 ≤ kT k ≤ 1, by the result of Steps (1), (2) in Theorem 2.1, there exists a GPpS operator such that P 2 = T . When 0 < kT k < 1/2, suppose K = (2 kT k)−1 T , then kKk = 1/2. By Steps (1), 2 (2) in Theorem 2.1, there exists the generalized positive operator p A, such that K = A , but A is not a generalized p selfadjoint operator. Suppose P = 2 kT kA. Because hp i hp i h i 2 kT kAx, y = 2 kT kx, A∗ y = x, (2 kT k)1/2(p−1) A∗ y [P x, y]p = p p p h i h p i 1/2(p−1) (p−2)/2(p−1) = x, (2 kT k) (2 kT k) Ay = x, 2 kT kAy p
= [x, P y]p . Thus P is a GPpS operator and T = P 2 . When kT k > 1, the same result for P is obtained by similar reasoning. Now we prove P is unique.
p
528
L.-X. Tian, Z.-R. Liu
If P, Q are the GPpS operators and T = P 2 = Q2 , we prove P = Q. By the above discussion of this theorem, for the GPpS operators P, Q, there exist respectively the GPpS operators P 0 and Q0 such that P 02 = P , Q02 = Q. Let y = (P −Q)x, then p p kP 0 yk + kQ0 yk = P 0 y, P 0 y p + Q0 y, Q0 y p = P 02 y, y p + Q02 y, y p = [P y, y]p + [Qy, y]p = [(P + Q)y, y]p = [(P + Q)(P − Q)x, y]p = (P 2 + QP − P Q − Q2 )x, y p . From the construction of P and Q, P T = T P = P 3 , QT = T Q = Q3 , and it easily follows that QP = P Q. Thus kP 0 ykp + kQ0 ykp = (P 2 − Q2 )x, y p = 0, and P 0 y = Q0 y = 0; P y = Qy = 0.
p
Then (P − Q)2 x = (P − Q)2 x, (P − Q)2 x p = (P − Q)y, (P − Q)2 x p = P y − Qy, (P − Q)2 x p = 0, ∀x ∈ X. (P − Q)2 x = 0, ∀x ∈ X, or (P − Q)2 = 0. Hence T − P Q = 0, P (P − Q) = 0, Q(P − Q) = 0 Because
p
k(P − Q)xk = [(P − Q)x, (P − Q)x]p = [x, P (P − Q)x]p − [x, Q(P − Q)x]p = 0,
then (P − Q)x = 0,∀x ∈ X. Therefore P = Q. The theorem is proved.
Corollary 2.2. If T is a generalized positive operator and T ∗ = αT , α > 0, then there exists a unique generalized positive operator P such that T = P 2 and P ∗ = α1/2 P. Definition 2.3. If U, U ∗ ∈ L(X), and U U ∗ = U ∗ U = I, U is called a generalized p unitary operator (see [3]). It is easy to prove that the generalized p unitary operator is an isometric operator. From Theorem 2.1, 2.2 and simulating to the result of Hilbert space, the following theorem can be obtained. Theorem 2.3. (1) If (T ∗ )∗ = T , T ∈ L(X), there exist U , P ∈ L(X) such that T = U P , where U is a generalized p unitary operator and P is a GPpS operator. (2) If (T ∗ )∗ = Mp T , Mp > 0, T ∈ L(X) (the definition of the operator is seen in [3]), −(p−1)/2 1/2 I, P ∗ = Mp P there exist U, P ∈ L(X), T = U P and U ∗ U = U U ∗ = Mp on P X. Then T = U P is called a polar decomposition of T . 3. Generalized Indefinite Inner Product (GIIP) Space In order to investigate the p dissipative operator in Banach space, we set up the GIIP space in this section. Using the new space we can give the maximal dissipative extension representation of the p dissipative operator by the negative subspace in the space. The GIIP space comes from the indefinite inner product space but differs from it. Many results on indefinite inner product space have been published, for example, see [9, 10, 11, 14]. It attracted great attention because the indefinite inner product space caused some important applications in quantum field theory [9], scattering theory [10] and control theory [14]. Here we will set up the GIIP space by means of GSIP.
p Dissipative Operator
529
Definition 3.1. Let R be a complex (or real) linear space, y, z ∈ R, define a complex (or real) number hy, zi: (1) hx + y, zi = hx, zi + hy, zi , hλx, yi = λ hx, yi; (2) If hx, yi = 0 for arbitrary y then x = 0. The space (R, h·, ·i) satisfying (1), (2) is called a generalized indefinite inner product (GIIP) space, h·, ·i is called the generalized indefinite inner product (GIIP) in R. Definition 3.2. Let (R, h·, ·i) be GIIP space. If it includes two subspaces H+ and H− with the properties below: (1) R = H+ ⊕ H− where ⊕ is an orthogonal direct sum for h·, ·i, that is, for arbitrary x ∈ H+ , y ∈ H− , hx, yi = 0, where H+ , H− are linear subspaces with the following: H+ = {x ∈ R| hx, xi ≥ 0, x 6 = 0}; H− = {x ∈ R| hx, xi ≤ 0, x 6 = 0}. (2) For 1 < p < ∞, spaces (H+ , h·, ·i) and (H− , h·, ·i) are GSIP spaces. Then we call (R, h·, ·i) a generalized Krein space. Here h·, ·i is a GIIP. If (H+ , h·, ·i), (H− , − h·, ·i) become Banach spaces and k·k is the norm of the spaces H+, H− , we call the space (R, h·, ·i) a complete generalized Krein space and H+ , H− are called a regular decomposition of (R,h·, ·i). Considering generality, we suppose H± 6 = {0}. Theorem 3.1. In the generalized Krein space (R, (·, ·)), for x, y ∈ R, x = x+ + x− , y = y+ + y− and x+ , y+ ∈ H+ , x− , y− ∈ H− denote [x, y]p = hx+ , y+ i − hx− , y− i , for 1 < p < +∞. Then (R, [·, ·]p ) is a GSIP space. Proof. We only need to verify that [·, ·]p meets with the following inequality: 1/q [x, y]p ≤ [x, x]1/p p [y, y]p , 1/p + 1/q = 1, or
(3.1)
p
|hx+ , y+ i − hx− , y− i|
p−1
≤ |hx+ , x+ i − hx− , x− i| · |hy+ , y+ i − hy− , y− i| p
p
p
p
= {kx+ k + kx− k } · {ky+ k + ky− k }p−1 . Noticing the basic Young’s inequality, for p > 1, a > 0, b > 0, we have ab ≤
1 p 1 q a + b , 1/p + 1/q = 1. p q
(3.2)
Let x+ 6 = 0, y+ 6 = 0 (otherwise formula (3.1) can easily be proved). Thus p
|hx+ , y+ i − hx− , y− i| ≤ (|hx+ , y+ i| + |hx− , y− i|)p p−1
≤ (kx+ k ky+ k p
p(p−1)
= kx+ k ky+ k
p−1 p
+ kx− k ky− k
kx− k 1+ kx+ k
)
ky− k ky+ k
p−1 !p
(3.3) .
530
L.-X. Tian, Z.-R. Liu
Let k = kx− k / kx+ k, m = ky− k / ky+ k. We might as well assume k ≥ 1, otherwise we p p(p−1) (1 + kmp−1 )p . only let k = kx+ k / kx− k. Then the right side of (3.3) = kx+ k ky+ k Thus p
p
p
p
1/q = (kx+ k + kx− k )1/p (ky+ k + ky− k )1/q [x, x]1/p p [y, y]p p 1/p p 1/q kx− k ky+ k p−1 = kx+ k 1 + ky+ k 1+ kx+ k ky− k p−1
= kx+ k ky+ k
(1 + k p )1/p (1 + mp )1/q .
Therefore, proving formula (3.1) is equal to proving the following: (1 + kmp−1 )p ≤ (1 + k p )(1 + mp )p−1 . Considering the inequality (3.2), then we have abp−1 ≤ ap /p + bp (p − 1)/p, a > 0, b > 0. Thus
1 np p − 1 np k + m , 1 < p < +∞, p p mnp k np k n mn(p−1) + (p − 1) , n = 1, 2, 3, · · · . ≤ p n n n By summing the above formula, we can obtain k n mn(p−1) ≤
p
∞ X k n mn(p−1) i=1
or
n
≤
∞ X k np i=1
n
+ (p − 1)
∞ X mnp i=1
n
,
p ln(1 + kmp−1 ) ≤ ln(1 + k p ) + (p − 1) ln(1 + mp ), (1 + kmp−1 )p ≤ (1 + k p )(1 + mp )p−1 .
This theorem has been proved.
Example 3.1. Let H = H+ ⊕ H− and H± respectively are GSIP spaces in which the GSIP are [·, ·]± respectively. We construct the GIIP as following that hx+ + x− , y+ + y− i = [x+ , y+ ]+ − x− , y− − , where x = x+ +x− , y = y+ +y− ∈ H. Easily, we can prove that (H, h·, ·i) is a generalized Krein space. 0
Example 3.2. Let (X, [·, ·]) be a GSIP space (for example X = C(K) or Lp , 1 < p0 < ∞). Suppose T is a generalized p selfadjoint operator and N (T ) 6 = {0} (such T exists in general, for example, the paper [15] gives the functional models of the 0 adjoint Abelian operators in C(K) and Lp (1 < p0 < ∞)). From [2] Theorem 2.22, X = Rp (T ) ⊕ Np (T ) = Rp (T ) ⊕ Np (T ), where Rp (T ), Np (T ) are the numerical range and kernel space, respectively. Then Rp (T ), Np (T ) are closed subspaces in X and (Rp (T ), [·, ·]), (Np (T ), [·, ·]) are GSIP spaces. Let the GIIP in X be the following hx1 + x2 , y1 + y2 i = [x1 , y1 ] − [x2 , y2 ] , for x1 , y1 ∈ Rp (T ), x2 , y2 ∈ Np (T ).
p Dissipative Operator
531
Being proved easily, (X, h·, ·i) is a generalized Krein space and Rp (T ),Np (T ) is a regular decomposition of X. For 1 < p < +∞, we can obtain different regular decompositions of X because there exist a lot of different generalized p selfadjoint operators in GSIP space X. Hence the regular decomposition of X is not unique. Q = (R, h·, ·i) → H± such that Definition 3.3. Q Define projection operators P± : x = x+ + x− ∈ → x± ∈ H± . Denote J = P+ − P− . Theorem 3.2. (1) J 2 = I, J = J ∗ , where J ∗ is a generalized p adjoint operator in generalized Krein space (R, [·, ·]p ). (2) hx, yi = [Jx, y]p ,[x, y]p = hJx, yi. Proof. We only prove J = J ∗ , the others may be obtained easily: [Jx, y]p = P+ x − P− x, y p = [P+ x, y]p − P− x, y p = hP+ x, y+ i + h0, y− i − {h0, y+ i + hP− x, y− i} = hP+ x, y+ i − hP− x, y− i = hP+ x, y+ i − P− x, y− p = hP+ x, y+ i + P− x, −y− p = hP+ x, y+ i + hP− x, −y− i = P+ x + P− x, y+ − y− p = [x, Jy]p . Hence J is a generalized p selfadjoint operator in (R, [·, ·]p ). We remark that the GSIP in Theorem 3.1 depends on the regular decomposition of R. In general the GSIP isn’t unique. Definition 3.4. Let (R, h·, ·i) be a GIIP space, and x ∈ R. If hx, xi ≥ 0 (or hx, xi ≤ 0), x is called a semipositive (or seminegative) vector in R. If hx, xi > 0 (or hx, xi < 0), x is called a positive (or negative) vector in R. If hx, xi = 0, x is called a neutral vector or isotropic vector. Definition 3.5. L is called a positive (or negative, semi-positive, semi-negative, neutral, respectively) subspace if all vectors in linear subspace L in R are positive (or negative, semi-positive, semi-negative, neutral, respectively). Suppose that L is a positive (or negative, semi-positive, semi-negative, neutral) subspace, and there is not any positive (or negative, semi-positive, semi-negative, neutral) subspace L such that L is a proper subspace of L0 . Then L is called a maximal positive (or maximal negative, maximal semi-positive, maximal semi-negative, maximal neutral) subspace in (R, h·, ·i). According to the related results on indefinite inner product space (see [9–12]) we can easily obtain: Any positive (or negative, semi-positive, semi-negative, neutral) subspace of GIIP space R can be extended as a maximal positive (or negative, semi-positive, semi-negative, neutral) subspace but the extension isn’t unique. Theorem 3.3. Suppose that L is a semi-positive subspace in 5 = (R, h·, ·i). Then there exist an orthogonal projection P+L in H+ and a contraction linear operator T : P+L H+ → H− such that (3.3) L = {x+ + T x+ | x+ ∈ P+L 5} (or kT k < 1), where the orthogonal projection P+L : L → H+ means that x → x+ , for arbitrary x = x+ + x− ∈ L, hx+ , x− i = 0. L satisfying Eq. (3.3) is a semi-positive (or positive) subspace in 5 if and only if kT x+ k ≤ kx+ k, for arbitrary x ∈ P+ 5( or kT k < 1).
532
L.-X. Tian, Z.-R. Liu p
p
Proof. For arbitrary x ∈ L, − kP− xk + kP+ xk = hx, xi ≥ 0, then kP− xk ≤ kP+ xk. p p p p kP− xk ≤ kxk = [x, x]p = [P+ x, P+ x]p + P− x, P− x p = kP+ xk + kP− xk ≤ p 2 kP+ xk . Thus P+ |L is reversible. Let the reversibility be Ex+ = x, x ∈ L. Then P+ x = x+ . Denote P+L : H+ → P+ L a projection operator. Writing T : P+L H+ → H− : P+ x = P+L x+ → P− EP+ x, for arbitrary x ∈ L, then we easily get that P+L is an orthogonal projection operator. From the semi-positive of L we obtain that T is a contraction linear operator, or,hT x, xi ≤ hx, xi, x ∈ D(T ). Then L = {x+ + T x+ : x+ ∈ P+L 5}. The other conclusions in the theorem are omitted because their proofs are easy. Corollary 3.1. The semi-positive subspace L is a maximal semi-positive subspace if and only if P+ L = H+ ; any semi-positive subspaces are contained in one maximal semi-positive subspace. Corollary 3.2. All maximal semi-positive (or semi-negative) subspaces in generalized Krein space have identical dimension. Theorem 3.4. In Banach space(R, [·, ·]p ), if the GSIP [·, ·] p is continuous to the first variable, any norms in R are equivalent to each other. 0
00
0
Proof. Suppose k·k , k·k are two norms in R and denote kxk = kxk + kxk . Now we 00 prove kxk is also the norm of R. In fact we only need to prove the completeness in R. 00 0 For arbitrary {xn } ∈ R, if kxn − xm k → 0, m, n → ∞, then there exist x0 ,x0 ∈ R 0 such that kxn − x0 k → 0, kxn − x0 k → 0, as n → ∞. Hence xn − x00 , y p → 0 (from the continuous condition), xn − x00 , y p → 0, for arbitrary y ∈ R. Therefore for arbitrary y ∈ R, x0 − x00 , y p ≤ xn − x00 , y p + [xn − x0 , y]p → 0, as n → ∞. Because y is arbitrary then x0 = x00 and 00
0
kxn − x0 k = kxn − x0 k + kxn − x0 k → 0, as n → ∞. 00
00
If k·k is a norm in (R, [·, ·]p ), the space is still complete. We get k·k is also a 00 norm in R and kxk ≤ kxk . From the Banach Inverse Theorem’s Corollary (see [50]) 00 in Banach space, we have that k·k is equivalent to k·k . For the same reason, k·k is 0 00 equivalent tok·k . Thereforek·k is equivalent to k·k . The proof is complete. 4. The Construction of Natural Boundary Space In the section we set up the natural boundary space of p dissipative operators in Banach space. The natural boundary space is a GIIP space. In the next section we will give the natural boundary space’s application. For similar results in Hilbert space and indefinite inner product space see M. G. Crandall and R. S. Phillips [12]. Because there doesn’t exist an inner product in Banach space, it is very difficult to extend the results. Using GSIP space and GIIP space, we solve this difficulty. (i) Hypotheses of spaces H0 , H1 , H2 . H0 : Let (H0 , [·, ·]p ), 1< p < ∞ be a GSIP space and the norm of H0 is kxk = [x, x]1/p p . Suppose that S is a GPpS operator in H0 (S may be an unbounded operator). And suppose F is a linear operator with the following conditions:
p Dissipative Operator
(1) (2) (3) (4)
533
domain D (F ) = D (S), R (F ) = H0 , F is a generalized p selfadjoint operator, [F u, u]p ≥ [u, u]p , for arbitrary u ∈ D(S) or denote F − I ≥ 0.
The operator F exists. For example, in SIP space X, let S be the GPpS operator then X = R(S) ⊕ N (S) (see [2] Theorem 2.22). We assume that F x = αSx, x ∈ R(S) ;F x = βx, x ∈ N (S)and F is a linear operator in X. If S is a bounded operator, choose 1 α and βsuch that α ≥ kSk , β ≥ 1; if S is an unbounded operator, choose α and βsuch that α = 1, β ≥ 1. Then we easily prove that F is a GPpS operator in X. From [2] Theorem 2.22, we have X = R (F ) ⊕ N (F ). Because of N (F ) = 0, then X = R(F ). Therefore F satisfies the above (1)–(4).
Remark 4.1. It is evident that kF k ≥ 1, F −1 ≤ 1. From Theorem 2.2 S, F have unique positive square roots respectively: we denote S 1/2 , F 1/2 and S 1/2 , F 1/2 are GPpS operators and D(S 1/2 ) = D(F 1/2 ). H1 : Let (H1 , [·, ·]1 ) be a GSIP space with GSIP [·, ·]1 such that i h [u, v]1 = F 1/2 , F 1/2 , u, v ∈ D(F 1/2 ) p
1/p
and the norm in H1 is denoted kxk1 = [x, x]1 dense set of H0 .
, x ∈ H1 . It is easy to see that H1 is a
H2 : Let (H2 , [·, ·]2 ) be a GSIP space and the GSIP is: h i [u, v]2 = F −1/2 u, F −1/2 v = F −1 u, v p , u, v ∈ H0 p
and the norm in H2 is denoted kxk2 =
1/p [x, x]2 .
Remark 4.2. (1) For arbitrary u ∈ H0 , kuk ≤ kuk1 . In fact h i 1/p 1/p 1/2 1/2 ≤ u, u] = F u, F u = kuk1 . kuk = [u, u]1/p [F p p p
(2) For arbitrary u ∈ H0 , generates a continuous functional lu (v) on H1 according to the formula: lu (v) = [v, u]p , for arbitrary v ∈ H1 . Its continuity follows from the estimate: 1/p 1/q |lu (v)| = [v, u]p ≤ [v, v]p [u, u]p p−1
= kvk kuk
1/(p−1)
(3) Let u0 ∈ H0 , then kuk2 = klu k 1/(p−1)
klu k
p−1
≤ kvk1 kuk
.
and kuk2 ≤ kuk. In fact !1/(p−1) −1/2 ! [v, u]p 1/(p−1) v, F −1/2 u p F
= sup = sup
F −1/2 v v∈H1 kvk1 v∈H1
= F −1/2 u = kuk2 .
534
L.-X. Tian, Z.-R. Liu
From Remark 4.1 F −1 ≤ 1, then
p p−1 p ≤ kuk , kuk2 ≤ kuk . kuk2 = F −1 u, u p ≤ F −1 u kuk Hence {lu , u ∈ H0 } is a dense subset of H1∗ , where H1∗ is a dual space of H1 . Remark 4.3. By using Remarks 4.1 and 4.2, then kuk1 ≤ kuk ≤ kuk2 , u ∈ H0 ; H1 ⊂ H0 ⊂ H2 in the topological sense. If u ∈ H1 , v ∈ H2 , we define [·, ·]p in H1 × H2 satisfying the following formula: h i [u, v]p = F 1/2 u, F −1/2 v = [F u, v]2 = u, F −1 v 1 , u ∈ H1 , v ∈ H2 . p
If u ∈ H2 , v ∈ H1 , define [u, v]p = F −1/2 u, F 1/2 v p . Example 4.1. For 1 < p < ∞, p 6 = 2, if H0 = Lp (R), denote the unique SIP [·, ·] in H0 : p−1 Z |g| f sgn gdx, f, g ∈ Lp (R). [f, g] = kgk kgk R Let F = αI, α > 1. Then H1 = Lp (R, αdx), H2 = Lq (R, α−1 dx), q −1 + p−1 = 1. Proposition 4.1. (H1 , [·, ·]1 ), (H2 , [·, ·]2 ) are GSIP spaces. Proposition 4.2. H2 = (H1 )0 , where (H1 )0 is a dual space of H1 . Proof. It is easy to see that H2 ⊂ (H1 )0 . For arbitrary l ∈ (H1 )0 , there exists a ∈ H1 such that l(u) = [u, a]1 from the Riesz Representation Theorem in GSIP space. Then l(u) = [u, a]1 = [u, F a]p , u ∈ H1 . Let α = F a ∈ H2 . Thus l(u) = [u, α]p , u ∈ H1 . From the Riesz Representation Theorem in GSIP space, l ∈ H2 . Then (H1 )0 ⊂ H2 . Hence (H1 )0 = H2 . Proposition 4.3. (1) If u ∈ H1 , v ∈ H2 , then
p−1 1/2
p−1 [u, v]p ≤ ≤ kuk1 kvk2 .
F u F −1/2 v p−1 (2) u ∈ H2 , v ∈ H1 , then [u, v]p ≤ kuk2 kvk1 . 1/2 (3) u, v ∈ H1 , then [Su, v]p = S u, S 1/2 v p = [u, Sv]p , and S 1/2 defines a bounded mappings on H1 to H2 and on H to H2 . S defines a bounded mapping on H1 to H2 . (ii) The definition of p dissipative operator. Definition 4.1. Let L be a linear operator on H1 to H2 with domain D(L) dense in H1 . We define L∗ , the generalized p adjoint operator to L, as the operator on H1 to H2 given by: v in D(L∗ ) and L∗ v = f if [Lu, v]p = [u, f ]p for all u in D (L). Definition 4.2. Let L be a densely defined linear operator on H1 to H2 . Then L is (1) p symmetric if L∗ ⊃ L; (2)p skew-symmetric if L∗ ⊃ −L; (3) generalized p selfadjoint if L∗ = L.
p Dissipative Operator
535
Definition 4.3. Let L be a densely defined linear operator on H1 to H2 . L is called dissipative operator if Re [Lu, u]p ≤ 0, u ∈ D(L). L is maximal dissipative if it is dissipative and not a proper restriction of a p dissipative operator. Remark 4.4. For 1 < p < +∞, if the space H0 has GSIP [·, ·]p and L is a p dissipative operator from H1 to H2 , then L is a p0 dissipative operator where p 6 = p0 , 1 < p0 < +∞ by using the formula kyk
p−p0
[x, y]p0 = [x, y]p , 1 < p, p0 < +∞,
and [x, y]p0 is a GSIP of H0 . Hence, for the p dissipative operator in Definition 4.3, p means that the space H0 has GSIP [·, ·]p ,1 < p < +∞. Let L0 = A − S on H1 to H2 , where A is a p skew-symmetric operator, D(A) is dense in H1 , Re [Au, u]p = 0 for any u ∈ D(L0 ) and S is a GPpS operator. Then L0 is a p dissipative operator because Re [L0 u, u]p = − [Su, u] ≤ 0. In this and the next section we investigate the very important operator L0 in Banach space. b of L0 . We introduce the product (iii) The construction of natural boundary space H 1 2 space H12 = H1 × H2 with element u = {u , u } and Q(·, ·): Q(u, v) = Re u2 , v 1 p + Re Su1 , v 1 p , for any u = {u1 , u2 }, v = {v 1 , v 2 } ∈ H1 × H2 . Let the graph of L0 be G(L0 ) = {{u, L0 u} | u ∈ D(L0 )}. As u = {u, L0 u} ∈ G(L0 ), then Q(u, u) = Re [L0 u, u]p + Re [Su, u]p = 0. Let L1 = A∗ − S; we have the set span G(L1 ), where G(L1 ) is a graph of L1 . If u, v ∈ G(L1 ) then Q(u, v) = Re [L1 u, v]p + Re [Su, v]p = Re A∗ u, v p . The sets H+ , H− are defined by H+ = {u : Q(u, u) ≥ 0, u ∈ G(L1 )} ⊂ G(L1 ), H− = {u : Q(u, u) ≤ 0, u ∈ G(L1 )} ⊂ G(L1 ). Because H+ , H− may not be linear spaces, we define H + , H − as follows: H + = span H+ , H − = span H−. Then H + , H − are closed linear subspaces of span G(L1 ) and G(L0 ) ⊂ H + ∩ H − . Define Q+ (·, ·) in H + : Q+ (u, v) = Q(u, v) sgn Q(v, v), u, v ∈ H + . Define Q− (·, ·) in H − : Q− (u, v) = Q(u, v)(− sgn Q(v, v)), u, v ∈ H − .
536
L.-X. Tian, Z.-R. Liu
Let H be the direct sum space H = H + ⊕ H − ; ⊕ means direct sum. Define the form Q (·, ·) in H, for u = u+ + u− , v = v + + v − ∈ H: Q u, v = Q+ (u+ , v + ) + Q− u− , v − . b u b = H/G(L0 ) = H + /G(L0 ) ⊕ H − /G(L0 ). Introduce the form Q b, vb , u b, Let H b , such that vb ∈ H b b u, vb) = Q(u, v), where u, v belong to the coset u b, vb in H. Q(b Theorem 4.1. (H + , Q+ (·, ·)), (H − , −Q− (·, ·)) are GSIP spaces. Let H + /G(L0 ), H − /G(L0 ) be quotient spaces, then (H + /G(L0 ), Q+ ), (H − /G(L0 ), −Q− ) are GSIP spaces. b b u, vb) = Q(u, v), for any u ∈ u b, v ∈ vb, where u, v ∈ H, u b, vb ∈ H. Theorem 4.2. Q(b b Q b (·, ·)) is a GIIP space. Theorem 4.3. (H, e =H b ⊕ H, ˇ ⊕ is a direct sum, where Hˇ is a GSIP space with GSIP [·, ·]p . For any Let H e define Q: e u e = {uˆ u} ˇ ∈ H, e u, ve) = Q(b b u, vb) + [u, Q(e ˇ v] ˇ p. e Q) e is a GIIP space. It is easy to see that (H, e e Q). e Suppose N b is a projection subspace of N e Let N be a negative subspace on (H, b e b b b in H. From the definition of Q, then N is a negative subspace on (H, Q) and b u, u e. b) = −Q(u, u) ≤ C kb uk , for any {b u, u} ˇ ∈N kuk ˇ ≤ −Q(b p
p
e is a maximal negative subspace on (H, e Q), e then N b is a maximal Theorem 4.4. If N b b negative subspace on (H, Q). p b u, u e is a maximal negative subspace on H, e then kuk b), for Theorem 4.5. If N ˇ ≤ −Q(b e . Define a transformation ϕ : u e . Then it is a u e = {b u, u} ˇ ∈N b → uˇ , u e = {b u, u} ˇ ∈N b on the maximal negative linear contraction transformation with respect to the form −Q e ˇ subspace N to H. Conversely, if the transformation ϕ is a contraction (in this sense) on e to H, ˇ then the graph of ϕ is a maximal negative the maximal negative subspace of H e subspace of H.
(iv) The proof of Theorem 4.1–Theorem 4.5. Proof of Theorem 4.1. Because of Q± (u, u) = Q(u, u)(±sgnQ(u, u)) = |Q(u, u)| > 0 for u 6 = 0. Now to prove(H + , Q+ ) is a GSIP space, we only need to prove: Q+ (u, v) ≤ Q+ (u, u) 1/p Q+ (v, v) (p−1)/p , for any u, v ∈ H + .
(4.1)
p Dissipative Operator
In fact,
537
Q+ (u, v) = |Q(u, v)| = Re A∗ u, v p = 0.5 Re A∗ u, v p + Re A∗ u, v p = 0.5 Re A∗ u, v p + Re u, A∗∗ v p .
As A∗ ⊃ −A, −A∗∗ ⊂ A∗ , then Q+ (u, v) = 0.5 Re A∗ u, v + Re u, −A∗ v . p p Construct a new GSIP space H1 × H2 with GSIP [·, ·]12 as follows: [u, v]12 = Re u1 , v 1 1 + Re u2 , v 2 2 , ∀u = {u1 , u2 }, v = {v 1 , v 2 } ∈ H1 × H2 . Imitating Theorem 3.1, [·, ·]12 is a GSIP in H1 × H2 . Let W u = W {u1 , u2 } = {F −1 u2 , F u1 }, ∀u ∈ H1 × H2 . Then W 2 = I and [W u, v]12 =[u, W v]12 , or W is a generalized p selfadjoint operator in GSIP space (H1 × H2 , [·, ·]12 ). Next we prove that W satisfies the generalized Schwarz inequality in (H1 × H2 , [·, ·]12 ): 1/p
|[W x, y]12 | ≤ |[W x, x]12 |
1/q
|[W y, y]12 |
, for x, y ∈ H1 × H2 .
(4.2)
As W 2 = I, from Proposition 2.1, there exist linear operators E1 , E2 satisfying ( 0, i 6 = j and E1 + E2 = I, W = E1 − E2 . E i Ej = Ei , i = j Then, for any f, g ∈ H1 × H2 , f = f1 + f2 , g = g1 + g2 , f1 , g1 ∈ E1 (H1 × H2 ), f2 , g2 ∈ E2 (H1 × H2 ). W f = f1 − f2 , W g = g1 − g2 . Thus E1 (H1 × H2 ), E2 (H1 × H2 ), W (H1 × H2 ) = (E1 − E2 )(H1 × H2 ) are linear subspaces in H1 × H2 . Construct a new GSIP [·, ·]E1 in the product space (E1 (H1 × H2 )) × (W (H1 × H2 )) as follows: [f1 , z]E1 = [f1 , h − g]12 , where f1 , h ∈ E1 (H1 × H2 ) ⊂ H1 × H2 , g ∈ E2 (H1 × H2 ), z = h − g ∈ (E1 − E2 )(H1 × H2 ) = W (H1 × H2 ) ⊂ H1 × H2 .
538
L.-X. Tian, Z.-R. Liu
Construct a new GSIP [·, ·]E2 in the product space (E2 (H1 × H2 )) × (W (H1 × H2 )) as follows: [f2 , z]E2 = [f2 , h − g]12 , where f2 , g ∈ E2 (H1 × H2 ) ⊂ H1 × H2 , h ∈ E1 (H1 × H2 ), z = h − g ∈ (E1 − E2 )(H1 × H2 ) = W (H1 × H2 ) ⊂ H1 × H2 . Because [·, ·]12 is a GSIP, then it is easy to prove that [·, ·]E1 , [·, ·]E2 are GSIPs in spaces (E1 (H1 × H2 )) × (W (H1 × H2 )), (E2 (H1 × H2 )) × (W (H1 × H2 )) respectively. Construct a new GSIP [·, ·]E3 in the space (H1 × H2 ) × W (H1 × H2 ) such that h i = [f1 , g1 − g2 ]E1 + [f2 , g1 − g2 ]E2 , fb, gb E3
where
fb = f1 + f2 , gb = g1 + g2 ∈ H1 × H2 , W gb = g1 − g2 .
Similar to Theorem 3.1, [·, ·]E3 is a GSIP in the space (H1 × H2 ) × (H1 × H2 ) because [·, ·]E1 , [·, ·]E2 , are GSIP in the spaces (E1 (H1 × H2 )) × (W (H1 × H2 )), (E2 (H1 × H2 )) × (W (H1 × H2 )) respectively. Hence we have the generalized Schwarz inequality in ((H1 × H2 ) × (H1 × H2 ), [·, ·]E3 ): h i 1/p 1/q i h fb, gb ≤ fb, fb gb, gb , fb, gb ∈ (H1 × H2 ) × (H1 × H2 ). E E3
E3
3
It is enough to remark that since for any fb,b g, h i fb, gb = [f1 , g1 − g2 ]E + [f2 , g1 − g2 ]E 1 2 E3 = |[f1 , g1 − g2 ]12 + [f2 , g1 − g2 ]12 | h i h i = fb, W gb = W fb, gb , 12
12
the generalized Schwarz inequality follows: h i 1/q fb, gb ≤ [f1 , f1 −f2 ]E +[f2 , f1 −f2 ]E 1/p [g1 , g1 −g2 ]E + g1 , g1 −g2 1 2 1 E2 E3 1/p
1/q
= |[f1 , f1 −f2 ]12 +[f2 , f1 −f2 ]12 | |[g1 , g1 −g2 ]12 +[g2 , g1 −g2 ]12 | h i 1/p i 1/p 1/q h 1/q = fb, W fb gb, W gb 12 = W fb, fb W gb, gb 12 . 12
12
Hence the formula (4.2) is proved. And we easily prove that Q+ (u, v) = 0.5 Re A∗ u, v + Re u, A∗∗ v = 0.5 W u0 , v 0 , p p 12 where u0 = {u, A∗ u}, v 0 = {v, A∗∗ v} ∈ H1 × H2 . By using Eq. (4.2), 1/p 1/q |Q+ (u, v)| ≤ 0.5 W u0 , u0 12 W v 0 , v 0 12 1/p 1/q = 0.5 Re A∗ u, u p + Re u, A∗∗ u p Re A∗ v, v p + Re v, A∗∗ v p 1/p 1/q = Q+ (u, u) Q+ (v, v) .
p Dissipative Operator
539
Therefore (H + , Q+ ) is a GSIP space. By similar reasoning we also conclude that (H − , −Q− ) is a GSIP space. Then the quotient spaces H ± /G(L0 ) exist. The forms Q± of H ± bring about the forms of the quotient spaces H ± /G(L0 ). It is easy to see that H ± /G (L0 ) , ±Q± are GSIP spaces. This completes the proof. b u0 ∈ b u, u b,b v ∈ H, Proof of Theorem 4.2. Obviously Q(b b ∈ G(L0 ). As u b) = 0 when u b u b u b + u0 , vb = Q b, vb , G(L0 ), first we prove that Q b u + u0 , vb) = Q+ (u+ + u0 , v + ) + Q− (u− , v − ), Q(b
(4.4)
Q+ (u+ + u0 , v + ) = Q(u+ + u0 , v + ) sgn Q(v + , v + ) = Re A∗ (u+ + u0 ), v+ p sgn Q(v + , v + ) = Re A∗ u+ , v+ p sgnQ(v + , v + ) + Re [Au0 , v+ ]p sgn Q(v + , v + ) = Q+ (u+ , v + ) + Re [Au0 , v+ ]p sgn Q(v + , v + ), where u+ = {u+ , L1 u+ }, v + = {v+ , L1 v+ } ∈ H + , u0 = {u0 , L0 u0 } ∈ G(L0 ). Now we need to prove Re [Au0 , v+ ]p = 0. By using the same notation for W and the form (4.2) of Theorem 4.1 we have Re [Au0 , v+ ]p = Q(u0 , v + ) = 0.5 W u00 , v+0 12 1/p 1/q ≤ 0.5 W u00 , u00 12 W v+0 , v+0 12 1/q 1/p = 0.5 |Re [Au0 , u0 ]12 | W v+0 , v+0 12 = 0, where u0 = {u, A∗ u}, u0 = {u, L0 u}, v + = {v+ , L1 v+ }, v 0 = {v+ , A∗∗ v+ }. Hence Re [Au0 , v+ ]p = 0. Then Q+ (u+ + u0 , v + ) = Q(u+ , v + ). The form (4.4) changes into b u + u0 , vb) = Q(u + u0 , v) = Q+ (u+ , v + ) + Q− (u− , v − ) = Q(u, v). Q(b Next we prove b u, vb + v 0 ) = Q(u, v), u, v ∈ H + ⊕ H − , v 0 ∈ G(L0 ), u = u+ + v − . Q(b It suffices to note that Q(b u, vb + v 0 ) = Q(u, v) by similar reasoning as in the preceding proof. Obviously v + + v 0 ∈ H + , v − + v 0 ∈ H − . We obtain Q(u, v + v 0 ) = Q+ (u+ , v + + v 0 ) + Q− (u− , v − ) = Q(u+ + 0, v + + v 0 ) + Q− (u− , v − ) = Q+ (u+ , v + ) + Q− (0, v 0 ) + Q− (u− , v − ) = Q+ (u+ , v + ) + Q− (u− , v − ) = Q(u, v). Then Therefore
b u, vb + v 0 ) = Q(u, v + v 0 ) = Q(u, v). Q(b b ∀ u∈u b u, vb) = Q(u, v), u b, vb ∈ H, b, v ∈ vb, Q(b
where u b, v are the cosets of u, v respectively. This completes the proof.
540
L.-X. Tian, Z.-R. Liu
b → H − /G(L0 ) be a projection operator such that Proof of Theorem 4.4. Let P− : H b u+ ∈ H + /G(L0 ), u− ∈ H/G(L0 ). u = u+ + P− u, P− u = u− , u ∈ H, b to be a maximal negFirst we prove that a necessary and sufficient condition for N b b ative subspace is that P− N = H − /G(L0 ). If N is a maximal negative subspace, b ⊂ H − /G(L0 ). If P− N b does not fill out H − /G(L0 ) then there exists then P− N b ). Hence u b and {b b is a negative subspace in H b and b∈ /N u} ∪ N u b ∈ (H − /G(L0 ))\(P− N b = H − /G(L0 ) and N b b . This is a contradiction. Conversely, if P− N properly contain N b b b ∈ H − /G(L0 ), is not maximal negative, there exists u b∈ / N , Q(b u, u b) < 0. Hence P− u b . As u b , where P+ = I − P− . b = (I − P− )b P− u b ∈ P− N u + P− u b, then (I − P− )b u ∈ P+ N b =N b , but u b ; this is a contradiction. Hence N b is a maximal b∈ /N Then u b ∈ (P+ + P− )N negative subspace. e is a maximal negative subspace in H, e then N e is a maximal Now to prove that if N b b b negative subspace in H. As N is a closed negative subspace, then P− N is a closed subb ), 0}∪ N e is space in H − /G(L0 ). If P− N 6 = H − /G(L0 ), then the set {(H/G(L0 ))\(P− N e e e a negative subspace relative to Q in N and properly contains N . This is a contradiction. b is a maximal negative subspace. Therefore N This completes the proof. Proofs of Theorem 4.3 and Theorem 4.5, follow from the above, so we omit the proofs. 5. The Maximal Dissipative Extension Representation of p Dissipative Operator Theorem 5.1. Let L0 = A − S, A is a p skew-symmetric operator, and satisfty Re [Au, u]p = 0, S is an reversible GPpS operator. Suppose the maximal dissipative extension of L0 is L. Then, there is a one to one correspondence between the maximal e of GIIP space H, e dissipative extension L of L0 and the maximal negative subspace N and u), L1 = A∗ − S, Lu = L1 u + S 1/2 ϕ(b b, N b is the projection of N e from H e to H}. b b∈N D(L) = {u ∈ D(L1 ) | u Proof. Assume that L is the maximal dissipative extension, then Re [Lu, u]p ≤ 0, u ∈ D(L). If u ∈ D(L0 ),v ∈ D(L), then Re [L0 u, v]p = Re [Au, v]p − Re [Su, v]p , Re [Au, v]p = Re [L0 u, v]p + Re [Su, v]p , Re [Au, v]p ≤ Re [L0 u, v]p + Re [Su, v]p p−1
, ≤ (kL0 uk + kSuk) kvk [Au, v]p = Re [Au, v]p + i Re [Au, iv]p ≤ Re [Au, v]p + Re [Au, iv]p p−1
≤ 2(kL0 uk + kSuk) kvk
.
(5. 1) (5. 2)
p Dissipative Operator
541
Then v ∈ D(A∗ ). As S : H1 → H is a bounded operator and L1 = A∗ − S, we have D(L1 ) ⊃ D(L). As the operator S is inversive, for arbitrary v ∈ D(L0 ) and u ∈ D(L), then we have i h [v, Lu − L1 u]p = S 1/2 v, S −1/2 (Lu − L1 u) p
p−1
. ≤ S 1/2 v S 1/2 Lu + S 1/2 L1 u 2
2
ˇ so that From the Riesz Representation Theorem of GSIP space, there exists fˇ ∈ H, for any v ∈ D(L0 ), we have i h i h [v, Lu − L1 u]p = S 1/2 v, fˇ = v, S 1/2 fˇ . p
p
Hence Lu − L1 u = S 1/2 fˇ, or Lu = L1 u + S 1/2 fˇ. As H0 = D(L0 ) ⊃ H1 , then Lu = L1 u + S 1/2 fˇ, u ∈ H1 . b u Let u b, vb ∈ H, b, vb are the cosets of u, v and u = u+ + u− , v = v + + v − , u+ , v + ∈ H + , u− , v − ∈ H − . From Theorem 4.2 we obtain b u, vb) = Q(u, v) = Q+ (u+ , v + ) + Q− (u− , v − ). Q(b The following inequality holds:
p p
b u, vb) − m Q(b
S 1/2 u − fˇ + fˇ ≤ 0,
(5.3)
where m is an fixed constant, u = {u, L1 u} ∈ G(L1 ), u b is a coset of u. First, we prove
p p
(5.4) Q(u, u) − m S 1/2 u − fˇ + fˇ ≤ 0, u ∈ G(L1 ). As Q(u, u) = Re [L1 u, u]p − Re [Su, u]p , then Re [L1 u, u]p = Q(u, u) − Re [Su, u]p . As Re [Lu, u]p ≤ 0, then Re L1 u + S 1/2 fˇ, u p ≤ 0, i h Q(u, u) − Re [Su, u]p + Re S 1/2 fˇ, u ≤ 0. p
Hence, to prove (5.2), we only have to prove
p p
i h
−m S 1/2 u − fˇ + fˇ ≤ − [Su, u]p + Re S 1/2 fˇ, u . p
In fact h
Re −S
1/2
u + fˇ, S 1/2 u
where 1/p + 1/q = 1, p > 1.
i p
h i 1/2 1/2 ˇ ≤ −S u + f , S u p
p−1
≤ S 1/2 u − fˇ S 1/2 u
p 1
q(p−1) 1
≤ S 1/2 u − fˇ + S 1/2 u , p q
(5.5)
542
L.-X. Tian, Z.-R. Liu
The left-hand side of the above inequality is equal to − [Su, u]p + Re fˇ, S 1/ 2u p . So
p h i
p 1 1
[Su, u]p − Re fˇ, S 1/2 u ≥ − S1/2u − fˇ − S 1/2 u . p q p By using the above inequality in (5.5), we have to prove
p p 1
p 1
p
−m S 1/2 u − fˇ + fˇ − S 1/2 u − fˇ − S 1/2 u ≤ 0 p q or
p p
p 1 1
(−m − ) S 1/2 u − fˇ + fˇ ≤ S 1/2 u . p q
(5.6)
Note that h i i h Re −S 1/2 u + fˇ, fˇ ≤ −S 1/2 u + fˇ, fˇ p p
p−1
1/2 ≤ −S u + fˇ fˇ
p 1 p r
fˇ , 1/p + 1/q = 1, p > 1, r > 0. ≤ S 1/2 u − fˇ + p qr Then
p 1 p i h qr
1/2
q Re −S 1/2 u + fˇ, fˇ ≤
−S u + fˇ + fˇ , p r p
p i h
qr p
−q Re S 1/2 u, fˇ + (q − 1/r) fˇ ≤
S 1/2 u − fˇ , p p
p h i
rq − 1 qr q
1/2
fˇ p ≤ Re S 1/2 u, fˇ + −
S u − fˇ , q−1 r (q − 1) p (q − 1) p
p i h
qr rq − 1 q
1/2
fˇ p . − Re S 1/2 u, fˇ ≤
S u − fˇ − q−1 p (q − 1) r (q − 1) p
As 1/p + 1/q = 1, q = p(q − 1), p = q/(q − 1), then
p
p−1 h i
1 − rq
fˇ p − r
S 1/2 u − fˇ ≤ p Re −S 1/2 u, fˇ ≤ p S 1/2 u fˇ r (q − 1) p
p p p
l 1/2
p 1 (p−1)q
≤p = l S 1/2 u + fˇ ,
S u + fˇ p ql ql where l > 0. By simplifying, we get p
p
p ∨ p 1 − rq
f − r −
S 1/2 u − fˇ ≤ l S 1/2 u .
r (q − 1) ql 1 1 , ql+1 }. Then we have Let r be such that 0 < r < min{ q1 , √2q(q−1)+q
p 1 − rq − > 0, r (q − 1) ql
p
p
p l
fˇ ≤ r
S 1/2 u − fˇ + S 1/2 u . t t t=
p Dissipative Operator
543
Using Inequality (5.6) it follows that
p 1 l
1 r
1/2
1/2 p
ˇ − −m − +
S u − f ≤
S u . p t q t l t
> 0. In fact h i 1−rq 1 2 (q r − 1) l + l + qr(q−1) q(q−1) 1 l h i − = . 1 q t (rq − 1) l + rq−1
Next we choose l > 0 such that
1 q
−
(5.7)
Because rq − 1 < 0, r < √
1 , 2q (q − 1) + q
then the equation
1 1−q l+ =0 qr (q − 1) q (q − 1) exists with two real solutions a1 and a2 such that s (1 − rq)2 1 − rq 1 ± − 4 . a1 , a2 = − qr (q − 1) q (q − 1) q (q − 1) r2 l2 +
Let a3 =
1 1−rq .
Then we change (5.7) into 1 l r(q − 1)(l − a1 )(l − a2 ) . − = (rq − 1) (l − a3 ) q t
Hence we can choose l > 0 such that q1 − tl > 0. We take m = − p1 + rt . At this time, the left-hand of (5.7) is 0 and the coefficient of the right-hand = q1 − tl > 0. Therefore (5.7) holds naturally. Hence (5.4) holds. So
p p
(5.8) Q (u, u) − m S 1/2 u − fˇ + fˇ ≤ 0, u = {u, L1 u} ∈ G(L1 ). b u b, u b = It follows that Q (u, u) ≤ 0 and u ∈ H − , Q (u, u) = Q− (u, u) .Hence Q Q (u, u) = Q− (u, u). Equation (5.3) holds. Analogous to the discussion of the form (1. 20) in [12], if v lies in D(L0 ), then 1/2 ˇ Lv = L1 v = L0 v; it follows from this that fˇ in Lu = L 1 u + S f depends only on the b ˇ H boundary coset to which a belongs; that is,f = ϕ u b . Since D(L0 ) is dense in H1 and S 1/2 is bounded on H1 to H, we see that S 1/2 D(L0 ) will be dense in Hˇ and so will the image of S 1/2 acting on the first components of any boundary coset. Consequently, (5.3) holds for all u in a given boundary coset only if it holds with the middle term omitted. In other words, p b u Q b, u b + ϕ u b ≤ 0, u ∈ D(L). e in H. e Therefore {{b u, ϕ(b u)}, u b ∈ D(L)} forms a negative subspace corresponding to Q On the other hand if b} u), D(L) = {u ∈ D(L1 ), u b∈N Lu = L1 u + S 1/2 ϕ(b
544
L.-X. Tian, Z.-R. Liu
and L is the extension of L0 and {{b u, ϕ u b }|u ∈ D(L)} is a maximal negative subspace e then we can show that (5.3) holds. So L is a p dissipative operator. of H, Therefore there exists one to one correspondence between the maximal negative e and the maximal dissipative extension representation (5.1), (5.2). subspace of H This completes the proof. 6. Application and Remark In this section we study the maximal dissipative extension of the Schr¨odinger operator by means of the above theory. According to [19], the Schr¨odinger operator is −h1 + V (x), defined in C ∞ (M ), M is C ∞ compact Riemann manifold. The Schr¨odinger operator has a unique dissipative extension in Sobolev space H 2 (M ). But if the domain isn’t a Riemann manifold, the operator becomes complex and the study of the Schr¨odinger equation becomes difficult (see [8, 9]. For this reason, we study the operator for the domain in Banach space, and give the maximal dissipative extension representation of the operator. Suppose X = 0 Lp [0, 2π] , p0 6 = 2, 1 < p0 < ∞, its GSIP ,[·, ·]p as follows: Z [f, g]p =
2π
p−2
f g |g|
dx, 1 < p < ∞,
0
where p may be different from p0 . Obviously the norm in X iskf k = [f, f ]1/p , L0 f = if 00 − f, 0
D(L0 ) = {f : f, f 0 , f 00 ∈ Lp [0, 2π] , f (0) = f (2π), f 0 (0) = f 0 (2π), 1 < p0 < ∞}, L0 is a certain type of Schr¨odinger operator which will be studied. Suppose A : D(L0 ) → X, Af = if 00 . Let G(L0 ) = {{f, L0 f } : f ∈ D(L0 )}. In X × X, construct Q(·, ·) such that Q(f, g) = (f g)0 (2π) − (f g)0 (0) = f 0 (2π)g(2π) − f 0 (0)g(0) + f (2π)g 0 (2π) − f (0)g 0 (0). Let H = H + ⊕ H − , where H + = span{f ∈ X, Q(f, f ) ≥ 0}, H − = span{f ∈ X, Q(f, f ) ≤ 0}. b b b b b Denote H=H/G(L 0 ). Construct Q=Q+ +Q− in H, where b + (fb+ , gb+ ) = Q(fb+ , gb+ ) sgn Q(b g+ , gb+ ), fb+ , gb+ ∈ H + /G(L0 ), Q b − (fb− , gb− ) = Q(fb− , gb− )(− sgn Q((b g− , gb− )), fb− , gb− ∈ H − /G(L0 ). Q e =H b ⊕ H. e ˇ For any fe = {fb, fˇ}, define Q, Suppose Hˇ = {f : f, f 0 ∈ X}, H e fe, ge) = Q( b fb, gb) + fˇ, gˇ . Q( p It is easy to see that
p Dissipative Operator
545
b = {b H u = {u(0), u(2π), u0 (0), u0 (2π)}, u ∈ X}. b , where N b is a maximal negative subspace in (H, b Q). b Then Q(b b u, u Let u b∈N b) ≤ 0 and b u, u Q(b b) = 2(u0 (2π) u(2π) − u0 (0)u(0)) ≤ 0, or
αu0 (2π)u(2π) + βu0 (0)u(0) = 0 and |β| ≤ |α| .
b is a two dimensional subspace. Then N e be a maximal negative subspace in H. e From Theorem 4.4, N b which is a Let N e b projection of N in H, is a maximal negative subspace. Because b u e ˇ u uˇ = ϕ(b u), uˇ ∈ H, b ∈ H, e = {b u, u} ˇ ∈N b component of N e is linearly dependent on the and ϕ is a linear mapping, then the H− b component by using Theorem 4.5. Thus Q(e e u, u) ≤ 0, or H− e u, u b u, u Q(e e) = Q(b b) + [u, ˇ u] ˇ p ≤ 0. Then
p
u)kp ≤ 0. −2(β/α + 1)u0 (0)u(0) + kϕ(b
Hence ϕ(u) = u0 (0)u(0)f , where f satisfies p
p
−2(β/α + 1)u0 (0)u(0) + kf k |u0 (0)u(0)| ≤ 0. 0
Theorem 6.1. The operator A is a p symmetric operator in Banach space Lp [0, 2π]. Proof. Let f, g ∈ D(L0 ), then Z 2π Z p−2 if 00 g |g| dx = [Af, g]p = 0
2π
Z
00
if g (p − 2)
0
! α
p−3
dα dx
0
χ[0,|g|] (x) αp−3 dα dx 0 0 Z 2π Z ∞ p−3 0 α igχ[|g|>α] (x) df dα = (p − 2) Z
2π
=
if 00 g (p − 2)
Z
0
= (p − 2)
∞
0
∞
Z p−3 α −
0
Z
Z
|g|
∞
Z
0
2π
ig 0 χ
[|g|>α] (x) df
dα
f −ig 00 χ[|g|>α] (x) dxdα 0 0 Z 2π Z ∞ p−3 α f (−ig 00 )χ[|g|>α] (x) dx dα = −(p − 2) = (p − 2)
αp−3
0
Z
2π
f ig 00
=−
=− 0
Z |g00 |
0
d αp−2 dx
0
0
Z
2π
2π
p−2
f ig 00 |g 00 |
dxd = − [f, Ag]p ,
where χE (x) denotes the characteristic function in E. Hence the operator A is a p symmetric operator.
546
L.-X. Tian, Z.-R. Liu 0
Corollary 6.1. Re [Au, u]p = 0 and L0 is a p dissipative operator in Lp [0, 2π]. b + , H − /G (L0 ) , −Q b − are GSIP spaces. Theorem 6.2. H + /G (L0 ) , Q Theorem 6.3. In X = Lp0 [0, 2π], 1 < p0 < ∞, suppose the Schr¨odinger operator L0 = if 00 − f, D(L0 ) = {f : f, f 0 , f 00 ∈ X, f (0) = f (2π) , f 0 (0) = f 0 (2π)} If the maximal dissipative extension of L0 is L, then there is one to one correspondence e of H e and between the operator L and the maximal negative subspace N Lu = iu00 − u + u0 (0)u(0)f, D(L) = {u|u, u0 , u00 ∈ X, αu0 (2π)u(2π) + βu0 (0)u(0) = 0, |β| ≤ |α|}, where f ∈ X satisfies the following: p
p
−2(β/α + 1)u0 (0)u(0) + |u0 (0)u(0)| kf k ≤ 0. By using Theorem 4.1, 4.3, 5.1, we easily prove Theorem 6.2, 6.3. Therefore we omit the proofs. Remark 6.1. In X = Lp [0, 2π], 1 < p < ∞, p 6 = 2, we may define the SIP (see [4, 15]) as follows: p−2 Z 2π |g| f sgn gdx, g ∈ X. [f, g] = kgk kgk 0 Now consider following the operator L1 in X: L1 f = if 00 − αf, α > 0, 0
f ∈ D(L1 ) = {f : f, f, f 00 ∈ X, f (0) = f (2π), f 0 (0) = f 0 (2π), 1 < p < ∞}. Similar to Theorem 6.1 and Corollary 6.1 we have that L1 is a dissipative operator in X. Thus similarly we obtain Theorems 6.2 and 6.3. Remark 6.2. Dissipative operators play an increasingly important role as research on nonselfadjoint operators proceeds. Many interesting initial value problems in partial differential equations are defined in Banach space. In the case considered here, we study the maximal dissipative extension representation of the operator in Banach space by introducing the GIIP space and researching the GSIP space. Especially we apply the theory to the Schr¨odinger operator. The Schr¨odinger operator −h1 + V (or i1 − iV ) is considered, where V (x) is the potential. If V (x) doesn’t satisfy the L2 integrable, it is Lp integrable or C [0, 2π](see [20, 21]). Then the particles in the Schr¨odinger equation will cause collision and scattering. Especially, for i1 − iV , V (x) is a complex function in Lp [0, 2π], and the particles cause scattering. It is difficult to study scattering in quantum mechanics at present. In the paper we try to study one of the Schr¨odinger operators in Lp [0, 2π], where V (x) is an imaginary number. Perhaps it is a new method to study the scattering of the Schr¨odinger equation in Banach space. But we don’t know how to connect the maximal dissipative extension representation of the Schr¨odinger operator with the scattering of the Schr¨odinger equation in Banach space yet. This work will be on-going. Moreover research on the operator theory in GIIP space will be very meaningful. Acknowledgement. This project is supported by the National Natural Science Foundation of China, the Science and Technology Foundation of Ministry of Machine-building Industry of China, and the Jiangsu Natural Science Foundation.
p Dissipative Operator
547
References 1. B. Nath: On a generalization of semi-inner product spaces. Math. J. Okayana Univ. 15, 1–6 (1971/1972) 2. D.K. Sen: Generalized p selfadjoint operators on Banach Spaces. Math. Japn. 27 (1), 151–158 (1982) 3. Wei Gouqing, Shen Youqing: The generalized p normal operators and hyponormal operators on Banach space. Chin. Ann. Math. B 88 (1), 70–79 (1987) 4. J.G. Stampel: Roots of scalar operators. Proc. Am. Math. Soc. 13, 796–798 (1962) 5. P. V. Pethe, N. K. Thakare: Applications of Riesz representation theorem in semi-inner product space. Indian J. Pure & Appli. Math. 7 (9), 1024–1031 (1976) 6. P.R. Halmos: A Hilbert space problem book. Princeton, NJ: Von Nostrand, 1967 7. Tian Lixin, Liu Zengrong: The Schr¨odinger operator. Proc. Am. Math. Soc. 126 (1), 203–211 (1998) 8. Tian Lixin: The generalized indefinite inner product space. J. Jiangsu Univ. Sci. & Techno. 16 (1), 82–86 (1995) 9. L. Bongar: Indefinite inner product spaces. Ergeb. Math. Grenzgeb. vol. 78, Heidelberg: Springer-Verlag, 1974 10. Yan Shaozong: Operator theory in indefinite inner product space. Adv. in Science of China: Mathematics Edited by Gu Chahao,Wang Yuan) vol. 3, China: Science Press, 1990, pp. 99–131 11. Langer, H.: Spectral functions of definitizable operators in Krein space. Lecture Notes in Math. 948, Berlin–Heidelberg–New York: Springer-Verlag, 1984, pp. 1–46 12. Crandall, M.G., Phillips, R.S.: On the extension problem for dissipative operators. J. Funct. Anal. 2, 147–176 (1968) 13. Lumer, G., Phillips, R.S.: Dissipative operators in a Banach spaces. Pacific J. Math. 11 (2), 679–698 (1961) 14. Branges, L.D.: Krein spaces of analytic functions. J. Funct. Analy. 81 (2), 219–359 (1988) 15. Fleming, R.J., Jamison, J.E.: Adjoint abelian operators on Lp and C(K). Trans. Am. Math. Soc. 217, 87–98 (1976) 16. Yan Yin: Attractors and dimensions for discretizations of a weakly damped Schr¨odinger equation and a sine-Gordon equation. Nonlinear Anal. 20, 1417–1452 (1993) 17. Soffer, A., Weinstein, M.I.: Multichannel nonlinear scatting for nonintegrable equations. J. Diff. Eq. 98, 376–390 (1992) 18. Nakao Hayashi: The initial value problem for the derivative nonlinear Schr¨odinger equation in the energy space. Nonlinear Anal. 20, 823–833 (1993) 19. Helffer, B.: Semiclassical analysis for the Schr¨odinger operator and applications. Lecture Note in Math. 1336, New York: Springer-Verlag, 1988 20. Olsen, P.A.: Fractional integration, Morrey spaces and a Schr¨odinger equation. Comm. Part. Diff. Eq. 20, 2005–2055 (1995) 21. Hoffmann Ostenhof, M., Hoffmann Ostenhof, T.: Interior Holder estimate for solutions of Schr¨odinger equations and the regularity of nodal sets. Comm. Part. Diff. Eq. 20, 1241–1273 (1995) 22. Simon, B., Zhu,Y.: The Lyapunov exponents for Schr¨odinger operators with slowly oscillating potentials. J. Funct. Anal. 140, 541–556 (1996) 23. Kuksin, S.B.: Growth and oscillations of solutions of nonlinear Schr¨od- inger equation. Commun. Math. Phys. 178, 265–280 (1996) 24. Tian Lixin: Spectra and inequality of generalized p selfadjoint operators. J. Jiangsu Univ. Sci. & Tech. 12 (4), 121–127 (1991) 25. Tian Lixin, Lu Dianchen: The property of nonwandering operators. Appl. Math. Mech. 17 (2), 155–162 (1996) 26. Tian Lixin, Lu Dianchen: The nonwandering operators in infinite dimensional linear space. Acta Mathematica Scientia 15 (4), 455–460 (1995) 27. Tian Lixin: Properties of generalized semi-inner product space and p normal operator. J. Jiangsu Univ. Sci. & Tech. 9 (4), 96–106 (1988) 28. Tian Lixin: The generalized p normal operator in Banach space. J. Jiangsu Univ. Sci. & Tech. 8 (1), 103–109 (1987) 29. Sansuc, J.-J., Tkachenko, V.: Spectral parametrization of nonselfadjoint Hills operators. J. Diff. Eq. 125, 366–384 (1996) 30. Tkachenko, V.: Spectral of nonselfadjoint Hills operators and a class of Riemann surfaces. Ann. Math. 143, 181–231 (1996) 31. Lumer, G.: Semi-inner product spaces. Trans. Am. Math. Soc. 100, 29–43 (1961)
548
L.-X. Tian, Z.-R. Liu
32. Berkson, E.: Some types of Banach space, Hermition operators and bade functionals. Trans. Am. Math. Soc. 116, 376–385 (1975) 33. Giles, J.R.: Classes of Semi-inner product space. Trans. Am. Math. Soc. 129 (3), 436–446 (1967) 34. Pethe, P.V., Thakare, N.K.: Applications of the projection theorem and some results. Indian J. Pure & Appli. Math. 8, 898–902 (1977) 35. Unni, K.R., Pullamadaiah, C.: On ortheogonality in semi-inner product space. Tsukuba J. Math. 5 (1), 15–19 (1981) 36. Antoine, J.-P., Gustafson, K.: Partial inner product and semi-inner product spaces. Adv. in Math. 41 (3), 281–300 (1981) 37. G. Lumer: Isometries of Orlicz spaces. Bull. Am. Math. Soc. 68 (1), 28–30 (1962) 38. Stampel, J.G.: Adjoint abelian operators on Banach space. Canad. J. Math. 21, 505–512 (1969) 39. Faulkner, G.D.: Representation of linear functions in Banach space. Rocky Mountain J. Math. 7 (4), 789–792 (1977) 40. Torrance, E.: Strictly convex spaces via semi-inner product spaces orthogonality. Proc. Am. Math. Soc. 76, 108–110 (1970) 41. Puttamadaiah, C., Huche, G.: On generalized adjoint abelian operators on Banach spaces. Indian J. Pure & Appli. Math. 17 (7), 919–924 (1986) 42. Phillips, R.S.: Dissipative operators and hyperbolic system of partial differential equations. Trans. Am. Math. Soc. 90, 193–254 (1959) 43. Olubummo, A., Phillips, R.S.: Dissipative ordinary differential operators. J. Math. Mech. 14 (6), 929–949 (1965) 44. Phillips, R.S., Sarason, L.: Singular symmetric positive first order differential operators. J. Math. Mech. 15 (2), 235–271 (1966) 45. Tian Lixin, Xu Zhenyuan: The research of longtime dynamics behavior in weakly damped formed KdV equation. Appli. Math. Mech. 18 (10), 1021–1028 (1997) 46. Godefory, G., Shapiro, J.H.: Operators with dense,invariant, cyclic vector manifolds. J. Funct. Anal. 98, 229–269 (1991) 47. Herrero, D.A.: Hypercyclic operator and chaos. J. Operator Theory 28, 93–103 (1992) 48. Herrero, D.A.: Triangular operators. Bull. Lond. Math. Soc. 23, 513–554 (1991) 49. Huijun Yang: Wave packet Chaos. Nonli. World 1, 1–21 (1994) 50. Yosida, K.: Functional analysis. Berlin–Heidelberg–New York: Springer-Verlag, 1965 51. Shinsen Chang, Yuqing Chen, Byung Soo Lee: On the semi-inner product in locally convex spaces. Int. J. Math. & Math. Sci. 20 (2), 219–224 (1997) Communicated by H. Araki
Commun. Math. Phys. 201, 549 – 576 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
A Time-Dependent Theory of Quantum Resonances?,?? M. Merkli, I. M. Sigal Department of Mathematics, University of Toronto, Toronto, ON, M5S 3G3, Canada. E-mail:
[email protected];
[email protected] Received: 13 May 1998 / Accepted: 14 September 1998
Abstract: In this paper we further develop a general theory of metastable states resulting from perturbation of unstable eigenvalues. We apply this theory to many-body Schr¨odinger operators and to the problem of quasiclassical tunneling. 1. Introduction Though the notion of quantum resonance is one of the central notions in physics, the mathematical theory of this phenomenon is still in its early stages. Usually, the resonance is defined in terms of poles of the S-matrix or Green’s function, bumps in the scattering cross-section, or solutions of the Schr¨odinger equation with certain boundary conditions at infinity, while its physical picture is that of a metastable state. It is the latter picture that still is very poorly understood and to which our paper aims to make its modest contribution. We consider in this paper a self-adjoint operator H0 on a Hilbert-space H, such that H0 has a (possibly degenerate) eigenvalue E0 which is embedded in the continuous spectrum of H0 . We perturb H0 by a symmetric operator W such that the operator H ≡ H0 + W is self-adjoint, and study the Schr¨odinger evolution governed by H: i∂t ψ = Hψ.
(1.1)
We assume that initial conditions are spectrally localized (with respect to H) in a neighbourhood, 1, of E0 : ψ0 ≡ ψ|t=0 ∈ Ran E1 (H). Here, E1 (λ) is the characteristic function of an interval 1. ? ??
Research on this paper is supported by NSERC under Grant NA7901. This work is part of the first author’s PhD requirement.
(1.2)
550
M. Merkli, I. M. Sigal
One expects that the eigenvalue E0 of H0 is unstable under the perturbation W , for W sufficiently small. Our goal is to understand how this instability manifests itself on the evolution given by (1.1)–(1.2). An important additional structure we need is given by a self-adjoint operator A on H, which measures the degree of localization of vectors from H. Namely, vectors in the domain of < A >α ≡ (1 + |A|2 )α/2 for α sufficiently large are said to be well localized. In other words, by localization of ψ ∈ H, we mean its localization in the spectral representation of A. An orbit ψ(t) ∈ H is called dispersive, or locally decaying, if ||−α ψ(t)||H vanishes as t → ∞. An obvious example of an operator A is the coordinate, x (or more precisely, |x|), if H is an L2 -space of functions of x. However, sometimes other choices are more convenient, especially the generator of dilation transformations. Let P be the orthogonal projector of H0 onto Null(H0 − E0 ), P = 11 − P , and let H be the reduced Hamiltonian H = P HP . We measure the smallness of the perturbation W by the parameter κ ≡ || α W P ||. We show that ψ can be written in the form ψ(t) = ψres (t) + ψdisp (t),
(1.3)
where ψres (t) = (11+O(κ))P ψ(t), ψdisp (t) is dispersive, i.e. ||−α ψdisp (t)||H → 0 as t → +∞, and P ψ(t) = e−iλt P ψ(0) + O(κ1−4β (t + 1)−β ),
(1.4)
for some bounded operator λ satisfying Re λ = E0 P + O(κ)
and
Im λ = −0 + O(κ3 ),
(1.5)
where 0 = πhWP δ(H − E0 )P W iP > 0, and we used the notation hAiP ≡ P AP Ran P for an operator A. The delta-function is defined in (A.24). Equations (1.3)–(1.4) paint the following picture of the resonance behaviour (see Remark 1 of Sect. 2.2 for a technical discussion). A system which is initially localized in a small spectral interval around an unstable eigenvalue radiates energy/probability to infinity, approaches the unstable unperturbed state, stays near it for a period of time of order 1/0, but then eventually loses all the probability to infinity. The decay law (1.5) is given by the celebrated Fermi Golden Rule. We apply the result above to Schr¨odinger operators and in particular to N -body systems and to the problem of tunneling. Remarks. 1. Equations (1.3)–(1.5) imply that H has no eigenvalues in the interval 1, i.e. that the eigenvalue E0 of H0 is unstable under the perturbation W , and that no new eigenvalues emerge. If one is interested in this result alone, a stationary approach would give a simpler proof (cf. [AHS, Sig3] which contain results in this direction). 2. The perturbation parameter κ can be small even when W is large. Exactly this happens in the tunneling resonance problem. In this case W is large but localized in a domain in which P is very small. Our result establishes a relation between the tunneling resonances and the Fermi Golden Rule not observed previously. 3. The bounded operator λ describes the resonances of H splitting out of the eigenvalue E0 of H0 to the order O(κ3 ). One can extend the proof of (1.3)–(1.5) in order to detect the resonance behaviour and obtain resonances to an arbitrary order. This way, one can replace the condition 0 > 0 by the condition which essentially states that the imaginary
A Time-Dependent Theory of Quantum Resonances
551
part of the resonance in question does not vanish at some order (see discussions after Eq. (1.11) and in Sect. 3.1). 4. The assumption 0 > 0 is satisfied generically for one-body Schr¨odinger operators [AHS]. The same is expected to hold in a much more general context. Under stronger direct assumptions on H this assumption can be essentially removed (see Remark 3 and discussions in Sects. 2.1–2.3 and 3.1). Strategy. The key tool used in the proof is a linear variant of the Liapunov-Schmidt projection method in the theory of stable and unstable manifolds, or time-dependent variant of the Feshbach method of perturbation theory. Namely we project Eq. (1.1) along the subspaces Ran P and Ran P to obtain the new equations i∂t P ψ = (E0 + hW iP )P ψ + P WP ψ,
(1.6)
i∂tP ψ = HP ψ + P W P ψ.
(1.7)
and
We will not use the second equation in an essential way; instead, we remark the following: in order to control ψ ≡ P ψ uniformly in κ (or W ), we would like to use a local decay estimate for e−iHtψ(0). The latter however does not hold for arbitrary initial conditions ψ 0 ≡ ψ(0), but in general only for initial conditions from certain spectral intervals with respect to H, see Condition (A5) and Sect. 4. Now, even if ψ0 is from an appropriate spectral interval for H, the initial condition ψ 0 may not be so for H. To surmount this problem, we observe that since H is very close toP HP + P HP and since ψ0 ∈ Ran E1 (H), the portion of ψ 0 orthogonal to Ran g10 (H) is very small. Here, 10 is an interval containing 1, and g10 (λ) is a smoothed out characteristic function of the interval 10 (in particular, g10 (λ) = 1, λ ∈ 10 ). So we can express ψ in terms of the state ψd ≡ g10 (H)ψ
(1.8)
and a. This ingenious idea was used first by [SW3]. It yields the following representation for ψ: ψ = B 0 P ψ + (11 + B 0 )ψd ,
(1.9)
where B 0 = O(κ) (i.e. B 0 is a bounded operator with norm ||B 0 || ≤ Cκ). Since ψ = P ψ0 +ψ, the full solution is of the form (1.3) with ψres (t) = BP ψ(t) and ψdisp = Bψd , where B = 11 + B 0 = 11 + O(κ). With Eqs. (1.6) and (1.8)−(1.9) in place, the stage for analysis is set. The component ψd satisfies the local decay property uniformly in κ by Assumption (A4) which is verified for a variety of systems. One important step must be made though, before we embark on estimations of P ψ(t) as a solution to Eq. (1.6): we iterate Eqs. (1.6) and (1.7) (or the resulting equation for ψd ). This is the only place where Eq. (1.7) is used. The iteration is a rather subtle affair and it takes us to the equation i∂t P ψ = λP ψ + f, where λ is of the form (1.5) and f satisfies the estimate Z t e−iλ(t−s) f (s)ds = O(κ), t ≥ 0. r(t) ≡ −i 0
(1.10)
(1.11)
552
M. Merkli, I. M. Sigal
The purpose of this iteration is to pick up the leading imaginary term in the bounded operator λ multiplying the function P ψ(t). This leading term is of the second order by Assumption (A5). As a result one needs only one iteration. In general, one should iterate until such a term appears (see discussion in Sect. 3.1). In that case the bounded operator λ yields the resonances of H generated by the perturbation of E0 to a higher order. Equations (1.10)–(1.11) as well as a priori estimates for the dispersive part ψd form a foundation of the proof of the result outlined above. In the analysis above, Ran P plays the role of a stable manifold, so the function ψres (t) describes the motion along this manifold and the component ψ is expected to decay, at least locally. (This local decay or radiation to infinity differentiates between infinite and finite dimensional dynamical systems.) However, there are two major differences with the standard case. The first one pertains to the pecularity of the resonance problem: The stable manifold is not really stable. It behaves as a stable manifold for long time intervals – the life time of the resonance in question – but eventually it disintegrates itself. The second difference is due to the fact that our problem is linear. This allows us to seek a priori estimates for the dispersive part ψ, rather than to use the equation for it, Eq. (1.7). History. Since the early days of quantum physics, the resonance phenomena occupied a central place (see e.g. [BW,KP,S,WW], and [LL] for a textbook discussion). The mathematical theory of resonances could probably be traced to the work of V. Weisskopf and E. Wigner [WW]; in its modern form, it was laid down by B. Simon [Sim1] who used the theory of dilation analytic Hamiltonians due to J. Aguilar and J. M. Combes and E. Balslev and J. M. Combes ([AC, BC]). This approach was further developed in [Sim2, Sig1, Hun2, Ger, HeSj, HM]. Details and extensive references can be found in [HiSig]. It requires the potential in question to have some analyticity properties at least in a neighbourhood of infinity. A theory dispensing with the latter condition was initiated in [Ort]. So far the theory developed was a stationary one, despite the fact that the physical picture (but not the physical definitions!) was that of a metastable state. The timedependent theory was initiated in works of E. Skibsted and W. Hunziker ([Sk1,2, Hu2]), and a space-time and the phase-space-time and variational analysis was given in [GS] and [PF], respectively. A new powerful approach was suggested by A. Soffer and M. Weinstein [SW3], who also obtained a rather detailed space-time description of evolution of metastable (resonance) states in the one-body Schr¨odinger case and for a Schr¨odinger particle coupled to a massive quantum field. Our paper generalizes the result of [SW3] to many-body Schr¨odinger operators (and degenerate eigenvalues). Besides we also treat the case of quasiclassical resonances, not considered in [SW3]. Though our approach follows the same general line as that of [SW3], we had to introduce some essential changes right at the beginning in order to make it applicable to a considerably wider class of systems. The approach outlined above was introduced in [SW1] (see also [SW2] and [BP]). The latter work was further improved and coached in terms of the stable-unstable manifold theory in [PW]. This approach is in fact what is known in physics as the (Feshbach) projection method (with the projection operator P ). It is usually applied to the stationary Schr¨odinger equation (H − E)ψ = 0, while we apply it to the Schr¨odinger equation (H − i∂t )ψ = 0. It serves also as a starting point to a renormalization group construction in a recent work [BFS1-3]. Notation. We use the following notation besides the one introduced above: E1 (λ) denotes the characteristic function of an interval 1, so that E1 (H) is the spectral projector of
A Time-Dependent Theory of Quantum Resonances
553
H corresponding to the interval 1. The length of 1 is written as |1|. A smoothened characteristic function of an open bounded interval 1 ⊂ R is denoted as g1 , i.e. g1 ∈ C0∞ (11 ), where 11 is a slightly bigger interval than 1, and g1 (λ) = 1 on 1. We set g 1 ≡ 1 − g1 . The norm in the Hilbertspace H is denoted as ||·||H , and h·, ·i is its inner product. The expectation value of an operator B in the state ψ is written as hBiψ ≡ hψ, Bψi. Moreover, hBiP stands for the operator P BP Ran P on the space Ran P . The domain of an (unbounded) operator A on H is written as D(A). For t ∈ R, let ≡ (1 + t2 )1/2 , and < A >= (1 + A∗ A)1/2 , where A∗ denotes the adjoint. We also let Re A = (A + A∗ )/2, and Im A = (A − A∗ )/2. For a self-adjoint operator H, we set δ(H − λ) = π1 Im(H − λ − i0)−1 , and P.V.(H − λ)−1 = Re(H −λ−i0)−1 , where the r.h.s. are assumed to be defined between appropriate Banach spaces. Let L(H) be the space of bounded linear operators on H with the standard norm ||·||. For a family {Bs rbrace of operators in L(H) depending on the parameter s ≥ 0, we write Bs = O(s) if ||Bs || ≤ Cs, where C is a constant independent of s. For a complexvalued function f (s), f (s) = O(s) means |f (s)| ≤ Cs and for φs ∈ H, φs = O(s) means ||φs ||H = O(s). Finally, for notational convenience, C will denote a generic strictly positive constant whose value can vary from expression to expression (C is allowed to depend on α, β, |1|, see below, but not on κ or t). 2. Main Results 2.1. Assumptions. We will work in the setting presented above, and where dim Ran P ≤ ∞. The operators H0 and H are assumed to be self-adjoint on the same domain, and W to be symmetric. We also assume that there is a self-adjoint operator A and a number α > 2 such that: (A1) || α P || < ∞, (A2) the perturbation W satisfies κ ≡ || α W P || < ∞, (A3) the multi-commutators ad(k) A (H) are H−bounded, uniformly in W , k = 1, . . . , n, where n > α + 1, (A4) the following local decay estimate holds for all φ ∈ D(α ) and t ≥ 0: ||−α e−iHt g10 (H)P φ||H ≤ C −α ||α P φ||H ,
(2.1)
for some constant C > 0 independent of W , (A5) the Fermi Golden Rule condition holds for E0 in the sense that there is a C0 > 0 such that the bounded, nonnegative operator 0 ≡ πhW δ(H − E0 )P W iP satisfies: 0 ≥ C0 κ2 .
(2.2)
Remarks. 1. (A1)–(A3) are easily verified in the applications (see Sect. 2.3). 2. The uniformity clause in (A3) and (A4) is a restriction on the class of W ’s allowed for a given H0 .
554
M. Merkli, I. M. Sigal
3. In Sect. 4, we derive Condition (A4) from the Mourre estimate. It is much easier to check the latter, as we demonstrate in examples below. 4. It is shown in the appendix that 0 is well defined and satisfies ||0|| ≤ Cκ2 (see the remark after (A.24)). Hence (A5) gives C0 κ2 ≤ ||0|| ≤ Cκ2 .
(2.3)
5. If the Mourre estimate holds then one can show (see e.g. [Sig2]) that 0 = γκ2 +O(κ7/3 ), where γ = πhU δ(H 0 − E0 )P U iP , and U = W/κ. 6. Condition (A5), or its more precise form given in Remark 5, is conjectured to hold generically. For a more detailed discussion in the case of Schr¨odinger operators see Paragraph 2.3. In fact, Condition (A5) can be removed at the expense of requiring a larger α in Conditions (A1)–(A4) and a lengthier proof (see discussion in Sect. 3.1). 7. The smoothness Condition (A3) can be thought of as a C n (R) property of the family H(θ) = U (θ)−1 HU (θ), where U (θ) = exp(iθA), the dilation group. 2.2. Abstract result. We present our main result in the setting of a general Hilbert-space (in Theorem 2.1). We treat the case of Schr¨odinger operators in Sect. 2.3. Theorem 2.1. Let α > 2 and 0 ≤ β < min{1/2, α − 2} and let ψ be the solution of i∂t ψ = Hψ, with initial condition ψ0 ∈ Ran E1 (H) ∩ D(α ). Then there exists a constant κ0 (depending on α, β and |1|) such that for κ < κ0 we have the expansion: ψ(t) = (11 + O(κ)) P ψ(t) + ψdisp (t),
(2.4)
where ψdisp (t) is a dispersive wave satisfying for t ≥ 0:
||−α ψdisp (t)||H ≤ C ||α P ψ0 ||H −α +κ1−2β −β , (2.5)
and P ψ(t) satisfies for t ≥ 0: P ψ(t) = e−iλt P ψ0 + O(κ1−4β −β ).
(2.6)
Here λ is a bounded linear operator on Ran P satisfying
Re λ = E0 P + P W P − P W P.V.(H − E0 )−1 P W P + O(κ3 ),
(2.7)
Im λ = −0 + O(κ3 ).
(2.8)
and
The terms on the r.h.s. of (2.7) and (2.8) are well defined. Corollary 2.2. Under the assumptions of Theorem 2.1, H has no eigenvalues in the interval 1. Remarks. 1. Equations (2.4)–(2.8) imply that though any orbit ψ(t) starting at ψ0 ∈ Ran E1 (H) is dispersive (i.e. locally decaying), for kP ψ0 kH small and for any R > 0, the state χ|A|≤R ψ(t) is close to the “stationary” state χ|A|≤R e−iλt P ψ0 in a time interval of the order κ1 ln(κ−1 ). 2. Our analysis yields a more detailed information about P ψ(t) and ψdisp (t) than the one given in Theorem 2.1. The dispersive wave satisfies
A Time-Dependent Theory of Quantum Resonances
555
||−α ψdisp (t)||H ≤ C||α P ψ0 ||H (< t >−α + < t >−β κ2−2β ) +C||P ψ0 ||H < t >−β κ1−2β . Set P ψ(t) = e−iλt P ψ(0) + r(t). Then we have in particular
||r(t)||H ≤ C ||α P ψ0 ||H κ1−2β + ||P ψ0 ||H κ1−4β −β .
(2.9)
3. There are various versions of Theorem 2.1 in which the condition ψ0 ∈ Ran E1 (H)∩ D(hAiα ) is either relaxed or modified (e.g. to ψ0 ∈ Ran P ). 2.3. Schr¨odinger operators. In this section, we choose our Hilbert-space to be H = L2 (X) ≡ L2 (with norm ||·||2 ), where X is a finite dimensional inner product space, the configuration space of a system in question. The Hamiltonian is given by H0 = −1 + V,
(2.10)
where 1 is the Laplacian on X, and V is a real function on X called the potential. Our choice of the operator A is the dilation generator A=
1 (x · p + p · x), 2
where p = −i∇, and the dot product is a coupling of X and X 0 . We assume here that W = κ0 U , where κ0 is a real number, and U : X → R, dim Ran P < ∞. Setting κ ≡ || α W P || as before, we have κ = κ0 || α U P ||, and κ → 0 is equivalent to κ0 → 0, provided || α U P || < ∞ (see (2.11)). Making the perturbation small means here to make κ0 small. We assume that for some n > α + 1: (S1) V, U ∈ C n , with bounded derivatives, (S2) (i) (x · ∇)k V is H-bounded, 0 ≤ k ≤ n, (ii) (x · ∇)k U is H-bounded, 0 ≤ k ≤ n, (S3) there exists a neighbourhood 11 of E0 such that E11 (H0 )i[H0 , A]E11 (H0 ) ≥ θ1 E11 (H0 ) + K, where θ1 > 0 and K is a compact operator on L2 , (S4) the Fermi Golden Rule Condition holds in the sense that 0 = π(κ0 )2 hU δ(H 0 − E0 )P U iP > 0 (0 is positive definite). Conditions (S2) with k = 0, 1 and (S3) imply that the projection P satisfies the following two estimates: || α U P || < ∞,
(2.11)
and for any multi-indices m1 and m2 with |m1,2 | ≤ n: ||xm1 pm2 P || < ∞.
(2.12)
Indeed, proceeding as in [HS1,CFKS], one can show that there is a δ > 0 such that ||eδ|x| P || < ∞. Using then (p2 + V )P = E0 P , and Condition (S1), it is easily shown that ||eδ|x| pm P || < ∞, for |m| ≤ n, from which (2.11) and (2.12) follow.
556
M. Merkli, I. M. Sigal
Remark. In the one-body case (see e.g. the problem of tunneling in Sect. 2.3.2), Condition (S4) was shown to hold generically at least for E0 simple ([AHS], for more details see below). The latter is still a conjecture in the many-body case. The only general results in that case are those of [AHS]. Translated into our context they state, under some additional conditions, that δ(H 0 − E0 ) 6= 0 and that, given a many-body Hamiltonian H0 as above and an open set ⊂ X (for the definition of X see Sect. 2.3.1) there is a real potential G ∈ C0∞ (), s.t. hGδ(H − E0 )GiP > 0 at least for E0 simple. Theorem 2.3. Theorem 2.1 holds for the Schr¨odinger operator (2.10) if we replace Conditions (A1)–(A5) by (S1)–(S4). The proof of Theorem 2.3 is given in Sect. 5. It consists in showing that the Conditions (S) imply the Conditions (A) in the case of Schr¨odinger operators. The essential part is to show that starting from the Mourre estimate (S3) for H0 , we get a strong Mourre estimate for the reduced HamiltonianH, see Theorem 4.1, and therefore the local decay estimate (A4) for H. 2.3.1. N -body systems. For N -body systems in the physical space Rd , the configuration space is P X = {x = (x1 , . . . , xN ) ∈ RdN | mi xi = 0}. Here mi is the mass of particle i. The Hamiltonian H0 is a Schr¨odinger operator (2.10) with V0 , an N -body potential: X Vij (xi − xj ), V0 (x) = i<j
where the two-body potentials Vij : Rd → R vanish at infinity. In terms of the two-body potentials, (S2)(i) means that (x · ∇)k Vij is 1-bounded, 0 ≤ k ≤ n. For example, we can take (x · ∇)k Vij to be Kato-potentials on Rd , 0 ≤ k ≤ n. Similarly, (S2)(ii) is satisfied if (x · ∇)k U are Kato-potentials on RdN , for 0 ≤ k ≤ n. If E0 is separated from the thresholds of H0 , then E0 is finitely degenerate, dim Ran P < ∞ (see [HS1,CFKS]), and the Mourre estimate holds for H0 and A if 11 is sufficiently small. Thus we obtained Corollary 2.4. The conclusion of Theorem 2.1 holds under the assumptions on the potentials mentioned above, provided that E0 is separated from the thresholds of H0 , and if the Fermi Golden Rule Condition holds. 2.3.2. Tunneling in quasiclassical approximation. We consider the following initial value problem on L2 (Rd ): i~∂t ψ = Hψ, ψ|t=0 ≡ ψ0 ∈ Ran E~1 (H),
(2.13) (2.14)
where ~ > 0 is considered to be a small parameter, and 1 is an interval to be specified, lying at the bottom of the continuous spectrum of H, and ~1 ≡ {~E|E ∈ 1}. Here the Schr¨odinger operator H is given by H = p2 + V,
p = −i~∇,
(2.15)
and V is a volcano-shaped bounded potential with a local minimum at the origin, defined in the Conditions (T) below. We define a reference potential V0 , such that
A Time-Dependent Theory of Quantum Resonances
557
(i) V0 is C n (Rd ), and such that (x · ∇)k V0 is bounded, 0 ≤ k ≤ n, (ii) V0 (x) = V (x) in a neighbourhood N of the origin, and inf x∈Rd \N V0 (x) > V (0). We set H 0 = p2 + V 0 ,
(2.16)
and therefore H = H0 + W with W = V − V0 . For ~ small enough, H0 has a unique (normalized) ground-state ϕ0 with energy E0 separated from the continuous spectrum of H0 . Let A ≡ (x · p + p · x)/(2~) and κ ≡ ||α W ϕ0 ||2 . We give now conditions on the potential V , and state then the result of this section, Theorem 2.5. (T1) V ≥ 0, V ∈ C n (Rd ), and (x · ∇)k V is bounded, 0 ≤ k ≤ n, (T2) V has a non-degenerate local minimum at x = 0 and vanishes as |x| → ∞, (T3) there is a δ > 0 such that δ ≡ {x ∈ Rd |V (x) < V (0) + δ} has a connected containing ∞, on which V (x) is non-trapping: ∀x ∈ ext component ext δ δ , 2(V (0) − V (x)) − x · ∇V (x) ≥ θ,
for some θ > 0,
(2.17)
(T4) the Fermi Golden Rule Condition holds: 0 ≡ πhδ(H − E0 )iP W ϕ0 satisfies 0 ≥ C0 ~p κ2 , for some p ≥ 0. Remarks. 1. By non-degeneracy of the local minimum in (T2), we mean that the Hessian of V at 0 is strictly positive definite. 2. From the harmonic approximation (see e.g. [HiSig], Chapter 11), it follows that E0 = V (0)+O(~). Hence for small ~, Condition (ii) implies that the classically allowed region {x ∈ Rd |V0 (x) ≤ E0 } is compact. This implies that ϕ0 is localized in x exponentially around the origin: ∀, 0 > 0, ∃C,0 (independent of ~) such that 0
||e(1−)ρE0 /~ ϕ0 ||2 ≤ C,0 e /~ .
(2.18)
Here, ρE0 (x) is the geodesic distance between x and 0, measured in the Agmon metric corresponding to the energy E0 . The proof of (2.18) is easily obtained e.g. from [HiSig], Theorem 3.4. 3. Since E0 = V (0) + O(~), then for small ~, W is supported in the classically forbidden region {x ∈ Rd |V0 (x) > E0 }. The exponential decay (in x) of ϕ0 implies then that κ = ||α W ϕ0 ||2 = O(e−η/~ ), for some η > 0. Theorem 2.5. Assume Conditions (T1)–(T4) hold, and let 0 ≤ β < min{α − 2, 1/2}. Then there is a ~0 > 0 (dependent on α, β, n, |1|, p) such that for ~ < ~0 , the solution to (2.13)–(2.14) has the expansion ψ(t) = a(t)ϕ1 + ψdisp (t),
(2.19)
where ϕ1 = ϕ0 + O(κ/~), and ψdisp (t) is a dispersive wave satisfying for t ≥ 0: ||−α ψdisp (t)||2 = ||α P ψ0 ||2 O(< t >−α ~−q ) +O(< t >−β ~−q κ1−2β ),
(2.20)
with q, a positive integer depending on α, β, n, p. Moreover, a(t) = hϕ0 , ψ(t)i satisfies for t ≥ 0:
558
M. Merkli, I. M. Sigal
a(t) = e−iλt/~ a(0) + O(< t >−β ~−q κ1−4β ),
(2.21)
Re λ = E0 + hW iϕ0 − hP.V.(H − E0 )−1 iP W ϕ0 + O(~−2 κ3 ),
(2.22)
Im λ = −πhδ(H − E0 )iP W ϕ0 + O(~−2 κ3 ).
(2.23)
with
We prove Theorem 2.5 in Sect. 6. A general discussion of the quasiclassical Schr¨odinger problem and extensive references can be found in [HiSig]. 3. Proof of Theorem 2.1 The proof of Theorem 2.1 consists of two main steps. In a first step, we establish differential equations for P ψ and ψd (t) (the latter function is defined in (1.8)), write the corresponding integral equations, and bring these equations to a convenient form. We do this in Sect. 3.1. In the second step we use these equations in order to prove the desired estimates on P ψ(t) and ψd (t). This is done in Sect. 3.2. Theorem 2.1 is then derived by observing that ψ(t) = B (P ψ(t) + ψd (t)) ,
(3.1)
where the operator B satisfies B = 11 + O(κ). 3.1. Differential equations for P ψ(t) and ψdisp (t). In this section, we establish the coupled differential equations for P ψ and ψd and iterate them in a suitable manner. The main result here is Eq. (3.13) together with the set of equations (3.14). Projecting (1.1) onto Ran P and Ran P yields i∂t P ψ = (E0 + P W P ) P ψ + P WP ψ, i∂tP ψ = HP ψ + P W P ψ.
(3.2) (3.3)
Recall that ψd = Pd ψ, where Pd = g10 (H)P . To pass from P ψ to ψd , we multiply both sides of (3.3) by g10 (H) and get i∂t ψd = Hψd + Pd W P ψ.
(3.4)
Now we express P ψ in (3.2) in terms of P ψ and ψd . Notice that ψ0 ∈ Ran E1 (H), so that ψ0 = g1 (H)ψ0 . Hence ψ = g1 (H)ψ and therefore P ψ = P g1 (H)ψ. Introducing 11 = g10 (H) + g10 (H) into the last equation yields P ψ = (g10 (H) + g10 (H))P g1 (H)ψ = g10 (H)P g1 (H)ψ + g10 (H)P ψ = g10 (H)P g1 (H)P ψ + g10 (H)P g1 (H)P ψ + Pd ψ. Hence (11 − g10 (H)P g1 (H))P ψ = g10 (H)P g1 (H)P ψ + ψd . The following proposition is proven in the appendix: Proposition 3.1. g10 (H)P g1 (H) = O(κ) and consequently, for small κ, B ≡ (11 − g10 (H)P g1 (H))−1 exists as a bounded linear operator on H, and ||B|| ≤ C (uniformly in κ for small κ).
A Time-Dependent Theory of Quantum Resonances
559
We have therefore P ψ = B 0 P ψ + Bψd ,
(3.5)
where B 0 = B − 11, and B is defined in Proposition 3.1 above. Remark that B = 11+O(κ), and B 0 = O(κ). With expression (3.5) for P ψ, we get ψ(t) = BP ψ(t) + ψdisp (t),
(3.6)
ψdisp (t) = Bψd (t).
(3.7)
where we defined
Equation (3.6) shows that ψ is of the form (2.4). From (3.2) and (3.5), we have the following equation of motion for P ψ: i∂t P ψ = λ1 P ψ + P W Bψd ,
(3.8)
where λ1 is a bounded linear operator on Ran P : λ1 = E0 + P W BP. Next, we rewrite Eq. (3.4) in the integral form Z t e−iH(t−s) Pd W P ψ(s)ds. ψd (t) = e−iHt ψd (0) − i
(3.9)
(3.10)
0
The last term can be transformed as follows: pick z ∈ C+ and integrate by parts in the following way: Z t e−iH(t−s) Pd W P ψ(s)ds −i 0 Z t ei(H−z)s eizs Pd W P ψ(s)ds = −ie−iHt 0
= −(H − z)−1 Pd W P ψ(t) + e−iHt (H − z)−1 Pd W P ψ(0) Z t + e−iH(t−s) (H − z)−1 Pd W [izP ψ(s) + (∂s P ψ)(s)] ds. 0
(3.11) In order to make the last term on the r.h.s. of (3.11) small, we want to take z to the real axis. Such a procedure is justified in the first two statements of the proposition to follow. The third statement of this proposition shows why z must approach the real axis from above for t > 0 (the outgoing condition). Proposition 3.2. For ω ∈ R and φ ∈ H ∩ D(α ), we have: ∀t ≥ 0 : −α e−iHt (H − ω − i)−1 Pd φ converges in H as ↓ 0. The limit is denoted as −α e−iHt (H − ω − i0)−1 Pd φ. (ii) The convergence in (i) is uniform in t ∈ R+ , and therefore t 7→−α e−iHt (H − ω − i0)−1 Pd φ is continuous as a map from R+ to H. (iii) ||−α e−iHt (H − ω − i0)−1 Pd φ||H ≤ C(t + 1)−α+1 ||α P φ||H , ∀t ≥ 0. (i)
560
M. Merkli, I. M. Sigal
This proposition is proven in the appendix. Notice that as Im z → 0, the individual terms in (3.11) do not converge in H. However, setting ω = Re z, and using our assumption that α W P is a bounded operator (see (A2)) and Proposition 3.2, we get for P W B acting on the second term on the r.h.s. of (3.10): Z t P W Be−iH(t−s) Pd W P ψ(s)ds −i 0 Z t P W B 0 e−iH(t−s) Pd W P ψ(s)ds = −i 0
−P W (H − ω − i0)−1 Pd W P ψ(t) + P W e−iHt (H − ω − i0)−1 Pd W P ψ(0) Z t P W e−iH(t−s) (H − ω − i0)−1 Pd W (ω − λ1 )P ψ(s)ds +i 0 Z t P W e−iH(t−s) (H − ω − i0)−1 Pd W P W Bψd (s)ds, −i 0
(3.12) where the term P W (H − ω − i0)−1 Pd W P ψ(t) and similar ones are well defined. To get the last two terms, we replaced ∂t P ψ in (3.11) using (3.8). Expression (3.12) contains the term of order two in κ that acts on P ψ(t). Let us choose ω ≡ E0 , so that ω − λ1 = −P W BP = O(κ). Using (3.8), (3.10) and (3.12), we get our final version of the equation of motion for P ψ: (3.13) i∂t P ψ = λP ψ + f, P 5 where λ ≡ λ1 − P W (H − E0 − i0)−1 Pd W P , f = j=1 fj , and the fj ’s are given by f1 (t) = P W Be−iHt ψd (0), f2 (t) = P W e−iHt (H − E0 − i0)−1 Pd W P ψ(0), Z t f3 (t) = −i P W e−iH(t−s) (H − E0 − i0)−1 Pd W P W BP ψ(s)ds, 0 Z t P W e−iH(t−s) (H − E0 − i0)−1 Pd W P W Bψd (s)ds, f4 (t) = −i 0 Z t P W B 0 e−iH(t−s) Pd W P ψ(s)ds. f5 (t) = −i
(3.14)
0
The expression for λ is analyzed in Proposition 3.3. The expansion (2.7)–(2.8) holds for λ. The proof is given in the appendix. Discussion. Our proof is based on estimating Eqs. (3.4) and (3.13) (obtained estimates are then synthesized into the final theorem with the help of Eqs. (3.6)–(3.7)). Equation (3.13) is a one-iteration of Eq. (3.8). This iteration is needed so that the bounded operator λ multiplying the vector function P ψ on the r.h.s. of the equation for P ψ captures the leading non-zero term of the imaginary part of the resonance. In our case, due to Assumption (A5), this term is the second order. That is why we need only one iteration.
A Time-Dependent Theory of Quantum Resonances
561
For the leading term of the imaginary part of a higher order (one can show that the leading term is always of an even order and positive) one should iterate Eq. (3.13) further (n − 2 times for order n). Controlling the resulting terms would require faster time decay of terms like W ψd (t) which would result in a higher power α in Conditions (A1)–(A4). 3.2. Estimates of P ψ(t) and ψd (t) and proof of Theorem 2.1. In this section, we show the estimates given in Theorem 2.1. Due to ψ = B(P ψ + ψd ) (see (3.6), (3.7)) and Lemma A.1 of the appendix which proves that ||−α Bφ||H ≤ C||−α φ||H , ∀φ ∈ H,
(3.15)
||−α B 0 φ||H ≤ Cκ||−α φ||H , ∀φ ∈ H,
(3.16)
and
it suffices to demonstrate appropriate estimates on P ψ(t) and ψd (t). To this end, write the integral equations for P ψ(t) and ψd (t) (cf. (3.13) and (3.9)): P ψ(t) = e−iλt P ψ(0) + r(t), ψd (t) = e−iHt ψd (0) + R(t), where r(t) =
P5
j=1 rj (t),
with Z
t
e−iλ(t−s) fj (s)ds,
(3.17)
e−iH(t−s) Pd W P ψ(s)ds.
(3.18)
rj (t) = −i 0
j = 1, . . . , 5, and Z R(t) = −i
t
0
The strategy (see [SW3]) is the following: for T > 0 and some β ≥ 0, introduce the norms [r]T = sup (t + 1)β ||r(t)||H
(3.19)
[R]T = sup (t + 1)β ||−α R(t)||H .
(3.20)
0≤t≤T
and 0≤t≤T
Using (3.17) and (3.18), we then show that [r]T ≤ Cκ1−2β , where the constant is independent of T , κ is sufficiently small and 0 ≤ β < min{1/2, α − 2}. Taking T → ∞ gives us the desired result: ||r(t)||H ≤ Cκ1−2β (t+1)−β . The corresponding estimate for ||−α ψd (t)||H is obtained similarly. These estimates and Eqs. (3.6), (3.7) and (3.15) imply the proof of Theorem 2.1. Ingredients of the estimations. • The basic tool in the estimations is the local decay Assumption (A4), and its integrated version given in Proposition 3.2(iii).
562
M. Merkli, I. M. Sigal
• In order to estimate ||e−iλt ||, we use the expansion for λ given in Proposition 3.3. We have ||e−iλt || ≤ eCκ t e−C0 κ t . 3
2
(3.21)
In order to prove this inequality, we use the differential inequality d −iλt 2 ||e u|| = he−iλt u, i(λ∗ − λ)e−iλt ui dt ≤ 2 sup(Im λ)||e−iλt u||2 , and the initial condition ||e−iλt u|| |t=0 = ||u|| to obtain that ||e−iλt || ≤ ||eIm λt ||. (The latter inequality can also be derived by taking the norm of the Trotter representation of e−iλt = limn→∞ [e−i Re λt/n eIm λt/n ]n .) Using that Im λ = −0 + O(κ3 ) and that the spectrum of 0 is bounded below by C0 κ2 , we arrive at (3.21). Setting γ ≡ C0 κ2 /2, we derive from (3.21) for κ small: ||e−iλt || ≤ e−γt ,
t ≥ 0.
(3.22)
• We have the following uniform bounds in t ≥ 0. a) For 0 ≤ β ≤ σ: Z t β e−γ(t−s) (s + 1)−σ ds ≤ C(1 + γ −1−β ) ≤ Cκ−2−2β . (t + 1) 0
If 0 ≤ β ≤ σ − 1 and σ > 1, then the r.h.s. above can be replaced by C(1 + γ −β ) ≤ Cκ−2β . b) For σ > 1, 0 ≤ β ≤ σ − 1: Z t (t − s + 1)−σ (s + 1)−β ds ≤ C. (t + 1)β 0
Estimations. We use the above mentioned points and the condition α−2 ≥ β to estimate ||rj (t)||H , for j = 1, . . . , 5. Z t e−γ(t−s) ||P W Be−iHs ψd (0)||H ds 1) ||r1 (t)||H ≤ 0 Z t α e−γ(t−s) (s + 1)−α ds ≤ C|| P ψ0 ||H κ 0
≤ C||α P ψ0 ||H κ1−2β (1 + t)−β , t ≥ 0.
(3.23)
In the second step, we used (3.15) and the local decay estimate (A4). Z t e−γ(t−s) (s + 1)−α+1 ds 2) ||r2 (t)||H ≤ C||P ψ0 ||H κ2 ≤ C||P ψ0 ||H κ Z 3) ||r3 (t)||H ≤ Cκ3 0
t
e−γ(t−s)
Z
0 2−2β
s
(t + 1)−β , t ≥ 0.
(3.24)
(s − τ + 1)−α+1 ||P ψ(τ )||H dτ ds.
0
(3.25)
A Time-Dependent Theory of Quantum Resonances
563
We split ||P ψ(τ )||H as ||P ψ(τ )||H ≤ e−γτ ||P ψ0 ||H + ||r(τ )||H ≤ e−γτ ||P ψ0 ||H + [r]T (τ + 1)−β ≤ C(γ −β ||P ψ0 ||H + [r]T )(τ + 1)−β , where we used (γτ )β e−γτ ≤ C, uniformly in γ, τ ≥ 0. So Z t −γ(t−s) e (s + 1)−β ds ||r3 (t)||H ≤ Cκ3 κ−2β ||P ψ0 ||H + [r]T 0 ≤ Cκ1−2β κ−2β ||P ψ0 ||H + [r]T (t + 1)−β . Z 4) ||r4 (t)||H ≤ Cκ3
t
e−γ(t−s)
0
Z
s
(3.26)
(s − τ + 1)−α+1 ||−α ψd (τ )||H dτ ds.
0
We decompose the last term in the integral as ||−α ψd (τ )||H ≤ ||−α e−iHτ ψd (0)||H + ||−α R(τ )||H ≤ C(τ + 1)−α ||α P ψ0 ||H + (τ + 1)−β [R]T . Using this decomposition in the double-integral, we get ||r4 (t)||H ≤ C ||α P ψ0 ||H κ3−2β + [R]T κ1−2β (t + 1)−β . Z 5) ||r5 (t)||H ≤ Cκ3
t
0
≤ Cκ1−2β
e−γ(t−s)
Z
(3.27)
s
(s − τ + 1)−α ||P ψ(τ )||H dτ ds 0 κ−2β ||P ψ0 ||H + [r]T (t + 1)−β .
(3.28)
Summing (3.23), (3.24), (3.26), (3.27) and (3.28), we find for 0 ≤ t ≤ T : (t + 1)β ||r(t)||H ≤ C[r]T κ1−2β + C[R]T κ1−2β +C||P ψ0 ||H κ1−4β + C||α P ψ0 ||H κ1−2β , (3.29) where the r.h.s. is independent of t. In order to close the estimates, we express [R]T in terms of [r]T . From (3.18), we get Z t (t − s + 1)−α ||P ψ(s)||H ds ||−α R(t)||H ≤ Cκ 0 (3.30) ≤ C κ1−2β ||P ψ0 ||H + [r]T κ (t + 1)−β , and thus from (3.20): [R]T ≤ Cκ1−2β ||P ψ0 ||H + Cκ[r]T . Taking the supremum over t ∈ [0, T ] in (3.29), and replacing [R]T in (3.29) by the last estimate, we get [r]T ≤ C[r]T κ1−2β + C||P ψ0 ||H κ1−4β + C||α P ψ0 ||H κ1−2β .
(3.31)
We can now isolate [r]T in this inequality if β < 1/2, and for κ small enough. We get [r]T ≤ C||P ψ0 ||H κ1−4β + C||α P ψ0 ||H κ1−2β .
(3.32)
564
M. Merkli, I. M. Sigal
The estimate is uniform in T , and taking T → ∞ yields ||r(t)||H ≤ Cκ1−4β (t + 1)−β .
(3.33)
Similarly, using [r]T ≤ Cκ1−4β in (3.30) gives ||−α R(t)||H ≤ Cκ1−2β (t + 1)−β .
(3.34)
The estimates given in Remark 2 after Theorem 2.1 are immediate from (3.30) and (3.32).
4. The Mourre Estimate for Reduced Operators and Local Decay 4.1. The Mourre estimate. In this section, we derive the strict Mourre estimate for the operator H ≡ P HP from the Mourre estimate for H0 and use the former to prove the local decay for the operator H. We consider perturbations W of the form W = κ0 U , where U is fixed, and κ0 is assumed to be sufficiently small. The main result of this section is related to some results in [AHS]. It is given in the following Theorem 4.1. Suppose that AP and [H0 , A](H0 + i)−1 are bounded, and that AW P , W (H0 + i)−1 and [W, A](H0 + i)−1 are O(κ0 ). Moreover, suppose that there is a neighbourhood 11 of E0 6 = 0 such that E11 (H0 )i[H0 , A]E11 (H0 ) ≥ θ1 E11 (H0 ) + K,
(4.1)
where θ1 > 0 and K is a compact operator. Then ∀ > 0 ∃C() > 0 such that E13 (H)i[H, A]E13 (H) ≥ (θ1 − )E13 (H),
(4.2)
for any neighbourhood 13 of E0 , 13 ⊂ 11 , provided |13 | ≤ C(), and κ0 ≤ C(). In particular, |13 | is so small that 0 6∈ 13 . Proof. We divide the proof into two steps. In Step 1, we pass from the Mourre estimate for H0 to a strong Mourre estimate for H 0 , in an appropriate interval 12 , 0 6∈ 12 ⊂ 11 . This is done by shrinking 12 around E0 as to make the contribution of the compact operator K arbitrarily small. In the second step, we pass from the strong Mourre estimate for H 0 to one for H. Step 1. We show that ∀1 > 0 ∃C1 (1 ) > 0 such that E12 (H 0 )i[H 0 , A]E12 (H 0 ) ≥ (θ1 − 1 )E12 (H 0 )
(4.3)
for any neighbourhood 12 of E0 such that 12 ⊂ 11 , and |12 | ≤ C1 (1 ). Let 12 be an open set containing E0 but not 0, and 12 ⊂ 11 . Applying P E12 (H 0 ) = E12 (H 0 ) to both sides of Eq. (4.1) and using E11 (H0 )P = E11 (H 0 )P = E11 (H 0 ), we obtain E12 (H 0 )i[H 0 , A]E12 (H 0 ) ≥ θ1 E12 (H 0 ) + E12 (H 0 )KE12 (H 0 ),
(4.4)
where the last term on the r.h.s. can be made arbitrarily small by shrinking 12 around E0 (H 0 has no eigenvalues in a neighbourhood of E0 if E0 6 = 0, thus E12 (H 0 ) → 0 strongly; and K is compact). This shows (4.3).
A Time-Dependent Theory of Quantum Resonances
565
Step 2. Let now 12 satisfy (4.3). We show that ∀2 > 0 ∃C2 (2 ) > 0 such that E13 (H)i[H, A]E13 (H) ≥ (θ1 − 1 − 2 )E13 (H),
(4.5)
for any neighbourhood 13 of E0 such that 13 ⊂ 12 , provided κ0 ≤ C2 (2 ). We have [H, A] = [H 0 , A] + [W , A].
(4.6)
E12 (H)[W , A]E12 (H) = O(κ0 ).
(4.7)
We claim
To prove the last estimate, write [W , A] = [W, A] + [P W P, A] − [P W, A] − [W P, A]. The last three commutators are O(κ0 ) since ||AP || < ∞ and ||AW P || = O(κ0 ). The fact that [W, A](H + i)−1 = O(κ0 ) follows from the assumption [W, A](H0 + i)−1 = O(κ0 ) and W (H0 + i)−1 = O(κ0 ). Let us now examine E13 (H)i[H 0 , A]E13 (H), where 13 is a neighbourhood of E0 , and 13 ⊂ 12 . Let h ∈ C0∞ (12 ) be such that h = 1 on 13 . We have h(H) = h(H 0 ) + I, where Z ˜ I ≡ − (H 0 − z)−1 W (H − z)−1 dh(z), and since [H 0 , A](H 0 + i)−1 is bounded, and both I and (H 0 + i)I are O(κ0 ), we get E13 (H)i[H 0 , A]E13 (H) = E13 (H)h(H)i[H 0 , A]h(H)E13 (H) = E13 (H)h(H 0 )i[H 0 , A]h(H 0 )E13 (H) + O(κ0 ).
(4.8)
Using (4.3) and h2 (H 0 ) = h2 (H) + O(κ0 ), we estimate the first term on the r.h.s. of (4.8) from below by (θ1 − 1 )E13 (H)h2 (H 0 )E13 (H) = (θ1 − 1 )E13 (H) + O(κ0 ). This together with (4.8) implies E13 (H)i[H 0 , A]E13 (H) ≥ (θ1 − 1 )E13 (H) + O(κ0 ).
(4.9)
Multiplying (4.6) from both sides by E13 (H) and taking into account (4.7) and (4.9) yields the desired result (4.5). 4.2. Local decay. (r) (P ) Theorem 4.2. Suppose (A3) and (4.2) hold, ||W (H0 +i)−1 || < 1, and (H0 +i)m adA 0 −m are bounded for m = 0, 1 and r ≤ n. Then there is an interval 1 , E0 ∈ (H0 + i) 10 ⊂ 13 such that the local decay estimate (A4) holds.
566
M. Merkli, I. M. Sigal
Proof. We verify first that we have the upper bound (uniformly in W ): −1 ||ad(k) A (H)(H + i) || ≤ C, k = 1, . . . , n.
(4.10)
Expanding the multicommutator of order k gives X (k) (r1 ) (r2 ) (r3 ) Cr1 ,r2 ,r3 adA (P )adA (H)adA (P ). (4.11) ad(k) A (H) = adA (P HP ) = r1 +r2 +r3 =k
Equation (4.10) follows since all the operators (r1 ) (r2 ) (r3 ) (P ), adA (H)(H0 + i)−1 , (H0 + i)adA (P )(H0 + i)−1 , adA
and (H0 + i)(H + i)−1 are bounded (the last fact follows from ||W (H0 + i)−1 || < 1). Theorem 4.1 together with (4.10) imply, due to a result of [HSS] that ||−α e−iHt Pd φ||H ≤ const. (t + 1)−α ||α P φ||H , t ≥ 0. −1 The constant depends on H (and κ) only through ||ad(k) A (H)(H + i) ||, k = 1, . . . , n.
5. Proof of Theorem 2.3 • We show (A1), i.e. || α P || < ∞. From || α P || ≤ C(1 + ||Aα P ||), we see that it is enough to show that An P is bounded, for some integer n ≥ α. Now An =
n
1 (x · p + p · x) 2
=
n X
ck (x · p)k .
(5.1)
k=0
But (x · p)k P is bounded by (2.12), k ≤ n: in fact, (x · p)k is a sum of Pνterms of the form xm pm , where m ≡ (m1 , . . . , mν ) are multi-indices with |m| ≡ j=1 mi ≤ k, mν m |m| m1 1 ∂x1 · · · ∂xmνν , with ν = dim X. xm = xm 1 · · · xν , and p = (−i) • (A2) follows directly from W = κ0 U and (2.11). • We show that (A3) is satisfied. From k 2 k k k k ad(k) A (H) = (−2i) p + i (x · ∇) V + i (x · ∇) W, 0 W = κ0 U and (S2), it is clear that ad(k) A (H) is H-bounded, uniformly in κ . • We show now the local decay estimate (A4). Equation (2.11) shows that AW P = O(κ0 ), and (S2)(ii) with k = 0, 1 gives that W (H0 +i)−1 and [W, A](H0 +i)−1 are O(κ0 ). Boundedness of AP follows from || α P || < ∞, which we have shown above, and [H0 , A](H0 + i)−1 is bounded by (S2)(i) with k = 1. This together with (S3) shows that the conditions of Theorem 4.1 are met. Due to Theorem 4.2, it is then enough to (r) (P )(H0 + i)−m is bounded for r ≤ n, m = 0, 1. We do this check that (H0 + i)m adA now. (r) (r) Let m = 0. Clearly ad(0) A (P ) = P is bounded. If r ≥ 1, then adA (P ) = ad(x·p) (P ). l m This multi-commutator is a sum of terms of the form (x · p) P (x · p) , l + m = r. It is thus enough to show that xl pl P pm xm is bounded, |l| + |m| ≤ r, r ≤ n. This is
A Time-Dependent Theory of Quantum Resonances
567
(r) guaranteed by (2.12). We conclude that adA (P ) is bounded (and does not depend on 0 κ ). Let m = 1. We write h i (r) (r) (r) (P )(H0 + i)−1 = adA (P ) + H0 , adA (P ) (H0 + i)−1 . (H0 + i)adA
For r = 0, the commutator is zero. For r > 0, the general term in the commutator is of the form H0 (x · p)l P (x · p)m − (x · p)l P (x · p)m H0 . Because H0 is p2 -bounded, it is enough to show that p2 (x · p)l P (x · p)m is bounded for any l, m, l + m ≤ n. This is again ensured by (2.12). • (A5) follows immediately from (S4). 6. Proof of Theorem 2.5 The proof of Theorem 2.5 is analogous to the proof of Theorem 2.1 for a non-degenerate eigenvalue E0 . In the quasiclassical case however, we must keep track of the parameter ~. The delicate point in the proof of Theorem 2.5 is to show local decay. The latter is deduced from the quasiclassical Mourre estimate (cf. [G, Gr, HN]). This estimate is of an independent interest. We begin with it. 6.1. The quasiclassical Mourre estimate. Theorem 6.1. Assume the non-trapping Condition (T3). Then ∀ > 0, there is a C > 0 and a neighbourhood 11 of E0 /~ (with |11 | independent of ~) such that for ~ ≤ C , we have i E~11 (H) [H, A]E~11 (H) ≥ (θ − )E~11 (H). ~
(6.1)
Proof. We write [H, A] = P [H, A]P + P H[P, A] + [P, A]HP . Since P A and AP are bounded, and P HP = P W P = O(κ), we get P [H, A]P = P [H, A]P + O(κ). Let 11 be an interval containing E0 /~, and of fixed length as specified in Proposition 6.2(i) below, and such that 0 ∈ / ~11 . Then we have E~11 (H) = E~11 (H)P , and hence i i E~11 (H) [H, A]E~11 (H) = E~delta1 (H) [H, A]E~11 (H) + O(κ). ~ ~
(6.2)
Now ~i [H, A] = 2(H − E0 ) + 2(E0 − V (x)) − x · ∇V (x), so with E0 ≥ V (0), it follows that i E~11 (H) [H, A]E~11 (H) ~ ≥ E~11 (H) 2(V (0) − V (x)) − x · ∇V (x) E~11 (H) + O(~).
(6.3)
Let N be a bounded neighbourhood of 0 ∈ Rd on which V (x) = V0 (x) (see (ii) in ◦ Paragraph 2.3.2). Let ∂N ≡ N \N be the boundary of N . We put δ1 = minx∈∂N (V (x)− C V (0)), and δ2 = min{δ1 , δ}/2, where δ is given in (T3). The sets ext δ , N and δ2 ≡ d d {x ∈ R |V (x) ≥ δ2 } cover R .
568
M. Merkli, I. M. Sigal
We introduce a C ∞ -decomposition of unity: 1 = χ1 (x) + χ2 (x) + χ3 (x), such that C suppχ1 ⊂ N , suppχ2 ⊂ ext δ , suppχ3 ⊂ δ2 . We estimate the r.h.s. of (6.3) on the supports of χ1 , χ2 and χ3 . We have χ3 (x)E~11 (H) = O(e−ρ/~ ),
(6.4)
for some ρ > 0 (see [Gr], Lemma 6). On the support of χ1 we have V (x) = V0 (x), suggesting that χ1 E~11 (H) is close to χ1 E~11 (H 0 ), which in turn is zero if ~11 does not contain any eigenvalues of H 0 . We have in fact Proposition 6.2. There is a neighbourhood ~11 of E0 , such that |11 | is independent of ~ (|11 | depends only on the second derivatives of V at the origin), and such that (i) ~11 ∩ σ(H0 ) = {E0 }, (ii) χ1 E~11 (H) = O(~1/2 ). The proof is given below. Putting 1 = χ1 (x) + χ2 (x) + χ3 (x) in front of the last factor E~11 (H) in the r.h.s. of (6.3), we then get, using (6.4), Proposition 6.2 and the non-trapping Condition (T3): i E~11 (H) [H, A]E~11 (H) ~ ≥ E~11 (H) (2(V (0) − V (x)) − x · ∇V (x)) (χ1 (x) + χ2 (x))E~11 (H) + O(~) ≥ E~11 (H) θχ2 (x) + O(~1/2 ) E~11 (H) + O(~) (6.5) ≥ E~11 (H) θ + O(~1/2 ) E~11 (H). Proof of Proposition 6.2.. (i) is a simple consequence of the harmonic approximation. In order to prove (ii), we introduce the unitary transformation U on L2 (Rd ) defined by (Uψ)(x) = ~d/4 ψ(~1/2 x),
ψ ∈ L2 (Rd ).
(6.6)
It is easily seen that UHU −1 = ~H 0 , UH0 U −1 = ~H00 , where the rescaled Hamiltonians are given by H 0 = −1 + ~−1 V (~1/2 x), and H00 = −1 + ~−1 V0 (~1/2 x). The spectra are related as σ(H) = ~σ(H 0 ), and σ(H0 ) = ~σ(H00 ). For a function g of H, we have Ug(H)U −1 = g(~H 0 ). We let H 0 ≡ P 0 H 0P 0 , where P 0 = 11 − P 0 , P 0 = UP U −1 , and H00 ≡ P 0 H00P 0 . Pick now g ∈ C0∞ such that g = 1 on 11 , and suppg ∩ σ(H00 ) = {E0 /~} (this is possible by (i)). Since g(H00 ) = 0, we have χ1 (~1/2 x)g(H 0 ) = χ1 (~1/2 x) g(H 0 ) − g(H00 ) Z 1 (6.7) = −χ1 (~1/2 x) (H00 − z)−1P 0 W (~1/2 x)P 0 (H 0 − z)−1 dg˜ ~ Z 1 = −χ1 (~1/2 x) (H00 − z)−1 W (~1/2 x)P 0 (H 0 − z)−1 dg˜ + O(~−1 κ). ~ In the last step, we used 1 1 P 0 W (~1/2 x) = UP W U −1 = O(~−1 κ). ~ ~
(6.8)
A Time-Dependent Theory of Quantum Resonances
569
The r.h.s. of (6.7) is now shown to be small by commuting χ1 (~1/2 x) through the resolvent (H00 − z)−1 to the right, and using χ1 (~1/2 x)W (~1/2 x) = 0: 1 χ1 (~1/2 x)(H00 − z)−1 W (~1/2 x) ~ h i 1 = (H00 − z)−1 −1, χ1 (~1/2 x) (H00 − z)−1 W (~1/2 x) ~ 1 = −2~1/2 (H00 − z)−1 ∇ · (∇χ1 )(~1/2 x)(H00 − z)−1 W (~1/2 x) ~ 1 + ~(H00 − z)−1 (1χ1 )(~1/2 x)(H00 − z)−1 W (~1/2 x). ~
(6.9)
Notice that ||(H00 − z)−1 ∇|| ≤ C| Im z|−1 , uniformly in ~, and that 1 (H00 − z)−1 W (~1/2 x)P 0 (H 0 − z)−1 = (H00 − z)−1 − (H 0 − z)−1 + O(~−1 κ) ~ = O(| Im z|−1 ) + O(~−1 κ). We then get from (6.7) and (6.9): (6.10) χ1 (~1/2 x)g(H 0 ) = O(~1/2 ). R Notice that we were able to use | Im z|−4 dg˜ < C, uniformly in ~, since the size of the support of the function g is independent of ~. From (6.10), it follows that U −1 χ1 (~1/2 x)g(H 0 )U = χ1 U −1 g(H 0 )U = χ1 g(H/~) = O(~1/2 ),
(6.11)
and (ii) of Proposition 6.2 is proved by multiplying the last equation by E~11 (H).
6.2. Local decay in the quasiclassical case. Theorem 6.3. Suppose that the Mourre estimate (6.1) holds, and that Conditions (ii) and (T1) of Paragraph 2.3.2 are satisfied. Then there is an interval 10 ⊂ 11 such that the local decay estimate holds: ||−α e−iHt/~ g~10 (H)P φ||2 ≤ C~−N −α ||α P φ||2 ,
(6.12)
0
where N is an integer depending on α, n, and C depends on |1 |, but not on ~. Proof. Via the unitary transformation U (introduced in the previous paragraph), (6.12) is equivalent to 0
||−α e−iH t g10 (H 0 )P 0 φ||2 ≤ C~−N −α ||α P 0 φ||2 .
(6.13)
Notice that U commutes with A. From (6.1), we get by conjugating with U: E11 (H 0 )i[H 0 , A]E11 (H 0 ) ≥ (θ − )E11 (H 0 ).
(6.14)
−2k 0 0 . ||ad(k) A (g1 (H ))|| ≤ Ck ~
(6.15)
We have also
Estimate (6.15) is obtained by using the representation (A.1) and expanding the multicommutator (as in the proof of Lemma A.1, see (A.12)). From (6.14) and (6.15), we get (6.13) following [HSS] and keeping track of the dependence on ~ in (6.15).
570
M. Merkli, I. M. Sigal
6.3. Proof of Theorem 2.5. Remarks 2 and 3 after Conditions (T) in Paragraph 2.3.2 show that (A1) and (A2) are satisfied. Clearly, (A3) is also true by (T1) and Condition −1 = O(~k ). (A4) holds modulo the factor ~−N (see Theorem (i); in fact, ad(k) A (H)(H +i) 6.3), and (A5) holds modulo ~p (see (T4)). Theorem 2.5 follows proceeding as in the proof of Theorem 2.1, and keeping track of ~. A. Appendix We first present an operator calculus which we then apply to find some norm-estimates on the operator B introduced in Proposition 3.1 that were used at various places in this work. In the subsequent sections, we give the proofs of Propositions 3.1–3.3. A.1. Operator calculus. The operator calculus presented below is based on a formula due to Helffer and Sj¨ostrand [HeSj] with estimates of the remainders given in [IS, HS2] (see also [D]). We follow [HS2]. Let A be a self-adjoint operator. For a complex-valued g ∈ C0∞ (R), we have the representation Z ˜ (A.1) g(A) = (A − z)−1 dg(z), where the integral is over C, g(z) ˜ is an almost analytic extension of g to the complex plane, and dg(z) ˜ ≡
1 (∂x + i∂y )g(z)dxdy. ˜ 2π
The function g(z) ˜ has compact support, and satisfies the estimate Z ˜ < ∞, p > 0. |Im(z)|−p d|g|(z)
(A.2)
(A.3)
Consequently, the integral (A.1) converges absolutely in norm. We need also estimates on commutators like [h(H), f (A)], where H, A are selfadjoint operators, h ∈ C0∞ and f ∈ C ∞ . If the multicommutators ad(k) A (H), k = 1, . . . , n are H−bounded, and f satisfies the condition given below, then we have the following expansion: [h(H), f (A)] =
n−1 X k=1
Z
where Rn =
1 (k) f (A)ad(k) A (h(H)) + Rn , k!
(A.4)
−1 ˜ (A − z)−n ad(k) A (h(H))(A − z) df (z).
We then have ||Rn || ≤ Cn ||ad(n) A (h(H))||
n+2 Z X
<x>k−n−1 |f (k) (x)|dx,
(A.5)
k=0
where <x>≡ (1 + x2 )1/2 . The condition on f is that the integrals in (A.5) exist. For details, see [HS2].
A Time-Dependent Theory of Quantum Resonances
571
A.2. Proof of Proposition 3.1. We want to show that ||g10 (H)P g1 (H)|| < 1 for small κ, then B is given by the norm-converging Neumann series B = (11 − g10 (H)P g1 (H))−1 =
∞ X n g10 (H)P g1 (H) .
(A.6)
n=0
Since g1 (H) commutes withP , and g10 g1 = 0, we get g10 (H)P g1 (H) = g10 (H)P (g1 (H)− g1 (H)). Using the second resolvent identity gives Z P (g1 (H) − g1 (H)) = −P [(H − z)−1 − (H − z)−1 ]dg˜1 (z) Z (A.7) = − (H − z)−1P W P (H − z)−1 dg˜1 (z). With ||W P || ≤ κ, we get ||(H − z)−1P W P (H − z)−1 || ≤
κ . |Im(z)|2
Now |Im(z)|−2 is integrable with respect to d|g˜1 |(z), see (A.3). The integral in (A.7) is thus bounded by κC(1), where Z C(1) = | Im(z)|−2 d|g˜ 1 |(z), and this shows existence of B for κ < 1/C(1). Moreover ||B|| ≤
∞ X
κn C(1)n = (1 − κC(1))−1 .
n=0
This completes the proof of the proposition.
A.3. Norm estimates on the operator B. Lemma A.1. We have ∀φ ∈ H: ||−α Bφ||H ≤ C ||−α φ||H , ||−α B 0 φ||H ≤ Cκ||−α φ||H ,
(A.8) (A.9)
where the constants are independent of κ for small κ. Proof. Due to (A.6) it is enough to show that || −α g10 (H)P g1 (H) α || ≤ Cκ.
(A.10)
Due to (A.7) and g10 (H)g1 (H) = 0, in order to show (A.10), it is enough to show Z −α (H − z)−1P W P (H − z)−1 g1 (H) α dg˜10 (z) ≤ Cκ. (A.11) We have ||W P || ≤ κ. Introduce α −α between P and (H − z)−1 in (A.11) and notice that P α is bounded by Condition (A1). The norm of the integrand in (A.11) is then bounded by
572
M. Merkli, I. M. Sigal
κ | Im(z)|−1 || −α (H − z)−1 g1 (H) α ||
≤ κ | Im(z)|−1 | Im(z)|−1 + || −α [(H − z)−1 g1 (H), α ]|| .
To estimate the commutator in the last expression, notice that ∀z ∈ / R, x ∈ R, x 7→ (x − z)−1 g1 (x) ∈ C0∞ (1). Hence we can apply expansion (A.4)-(A.5) with h(x) ≡ (x − z)−1 g1 (x), and f (x) =<x>α . Estimate (A.5) implies ||Rn || ≤ C||ad(n) A ((H − z)−1 g1 (H))||. Now −1 ad(k) A ((H − z) g1 (H)) Z −1 −1 = ad(k) ˜ 1 (ζ) A ((H − z) (H − ζ) )dg Z X (r1 ) −1 2) Cr1 ,r2 adA ((H − z)−1 )ad(r ˜ 1 (ζ), = A ((H − ζ) )dg r1 +r2 =k
for some numbers Cr1 ,r2 . Therefore (k) adA ((H − z)−1 g1 (H)) Z X (r1 ) −1 2) ||adA ((H − z)−1 )|| ||ad(r ˜ 1 |(ζ) ≤C A ((H − ζ) )||d|g
(A.12)
r1 +r2 =k
In order to estimate (A.12) further, observe that −1 ||ad(k) A ((H − z) )|| ≤ C
k+1 X
| Im(z)|−j
(A.13)
j=2
uniformly in κ for small κ. The integral in (A.12) is thus bounded by C
Z X k+1
| Im(ζ)|−j d|g˜ 1 |(ζ) < ∞,
j=2
uniformly in κ, for small κ. So we get −1 ||ad(k) A ((H − z) g1 (H))|| ≤ C
k+1 X
| Im(z)|−j ,
j=2
and hence the l.h.s. of (A.12) is indeed bounded by Z Cκ | Im(z)|−2 + | Im(z)|−n−2 d|g˜10 |(z) ≤ Cκ,
(A.14)
uniformly in κ, for small κ. This shows (A.8). (A.9) is then readily obtained from the fact that B 0 = g 10 (H)P g1 (H)B and (A.10). A.4. Proof of Proposition 3.2. For φ ∈ D(α ), ↓ 0, t ≥ 0 fixed, we show that φ (t) ≡−α e−iHt (H − ω − i)−1 Pd φ is a Cauchy-net in H. We notice that (for > 0)
A Time-Dependent Theory of Quantum Resonances −1
(H − ω − i)
573
Z =i
∞
e−i(H−ω−i)s ds,
(A.15)
0
and so we get, using the first resolvent identity: (H − ω − i)−1 − (H − ω − i0 )−1 Z ∞ Z 0 −i(H−ω−i)s = −i( − ) e ds 0
∞
0
e−i(H−ω−i )σ dσ.
0
Therefore ||φ (t) − φ0 (t)||H Z ∞ Z ∞ 0 0 −s ≤ | − | e e− σ ||−α e−iH(t+s+σ) Pd φ||H dσds Z0 ∞ Z ∞0 ds dτ ||−α e−iHτ Pd φ||H ≤ | − 0 | 0 t+s Z ∞ Z ∞ ≤ | − 0 | ds dτ (τ + 1)−α C||α φ||H , 0
(A.16)
s
where we used the local decay (2.1). Since α > 2, the double integral in (A.16) is finite, so we get ||φ (t) − φ0 (t)||H ≤ C| − 0 | ||α φ||H , ∀t ≥ 0, which shows that φ is a Cauchy-net, and (i) is proved. To prove (ii), we remark that from (A.16), we have that the limit φ0 (t) ≡ lim0 ↓0 φ0 (t) exists, and φ → φ0 uniformly in t. So φ0 is continuous since the φ ’s are. To prove (iii), remark that (t ≥ 0, > 0): Z ∞ e−i(H−ω−i)s ds, e−iHt (H − ω − i)−1 = ie−iωt et t
so that we get ||−α e−iHt (H − ω − i)−1 Pd φ||H Z ∞ ≤ et e−s ||−α e−iHs Pd φ||H ds t Z ∞ α (s + 1)−α ds ≤ C|| φ||H t
≤ C(t + 1)−α+1 ||α φ||H . This completes the proof of the proposition.
A.5. Proof of Proposition 3.3. The definition of λ is (see the sentence after (3.13)) λ = E0 + P W P + P W B 0 P − P W (H − E0 − i0)−1 Pd W P.
(A.17)
We analyze first P W B 0 P . Since B 0 = g10 (H)P g1 (H) + O(κ2 ), we have P W B 0 P = P W g10 (H)P g1 (H)P + O(κ3 ).
(A.18)
574
M. Merkli, I. M. Sigal
Using g10 (H)g1 (H) = 0, we get g10 (H)P g1 (H) = g10 (H)P (g1 (H) − g1 (H)) Z = −g10 (H)P (H − z)−1 W P (H − z)−1 dg˜ 1 (z). Since (H − z)−1 = (H0 − z)−1 (11 − W (H − z)−1 ) and P W and W P are O(κ), we deduce that Z g10 (H)P g1 (H) = −g10 (H)P (H − z)−1 (E0 − z)−1 dg˜ 1 (z)W P + O(κ2 ). (A.19) Now
(H − z)−1 (E0 − z)−1 = −(H − E0 )−1 (H − z)−1 − (E0 − z)−1 , and therefore g10 (H)P g1 (H) = g10 (H)P (H − E0 )−1 g1 (H) − g1 (E0 ) W P + O(κ2 ) = −g10 (H)P (H − E0 )−1 W P + O(κ2 ). Combining this relation with (A.18), we find P W B 0 P = −P W g10 (H)(H − E0 )−1P W P + O(κ3 ).
(A.20)
0
Notice that the term of order two in κ of P W B P is self-adjoint. Let us now examine P W (H − E0 − i0)−1 Pd W P , which will give the non-zero anti-self-adjoint contribution of order two in κ, P W (H − E0 − i0)−1 Pd W P = P W (H − E0 − i0)−1P W P − P W g10 (H)(H − E0 )−1P W P.
(A.21)
Observe that the first term on the r.h.s. exists, because of Proposition 3.2 (and the / supp(g10 ). assumption that α W P is bounded). The second term exists since E0 ∈ Hence it follows from (A.17), (A.20) and (A.21) that λ = E0 + P W P − P W (H − E0 − i0)−1P W P + O(κ3 ).
(A.22)
Notice that since P W (H − E0 − i)−1P W P converges strongly as ↓ 0, then so does its adjoint P W (H − E0 + i)−1P W P . The proof is now complete if one observes that the principal value and the delta-function have the representations: 1 (A.23) P.V.(H − E0 )−1 = lim (H − E0 − i)−1 + (H − E0 + i)−1 2 ↓0 and δ(H − E0 ) =
1 lim (H − E0 − i)−1 − (H − E0 + i)−1 . 2πi ↓0
(A.24)
Remark. To show ||0|| ≤ Cκ2 (see (2.3)), notice that ||P W (H − E0 ± i0)−1 Pd W P || = O(κ2 ). This is shown to hold using Proposition 3.2(ii) with t = 0, and the assumption P W α = O(κ). Acknowledgement. The authors are grateful to J. Fr¨ohlich, W. Hunziker, A. Soffer and M. Weinstein for useful discussions and to A. Soffer and M. Weinstein for making their work available to us prior to publication. It is a pleasure to thank the referee for useful remarks.
A Time-Dependent Theory of Quantum Resonances
575
References [AC] [AHS] [BC] [BFS1] [BFS2] [BFS3] [BP] [BW] [CFKS] [D] [G] [Gr] [GS] [HM] [HN] [HeSj]
[HiSig] [Hu1] [Hu2] [HS1] [HS2] [HSS] [IS] [KP] [LL] [Ort] [PF] [PW] [S] [Sig1] [Sig2] [Sig3]
Aguilar, J., and Combes, J.M.: A class of analytic perturbations for one-body Schr¨odinger Hamiltonians. Commun. Math. Phys. 22, 269–279 (1971) Agmon, S., Herbst, I., and Skibsted, E.: Perturbation of embedded eigenvalues in the generalized N -body problem. Commun. Math. Phys. 122, 411–438 (1989) Balslev, E., and Combes, J.M.: Spectral properties of many-body Schr¨odinger operators with dilation analytic interactions. Commun. Math. Phys. 22, 280–294 (1971) Bach, V., Fr¨ohlich, J., and Sigal, I.M.: Mathematical theory of non-relativistic matter and radiation. Lett. in Math. Phys. 34, 183–201 (1995) Bach, V., Fr¨ohlich, J., and Sigal, I.M.: Quantum Electrodynamics of Confined Nonrelativistic Particles. To appear in Advances in Mathematics Bach, V., Fr¨ohlich, J., and Sigal, I.M.: Renormalization group Analysis of Spectral Problems in QFT. To appear in Advances in Mathematics Buslaev, V., and Perelman, G.: On the stability of solitary waves for nonlinear Schr¨odinger equations. Am. Math. Soc. Transl. (2) 164, 75–98 (1995) Breit, W., and Wigner, E.: Capture of slow neutrons. Phys. Rev. 49, 519–531 (1936) Cycon, H.L., Froese, R.G., Kirsch, W., and Simon, B.: Schr¨odinger Operators. Texts and Monographs in Physics, Berlin: Springer-Verlag, 1987 Davies, E.B.: Spectral thoery of differential operators. Cambridge: Cambridge University Press, 1995 G´erard, C.: Resonance theory for periodic Schr¨odinger operators. Bull. Soc. Math. Fr. 118, 27–54 (1990) Graf, G.M.: The Mourre estimate in the semiclassical limit. Lett. in Math. Phys. 20, 47–54 (1990) G´erard, C., and Sigal, I.M.: Space-time picture of semiclassical resonances. Commun. Math. Phys. 145, 281–328 (1992) Helffer, B., and Martinez, A.: Comparaison entre les diverses notions de r´esonances. Preprint (1987) Hislop, P.D., and Nakamura, S.: Semiclassical resolvent estimates. Ann. Inst. H. Poincar´e A41 (1989) Helffer, B., and Sj¨ostrand, J.: Equation de Schr¨odinger avec champ magn´etique et e´ quation de Harper. In: Schr¨odinger operators, H. Holden, A. Jensen eds., Lecture notes in Physics 345, Berlin– Heidelberg–New York: Springer-Verlag, 1989 Hislop, P.D., and Sigal, I.M.: Introduction to spectral theory. Applied Mathematical Sciences 133, NY: Springer, 1996 Hunziker, W.: Distortion analyticity and molecular resonance curves. Ann. Inst. Henri Poincar´e 45, 339–358 (1986) Hunziker, W.: Resonances, metastable states, and exponential decay laws in perturbation theory. Commun. Math. Phys. 132, 177–188 (1990) Hunziker, W., and Sigal, I.M.: The general theory of N -body quantum systems. CRM Proceedings and lecture notes 8, 35–72 (1995) Hunziker, W., and Sigal, I.M.: Time-dependent scattering theory of N -body quantum systems. Preprint (1997) Hunziker, W., Sigal, I.M., and Soffer, A.: Minimal escape velocities. Preprint (1997) Ivrii, V., and Sigal, I.M.: Asymptotics of the ground state energies of large Coulomb systems. Ann. Math. 138, 243–335 (1993) Kapur, P.L., and Peierls, R.: The dispersion formula for nuclear reactions. Proc. Roy. Soc. (London) A166, 277–295 (1938) Landau, L.D., and Lifshitz, E.M.: Quantum Mechanics. Oxford: Pergamon Press, 1977 Orth,A.: Quantum mechanical resonance and binding absorption: The many-body problem. Commun. Math. Phys. 126, 559–573 (1990) Pfeifer, P., and Fr¨ohlich, J.: Generalized time-energy uncertainty relations and bounds on life-times of resonances. Rev. Mod. Phys. 67, 759–779 (1995) Pillet, C.A., and Wayne, C.E.: Invariant manifolds for a class of dispersive, Hamiltonian, partial differential equations. Preprint (1997) Schwinger, J.: Field theory of unstable particles. Ann. Phys. 9, 169–193 (1960) Sigal, I.M.: Complex transformation method and resonances in one-body quantum systems. Ann. Inst. Henri Poincar´e 41, 103–114 (1984) Sigal, I.M.: Non-linear wave and Schr¨odinger equations. Commun. Math. Phys. 153, 297–320 (1993) Sigal, I.M.: Scattering Theory for Many-Body Quantum Mechanical Systems. Lecture Notes in Math. 1011, Berlin–Heidelberg–New York: Springer-Verlag, 1983
576
M. Merkli, I. M. Sigal
[Sim1] Simon, B.: Resonances in n-body quantum systems with dilation analytic potentials and the foundations of time-dependent perturbation theory. Ann. Math. 97, 247–274 (1973) [Sim2] Simon, B.: Resonances and complex scaling: A rigoros overview. Int. J. Quant. Chem. 14, 529–542 (1978) [Sk1] Skibsted, E.: Truncated Gamow functions and the exponential decay law. Ann. Inst. H. Poincar´e 46, 131–153 (1987) [Sk2] Skibsted, E.: On the evolution of resonance states. J. Math. Anal. Appl. 141, 27–48 (1989) [SW1] Soffer, A., and Weinstein, M.I.: Multichannel nonlinear scattering for non-integrable equations. Commun. Math. Phys 133, 119–146 (1990) [SW2] Soffer, A., and Weinstein, M.I.: Multichannel nonlinear scattering, II. The case of anisotropic potentials and data. J. Diff. Eqns. 98, 376–390 (1992) [SW3] Soffer, A., and Weinstein, M.I.: Time dependent resonance theory. GAFA, to appear [WW] Weisskopf, V., and Wigner, E.: Berechnung der nat¨urlichen Linienbreite auf Grund der Diracschen Lichttheorie. Z. Phys. 63, 54–73 (1930) Communicated by B. Simon This article was processed by the author using the LaTEX style file cljour1 from Springer-Verlag
Commun. Math. Phys. 201, 577 – 590 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
Asymptotics for Large Time of Global Solutions to the Generalized Kadomtsev–Petviashvili Equation Nakao Hayashi1 , Pavel I. Naumkin2 , Jean-Claude Saut3 1 Department of Applied Mathematics, Science University of Tokyo 1-3, Kagurazaka, Shinjuku-ku, Tokyo 162, Japan. E-mail:
[email protected] 2 Instituto de F´isica y Matem´ aticas, Universidad Michoacana AP 2-82, CP 58040, Morelia, Michoacan, Mexico. E-mail:
[email protected] 3 Analyse Num´ erique et EDP, CNRS et Universit´e Paris-Sud, Bˆat 425, 91405 Orsay, France. E-mail:
[email protected] Received: 10 March 1998 / Accepted: 15 September 1998
Abstract: We study the large time asymptotic behavior of solutions to the generalized Kadomtsev–Petviashvili (KP) equations ( (t, x, y) ∈ R × R2 , ut + uxxx + σ∂x−1 uyy = −(uρ )x , (KP) (x, y) ∈ R2 , u(0, x, y) = u0 (x, y), where σ = 1 or σ = −1. When ρ = 2 and σ = −1, (KP) is known as the KPI equation, while ρ = 2, σ = +1 corresponds to the KPII equation. The KP equation models the propagation along the x-axis of nonlinear dispersive long waves on the surface of a fluid, when the variation along the y-axis proceeds slowly [10]. The case ρ = 3, σ = −1 has been found in the modeling of sound waves in antiferromagnetics [15]. We prove that if ρ ≥ 3 is an integer and the initial data are sufficiently small, then the solution u of (KP) satisfies the following estimates: ku(t)k∞ ≤ C(1 + |t|)−1 (log(2 + |t|))κ , kux (t)k∞ ≤ C(1 + |t|)−1 for all t ∈ R, where κ = 1 if ρ = 3 and κ = 0 if ρ ≥ 4. We also find the large time asymptotics for the solution.
1. Introduction In this paper we study the asymptotic behavior in time of solutions to the generalized Kadomtsev–Petviashvili (KP) equation ( (x, y) ∈ R2 , t ∈ R, ut + uxxx + σ∂x−1 uyy = −(uρ )x , (1.1) (x, y) ∈ R2 , u(0, x, y) = u0 (x, y),
578
N. Hayashi, P.I. Naumkin, J.-C. Saut
Rx where ∂x−1 = −∞ dx, σ = 1 or σ = −1. When ρ = 2 and σ = −1, (KP) is known as the KPI equation, while ρ = 2, σ = +1 corresponds to the KPII equation. The KPI equation models the propagation of nonlinear dispersive long waves on the surface of a fluid along the x-axis, when the variation along the y-axis proceeds slowly [10], the KPII equation corresponds to the usual situation in water waves, where capillary effects are weak. The case ρ = 3, σ = −1 has been found in the modeling of sound waves in antiferromagnetics [15]. Various authors studied the Cauchy problem for KPI and KPII equations. For instance, Ukai [16] and Isaza–Mejia–Stallbohm [9] have shown the local well-posedness of the Cauchy problem in Hs (R2 ), s ≥ 3, while I´orio and Nunes [8] obtained the local well-posedness in Hs (R2 ), s > 2 for a class of generalized KP equations. On the other hand Bourgain [2] has proved the global well-posedness in Hs (R2 ), s ≥ 0 for the KPII case. For the generalized KPI case, some rigorous results have recently appeared. In paper [11] it was shown that certain solutions of the generalized KPI equation with ρ ≥ 5 can not remain in the Sobolev space H1 (R2 ) for all time. This blow-up phenomena is essentially due to the transverse effects (namely, the norm of the derivative kuy (t)k blows up in finite time). Schwarz [13] considered the periodic problem for the KPI equation and showed that global solutions exist if the initial data are sufficiently small in L2 (R2 ) (the smallness assumption can be easily removed, see [4, 12]). Some results concerning existence of the solitary wave solutions for the generalized KPI equation were obtained recently in paper [1]. There it was shown that the Cauchy problem (1.1) with σ = −1 does not possess any nontrivial solitary wave solution if ρ ≥ 5 and in the case 2 ≤ ρ < 5 nontrivial solitary waves do exist. In paper [14] Tom has shown the existence of global solutions (1.1) with σ = −1, 2 ≤ p ≤ 5 in the natural energy space associated to the KP equations. In particular for 2 ≤ p < 7/3 the existence of global solutions is proved without any size restriction on the data. However the uniqueness of such solutions is an open problem. We use the following notation and function spaces. As usual we denote the Lebesgue 1/p RR |ϕ(x, y)|p dxdy space as Lp = {ϕ ∈ S0 ; kϕkp < ∞}, where the norm is kϕkp = if 1 ≤ p < ∞ and kϕk∞ = ess.sup{|ϕ(x, y)|; (x, y) ∈ R2 } if p = ∞. For simplicity space is Hpm,s = {ϕ ∈ S0 ; kϕkm,s,p = we put kϕk = kϕk2 . The weighted Sobolev
s/2 m/2
ϕ < ∞}, m, s ∈ R, 1 ≤ p ≤ ∞. We also use 1 − ∂x2 − ∂y2
1 + x2 + y 2 p
the following notations Hm,s = H2m,s , k · km,s = k · km,s,2 . Let C(I; B) be the space of continuous functions from an interval I to a Banach space B. Different positive constants might be denoted by the same letter C. By Fϕ or ϕˆ we denote the Fourier transform of the function ϕ, ZZ 1 e−ixξ−iyη ϕ(x, y)dxdy. ϕ(ξ, ˆ η) = 2π R2 The inverse Fourier transform F −1 ϕ or ϕˇ is given by ZZ 1 eixξ+iyη ϕ(ξ, η)dξdη. ϕ(x, ˇ y) = 2π R2 We denote by U(t) the free KP evolution group defined on the space of functions ˆ −1 ϕ = ϕ(ξ,η) , by the formula ϕ ∈ L2 such that ∂ −1 ϕ ∈ L2 , where ∂[ x
x
iξ
ˆ ξ, η) U(t)ϕ(t) = F −1 eit(ξ −ση /ξ) ϕ(t, ZZ G(t, x − x0 , y − y 0 )ϕ(t, x0 , y 0 )dx0 dy 0 , = R2
3
2
Generalized Kadomtsev–Petviashvili Equation
579
RR 3 2 1 where G(t, x, y) = 2π eixξ+iyη+itξ −itση /ξ dξdη. We introduce the following operR2 ators Jx = U(t)xU(−t) and Jy = U(t)yU(−t). By a simple calculation we see that η2 −1 3 i∂ξ FU(−t) Jx = U(t)xU(−t) = F exp it ξ − σ ξ η2 η2 FU(−t) exp it ξ 3 − σ = F −1 i∂ξ + t 3ξ 2 + σ 2 ξ ξ = x − t 3∂x2 − σ∂x−2 ∂y2 , and in the same way η2 3 i∂η FU(−t) Jy = U(t)yU(−t) = F exp it ξ − σ ξ η η2 exp it ξ 3 − σ FU(−t) = y − 2σt∂x−1 ∂y . = F −1 i∂η − 2σt ξ ξ −1
The operator Jx is not of the first order and does not work well for KP equation (1.1) directly, so we also introduce the operator I = x + 2∂x−1 ∂y y + 3t∂x−1 ∂t . Then we easily see that I = Jx + 2∂x−1 ∂y Jy + 3t∂x−1 L, where L = ∂t + ∂x3 + σ∂x−1 ∂y2 = ∂t + 2 F −1 −iξ 3 + iσ ηξ F is the linear part of Eq. (1.1). We introduce the function spaces XT = {ϕ ∈ C [0, T ]; L2 ; |||ϕ|||XT = sup 1− q2
+ (1 + t)
0≤t≤T
−1
∂x ϕ(t) 4,0
kϕ(t)k2,0,q < ∞}
and Y = {ϕ ∈ X∞ ; ||ϕ||Y = ||ϕ||X∞ + sup kJx ∂x ϕ(t)k1,0 t∈R+
+ sup (1 + t)−γ/8 kJy ϕ(t)k3,0 + sup (1 + t)−γ/6 Jy2 ∂x ϕ(t) 1,0 < ∞}. t∈R+
t∈R+
The main results of this paper are the following. First of all we give a global existence theorem with an optimal time decay estimate in the case of small initial data belonging to a rather wide class. Theorem 1.1. Let ρ ≥ 3 be an integer. Let the initial data u0 ∈ H2,0q , ∂x−1 u0 ∈ H4,0 q−1
and the value 2 = ku0 k2,0, q + ∂x−1 u0 be sufficiently small, here 2+ 2 < q ≤ 2ρ. q−1
4,0
ρ−2
Then there exists a unique global solution u ∈ X∞ of the Cauchy problem (1.1) satisfying 2 the estimate ku(t)k2,0,q ≤ C(1 + |t|)−1+ q for all t ∈ R. Under assumptions of Theorem 1.1 we can prove the existence of the final scattering states. Corollary 1.1. Let u be the solution stated in Theorem 1.1. Then there exists a unique final state u+ ∈ H2,0 such that 2 1−(ρ−1) 1− q2 ku(t) − u+ k2,0 ≤ Ct , ku(t) − u+ k1,0 ≤ Ct1−ρ(1− q ) for all t > 1.
580
N. Hayashi, P.I. Naumkin, J.-C. Saut
Remark 1.1. Note that the final state u+ can be calculated approximately in the following way. For any desired accuracy 1 > 0 via Corollary 1.1 we can find a time T > 0 such that ku(T ) − u+ k2,0 ≤ 1 , and the solution u(t) of the Cauchy problem (1.1) at the time t = T can be constructed by virtue of any standard approximate algorithm. Under more restrictions on the decay rate at infinity of the initial data u0 we can estimate the large time decay in the uniform norm of the solution and we write an asymptotic formula for the derivative ux of the solution. Theorem 1.2. Let ρ ≥ 3 be an integer. Let the initial data u0 ∈ H2,0q , ∂x−1 u0 ∈ H4,0 q−1
and the value 2 = ku0 k2,0, q + ∂x−1 u0 be sufficiently small, here 2+ 2 < q ≤ 2ρ. q−1
4,0
ρ−2
Assume also that xu0 ∈ H2,0 , yu0 ∈ H3,0 , and y 2 u0 ∈ H2,0 . Then there exists a unique global solution u ∈ Y satisfying the following estimates: ku(t)k∞ ≤ C(1 + |t|)−1 (log(2 + t))κ , k∂x u(t)k∞ ≤ C(1 + |t|)−1
(1.2)
for all t ∈ R, where κ = 1 if ρ = 3 and κ = 0 if ρ ≥ 4. Moreover there exists V ∈ L∞ such that the following asymptotics: γ y 1 |y| κ + O t−1−γ/6 1 + ux (t, x, y) = Re A(z)V κ, (1.3) t 2σt t 2 is valid for large of time with respect √ √ values to (x, y) ∈ R , where √ t ≥ 1 uniformly 3 1 2 y 3t, κ = max(0, −z) 3 3t, γ ∈ 0, 41 and the function A(z) = z = x + 4σt √ R∞ √ √ 2 e−iσπ/4 dξ ξ exp izξ + iξ 3 /3 . 0 3π
Remark 1.2. Note that our asymptotic formula (1.3) has a quasilinear character. The estimate of the remainder term in the asymptotic formula (1.3) grows, when yt → ∞. Therefore in view of the uniform estimate (1.2) the asymptotics of the solution has to be considered by another approach in the region yt → ∞. The time decay estimate (1.2) of the solution u in the uniform norm differs from the linear case by a logarithmic correction if ρ = 3, but we do not know if it is optimal in this case. Remark 1.3. The value V can be calculated approximately in the same manner as in Remark 1.1 since from the proof of Theorem 1.2 we see that V = limt→∞ FU(−t)ux in spaces H0,1 ∩ L∞ . We organize our paper as follows. In Sect. 2 we state two lemmas. In particular, the crucial Lemma 2.2 gives the time decay estimate and the large time asymptotic formula for the free KP evolution group U(t) in terms of the operators Jx and Jy since kxU(−t)uk = kJx uk and kyU(−t)uk = kJy uk. Our approach here is close to the method of paper [7]. Section 3 is devoted to the proof of Theorems 1.1–1.2 and Corollary 1.1.
2. Linear Estimates Lemma 2.1 (The Sobolev inequality). Let q, r be any numbers such that 1 ≤ q, r ≤ ∞, and let j, m be any numbers satisfying 0 ≤ j < m. Then the following estimate is true:
Generalized Kadomtsev–Petviashvili Equation
581
j/2 m/2
α ϕ ≤ C −∂x2 − ∂y2 ϕ kϕkq1−α ,
−∂x2 − ∂y2 p
r
1−α where C is a constant depending only on m, j, q, r, α, p1 = j2 + α( r1 − m 2 ) + q and j the parameter α is arbitrary in the interval m ≤ α ≤ 1 with the following exception: if j m − j − r2 is a nonnegative integer, then α = m .
For Lemma 2.1, see, e.g., Friedman [6]. Lemma 2.2. Let ϕ ∈ Lp ∩ L2 , 1 ≤ p ≤ 2 and ∂x−1 ϕ ∈ L2 . Then kU(t)ϕkq ≤ C(1 + |t|)−1+ q kϕkp , 2
(2.1)
where 1/q + 1/p = 1. Moreover if 1 + |x| + y 2 (|ϕ| + |ϕx |) ∈ L2 then we have the following asymptotics for large values of time t ≥ 1 uniformly with respect to (x, y) ∈ R2 : σy 1 κ U(t)ϕ(t) = Re A(z)ϕˆ t, κ, γ 2t t
|y|
1 + |x| + y 2 (|ϕ(t, x, y)| + |ϕx (t, x, y)|) , + O t−1−γ/3 1 + t (2.2) √ √ √ 3 1 2 y 3t, κ = max(0, −z) 3 3t, γ ∈ 0, 41 and the function where z = x + 4σt √ R∞ √ A(z) = 2√3π e−iσπ/4 0 dξ ξ exp izξ + iξ 3 /3 is a half-derivative of the Airy function. Proof. For the proof of the first estimate (2.1) see Proposition 2.3 and Lemma 2.4 in paper [11]. To prove asymptotic formula (2.2) for all t ≥ 1 we write the representation ZZ
G(t, x − x0 , y − y 0 )ϕ(t, x0 , y 0 )dx0 dy 0 ,
U(t)ϕ(t) = R2
RR
eixξ+iyη+itξ −itση /ξ dξdη (see [11]). Making a q √ 0 3 3t we change of the variable of integration η = η 0 |ξ| t and after that setting ξ = ξ get for the kernel (omitting the prime) where the kernel G(t, x, y) =
1 2π
3
2
R2
! |ξ| 2 − iση sgnξ t R R Z ∞ p y2 1 1 −iσπ/4 3 + itξ = dξ ξ exp iξ x + Re A(z). = √ Re e 4σt 2πt πt 0
1 √ G(t, x, y) = 2π t
Z
Z p dξ |ξ| exp ixξ + itξ 3 dη exp iyη
r
√ First let us consider the case z < 0, then there is a stationary point ξ = −z > 0 in the integral A(z). To move this stationary point into the origin we make a change of the
582
N. Hayashi, P.I. Naumkin, J.-C. Saut
√ variable of integration ξ = ξ 0 + a with a = −z, we have √ 3 2 π A(z) = √ B(a)e−2ia /3−iσπ/4 , 3 R∞ √ where B(r) = −r dξ ξ + r exp irξ 2 + iξ 3 /3 for r ≥ 0. We will show that B(r) ∈ Cγ (0, ∞) with γ ∈ 0, 41 . First let us prove the estimate supr>0 B(r) ≤ C. Using the identity −1 d irξ2 +iξ3 /3 2 3 ξe , eirξ +iξ /3 = 1 + iξ 2 (2r + ξ) dξ we integrate by parts with respect to ξ in the integral B(r) to obtain Z
∞
eirξ +iξ /3 √ f (ξ, r)dξ, (2.3) B(r) = r+ξ −r 2 (r+ξ)(4r+3ξ) − ξ2 . In view of the inequality where we denote f (ξ, r) = 1+iξ21(2r+ξ) iξ1+iξ 2 (2r+ξ) 1 + iξ 2 (2r + ξ) ≥ C(1 + rξ 2 + |ξ|3 ) for ξ ≥ −r we get the estimate ξ2 (r+ξ)(4r+3ξ) 2 1+iξ (2r+ξ) ≤ 2
3
C(r + |ξ|). Therefore we obtain |f (ξ, r)| ≤ C 1+ξ|ξ|+r 2 (2r+ξ) for all ξ ≥ −r. So we easily get Z
Z −r/2 C (|ξ| + r)dξ dξ √ √ ≤ 2 2 (2r + ξ)) r + ξ 1 + r (1 + ξ r +ξ −r −r Z r Z ∞√ √ dξ ξdξ +C ≤C +C r 2 1 + rξ 1 + ξ3 −r/2 r
|B(r)| ≤ C
∞
(2.4)
uniformly with respect to r ≥ 0. Now we prove the following estimate (2.5) |B(r) − B(s)| ≤ C|r − s|γ uniformly in r, s ≥ 0, where γ ∈ 0, 41 . In view of estimate (2.4) it is sufficient to consider only the case 0 < r − s < 1. Via (2.3) and thefollowing estimates: |fr (ξ, r)| ≤ γ 2 2 C −irξ ≤ C(r−s) − e−isξ ≤ C(r − s)γ |ξ|2γ and √1 − √1 1 +γ for 1+ξ 2 (2r+ξ) , e ξ ≥ −s, we get
s+ξ
r+ξ
(s+ξ) 2
Z ∞ Z −s |f (ξ, r)|dξ |f (ξ, r) − f (ξ, s)|dξ √ √ + |B(r) − B(s)| ≤ r+ξ r+ξ −r −s Z ∞ Z ∞ eirξ2 − eisξ2 1 1 √ √ |f (ξ, s)|dξ + −√ |f (ξ, s)|dξ + r+ξ s+ξ r+ξ −s −s Z ∞ Z −s dξ dξ √ √ + C(r − s) ≤C 2 r+ξ r+ξ −r −s 1 + ξ (2s + ξ) Z ∞ 2γ |ξ| (|ξ| + s)dξ √ + C(r − s)γ 2 r+ξ −s 1 + ξ (2s + ξ) Z ∞ (|ξ| + s)dξ + C(r − s)γ ≤ C(r − s)γ . 1 +γ 1 + ξ 2 (2s + ξ) −s (s + ξ) 2
Generalized Kadomtsev–Petviashvili Equation
583
√ γ + Thus we have proved ). Denoting b = max(0, −ζ), ζ = √ that B(r) ∈ C (R √ 0 2 3 (y−y ) 0 3t (we recall that a = −z, and that we now are considering x − x + 4σt √ 2 3 y 3t < 0) we obtain the case z = x + 4σt |a − b| ≤
p
|z − ζ| ≤ Ct−1/6
p |x0 | + t−1/2 |y 0 | +
r
! |y| p 0 |y | t
(2.6)
3/2 for all x, x0 ,√y, y 0 ∈ R, t ≥ 1. By virtue of the Taylor formula we have (−z) − (−ζ)3/2 = 23 −z(ζ − z) + O |ζ − z|3/2 for z < 0 and ζ < 0. And in the case ζ ≥ 0 we have b = 0, a3 = O |ζ − z|3/2 and a(ζ − z) = O |ζ − z|3/2 . Therefore we get
a 2 3 2 3 a − b = −√ 3 3 3 3t +O
x0 +
1 √ t
y 0 (y 0 )2 y − 2σt 4σt
p |x0 |3 + t−3/2 |y 0 |3 +
|y| t
3/2
p |y 0 |3
!!
(2.7)
for all x, x0 , y, y 0 ∈ R, t ≥ 1. Now estimates (2.4)–(2.7) yield ZZ 1 Re A(ζ)ϕ(t, x0 , y 0 )dx0 dy 0 U(t)ϕ(t) = 2πt R2 √ ZZ 3 2 π √ Re B(a) = e−2ib /3−iσπ/4 ϕ(t, x0 , y 0 )dx0 dy 0 2πt 3 R2 γ |y| +O 1+ t−1−γ/3 k(1 + |x|γ + |y|γ )ϕ(t, x, y)kL1 t ZZ a y 0 (y 0 )2 1 0 ϕ t, x0 , y 0 dx0 dy 0 x exp −i √ + − Re A(z) y = 3 2πt 2σt 4σt 3t R2 γ
|y| +O 1+ t−1−γ/3 1 + |x| + y 2 ϕ(t, x, y) , t (2.8) γ +|y|γ 1 a 2 2 ∈ L < 1 via the estimate R for γ ∈ 0, . If since 1+|x| 2 2/3 4 t 1+|x|+y 0 2 |y 0 |γ a(y ) √ − 1 ≤ C we get exp i 4t 3 tγ/3 3t a y 0 (y 0 )2 y 0 ϕ t, x0 , y 0 dx0 dy 0 = ϕˆ t, κ, x y κ exp −i √ + − 3 2σt 4σt 2σt 3t R2
+ O t−γ/3 1 + |x| + y 2 ϕ(t, x, y) . (2.9) a ≥ 1 then integration by parts with respect to x0 yields And if t2/3 1 2π
ZZ
0 0 a y 0 (y 0 )2 0 0 0 ϕ t, x x exp −i √ + − , y dy dx y 3 2σt 4σt 3t R2 √
C C3t
1 + |x| + y 2 ϕx (t, x, y) kϕx k1 ≤ √ ≤ 3 a t
ZZ
584
N. Hayashi, P.I. Naumkin, J.-C. Saut
and in the same way
C C y
1 + |x| + y 2 ϕx (t, x, y) , κ ≤ kϕx k1 ≤ √ ϕˆ t, κ, 3 2σt |κ| t a hence in the case t2/3 ≥ 1 we again obtain (2.9). Therefore (2.8), (2.9) give us the asymptotics (2.2) for the case z < 0. We now consider the case z ≥ 0. For this case we can easily obtain that A ∈ C1 (R+ ). Indeed, using the identity −1 d izξ+iξ3 /3 3 ξe eizξ+iξ /3 = 1 + iξ(z + ξ 2 ) , dξ we integrate by parts with respect to ξ in the integrals A(z) and A0 (z) to get Z ∞ izξ+iξ3 /3 √ k 2 d 2iξ z + 3ξ e ξ k = C − 2k − 1 ξ A(z) dξ dz k 2 2 1 + iξ(z + ξ ) 1 + iξ(z + ξ ) 0 Z ∞ k√ ξ ξdξ ≤C ≤C 1 + ξ3 0 for k= 0, 1. Therefore we have the following estimate: |A(ζ) − A(z)| ≤ C|ζ − z|γ ≤ γ C 1 + |y| 1 + |x|γ + |y|γ . Thus we can write t tγ/3 ZZ ZZ 1 1 0 0 0 0 Re Re A(z) A(ζ)ϕ(t, x , y )dx dy = ϕ(t, x0 , y 0 )dx0 dy 0 U(t)ϕ(t) = 2πt 2πt R2 R2 ZZ 1 1 + (A(ζ) − A(z))ϕ(t, x0 , y 0 )dx0 dy 0 = Re A(z)ϕˆ (t, 0, 0) Re 2πt t R2 γ
|y| t−1−γ/3 1 + |x| + y 2 ϕ(t, x, y) . (2.10) +O 1+ t Now for the case z ≥ 0 the asymptotic formula (2.2) follows from (2.10). Lemma 2.2 is proved. Lemma 2.3. The following commutator relations are valid: and
[Jx , ∂y ] = [Jy , ∂x ] = 0, [Jx , ∂x ] = [Jy , ∂y ] = −1
(2.11)
[L, Jx ] = [L, Jy ] = 0, [L, I] = 3∂x−1 L.
(2.12)
Proof. We have [Jx , ∂y ] = [x, ∂y ] = 0 and similarly [Jy , ∂x ] = [y, ∂x ] = 0. On the other hand [Jx , ∂x ] = [x, ∂x ] = −1 and [Jy , ∂y ] = [y, ∂y ] = −1, whence (2.11) follows. By the Leibnitz rule we have ∂x3 , x = 3∂x2 and ∂y2 , y = 2∂y . Further, integration by parts yields ∂x−1 , x = −∂x−2 . Hence we see that the operators Jx and Jy commute with the linear part L = ∂t + ∂x3 + σ∂x−1 ∂y2 of Eq. (1.1), namely [L, Jx ] = −[∂t , t] 3∂x2 − σ∂x−2 ∂y2 + ∂x3 , x + σ ∂x−1 , x ∂y2 = 0 and
[L, Jy ] = −2σ[∂t , t]∂x−1 ∂y + ∂x3 , x + σ∂x−1 ∂y2 , y = 0. A direct calculation shows that the operator I almost commutes with the operator L, i.e. [L, I] = 3[∂t , t]∂x−1 ∂t + ∂x3 , x + σ ∂x−1 , x ∂y2 + 2σ∂x−2 ∂y ∂y2 , y = 3∂x−1 L. Thus formulas (2.12) are true. Lemma 2.3 is proved.
Generalized Kadomtsev–Petviashvili Equation
585
3. Nonlinear Estimates Here we use the following local existence theorem (for the proof see [9, 12]). Theorem 3.1. Let ρ ≥ 3 be integer. We assume that the initial data u0 ∈ H2,0q , ∂x−1 u0 ∈ q−1
H4,0 and the value 2 = ∂x−1 u0 + ku0 k2,0, q is sufficiently small, here 2 + 2 < 4,0
ρ−2
q−1
q ≤ 2ρ. Then there exists a unique solution u ∈ XT of (1.1) for some time T > 1 such that kukXT ≤ C2 . Remark 3.1. We demand that the power ρ of the nonlinear term in Eq. (1.1) be an integer because of the local existence Theorem 3.1. Our proofs of the Theorems 1.1–1.2 are valid also for fractional values of ρ. Proof of Theorem 1.1. It is sufficient to prove a-priori estimates of solutions to the generalized KP equation (1.1). The integral equation associated with (1.1) is written as Z u(t) = U(t)u0 − ρ
t
U(t − s) uρ−1 ∂x u ds.
(3.1)
0
i 2 , 2ρ , m = 1 − q2 , j = 0, α = 1, r = 2 to We apply Lemma 2.1 with p = q ∈ 2 + ρ−2 get kϕkq ≤ Ckϕk1,0 . By virtue of the H¨older inequality we obtain kuρ−1 ∂x uk2,0,
q q−1
≤ Ckukρ−1 kuk3,0 ≤ Ckukρ−1 2,0,q kuk3,0 , 1,0, 2q(ρ−1) q−2
where we have used the estimate of Lemma 2.1, in which we took the parameters to be q−2 2 equal to p = 2q(ρ−1) q−2 , r = q, j = 0, m = q − q(ρ−1) , α = 1 and 2 < q ≤ 2ρ. Then via estimate (2.1) of Lemma 2.2 we have from (3.1), 2 q + ku0 k3,0 ku(t)k2,0,q ≤ C(1 + t)−1+ q ku0 k2,0, q−1 Z t 2 −1+ q2 2 (t − s)−1+ q ku(s)kρ−1 . +C 2,0,q ku(s)k3,0 ds ≤ C (1 + t)
(3.2)
0
And by the classical energy estimate we get
−1 2
∂x u ≤ C4 + C 4,0
Z
t
0
Z
−1
ku(s)kρ−1 2,0,q ku(s)k3,0 ∂x u(s) 4,0 ds
≤ C + C 4
6
t
−(ρ−1) 1− q2
(1 + s)
(3.3) ds ≤ C( + ), 4
6
0
since kϕk1,0,∞ ≤ kϕk2,0,q by virtue of Lemma 2.1 with r = q, p = ∞, j = 0, m = q2 < 1, 2 . Now from (3.2)–(3.3) it follows α = 1, and since 1 − q2 (ρ−1) > 1, when q > 2+ ρ−2 that ||u||XT ≤ . A standard continuation argument yields the result of Theorem 1.1.
586
N. Hayashi, P.I. Naumkin, J.-C. Saut
Proof of Corollary 1.1. We have by (3.1) and Theorem 1.1, Z t Z t
ρ−1
u ux dτ ≤ C kukρ2,0,q dτ ku(t) − u(s)k1,0 ≤ C 1,0 s s Z t −ρ 1− q2 1−ρ 3 (1 + τ ) dτ ≤ C3 (1 + s) ≤ C
1− q2
(3.4)
s
for t > s. And analogously Z t Z t −(ρ−1) 3 kukρ−1 kuk dτ ≤ C (1 + τ ) ku(t) − u(s)k2,0 ≤ C 3,0 2,0,q s s 1−(ρ−1) 1− q2 3 ≤ C (1 + s)
1− q2
dτ
(3.5)
for t > s. This implies that there exists a unique final state u+ ∈ H2,0 such that limt→∞ ku(t) − u+ k2,0 = 0. We let t → ∞ in (3.4) and (3.5) to get the result of the corollary. Proof of Theorem 1.2. Applying the operator I = x + 2∂x−1 ∂y y + 3t∂x−1 ∂t to (1.1), using the identity [L, I] = 3∂x−1 L, (see Lemma 2.3) we obtain LIu = −ρuρ−1 (Iu)x + (3ρ − 5)uρ ,
(3.6)
since the operator I acts on the nonlinearity as a first order differential operator: I(uρ )x = ρuρ−1 (Iu)x + (3ρ − 2)uρ . We multiply both sides of (3.6) by (Iu)xx , integrate by parts in x to get d ρ−1 2 k(Iu)x k2 ≤ Ckukρ−1 1,0,∞ k(Iu)x k + Ckuk1,0,∞ kukk(Iu)x k dt −(ρ−1) 1− q2 kuk2 + k(Iu)x k2 , ≤ C2 (1 + t) since kuk1,0,∞ ≤ Ckuk2,0,q ≤ C(1 + t)−1+ q . From the L2 conservation law we have d 2 dt kuk = 0, whence we obtain d −(ρ−1) 1− q2 kuk2 + k(Iu)x k2 ≤ C2 (1 + t) kuk2 + k(Iu)x k2 . dt 2
Similarly we have
d k(Iu)xx k ≤ C uρ−3 u2x (Iu)x + C uρ−2 uxx (Iu)x dt
+ C uρ−2 ux (Iu)xx + C uρ−2 u2x + C uρ−1 uxx ≤ Ckukρ−1 2,0,q kuk2,0 + k(Iu)x k1,0 −(ρ−1) 1− q2 kuk22,0 + k(Iu)x k21,0 ≤ C2 (1 + t) and
d kuxxx k + kuyyy k ≤ Ckukρ−1 2,0,q kux k2,0 + kuy k2,0 dt −(ρ−1) 1− q2
≤ C2 (1 + t)
kuk3,0 .
Generalized Kadomtsev–Petviashvili Equation
587
We apply Gronwall’s inequality to the above to get 2
ku(t)k23,0 + kIux (t)k21,0 ≤ Cku0 k23,0 + Ckxu0 k22,0 + C ky∂y u0 k1,0 ≤ C.
(3.7)
Multiplying (1.1) by Jy we obtain LJy u = −ρuρ−1 (Jy u)x .
(3.8)
In the same way as in the proof of (3.7) we have from (3.8),
d kJy uxxx k ≤ C(ρ − 3) uρ−4 u3x Jy ux + C uρ−3 ux uxx Jy ux dt
+ C uρ−3 u2x Jy uxx + C uρ−2 uxxx Jy ux + C uρ−2 uxx Jy uxx ≤ Ckukρ−1 2,0,q kuk3,0 + kJy ux k2,0 −(ρ−1) 1− q2 2 kuk23,0 + kJy ux k2,0 ≤ C2 (1 + t) and similarly d −(ρ−1) kJy uyyy k ≤ C2 (1 + t) dt whence Gronwall’s inequality yields
1− q2
kuk23,0 + kJy ux k22,0 ,
kJy u(t)k3,0 ≤ Ckyu0 k3,0 ≤ C.
(3.9)
By the identity I∂x = Jx ∂x + 2∂y Jy + 3tL, estimates (3.7), (3.9) and the time decay estimate of Theorem 1.1 we obtain kJx ∂x u(t)k1,0 ≤ C kIux (t)k1,0 + C k∂y Jy uk1,0 + Ct kLuk1,0 ≤ C.
(3.10)
Now we derive a rough estimate for the norm ku(t)k1,0,∞ . Since kuk1,0,∞ ≤ Ckuk2,0,4 , using the integral equation associated with (1.1), estimates of Lemma 2.2 and Theorem 1.1, we have for t ≥ 1, Z t−1
(t − s)−1 uρ−1 ux ds ku(t)k1,0,∞ ≤ C(1 + t)−1 ku0 k1,0,1 + C Z
0
1,0,1
1
uρ−1 ux ds 2,0,4/3 t−s t−1 Z t−1 2 −1 (t − s)−1 kuk23,0 kukρ−2 ≤ C (1 + t) + C 1,0,∞ ds 0 Z t 1 3/2 ρ−3/2 √ kuk3,0 kuk1,0,∞ ds, +C t−s t−1 +C
t
√
γ/6−1 whence by Gronwall’s inequality .
a rough estimate ku(t)k1,0,∞ ≤ C(1 + t)
we get Now let us estimate the norm Jy2 ux 1,0 . We have
d
Jy2 ux ≤ C uρ−2 (Jy ux )2 + C uρ−1 Jy2 uxx dt
2 ρ−1 2
≤ Ckukρ−3 1,0,∞ kJy ux k2,0 + Ckuk2,0,q Jy ux
≤ C(1 + t)γ/6−1 + C(1 + t)−(ρ−1)(1−2/q) Jy2 ux
588
N. Hayashi, P.I. Naumkin, J.-C. Saut
and
d
Jy2 uxx ≤ C uρ−3 ux (Jy ux )2 + C uρ−2 (Jy ux )(Jy uxx ) dt
2 ρ−1 2
+ C uρ−2 ux Jy2 uxx ≤ Ckukρ−3 1,0,∞ kJy ux k2,0 + Ckuk2,0,q Jy uxx
≤ C(1 + t)γ/6−1 + C(1 + t)−(ρ−1)(1−2/q) Jy2 uxx .
The value Jy2 uxy is considered in the same way. Therefore by Gronwall’s inequality
we get the desired estimate Jy2 ux 1,0 ≤ C(1 + t)γ/6 . Thus via (3.9) and (3.10) we have kukY < ∞. Now the second estimate (1.2) of Theorem 1.2 follows from Lemma 2.2. By the integral equation associated with (1.1), Lemma 2.2 and estimates of Theorem 1.1 we have for t ≥ 1, Z t−1
(t − s)−1 uρ−1 ux 1 ds ku(t)k∞ ≤ C(1 + t)−1 ku0 k1 + C Z +C +C
0
t
t−1 Z t t−1
ρ−1
u ux ds ≤ C2 (1 + t)−1 + C 2,0 kukρ−1 2,0,q kuk3,0 ds
+ C(1 + t)
−(ρ−1) 1− q2
−1
≤ C(1 + t)
Z
t−1
0
Z
(t − s)−1 kuk2 kux kρ−2 ∞ ds
t−1
+ C
3
(t − s)−1 (1 + s)2−ρ ds
0
≤ C(1 + t)−1 (log(1 + t))κ ,
where κ = 1 if ρ = 3 and κ = 0 if ρ ≥ 4. Thus the estimates (1.2) are true. To prove asymptotic formula (1.3) we use the same method as in paper [7]. We have by Eq. (1.1) (U(−t)u)t = −U(−t)(uρ )x , and from (3.6) we get (U(−t)(Iu)x )t = −U(−t) ρuρ−1 (Iu)x − (2ρ − 5)uρ x . Hence we obtain as in (3.5), Z kU(−t)u(t) − U(−s)u(s)k2,0 ≤ C
t
s
ku(τ )kρ−1 1,0,∞ ku(τ )k3,0 dτ
Z
≤ C2
s
t
τ
−(ρ−1) 1− q2
1−(ρ−1) 1− q2
dτ ≤ C2 s
(3.11)
and Z t ku(τ )kρ−1 kU(−t)Iux (t) − U(−s)Iux (s)k1,0 ≤ C 1,0,∞ kIux (τ )k1,0 + ku(τ )k dτ s Z t −(ρ−1) 1− q2 1−(ρ−1) 1− q2 2 τ dτ ≤ C2 s . ≤ C s
(3.12)
Similarly we get kU(−t)∂y Jy u(t) − U(−s)∂y Jy u(s)k1,0
1−(ρ−1) + U(−t)∂x Jy2 u(t) − U(−s)∂x Jy2 u(s) ≤ C2 s
1− q2
.
(3.13)
Generalized Kadomtsev–Petviashvili Equation
589
As in (3.10) we obtain
U(−t) I∂x − Jx ∂x − 2∂y Jy u(t) ≤ Ct k(uρ )x k ≤ C2 s1−(ρ−1)
1− q2
. (3.14)
By virtue of (3.11)–(3.14) we find kF(U(−t)u)(t) − F(U(−s)u)(s)k∞ ≤ kU(−t)u(t) − U(−s)u(s)k1
≤ C 1 + |x| + y 2 (U(−t)u(t) − U(−s)u(s)) ≤ CkU(−t)u(t) − U(−s)u(s)k1,0 + CkU(−t)Iux (t) − U(−s)Iux (s)k + C kU(−t)∂y Jy u(t) − U(−s)∂y Jy u(s)k
1−(ρ−1) 1− q2 . + C U(−t)Jy2 ux (t) − U(−s)Jy2 ux (s) ≤ C2 s
(3.15)
The estimate (3.15) implies that there exists a unique function V ∈ L∞ ∩ H0,1 such that
lim U(−t)ux (t) − F −1 V 1,0 + U(−t)ux (t) − F −1 V 1 = 0. t→∞
Asymptotic formula (1.3) follows from formula (2.2) and the estimate k(1 + |x| + y 2 )(|(F −1 V )x | + |(F −1 V )xx )|k ≤ C(1 + t)γ/6 which comes from the estimates (3.15) and kukY ≤ C. This completes the proof of Theorem 1.2.
References 1. de Bouard, A. and Saut, J.C.: Solitary waves of generalized Kadomtsev–Petviashvili equations. Annales IHP, Analyse non Lin´eaire 14, 211–236 (1997) 2. Bourgain, J.: On the Cauchy problem for the Kadomtsev–Petviashvili equations. Geom. Funct. Anal. 3, 315–341 (1993) 3. Cazenave, T.: Equations de Schr¨odinger non lin´eaires en dimension deux. Proc. Roy. Soc. Edinburgh 84A, 327–346 (1979) 4. Colliander, J.E.: Globalizing estimates for the periodic KP I equation. Illinois J. Math. 40, 692–698 (1996) 5. Fokas, A.S. and Sung, L.Y.: On the solvability of the N-wave, Davey–Stewartson and Kadomtsev– Petviashvili equations. Inverse Problems 8, 673–708 (1992) 6. Friedman, A.: it Partial Differential Equations. New York: Holt-Rinehart and Winston, 1969 7. Hayashi, N. and Naumkin, P.I.: Large time asymptotics to the generalized Bengamin-Ono equation. Trans. A.M.S. 351, 109–130 (1999) 8. I´orio R.J. and Leite Nunes, W.V.: On equations of KP type. Proc. Roy. Soc. Edinburgh 128, A 725–743 (1998) 9. Isaza, P., Mejia, J. and Stallbohm, V.: Local solution for the Kadomtsev–Petviashvili equation in R2 . J. Math. Anal. and Appl. 196, 566–587 (1995) 10. Kadomtsev, B.B. and Petviashvili, V.I.: On the stability of solitary waves in weakly dispersive media. Soviet Phys. Dokl. 15, 539–541 (1970) 11. Saut, J.C.: Remarks on the generalized Kadomtsev–Petviashvili equations. Indiana Univ. Math. J. 42, 1011–1026 (1993) 12. Saut, J.C.: Recent results on the generalized Kadomtsev–Petviashvili equations. Acta Applicandae Mathematicae 39, 477–487 (1995) 13. Schwarz, M.: Periodic solutions of Kadomtsev–Petviashvili equations. Adv. in Math. 66, 217–233 (1987) 14. Tom, M.M.: On a generalized Kadomtsev–Petviashvili equation. In: Contemporary Mathematics A.M.S. 200, 193–210 (1996)
590
N. Hayashi, P.I. Naumkin, J.-C. Saut
15. Turitsyn, S.K. and Fal’kovitch, G.E.: Stability of magnetoelastic solitons and self-focusing of sound in antiferromagnet. Soviet Phys. JETP 62, 146–152 (1985) 16. Ukai, S.: Local solutions of the Kadomtsev–Petviashvili equation. J. Fac. Sci. Univ. Tokyo, Sect. A, Math. 36, 193–209 (1989) 17. Zhou, X.: Inverse scattering transform for the time dependent Schr¨odinger equation with applications to the KP-I equation. Commun. Math. Phys. 128, 551–564 (1990) Communicated by A. Kupiainen
Commun. Math. Phys. 201, 591 – 618 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
A Basis for Representations of Symplectic Lie Algebras A. I. Molev School of Mathematics and Statistics, University of Sydney, Sydney NSW 2006, Australia. E-mail:
[email protected] Received: 30 March 1998 / Accepted: 15 September 1998
Abstract: A basis for each finite-dimensional irreducible representation of the symplectic Lie algebra sp(2n) is constructed. The basis vectors are expressed in terms of the Mickelsson lowering operators. Explicit formulas for the matrix elements of generators of sp(2n) in this basis are given. The basis is natural from the viewpoint of the representation theory of the Yangians. The key role in the construction is played by the fact that the subspace of sp(2n − 2) highest vectors in any finite-dimensional irreducible representation of sp(2n) admits a natural structure of a representation of the Yangian Y(gl(2)).
0. Introduction One of the central problems of the representation theory is to construct a basis in the representation space and to find the representation matrices in the basis. A solution of this problem for the general linear Lie algebra gl(N ) and the orthogonal Lie algebra o(N ) was given by Gelfand and Tsetlin [GT1, GT2]. They proposed a parameterization of basis vectors and gave formulas for the matrix elements of the generators of the Lie algebras in this basis. An explicit construction of the Gelfand–Tsetlin basis vectors in terms of lowering operators was given by Zhelobenko [Z1, Z2], Nagel–Moshinsky [NM] (gl(N )-case); Pang–Hecht [PH], Wong [Wo] (o(N )-case). Different formulas for the lowering operators are also obtained by Asherova–Smirnov–Tolstoy [AST2], Gould [G1, G2], Nazarov–Tarasov [NT1], Molev [Mo2]. A quite different approach to construct modules over the classical Lie algebras is developed in the papers by King–El-Sharkaway [KS], Berele [B], King–Welsh [KW], Koike–Terada [KT], Proctor [P2]. It is based on the Weyl realization of the representations of the classical groups in tensor spaces; see [W]. In particular, bases in the representations of the orthogonal and symplectic Lie algebras parameterized by o(N )standard or sp(2n)-standard Young tableaux are constructed. Although the subset of the
592
A. I. Molev
standard Young tableaux is not preserved by the action of the Lie algebra, explicit trace relations and Garnir relations between the Young tableaux allow one to get an algorithm for calculation of the matrix elements of the generators of the Lie algebras. Bases with special properties in the universal enveloping algebra for a simple Lie algebra g and in some g-modules were constructed by Lakshmibai–Musili–Seshadri [LMS], Littelmann [L] (monomial bases); De Concini–Kazhdan [CK] (combinatorial bases for GL(n)); Gelfand–Zelevinsky [GZ2], Retakh–Zelevinsky [RZ], Mathieu [M1] (‘good’ bases); Lusztig [Lu], Kashiwara [Ka] (canonical or crystal bases); see also Mathieu [M2] for a review and more references. The problem of constructing an analog of the Gelfand–Tsetlin basis for the symplectic Lie algebra sp(2n) has been addressed by many authors. The branching rule for the reduction sp(2n) ↓ sp(2n − 2) is obtained by Zhelobenko [Z1]. Contrary to the case of the Lie algebras gl(N ) and o(N ) this reduction turns out to be not multiplicity-free, which makes the problem of constructing a basis for representations of the symplectic Lie algebra sp(2n) more complicated. Raising and lowering operators acting on the subspace V (λ)+ of sp(2n − 2) highest vectors in a representation V (λ) of sp(2n) are constructed by Mickelsson [Mi1] (see also Bincer [Bi1, Bi2]). They are explicitly expressed as elements of the universal enveloping algebra U(sp(2n)). Applying the lowering operators consequently to the sp(2n) highest vector one obtains a basis in V (λ)+ and then by induction one constructs a basis in V (λ). However, the monomials in the lowering operators can be chosen arbitrarily (since the operators do not commute) and none of the bases is distinguished. The problem of calculating the matrix elements of generators of sp(2n) in such a basis appears to be very difficult. The algebra generated by the raising and lowering operators, and more general algebras Z(g, g0 ) associated with a Lie algebra g and a reductive subalgebra g0 ⊂ g were studied by Mickelsson [Mi2], Van den Hombergh [Ho]. The theory of these algebras was further developed by Zhelobenko [Z3]–[Z6] with the use of the extremal projection method originated from [AST1]–[AST3]. A basis for the representations of the symplectic Lie algebras was constructed by Gould and Kalnins [GK, G3] with the use of the restriction gl(2n) ↓ sp(2n). The basis vectors are parameterized by a subset of the Gelfand–Tsetlin gl(2n)-patterns. Some matrix element formulas are also derived by using the gl(2n)-action. A similar observation is made independently by Kirillov [K] and Proctor [P1]. A description of the Gelfand–Tsetlin patterns for sp(2n) and o(N ) can be obtained by regarding them as fixed points of involutions of the Gelfand–Tsetlin patterns for the corresponding Lie algebra gl(N ). The problem of separation of multiplicities in the reduction sp(2n) ↓ sp(2n − 2) can be approached by investigating the restriction of sp(2n)-modules to an intermediate (nonreductive) subalgebra sp(2n − 1) ⊂ sp(2n). Such subalgebras and their representations are studied by Gelfand–Zelevinsky [GZ1], Proctor [P1], Shtepin [S]. The separation of multiplicities can be achieved by constructing a filtration of sp(2n − 1)-modules [S]. Matrix elements of generators of sp(2n) are obtained by Wong–Yeh [WY] for certain degenerate irreducible representations. In this paper we give a construction of a weight basis in V (λ) and obtain explicit formulas for the matrix elements of generators of gn = sp(2n) in this basis; see Theorem 1.1. Our approach is based on the theory of Mickelsson algebras and the representation theory of the Yangians. It is well-known [D, Sect. 9.1] that the subspace V (λ)+µ of gn−1 highest vectors of a given weight µ is an irreducible representation of the centralizer algebra Cn =
Representations of Symplectic Lie Algebras
593
U(gn )gn−1 . However, the algebraic structure of Cn is very complicated which makes the problem of studying their representations very difficult. An approach to solve this problem is developed by Olshanski [O3]; see also [MO]. He constructed a chain of natural homomorphisms C1 ← C2 ← · · · ← Cn ← Cn+1 ← · · · analogous to the Harish-Chandra homomorphism [D, Sect. 7.4]. The projective limit of this chain is an algebra isomorphic to the tensor product of an algebra of polynomials and a quantized enveloping algebra Y− (2) which was called the twisted Yangian. (This centralizer construction can be applied to any pairs of Lie algebras a(N − M ) ⊂ a(N ) of type A–D, where N → ∞ with M fixed. In the result one obtains either the Yangian Y(M ) := Y(gl(M )) for the Lie algebra gl(M ) (see Olshanski [O1, O2]), or the orthogonal Y+ (M ) or symplectic twistedYangian Y− (M ) (see [O3, MO])). In particular, one has an algebra homomorphism Y− (2) → Cn so that the subspace V (λ)+µ admits a structure of a representation of Y− (2) which can be shown to be irreducible. The algebra Y− (2) can be either defined as a subalgebra in the Yangian Y(2) or can be presented by generators and defining relations. The algebraic structure of the twisted Yangians is studied in [O3] and [MNO], and their finite-dimensional irreducible representations are described in [Mo4] in terms of the highest weights. In particular, it is proved that any finitedimensional irreducible representation of Y− (2) can be extended to that of the Yangian Y(2) thus providing the subspace V (λ)+µ with a structure of an irreducible Y(2)-module (see Theorem 5.2 below). Lowering operators for the Yangian reduction Y(M ) ↓ Y(M − 1) and GelfandTsetlin-type bases for representations of Y(M ) were constructed in [Mo2] and [NT2] (see also [C, NT1]). We use a special case of these constructions to get a GelfandTsetlin-type basis in the Y(2)-module V (λ)+µ ; cf. [T, Dr, CP]. The basis corresponds to an inclusion Y(1) ⊂ Y(2) which can be naturally chosen at least in two different ways. However, to compute the action of generators of gn in this basis we need to express the basis vectors in terms of the elements of the twisted Yangian Y− (2). In other words, the two inclusions Y(1) ⊂ Y(2), Y− (2) ⊂ Y(2) must be compatible with each other in some sense (see Remark 4.3) which makes the choice of the first inclusion unique and brings the necessary rigidity into the construction of the basis in V (λ)+µ . To calculate the matrix elements of generators of gn in this basis we explicitly express the elements of the twisted Yangian Y− (2), regarded as operators in V (λ)+µ , in terms of the Mickelsson raising and lowering operators. Our main instrument is Theorem 5.1 which provides explicit formulas for the images of generators of Y− (2) under the natural homomorphism to the Mickelsson algebra Z(gn , gn−1 ): Y− (2) → Cn → Z(gn , gn−1 ). The use of the quadratic relations in both the algebras Y− (2) and Z(gn , gn−1 ) allows us to avoid long calculations. The sections are organized as follows. The main result (Theorem 1.1) is formulated in Sect. 1. Sections 2–4 contain preliminary results which are used in the proof of Theorem 1.1. In Sect. 2 following [Z3]–[Z6] we introduce the Mickelsson raising and lowering operators and describe the algebraic structure of the algebra Z(gn , gn−1 ). In Sect. 3 we formulate some known results on the algebraic structure of the Yangian
594
A. I. Molev
Y(2n) and the twisted Yangian Y− (2n); see [O3, MNO]. In Sect. 4 we describe a particular case of the construction of Gelfand–Tsetlin-type basis for a certain class of representations of Y(2) and Y− (2); see [Mo2, NT2]. Our main arguments are given in Sects. 5 and 6. We construct the highest vector and find the highest weight of V (λ)+µ as a representation of Y− (2). As a corollary we obtain a proof of the Zhelobenko branching rule for representations of the symplectic Lie algebras [Z1] (see also Hegerfeldt [H], King [Ki], Proctor [P2], Okounkov [Ok]). In Sect. 6 we construct a basis in V (λ) and derive the formulas for the matrix elements of generators of gn in this basis. They have a multiplicative form which exhibits some similarity with the Gelfand–Tsetlin formulas in the case of gl(N ) and o(N ). 1. Main Theorem We shall enumerate the rows and columns of 2n × 2n-matrices over C by the indices −n, −n + 1, . . . , −1, 1, . . . , n. We shall also assume throughout the paper that the index 0 is skipped in a sum or in a product. The canonical basis {ei } in the space C2n will be enumerated by the same set of indices. We let the Eij , i, j = −n, . . . , −1, 1, . . . , n denote the standard basis of the Lie algebra gl(2n). Introduce the elements Fij = Eij − θij E−j,−i , θij = sgn i · sgn j.
(1.1)
The symplectic Lie algebra gn = sp(2n) can be identified with the subalgebra in gl(2n) spanned by the elements Fij , i, j = −n, . . . , n. They satisfy the following symmetry property: (1.2) F−j,−i = −θij Fij . The elements Fk,−k , F−k,k with k = 1, . . . , n and Fk−1,−k with k = 2, . . . , n generate gn as a Lie algebra. The subalgebra gn−1 is spanned by the elements (1.1) with the indices i, j running over the set {−n + 1, . . . , n − 1}. Denote by h = hn the diagonal Cartan subalgebra in gn . The elements F11 , . . . , Fnn form a basis of h. The finite-dimensional irreducible representations of gn are in a one-to-one correspondence with n-tuples of integers λ = (λ1 , . . . , λn ) satisfying the inequalities 0 ≥ λ 1 ≥ λ 2 ≥ · · · ≥ λn . Such an n-tuple λ is called the highest weight of the corresponding representation which we shall denote by V (λ). It contains a unique, up to a multiple, nonzero vector ξ (the highest vector) such that i = 1, . . . , n, Fii ξ = λi ξ, −n ≤ i < j ≤ n. Fij ξ = 0, We shall sometimes use the numbers1 λ−i := −λi . They are eigenvalues of ξ with respect to the operators F−i,−i . The restriction of V (λ) to the subalgebra gn−1 is isomorphic to a direct sum of irreducible finite-dimensional representations V 0 (µ), µ = (µ1 , . . . , µn−1 ) of gn−1 with certain multiplicities: 1 The non-negative labels λ −n ≥ · · · ≥ λ−1 ≥ 0 are usually used to parameterize the irreducible finitedimensional representations of gn . We have chosen to work with positive subindices. Both parameterizations can be easily obtained from each other.
Representations of Symplectic Lie Algebras
V (λ) =
595
M
c(µ)V 0 (µ).
(1.3)
µ
The multiplicity c(µ) is equal to the number of n-tuples of integers (ν1 , . . . , νn ) satisfying the inequalities [Z1] (see also Corollary 5.3 below): 0 ≥ ν1 ≥ λ1 ≥ ν2 ≥ λ2 ≥ · · · ≥ νn−1 ≥ λn−1 ≥ νn ≥ λn , 0 ≥ ν1 ≥ µ1 ≥ ν2 ≥ µ2 ≥ · · · ≥ νn−1 ≥ µn−1 ≥ νn .
(1.4)
Denote by V (λ)+ the subspace of gn−1 highest vectors in V (λ): V (λ)+ = {η ∈ V (λ) | Fij η = 0, −n < i < j < n}. Given µ = (µ1 , . . . , µn−1 ) we denote by V (λ)+µ the corresponding weight subspace in V (λ)+ : V (λ)+µ = {η ∈ V (λ)+ | Fii η = µi η, i = 1, . . . , n − 1}. We obviously have dim V (λ)+µ = c(µ). Any nonzero vector η ∈ V (λ)+µ generates a gn−1 -submodule in V (λ) isomorphic to V 0 (µ). A parameterization of basis vectors in V (λ) is obtained by using its further restrictions to the subalgebras of the chain g1 ⊂ g2 ⊂ · · · ⊂ gn−1 ⊂ gn . Define the pattern 3 associated with λ as an array of integer row vectors of the form λn1 λn2 0 λn1 λ0n2
···
λnn 0 λnn
··· λn−1,1 · · · λn−1,n−1
λ0n−1,1 · · · λ0n−1,n−1 ··· ··· λ11 λ011 such that the upper row coincides with λ and the following inequalities hold: 0 ≥ λ0k1 ≥ λk1 ≥ λ0k2 ≥ λk2 ≥ · · · ≥ λ0k,k−1 ≥ λk,k−1 ≥ λ0kk ≥ λkk for k = 1, . . . , n; and 0 ≥ λ0k1 ≥ λk−1,1 ≥ λ0k2 ≥ λk−1,2 ≥ · · · ≥ λ0k,k−1 ≥ λk−1,k−1 ≥ λ0kk for k = 2, . . . , n. Let us set 0 = λ0ki − i, 1 ≤ i ≤ k ≤ n. lki = λki − i, lki
(1.5)
596
A. I. Molev
Theorem 1.1. There exists a basis {ζ3 } in V (λ) parameterized by all patterns 3 associated with λ such that the action of generators of gn is given by the formulas Fkk ζ3 =
k X
2
λ0ki
−
k X
i=1
Fk,−k ζ3 =
k X
i=1
λk−1,i
ζ3 ,
i=1
ki
k X
Bki (3) ζ3−δ0 , ki
i=1
Fk−1,−k ζ3 =
!
Aki (3) ζ3+δ0 ,
i=1
F−k,k ζ3 =
λki −
k−1 X
k−1 X
Cki (3) ζ3−δ
i=1
k−1,i
+
k−1 k X X
Dkijm (3) ζ3+δ0
ki
i=1 j,m=1
. 0 +δk−1,j +δk−1,m
Here k Y
Aki (3) =
a=1, a6=i
1 0 − l0 , lka ki
0 Bki (3) = 4 Aki (3) lki
k k−1 Y Y 0 0 (lka − lki ) (lk−1,a − lki ), a=1
Cki (3) =
1
a=1
k−1 Y
2 lk−1,i l2 a=1, a6=i k−1,i
1 , 2 − lk−1,a
and Dkijm (3) = Aki (3)Ak−1,m (3)Ckj (3)
k Y
0 0 (lk−1,j − lka )(lk−1,j + lka + 1)
a=1, a6=i
×
k−1 Y
0 0 (lk−1,j − lk−1,a )(lk−1,j + lk−1,a + 1).
a=1, a6=m 0 are obtained from 3 by replacing λki and λ0ki by λki ± 1 The arrays 3 ± δki and 3 ± δki 0 and λki ± 1 respectively. It is supposed that ζ3 = 0 if the array 3 is not a pattern.
Theorem 1.1 will be proved in Sects. 5 and 6. 2. Mickelsson Algebra Z(gn , gn−1 ) This section contains preliminary results on the algebraic structure of the Mickelsson algebra Z(gn , gn−1 ); see [Z3]–[Z6] for further details. Consider the extension of the universal enveloping algebra U(gn ) U0 (gn ) = U(gn ) ⊗U(h) R(h),
Representations of Symplectic Lie Algebras
597
where R(h) is the field of fractions of the commutative algebra U(h). Let J denote the left ideal in U0 (gn ) generated by the elements Fij with −n < i < j < n. The Mickelsson algebra Z(gn , gn−1 ) is the quotient algebra of the normalizer Norm J = {x ∈ U0 (gn ) | J x ⊆ J} modulo the two-sided ideal J. It is an algebra over C and an R(h)-bimodule. The algebraic structure of Z(gn , gn−1 ) can be described by using the extremal projection p = pn−1 for the Lie algebra gn−1 [AST1]–[AST3]. The projection p is, up to a factor from R(hn−1 ), a unique element of an extension of U0 (gn−1 ) to an algebra of formal series, satisfying the condition (2.1) Fij p = p Fji = 0 for − n < i < j < n. Explicit formulas for p are given in [AST1, Z3]. The element p is of zero weight (with respect to the adjoint action of hn−1 ) and it can be normalized to satisfy the condition p2 = p. The projection p is a well-defined operator in the quotient U0 (gn )/ J and the Mickelsson algebra Z(gn , gn−1 ) can be naturally identified with the image of p (see [Z3]): Z(gn , gn−1 ) = p U0 (gn )/ J . An analog of the Poincar´e–Birkhoff–Witt theorem holds for the algebra Z(gn , gn−1 ) so that ordered monomials in the elements Fn,−n , F−n,n , pFin , pFni , i = −n + 1, . . . , n − 1 form a basis of Z(gn , gn−1 ) as a left or right R(h)-module [Z4]. We have the following equalities in U0 (gn ) mod J: X
pFin = Fin +
Fii1 Fi1 i2 · · · Fis−1 is Fis n
i>i1 >···>is >−n
X
pFni = Fni +
Fi1 i Fi2 i1 · · · Fis is−1 Fnis
i · · · > is > −n we have in U0 (gn ) mod J, pFni1 Fi1 i2 · · · Fis−1 is Fis ,−n = 2 pFnis Fis ,−n ,
(2.10)
if im + im+1 = 0 for a certain m; otherwise this equals pFnis Fis ,−n . This allows one to write the right hand side (2.8) in the form n−1 X
pFni Fi,−n ai + Fn,−n b, ai , b ∈ R(h).
(2.11)
i=1
It remains to check that the coefficients ai and b remain unchanged if we replace fn by f−n − 1 = −fn − 1. This can be done by a straightforward calculation which implies that the right-hand side of (2.9) coincides with (2.11).
Representations of Symplectic Lie Algebras
599
Proposition 2.1. We have the relations in U0 (gn ) mod J: Fn,−n =
n X
zni zn,−i
i=−n+1
Fn−1,−n =
n−1 X
n Y a=−n+1 a6=i
n−1 Y
zn−1,i zi,−n
i=−n+1
1 , fi − fa
a=−n+1 a6=i
1 , fi − fa
(2.12)
(2.13)
where znn = zn−1,n−1 := 1. Proof. Both relations are proved in the same way, so we only give a proof of (2.12). The following equality in U0 (gn ) mod J is implied by the explicit formulas for the pFi,−n (see (2.2)): Fn,−n = zn,−n X
+
1 (fn − f−n+1 ) · · · (fn − fn−1 )
Fni1 Fi1 i2 · · · Fis−1 is · pFis ,−n
n>i1 >···>is >−n
1 , (fis − fn )(fis − fi1 ) · · · (fis − fis−1 )
where s = 1, 2, . . . . Now (2.12) follows from (2.4).
3. Yangian Y(2n) and Twisted Yangian Y− (2n) Proofs of the results formulated in this section and further details concerning the algebraic structure of the (twisted) Yangians can be found in [MNO]. The Yangian Y(2n) = Y(gl(2n)) is the complex associative algebra with the gener(2) ators t(1) ij , tij , . . . , where i, j = −n, . . . , −1, 1, . . . , n, and the defining relations [tij (u), tkl (v)] = where
1 (tkj (u)til (v) − tkj (v)til (u)), u−v
(3.1)
−1 −2 tij (u) := δij + t(1) + t(2) + · · · ∈ Y(2n)[[u−1 ]]. ij u ij u
One can rewrite (3.1) as a ternary relation for the matrix X tij (u) ⊗ Eij ∈ Y(2n)[[u−1 ]] ⊗ End C2n . T (u) := i,j
To do this introduce the following notation. For an operator X ∈ End C2n and a number m = 1, 2, . . . we set ⊗m , 1 ≤ k ≤ m. (3.2) Xk := 1⊗(k−1) ⊗ X ⊗ 1⊗(m−k) ∈ End C2n If X ∈ (End C2n )⊗2 , then for any k, l such that 1 ≤ k, l ≤ m and k 6= l, we denote by Xkl the operator in (C2n )⊗m which acts as X in the product of k th and lth copies and as 1 in all other copies. That is, X X arstu Ers ⊗ Etu ⇒ Xkl = arstu (Ers )k (Etu )l , (3.3) X= r,s,t,u
r,s,t,u
600
A. I. Molev
where arstu ∈ C. The ternary relation has the form R(u − v)T1 (u)T2 (v) = T2 (v)T1 (u)R(u − v),
(3.4)
where R(u) = R12 (u) = 1 − u−1 P and P is the permutation operator in C2n ⊗ C2n . The Yangian Y(2n) is a Hopf algebra with the coproduct n X
1(tij (u)) =
tia (u) ⊗ taj (u).
(3.5)
a=−n
The twisted Yangian Y− (2n) corresponding to the symplectic Lie algebra gn = sp(2n) is defined as follows. By X 7→ X t we will denote the matrix transposition such that (Eij )t = θij E−j,−i . Introduce the matrix S(u) = (sij (u)) by setting S(u) := T (u)T t (−u), or, in terms of matrix elements, sij (u) =
n X
θaj tia (u)t−j,−a (−u).
(3.6)
a=−n −1 −2 +s(2) +· · · . The twistedYangian Y− (2n) is the subalgebra Write sij (u) = δij +s(1) ij u ij u (2) of Y(2n) generated by the elements s(1) ij , sij , . . . , where −n ≤ i, j ≤ n. The matrix S(u) satisfies the following quaternary relation and symmetry relation which follow from (3.4):
R(u − v)S1 (u)Rt (−u − v)S2 (v) = S2 (v)Rt (−u − v)S1 (u)R(u − v), 2u − 1 1 S t (−u) = S(u) + S(−u). 2u 2u
(3.7)
Here we use the notation (3.2), where Rt (u) is obtained from R(u) by applying the transposition t in either of the two copies of End C2n : X θij E−j,−i ⊗ Eji . Rt (u) = 1 − u−1 i,j
Relations (3.7) are defining relations for the algebra Y− (2n) and they can be rewritten in terms of the generating series sij (u) as follows: [sij (u), skl (v)] =
1 (skj (u)sil (v) − skj (v)sil (u)) u−v 1 (θk,−j si,−k (u)s−j,l (v) − θi,−l sk,−i (v)s−l,j (u)) − u+v 1 (θi,−j sk,−i (u)s−j,l (v) − θi,−j sk,−i (v)s−j,l (u)) + 2 u − v2
and θij s−j,−i (−u) =
2u − 1 1 sij (u) + sij (−u). 2u 2u
(3.8)
(3.9)
(r) and the This allows one to regard Y− (2n) as an abstract algebra with generators sij relations (3.8), (3.9).
Representations of Symplectic Lie Algebras
601
− The mapping Fij 7→ s(1) ij defines an inclusion U(gn ) ,→ Y (2n) while the mapping
sij (u) 7→ δij +
Fij u − 1/2
(3.10)
defines an algebra homomorphism Y− (2n) → U(gn ). Any even formal series c(u) ∈ 1 + u−2 C[[u−2 ]] defines an automorphism of Y− (2n) given by sij (u) 7→ c(u) sij (u).
(3.11)
The Sklyanin determinant sdet S(u) is a formal series in u−1 with coefficients from the center of the algebra Y− (2n). It can be defined by the formula (see (3.2), (3.3)): t t t t · · · R1,2n S2 (u − 1)R23 · · · R2,2n S3 (u − 2) A2n S1 (u)R12 t S2n (u − 2n + 1) = sdet S(u)A2n , · · · S2n−1 (u − 2n + 2)R2n−1,2n (3.12) t t := Rij (−2u + i + j − 2), and A2n is the normalized antisymmetrizer in the where Rij tensor space (C2n )⊗2n so that A22n = A2n . Explicit formulas for sdet S(u) are given in [O3, MNO, Mo3]. b = (b The Sklyanin comatrix S(u) sij (u)) is defined by
b sdet S(u) = S(u)S(u − 2n + 1).
(3.13)
The mapping S(u) 7→
2u + 1 b S(−u + n − 1) 2u − 2n + 1
(3.14)
defines an automorphism of the algebra Y− (2n); see [Mo4, Proposition 2.1]. e the submatrices of S(u) whose rows and Let us denote by S (n−1) (u) and S(u) columns are enumerated by the sets of indices {−n + 1, . . . , −1, 1, . . . , n − 1} and {−n + 1, . . . , −1, 1, . . . , n} respectively. Introduce the nnth quasi-determinant of the e by matrix S(u) −1 e −1 e nn = S(u) ; |S(u)| nn
see [GKLLRT]. We shall need the following expression for the matrix element sbnn (u) b of the Sklyanin comatrix S(u). Proposition 3.1. We have the formula sbnn (u) =
2u + 1 e |S(−u)|nn sdet S (n−1) (u − 1). 2u − 1
(3.15)
−1 (u − 2n + 1) from the right and using Proof. Multiplying both sides of (3.12) by S2n (3.13) we obtain the relation t t t t · · · R1,2n S2 (u − 1)R23 · · · R2,2n S3 (u − 2) A2n S1 (u)R12
· · · S2n−1 (u − 2n +
t 2)R2n−1,2n
(3.16) = A2n Sb2n (u).
It can be easily verified by using the symmetry relation (3.9) (see also [Mo3]) that t t · · · R1,2n = A2n S1 (u)R12
2u + 1 A2n S1t (−u). 2u − 1
602
A. I. Molev
Denote by A(2) 2n the normalized antisymmetrizer corresponding to the subgroup S{2,...,2n} (2) of the symmetric group S2n . Clearly, A2n = A2n A(2) 2n . Note that A2n is permutable with t t t S1 (−u), while Rij is permutable with Rkl and Sk (u) provided that the indices i, j, k, l are distinct. So, we can rewrite formula (3.16) in the form: 2u + 1 t t A2n S1t (−u)A(2) (3.17) 2n S2 (u − 1)R23 · · · R2,2n−1 S3 (u − 2) 2u − 1 t t · · · R2n−1,2n = A2n Sb2n (u). · · · S2n−1 (u − 2n + 2)R2,2n Let us apply the operators in both sides of this formula to the vector vi = e−i ⊗ e−n+1 ⊗ e−n+2 ⊗ · · · ⊗ en−1 ⊗ en , where i ∈ {−n + 1, . . . , n}. For the right-hand side we clearly obtain (3.18) A2n Sb2n (u)vi = δin sbnn (u) ζ, where ζ := A2n (e−n ⊗ e−n+1 ⊗ · · · ⊗ en ). To calculate the left hand side we note first that t t · · · R2n−1,2n vi = vi . R2,2n Further, let us introduce the formal series 8a2 ,...,a2n−1 (u − 1) ∈ Y− (2n)[[u−1 ]], −n ≤ ai ≤ n, as follows: t t A(2) 2n S2 (u − 1)R23 · · · R2,2n−1 S3 (u − 2) · · · S2n−1 (u − 2n + 2)(e−n+1 ⊗ · · · ⊗ en−1 ) X 8a2 ,...,a2n−1 (u − 1)(ea2 ⊗ · · · ⊗ ea2n−1 ). = a2 ,...,a2n−1
In particular,
(2n − 2)! 8−n+1,...,n−1 (u − 1) = sdet S (n−1) (u − 1),
(3.19)
and the series 8a2 ,...,a2n−1 (u − 1) is skew symmetric with respect to permutations of the indices a2 , . . . , a2n−1 ; see [MNO, Sect. 4]. This allows us to write the left hand side of (3.17) applied to vi in the form: 2n−1 X 2u + 1 (2n − 2)! (−1)k−1 stbk ,−i (−u) 8b ,...,bb ,...,b (u − 1) ζ 1 2n−1 k 2u − 1 k=1
=
2n−1 X 2u + 1 (2n − 2)! θin si,−bk (−u) (−1)k−1 θ−bk ,n 8b ,...,bb ,...,b (u − 1) ζ, 1 2n−1 k 2u − 1 k=1
where (b1 , . . . , b2n−1 ) = (−n, −n + 1, . . . , n − 1) and the hat indicates the index to be omitted. Put 8−bk (u − 1) := (2n − 2)! (−1)k−1 θ−bk ,n 8b
b
1 ,...,bk ,...,b2n−1
(u − 1).
Then, taking into account (3.18), we get the following matrix relation: 0 8−n+1 (u − 1) 2u + 1 e .. .. S(−u) = . . . 2u − 1 sbnn (u) 8n (u − 1)
Representations of Symplectic Lie Algebras
603
−1 e Multiplying both sides by the matrix S(−u) from the left and comparing the nth coordinates of the vectors, we obtain using (3.19) that
2u + 1 −1 e sdet S (n−1) (u − 1) = S(−u) sb (u), nn nn 2u − 1 which implies (3.15).
4. Representations of the Algebras Y(2) and Y− (2) Here we formulate some necessary results on representations of the algebras Y(2) and Y− (2); see [T, Dr, CP, NT2, Mo2, Mo4]. Having in mind their applications in Sects. 5 and 6 we shall enumerate the generators of Y(2) and Y− (2), as well as rows and columns of 2 × 2-matrices, by the symbols −n, n instead of the usual −1, 1. A representation of the Yangian Y(2) is a highest weight representation if it is generated by a nonzero vector η (the highest vector) such that ti,i (u) η = λi (u) η, i = −n, n, t−n,n (u) η = 0, for certain formal series λi (u) ∈ 1+u−1 C[[u−1 ]]. The pair (λ−n (u), λn (u)) is called the highest weight of the representation. Given arbitrary series λ−n (u), λn (u) there exists a unique, up to an isomorphism, irreducible highest weight representation of Y(2) with the highest weight (λ−n (u), λn (u)) which will be denoted by L(λ−n (u), λn (u)). Similarly, a representation of the Yangian Y− (2) is a highest weight representation if it is generated by a nonzero vector η (the highest vector) such that sn,n (u) η = µ(u) η, s−n,n (u) η = 0, for a certain formal series µ(u) ∈ 1 + u−1 C[[u−1 ]] called the highest weight of the representation. Given an arbitrary series µ(u) there exists a unique, up to an isomorphism, irreducible highest weight representation of Y− (2) with the highest weight µ(u) which will be denoted by V (µ(u)). Every irreducible finite-dimensional representation of the algebra Y− (2) is isomorphic to a unique V (µ(u)). Given a pair of complex numbers (α, β) such that α − β ∈ Z+ we denote by L(α, β) the irreducible representation of the Lie algebra gl(2) with the highest weight (α, β) with respect to the upper triangular Borel subalgebra. We have dim L(α, β) = α − β + 1. We may regard L(α, β) as a Y(2)-module by using the algebra homomorphism Y(2) → U(gl(2)) given by (4.1) tij (u) 7→ δij + Eij u−1 , i, j ∈ {−n, n}. The coproduct (3.5) allows one to construct representations of Y(2) of the form L = L(α1 , β1 ) ⊗ · · · ⊗ L(αn , βn ).
(4.2)
One easily obtains from (3.5) and (4.1) that the tensor product η = ω 1 ⊗ · · · ⊗ ωn
(4.3)
604
A. I. Molev
of the highest vectors ωi of the L(αi , βi ) generates a highest weight submodule in L with the highest weight (α(u), β(u)), where α(u) = (1 + α1 u−1 ) · · · (1 + αn u−1 ), β(u) = (1 + β1 u−1 ) · · · (1 + βn u−1 ).
(4.4)
Hence, if the representation L is irreducible then it is isomorphic to L(α(u), β(u)). An irreducibility criterion of representation (4.2) is given by Tarasov [T] and Chari–Pressley [CP]; see also [Mo4]. To formulate the result, with each L(α, β) associate the string S(α, β) = {β, β + 1, . . . , α − 1} ⊂ C. We say that two strings S1 and S2 are in general position if either (i) S1 ∪ S2 is not a string, or (ii) S1 ⊆ S2 , or S2 ⊆ S1 . The representation (4.2) of Y(2) is irreducible if and only if the strings S(αi , βi ) and S(αj , βj ) are in general position for all i < j [CP]. The tensor product (4.2) can also be regarded as a representation of the subalgebra Y− (2) ⊂ Y(2). The following criterion of its irreducibility is given in [Mo4] and will be used in Sect. 5. Proposition 4.1. The representation (4.2) of Y− (2) is irreducible if and only if each pair of strings (S(αi , βi ), S(αj , βj )) and (S(αi , βi ), S(−βj , −αj )) is in general position for all i < j. If the representation L of Y− (2) defined in (4.2) is irreducible then by (3.6) it is isomorphic to V (µ(u)) with µ(u) = α(−u)β(u), (4.5) where α(u) and β(u) are given by (4.4). (r) ∈ Y(2) with r ≥ n + 1 act as It follows from (3.5) and (4.1) that the elements tij zero operators in (4.2). Therefore, the operators Tij (u) = un tij (u)
(4.6)
are polynomials in u. By (3.6) the same is true for the operators Sij (u) = u2n sij (u). Note that the defining relations (3.1) allow us to rewrite the formula (3.6) for sn,−n (u) in the form sn,−n (u) =
u + 1/2 tn,−n (u)tnn (−u) − tn,−n (−u)tnn (u) . u
Therefore we may introduce another polynomial operator in L by 1 (−1)n Tn,−n (u)Tnn (−u) − Tn,−n (−u)Tnn (u) . Sn,−n (u) = u + 1/2 u (4.7) \ \ (u), Sn,−n (v)] = 0. Note that by (3.8) we have [Sn,−n Let γ = (γ1 , . . . , γn ) be an n-tuple of complex numbers such that \ (u) = Sn,−n
αi − γi ∈ Z+ , γi − βi ∈ Z+ , i = 1, . . . , n.
(4.8)
Representations of Symplectic Lie Algebras
605
Introduce the following vector in L: ηγ =
n Y
\ \ \ Sn,−n (−γi + 1) · · · Sn,−n (−βi − 1)Sn,−n (−βi ) η,
(4.9)
i=1
where η is defined in (4.3). Proposition 4.2. Suppose that the representation L of Y− (2) given by (4.2) is irreducible and the strings S(αi , βi ) satisfy the condition S(αi , βi ) ∩ S(αj , βj ) = ∅ for i 6= j.
(4.10)
Then the vectors ηγ with γ satisfying (4.8) form a basis in L. Moreover, one has the formulas Tnn (u) ηγ = (u + γ1 ) · · · (u + γn ) ηγ , n 1 1 Y ηγ+δi , Tn,−n (−γi ) ηγ = 2 −γi − γa
(4.11) (4.12)
a=1, a6=i n Y
n Y
k=1
a=1, a6=i
T−n,n (−γi ) ηγ = −2
(αk − γi + 1)(βk − γi ) ·
(−γi − γa + 1) ηγ−δi , (4.13)
where δi is the n-tuple which has 1 on the ith position and zeroes as remaining entries; it is assumed that ηγ = 0 if γ does not satisfy (4.8). Proof. Since L is irreducible as a Y(2)-module it is isomorphic to the highest weight representation L(α(u), β(u)). For each γ satisfying (4.8) introduce the vector ηeγ =
n Y
Tn,−n (−γi + 1) · · · Tn,−n (−βi − 1)Tn,−n (−βi ) η.
i=1
The vectors {e ηγ } form a basis in L and the following relations hold: Tnn (u) ηeγ = (u + γ1 ) · · · (u + γn ) ηeγ , Tn,−n (−γi ) ηeγ = ηeγ+δi , n Y T−n,n (−γi ) ηeγ = − (αk − γi + 1)(βk − γi ) ηeγ−δi .
(4.14) (4.15) (4.16)
k=1
This is a special case of the construction of Gelfand–Tsetlin-type bases for representations of the Yangian Y(m) [Mo2, NT2] (see also [T, NT1]). The formulas (4.7), (4.14) and (4.15) imply that \ (−γi ) ηeγ = 2 Sn,−n
n Y
(−γi − γa ) ηeγ+δi .
(4.17)
a=1, a6=i
Hence, for each γ the vectors ηγ and ηeγ coincide up to a nonzero multiple. This proves (4.11). Now (4.12) and (4.13) follow from (4.15) and (4.16). Remark 4.3. The above proof of (4.17) relies on the fact that the ηeγ are eigenvectors ηγ } in L for the operators Tnn (u); see (4.7). That is, the Gelfand–Tsetlin-type basis {e corresponds to the inclusion Y(1) ⊂ Y(2) with Y(1) generated by the coefficients of tnn (u).
606
A. I. Molev
5. Yangian Action on V (λ)+µ Let us introduce the 2n × 2n-matrix F = (Fij ) whose ij th entry is the element Fij ∈ gn (see (1.1)) and set F . F(u) = 1 + u − 1/2 b b Denote by F(u) the image of the Sklyanin comatrix S(u) under the homomorphism S(u) 7→ F(u); see (3.10). By (3.11) and (3.14) the mapping π : S(u) 7→ c(u) where c(u) =
2u + 1 b F(−u + n − 1), 2u − 2n + 1
n Y (1 − (k − 1/2)2 u−2 ), k=1
defines a homomorphism Y− (2n) → U(gn ); cf. [O3, MO]. The series (1 + nu−1 )−1 sdet S(u + n − 1/2) is even in u (see [MNO, Sect. 4.11]) and so by (3.14) the image of the generator s(1) ij with respect to π coincides with Fij . By (3.8) we then have [Fij , sπkl (u)] = δkj sπil (u) − δil sπkj (u) − θk,−j δi,−k sπ−j,l (u) + θi,−l δ−l,j sπk,−i (u), (5.1) where sπij (u) := π(sij (u)). This implies that the image of the restriction of π to the subalgebra Y− (2) generated by the elements sij (u) with i, j ∈ {−n, n} is contained in the centralizer Cn = U(gn )gn−1 and thus defines an algebra homomorphism π : Y− (2) → Cn .
(5.2)
However, the subspace V (λ)+µ is an irreducible representation of the centralizer Cn ; see [D, Sect. 9.1]. It follows from [MO, Proposition 4.9] that the algebra Cn is generated by the image of π and the center of U(gn ). Since the elements of the center of U(gn ) act as scalar operators in V (λ), the Y− (2)-module V (λ)+µ defined via the homomorphism (5.2) is irreducible. Note that Cn is a subalgebra in the normalizer Norm J (see Sect. 2): Cn ,→ Norm J . Thus, using (5.2) and the definition of the Mickelsson algebra Z(gn , gn−1 ) we obtain an algebra homomorphism which we denote by π 0 : π 0 : Y− (2) → Z(gn , gn−1 ).
(5.3)
In other words, the elements of Y− (2), as operators in the space V (λ)+µ , can be expressed as elements of the Mickelsson algebra Z(gn , gn−1 ). An explicit form of the images of the generators of Y− (2) under the homomorphism (5.3) is given in the following theorem.
Representations of Symplectic Lie Algebras
607
Introduce the polynomials Zij (u), i, j ∈ {−n, n} with coefficients in the Mickelsson algebra Z(gn , gn−1 ): Zn,−n (u) =
n−1 X i=1
Z−n,n (u) =
a=1, a6=i
n−1 X
n−1 Y
z−i,n zin
i=1
Zn,n (u) =
n−1 Y
zni zn,−i
a=1, a6=i
n−1 X
n−1 Y u2 − ga2 + F (u2 − ga2 ), −n,n gi2 − ga2 a=1
(5.5)
a=−n+1 a6=i
n−1 X
Z−n,−n (u) = −
(5.4)
n−1 Y
zni z−n,−i
i=−n+1
n−1 Y u2 − ga2 + F (u2 − ga2 ), n,−n gi2 − ga2 a=1
n−1 Y u + ga + (u + gn ) (u + ga ), gi − ga a=−n+1
n−1 Y
z−i,n zi,−n
i=−n+1
a=−n+1 a6=i
(5.6)
n−1 Y u + ga 0 + (u + gn ) (u + ga ), gi − ga (5.7) a=−n+1
where gi := fi + 1/2 = Fii − i + 1/2 and gn0 = −gn − 2n + 1. Theorem 5.1. The images of the generators sij (u), i, j ∈ {−n, n} of Y− (2) under the homomorphism (5.3) are given by the formulas u + 1/2 u + 1/2 Z−n,−n (u), s−n,n (u) 7→ Z−n,n (u), 2n u u2n u + 1/2 u + 1/2 Zn,−n (u), sn,n (u) 7→ Zn,n (u). sn,−n (u) 7→ 2n u u2n
s−n,−n (u) 7→
(5.8)
Proof. Consider first the generator sn,n (u). Using Proposition 3.1 we obtain the followb − 1/2): ing formula for the nnth matrix element of the matrix F(u b − 1/2)nn = F(u
u |1 − Fe u−1 |nn sdet F (n−1) (u − 3/2), u−1
(5.9)
where Fe is the submatrix of F obtained by removing the row and column enumerated by −n, and sdet F (n−1) (u) is the image of the Sklyanin determinant sdet S (n−1) (u) under the homomorphism (3.10). Using the combinatorial interpretation of the quasi-determinant |1 − Fe u−1 |nn [GKLLRT, Prop. 7.20] we obtain the formula: |1 − Fe u−1 |nn = 1 −
∞ X
(k) −k Fnn u ,
(5.10)
k=1
where (k) = Fab
X
Fai1 Fi1 i2 · · · Fik−1 b ,
summed over all values of the indices im ∈ {−n + 1, . . . , −1, 1, . . . , n − 1}. Let us show that for k ≥ 2 we have the equality in Z(gn , gn−1 ): (k) Fnn
=
n−1 X i=−n+1
(k−1) p Fni ·
Y
(fi − fa ) · pFin
i −n then in U0 (gn ) mod J, (k−1) (k−1) · Fii1 Fi1 i2 · · · Fis−1 is Fis n = 2 pFni Fi s n , pFni s (k−1) Fis n (cf. if i + i1 = 0 or im + im+1 = 0 for a certain m; otherwise this equals pFni s (2.10)). Using this, we verify by a straightforward calculation that the coefficient at each (k−1) Fin on the right hand side of (5.11) equals 1, and so it is given by product pFni n−1 X i=−n+1
(k−1) (k) (k) pFni Fin = pFnn = Fnn ,
which proves (5.11). Further, we have in Z(gn , gn−1 ), (k−1) = pFni (fi + n − 1)k−2 , i = −n + 1, . . . , n − 1. pFni
(5.13)
Indeed, (k−1) = pFni
n−1 X
(k−2) pFna Fai =
n−1 X a=i
a=−n+1
(k−2) (k−2) pFna Fai = pFni (Fii + n − i − 1),
where we have used (2.1) and (5.12). Now (5.13) follows by induction. Thus, rewriting (5.11) and (5.13) in terms of the generators zni and zin (see (2.3)) we obtain from (5.10) that in Z(gn , gn−1 ) |1 − Fe u−1 |nn = 1 − Fnn u−1 +
n−1 X
1 zni z−n,−i u(u − fi − n) i=−n+1
n−1 Y a=−n+1 a6=i
1 . fi − fa
(5.14) The coefficients of the series sdet F (n−1) (u) belong to the center of U(gn−1 ) and so its image under π coincides with its image with respect to the Harish-Chandra homomorphism. The latter was found in different ways in [Mo3, Sect. 5] and [MN, Sect. 6]. The result can be written as follows: in Z(gn , gn−1 ) we have sdet F (n−1) (u) =
n−1 Y
2n−2 Y
a=1
a=1
((u − n + 3/2)2 − fa2 ) ·
1 . u − a + 1/2 0
(5.15)
Finally, using (5.9), (5.14) and (5.15) we find that the image sπn,n (u) of sn,n (u) under the homomorphism π 0 is 0
sπn,n (u) = c(u)
2u + 1 u + 1/2 b Zn,n (u). F(−u + n − 1)nn = 2u − 2n + 1 u2n
Representations of Symplectic Lie Algebras
609
0
To find the image sπn,−n (u) of sn,−n (u) under π 0 we note that by (5.1), [Fn,−n , sπn,n (u)] = −2 sπn,−n (u). This implies that the series 0 1 u2n sπn,−n (u) = − [Fn,−n , Zn,n (u)] u + 1/2 2
(5.16)
is a polynomial in u, and by the symmetry relation (3.9) this polynomial is even. By (5.6) it is of degree n − 1 in u2 with the highest coefficient Fn,−n . Moreover, we see from (5.6) that Zn,n (−gi ) = −zni z−n,−i , i = −n + 1, . . . , n − 1. (If the polynomials Zij (u) are evaluated in R(h), we assume, to avoid an ambiguity, that they are written in such a way that the elements of Z(gn , gn−1 ) are to the left from coefficients belonging to R(h)(u), as appears in their definition). Hence, using (2.5) we obtain [Fn,−n , Zn,n (−gi )] = −2 zni zn,−i .
(5.17)
Applying the Lagrange interpolation formula to the polynomial (5.16) by using its values at the n − 1 points −gi , i = 1, . . . , n − 1 we prove that 1 − [Fn,−n , Zn,n (u)] = Zn,−n (u). 2
(5.18)
Similarly, replacing Fn,−n by F−n,n in the above argument we get the formula for the image of s−n,n (u) under π 0 . Finally, to find the image of s−n,−n (u) we use the following formula implied by (5.1): [Fn,−n , [F−n,n , sπn,n (u)]] = 2 sπn,n (u) − 2 sπ−n,−n (u).
The weight subspace V (λ)+µ (see Sect. 1) is nonzero if and only if λi+1 ≤ µi ≤ λi−1 for i = 1, . . . , n − 1, (λ0 := 0),
(5.19)
see (1.4). We assume that these conditions are satisfied. Theorem 5.2. We have an isomorphism of Y− (2)-modules: V (λ)+µ ' L(α1 , β1 ) ⊗ · · · ⊗ L(αn , βn ),
(5.20)
where α1 = −1/2; αi = min{λi−1 , µi−1 } − i + 1/2, i = 2, . . . , n, i = 1, . . . , n − 1, βn = λn − n + 1/2. βi = max{λi , µi } − i + 1/2, In particular, V (λ)+µ is a Y(2)-module.
(5.21)
610
A. I. Molev
Proof.
2
Let us consider the following vector ηµ in V (λ)+µ : λ −µ10
ηµ = zn1100
( a, a = −a,
where
0
λ
0 −µ(n−1)0
(n−1) · · · zn(n−1) 0
ξ,
(5.22)
if λa − µa ≥ 0, if λa − µa < 0,
(recall that λ−a = −λa , µ−a = −µa ). Let us verify that the vector ηµ is nonzero and satisfies the relations Z−n,n (u) ηµ = 0
(5.23)
Znn (u) ηµ = (u − α2 ) · · · (u − αn )(u + β1 ) · · · (u + βn ) ηµ .
(5.24)
and Using (2.6) and (2.7) we easily obtain by induction on the degree of the monomial (5.22) that (5.25) z−i0 ,n ηµ = 0, i = 1, . . . , n − 1. Let us verify that for i = 1, . . . , n − 1, zin ηµ = −(mi + α2 ) · · · (mi + αn )(mi − β1 ) · · · (mi − βn ) ηµ+δi ,
(5.26)
if λi ≥ µi ; and z−i,n ηµ = −(mi −α2 −1) · · · (mi −αn −1)(mi +β1 −1) · · · (mi +βn −1) ηµ−δi , (5.27) if λi ≤ µi ; here we have set mi = µi − i + 1/2. We shall prove (5.24), (5.26) and (5.27) simultaneously by induction on the degree of the monomial (5.22). If the degree is zero all the relations are obvious. If λi = µi then both (5.26) and (5.27) hold by (5.25) because in this case βi = αi+1 + 1 = mi . Suppose now that λi > µi . By (2.6) we can write zin ηµ = zin zni ηµ+δi . Formula (5.7) gives zin zni = −Z−n,−n (−g−i ). However, (−g−i ) ηµ+δi = mi ηµ+δi . Applying Theorem 5.1 and the symmetry relation (3.9) we obtain 1 1 − 1 Zn,n (−mi ) − Zn,n (mi ). Z−n,−n (mi ) = 2mi 2mi Using the induction hypotheses we can find Zn,n (±mi ) ηµ+δi by (5.24). Since mi = αi+1 we have Zn,n (mi ) ηµ+δi = 0 and (5.26) follows. The same argument can be applied to prove (5.27). Here we use the relation z−i,n zn,−i = Z−n,−n (−gi ) implied by (5.7). To prove (5.24) it suffices to check this relation for 2n−1 different values of u because Zn,n (u) is a monic polynomial in u of degree 2n − 1. For these take the eigenvalues of −gi with i = −n + 1, . . . , n on the vector ηµ and then use (5.6), (5.26) and (5.27). 2
Theorem 5.2 was announced in [Mo1].
Representations of Symplectic Lie Algebras
611
Relation (5.23) follows from (2.5) and (5.25)–(5.27). The fact that ηµ 6= 0 is implied by (5.26) and (5.27). Indeed, applying appropriate operators zin or z−i,n to the vector ηµ repeatedly we can obtain the highest vector ξ of V (λ) with a nonzero coefficient. Finally, using formulas (5.8) we deduce from (5.23) and (5.24) that ηµ is the highest vector of the representation V (λ)+µ of the algebra Y− (2), and the highest weight is given by µ(u) = (1 − α1 u−1 ) · · · (1 − αn u−1 )(1 + β1 u−1 ) · · · (1 + βn u−1 ). On the other hand, Proposition 4.1 implies that the tensor product in (5.20) is an irreducible representation of Y− (2) because the strings S(α1 , β1 ), . . . , S(αn , βn ), S(−β1 , −α1 ), . . . , S(−βn , −αn ) are pairwise in general position. Therefore, it is isomorphic to the highest weight repre sentation V (µ(u)) ' V (λ)+µ ; see (4.5). The branching rule for representations of the symplectic Lie algebras [Z1] follows immediately from Theorem 5.2. Corollary 5.3. The restriction of V (λ) to the subalgebra gn−1 is isomorphic to the direct sum (1.3) of irreducible finite-dimensional representations V 0 (µ) of gn−1 , where the multiplicity c(µ) equals the number of n-tuples of integers ν satisfying the inequalities (1.4). Proof. Suppose first that the gn−1 highest weight µ satisfies the conditions (5.19). In this case the multiplicity c(µ) coincides with the dimension of the space V (λ)+µ . By Theorem 5.2, n Y (αi − βi + 1), dim V (λ)+µ = i=1
which is equal to the number of solutions of the inequalities (1.4). To complete the proof we check that the dimensions of the spaces on both sides of (1.3), with µ satisfying (5.19), coincide. For this we use the formulas for degrees of irreducible characters given in [W, Chapter VII, Sect. 8]. 6. Construction of the Basis in V (λ) In this section we complete the proof of Theorem 1.1. We first construct a basis in the space V (λ)+µ . A basis in V (λ) will then be obtained by induction with the use of the branching rule (1.3). Note that by (2.12), (5.6) and (5.18) we have Zn,−n (gn ) = zn,−n and so the polynomial Zn,−n (u) can be written as follows: Zn,−n (u) =
n X i=1
zni zn,−i
n Y u2 − ga2 , gi2 − ga2
a=1, a6=i
where, as before, znn = 1 and gi = Fii − i + 1/2. We shall use the notation li = λi − i + 1/2, γi = νi − i + 1/2, mi = µi − i + 1/2
(6.1)
612
A. I. Molev
with i ranging over {1, . . . , n} or {1, . . . , n − 1} respectively. Given ν satisfying the inequalities (1.4) consider the vector ξν =
n−1 Y i=1
νi −µi νi −λi zni zn,−i ·
γY n −1
Zn,−n (k) ξ,
(6.2)
k=ln
where the operators znj and Zn,−n (u) are defined in (2.3) and (5.4). Here the action of elements of R(h) is obtained by the extension from that of U(h). The action is well-defined for those elements whose denominators are not zero operators. One easily checks that the vector ξν is well-defined. By definition of the algebra Z(gn , gn−1 ) we have ξν ∈ V (λ)+ . Moreover, ξν is clearly of gn−1 -weight µ. Proposition 6.1. The vectors ξν with ν satisfying the inequalities (1.4) form a basis in the space V (λ)+µ . Proof. We use Theorem 5.2. The strings S(αi , βi ) obviously satisfy the condition (4.10). By Proposition 4.2 the vectors ηγ defined by (4.9) with the n-tuple γ = (γi ) given in \ (u) coincides with (6.1) form a basis in V (λ)+µ . On the other hand, the operator Sn,−n Zn,−n (u) by Theorem 5.1. That is, the vectors ηγ =
n Y
Zn,−n (γi − 1) · · · Zn,−n (βi + 1)Zn,−n (βi ) ηµ ,
(6.3)
i=1
with ηµ defined in (5.22) form a basis in V (λ)+µ . Let us show that for each ν satisfying (1.4) we have the equality of corresponding vectors: ηγ = ξν .
(6.4)
Note first that for any i = −n + 1, . . . , n − 1 and any value u ∈ C one has [Zn,−n (u), zni ] = 0.
(6.5)
Indeed, by (5.1) and (5.8) we have [Zn,−n (u), Fni ] = 0. It remains to apply the extremal projection p = pn−1 and use the fact that Zn,−n (u) commutes with gn−1 . Let b1 , . . . , bk be all the indices a among 1, . . . , n − 1 for which the difference λa − µa is positive and let c1 , . . . , cl be the remaining indices; k + l = n − 1. Using (2.6) and (6.5) rewrite the vector (6.3) as follows: ηγ =
k Y i=1
λb −µbi
znbii
·
n Y
Zn,−n (γi − 1) · · · Zn,−n (βi ) ·
i=1
l Y i=1
µc −λci
i zn,−c i
ξ.
(6.6)
Further, by (5.4) we have Zn,−n (gi ) = zni zn,−i , i = 1, . . . , n − 1. However, gi
l Y i=1
µc −λci
i zn,−c i
ξ = βi
l Y i=1
µc −λci
i zn,−c i
ξ.
Therefore, given i ∈ {1, . . . , n − 1} the operator Zn,−n (βi ) in (6.6) can be replaced with zni zn,−i . Moving the zni to the left permuting it with the operators of the form Zn,−n (u) we represent the vector again in the form (6.6). Proceeding by induction we shall get the expression for the vector ηγ which coincides with (6.2).
Representations of Symplectic Lie Algebras
613
Given a pattern 3 associated with λ (see Sect. 1) define the vector ξ3 by the formula 0 lkk −1 → k−1 Y Y λ0 −λ Y 0 λ −λ ki ki zkiki k−1,i zk,−i · Zk,−k (a + 1/2) ξ. ξ3 = k=1,...,n
i=1
a=lkk
Here the zkj and Zk,−k (u) are elements of the Mickelsson algebra Z(gk , gk−1 ), and 0 are defined in (1.5). The branching rule (1.3) and Proposition 6.1 the numbers lkk , lkk immediately imply the following. Proposition 6.2. The vectors ξ3 where 3 runs over all patterns associated with λ form a basis in the representation V (λ) of gn . Our next task is to calculate the matrix elements of the generators Fkk , Fk,−k , F−k,k , Fk−1,−k of the Lie algebra gn in the basis {ξ3 }. Note that the elements Fkk , Fk,−k , F−k,k belong to the centralizer of the subalgebra gk−1 in U(gk ). Therefore, these operators preserve the subspace of gk−1 highest vectors in V (λ). So, it suffices to compute the action of these operators with k = n in the basis {ξν } of the space V (λ)+µ ; see Proposition 6.1. For Fnn we immediately get ! n n n−1 X X X νi − λi − µi ξν . (6.7) Fnn ξν = 2 i=1
i=1
i=1
Further, by (6.3) and (6.4), Zn,−n (γi ) ξν = ξν+δi , i = 1, . . . , n. However, Zn,−n (u) is a polynomial in u2 of degree n − 1 with the highest coefficient Fn,−n ; see (5.4). Applying the Lagrange interpolation formula with the interpolation points γi , i = 1, . . . , n we obtain Zn,−n (u) ξν =
n n X Y u2 − γa2 ξν+δi . γi2 − γa2 i=1 a=1, a6=i
Taking here the coefficient at u2n−2 we get Fn,−n ξν =
n n X Y
γ2 i=1 a=1, a6=i i
1 ξν+δi . − γa2
(6.8)
Similarly, we see from (5.5) that Z−n,n (u) is a polynomial in u2 of degree n − 1 with the highest coefficient F−n,n . Using the defining relations (3.1) we can write the following formula for the operator S−n,n (u) in V (λ)+µ : S−n,n (u) =
(−1)n (u + 1/2) T−n,n (u)T−n,−n (−u) − T−n,n (−u)T−n,−n (u) . u
Hence, by Theorem 5.1 we obtain the equality of operators in V (λ)+µ : Z−n,n (u) =
(−1)n T−n,n (u)T−n,−n (−u) − T−n,n (−u)T−n,−n (u) . u
614
A. I. Molev
(1) This implies that F−n,n , as an operator in V (λ)+µ , coincides with 2 t(1) −n,n , where t−n,n is the highest coefficient of the polynomial T−n,n (u) which has degree n − 1; see (4.6). Using (4.13) we find that n n Y Y (αk − γi + 1)(βk − γi ) · (−γi − γa + 1) ξν−δi .
T−n,n (−γi ) ξν = −2
k=1
a=1, a6=i
Note that by (5.21) we have n n n−1 Y Y Y (αk − γi + 1)(βk − γi ) = (1/2 − γi ) (lk − γi ) (mk − γi ); k=1
k=1
k=1
see (6.1). Applying the Lagrange interpolation formula to the polynomial T−n,n (u) with the interpolation points −γi , i = 1, . . . , n and taking the coefficient at un−1 we finally obtain that Qn−1 n Qn X )(γa + γi − 1) a=1 (ma − γi ) a=1 (la − γiQ ξν−δi . (6.9) F−n,n ξν = 2 n a=1, a6=i (γi − γa ) i=1 To compute the action of the elements Fk−1,−k we may only consider the case k = n. The operator Fn−1,−n preserves the subspace of gn−2 highest vectors in V (λ). Therefore it suffices to calculate its action on the basis vectors of the form ξνµν 0 =
n−2 Y i=1
νi0 −µ0i zn−1,i
νi0 −µi zn−1,−i
0 γn−1 −1
Y
·
Zn−1,−n+1 (a) ξνµ ,
a=mn−1
where ξνµ = ξν is defined in (6.2), µ0 is a fixed gn−2 highest weight, ν 0 is an (n−1)-tuple of integers such that the inequalities (1.4) are satisfied with λ, ν, µ respectively replaced by µ, ν 0 , µ0 , and we set γi0 = νi0 − i + 1/2. The operator Fn−1,−n is permutable with the elements zn−1,i and Zn−1,−n+1 (u) which follows from the explicit formulas (2.3) and (5.4). Hence, we can write Fn−1,−n ξνµν 0 = Xµν 0 Fn−1,−n ξνµ , where Xµν 0 denotes the operator Xµν 0 =
n−2 Y i=1
νi0 −µ0i zn−1,i
νi0 −µi zn−1,−i
0 γn−1 −1
·
Y
Zn−1,−n+1 (a).
a=mn−1
Let us now apply formula (2.13). We have fa ξνµ = (ma − 1/2) ξνµ , a = 1, . . . , n − 1;
(6.10)
see (6.2). Further, for i = 1, . . . , n − 1, Xµν 0 zn−1,−i zni ξνµ = ξν,µ−δi ,ν 0 .
(6.11)
Indeed, by (6.2) zni ξνµ = ξν,µ−δi , and Xµν 0 zn−1,−i = Xµ−δi ,ν 0 for i < n − 1, where we have used (2.6) and (6.5). Let us verify that the latter formula holds for i = n − 1 as well. By (2.12) and (5.4) we can write zn−1,−n+1 = Zn−1,−n+1 (gn−1 ). However,
Representations of Symplectic Lie Algebras
615
gn−1 ξν,µ−δn−1 = (mn−1 − 1) ξν,µ−δn−1 , and so Xµν 0 zn−1,−n+1 = Xµν 0 Zn−1,−n+1 (mn−1 − 1) = Xµ−δn−1 ,ν 0 , as desired. Finally, for j = 1, . . . , n − 1 consider the expression Xµν 0 zn−1,j zn,−j ξνµ .
(6.12)
First transform the vector zn,−j ξν . The calculation is trivial if νj = µj , so we shall assume that νj − µj ≥ 1. We have zn,−j ξνµ = zn,−j znj ξν,µ+δj . By (5.17) and (5.18) we have zn,−j znj = Zn,−n (gj − 1). However, (gj − 1) ξν,µ+δj = mj ξν,µ+δj . To calculate Zn,−n (mj )ξν,µ+δj we apply again the Lagrange interpolation formula (cf. the proof of (6.8)) for the polynomial Zn,−n (u) at the interpolation points γi , i = 1, . . . , n and then put u = mj . The result is Zn,−n (mj )ξν,µ+δj =
n n X Y m2j − γa2 ξν+δi ,µ+δj . γi2 − γa2 i=1
(6.13)
a=1, a6=i
Let us now transform the operator Xµν 0 zn−1,j , j < n − 1. Here the calculation is very similar to the previous one. We shall assume that νj0 − µj ≥ 1. We have Xµν 0 zn−1,j = Xµ+δj ,ν 0 zn−1,−j zn−1,j . Using (5.17) and (5.18) again, we can write zn−1,−j zn−1,j = Zn−1,−n+1 (gj − 1). We have (gj − 1) ξν+δi ,µ+δj = mj ξν+δi ,µ+δj . Exactly as above, we use the Lagrange interpolation formula for the polynomial Zn−1,−n+1 (u) with the interpolation points γr0 , r = 1, . . . , n − 1 and then put u = mj . This gives Xµ+δj ,ν 0 Zn−1,−n+1 (mj ) =
n−1 X
n−1 Y
m2j − γa0
r=1 a=1, a6=r
2
γr0 2 − γa0 2
Xµ+δj ,ν 0 +δr .
(6.14)
In the case j = n − 1 we write Xµν 0 = Xµ+δn−1 ,ν 0 Zn−1,−n+1 (mn−1 ) and (6.14) holds for this case as well. Combining (6.10)–(6.14) we obtain from (2.13) Fn−1,−n ξνµν 0 =
n−1 X i=1
+
n n−1 X X i=1
n−1 Y 1 1 ξν,µ−δi ,ν 0 2mi − 1 (mi − ma )(mi + ma − 1) a=1, a6=i
1
2mj − 1 j,r=1
n−1 Y a=1, a6=j
1 (mj − ma )(mj + ma − 1)
n Y m2j − γa2 γi2 − γa2
a=1, a6=i
n−1 Y
m2j − γa0
a=1, a6=r
γr0 2 − γa0 2
(6.15)
2
ξν+δi ,µ+δj ,ν 0 +δr .
616
A. I. Molev
To complete the proof of Theorem 1.1 we rewrite the formulas (6.7)–(6.9) and (6.15) in the notation related to the patterns 3 (see Sect. 1) to get the matrix elements of the generators Fkk , Fk,−k , F−k,k , Fk−1,−k in the basis {ξ3 } of V (λ). The formulas of Theorem 1.1 are given in the normalized basis {ζ3 }, where ζ3 = N3 ξ3 , and N3 =
n Y
Y
0 0 (−lki − lkj − 1)!
k=2 1≤i<j≤k
Theorem 1.1 is proved. Remark 6.3. The space V (λ) is equipped with the standard inner product h , i such that for all i, j the adjoint operator to Fij is Fji . The basis {ζ3 } is not orthogonal with respect to this inner product. A possible way to get an orthogonal basis is to find the eigenvectors in V (λ)+µ for a commutative subalgebra in Y− (2). Such a subalgebra is generated by the coefficients of the series s−1,−1 (u) + s1,1 (u) = tr S(u). Acknowledgement. This project was initiated in collaboration with G. Olshanski to whom I would like to express my deep gratitude. I would like to thank M. Nazarov, V. Tolstoy and D. P. Zhelobenko for useful remarks and discussions.
References [AST1]
Asherova, R. M., Smirnov, Yu. F., Tolstoy, V. N.: Projection operators for simple Lie groups. Theor. Math. Phys. 8, 813–825 (1971) [AST2] Asherova, R. M., Smirnov, Yu. F., Tolstoy, V. N.: Projection operators for simple Lie groups. II. General scheme for constructing lowering operators. The groups SU (n). Theor. Math. Phys. 15, 392–401 (1973) [AST3] Asherova, R. M., Smirnov, Yu. F., Tolstoy, V. N.: Description of a certain class of projection operators for complex semisimple Lie algebras. Math. Notes 26, no. 1–2, 499–504 (1979) [B] Berele, A.: Construction of Sp-modules by tableaux. Linear and Multilinear Algebra 19, 299–307 (1986) [Bi1] Bincer, A.: Missing label operators in the reduction Sp(2n) ↓ Sp(2n − 2). J. Math. Phys. 21, 671–674 (1980) [Bi2] Bincer, A.: Mickelsson lowering operators for the symplectic group. J. Math. Phys. 23, 347–349 (1982) [CP] Chari, V., Pressley, A.: Yangians and R-matrices. L’Enseign. Math. 36, 267–302 (1990) [C] Cherednik, I. V.: A new interpretation of Gelfand–Tzetlin bases. Duke Math. J. 54, 563–577 (1987) [CK] De Concini, C., Kazhdan, D.: Special bases for SN and GL(n). Israel J. Math. 40, no. 3–4, 275–290 (1981) [D] Dixmier, J.: Alg`ebres Enveloppantes. Paris: Gauthier-Villars, 1974 [Dr] Drinfeld, V. G.: A new realization of Yangians and quantized affine algebras. Soviet Math. Dokl. 36, 212–216 (1988) [GKLLRT] Gelfand, I. M., Krob, D., Lascoux, A., Leclerc, B., Retakh, V. S., Thibon, J.-Y.: Noncommutative symmetric functions. Adv. Math. 112, 218–348 (1995) [GT1] Gelfand, I. M., Tsetlin, M. L.: Finite-dimensional representations of the group of unimodular matrices. Dokl. Akad. Nauk SSSR 71, 825–828 (1950) (Russian). English transl. in: Gelfand, I. M. Collected papers. Vol II, Berlin: Springer-Verlag, 1988 [GT2] Gelfand, I. M., Tsetlin, M. L.: Finite-dimensional representations of groups of orthogonal matrices. Dokl. Akad. Nauk SSSR 71, 1017–1020 (1950) (Russian). English transl. in: Gelfand, I. M. Collected papers. Vol II, Berlin: Springer-Verlag, 1988
Representations of Symplectic Lie Algebras
[GZ1] [GZ2] [G1] [G2] [G3] [GK] [H] [Ho] [Ka] [Ki]
[KS] [KW] [K] [KT] [LMS] [L] [Lu] [M1] [M2]
[Mi1] [Mi2] [Mo1] [Mo2] [Mo3] [Mo4] [MN] [MNO] [MO] [NM] [NT1]
617
Gelfand, I. M., Zelevinsky, A.: Models of representations of classical groups and their hidden symmetries. Funct. Anal. Appl. 18, 183–198 (1984) Gelfand, I. M., Zelevinsky, A.: Multiplicities and proper bases for gln . In: Group theoretical methods in physics. Vol. II, Yurmala 1985, Utrecht: VNU Sci. Press, 1986, pp. 147–159 Gould, M. D.: On the matrix elements of the U(n) generators. J. Math. Phys. 22, 15–22 (1981) Gould, M. D.: Wigner coefficients for a semisimple Lie group and the matrix elements of the O(n) generators. J. Math. Phys. 22, 2376–2388 (1981) Gould, M. D.: Representation theory of the symplectic groups. I. J. Math. Phys. 30, 1205–1218 (1989) Gould, M. D., Kalnins, E. G.: A projection-based solution to the Sp(2n) state labeling problem. J. Math. Phys. 26, 1446–1457 (1985) Hegerfeldt, G. C.: Branching theorem for the symplectic groups. J. Math. Phys. 8, 1195–1196 (1967) Van den Hombergh, A.: A note on Mickelsson’s step algebra. Indag. Math. 37, no.1, 42–47 (1975) Kashiwara, M.: Crystalizing the q-analogue of universal enveloping algebras. Commun. Math. Phys. 133, 249–260 (1990) King, R. C.: Weight multiplicities for the classical groups. In: Group theoretical methods in physics. Fourth Internat. Colloq., Nijmegen 1975. Lecture Notes in Phys., Vol. 50, Berlin: Springer, 1976, pp. 490–499 King, R. C., El-Sharkaway, N. G. I.: Standard Young tableaux and weight multiplicities of the classical Lie groups. J. Phys. A 16, 3153–3177 (1983) King, R. C., Welsh, T. A.: Construction of orthogonal group modules using tableaux. Linear and Multilinear Algebra 33, 251–283 (1993) Kirillov, A. A.: A remark on the Gelfand-Tsetlin patterns for symplectic groups. J. Geom. Phys. 5, 473–482 (1988) Koike, K., Terada, I.: Young-diagrammatic methods for the representation theory of the classical groups of type Bn , Cn , Dn . J. Algebra 107, 466–511 (1987) Lakshmibai, V., Musili, C., Seshadri, C. S.: Geometry of G/P . IV. Standard monomial theory for classical types. Proc. Indian Acad. Sci. Sect. A Math. Sci. 88, no. 4, 279–362 (1979) Littelmann, P.: An algorithm to compute bases and representation matrices for SLn+1 representations. J. Pure Appl. Algebra 117/118, 447–468 (1997) Lusztig, G.: Canonical bases arising from quantized enveloping algebras. J. Am. Math. Soc. 3, 447–498 (1990) Mathieu, O.: Good bases for G-modules. Geom. Dedicata 36, 51–66 (1990) Mathieu, O.: Bases des repr´esentations des groupes simples complexes (d’apr`es Kashiwara, Lusztig, Ringel et al.). S´emin. Bourbaki, Vol. 1990/91. Ast´erisque no. 201–203. Exp. no. 743, 421–442 (1992) Mickelsson, J.: Lowering operators and the symplectic group. Rep. Math. Phys. 3, 193–199 (1972) Mickelsson, J.: Step algebras of semi-simple subalgebras of Lie algebras. Rep. Math. Phys. 4, 307–318 (1973) Molev, A. I.: Representations of twisted Yangians. Lett. Math. Phys. 26, 211–218 (1992) Molev, A.: Gelfand–Tsetlin basis for representations of Yangians. Lett. Math. Phys. 30, 53–60 (1994) Molev, A.: Sklyanin determinant, Laplace operators, and characteristic identities for classical Lie algebras. J. Math. Phys. 36, 923–943 (1995) Molev, A. I.: Finite-dimensional irreducible representations of twisted Yangians. J. Math. Phys. 39, 5559–5600 (1998) Molev, A., Nazarov, M.: Capelli identities for classical Lie algebras. Math. Annalen, to appear; q-alg/9712021 Molev, A., Nazarov, M., Olshanski, G.: Yangians and classical Lie algebras. Russian Math. Surveys 51:2, 205–282 (1996) Molev, A., Olshanski, G.: Centralizer construction for twisted Yangians. Preprint CMA 065-97, Austral. Nat. University, Canberra; q-alg/9712050 Nagel, J. G., Moshinsky, M.: Operators that lower or raise the irreducible vector spaces of Un−1 contained in an irreducible vector space of Un . J. Math. Phys. 6, 682–694 (1965) Nazarov, M., Tarasov, V.: Yangians and Gelfand–Zetlin bases. Publ. RIMS, Kyoto Univ. 30, 459–478 (1994)
618
[NT2] [Ok] [O1] [O2]
[O3]
[PH] [P1] [P2] [RZ] [S] [T] [W] [Wo] [WY] [Z1] [Z2] [Z3] [Z4] [Z5] [Z6]
A. I. Molev
Nazarov, M., Tarasov,V.: Representations ofYangians with Gelfand–Zetlin bases. J. ReineAngew. Math. 496, 181–212 (1998) Okounkov, A.: Multiplicities and Newton polytopes. In: Olshanski, G. (ed.) Kirillov’s Seminar on Representation Theory. Am. Math. Soc. Transl. 181, Providence, RI: AMS, 1998, pp. 231–244 Olshanski, G. I.: Extension of the algebra U (g) for infinite-dimensional classical Lie algebras g, and the Yangians Y (gl(m)). Soviet Math. Dokl. 36, 569–573 (1988) Olshanski, G. I.: Representations of infinite-dimensional classical groups, limits of enveloping algebras, and Yangians. In: Kirillov, A. A. (ed.) Topics in Representation Theory. Advances in Soviet Math. 2, Providence, RI: AMS, 1991, pp. 1–66 Olshanski, G. I.: TwistedYangians and infinite-dimensional classical Lie algebras. In: Kulish, P. P. (ed.) Quantum Groups, Lecture Notes in Math. 1510, Berlin-Heidelberg: Springer, 1992, pp. 103–120 Pang, S. C., Hecht, K. T.: Lowering and raising operators for the orthogonal group in the chain O(n) ⊃ O(n − 1) ⊃ · · · , and their graphs. J. Math. Phys. 8, 1233–1251 (1967) Proctor, R.: Odd symplectic groups. Invent. Math. 92, 307–332 (1988) Proctor, R.:Young tableaux, Gelfand patterns, and branching rules for classical groups. J. Algebra 164, 299–360 (1994) Retakh, V., Zelevinsky, A.: Base affine space and canonical basis in irreducible representations of Sp(4). Dokl. Acad. Nauk USSR 300, 31–35 (1988) Shtepin, V. V.: Intermediate Lie algebras and their finite-dimensional representations. Russian Akad. Sci. Izv. Math. 43, 559–579 (1994) Tarasov, V. O.: Irreducible monodromy matrices for the R-matrix of the XXZ-model and lattice local quantum Hamiltonians. Theor. Math. Phys. 63, 440–454 (1985) Weyl, H.: Classical Groups, their Invariants and Representations. Princeton, NJ: Princeton Univ. Press, 1946 Wong, M. K. F.: Representations of the orthogonal group. I. Lowering and raising operators of the orthogonal group and matrix elements of the generators. J. Math. Phys. 8, 1899–1911 (1967) Wong, M. K. F., Yeh, H.-Y.: The most degenerate irreducible representations of the symplectic group. J. Math. Phys. 21, 630–635 (1980) Zhelobenko, D. P.: The classical groups. Spectral analysis of their finite-dimensional representations. Russ. Math. Surv. 17, 1–94 (1962) Zhelobenko, D. P.: Compact Lie groups and their representations. Transl. of Math. Monographs 40 Providence, RI: AMS, 1973 Zhelobenko, D. P.: S-algebras and Verma modules over reductive Lie algebras. Soviet. Math. Dokl. 28, 696–700 (1983) Zhelobenko, D. P.: Z-algebras over reductive Lie algebras. Soviet. Math. Dokl. 28, 777–781 (1983) Zhelobenko, D. P.: On Gelfand–Zetlin bases for classical Lie algebras. In: Kirillov, A. A. (ed.) Representations of Lie groups and Lie algebras, Budapest: Akademiai Kiado: 1985, pp. 79–106 Zhelobenko, D. P.: Extremal projectors and generalized Mickelsson algebras on reductive Lie algebras. Math. USSR-Izv. 33, 85–100 (1989)
Communicated by T. Miwa
Commun. Math. Phys. 201, 619 – 655 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
The Initial Boundary Value Problem for Einstein’s Vacuum Field Equation Helmut Friedrich, Gabriel Nagy Albert-Einstein-Institut, Max-Planck-Institut f¨ur Gravitationsphysik, Schlaatzweg 1, 14473 Potsdam, Germany Received: 11 June 1998 / Accepted: 15 September 1998
Abstract: We study the initial boundary value problem for Einstein’s vacuum field equation. We prescribe initial data on an orientable, compact, 3-dimensional manifold S with boundary 6 6= ∅ and boundary conditions on the manifold T = R+0 × 6. We assume the boundaries 6 and {0} × 6 of S and T to be identified in the natural way. Furthermore, we prescribe certain gauge source functions which determine the evolution of the fields. Provided that all data are smooth and certain consistency conditions are met on 6, we show that there exists a smooth solution to Einstein’s equation Ric[g] = 0 on a manifold which has (after an identification) a neighbourhood of S in T ∪ S as a boundary. The solution is such that S is space-like, the initial data are induced by the solution on S, and, in the region of T where the solution is defined, T is time-like and the boundary conditions are satisfied. 1. Introduction In this article we study the initial boundary value problem for Einstein’s vacuum field equation. Let S be a smooth, orientable 3-dimensional manifold with boundary 6 6= ∅. The boundary of the manifold M = R+0 ×S consists then of S ' {0}×S and T = R+0 ×6 which are identified along the edge 6 ' {0} × 6 of M . We are interested in answering the following question: Which data do we have to prescribe on S and T such that there exists a (unique) smooth solution g of Einstein’s equation Ric[g] = 0,
(1.1)
on M for which S is space-like, T is time-like and which is such that g induces the given data on S and T ? The answer to this question will be of potential interest in any problem concerned with solutions to (1.1) which contain a distinguished time-like hypersurface. It will provide possibilities to construct examples or counterexamples to various conjectures
620
H. Friedrich, G. Nagy
and will give us tools to construct space-times with certain specified properties. We mention just a few such problems. The motion of ideal fluid bodies with exterior vacuum field is of considerable interest in general relativity but its analytical properties are not well understood. The free time-like boundary, along which the transition of the Einstein–Euler equations into the Einstein vacuum field equations occurs, poses analytical difficulties. Though this situation is different from the one considered in the present article, the study of the initial boundary value problem sheds some light on the problem of the floating fluid balls. Our interest in this problem was one of the reasons to analyse the field equations in this article in a representation which is close to the one considered in [6]. In [2] the modeling of isolated systems in terms of asymptotically flat fields has been criticized. It has been suggested to separate instead the (massive) system of interest by a judiciously chosen time-like cut from the rest of the universe and to study the space-time so obtained as an object of its own. Whether such an approach leads to useful notions characterizing the behaviour of the system as a whole (energy momentum, angular momentum, etc.) and, in particular, whether it allows us to introduce meaningful concepts of incoming/outgoing radiation, etc. requires the understanding of the initial boundary value problem which has de facto been introduced in [2] without ever mentioning it. In many numerical calculations in general relativity artificial time-like boundaries are introduced to restrict the calculations to finite grids (cf. [7] for possibilities to avoid such boundaries in certain relevant cases). A thorough understanding of the analytical features of the initial boundary value problem for Einstein’s equation should be a prerequisite for successful numerical calculations near the boundary. There are available in the literature various discussions of the Einstein equation in the neighbourhood of time-like boundaries (see e.g. [1, 10, 15]), but it appears that the existence of solutions to the initial boundary value problem for Einstein’s equation has not been discussed so far in any generality. A general study of the initial boundary value problem for Einstein’s equation with negative cosmological constant has been given in [4], but there the boundary data are prescribed on the conformal boundary at space-like and null infinity. Due to the fact that this boundary is defined intrinsically by the nature of the geometry, there occur certain simplifications which allow us to characterize the data on the boundary in a covariant way. In contrast to this, in the situation studied in this paper the boundary is not singled out by a geometric consideration but “is put in by hand”. This leads to various complications in the detailed analysis of the present problem. Nevertheless, some general ideas and some specific techniques developed in [4] apply to the present problem. It may be of interest to compare the methods and the results obtained in the present article with the completely different techniques for analysing the field near time-like boundaries used in [10]. This may give a deeper understanding of the problem and should shed light on the relative efficiencies of the different methods. The basic step in our study is to reduce the geometrical initial boundary value problem for Einstein’s equation to an initial boundary value problem for a hyperbolic system to which the general results on “maximally dissipative” initial boundary value problems (cf. Sect. 3) apply. A central difficulty here arises from the need to control the conservation of the constraints if the fields are evolved by a suitable hyperbolic system of reduced equations. In our treatment of the problem this means essentially that we have to show that those equations in the system (2.5) which contain only derivatives in space-like directions are preserved in the course of the evolution. This difficulty largely motivates our choice of the basic equations in Sect. 2, our choice of the gauge conditions in Sect. 4, and our choice of the reduced equations in Sect. 5. It is shown in Theorem 6.1 that our
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
621
reduced problem, to which certain general results available in the mathematical literature apply, yields in fact solutions to the Einstein equation. It is a most remarkable feature of Einstein’s equation that the nature of the boundary condition does not play any role in this conclusion. This is most important for us, since the way we prescribe the boundary conditions on T does not allow us to check by direct calculations on T whether any constraints on T , either the intrinsic constraints induced on T or the constraints mentioned above, are satisfied on T . In Sect. 7 we discuss the initial and boundary data which can be prescribed freely. While the initial data are well known from the study of the Cauchy problem for Einstein’s equation and while it is also clear that the initial and the boundary data will have to satisfy certain consistency conditions on the edge 6, the boundary conditions require a more careful study. In the local problem the boundary conditions are suggested by the nature of our reduced equations and by the theory of maximally dissipative boundary value problems. The question of how to prescribe boundary conditions in regions which cannot be handled solely in terms of one choice of gauge, sheds sharp light on some peculiar features of our problem. It turns out that we need to specify, in an implicit form, a time-like unit vector field e0 tangent to the boundary T . All other boundary conditions refer to this vector field in one way or another. The boundary hypersurface is essentially singled out (imagining our prospective solution for the moment as a part of a larger space-time) by prescribing the mean extrinsic curvature χ of the boundary. However, the specification of χ is tied to that of e0 and while locally the boundary could be specified by one real function, the situation is more complicated if long time evolutions are studied (cf. Sect. 8). After the specification of the boundary in terms of χ, the basic freedom on the boundary consists in prescribing on T two arbitrary real functions and their coupling to the conformal Weyl curvature. We provide some explanation of the nature of this coupling (cf. Sect. 7) but we avoid speaking of incoming/outgoing gravitational radiation. Any such interpretation would depend on the time-like unit vector field e0 on T , the choice of which is rather arbitrary as long as no further assumptions are introduced. In Theorem 8.1 we state our general existence result, which is local in time but global along the edge 6. We do not show the uniqueness of the solution in the general case. This is due to some open question concerning our gauge conditions (cf. Sect. 4) which we intend to make a topic of a separate investigation. However, in the particular case where the mean extrinsic curvature is constant on the boundary, local uniqueness of the solution is demonstrated. There are certainly many possibilities to discuss the initial boundary value problem and there will be as many ways of stating boundary conditions. However, all of these should be just modifications of the boundary conditions given in our theorem.
2. The Field Equations We shall use a frame formalism in which the metric g will be represented in terms of a frame field {ek }k=0,1,2,3 which satisfies the orthonormality condition gik ≡ g(ei , ek ) = diag(1, −1, −1, −1) and for which e0 is future directed. All fields (with the possible exception of the fields ek themselves) will be in the following expressed in terms of this frame.
622
H. Friedrich, G. Nagy
The basic unknowns in our representation of the field equations are given by the fields eµk , 0k i j , C i jkl . The functions eµ k = ek (xµ ) are the coefficients of the frame in a suitably chosen coordinate system {xµ }µ=0,1,2,3 . In these coordinates the coefficients of the contravariant form of the metric are then given g µν = g jk eµ j eν k . The 0k i j are the connection coefficients in our frame such that ∇k ej ≡ ∇ek ej = 0k i j ei , where ∇ denotes the Levi–Civita connection of g. The fact that the connection is metric is expressed by the condition 0i l k glj = −0i l j glk . Finally, C i jkl is a tensor field which is required to possess the algebraic properties of a conformal Weyl tensor and which will in fact represent that tensor. The curvature of the connection ∇ is given by ri jkl = ek (0l i j ) − el (0k i j ) + 0k i m 0l m j − 0l i m 0k m j − 0m i j (0k m l − 0l m k ).
(2.1)
For later discussions it will be convenient to introduce tensor fields Ti k j , 1i jkl , Hjkl by setting Ti k j ek = −[ei , ej ] + (0i k j − 0j k i ) ek , 1
i
jkl
=r
i
jkl
−C
Hjkl = ∇i C
i
i
jkl ,
jkl .
(2.2) (2.3) (2.4)
The Einstein equation can then be expressed by the equations Ti k j = 0,
1i jkl = 0,
Hjkl = 0.
(2.5)
The first of these equations implies that the connection ∇ is torsion free and therefore, since it is metric, that it is the Levi–Civita connection of the metric g. The equation allows us to determine the connection coefficients in terms of the frame coefficients and their first derivatives. The second equation requires that the curvature of ∇ coincides with the Weyl curvature and thus implies Einstein’s equation (1.1). The third equation is the once contracted vacuum Bianchi identity. We refer to it as to the Bianchi equation. One of the reasons why we chose this representation of the Einstein equation is that it simplifies the analysis of our problem. The equations contain direct geometric information. They are easily adapted to our situation and then entail immediate projection formalisms. Moreover, certain features of the Bianchi equation which are important for the discussion of initial boundary value problems are well understood [4]. Finally, the evolution equations for gravitating ideal fluid bodies derived in [6], which we want to use for analysing the problem of the “floating fluid ball”, extend the equations above. In the frame formalism there exists a natural decomposition of the Bianchi equation. We set n = e0 and study the decomposition of Hjkl with respect to n and its orthogonal complement, which carries the induced metric hij = gij − ni nj . We denote by lijk the totally antisymmetric tensor with 0123 = 1 and set ijk = nl lijk . Furthermore we set lij = hij − ni nj . The electric and magnetic part of the conformal Weyl tensor are defined with respect ∗ nj nl with the right to n by Eik = hi m hk n Cmjnl nj nl and Bik = hi m hk n Cmjnl 1 ∗ mn dual of the conformal Weyl tensor given by Cijkl = 2 Cijmn kl . We have Eij = Eji ,
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
623
Eij nj = 0, Ei i = 0. The same relations hold for Bij . With these conventions the conformal Weyl tensor can be written Cijkl = 2 lj[k El]i − li[k El]j − n[k Bl]m m ij − n[i Bj]m m kl . Using the symmetries of the Weyl tensor and the identities ijp klp = −2 h[i k hj] l ,
ipq jpq = −2 hi j ,
we get the decomposition Hjkl = 2 nj P[k nl] + hj[k Pl] + Qi (nj i kl − i j[k nl] ) − 2 Pj[k nl] − Qji i kl , (2.6) where we set 1 Qk = − nj k lm ∇i C i jlm , 2
Pk = nj hk l nm ∇i C i jlm ,
Pjk = h(j m hk) n nl ∇i C i mln ,
Qjk =
1 h(j l k) mn ∇i C i lmn . 2
(2.7)
(2.8)
In terms of these fields the Bianchi equation is equivalent to Qk = 0,
Pk = 0,
Pjk = 0,
Qjk = 0.
(2.9)
To obtain more explicit expressions we set Kij = hi k ∇k nj , K = hij Kij , ai = n ∇j ni , Dk Eij = hk l hi m hj n ∇l Emn , etc. such that j
Kij = −hi k 0k 0 j ,
K = −hpq 0p 0 q , ai = 00 i 0 .
Observing that hi m hj n nk ∇k Emn = Ln Eij − Elj Ki l − Eil Kj l , where Ln denotes the Lie derivative in the direction of n, we get Pi = Dj Eji + 2 K jl k l(i Bj)k , j
Qi = D Bji + i
kl
(2 K
j
k
j
− Kk ) Elj ,
(2.10) (2.11)
Pjl = Ln Ejl + Di Bk(j l) ik − 3 K(j i El)i − 2 K i (j El)i −2 ai ik (j Bl)k + hjl K ik Eik + 2 K Ejl ,
(2.12)
Qjl = Ln Bjl − Di Ek(j l) ik + 2 ai ik (j El)k −K i (j Bl)i − 2 K(j i Bl)i + K Bjl − Kik Bpq pi (j kq l) .
(2.13)
624
H. Friedrich, G. Nagy
3. Maximally Dissipative Boundary Value Problems We need to remove the gauge freedom in Eqs. (2.5) and to extract from the resulting equations a “reduced system” which will allow us to discuss initial boundary value problems. To motivate our choice of gauge conditions and reduced system, we shall outline briefly the argument which leads to maximally dissipative boundary conditions. We consider on M = {x ∈ R4 |x0 ≥ 0, x3 ≥ 0} a real linear symmetric hyperbolic system Aµ ∂µ u = B u + f (x)
(3.1)
for an RN -valued unknown u on M , i.e. the matrices Aµ = Aµ (x), µ = 0, 1, 2, 3, are smooth functions on M which take values in the set of symmetric N × N -matrices, there exists a 1-form ξµ such that Aµ ξµ is positive definite, B = B(x) is a smooth matrix-valued function and f (x) a smooth RN -valued function on M . For convenience we assume that the positivity condition is satisfied with ξµ = δ 0 µ . Set S = {x ∈ M |x0 = 0}, T = {x ∈ M |x3 = 0}, and define for τ ≥ 0 the sets Mτ = {x ∈ M |0 ≤ x0 ≤ τ }, Sτ = {x ∈ M |x0 = τ }, Tτ = {x ∈ M |0 ≤ x0 ≤ τ, x3 = 0}. We prescribe data as follows: We choose g ∈ C ∞ (S, RN ) and require as initial condition u(x) = g(x), x ∈ S. We choose a smooth map Q of T into the set of linear subspaces of RN and require as boundary condition u(x) ∈ Q(x), x ∈ T. The type of maps Q admitted here is suggested by the structure of the equations. Suppose that u is a solution of (3.1) of spatially compact support in M . Then (3.1) implies ∂µ (t u Aµ u) = t u K u + 2 t u f with K = B + t B + ∂µ Aµ . Integration over Mτ gives Z Z Z t t u A0 u dS = u A0 u dS + Sτ
S
Mτ
{t u K u + 2 t u f } dV +
Z Tτ
t
u A3 u dS.
If the last term on the right-hand side were non-positive, we could use this equation to obtain the energy estimates which are basic for proving existence and uniqueness of solutions to symmetric hyperbolic systems. Thus the structure of the “normal matrix" A3 plays a prominent role in formulating the boundary conditions. We shall assume the following conditions to be satisfied: (i) The set T is a characteristic of (3.1) of constant multiplicity, i.e. dim(ker A3 (x)) = const. > 0, x ∈ T. (ii) The map Q is chosen such as to ensure the desired non-positivity t
u A3 (x) u ≤ 0,
u ∈ Q(x), x ∈ T.
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
625
(iii) The subspace Q(x), x ∈ T , is a maximal with (ii), i.e. dim(Q(x)) = number of non-positive eigenvalues of A3 counting multiplicity. The last condition implies in particular that ker A3 (x) ⊂ Q(x). We discuss the specification of Q in terms of linear equations. Since A3 is symmetric, we can assume, possibly after a transformation of the dependent unknown, that at x ∈ T , −Ip 0 0 A3 = κ 0 0k 0 , κ > 0, 0 0 Iq where Ip is a p × p unit matrix, 0k is a k × k zero matrix, etc. and p + k + q = N . Writing u = t (a, b, c) ∈ Rp × Rk × Rq we find that at x the linear subspaces admitted as values of Q are necessarily given by equations of the form 0 = c − H a, where H = H(x) is a q × p matrix satisfying − t a a + t a t H H a ≤ 0,
a ∈ Rp , i.e.
t
H H ≤ Ip .
We note that there is no freedom to prescribe data for the component b of u associated with the kernel of A3 . More specifically, if A3 ≡ 0 on T , energy estimates are obtained without imposing conditions on T and the solutions are determined uniquely by the initial condition on S. By subtracting a suitable smooth function from u and redefining the function f , we can convert the homogeneous problem above to an inhomogeneous problem and vice versa. Inhomogeneous maximal dissipative boundary conditions are of the form q = c − H a, with q = q(x), x ∈ T , a given Rq -valued function representing the free boundary data on T . The linear maximally dissipative boundary value problem has been analysed by Rauch [12] under weak smoothness assumptions, results for higher smoothness can be found e.g. in [13] and in the literature given there. In the case of quasi-linear equations the matrices Aµ depend on the unknown u as well. Thus the fact that the normal matrix A3 depends also on u has to be observed in formulating the boundary conditions. Initial boundary value problems for quasi-linear equations with a general form of A0 and boundary conditions as indicated above have been discussed e.g. in [8, 14]. To illustrate the discussion above in a simple case we consider Maxwell fields. This will also allow us to point out some specific features of Einstein’s equation. We assume that a metric g is given on the set M above and that Maxwell’s equations are expressed in an orthonormal frame ek . The notation introduced in the previous chapter will be employed throughout. Maxwell’s equations are given by ∗ , 0 = Hk ≡ ∇j Fjk − 4 π Jk , 0 = Hk∗ ≡ ∇j Fjk ∗ = 21 jk il Fil . In terms of the electric field E i = −hi j nk F jk and the magnetic with Fjk field B i = hi j nk F ∗jk we get decompositions
Fij = −2 E[i nj] + ijk B k ,
Fij∗ = ijk E k + 2 B[i nj] ,
626
H. Friedrich, G. Nagy
which entail further decompositions Hk = Pk − nk P, Hk∗ = −Qk + nk Q, with P = Di Ei + K ij B k ijk + 4 π ρ, Q = Di Bi − K ij E k ijk , Pi = Ln Ei − i jk Dj Bk − Ej (Ki j + K j i ) + Ei Kj j + aj B k jki − 4 π ji , Qi = Ln Bi + i jk Dj Ek − Bj (Ki j + K j i ) + Bi Kj j − aj E k jki , where we set ρ = nk Jk , jl = hl k Jk . Notice that the terms in the first two equations which involve Kij drop out if n is hypersurface orthogonal. It holds nk Pk = 0, nk Qk = 0. For convenience we shall assume that the normals to S are tangent to T on 6. The frame ek is now chosen on M such that e0 and e3 coincide on S, respectively T , with the unit normals pointing into M . Set x0 = 0 on S and let xα , α = 1, 2, 3, be coordinates on S with x3 = 0 on 6 and x3 > 0 on S \ 6. These coordinates are extended into M such that e0 (xµ ) = δ µ 0 on M . We have e3 k = ek (x3 ) = e3 3 δ 3 k ,
e3 3 > 0 on T,
eµ 0 = δ µ 0 on M.
Choose now Jk on M such that the conservation law ∇k Jk = 0, holds on M and prescribe data Ei , Bi on S satisfying the constraints: P = 0,
Q = 0 on S.
(3.2)
To study the time evolution we observe that by our formalism the equations P0 = 0, Q0 = 0 are trivially satisfied and we consider the propagation equations Pr = 0,
Qr = 0 on M, r = 1, 2, 3.
(3.3)
If we write these equations in the form (3.1) with u = t (u1 , . . . , u6 ), where ur = Er , u3+r = Br for r = 1, 2, 3, we find Aµ = I δ µ 0 + F µ on M, with
0 0 0 0
F = eµ 3 −eµ 2 µ
0 0 0 eµ 3 −eµ 2 0 0 −eµ 3 0 eµ 1 µ µ 0 0 e 2 −e 1 0 . 0 0 0 −eµ 3 eµ 2 µ 0 −e 1 0 0 0 eµ 1 0 0 0 0
Since Aµ gµν eν 0 = I, the 6 × 6 unit matrix, and the matrices Aµ are symmetric, we see that Eqs. (3.3) form a symmetric hyperbolic system. On S we have F 0 = 0. We shall assume that A0 is positive definite on M . For the normal matrix on T we find 0 0 0 0 10 0 0 0 −1 0 0 3 3 0 0 0 0 0 0 A =e 3 , 0 −1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 00
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
627
which tells us that T is a characteristic of constant multiplicity since e3 3 > 0 on T . To study the maximally dissipative boundary condition we perform a transformation u → v = C u such that A3 = t C D C with D = e3 3 diag (−1, −1, 0, 0, 1, 1). Such a transformation is given by 1 1 v1 = √ (u1 − u5 ), v2 = √ (u2 + u4 ), v3 = u3 , 2 2 1 1 v5 = √ (u1 + u5 ), v6 = √ (u2 − u4 ). v 4 = u6 , 2 2 As discussed above we introduce now a real matrix valued function H on T , ab H= with (a v1 + b v2 )2 + (c v1 + d v2 )2 ≤ v12 + v22 , (v1 , v2 ) ∈ R2 , (3.4) cd to write down inhomogeneous maximally dissipative boundary conditions. Translated back into the original unknowns these conditions read q1 = E1 + B2 − a (E1 − B2 ) − b (E2 + B1 ), q2 = E2 − B1 − c (E1 − B2 ) − d (E2 + B1 ), where q1 and q2 are smooth functions, prescribed on T . In terms of the spin frame formalism of Newman and Penrose [11] this equation takes the form q = φ2 + α φ0 + β φ¯ 0 , where we set 1 1 1 q = − (q1 + i q2 ), α = (a + d − i b + i c), β = (a − d + i b + i c), (3.5) 2 2 2 1 1 1 k = √ (e0 − e3 ), m = √ (e1 − i e2 ), (3.6) l = √ (e0 + e3 ), 2 2 2 φ1 = Fij (li k j + m ¯ i mj ), φ2 = Fij m ¯ i kj . φ0 = Fij li mj , By picking the matrix H appropriately, we see that we could alternatively prescribe e.g. the components (E1 , E2 ) or (B1 , B2 ) or φ2 freely on T . The function φ2 can be interpreted as the component of the Maxwell field which is transverse to and travels in the direction of e3 . We note that all the prescriptions above depend on e0 , for which there exists no privileged choice on T . The non-positivity condition (ii) implies for the Poynting vector S i = − 41π ijk Ej Bk on T the relation S3 =
1 1 t u A3 u ≤ 0, (E1 B2 − E2 B1 ) = 4π 8 π e3 3
(3.7)
with e3 pointing towards M on T . Given the field equations and the data on S, we can derive a formal expansion of the prospective solution u on S, in particular on 6, in terms of x0 . If we want to ensure the smoothness of u, we need to give the boundary conditions such that they are consistent with the formal expansion of u on 6. We shall not discuss these “consistency conditions” (cf. [8]) any further.
628
H. Friedrich, G. Nagy
The initial boundary value problem which we have outlined here admits a unique smooth solution Ek , Bk of (3.3) on a suitably given neighbourhood of 6 in M . We still need to show that these fields satisfy also the constraints P = 0, Q = 0. In the case of Maxwell equations the argument is straightforward. A direct calculation shows that the fields Hk , Hk∗ satisfy for arbitrary fields Ek , Bk the identities ∇k Hk = −Rjk F jk = 0, ∇k Hk∗ = −Rjk F jk = 0. On the other hand, observing the decompositions of Hk , Hk∗ given above and the fact that our fields solve (3.3), we find for P and Q the “subsidiary equations” 0 = −∇k Hk = Ln P + P Kj j , 0 = ∇k Hk∗ = Ln Q + Q Kj j . Because of (3.2) it follows from these ODE’s that P and Q also vanish off S. In the following we shall reduce the “geometric” initial boundary value problem for Einstein’s equation to a maximally dissipative boundary value problem for a suitably chosen reduced system. We have seen above that at least three important conditions have to be met by the gauge conditions and the reduced system: The system should be symmetric hyperbolic, the resulting problem should satisfy the condition of maximal dissipativity, and the problem should allow us to demonstrate the preservation of the constraints. Besides studying the Bianchi equation, which is similar to the Maxwell’s equations, we need to take care of the equations for the frame and the connection coefficients and we need to characterize the boundary itself in terms of some data. The enormous freedom available here allows for reduced systems which satisfy the first two conditions but which lead to difficulties when it comes to verifying the third condition. This should be kept in mind when we study now the gauge conditions and then extract the reduced system.
4. The Gauge Conditions Consider the 4-manifold M = R+0 × S, where S is a smooth orientable 3-manifold with boundary 6 ≡ ∂ S 6= ∅. We write ∂M = S ∪ T and S ∩ T = {0} × 6 ≡ 6, where we identify S in the obvious way with {0} × S ⊂ M and set T = R+0 × 6. Let g be a smooth Lorentz metric on M for which S is space-like and T is time-like. Given a point p ∈ 6 we want to construct in some appropriate neighbourhood U of p coordinates xµ and an orthonormal frame field ek which are conveniently adapted to S ∩ U and T ∩ U . It will be seen that our construction works for a suitably chosen neighbourhood U of p. Set x0 = 0 on S and let xα , α = 1, 2, 3, be local coordinates on S ∩ U with x3 = 0 on 6 ∩ U and x3 > 0 elsewhere. Choose a time-like unit vector field e0 on U which is tangent to T ∩ U , orthogonal to the 2-surfaces {x3 = c = const. > 0} in S ∩ U , and points towards M on S ∩ U . We assume that the integral curves of e0 starting on S ∩ U generate U . We extend the functions xµ to U such that eµ 0 ≡ e0 (xµ ) = δ µ 0 on U , i.e. x0 is the parameter on the integral curves of e0 which vanishes on S ∩ U and the xα are constant on these curves. The xµ provide smooth coordinates on U . The sets Tc = {x3 = c} are smooth time-like hypersurfaces in U with T0 = T ∩ U . Let e3 be the smooth unit vector field normal to Tc which points towards M on T0 . We denote by D the Levi–Civita connection defined by the metric induced on Tc . Choose vector fields eA , A = 1, 2, on S ∩ U which are tangent to Tc ∩ S and which form with e0 , e3 a smooth orthonormal frame field on S ∩ U . Using the connection D, we extend the fields eA to
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
629
Tc by Fermi transport on Tc in the direction of e0 such that (in signature-independent form) g(e0 , e0 ) De0 eA + g(eA , De0 e0 ) e0 − g(eA , e0 ) De0 e0 = 0 on U.
(4.1)
The ek form a smooth orthonormal frame field on U . We shall refer the type of gauge considered above as an “adapted gauge”. In the further discussion we will have to consider three types of projections. Since our frame is well adapted to our geometrical situation we can avoid the introduction of corresponding projection formalisms by distinguishing four groups of indices. The latter are given, together with the values they take, as follows: a, c, d, e, f = 0, 1, 2;
i, j, k, l, m, n = 0, 1, 2, 3;
p, q, r, s, t = 1, 2, 3;
A, B, C, D = 1, 2.
We assume the summation rule for each group. The frame coefficients eµ k satisfy eµ 0 = δ µ 0 ,
e3 a = 0,
e3 3 > 0 on U.
(4.2)
A part of the connection coefficients defines the inner connection D on Tc , we have Da ec ≡ Dea ec = 0a b c eb .
(4.3)
The condition (4.1) reads in terms of the connection coefficients 00 A B = 0.
(4.4)
As a consequence the fields ea satisfy on Tc the equations De0 e0 = 00 A 0 eA ,
De0 eA = −gAB 00 B 0 e0 .
(4.5)
Given the hypersurfaces Tc , the coefficients 00 A 0 can be considered as gauge source functions (cf. [5]) which govern the evolution of the coordinates and the frame field off S. Lemma 4.1. Suppose that the hypersurfaces Tc are given on U and let the coordinates 0 0 0 xµ and the frame field ek described above be given on S ∩ U . If F A = F A (x µ ), 0 0 A = 1, 2, are smooth functions on {x0 ∈ R4 |x 0 ≥ 0, x 3 ≥ 0}, there exist unique 0 coordinates x µ and unique frame vector fields e0k on some neighbourhood U 0 of p in 0 U which represent an adapted gauge and which are such that on S ∩ U 0 x µ = xµ , 0 0 0 0 e0k = ek holds, and on U 0 x 3 = x3 , 000 A 0 (x µ ) = F A (x µ ) holds, where 00i j k denote the connection coefficients with respect to e0k . Proof. The new coordinates and frame vector fields would need to satisfy the equations De00 e00 = F
0
A
0
(x µ ) e0A ,
De00 e0A = −gAB F
0
B
0
(x µ ) e00 ,
0
e00 (x µ ) = δ µ 0 , 0
with x 3 = c on the hypersurface Tc ∩ U 0 , c ≥ 0. Since the connection D on Tc can be considered as known, we can read the equations above as a system of ODE’s on Tc for 0 0 the coordinates x α (xβ , c) and the coefficients e α a (xβ , c) of the vector fields e0a in the
630
H. Friedrich, G. Nagy
coordinates xα , where α, β = 0, 1, 2. For the given data on S ∩ U this system of ODE’s has a unique solution in some neighbourhood U 0 of p which depends smoothly on the 0 initial data and the parameter c. We set e 3 a (xα , c) = 0 and express the frame in the new coordinates. Equations (4.5) imply a system of ODE’s for the quantities g(e0a , e0b ) which allows us to show that the frame is indeed orthonormal. The second fundamental form of Tc in the frame ea is given by χab ≡ g(∇ea e3 , eb ) = 0a j 3 gjb = 0a 3 b = 0(a 3 b) ,
(4.6)
the mean extrinsic curvature of the hypersurfaces Tc is given by χ ≡ g ab χab = g jk 0j 3 k = ∇µ eµ 3 .
(4.7)
Lemma 4.2. Consider the smooth functions χ(xα , 0), 00 A 0 (xα , 0), α = 0, 1, 2, which are implied on T ∩U in the adapted gauge considered above. Let xµ be smooth functions on S ∩ U with x0 = 0 and such that x1 , x2 , x3 are local coordinates on S ∩ U with x3 = 0 on 6 ∩ U and x3 > 0 elsewhere. Let {ek }k=0,... ,3 be a smooth orthonormal frame field on S with the following properties. The vector field e0 points towards M , it is tangent to T on 6 ∩ U , and for given number c, 0 ≤ c ≤ supU x3 , it is orthogonal to the sets Sc ≡ {x3 = c} ⊂ S. The vector fields eA are tangent to the sets Sc and the vector field e3 points towards M on 6 ∩ U . If f = f (xµ ), F A = F A (xµ ), A = 1, 2, are smooth functions on {x ∈ R4 |x0 ≥ 0, x3 ≥ 0} satisfying f (xα , 0) = χ(xα , 0),
F A (xα , 0) = 00 A 0 (xα , 0),
then, if there exists a smooth extension of the functions xµ and the vector fields ek to some neighbourhood U 0 of p in U such that xµ , ek represent the coordinates and the frame field in an adapted gauge on U 0 for which χ(xµ ) = f (xµ ) and 00 A 0 (xµ ) = F A (xµ ) on U 0 , the extension is unique. If χ(xα , 0) = χ0 = const. and f is chosen to be constant and equal to χ0 , there exists a smooth extension of xµ and ek with the properties listed above. Remark 4.1. We shall in the following consider the functions F A and, for x3 > 0, the function f as gauge source functions which determine the foliation by hypersurfaces {x3 = const.} and the evolution of the field e0 on these hypersurfaces. Therefore the existence of the extensions of the xµ , ek is important for us also in the case of general functions χ(xα , 0), f (xµ ) (cf. Sect. 8). Since the general existence proof appears to require techniques which are different from the ones used in this article we will make it a topic of separate investigation. 0
0
Proof. Let xµ be coordinates on U such that we have xµ = xµ for µ0 = µ as well as 0 0 e0 (x3 ) = 0 on S∩U and x3 = 0 on T ∩U . In the following indices α, β, α0 , β 0 take values 0, 1, 2. For given number c we wish to construct a hypersurface Tc = {x3 = c} such that Tc ∩ S = Sc and the mean extrinsic curvature of Tc satisfies χ(xα , c) = f (xα , c). 0 The hypersurface will be given as the graph of a smooth function φ(xα , c). We set µ0 30 α0 8(x , c) = x − φ(x , c) and require 0
p ∈ Tc iff 8(xµ (p), c) = c.
(4.8)
In the following the dependence of the various quantities on the parameter c will not always be written out explicitly but it should be kept in mind.
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation 0
631
0
The unit normal N µ to Tc with N 3 > 0 on S will be given by 0
0
0
N µ = N ∇µ 8. with N = −(−∇ν 0 8 ∇ν 8)− 2 . 1
It will be ensured by condition (4.11) that N 6= 0 close to S ∩U . The second fundamental form of Tc will be given by 0
0
0
0
χµ0 ν 0 = kµ0 ρ kν 0 λ ∇ρ0 Nλ0 = N kµ0 ρ kν 0 λ ∇ρ0 ∇λ0 8.
(4.9)
Here indices are raised and lowered by using the metric g and we denote by kµ0 ν 0 = gµ0 ν 0 + Nµ0 Nν 0 the metric which will be induced on the hypersurface Tc . The equation which relates the function f to the mean extrinsic curvature of Tc takes the form 0
0 0
0
0
0
∇µ0 N µ = N k µ ν ∇µ0 ∇ν 0 8 = −N k α β ∂α0 ∂β 0 φ + h(xα , φ, ∂β 0 φ) = f (xα , c), (4.10) with some smooth function h. To ensure that x3 = c on Tc ∩ S and e0 is tangent to Tc , we require φ = 0,
0 = e0 (8) = e3
0
0
− φ,α0 eα
0
0
0
= −φ,α0 eα 0
0
on Sc .
(4.11)
0
In (4.9), (4.10) and (4.11) it is assumed that x3 = c + φ(xα ) in the arguments of 0 0 the background fields gµ0 ν 0 , 0µ0 ρ ν 0 , eµ 0 entering the equations. It follows from (4.11) 0 0 0 1 that Nµ0 = −(−g 3 3 )− 2 δ 3 µ0 6= 0 at Sc . The metric induced by kµ0 ν 0 on the tangent spaces of Tc at points of Sc is Lorentzian and this property will be preserved in some neighbourhood of Sc in Tc if φ is smooth. The quasilinear wave equation (4.10) and the initial conditions (4.11) suggest to find Tc by solving a Cauchy problem for φ. In the particular case where f = const. = χ0 , the existence of a unique smooth solution can be inferred from general theorems (cf. [9]) which also entail the smooth dependence of the solution from the initial data. This allows us to construct (sufficiently close to S) a family of hypersurface Tc with mean extrinsic curvature χ0 , which can be described as the set of level hypersurfaces of a smooth function x3 with d x3 6= 0 and x3 = 0 on the intersection of its domain of definition with T . In view of Lemma 4.1 this entails the last statement of the lemma above. However, if ∂α f 6= 0, α = 0, 1, 2, we cannot proceed in this way. While the left hand 0 side of (4.10) is expressed in terms of the coordinates xα , the function f on the far right hand side is given in terms of the coordinates xα which still need to be determined as 0 functions of the xµ . This leads us to consider Eqs. (4.5) again. We begin with a few basic remarks. A vector field s is tangent to Tc if and only if s(8) = 0 or, equivalently, if 0
0
s3 = φ,α0 sα . 0
(4.12) 0
Thus we only need to determine φ and sα to find sµ on Tc . We shall consider equations, for unknowns on Tc , in which vector fields ea tangent to Tc act as operators.Any such unknown h will be thought of as being induced by a function
632
H. Friedrich, G. Nagy 0
H defined on some neighbourhood of Tc . In our coordinates xµ , which are not adapted 0 to Tc , the usual expression ea (h) = h,µ0 eµ a is not directly defined. By our procedure 0 0 0 0 above, Tc is parametrized by the xα and we have h = h(xα ) = H(xα , c + φ(xα )), which entails 0
h,α0 eα
0
a
= H,α0 eα
0
a
+ H,30 φ,α0 eα
0
a
= H,α0 eα
0
a
+ H,30 e3
a
= ea (H).
Therefore, any expression ea (h) with h defined on Tc and ea tangent to Tc will be interpreted in the following by 0
ea (h) = h,α0 eα a .
(4.13)
Since the connection D induced on Tc is not known yet, we express D in terms of the derivative operator ∇ and the second fundamental form χµ0 ν 0 on Tc . For any vector fields e0 , s tangent to Tc we should have on Tc , 0
0
0
0
0
De0 sµ = ∇e0 sµ − N µ χρ0 λ0 sρ eλ 0 . Because of (4.12) it is sufficient to consider the α0 -components of this equation, the 30 component will be a consequence. Thus Eqs. (4.5) take the form 0
∇e0 eα α0
∇ e0 e
A
0
0
0
− N α χ00 = F A (xα , c) eα
−N
α0
B
A,
(4.14)
α0
α
χ0A = −gAB F (x , c) e
with e3
0
0,
0
a
= φ,α0 eα
(4.15) (4.16)
a
0
0
being used wherever e3 a occurs in the equations. The transformation xα = xα (xα ) will be obtained as solution to e0 (xα ) = δ α 0 . 0
0
(4.17) 0
In Eqs. (4.14), (4.15) we set N µ = eµ 3 = N ∇µ 8. However, writing χab = 0 0 χµ0 ν 0 eµ a eν b with the expression (4.9) of the second fundamental form, would introduce terms of second order in φ which would spoil the hyperbolicity of the system. We shall derive instead propagation equations for χab . On the hypersurface Tc we will have Codazzi’s and Gauss’ equations which will take in the frame ea on Tc the form Db χca − Dc χba = R3 abc , ec (0d a b ) − ed (0c a b ) + 0c a e 0d e b − 0d a e 0c e b −0e a b (0c e d − 0d e c ) + χc a χdb − χd a χcb = Ra bcd ,
(4.18) (4.19)
respectively, with 0a c b denoting the connection coefficients of D in the frame ea . We write as usual Db χca = eb (χca ) − 0b e c χea − 0b e a χce and assume the interpretation (4.13) of directional derivatives. Equation (4.18) implies the system D0 χ01 − D1 χ11 − D2 χ12 = e1 (f ) + g ab R3 ab1 , D0 χ02 − D1 χ12 − D2 χ22 = e2 (f ) + g
ab
R
3
ab2 ,
(4.20) (4.21)
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
633
D0 χ11 − D1 χ01 = R3 101 , 2 D0 χ12 − D1 χ02 − D2 χ01 = R
3
D0 χ22 − D2 χ02 = R
3
102
+R
(4.22) 3
201 ,
202 ,
(4.23) (4.24)
ab
where we set f = g χab and assume that the function χ00 , of which no derivative is taken in the equations, is given by χ00 = f + χ11 + χ22 . We write Rjabc = 0 0 0 0 0 0 Rµ0 ν 0 λ0 ρ0 eµ j eν a eλ b eρ c and use (4.12) to express e3 a in terms of φ,α and eα a . A A A With the gauge conditions 00 B = 0, 00 0 = F , Eq. (4.19) implies the system e0 (0A B 0 ) = eA (F B ) − 0C B 0 0A C 0 −0A B C F C − F B F C gAC − χ0 B χA0 + χA B χ00 + RB 00A ,
(4.25)
e0 (0A B C ) = −F B 0A 0 C − 0A B 0 F D gCD −0D B C 0A D 0 − χ0 B χAC + χA B χ0C + RB C0A .
(4.26)
It remains to explain the meaning of the expressions eA (f ), eA (F B ). We should have ∂ xµ µ0 e a = f,µ eµ a = f,α eα a , ∂ xµ0 where eµ a denotes the coefficients of the frame field ea in the coordinates xµ . We derive equations for the quantities eα A . Because the intrinsic connection on Tc will be torsion free we should have ea (f ) = f,µ
0 = Dea Deb xα − Deb Dea xα = Dea eα b − Deb eα a , where eα a is considered for given α as the expression of d xα in our frame. Observing our gauge conditions, in particular their implication eα 0 = δ α 0 , we get the equation e0 (eα A ) = −gAB F B δ α 0 − 0A B 0 eα B .
(4.27)
In the equations above we set now eA (f ) = f,α eα A ,
B α eA (F B ) = F,α e A.
(4.28)
With the interpretations and gauge conditions given above Eqs. (4.10), (4.14), (4.15), (4.17), (4.20) to (4.27), form a quasi-linear system of equations for the unknowns φ, 0 0 eα a , xα (xα ), χab , 0a b c , eα a . 0 0 The initial data for the coordinates xα (xα ) and the frame coefficients eα a are given in the statement of the lemma. The initial data for φ are given by (4.11). Using Eq. (4.10), we can calculate φ to second order on Sc , which allows us to obtain the initial data for χab from (4.9) and the data for the frame. From Eqs. (4.14), (4.15) we can determine the frame coefficients to first order on Sc which allows us to calculate the coefficents 0a b c . Finally, we get from (4.17) the coordinate transformation to first order, which allows us to determine the coefficients eα a on Sc . Equation (4.10) is of wave equation type while the remaining equations form a symmetric hyperbolic system if φ is thought to as being given. The coupled system can be dealt with either directly or by using the well known procedure to write the wave equation as a symmetric hyperbolic system. Then the whole system will be symmetric hyperbolic and the existence and uniqueness of smooth solutions to this system for the given data follows from known results [9]. This implies in particular the first assertion of the lemma.
634
H. Friedrich, G. Nagy
We comment on the open problem. The data as well as the coefficients of our dif0 ferential system depend smoothly on the coordinates xα and the parameter c. Thus the 0 solutions will be jointly smooth in xα and c in some neighbourhood U 0 of p. Choosing U 0 small enough, we can define the hypersurfaces Tc ∩ U 0 to be the level hypersurfaces {x3 = c} of a smooth function x3 which together with the functions xα on Tc ∩ U 0 provides a smooth coordinate system xµ on U 0 . To solve the existence problem with this type of argument we would need to show e.g. that χab = χˆ ab on Tc , where χab denotes the symmetric tensor obtained as solution 0 0 to Eqs. (4.20) to (4.24), while χˆ ab = χµ0 ν 0 eµ a eν b with χµ0 ν 0 denoting the second fundamental form on Tc given by (4.9). This will be discussed elsewhere. 5. The Reduced Equations While the constraints induced by the Einstein equations on space- or time-like hypersurfaces are defined uniquely, there are many ways to extract evolution equations from the Einstein equations. Our choice of “reduced equations” or “propagation equations” (and in fact also the representation of the field equations and the gauge conditions introduced in the previous sections) is motivated by the following observations: (i) Our propagation equations are symmetric hyperbolic and allow us to formulate a maximally dissipative boundary value problem. (ii) The constraints are preserved by our propagation equations irrespective of the chosen maximal dissipative boundary condition. While the requirements in (i) are met by many systems, property (ii) imposes strong restriction on the choice of propagation equations. There appears to be no systematic way to derive such equations and a priori there appears to be no reason why propagation equations satisfying (ii) should exist at all. Observing (2.2) and (4.2), we obtain for the coefficients eµ p the propagation equations 0 = −T0 k p eµ k = ∂x0 eµ p − (00 q p − 0p q 0 ) eµ q − 00 0 p δ µ 0 .
(5.1)
The functions F A (xµ ) = 00 A 0 (xµ ) and, for x3 > 0, f (xµ ) = χ(xµ ) will be considered in the following as gauge source functions. They are to be prescribed in accordance with the boundary conditions but are free otherwise. Observing (4.4), (4.6), and the symmetries of the connection coefficients, we have to derive equations for the connection coefficients 0A b c , χab = 0a 3 b , and 03 j b . The Gauss equations with respect to the hypersurfaces Tc provide the equations 0 = 1B 00A = e0 (0A B 0 ) − eA (F B ) + 0C B 0 0A C 0 −0A B C F C + F B F C gAC + χ0 B χA0 − χA B χ00 − C B 00A ,
(5.2)
0 = 1B C0A = e0 (0A B C ) + F B 0A 0 C + 0A B 0 F D gCD +0D B C 0A D 0 + χ0 B χAC − χA B χ0C − C B C0A .
(5.3)
Codazzi’s equations, 0 = 13 abc = Db χca − Dc χba − C 3 abc , imply propagation equations 0 = g ab 13 ab1 = D0 χ01 − D1 χ11 − D2 χ12 − D1 (f ),
(5.4)
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
635
0 = g ab 13 ab2 = D0 χ02 − D1 χ12 − D2 χ22 − D2 (f ), 0=1 0=1
3
201
+1
3
102
3
101
= D0 χ11 − D1 χ01 − C
3
101 ,
= 2 D0 χ12 − D1 χ02 − D2 χ01 − C
0=1
3
202
= D0 χ22 − D2 χ02 − C
3
(5.5) (5.6)
3
201
−C
3
102 ,
(5.7)
202 .
(5.8)
In these equations it is understood that the component χ00 , which appears only in undifferentiated form, is given by χ00 = χ11 + χ22 + f . Using again the relation g ab 0a 3 b = f , we get for the coefficients 03 j b the equations 0 = 1A B03 = e0 (03 A B ) + F A 03 0 B + 03 A 0 F C gBC + 0C A B 03 C 0 +03 A B 03 3 0 + χ0 A 03 3 B − 03 A 3 χ0B − 0C A B χ0 C − C A B03 ,
(5.9)
0 = 1A 003 = e0 (03 A 0 ) − e3 (F A ) + χ0 A 03 3 0 − 03 A B F B + 0B A 0 03 B 0 +03 A 0 03 3 0 − 0B A 0 χ0 B − 03 3 B g BA χ00 − F A χ00 − C A 003 ,
(5.10)
0 = 13 A03 + 13 03A = e0 (03 3 A ) − eA (03 3 0 ) +03 3 0 F B gBA + 03 3 C 0A C 0 ,
(5.11)
0 = g ab 13 ab3 = e0 (03 3 0 ) + g AB eA (03 3 B ) − e3 (f ) −g ab 03 3 k 0b k a + g ab 0b 3 k 03 k a + g ab 0m 3 a (03 m b − 0b m 3 ).
(5.12)
There are various ways to extract symmetric hyperbolic propagation equations from the overdetermined Bianchi equation. We shall use the boundary adapted system introduced in [4] because this is particularly well suited for the discussion of initial boundary value problems. We denote by N = e3 the vector orthogonal to the family of hypersurfaces Tc and write again n = e0 , ijk = nl lijk . Using the fact that the electric and magnetic parts of the conformal Weyl tensor are symmetric and trace-free, the boundary adapted system is written as a system for the unknowns Eij , Bij , 1 ≤ i ≤ j, i < 3. It is understood that the relations g ij Eij = 0 and g ij Bij = 0 are used everywhere in the following equations to replace the fields E33 and B33 by our unknowns. The boundary adapted system is then given by Pij + N(i j) kl Nk Ql = 0,
Qij − N(i j) kl Nk Pl = 0, 1 ≤ i ≤ j, i < 3. (5.13)
Writing it in the equivalent form P11 − P22 2 P12 P11 + P22 P13 P23
= = = = =
0, 0, 0, 1 2 Q2 , − 21 Q1 ,
Q11 − Q22 2 Q12 Q11 + Q22 Q13 Q23
= = = = =
0, 0, 0, − 21 P2 , 1 2 P1 ,
as a system for the unknown “vector” u which is the transpose of
(5.14)
636
H. Friedrich, G. Nagy
((E− , 2E12 , E+ , E13 , E23 ), (B− , 2B12 , B+ , B13 , B23 )), where E± = E11 ± E22 , and B± = B11 ± B22 , it takes the form (Iµ + Aµ ) ∂µ u = b, with
where
0 Aµ , A = t µ A 0
Iµ 0 , I = 0 Iµ µ
1 0 I µ = δµ 0 0 0 0
0 1 0 0 0
0 0 1 0 0
00 0 0 0 0, 1 0 01
(5.15)
µ
0
µ
e 3 Aµ = 0 −eµ
2
−eµ 1
eµ 2 eµ 1 −eµ 3 0 µ µ 0 0 −e 1 e 2 0 0 eµ 2 −eµ 1 . (5.16) eµ 1 −eµ 2 0 0 −eµ 2 eµ 1 0 0
The reduced equations consisting of (5.1) to (5.13), in which our gauge conditions, in particular (4.2), (4.4), are assumed, is thus seen to form a symmetric hyperbolic system. The following specific feature of the system (5.13) should be noticed here. As discussed in [4], we could have taken the system Pjk = 0, Qjk = 0, suitably interpreted, as propagation equations. This would also have resulted in a symmetric hyperbolic system of propagation equations. However, in that case the rank of the matrix A3 , and with it the freedom to prescribe boundary data for the reduced system, would have been larger than in the present case. Another important reason for the choice of (5.13) will be pointed out in our discussion of the subsidiary system. 6. The Subsidiary System We show now that solutions to the reduced system which satisfy the constraints on S are indeed solutions to the full Einstein equations. Let g 0 be a time-oriented Lorentz metric on M for which T is time-like and S is space-like and such that it is in the past of M \ S. For a given subset V of T ∪ S and an open subset U of M we define the domain of dependence of V in U with respect to g 0 as the set of points p ∈ U such that (i) I − (p), the chronological past of p in (M, g 0 ), is contained in U , (ii) every past inextendible g 0 -causal curve through p meets V ∩ U . Theorem 6.1. Suppose that the fields eµ k , 0i j k , C i jkl , with χab ≡ 0a 3 b symmetric, are smooth on some open neighbourhood U of p ∈ 6 in M and satisfy the gauge conditions (4.2), (4.4) as well as the reduced equations (5.1) to (5.13) on U . Let g be the metric for which the frame ek is orthonormal and denote by D the domain of dependence of (S ∪ T ) ∩ U in U with respect to g. Then the Einstein equations (2.5) will be satisfied on D if they are satisfied on S ∩ U . Remark 6.1. It is a remarkable feature of Einstein’s equation that it admits a hyperbolic reduced system which allows us to draw such a conclusion without any reference to the behaviour of the fields on T . Proof. Since we have to show that the tensor fields defined by the left-hand sides of Eqs. (2.2), (2.3), (2.4) vanish on D, we shall refer to these fields as to the “zero quantities”. The reduced equations are equivalent to the equations T0 k j = 0,
(6.1)
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
1a b0k = 0, 13 A0A = 0, g ab 13 ab3 = 0,
637
(6.2)
13 A03 + 13 03A = 0, 13 201 + 13 102 = 0, g ab 13 abA = 0, Pij + N(i j) kl Nk Ql = 0,
Qij − N(i j) kl Nk Pl = 0.
(6.3)
We get slightly more information on the torsion tensor. Observing the assumed symmetry of χab , we get −T1 i 2 eµ i = eν 1 ∂ν eµ 2 − eν 2 ∂ν eµ 1 − (01 a 2 − 02 a 1 ) eµ a . Evaluating this expression for µ = 3 we get T1 3 2 e3 3 = 0, from which we conclude by (4.2) that TA 3 B = 0.
(6.4)
Since 0ijk = −0ikj , we know that the connection defined by the connection coefficients is metric. However, it is not clear at this stage whether it is torsion free, since so far we only know that (6.1) holds. For this reason the curvature tensor defined by the connection coefficients is not given by ri jkl but by Ri jkl = ek (0l i j ) − el (0k i j ) + 0k i m 0l m j −0l i m 0k m j − 0m i j (0k m l − 0l m k − T m kl ), which is equivalent to Ri jkl = 1i jkl + C i jkl + 0m i j Tk m l . Furthermore, it is not known at this stage whether the tensor C i jkl is indeed the conformal Weyl tensor of the metric defined by the frame coefficients eµ k . Together with the torsion tensor the curvature tensor satisfies the Bianchi identities X X ∇j Tk i l = (Ri jkl + Tj m k Tl i m ), (jkl)
X
(jkl)
∇j Ri mkl = −
(jkl)
X
Ri mnj Tk n l ,
(jkl)
P
where (jkl) denotes the sum over the cyclic permutation of the indices jkl. Observing P that we assumed the symmetry (jkl) C i jkl = 0, the first identity can be written in the form X X ∇j Tk i l = (1i jkl + 0m i j Tk m l + Tj m k Tl i m ). (6.5) (jkl)
(jkl)
Again, observing this equation and that the tensor C i jkl has the algebraic properties of a conformal Weyl tensor, we get the relation X (jkl)
0 0 1 00 ∇j C i mkl = − k l i m jkl m ∇i0 C i m0 k0 l0 . 2
638
H. Friedrich, G. Nagy
This allows us to write the second Bianchi identity in the form X
∇j 1i mkl = Li mjkl ,
(6.6)
(jkl)
where we set Li mjkl =
X 1 k 0 l0 i j0 0n i m 1n jkl m jkl Hj 0 k0 l0 − 2 (jkl)
0
+(Ri mnj + ∇j 0n i m + 0n0 i m 0n n j ) Tk n l + 0n i m Tj n
0
k
(6.7)
o T l n n0 ,
with ∇j 0n i m = ej (0n i m ) − 0j l n 0l i m + 0j i l 0n l m − 0j l m 0n i l . Notice that the field Li mjkl is a polynomial in the zero quantities which vanishes if the zero quantities vanish. The identities above will be used to derive certain systems of differential equations, the “subsidiary systems”, which are satisfied by the zero quantities. In view of (6.1), we get from (6.5) the equation ∇0 Tk i l + ∇l T0 i k + ∇k Tl i 0 = 1i 0kl + 1i l0k + 1i kl0 + 0m i 0 Tk m l , (6.8) which can be rewritten e0 (Tk i l ) = 1i 0kl + 1i l0k + 1i kl0 + (0m i 0 − 00 i m ) Tk m l + (0l
m
0
− 00
m
l ) Tm
i
k
+ (0k
m
0
− 00
m
k ) Tl
i
(6.9)
m.
With (6.2), (6.4) this implies in our gauge e0 (TA 3 3 ) = −0A B 0 TB 3 3 , from which we conclude that TA 3 3 = 0. Combined with (6.1), (6.4) this gives Ti 3 j = 0.
(6.10)
Using this equation in (6.5) we get the relation X
13 jkl =
(jkl)
X
(0j 3 a − 0a 3 j ) Tk a l ,
(6.11)
(jkl)
which implies 13 012 + 13 201 + 13 120 = 0, 1
3
123
+1
3
231
= 03
3
a
T1
a
2.
(6.12) (6.13)
We shall show now that the “vector” v which is the transpose of (T1 a 2 , 1A 012 , 11 221 , 13 001 , 13 002 , 13 012 ), vanishes on D. For this purpose we derive a homogeneous symmetric hyperbolic system for v as follows.
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
639
The equations for T1 a 2 , obtained from (6.9), are given by e0 (T1 a 2 ) = 1a 012 + (0A a 0 − 00 a A )T1 A 2 − (01 1 0 + 02 2 0 )T1 a 2 .
(6.14)
The equations for the remaining components of v are obtained by observing (6.2), the symmetry of 0a 3 b as well as the results obtained so far, and by writing out in detail the six equations of (6.6) where the quantities L1 0012 , L2 0012 , L1 2021 , L3 0012 , L3 2210 , L3 1120 , occur on the left-hand sides. It is important to note that there is one component of the 0 0 0 tensor Hjkl for each component of the tensor Li mjkl coming from k l i m jkl j Hj 0 k0 l0 . For the quantities mentioned above these components are respectively, H323 H331 H330 H310 H302 H312 . However all these quantities vanish due to the reduced equations (6.3). A straightforward calculation shows us that 1 1 H323 = Q13 + P2 = 0, H310 = P13 − Q2 = 0, 2 2 1 1 H331 = Q23 − P1 = 0, H302 = P23 + Q1 = 0, 2 2 H312 = Q33 = 0, H330 = P33 = 0, where the last two equations follow from the reduced equations since Pij and Qij are trace free. After a somewhat lengthy though straightforward calculation we get 3 3 00 1 13 012 − 00 3 0 + 01 3 1 13 002 + 02 3 1 13 001 2 −00 2 0 11 221 − 201 1 0 + 02 2 0 11 012 − 02 1 0 12 012 − R1 0a0 + ∇0 0a 1 0 + 0k 1 0 0a k 0 T1 a 2 ,
e0 (11 012 ) =
3 3 00 2 13 012 + 00 3 0 + 02 3 2 13 001 − 01 3 2 13 002 2 +00 1 0 11 221 − 01 1 0 + 202 2 0 12 012 − 01 2 0 11 012 − R2 0a0 + ∇0 0a 2 0 + 0k 2 0 0a k 0 T1 a 2 ,
e0 (12 012 ) =
e0 (11 221 ) = 00 3 1 13 001 + 00 3 2 13 002 − 00 2 0 − 01 1 2 11 012 + 00 1 0 + 02 1 2 12 012 − 01 1 0 + 02 2 0 11 221 + R1 2a0 + ∇0 0a 1 2 + 0k 1 2 0a k 0 T1 a 2 ,
(6.15)
(6.16)
(6.17)
(6.18) e0 (13 012 ) − e1 (13 002 ) + e2 (13 001 ) = − 200 2 0 + 01 1 2 13 001 3 01 1 0 + 02 2 0 13 012 + 200 1 0 + 02 2 1 13 002 − 200 3 A 1A 012 − 2 − R3 0a0 + ∇0 0a 3 0 + 0k 3 0 0a k 0 T1 a 2 ,
640
H. Friedrich, G. Nagy
1 e0 (13 001 ) + e2 (13 012 ) = − 01 1 0 + 202 2 0 13 001 + 01 2 0 13 002 (6.19) 2 3 2 − 00 0 13 012 + 00 3 0 12 012 + 0A 3 2 1A 012 + 00 3 1 11 221 2 + R3 2a0 + ∇0 0a 3 2 + 0k 3 2 0a k 0 T1 a 2 , 1 (6.20) e0 (13 002 ) − e1 (13 012 ) = 02 1 0 13 001 − 201 1 0 + 02 2 0 13 002 2 3 + 00 1 0 13 012 − 00 3 0 + 01 3 1 11 012 − 02 3 1 12 012 − 00 3 2 11 221 2 − R3 1a0 + ∇0 0a 3 1 + 0k 3 1 0a k 0 T1 a 2 . Multiplying the last two equations by 2 we obtain a system for v which is symmetric hyperbolic. A calculation shows that its characteristics are non-space-like for g. Moreover, it does not contain the directional derivative operator e3 . As pointed out in our discussion of maximally dissipative boundary value problems this allows us to obtain energy estimates for our solution regardless of its behaviour on T . Since v = 0 on S by assumption, it follows that v vanishes in D. Thus we have T1 a 2 = 0, 1A 012 = 0, 11 221 = 0, 13 001 = 0, 13 002 = 0,
13 012 = 0,
whence, by (6.2), (6.12), (6.13), 13 201 = 0, 13 102 = 0, 13 123 + 13 231 = 0. To write out the equations for the “vector” u which is the transpose of (13 103 , 13 203 , 13 123 , 13 113 , 13 223 ), it will be convenient to use the following notation. For any tensor field T ij... kl... we write 0m T i...j k...l ≡ (∇m − em ) T i...j k...l = 0m i n T n...j k...l + . . . − 0m n l T i...j k...n , which is a bi-linear expression in the components of the tensor field and the connection coefficients. Observing the results obtained so far, we get, again from (6.6), the equations =g
=g
ij
ij
e0 (13 103 ) − e1 (13 113 ) − e2 (13 123 ) 3 L ij13 − 0j 13 i13 + 03 13 ij1 + 01 13 i3j ,
(6.21)
e0 (13 203 ) − e1 (13 123 ) − e2 (13 223 ) 3 L ij23 − 0j 13 i23 + 03 13 ij2 + 02 13 i3j ,
(6.22)
(6.23) 2 e0 (13 123 ) − e1 (13 203 ) − e2 (13 103 ) = L3 1023 + L3 2130 3 3 3 3 3 3 − 00 1 123 + 03 1 102 + 02 1 130 − 01 1 230 + 00 1 213 + 03 1 201 ,
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
641
e0 (13 113 ) − e1 (13 103 ) = L3 1013 − 00 1
3
113
+ 03 1
3
101
+ 01 1
3
130
e0 (13 223 ) − e2 (13 203 ) = L3 2023 − 00 1
3
223
+ 03 1
3
202
+ 02 1
3
230
(6.24) , (6.25) .
The system (6.21) to (6.25) for u takes the form Aµ ∂µ u = b with µ e 0 0 −eµ 2 −eµ 1 0 0 eµ 0 −eµ 1 0 −eµ 2 µ µ 0 . A = −e 2 −eµ 1 2 eµ 0 0 −eµ 0 0 eµ 0 0 1 µ µ 0 e 0 0 −e 2 0 Using finally the definition of Hjkl we set Nkl = ∇j Hjkl = ∇i ∇j Cijkl .
(6.26)
Observing that our solution Cijkl has by definition all the symmetries of a conformal Weyl tensor, i.e. Cijkl = C[ij][kl] , Cijkl = Cklij , Cijk j = 0, Ci[jkl] = 0, which imply Cij[k m C ij l]m = 0, we find 1 Nkl = Rmi m j C ij kl − Rij[k m C ij l]m − Ti m j ∇m C ij kl 2 = −1mi m j C ij kl − 0m n i Tn m j C ij kl + 1ij[k m C ij l]m 1 −0mij C ijn [l Tk] m n + Ti m j ∇m C ij kl . 2
(6.27)
We note that the expression on the far right-hand side is linear in the zero quantities. On the other hand we obtain from the identity (2.6) and the reduced equations (5.13) a relation Hjkl = Pi ui jkl + Qi v i jkl ,
(6.28)
with ui jkl = −2 nj n[k hl] i + hj[k hl] i − N(j n) mi Nm n kl , v i jkl = nj i kl − i j[k nl] + nl N(j k) mi Nm − nk N(j l) mi Nm . Contracting (6.28) with 2 hq k nl and −p kl respectively, we finally get 2 Ln Pq + (qji + 2 N(j q)mi N m ) Dj Qi
(6.29)
= 2 Kq i Pi − 2 Pi hq k nl ∇j ui jkl − 2 Qi hq k nl ∇j v i jkl + 2 hq k nl Nkl , 2 Ln Qp − (pji + 2 N(j p)mi N m ) Dj P i i
= 2 Kq Qi + Pi p
kl
j
∇ u
i
jkl
kl
j
+ Qi p ∇ v
i
jkl
− p
(6.30) kl
Nkl .
If we write these equations as a system for the unknown w which is the transpose of ((P1 , P2 , P3 ), (Q1 , Q2 , Q3 )), they take the form
642
H. Friedrich, G. Nagy
Iµ ∂µ w + Aµ ∂µ w = b, with I = µ
2 eµ 0 0 0
Iµ 0 , 0 Iµ 0 0 2 eµ 0 0 , 0 eµ 0
Iµ =
0 Aµ , t µ A 0 0 0 eµ 2 0 − eµ 1 . Aµ = 0 − eµ 2 eµ 1 0
Aµ =
Equations (6.9) for the remaining components of the torsion tensor, Eqs. (6.21) to (6.25), and Eqs. (6.29), (6.30) provide the subsidiary system for those zero quantities of which we do not know yet whether they vanish. The system is symmetric hyperbolic and a calculation shows that its characteristics are non-space-like for gµν . The derivative operator e3 does not occur in the system. Since the zero quantities vanish on S, we conclude that they vanish on D. The requirement that the operator e3 does not occur in the subsidiary systems was one of our main criteria for choosing the reduced system. Otherwise we would have been confronted with the task to analyse in detail the structure of boundary data for the subsidiary systems which are determined by the reduced system from the data on S as well as on T . 7. Initial and Boundary Data In the following we discuss how to prepare initial and boundary data for the reduced equations. The discussion of the initial data is somewhat complicated by the fact that we do not require the unit normal e0 of 6 in T to be orthogonal to S. Without this generality our results would be of rather restricted applicability. 7.1. The construction of initial data. Experience with the standard Cauchy problem for Einstein’s vacuum field equation tells us that we have to assume as initial data on S a smooth (negative) Riemannian metric γαβ and a smooth symmetric tensor field καβ satisfying the Hamiltonian and the momentum constraint R0 − (κα α )2 + καβ καβ = 0,
δ
0
α
καβ − δβ0 κα α = 0,
(7.1)
on S. Here δ 0 denotes the Levi–Civita connection and R0 the Ricci scalar of the metric γ. To derive initial data for the reduced equations we shall first determine data in terms 0 of coordinates x µ and a frame e0k which are suitably adapted to the initial hypersurface S and shall then express these data in the coordinates xµ and the frame field ek which satisfy the conditions described in Sect. 4. 0 0 0 0 ∩ U with x 0 = 0 on S ∩ U , x 3 = 0 on 6 ∩ U , x 3 > 0 on Let x µ be functions on S 0 (S \ 6) ∩ U , such that the x α , α = 1, 2, 3, define a smooth coordinate system on S ∩ U 0 0 and the x α , α = 1, 2, are constant along the integral curves of the gradient of x 3 . For 0 0 numbers c ≥ 0 in the range of x 3 we set Sc = {x 3 = c}. Let {e0p }p=1,2,3 be a smooth frame field on S ∩ U such that e03 is orthogonal to the surfaces Sc , pointing towards S on S0 = 6 ∩ U , and such that 0 = diag (−1, −1, −1). γ(e0p , e0q ) = gpq
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
643 0
0
The information on the metric γ is contained in the coefficients e α p = e0p (x α ). We 0 0 write κ0pq = καβ e α p e β q . Imagine now the initial data set (S, γ, κ) as being isometrically embedded into a solution (M, gµν ) of the field equations and denote by ∇ the connection defined by g. Let e00 be the (future directed) unit normal of S. We assume that the orthonormal frame 0 field {e0k }k=0,... ,3 and the functions x µ on S are extended off S such that the frame 0 is parallely propagated in the direction of e00 and that the coordinates x α , α = 1, 2, 3, 0 are constant on the integral curves of e00 , while x 0 is a natural parameter on these curves. The connection coefficients defined by ∇e0k e0j = 00k i j e0i then satisfy on S (cf. condition (7.27) added below) δe0 0p e0 q = 00p r q e0r ,
000 i j = 0,
0
00p 0 q = −κ0pq ,
(7.2)
0
and we have for the coefficients e µ k = e0k (x µ ), 0
e µ 0 = δµ 0 ,
0
e 0 p = 0,
0
0
e µ 3 = δµ 3 e 3 3 ,
0
e 3 3 > 0.
The electric part of the conformal Weyl tensor with respect to ν = e00 then follows under our assumptions from the Gauss equation on S. It is given by 0 0 0 = Cp0q0 = Rpq − Epq
1 0 0 1 1 rs 0 0 ) − κ0sp κ0q s + κ0rs κ0 gpq }, R gpq − {κ0r r (κ0pq − κ0s s gpq 4 4 4
0 denotes the Ricci tensor of the metric γ in the frame e0p and R0 is the Ricci where Rpq scalar of γ. The tensor above is obviously symmetric, the Hamiltonian constraint ensures that it is trace free. The magnetic part follows under our assumptions from the Codazzi equation. It is given by 1 0 0 = Cp0ik 0q0 ik = −0q rs δe0 0r κ0sp , Bpq 2 0
where 0ijkl is totally antisymmetric, 00123 = 1, and 0pqr = ν i 0ipqr . The symmetry of the tensor above is a consequence of the momentum constraint, it is trace free because of 0 the symmetry of κ0pq . These fields together determine the conformal Weyl tensor Cijkl 0 in the frame ej by the formula 0 0 0 0 0 0 0 0 0 0 = 2 lj[k El]i − li[k El]j − ν[k Bl]m m ij − ν[i0 Bj]m m kl , Cijkl 0 0 = gij − 2 νi0 νj0 . where we set lij Only projections into S of expressions (2.1) to (2.4) can be determined from our 0 0 0 0 0 , Bpq , Cijkl . Using the projector hˆ 0i j = gi0 j − νi0 ν j and the fact data e µ k , 00i j k , Epq that in 3 dimensions the Riemann tensor is given in terms of the Ricci tensor, we find by the way we derived our data from γ and κ that
hˆ 0p i hˆ 0q j Ti0 k j = 0, 0
(7.3)
(7.4) hˆ 0p k hˆ 0q l 1 i jkl = 0, 0 0 0 0 0 1 Ps0 = ν j hˆ 0s l ν m ∇e0i C i jlm = 0, Q0s = − ν j 0s lm ∇e0i C i jlm = 0, (7.5) 2 i.e. the constraints induced by Eqs. (2.5) on S ∩ U are satisfied by our data.
644
H. Friedrich, G. Nagy
If the normal n of 6 ∩ U in T were orthogonal to S, we could set ek = e0k on S ∩ U and use the data determined above as initial data for the reduced field equations. To include the case where n does not necessarily coincide with e00 on 6 we proceed as 0 follows. We choose functions xµ such that xµ = x µ on S ∩ U . We set eA = e0A , A = 1, 2, and e0 = cosh(θ) e00 + sinh(θ) e03 ,
e3 = sinh(θ) e00 + cosh(θ) e03 ,
(7.6)
with θ ∈ C ∞ (S ∩ U ) chosen such that e0 = n on 6 ∩ U . We write the relations above in the form ek = 3j k e0j with a Lorentz transformation 3j k . We note here that 2 ≡ θ|6∩U is a free datum which determines in part the geometry of the space-time we wish to construct, while on the remaining part of S ∩U the function θ must be regarded as a gauge source function. Near S the coordinates xµ will be chosen such that eµ 0 = e0 (xµ ) = δ µ 0 . This implies on S the relation δ µ 0 = e0 (xµ ) = cosh(θ) e00 (xµ ) + sinh(θ) e03 (xµ ) = cosh(θ)
0 ∂xµ µ 3, 0 0 + sinh(θ) e ∂x
µ
∂x µ µ which allows us to determine ∂x 0 ν and thus the frame coefficients e k = ek (x ) on S ∩ U. The transformation law between the connection coefficients defined by ∇ei ek = 0i j k ej and the connection coefficients 00i j k reads
0i k j = ei (3m j ) 3m k + 3l i 3n j 00l m n 3m k ,
(7.7)
with 3i k = gji 3j l g lk , which satisfies 3i k 3i l = δ k l . To determine the left hand side of (7.7) we need to determine the derivatives of 3i k . The requirement that the latter is a Lorentz transformation implies ei (3m j ) 3mk + ei (3m k ) 3mj = 0, which translates into the equivalent conditions ei (3A B ) = −ei (3B A ), A
ei (3
A
ei (3
0)
= cosh(θ) ei (3
3)
= sinh(θ) ei (3
0
0
ei (3 0 ) = ei (3 3 ), 0
3
(7.8)
A)
− sinh(θ) ei (3
A ),
A)
− cosh(θ) ei (3
A ),
3
3
ei (3 0 ) = ei (3 3 ), 3
0
(7.9)
cosh(θ) ei (3 0 ) = sinh(θ) ei (3 0 ). 0
3
Observing that e0 (3i j ) = cosh(θ) e00 (3i j ) + sinh(θ) e03 (3i j ) and that we can calculate the tangential derivatives e0p (3i j ) for p = 1, 2, 3, on S ∩ U , we find from the gauge condition 00 A B = 0 and (7.7), e00 (3A B ) = −000 A B − tanh(θ) 003 A B . The gauge condition 00 0 A = −gAB F B gives with (7.8), cosh(θ) e00 (3A 0 ) = −gAB F B − 3k 0 00k l A 3l 0 . The requirement that e3 is hypersurface orthogonal at Sc , i.e. χ0A = χA0 , implies with (7.8) cosh(θ) e00 (3A 3 ) = −e0A (θ) − 3k 0 (00A l k − 00k l A ) 3l 3 .
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
645
Using again (7.8) we obtain from these relations the quantities e00 (3i j ) with i = A = 1, 2, or j = B = 1, 2. The gauge condition g ab 0a 3 b = f implies on S ∩ U , e00 (33 0 ) = − sinh(θ) e03 (θ) + f − g ab 3k a 3l b 00k i l 3i 3 , from which we determine the quantities e00 (3i j ) with i, j = 0, 3, by using (7.9). We note that the quantity 0A 3 B is indeed symmetric because of the relation 0A 3 B = − sinh(θ) 00A 0 B + cosh(θ) 00A 3 B , implied by (7.7). The terms on the right hand side are symmetric because e00 and e03 are orthogonal to S ∩ U and Sc respectively. Finally, the conformal Weyl tensor is given in our gauge by 0
Cijkl = Ci00 j 0 k0 l0 3i i 3j
0
j
3k
0
0
k
3l l ,
where the primed indices take values 0, . . . , 3. The data so obtained are useful for our purpose because we have Lemma 7.1. Suppose eµ k , 0i j k , Cijkl coincide on S ∩ U with the data determined above and satisfy the reduced field equations (5.1) to (5.13) as well as our gauge conditions in some neighbourhood of S ∩ U ' {0} × (S ∩ U ) ⊂ R × (S ∩ U ). Then eµ k , 0i j k , Cijkl satisfy Eqs. (2.5) on S. Proof. We have to show that the tensor fields Ti k j , 1i jkl , Hjkl vanish on S ∩ U . Given 0 the metric for which eµ k is orthonormal, we can extend the coordinates x µ and the frame 0 ek off S ∩ U as described above and express the tensor fields in terms of this gauge. We 0 0 0 have n = n i e0i with n i = cosh(θ) ν i + sinh(θ) δ i 3 . In terms of Ti0 j k = Ti0 j k [e0 , 00 ] 0 0 Eq. (5.1) reads 0 = n i Ti0 j k = cosh(θ) ν i Ti0 j k + sinh(θ) T30 j k . Using (7.3) we obtain 0 from this relation that ν i Ti0 j k = 0 on S ∩ U . This equation and (7.3) imply by the tensorial nature of T that Ti j k = 0 on S ∩ U . We have a decomposition 1i jkl = Di jkl + 2 Di j[l nk] , with fields Dijkl = 1ijmn hm k hn l ,
Dijl = 1ijkn nk hn l ,
which are anti-symmetric in the indices i, j. If the torsion tensor vanished to first order on S ∩ U we could use the first Bianchi identity to deduce the identity 2 D3 [AB] = D3 0AB . However, observing the assumed symmetry of χab = 0 a 3 b , this relation can be verified in our case by a direct calculation. The reduced equations (6.2) can then be rewritten in the form Da bp = 0, D3 0p = g ab D3 apb , D3 AB =
1 3 D 0AB , D3 A3 = D3 0A3 . (7.10) 2
Equation (7.4) reads 1i jmn hˆ m k hˆ n l = 0, where hˆ j k = g j k − ν j νk with ν i = cosh(θ) ni − sinh(θ) N i . Transvecting this equation suitably with hj k and ni we find that it is equivalent to
646
H. Friedrich, G. Nagy
Di jAB = 0,
Di j3A = tanh(θ) Di jA .
(7.11)
It is now a matter of straightforward algebra to show that Eqs. (7.10), (7.11) imply Di jp = 0, Di jpq = 0. 0 0 Consider the field Hijk = Hijk [e0 , 00 , C 0 ] decomposed with respect to ν according to the rule (2.6). The fact that the constraints (7.5) are satisfied on S ∩ U is expressed 0 0 = 0, which is in turn equivalent to equivalently by the equation ν i Hijk 0 = ν i Hijk = cosh(θ) ni Hijk − sinh(θ) N i Hijk , on S ∩ U . On the other hand we have by our assumptions the relation (6.28). Together these two equations imply 0 = cosh(θ) (Pi nj ui jkl + Qi nj v i jkl ) − sinh(θ) (Pi N j ui jkl + Qi N j v i jkl ) = cosh(θ) (−2 n[k Pl] + Qi ikl ) − sinh(θ) 2 N[k Pl] , which entails Pl = 0 and Qk = 0 on S ∩ U .
7.2. The boundary conditions. The boundary conditions for the reduced system are determined by the rules described in Sect. 3. In the reduced system the only contribution to the normal matrix comes from (5.15) and the boundary conditions thus only involve the conformal Weyl tensor. By (5.16) we find t
1 1 u A3 u = 4 B− E12 − 4 E− B12 = −{ √ (E− + 2 B12 )}2 − { √ (B− − 2 E12 )}2 2 2 1 1 +{ √ (E− − 2 B12 )}2 + { √ (B− + 2 E12 )}2 . 2 2
Choosing a smooth matrix-valued function H on T as in (3.4), we can thus write the boundary conditions in the form q1 = E11 − E22 − 2 B12 − a (E11 − E22 + 2 B12 ) − b (B11 − B22 − 2 E12 ), (7.12) q2 = B11 − B22 + 2 E12 − c (E11 − E22 + 2 B12 ) − d (B11 − B22 − 2 E12 ),(7.13) with some given smooth functions q1 , q2 on T . The components of the conformal Weyl tensor which enter these conditions are obtained by projecting its e0 -electric and e0 -magnetic parts into the plane orthogonal to e3 and by taking then the trace-free parts. The resulting tensors are given in our notation by 1 1 ηAB = EAB − gAB g CD ECD , βAB = BAB − gAB g CD BCD . 2 2 In terms of the null frame defined by (3.6), the relevant components of the conformal Weyl tensor are given in NP notation by 90 = Cµνσπ lµ mν lσ mπ = η11 + β12 + i (β11 − η12 ), ¯ µ kν m ¯ σ k π = η11 − β12 + i (β11 + η12 ), 94 = Cµνσπ m and the boundary conditions (7.12), (7.13) take the form ¯ 0, q = −94 + α 90 + β 9
(7.14)
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
647
where q, α, β are defined as in (3.5). The form of (7.14) can be understood as follows. In our frame the components 90 , 94 of the conformal Weyl tensor can be interpreted as parts of the field transverse to e3 , traveling into the directions −e3 , e3 respectively (cf. also [5]). Assume there were a family of outgoing null hypersurfaces tangent to the vector field k on T . Then the field equations would imply on these hypersurfaces propagation equations of the form 90,µ k µ − 91,µ mµ = L(0i j k , 9l ). This shows clearly that the values of 90 will be determined on T by the evolution equations once the other fields are given. This is consistent with the fact that the conditions on α, β prevent us from prescribing 90 on T . On the other hand, if there were a family of ingoing null hypersurfaces tangent to l on T , the field equations would imply on these hypersurfaces propagation equations of the form ¯ µ = L0 (0i j k , 9l ), 94,µ lµ − 93,µ m and the quantity 90 would in fact represent the null datum on these hypersurfaces. Therefore it is natural that we can prescribe the value of 94 freely on T and couple parts ¯ 0 back to it as it is realized in (7.14). of 90 , 9 In trying to give along these lines any explanation of (7.14) in terms of “ingoing/outgoing gravitational radiation” it should be observed that our gauge conditions, in particular the components 90 , 94 of the conformal Weyl tensor, depend on the choice of the vector field e0 on T , which so far is rather arbitrary. This situation should be compared with that at null infinity, where one causal direction is singled out by the causal nature of the boundary and a natural concept of “radiation field” is obtained. The condition on the coefficients of H in (3.4) can be expressed in terms of α, β, and v = (v1 , v2 ) in the form t
vBv ≤
1 − |α|2 − |β|2 t v v, 2
v ∈ R2 ,
(7.15)
where the symmetric bi-linear form in v on the left hand side is defined by the matrix Re(α¯ β) Im(α¯ β) B = B(α, β) = . Im(α¯ β) −Re(α¯ β) Since v 6= 0 can always be chosen such that the term on the left-hand side of (7.15) is non-negative, it follows that |α|2 + |β|2 ≤ 1. Moreover, since v is arbitrary in (7.15), it follows then that |α|2 + |β|2 = 1 if and only if α = 0 or β = 0. We take this opportunity to correct a mistake in [4] (which is of no consequence in that article). Equation (5.49) in [4] should be replaced (observing the different notation) by (7.15). The closest analogue to (3.7) appears to be the following. Observing that the Bel– ¯ a0 b0 c0 d0 , we find Robinson tensor is given in spinor notation by Taa0 bb0 cc0 dd0 = 9abcd 9 that the non-positivity condition (ii) in Sect. 3 takes the form t
¯ 0 + 94 9 ¯ 4 = −2 (T0333 + T3000 ) ≤ 0, u A3 u = −90 9
where we assume the Bel–Robinson tensor to be given in the frame ek . 7.3. Boundary conditions and gauge conditions. So far our considerations were based on a fixed choice of a local gauge. If we want to go beyond the study of local solutions,
648
H. Friedrich, G. Nagy
we have to glue together different local solutions and therefore need to discuss the transformation behaviour of the initial and boundary conditions under changes of the gauge. The transformation behaviour of the initial data is obvious and will not be considered any further. The transformation behaviour of the boundary conditions is more complicated. Since the time-like unit vector field e0 is assumed to be given on T and e3 is by definition the inward pointing unit normal of T , the remaining gauge freedom on T consists of smooth coordinate transformations 0
xα → x β (xα ), α, β = 1, 2, and rotations of the frame eA → e0A = 3B A eB , (3B A ) = 3(8) =
(7.16)
cos 8 − sin 8 , sin 8 cos 8
(7.17)
or, in terms of the null frame (3.6), m → m0 = ei 8 m.
(7.18)
Since we assume the vectors eA to be Fermi-propagated in the direction of e0 with respect to the intrinsic connection on T , the function 8 in (7.17) is independent of x0 . Further, the coordinates on 6 are dragged along with e0 . Thus the remaining gauge transformation can be specified on T completely in terms of their behaviour on 6. The connection coefficients 00 A 0 , which are specified in terms of the gauge source functions F A , transform under (7.17) according to 00 A 0 → 000 A 0 = 3(8)B A 00 B 0 . It will be convenient to give F A in terms of the complex function F = F 1 + i F 2.
(7.19)
The transformation behaviour above is then reflected by F → F 0 = ei 8 F.
(7.20)
The components of the conformal Weyl tensor transform under (7.18) according to 90 → 900 = e2 i 8 90 ,
94 → 904 = e−2 i 8 94 .
To make sense of the boundary condition (7.14) in a covariant way we require that the functions on T which enter this condition transform under (7.18) as q → q 0 = e−2 i 8 q,
α → α0 = e−4 i 8 α,
β → β 0 = β.
(7.21)
It is important to note that (7.15) is invariant under (7.21). This follows from the facts that B transforms according to B(α0 , β 0 ) = t 3(−8) B(α, β) 3(−8), under (7.21) and that the quadratic expressions on the right-hand side of (7.15) are invariant under (7.21) and rotations of v. To describe the boundary conditions in a covariant way we introduce some tensor fields which contract to zero with any vector orthogonal to eA , A = 1, 2. We refer to such
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
649
tensors as to “eA -tensors”. The symmetric trace-free eA -tensors of rank 2 are generated by tensors tij , sij with non-trivial (i.e. not necessarily vanishing) components tAB = δ 1 A δ 1 B − δ 2 A δ 2 B ,
sAB = δ 1 A δ 2 B + δ 2 A δ 1 B ,
respectively. The eA -tensor Jijkl with non-trivial components Re(α) Im(α) (tAB tCD − sAB sCD ) + (tAB sCD + sAB tCD ),(7.22) 2 2 is completely symmetric and trace-free. In fact, any symmetric trace-free eA -tensor of rank 4 has this form with certain coefficients Re(α), Im(α). Therefore the form is necessarily preserved under the transformations (7.17) and it turns out that the coefficients transform under (7.17) into Re(e−4i8 α), Im(e−4i8 α), in accordance with (7.21). We finally need the tensor jk = N i ijk and the induced metric on the subspaces orthogonal to e0 , e3 , which is represented by gAB . We use these tensors and the function β, which transforms under (7.17) according to (7.21), to define the eA -tensor Iijkl with nontrivial components JABCD =
IAB CD = Re(β) g(A C gB) D + Im(β) g(A C B) D .
(7.23)
It is invariant under (7.17) and contracts with a symmetric trace-free eA -tensor of rank 2 to yield another such tensor. Setting now C ρ± AB = ηAB ± βAC B ,
(7.24)
and introducing as the free datum on T the symmetric trace-free eA -tensor qij with non-trivial components qAB = Re(q) tAB + Im(q) sAB ,
(7.25)
we find that the boundary conditions (7.12), (7.13) can be written as a tensor equation on T which has non-trivial components CD + ρCD + JAB CD ρ+CD . qAB = −ρ− AB + IAB
(7.26)
The main property of the tensor fields Jijkl , Iijkl , qij is that the form of their expressions in the frame ek is universal, that they do not depend on the vectors e0 , e3 orthogonal to eA , and that they are uniquely determined by the functions α, β, q. Consider a neighbourhood W of 6 in T . Assume a fixed orientation of 6, and denote by O+ (6) the bundle of oriented orthonormal frames on 6. Because of the transformation laws (7.20), (7.21) the complex-valued functions F , α, β, q should at 6 not be considered as functions on 6 but as spin weighted functions on O+ (6). Suppose that the time-like vector field e0 is given on W , the flow lines of e0 generate W , and x0 maps each flow line onto the interval [0, x0∗ [, where x0∗ is a fixed positive number or infinity. For p ∈ W denote by p∗ ∈ 6 the point at which the flow line on e0 passing through p meets 6. For given eA at p denote by e∗A the frame of 6 at p∗ which is transported into eA by T -intrinsic Fermi transport along the flow line. With the values of F , α, β, q in the 2-frame eA at p we associate the same complex numbers F (x0 (p)), α(x0 (p)), β(x0 (p)), q(x0 (p)) in the 2-frame e∗A at p∗ . For given x0 ∈ [0, x0∗ [ we thus get a set of smooth complex-valued functions F , α, β, q on [0, x0∗ [×O+ (6) of spin weight 1, −4, 0, −2 respectively. Giving these functions is equivalent to giving the tensor fields Jijkl , Iijkl , qij on W because of the universal form of the local expressions (7.22), (7.23), (7.25). We will need to extend the gauge source function F into a neighbourhood of T in the prospective solution space-time.
650
H. Friedrich, G. Nagy
Definition 7.1. We call a smooth function x3 defined on some open neighbourhood U of 6 in S a boundary defining function on S if d x3 6= 0,
x3 |6 = 0,
x3 |S\6 > 0,
and if the sets Sc = {x3 = c} ⊂ S are diffeomorphic to 6 for 0 ≤ c < x3∗ ≡ supU x3 and are obtained by pushing forward 6 with the flow of the vector field −(|d x3 |γ )−2 gradγ x3 . Given a boundary defining function x3 , we denote by e03 the smooth unit vector field which is orthogonal to Sc on U and points towards S on S0 = 6. Using the flow lines of e03 we can map Sc diffeomorphically onto 6 and get the representation U = 6 × [0, x3∗ [. Following our discussion in Sect. 4, we consider time-like hypersurfaces Tc having intersection Sc with S. The hypersurfaces Tc and the coordinate x0 on them are generated by a time-like vector field e0 . Repeating the discussion above, we find that for given c the gauge source functions F A on Tc can be represented by a smooth complex-valued function on [0, x0∗ [×O+ (Sc ) of spin weight 1. Using S-intrinsic Fermi-transport of local frames eA on 6 in the direction of e03 and parametrizing the integral curves by x3 , we obtain bundle morphisms of the O+ (Sc ) onto O+ (6). This allows us to specify the information about the gauge source functions F A in a neighbourhood of T in M in terms of a smooth complex-valued function F on [0, x0∗ [×O+ (6) × [0, x3∗ [ of spin-weight 1. We shall say that F is based on the boundary defining function x3 on S. Conversely, given F as above and a local section of O+ (6), i.e. an oriented orthonormal frame eA on some open subset V of 6, we can use Fermi-transport of eA in the directions of e03 and e0 to obtain the gauge source function F A in the frame eA (resp. ek ). The requirement that the fields eA be Fermi-transported along e03 should be added to (7.2) in the form 003 A B = 0 on U.
(7.27)
The following (where we set R+0 = [0, ∞[) summarizes our main observations about the initial and boundary data and the gauge source functions. Definition 7.2. A smooth initial boundary data set for Einstein’s vacuum field equation consists of the following. A smooth, orientable, compact, 3-dimensional initial manifold S with boundary 6 6= ∅ and the boundary manifold T = R+0 ×6 (with this product structure distinguished). The boundaries 6 and {0} × 6 of these manifolds are identified in the natural way. A smooth (negative) Riemannian metric γαβ and a smooth symmetric tensor field καβ on S which satisfy the constraints (7.1). A smooth real function 2 on 6. A smooth real function T 3 (x0 , p) → χ(x0 , p) ∈ R. Smooth complex functions F , α, β, q on R+0 × O+ (6) of spin weight 1, −4, 0, −2 respectively, such that the functions α, β satisfy condition (7.15). It may be surprising that the function F is listed as part of the initial data set. The somewhat complicated situation concerning the pair χ, F will be discussed in the next section. Definition 7.3. Given an initial boundary data set as in Definition 7.2, an associated set of gauge source functions consists of:
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
651
A smooth real function θ on S with θ|6 = 2. A smooth real function R+0 × 6 × [0, x3∗ [3 (x0 , p, x3 ) → f (x0 , p, x3 ) ∈ R such that f (x0 , p, 0) = χ(x0 , p), p ∈ 6. Here we use a boundary defining function x3 on S to represent a neighbourhood U of 6 in S in the form U = 6 × [0, x3∗ [ with some x3∗ > 0. A smooth complex function F : [0, x0∗ [×O+ (6) × [0, x3∗ [3 (x0 , p, x3 ) → F (x0 , p, x3 ) ∈ C, of spin-weight 1, based on the boundary defining function x3 above, which coincides on [0, x0∗ [×O+ (6) × {0} with the function F given in Definition 7.2. 7.4. The consistency condition. Given the data γ, κ in Definition 7.2 and the gauge source function f , θ, F in Definition 7.3, we can, by taking formal derivatives of the reduced field equations, determine a formal expansion in terms of x0 on a neighbourhood of 6 in S for the fields eµ k , 0i j k , Cijkl , and thus in particular for ηAB , βAB . On the other hand, we also have at 6 a formal expansion of the functions α, β, q on T in terms of x0 . Therefore, to obtain a smooth solution, the formal expansions obtained for the quantities entering the two sides of the boundary condition (7.26) should coincide at any order, i.e. the data need to satisfy a certain “consistency condition”. To meet this condition we may e.g. choose all fields except q, determine the formal expansion of the expression on the right-hand side of (7.26), and choose q on T such that it has the same formal expansion at 6. 8. The Existence Result Given an initial boundary data set as in Definition 7.2, we set M = R+0 ×S such that S and T , identified along their boundary 6, can be considered in a natural way as the boundary of M . We define the function x0 on M such that it induces the natural coordinate on the factor R+0 . Then S = {x0 = 0}. We can now formulate our main result. Theorem 8.1. Suppose we are given a smooth initial boundary data set as in Definition 7.2 and an associated set of smooth gauge source functions as in Definition 7.3 such that the consistency conditions on 6 are satisfied at any order. Then we can find some open set M 0 in M , with {p ∈ M | x0 (p) < τ } ⊂ M 0 for some τ > 0, and on M 0 a solution g to Einstein’s field equation Ric[g] = 0 such that M 0 coincides with the domain of dependence of S ∪ T 0 in (M 0 , g), where T 0 = T ∩ M 0 , and the following properties hold. S is space-like and T 0 is time-like for g. The first and second fundamental form induced by g on S is given (up to a common diffeomeorphism) by γ and κ respectively. The mean extrinsic curvature induced by g on T 0 is given by χ. (ii) The curves R+0 3 x0 → (x0 , p) ∈ T , p ∈ 6, induce curves on T 0 whose tangent vectors define a smooth time-like unit vector field e0 on T 0 orthogonal to 6. If e03 denotes the unit normal of 6 in S pointing towards S, we have g(e0 , e03 ) = − sinh(2). (iii) Denote by e3 the inward pointing g-unit normal of T 0 . Let eA , A = 1, 2, be an oriented frame on some open subset V of 6 such that ek , k = 0, 1, 2, 3, defines an orthonormal frame for M 0 on V . Extend eA into T 0 by T 0 -intrinsic Fermi-transport in the direction of e0 . For the components F A of F in the frame eA we have then F A = g AB g(∇e0 e0 , eB ). If α, β, q are given in the frame eA the associated eA tensors (cf. (7.22), (7.23), (7.25)) satisfy the boundary condition (7.26). (i)
652
H. Friedrich, G. Nagy
(iv) In the particular case where χ = χ0 is constant and F = 0 on T the solution is locally (geometrically) unique near S. Remark 8.1. (i) That the solution is (geometrically) unique in the domain of dependence of the set S is well known from the study of the standard Cauchy problem. To demonstrate in general the uniqueness of the solution locally in time of our initial-boundary value problem, we would have to show that the solution is independent of the choice of gauge source functions. To show this we would need the existence statement which is missing in Lemma 4.2. (ii) We have included F as a datum in Definition 7.2. Given a solution, we can according to Lemma 4.1 always redefine the vector field e0 and the associated coordinates on T close to 6 to achieve a transition 0
(χ(xα ), F (xα )) → (χ0 (xα ), F 0 ≡ 0).
(8.1)
This shows that locally the freedom encoded in the pair χ, F corresponds to that of one real-valued function and Theorem 8.1 tells us that this function is not restricted by any condition if questions concerning the life-time of the solutions are ignored. (iii) If we could perform the transition (8.1) globally on T 0 , irrespective of the life time of the solution, it would be natural to use the particular gauge with F 0 = 0 and specify χ0 as the part of the data which characterizes the nature of the boundary. However, the integral curves of the vector field e0 will then be T 0 -intrinsic geodesics. In general, we can therefore expect that the gauge with F 0 = 0 will, due to focussing phenomena of the geodesics, have a lifetime much shorter than the lifetime of the solution which was specified in terms of χ and F . (iv) This suggests to consider as a datum equivalence classes of pairs (χ, F ) to characterize the boundary. However, which pairs are equivalent in this sense does depend also on the other data (which, incidentally, are related on the boundary to (χ, F ) by the vector field e0 which is specified implicitly in terms of F ) and can only be decided after the solutions are available. There appears to be no way to compare different pairs by calculations on T solely in terms of the data prescribed on T . For the same reason it is not possible to determined which pairs (χ, F ) are particularly “good” for specifying a space-time and which pairs are locally equivalent but not particularly useful because they refer to a gauge which breaks down quickly. (v) These difficulties, which are intrinsic to the initial-boundary value problem and do not represent a peculiar feature of our specific type of analysis, arise because the coordinates on the boundary in which χ is given are related in a direct way to the evolution of the fields. (vi) In the case of the Anti-de Sitter-type space-times studied in [4] boundary data are prescribed on the boundary at infinity which is singled out in a geometric way. There the difficulties pointed out above do not arise due to the special geometric features of the boundary. (vii) For convenience we assume all data to be smooth and we obtain smooth solutions. If weaker smoothness requirements are imposed on the data, a loss of smoothness along the boundary may occur for the solution to the reduced equations. We do not analyse whether due to particular features of the Einstein equations (such as the presence of constraints) more smoothness will be preserved than suggested by the general results (cf. [8, 14]). (viii) From the following proof it can be seen immediately that a result similar to Theorem 8.1 is obtained in the case where S has only inner boundaries and asymptotically flat ends or asymptotically hyperboloidal (cf. [3]) ends with smooth asymptotics.
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
653
(ix) We finally remark that all smooth solutions to Einstein’s vacuum field equations on a region bounded by a space-like and a time-like hypersurface as considered in the introduction can be characterized in terms of data as considered above. Furthermore, if the boundary T and the boundary conditions are extended suitably “backward in time”, the existence of a solution characterized by such data also follows from our result. Proof. Since we are dealing with a hyperbolic problem, we can show the existence of a solution by patching together local solutions. A basic step consists in solving the initial boundary value problem in some neighbourhood U of a given point p ∈ 6 in M . Let x1 , x2 be coordinates and eA a smooth oriented γ-orthonormal frame field on some open neighbourhood V of p in 6. Let e03 be the smooth unit normal to the surfaces Sc in S defined by the boundary defining function x3 such that e03 points towards S on 6. We extend the coordinates x1 , x2 into S such that they are constant on the integral curves of e03 and form together with x3 a coordinate system on some neighbourhood of p in S which is denoted in our notation by V × [0, x3∗ [. We assume the frame eA to be extended to V × [0, x3∗ [ by S-intrinsic Fermi-transport along the integral curves of e03 . The vector fields eA are then tangent to Sc . Observing now the gauge conditions in Sect. 4 and the meaning of the gauge source functions, we use the gauge source functions in the gauge determined by the coordinates x1 , x2 and the frame eA on V to obtain the reduced equations described in Sect. 5. The initial data on V × [0, x3∗ [⊂ S and the boundary conditions on R+0 × V ⊂ T for the reduced equations are determined as described in Sect. 7. With the help of suitable cut-off functions we can put the initial boundary value problem so obtained into the setting considered in [8] (cf. [4] for the details of such a procedure). The results in [8] then imply the existence of some neighbourhood U of p in M , with S ∩ U ⊂ V × [0, x3∗ [ and T ∩ U ⊂ R+0 × V , on which there exists a unique smooth solution u = (eµ k , 0i j k , C i jkl ) of the reduced field equations which satisfies our gauge conditions on U and the initial and boundary conditions on S ∩ U and T ∩ U respectively. We assume that the neighbourhood U is chosen such that it coincides with the domain of dependence of the set (S ∪ T ) ∩ U in U with respect to the metric g for which the frame ek is orthonormal. By Theorem 6.1 and Lemma 7.1 we concluded that u satisfies indeed Eqs. (2.5) and thus Ric[g] = 0. The local solutions can be patched together to yield a solution on some neighbourhood of 6 in M . Consider p, q ∈ 6 and solutions up , uq to (2.5) on neighbourhoods Up , Uq of these points respectively which are obtained as described above. If Up ∩Uq ∩6 = ∅ we have also Up ∩ Uq = ∅. If Up ∩ Uq ∩ 6 6= ∅, the initial data given on Up ∩ S and Uq ∩ S can be related on their intersection Up ∩ Uq ∩ S by the explicitly known simple gauge transformations (7.16), (7.17) which also relate on Up ∩ Uq ∩ S the boundary conditions given on Up ∩ T and Uq ∩ T . These transformations imply also transformations of the gauge source functions. Using the uniqueness property for the solution of the initial boundary value problem for the reduced equations (which is an immediate consequence of the energy estimates) we can thus show that the solution induced by up on the domain of dependence Dp (with respect to up ) of Up ∩ Uq ∩ (S ∪ T ) in Up is related by a gauge transformation to the solution induced by uq on the domain of dependence Dq (with respect to uq ) of Uq ∩ Up ∩ (S ∪ T ) in Uq . Thus we can identify (Up , up ), (Uq , uq ) on Up ∩ Uq via the gauge tranformation to obtain a solution on Up ∪ Uq . Since the time-like frame vectors on Up and Uq are not affected by the gauge transformations (7.16), (7.17) they are also identified on Up ∩ Uq and we obtain a unique time-like vector field e0 on Up ∪ Uq . Proceeding along these lines we can construct a neighbourhood Z of 6 in M on which there exists a smooth solution u of (2.5) such that the initial and boundary
654
H. Friedrich, G. Nagy
conditions are satisfied on Z ∩S and Z ∩T respectively and Z coincides with the domain of dependence of Z ∩ (S ∪ T ) with respect to u. Furthermore we get on Z a unique time-like unit vector field e0 which is in particular tangent to Z ∩ T . It is well known from the study of the Cauchy problem for Einstein’s field equation that the data γ, κ on S\6 determine a (geometrically) unique, smooth, maximal, globally hyperbolic solution (MS , gS ) to the vacuum field equations. Denote by DZ the domain of dependence of (S \ 6) ∩ Z in (Z, g), with g the metric determined from u, and by D the domain of dependence of (S \ 6) ∩ Z in (MS , gS ). The results on the Cauchy problem then allow us to conclude that there must exist an isometric embedding ψ of DZ into D. Using ψ to identify DZ with ψ(DZ ), we obtain a solution (M 0 , g) to the vacuum field equations. We can, possibly after shrinking M 0 slightly, extend the vector field e0 given in a neighbourhood of Z ∩ T to a time-like unit vector field e0 in (M 0 , g) and define a smooth function x0 which vanishes on S and satisfies < e0 , d x0 > = 1 on M 0 . Choosing τ > 0 small enough, the integral curves of e0 starting on S will have length not smaller than τ . This proves assertions (i)–(iii) of the theorem. The proof of assertion (iv) relies on the fact that we can bring the solutions into a standard form near 6 if the mean extrinsic curvature is constant on the boundary. ˆ 0 , g) ˆ are solutions of the vacuum Assume that χ = χ0 = const. and that (M 0 , g), (M ˆ the domain of dependence of equations satisfying conditions (i)–(iii). Denote by D, D ˆ 0 , g) ˆ ˆ respectively. We can assume, possibly after shrinking D, D S \ 6 in (M 0 , g), (M ˆ in time, that there exists an isometry ψ of (D, g|D ) onto (D, g| ˆ Dˆ ) which induces the identity on S \ 6. Let x3 be a boundary defining function on S with level sets Sc and θ a smooth function on S with θ|6 = 2. Denote by e03 the normalized gradient of x3 pointing towards S on 6. Notice thatit does not matter whether we use g or gˆ here. The following constructions will be done on (M 0 , g). Let e0 be the time-like unit vector field on Sc which is orthogonal to Sc and satisfies g(e0 , e03 ) = − sinh(θ). Following the discussion in Sect. 4 we can construct a slicing of a neighbourhood R0 of 6 in M 0 by hypersurfaces Tc , 0 ≤ c < sup x3 , such that Tc ∩ S = Sc , e0 is tangent to Tc on Sc , Tc has constant mean extrinsic curvature χ0 , the vector field e0 on Sc can be extended to a Tc -intrinsic geodesics vector field e0 on Tc with connected integral curves. We denote by x0 the function on R0 which vanishes on S and induces the natural (affine) parameter on the integral curves of e0 . Let h ∈ C ∞ (R, R) be a decreasing function with h(0) > 0, h(a) = 0 for some a, 0 < a < sup x3 such that the set R which is bounded by T 0 , S and {p ∈ M 0 | x0 < h(x3 )} is relative compact in M 0 and coincides with the domain of dependence of R ∩ (S ∪ T 0 ) in M 0 . ˆ 0 , g) We can repeat this discussion with (M ˆ replacing (M 0 , g) to obtain analogous ˆ vector field eˆ0 , and function xˆ 0 sets Tˆc (with mean extrinsic curvature equal to χ0 ), R, 3 based on x and θ. By a suitable choice of h we can assume that the same functions are ˆ used to define R and R. ˆ If p ∈ R, there is a unique number c and a We define now a map ψ¯ from R onto R. unique q ∈ Sc such the Tc -intrinsic geodesic on Tc with tangent vector e0 at q meets p. ¯ We define ψ(p) to be the unique point on the Tˆc -intrinsic geodesic through q for which ¯ xˆ 0 (ψ(p)) = x0 (p). The map ψ¯ then defines a bijection which implies the identity on R ∩ S = Rˆ ∩ S. By Lemma 4.2 we can express the solutions on R, Rˆ locally in terms of a gauge as described in Sect. 4 with the gauge source function being in both cases given by θ, F = 0, f = χ0 . In terms of such a gauge the data related by ψ¯ are identical and the reduced field equations take the same form. The uniqueness of the local solutions, implied by the energy estimates, allows us to conclude that ψ¯ is in fact an isometry.
Initial Boundary Value Problem for Einstein’s Vacuum Field Equation
655
We show that the restrictions of ψ¯ and ψ to R ∩ D define identical maps from R ∩ D ˆ Since ψ is an isometry which leaves (S \ 6) ∩ R pointwise invariant, the onto Rˆ ∩ D. sets T¯c = ψ(Tc ∩ D ∩ R) have constant mean extrinsic curvature equal to χ0 , satisfy ˆ (ψ) e0 , e03 ) = g(T ˆ (ψ) e0 , T (ψ) e03 ) = T¯c ∩S = Sc , and are tangent to eˆ0 on Sc , because g(T 0 ¯ g(e0 , e3 ) = − sinh(θ) entails eˆ0 = T (ψ) e0 . This implies that Tc ⊂ Tˆc . Since isometries map geodesic vector fields again onto such vector fields, it follows that T (ψ) e0 = eˆ0 on T¯c . Since isometries preserve affine parameters, we have x0 = xˆ 0 ◦ ψ on Tc ∩ D ∩ R. This implies our assertion. ˆ 00 = Rˆ ∪ D ˆ to be equal to ψ on D Defining the map 9 from M 00 = R ∪ D onto M ¯ and equal to ψ elsewhere, we get an isometry for the metrics induced by g and gˆ on M 00 ˆ 00 respectively. This proves assertion (iv). and M References 1. Bartnik, R.: Einstein equations in the null quasispherical gauge. Class. Quantum Grav. 14, 2185–2194 (1997) 2. Ellis, G.F.R.: Relativistic cosmology: Its nature, aims and problems. In: General Relativity and Gravitation, B. Bertotti et al. (eds.), Dordrecht: Reidel, 1984 3. Friedrich, H. Cauchy Problems for the Conformal Vacuum Field Equations in General Relativity. Commun. Math. Phys. 91, 445–472 (1983) 4. Friedrich, H.: Einstein Equations and Conformal Structure: Existence of Anti-de Sitter-Type SpaceTimes. J. Geom. Phys. 17, 125–184 (1995) 5. Friedrich, H.: Hyperbolic reductions for Einstein’s field equations. Class. Quantum Gravity 13, 1451– 1469 (1996) 6. Friedrich, H.: Evolution equations for gravitating ideal fluid bodies in general relativity. Phys. Rev. D 57, 2317–2322 (1998) 7. Friedrich, H.: Einstein’s equation and geometric asymptotics. AEI–Preprint 059, 1998, gr-qc 9804009 8. Gu`es, O.: Probl`eme mixte hyperbolique quasi-lin´eaire charact´eristique. Commun. Part. Diff. Eq. 15, 595–645 (1990) 9. Kato, T.: The Cauchy problem for quasi-linear symmetric hyperbolic systems. Arch. Ration. Mech. Anal. 58, 181–205 (1975) 10. Kijowski, J.: A simple derivation of canonical structure and quasi-local Hamiltonians in reneral relativity. Gen. Rel. Grav. 29, 307–343 (1997) 11. Newman, E.T., Penrose, R.: An Approach to Gravitational Radiation by a Method of Spin Coefficients. J. Math. Phys. 3, 566–578 (1992) 12. Rauch, J.: Symmetric positive systems with boundary characteristics of constant multiplicity. Trans. Am. Math. Soc. 291, 167–187 (1985) 13. Secchi, P.: The initial boundary value problem for linear symmetric hyperbolic systems with characteristic boundary of constant multiplicity. Diff. Int. Eqs 9, 671–700 (1996) 14. Secchi, P.: Well-Posedness of Characteristic Symmetric Hyperbolic Systems. Arch. Rational Mech. Anal. 134, 155–197 (1996) 15. Tamburino, L.A., Winicour, J.H.: Gravitational Fields in Finite and Conformal Bondi Frames. Phys. Rev. 150, 1039–1053 (1966) Communicated by H. Nicolai
Commun. Math. Phys. 201, 657 – 697 (1999)
Communications in
Mathematical Physics © Springer-Verlag 1999
Non-Equilibrium Statistical Mechanics of Anharmonic Chains Coupled to Two Heat Baths at Different Temperatures J.-P. Eckmann1,2 , C.-A. Pillet3,4 , L. Rey-Bellet2 1 2 3 4
D´epartement de Physique Th´eorique, Universit´e de Gen`eve, CH-1211 Gen`eve 4, Switzerland Section de Math´ematiques, Universit´e de Gen`eve, CH-1211 Gen`eve 4, Switzerland PHYMAT, Universit´e de Toulon, F-83957 La Garde Cedex, France CPT-CNRS Luminy, F-13288 Marseille Cedex 09, France
Received: 1 April 1998 / Accepted: 17 September 1998
Abstract: We study the statistical mechanics of a finite-dimensional non-linear Hamiltonian system (a chain of anharmonic oscillators) coupled to two heat baths (described by wave equations). Assuming that the initial conditions of the heat baths are distributed according to the Gibbs measures at two different temperatures we study the dynamics of the oscillators. Under suitable assumptions on the potential and on the coupling between the chain and the heat baths, we prove the existence of an invariant measure for any temperature difference, i.e., we prove the existence of steady states. Furthermore, if the temperature difference is sufficiently small, we prove that the invariant measure is unique and mixing. In particular, we develop new techniques for proving the existence of invariant measures for random processes on a non-compact phase space. These techniques are based on an extension of the commutator method of H¨ormander used in the study of hypoelliptic differential operators. 1. Introduction In this paper, we consider the non-equilibrium statistical mechanics of a finite-dimensional non-linear Hamiltonian system coupled to two infinite heat baths which are at different temperatures. We show that under certain conditions on the initial data the system goes to a unique non-equilibrium steady state. Several of the ideas of this paper have been developed in the Ph.D. thesis of one of us [R-B1]. To put this new result into perspective, we situate it among other results in equilibrium and non-equilibrium statistical mechanics. First of all, for the case of only one heat bath one expects of course “return to equilibrium.” This problem has a long history, and a proof of return to equilibrium under quite general conditions on the non-linear small system and its coupling to the heat bath has been recently obtained in [JP1-4], see also [GLR]. Viewed from the context of our present problem, the main simplifying feature of the one-bath problem is that the final state can be guessed, a priori, to be the familiar Boltzmann distribution.
658
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
For the case of two heat baths, there are no results of such generality available, among other things precisely because one cannot guess in general what the steady state is going to be. Since we are dealing with systems on a non-compact phase space and without energy conservation, there is nothing like an SRB Ansatz for our problem [GC]. Worse, even the existence of any stationary state is not obvious at all. The only notable exceptions are problems where the small system and its coupling to the heat baths are linear. Then the problem can be formulated in terms of Gaussian measures, and approach to a steady state has been proved in this case in [RLL, CL, OL] for Markovian heat baths and in [SL] for the general case. For other “boundary driven models” see [GLP, GKI, FGS]. Our approach in the present paper will consist in using the spirit of [FKM] and [FK] to give a microscopic derivation of the equations of motion: under suitable assumptions, we will reduce the study of the dynamics of the coupled system (an infinite dimensional Hamiltonian system) to the study of a random finite dimensional dynamical system. However, we will not achieve the generality of [JP1-4]. Each heat bath is an infinite dimensional linear Hamiltonian system, in our case it will be chosen as the classical field theory associated with the wave equation. The small system is a non-linear Hamiltonian system with an arbitrary (but finite) number of degrees of freedom, in our case it is chosen as a chain of anharmonic oscillators with nearest neighbor couplings. The potential must be of quadratic type near infinite energies. The two heat baths are coupled respectively to the first and the last particle of the chain. The initial conditions of the heat baths will be distributed according to thermal equilibrium at inverse temperatures βL , βR . Integrating the variables of the heat baths leads to a system of random integro-differential equations: the generalized Langevin equations. They differ from the Newton equations of motion by the addition of two kinds of force, on one hand there is a (random) force exerted by the heat baths on the chain of oscillators and on the other hand there is a dissipative force with memory which describes the genuine retro-action from the heat bath on the small system. We will choose the couplings between the baths and the chain such that the random forces exerted by the baths have an exponentially decaying covariance. With this assumption (see [Tr]), the resulting equations are quasi-Markovian. By this, we mean that one can introduce a finite number of auxiliary variables in such a way that the evolution of the chain, together with these variables, is described by a system of Markovian stochastic differential equations. With this set-up, we are led to a classical problem in probability theory: the study of invariant measures for diffusion processes. In our problem, the main difficulties stem from the facts that the phase space is not compact and that the resulting diffusion process is degenerate and not self-adjoint. The standard techniques used to prove the existence of invariant measures do not seem to work in our case and, in this paper, we develop new methods to solve this problem, which rely on methods of spectral analysis. Our proof of existence is based on a compactness argument, as often in the proof of existence of invariant measures. More precisely we will prove that the generator of the diffusion process, a second order differential operator, given in our problem, has compact resolvent, in a suitably chosen Hilbert space. This is done by generalizing the commutator method of H¨ormander, [H¨o], used in the study of hypoelliptic operators. Similar methods have been used to study the spectrum of Schr¨odinger operators with magnetic fields, see [HM, He, HN]. The restriction to a chain is mostly for convenience. Other geometries can be accommodated with our methods, and the number of heat baths is not restricted to two. Furthermore, the techniques developed in this paper can be applied to other interesting
Anharmonic Chains Coupled to Two Heat Baths
659
models of non-equilibrium statistical mechanics, for example, an electric field acting on a system of particles [R-B2].
2. Description of the Model and Derivation of the Effective Equations In this section we define a model of two heat baths coupled to a small system, and derive the stochastic equations which describe the time evolution of the small system. The heat baths are classical field theories associated with the wave equation, the small system is a chain of oscillators and the coupling between them is linear in the field. We begin the description of the model by defining the “small” system. It is a chain of d-dimensional anharmonic oscillators. The phase space of the chain is R2dn with n and d arbitrary and its dynamics is described by a C ∞ Hamiltonian function of the form HS (p, q) =
n X p2j j=1
2
+
n X j=1
Uj (qj ) +
n−1 X i=1
Ui(2) (qi , qi+1 ) ≡
n X p2j j=1
2
+ V (q),
(2.1)
where q = (q1 , . . . , qn ), p = (p1 , . . . , pn ), with pi , qi ∈ Rd . The potential energy will be assumed “quadratic + bounded” in the following sense. Let F be the space of C ∞ functions F on Rdn for which ∂ α F (q) is bounded uniformly in q ∈ Rdn for all multi-indices α. Then our hypotheses are H1) Behavior at infinity: We assume that V is of the form V (q) = 21 q − a, Q(q − a) + F (q), where Q is a positive definite (dn × dn) matrix, a is a vector, and ∂q(ν) F ∈ F for i i = 1, . . . , n and ν = 1, . . . , d. H2) Coupling: Each of the (d × d) matrices Mi,i+1 (q) ≡ ∇qi ∇qi+1 Ui(2) (qi , qi+1 ), i = 1, . . . , n − 1, is either uniformly positive or negative definite. Remark. The first hypothesis makes sure the particles do not “fly away.” The second hypothesis makes sure that the nearest neighbor interaction can transmit energy. As such, this condition is of the hypoelliptic type. Example. A typical case (in dimension d) covered by these hypotheses is given by p p Uj (q) = q 2 + 5 sin 1 + q 2 , Ui(2) (q, q 0 ) = (q − q 0 )2 + sin 1 + (q − q 0 )2 /(2d). As a model of a heat bath we consider the classical field theory associated with the d-dimensional wave equation. The field ϕ and its conjugate momentum field π are elements of the real Hilbert space H = HR1 (Rd ) ⊕ LR2 (Rd ) which is the completion of C0∞ (Rd ) ⊕ C0∞ (Rd ) with respect to the norm defined by the scalar product: Z ϕ ϕ = dx |∇ϕ(x)|2 + |π(x)|2 . (2.2) , π π H
660
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
The Hamilton function of the free heat bath is Z 1 dx |∇ϕ(x)|2 + |π(x)|2 , HB (ϕ, π) = 2 and the corresponding equation of motion is the ordinary wave equation which we write in the form ϕ(t) ˙ ϕ =L π(t) ˙ π where 0 1 L≡ . 10 Let us turn to the coupling between the chain and the heat baths. The baths will be called “L” and “R”, the left bath couples to the coordinate q1 and the right bath couples to the other end of the chain (qn ). Since we consider two heat baths, the phase space of the coupled system, for finite energy configurations, is R2dn × H × H and its Hamiltonian will be chosen as H(p, q, ϕL , πL , ϕR , πR ) = HS (p, q) + HB (ϕL , πL ) + HB (ϕR , πR ) Z Z + q1 · dx ρL (x)∇ϕL (x) + qn · dx ρR (x)∇ϕR (x).(2.3) Here, the ρj (x) ∈ L1 (Rd ) are charge densities which we assume for simplicity to be spherically symmetric functions. The choice of the Hamiltonian Eq. (2.3) is motivated by the dipole approximation of classical electrodynamics. For notational purposes we use in the sequel the shorthand ϕi . φi ≡ πi We set αi = αi(1) , . . . , αi(d) , i ∈ {L, R}, with −ik (ν) ρbi (k)/k 2 (ν) . α bi (k) ≡ 0 Here and in the sequel the “hat” means the Fourier transform Z 1 dx f (x)e−ik·x . fb(k) ≡ (2π)d/2 With this notation the Hamiltonian becomes H(p, q, φL , φR ) = HS (p, q) + HB (φL ) + HB (φR ) + q1 · (φL , αL )H + qn · (φR , αR )H , where HB (φ) = 21 kφk2H . We next study the equations of motions. They take the form q˙j (t) = pj (t), j = 1, . . . , n, p˙1 (t) = −∇q1 V (q(t)) − (φL (t), αL )H , p˙j (t) = −∇qj V (q(t)), j = 2, . . . , n − 1, p˙n (t) = −∇qn V (q(t)) − (φR (t), αR )H , φ˙ L (t) = L (φL (t) + αL · q1 (t)) , φ˙ R (t) = L (φR (t) + αR · qn (t)) .
(2.4)
Anharmonic Chains Coupled to Two Heat Baths
661
The last two equations of (2.4) are easily integrated and lead to Z t ds LeL(t−s) αL · q1 (s), φL (t) = eLt φL (0) + 0 Z t Lt ds LeL(t−s) αR · qn (s), φR (t) = e φR (0) + 0
where the φi (0), i ∈ {L, R}, are the initial conditions of the heat baths. Inserting into the first 2n equations of (2.4) gives the following system of integro-differential equations: q˙j (t) = pj (t),
j = 1, . . . , n, −Lt
p˙1 (t) = −∇q1 V (q(t)) − φL (0), e
αL
Z − H
t
ds DL (t − s)q1 (s),
0
p˙j (t) = −∇qj V (q(t)),
j = 2, . . . , n − 1, Z t −Lt αR H − ds DR (t − s)qn (s), p˙n (t) = −∇qn V (q(t)) − φR (0), e
(2.5)
0
Di(µ,ν) (t
− s), i ∈ {L, R}, are given by where the d × d dissipation matrices Di(µ,ν) (t − s) = αi(µ) , LeL(t−s) αi(ν) H Z 1 ρi (k)|2 |k| sin(|k|(t − s)). = − δµ,ν dk |b d The last expression is obtained by observing that cos(|k|t) |k|−1 sin(|k|t) Lt , e = −|k| sin(|k|t) cos(|k|t) written in Fourier space. So far we only discussed the finite energy configurations of the heat baths. From now on, we will assume that the two reservoirs are in thermal equilibrium at inverse temperatures βL and βR . This means that the initial conditions 8(0) ≡ {φL (0), φR (0)} are distributed according to the Gaussian measure with mean zero and covariance hφi (f )φj (g)i = δij (1/βi )(f, g)H . (Recall that the Hamiltonian of the heat baths is P given by i∈{L,R} (φi , φi )H .) If we assume that the coupling functions αi(ν) are in H, i ∈ {L, R}, and ν ∈ {1, · · · , d} then the ξi (t) ≡ φi (0)(e−Lt αi ) become d-dimensional Gaussian random processes with mean zero and covariance hξi (t)ξj (s)i = δi,j
1 Ci (t − s), i, j ∈ {L, R}, βi
and the d × d covariance matrices Ci (t − s) are given by Ci(µ,ν) (t − s) = αi(µ) , eL(t−s) αi(ν) H Z 1 ρi (k)|2 cos |k|(t − s) . = δµ,ν dk |b d
(2.6)
662
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
The relation C˙ i (t) = Di (t),
(2.7)
which is checked easily by inspection, is known as the fluctuation dissipation theorem. It is characteristic of the Hamiltonian nature of the system. After these assumptions and transformations, the equations of motion (2.5) become a system of random integrodifferential equations on R2dn which we will analyze in the sequel. Finally, we impose a condition on the random force exerted by the heat baths on the chain. We assume that H3) The covariances of the random processes ξi (t) with i ∈ {L, R} satisfy Ci(µ,ν) (t − PM s) = δµ,ν m=1 λ2i,m e−γi,m |t−s| , with γi,m > 0 and λi,m > 0. This can be achieved by a suitable choice of the coupling functions ρi (x), for example ρbi (k) = const.
M Y m=1
(k 2
1 , 2 )1/2 + γi,m
where all the γi,m are distinct. To keep the notation from still further accumulating, we choose M the same on the left and the right. We will call the random process given by Eq. (2.5) quasi-Markovian if Condition H3 is satisfied. Indeed, using Condition H3 together with the fluctuation-dissipation relation (2.7) and enlarging the phase space one may eliminate the memory terms (both deterministic and random) of the equations of motion (2.5) and rewrite them as a system of Markovian stochastic differential equations. By Condition H3 we can rewrite the stochastic processes ξi (t) as Itˆo stochastic integrals s Z M X 2γi,m t −γi,m (t−s) λi,m e dwi,m (s), ξi (t) = βi −∞ m=1 where the wi,m (s) are d-dimensional Wiener processes with covariance i h (µ) (µ) (ν) (ν) 0 0 (t) − wi,m (s) wj,m E wi,m 0 (t ) − wj,m0 (s ) = δi,j δµ,ν δm,m0 |[s, t] ∩ [s0 , t0 ]|,
(2.8)
where s < t and s0 < t0 , E is the expectation on the probability space of the Wiener process and | · | denotes the Lebesgue measure. We introduce new “effective” variables rL,m , rR,m ∈ Rd , with m = 1, . . . , M , which describe both the retro-action of the heat bath onto the system and the random force exerted by the heat baths: Z t ds e−γL,m (t−s) q1 (s) rL,m (t) = λ2L,m γL,m s 0 Z 2γL,m t −γL,m (t−s) e dwL,m (s), − λL,m βL −∞ Z t ds e−γR,m (t−s) qn (s) rR,m (t) = λ2R,m γR,m 0 s Z 2γR,m t −γR,m (t−s) e dwR,m (s). − λR,m βR −∞
Anharmonic Chains Coupled to Two Heat Baths
663
We get the following system of Markovian stochastic differential equations: dqj (t) = pj (t)dt,
j = 1, . . . , n,
dp1 (t) = −∇q1 V (q(t))dt +
M X
rL,m (t)dt,
m=1
dpj (t) = −∇qj V (q(t))dt, j = 2, . . . , n − 1, dpn (t) = −∇qn V (q(t))dt +
M X
rR,m (t)dt,
(2.9)
m=1
s
2γL,m dwL,m (t), βL s 2γR,m dwR,m (t), drR,m (t) = −γR,m rR,m (t)dt + λ2R,m γR,m qn (t)dt − λR,m βR drL,m (t) = −γL,m rL,m (t)dt +
λ2L,m γL,m q1 (t)dt
− λL,m
which defines a Markov diffusion process on R2d(n+M ) . This system of equations is our main object of study. Our main results are the following: Theorem 2.1. If Conditions H1-H2 hold, there is a constant λ∗ > 0, such that for |λL,m |, |λR,m | ∈ (0, λ∗ ) with m = 1, . . . , M , the solution of Eq. (2.9) is a Markov process which has an absolutely continuous invariant measure µ with a C ∞ density. Remark. In Proposition 3.6 we will show even more. Let h0 (β) be the Gibbs distribution for our system when the heat baths are both at temperature 1/β. If h denotes the density of the invariant measure found in Theorem 2.1, we find that h/h0 (β) is in the Schwartz space S for all β < min(βL , βR ). This mathematical statement reflects the intuitively obvious fact that the chain can not get hotter than either of the baths. Concerning the uniqueness and the ergodic properties of the invariant measure, our results are restricted to small temperature differences. We have the following result. Theorem 2.2. If Conditions H1-H2 hold, there are constants λ∗ > 0 and ε > 0 such that for |λL,m |, |λR,m | ∈ (0, λ∗ ) with m = 1, . . . , M , and |βL − βR |/(βL + βR ) < ε, the Markov process (2.9) has a unique invariant measure and this measure is mixing. Remark. The restriction on the couplings between the small system and the baths λL,m , λR,m is non-perturbative: it is a condition of stability of the coupled small system plus heat baths. Indeed, the baths have the effect of renormalizing the deterministic potential seen by the small system. The constant λ∗ depends only on the potential V (q): if the coupling constants λL,m , λR,m are too large, the effective potential ceases to be stable and, at least at equilibrium (i.e., for βL = βR ), there is no invariant probability measure for the Markov process (2.9), but only a σ-finite invariant measure (see Eq. (3.7) and Eq. (3.9)). This restriction is related to Condition H1 on the potential: for potentials which grow at infinity faster than quadratically, this restriction would not be present (see [JP1]). On the other hand, the restriction on the temperature difference is of perturbative origin.
664
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
Remark. Another, more physical interpretation of the problem addressed above has been made by a referee. One starts from a translation invariant coupling between the chain and either of the baths, which is of the form Z dx ϕ(x)ρ(x − q1 ). The dipole expansion for this coupling leads to the terms of the form Z Z q2 1 dx|ρ(x)|2 . q1 · dx ϕ(x)∇ρ(x) + 1 2 d We have only taken the first term in (2.3). If one takes both terms, one can take λL,m and λR,m arbitrarily large. A more physical formulation of the results of Theorem 2.2 is obtained by going back to Eq. (2.5), which expresses all the quantities in terms of the phase space of the small system and the initial conditions 8(0) of the heat baths. Let us introduce some notation: For given initial conditions 8(0), we let 2t,8(0) (p, q) denote the solution of Eq. (2.5). Finally, define Z µ(dp, dq, dr), ν(dp, dq) = r∈R2dM
where µ is the invariant measure of Theorem 2.1. Corollary 2.3. Under the hypotheses of Theorem 2.2, the system Eq. (2.5) reaches a stationary state and is mixing in the following sense: For any observables F , G ∈ L2 (R2dn , ν(dp, dq)) and for any probability measure ν0 (dp, dq) which is absolutely continuous with respect to ν(dp, dq) we have Z Z lim ν0 (dp, dq)h F ◦ 2t,80 (p, q)i = ν(dp, dq)F (p, q), t→∞ Z (2.10) ν(dp, dq)h F ◦ 2t,80 (p, q)G(p, q)i lim t→∞ Z Z = ν(dp, dq)F (p, q) ν(dp, dq)G(p, q). Here, h·i denotes the integration over the Gaussian measures of the two heat baths, introduced earlier. We explain next the strategy of our argument. Our proof is based on a detailed study of Eq. (2.9). Let x = (p, q, r). For a Markov process x(t) with phase space X and an invariant measure µ(dx), its ergodic properties may be deduced from the study of the associated semi-group T t on the Hilbert space L2 (X, µ(dx)). To prove the existence of the invariant measure in Theorem 2.1 we proceed as follows: We consider first the semi-group T t on the auxiliary Hilbert space H0 ≡ L2 (X, µ0 (dx)), where the reference measure µ0 (dx) is a generalized Gibbs state for a suitably chosen reference temperature. Our main technical result consists in proving that the generator L of the semi-group T t on H0 and its adjoint L∗ have compact resolvent. This is proved by generalizing the commutator method developed by H¨ormander to study hypoelliptic operators. From this follows the existence of a solution to the eigenvalue equation (T t )∗ g = g in H0 and this implies immediately the existence of an invariant measure. To prove Theorem 2.2 we use
Anharmonic Chains Coupled to Two Heat Baths
665
a perturbation argument, indeed at equilibrium (i.e., for βL = βR ) the invariant measure is unique and 0 is a simple eigenvalue of the generator L in H0 . Using the compactness properties of L, we show that 0 is a simple eigenvalue of the generator L in H0 for |βL − βR |/(βL + βR ) small enough. And this can be used to prove the uniqueness claim of Theorem 2.2, while the mixing properties will be shown by extending the method of [Tr]. This paper is organized as follows: In Sect. 3 we prove Theorem 2.1 and Theorem 2.2 except for our main estimates Proposition 3.4 and Proposition 3.5 which are proven in Sect. 4. In Appendices A, B, and C, we prove some auxiliary results. 3. Invariant Measure: Existence and Ergodic Properties In this section, our main aim is to prove Theorem 2.1 and Theorem 2.2. We first prove some basic consequences of our Assumptions H1 and H2. In particular, we define the semi-group T t describing the solutions of Eq. (2.9) on the auxiliary Hilbert space H0 described in the introduction. Furthermore we recall some basic facts on hypoelliptic differential operators. Once these preliminaries are in place, we can attack the proof of Theorem 2.1 and Theorem 2.2 proper. 3.1. Existence and fundamental properties of the dynamics. Let X = R2d(n+M ) and write the stochastic differential equation (2.9) in the abbreviated form dx(t) = b(x(t))dt + σdw(t),
(3.1)
where (i)
b is a C ∞ vector field which satisfies, by Condition H1, sup |∂ α b(x)| < ∞,
x∈X
for any multi-index α such that |α| ≥ 1. In particular B ≡ kdiv bk∞ < ∞.
(3.2)
(ii) σ : R2dM → X is a linear map. We also define D≡
1 T σσ ≥ 0. 2
(3.3)
(iii) w ∈ W ≡ C(R; R2dM ) is a standard 2dM -dimensional Wiener process. Equation (3.1) is a customary abbreviated form of the integral equation Z t ds b(ξ(s, w; x)) + σ(w(t) − w(0)). ξ(t, w; x) = x +
(3.4)
0
It follows from an elementary contraction argument (see e.g. [Ne], Theorem. 8.1) that (3.4) has a unique solution R 3 t 7→ x(t) = ξ(t, w; x) ∈ C(R; X), for arbitrary initial condition x ∈ X and w ∈ W.
666
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
The difference w(t) − w(0) has the statistics of a standard Brownian motion and we denote by E[·] the corresponding expectation. By well-known results on stochastic differential equations, this induces on ξ(t, w; x) the statistics of a Markovian diffusion process with generator ∇ · D∇ + b(x) · ∇.
(3.5)
More precisely (see [Ne] Theorem 8.1): Let C∞ (X) denote the continuous functions which vanish at infinity with the sup-norm and let F t be the σ-field generated by x and {w(s) − w(0) ; 0 < s ≤ t}, then for 0 ≤ s ≤ t and f ∈ C∞ (X) we have (3.6) E f (x(t))|F s = T t−s f (x(s)) a.s., where T t is a strongly continuous contraction semi-group of positivity preserving operators on C∞ (X) whose generator reduces to (3.5) on C0∞ (X). In the sequel we denote by L the differential operator (3.5) with domain D(L) = C0∞ (X). To prove the existence of an invariant measure we will study the semi-group T t or rather an extension of it on the auxiliary weighted Hilbert space H0 described in the introduction. To define H0 precisely, we consider the “effective Hamiltonian” M 2 2 X 1 rR,m 1 rL,m + 2 − q1 · rL,m − qn · rR,m . G(p, q, r) = HS (p, q) + λ2L,m 2 λR,m 2 (3.7) m=1 We note that, due to Condition H1, G(x) → +∞ as |x| → ∞ as long as |λL,m |, |λR,m | < λ∗ for some λ∗ depending only on the potential V (q). We choose further a “reference temperature” β0 , which is arbitrary subject to the condition β0 < 2 min(βL , βR ).
(3.8)
For example we could take β0 as the inverse of the mean temperature of the heat baths: β0−1 = (βL−1 + βR−1 )/2. For the time being, it will be convenient not to fix β0 . Then, we let H0 = L2 (X, Z0−1 e−β0 G dx),
(3.9)
and we denote (·, ·)H0 and k · kH0 the corresponding scalar product and norm. Remark. With a proper choice of Z0 , it is easy to check that the quantity Z0−1 e−β0 G(q,p,r) dx is the invariant measure for the Markov process Eq. (2.9) when βL = βR = β0 and |λL,m |, |λR,m | < λ∗ . Lemma 3.1. If the potential V satisfies Condition H1 and if β0 < 2 min(βL , βR ) there is a λ∗ > 0 such that if the couplings satisfy |λL,m |, |λR,m | ∈ (0, λ∗ ), then the semi-group t T t given by Eq. (3.6) extends to a strongly continuous quasi bounded semi-group TH 0 on H0 : t k ≤ eαt , kTH 0 H0
Anharmonic Chains Coupled to Two Heat Baths
where α is given by α=d
M X m=1
γL,m
1 − 2
667
! p (βL − β0 /2)β0 /2 + γR,m βL
1 − 2
≥ 0.
!! p (βR − β0 /2)β0 /2 βR (3.10)
t is the closure of the differential operator L with domain C0∞ The generator LH0 of TH 0 given by
L=
M X λ2L,m γL,m
βL
m=1
+
+
∇rL,m − βL WL,m · ∇rL,m
M X
λ2R,m γR,m ∇rR,m − βR WR,m · ∇rR,m β R m=1 M X
rL,m · ∇p1 + LS +
m=1
M X
(3.11)
rR,m · ∇pn ,
m=1
with the abbreviations −2 WL,m = λ−2 L,m rL,m − q1 , WR,m = λR,m rR,m − qn ,
(3.12)
and where LS is the Liouville operator associated with the Hamiltonian HS (q, p): LS =
n X
pj · ∇qj − (∇qj V ) · ∇pj .
(3.13)
j=1 t is positivity preserving: Moreover, TH 0 t f ≥ 0 if f ≥ 0, TH 0
(3.14)
t 1 = 1. TH 0
(3.15)
and
Remark. We have α = 0 if only if βL = βR = β0 . Proof. The proof uses standard tools of stochastic analysis and is given in Appendix A. Having shown a priori bounds using Condition H1, we will state one basic consequence of Condition H2. We recall that a differential operator P is called hypoelliptic if sing supp u = sing supp P u for all u ∈ D0 (X). Here D0 (X) is the usual space of distributions on the infinitely differentiable functions with compact support and for u ∈ D0 (X), sing supp u is the set of points x ∈ X such that there is no open neighborhood of x to which the restriction of u is a C ∞ function. Let P be of the form P =
J X j=1
Yj2 + Y0 ,
(3.16)
668
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
where Yj , j ∈ {0, . . . , J} are real C ∞ vector fields. Then by H¨ormander’s Theorem, [H¨o], Thm. 22.2.1, if the Lie algebra generated by Yj , j ∈ {0, . . . , J} has rank dim X at every point, then P is hypoelliptic. Differential operators arising from diffusion problems are of the form (3.16). Let L be the differential operators given in Eq. (3.11), let LT denote its formal adjoint, then one may easily check that Condition H2 implies that any of the following operators: L, LT , ∂t + L, ∂t + LT , satisfies the condition of H¨ormander’s Theorem and thus is hypoelliptic.As an immediate consequence we have: Corollary 3.2. If Condition H2 is satisfied then the eigenvectors of L and LT are C ∞ functions. Next, let P (t, x, E), t ≥ 0, x ∈ X, E ∈ B denote the transition probabilities of the Markov process ξ(t, w; x) solving the stochastic differential equation (2.9) with initial condition x, i.e., P (t, x, E) = P (ξ(t, w : x) ∈ E) , where P denotes the probability associated with the Wiener process. Then by the forward and backward Kolmogorov equations we obtain Corollary 3.3. If Conditions H1 and H2 are satisfied then the transition probabilities of the Markov Process ξ(t, w; x) have a smooth density P (t, x, y) ∈ C ∞ ((0, ∞) × X × X). 3.2. Proof of Theorem 2.1 and Theorem 2.2. After these preliminaries we now turn to t . the study of spectral properties of the generator LH0 of the semi-group TH 0 The proof of the existence of the invariant measure will be a consequence of the following key property which we prove in Sect. 4. Proposition 3.4. If the potential V satisfies Conditions H1, H2 and if β0 < 2 min(βL ,βR ) there is a λ∗ > 0 such that if the couplings satisfy |λL,m |, |λR,m | ∈ (0, λ∗ ), then both LH0 and L∗H0 have compact resolvent. A useful by-product of the proof of Proposition 3.4 are some additional smoothness and decay properties of the eigenvalues of LH0 and L∗H0 on H0 . Proposition 3.5. Let g denote an eigenvector of LH0 or L∗H0 . If the assumptions of Proposition 3.4 are satisfied then we have ge−β0 G/2 ∈ S(X), where S(X) denotes the Schwartz space. Using these results, we come back to the Markov process defined by Eqs.(2.9), and whose semi-group T t was defined in Eq. (3.6). We prove the existence of an invariant measure with a smooth density and give a bound which shows that, in some sense, the chain does not get hotter than the hottest heat bath. Proposition 3.6. Under the assumptions H1–H2 there is a λ∗ > 0 such that if the couplings satisfy |λL,m |, |λR,m | ∈ (0, λ∗ ) the Markov process T t has an invariant measure µ which is absolutely continuous with respect to the Lebesgue measure. Its density h satisfies the following: h exp(βG) ∈ S(X) for all β < min(βL , βR ).
Anharmonic Chains Coupled to Two Heat Baths
669
Proof. The function 1 is obviously a solution of Lf = 0 with L defined in Eq. (3.11). Note next that the function 1 is in H0 , as is seen from Eq. (3.9) (if |λL,m | and |λR,m | are sufficiently small). Since, by Proposition 3.4, the operator LH0 has compact resolvent on H0 , it follows that 0 is also an eigenvalue of L∗H0 . Let us denote the corresponding eigenvector by g. We will choose the normalization (g, 1)H0 = 1. We assume first that g ≥ 0. Then the function h(x) = Z0−1 g(x)e−β0 G(x) ,
(3.17)
with β0 and G defined in Eqs.(3.8) and (3.7), is the density of an invariant measure for the process T t : Indeed, we note first that khkL1 (X,dx) = (1, g)H0 is finite and thus µ(dx) is a probability measure. Let E be some Borel set. Then the characteristic function χE of E belongs to H0 . We have Z Z −1 t dx e−β0 G(x) g(x)T t χE µ(dx) T χE = Z0 Z t ∗ = Z0−1 dx e−β0 G(x) (TH ) g(x)χE 0 = µ(E), and therefore µ(dx) is an invariant measure for the Markov process (2.9). To complete the first part of the proof of Proposition 3.6 it remains to show that g ≥ 0. We will do this by checking that h ≥ 0. We need some notation. Let LT denote the formal adjoint of L. Then one has LT h = 0. This follows from the identities Z Z dx f LT h = Z0−1 dx f LT ge−β0 G Z = Z0−1 dx Lf )ge−β0 G = Lf, g)H0 = (f, L∗H0 g)H0 = (f, 0)H0 = 0, which hold for all f ∈ C0∞ (X). Consider now the semi-group T t acting on the space C∞ (X) defined at the beginning of Sect. 3. The operator T t induces an action (T t )∗ de∗ (X) which consists of finite measures. Since T t is Markovian, fined on the dual space C∞ t ∗ (T ) maps probability measures to probability measures. Furthermore, if a measure ν has a density f in L1 (X, dx), then (T t )∗ ν is a measure which has again a density in L1 (X, dx): Indeed, by Corollary 3.3 the transition probabilities of the Markov process P (t, x, y) are in C ∞ ((0, ∞) × X × X). If we denote by (T t )T the induced action of (T t )∗ on the densities, we have for g ≥ 0, Z t ∗ (T ) g(x)dx = dy g(y)P (t, y, dx) Z = dx dy g(y)P (t, y, x) = (T t )T g (x)dx, and k(T t )T gkL1 = kgkL1 . Coming back to the invariant density h, we know that (T t )T h = h.
670
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
We next show (T t )T |h| = |h|. Since |h| ± h ≥ 0, we have (T t )T (|h| ± h) ≥ 0. This can be rewritten as |(T t )T h| ≤ (T t )T |h|. Therefore,
|h| = |(T t )T h| ≤ (T t )T |h|.
Since (T t )T preserves the L1 -norm, we conclude that |h| = (T t )T |h|.
(3.18)
This shows the existence of an invariant measure. Now, by Proposition 3.5, we have h exp(βG/2) ∈ S(X) for all β < 2 min(βL , βR ) and so for β < min(βL , βR ) it follows that h exp(βG) ∈ S(X). This concludes the proof of Proposition 3.6. We next prove the uniqueness of the invariant measure and the ergodic properties of the Markov process. We start by fixing an inverse temperature β0 . If βL = βR = β0 , the two heat baths are at the same temperature, and the equilibrium state of the system is known, since it is given by the generalized Gibbs distribution Z0−1 e−β0 G . For the equilibrium case, this distribution is the unique invariant measure. The existence is obvious from what we showed for the case of arbitrary temperatures. To show uniqueness, assume that there is a second invariant measure. Since LT is hypoelliptic, then by Corollary 3.2 this measure has a smooth density. Since different smooth invariant measures have mutually disjoint supports and e−β0 G has support everywhere, uniqueness follows. If the invariant measure is unique, it is ergodic and hence, (see [Yo] and [Ho]) 0 is a simple eigenvalue of LH0 . The case of different temperatures will be handled by a perturbation argument around the equilibrium situation we just described. This perturbation argument will take place in the fixed Hilbert space H0 defined in Eq. (3.9). Thus, we will consider values of βL and βR such that 1 1 1 1 = + , β0 2 βL βR
|βL − βR | < ε, βL + βR
(3.19)
for some small ε > 0 (which does not depend on β0 ). We first show that 0 remains a simple eigenvalue of the generator LH0 when the temperature difference satisfies (3.19) for a sufficiently small ε. Lemma 3.7. Under the assumptions H1–H2 there are constants λ∗ > 0 and ε > 0 such that if the couplings satisfy |λL,m |, |λR,m | ∈ (0, λ∗ ) and moreover βL , βR satisfy (3.19), then 0 is a simple eigenvalue of the generator LH0 . Proof. It will be convenient to work in the flat Hilbert space L2 (X, dx). Note that K = exp (−β0 G/2)L exp (+β0 G/2) is a function K ≡ K(βL , βR , β0 ). We write this as K(βL , βR , β0 ) = K(β0 , β0 , β0 ) + δZ, where δ=
βR − βL . βR + βL
Anharmonic Chains Coupled to Two Heat Baths
671
One finds K(β0 , β0 , β0 ) =
M X λ2L,m γL,m m=1
+
2
M X λ2R,m γR,m m=1
+
2 2 β0 2 d ∇ − WL,m + 2 β0 rL,m 2 λL,m
M X
2
2 2 β0 2 d ∇ − WR,m + 2 β0 rR,m 2 λR,m
rL,m · ∇p1 + LS +
m=1
!
M X
!
rR,m · ∇pn ,
m=1
and Z=
M X λ2L,m γL,m 2 2 β0 2 ∇rL,m + WL,m · ∇rL,m + ∇rL,m · WL,m + WL,m 2 β0 2 m=1 M X λ2R,m γR,m 2 2 β0 2 ∇rR,m + WR,m · ∇rR,m + ∇rR,m · WR,m + WR,m . − 2 β0 2 m=1
Furthermore, by Proposition 3.4, R0 ≡ (1 − K(β0 , β0 , β0 ))−1 is a compact operator, and therefore the simple eigenvalue 1 of R0 is isolated. From now on we assume for convenience that α ≡ α(βL , βR ) is strictly smaller than one. Note that this is no restriction of generality: if α ∈ [n − 1, n) with n > 1, we replace (1 − K)−1 by (1 − n1 K)−1 in the following discussion. We show next that the resolvent R(βL , βR , β0 ) ≡ (1 − K(βL , βR , β0 ))−1 depends analytically on the parameter δ. It is convenient to write the perturbation Z as Z=
N X
E j Fj ,
j=1 (ν) , i ∈ {L, R}, m = where the Ej and Fj are of the form const. ∂r(ν) or const. Wi,m i,m 1, . . . , M , and N = 8dM . With the matrix notation F1 .. F = . , E T = (E1 , . . . , EN ) , FN
we can write Z as Z = E T F . We will use the following resolvent formula: −1 F R0 . R(βL , βR , β0 ) = R0 1 + δR0 E T 1 − δF R0 E T
(3.20)
To justify Eq. (3.20) we have to show that for δ small enough the operator-valued matrix 1 − δF R0 E T is invertible. It is enough to show that Fj R0 Ek is a bounded operator, for all j, k. For this we decompose 1−K(β0 , β0 , β0 ) into its symmetric and antisymmetric parts: 1 − K(β0 , β0 , β0 ) = X + iY,
672
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
where X=1
+
M X λ2L,m γL,m m=1
+
2
M X λ2R,m γR,m m=1
2
2 β0 2 1 − ∇2rL,m + WL,m −d 2 β0 2 λL,m
!
2 β0 2 1 − ∇2rR,m + WR,m −d 2 β0 2 λR,m
! .
From the simple estimates kEj f k2 ≤ (f, Xf ), kFj f k2 ≤ (f, Xf ), which hold for f ∈ C0∞ (X), and since X is a strictly positive operator we see that Ej X −1/2 and Fj X −1/2 are bounded operators. From the identity Ej R0 Fk = Ej (X + iY )−1 Fk = Ej X −1/2 (1 + iX −1/2 Y X −1/2 )−1 X −1/2 Fk , we see that Ej R0 Fk are bounded operators for all j, k. Therefore, the r.h.s. of Eq. (3.20) is well defined for sufficiently small δ. An immediate consequence of the resolvent formula (3.20) is that for sufficiently small δ the spectrum of R(βL , βR , β0 ) has the same form as the spectrum R0 : 1 is an eigenvalue and there is a spectral gap and, in particular 1 is a simple eigenvalue. This concludes the proof of Lemma 3.7. Next we use this lemma to prove uniqueness of the invariant measure. We have the following Theorem 3.8. Under the assumptions H1–H2 there are constants λ∗ > 0 and ε > 0 such that if the couplings satisfy |λL,m |, |λR,m | ∈ (0, λ∗ ) and the temperatures satisfy |βR − βL |/(βL + βR ) < ε, the Markov process T t has a unique (and hence ergodic) invariant measure. Proof. The proof uses a dynamical argument. By Proposition 3.4 we have in the Hilbert t t ∗ 1 = 1 and (TH ) g= space H0 (with β0 given as in (3.19)) the eigenvalue equation TH 0 0 g. Let the eigenvectors be normalized such that (g, 1)H0 = 1. By Lemma 3.7, 0 is a simple eigenvalue of the generator LH0 if (βR − βL )/(βL + βR ) is small enough and by Proposition 3.6, the measure µ(dx) = Z0−1 g exp (−β0 G) is an invariant measure for the Markov process. It is absolutely continuous with respect to the Lebesgue measure which we denote by λ. Assume now that ν is another invariant probability measure. By the hypoellipticity of L it must have a smooth density. Therefore there is a Borel set A ⊂ X, which we may assume bounded, with the following properties: we have ν(A) > 0 and λ(A) > 0 but µ(A) = 0, because the measures have disjoint supports. Let χA denote the characteristic function of the set A. By the pointwise ergodic theorem, see [Yo] and [Ho], we have, Rs denoting σs (x) the ergodic average σs (x) = (1/s) 0 dt T t χA (x), lim σs (x) = ν(A), ν–a.e..
s→∞
(3.21)
Anharmonic Chains Coupled to Two Heat Baths
673
Since T t is a contraction semi-group on B(X, B), and χA ≤ 1 we find kσs k∞ ≤ 1, for all s > 0. From the easy bound kσs kH0 ≤ kσs k∞ , we see that the set {σs , s > 0} is a bounded subset of H0 and hence weakly sequentially precompact. Therefore, there is a sequence sn ↑ ∞ such that w–lim σsn = σ ∗ , n→∞
where w–lim denotes the weak limit in H0 . Since T u is a bounded operator for all u > 0, we have w–lim T u σsn = T u σ ∗ . n→∞
We next show T u σ∗ = σ∗ .
(3.22)
Indeed, 1 T σsn (x) = sn u
= σ sn
Z
sn +u
dt T t χA (x) Z u Z sn +u 1 1 t − dt T χA (x) + dt T t χA (x). sn 0 sn sn u
(3.23)
The last two terms in (3.23) are bounded by u/sn and we obtain T u σ ∗ = σ ∗ for all u > 0 by taking the limit n → ∞ in (3.23). t , t > 0 and so, by Therefore, σ ∗ is in the eigenspace of the eigenvalue 1 of TH 0 ∗ ∗ Lemma 3.7 σ = c1. To compute c we note that c = (g, σ )H0 and, using the invariance of the measure, we get Z sn 1 t dt TH χ c = lim g, A 0 n→∞ sn 0 H0 Z sn Z 1 t = lim dt µ(dx) T χA = µ(A). n→∞ sn 0 So we have c = µ(A) and µ(A) = 0 by hypothesis. Using this information, we consider (χA , σsn )H0 . We have, on one hand, lim (χA , σsn )H0 = (χA , σ ∗ )H0 = 0,
n→∞
and on the other hand we have, by Eq. (3.21) and by the dominated convergence theorem, Z lim (χA , σsn )H0 = dx ν(A)χA Z0−1 e−β0 G > 0, n→∞
and this is a contradiction. This shows that there is a unique invariant measure for the Markov process T t and as a consequence the measure is ergodic. We will now strengthen the last statement by showing that the invariant measure is in fact mixing. This will be done by extending the proof of return to equilibrium given in [Tr].
674
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
Proposition 3.9. Assume that the conditions of Theorem 3.8 are satisfied. Then the invariant measure µ(dx) for the Markov process T t is mixing, i.e., Z Z Z t µ(dx)f (x)T g(x) = µ(dx)f (x) µ(dx)g(x), lim t→∞
for all f, g ∈ L2 (X, µ(dx)). Proof. We denote H = L2 (X, µ(dx)) and its scalar product by (·, ·)H and by k · kH its norm. By [Yo], Chap. XIII.1, Thm. 1, T t defines a contraction semi-group on H. Since T t is a strongly continuous semi-group on C∞ (X) (see [Ne]) and since C∞ (X) is dense t on H. The property of in H, we can extend T t to a strongly continuous semi-group TH mixing is equivalent to t f = (1, f )H for all f ∈ H. w–lim TH t→∞
(3.24)
By a simple density argument it is enough to show (3.24) for a dense subset of H. Let C 2 (X) denote the bounded continuous functions whose first and second partial derivatives are bounded and continuous. Then – [GS], Part II, §9 – if f ∈ C 2 (X), then t t f ∈ C 2 (X) and for any τ < ∞, TH f is uniformly differentiable w.r.t. t ∈ [0, τ ] and TH ∂ t T f = Lf, ∂t H where L is the differential operator given in (3.11). Let f ∈ C 2 (X). Using the fact, see Proposition 3.5 and Proposition 3.6, that the density of the invariant measure is of the form h(x) = gee−β0 G/2 with g ∈ S(X), we may differentiate under the integral and integrate by parts and using the invariance of the measure we obtain d t t t t f, TH f )H + (TH f, LTH f )H kT t f k2 = (LTH dt H H 2 X 2λi,m γi,m t =− k∂r(ν) TH f k2H , i,m βi
(3.25)
i∈{L,R} m∈{1,...,M } ν∈{1,...,d}
t f k2H is decreasing, where ∂r(ν) is the differential operator with domain C 2 (X). Thus kTH i,m
t f k2H exists. As a consequence we bounded below and continuous and so limt→∞ kTH find t f k2H ∈ L1 ([0, ∞), dt). k∂r(ν) TH i,m
(3.26)
Following [Tr] and [Br], we call a sequence {tn } a (∗)-sequence if tn ↑ ∞ and tn f k2H = 0. lim k∂r(ν) TH
n→∞
i,m
(3.27)
The existence of (∗)-sequences for our problem follows easily from (3.26). Further we define an almost (∗)-sequence as a sequence sn ↑ ∞ for which there exists a (∗)-sequence tn f = {tn } with |sn − tn | → 0 as n → ∞. As in [Tr] we next show that w–limn→∞ TH sn w–limn→∞ TH f . Indeed, let us choose τ < min(t1 , s1 ). From the inequality
Anharmonic Chains Coupled to Two Heat Baths
k
675
d t t−τ τ τ T f kH = kTH LTH f kH ≤ kLTH f kH , dt H
which holds for t > τ , we have sn tn τ − TH )f kH ≤ |sn − tn | kLTH f kH → 0, n → ∞, k(TH tn sn f and TH f have the same weak limit. which shows that TH t f, t ≥ 0} is bounded and hence sequentially weakly precompact, so The set {TH that by passing to a subsequence, we may assume that tn f = γ, γ ∈ H. w–lim TH n→∞
tn is a locally constant function Next we show that for all (∗)-sequences w–limn→∞ TH µ − a.e. We begin by showing that γ does not depend on the variables r. Let ∂r∗(ν) denote the adjoint of ∂r(ν) in H and let ψ ∈ C0∞ (X). By the smoothness i,m
i,m
properties of the density of the invariant measure we see that the function ψ is in the domain of ∂r∗(ν) and we have, using (3.27) i,m
tn tn f, ∂r∗(ν) ψ)H = lim (∂r(ν) TH f, ψ)H = 0. (γ, ∂r∗(ν) ψ)H = lim (TH n→∞
i,m
Written explicitly, Z
i,m
n→∞
i,m
dp dq dr γ(p, q, r)∂r(ν) ψ(p, q, r)h(p, q, r) = 0, i,m
(3.28)
for any ψ ∈ C0∞ (X). Since γ ∈ H, we may set γ = 0 on the set A ≡ {x ∈ X ; h(x) = 0} and γ is locally integrable and thus defines a distribution in D0 (X). By Eq. (3.28) the support of the distribution ∂r(ν) γ(p, q, r) does not intersect the set A and thus γ(p, q, r) i,m is µ-a.e. independent of r. tn +t t f = TH γ. Since t + tn ↑ ∞, it is easy to show, Let t > 0. Then w–limn→∞ TH see [Br], that tn + t has an almost (∗)-subsequence sn and from the above arguments we t γ is independent of r. conclude that TH Next we show inductively, using Condition H2 that γ does not depend on the variables t ∗ t ) denote the semi-group dual to TH on H and denote Z its generator. p, q. Let (TH ∞ Note that for ψ ∈ C0 (X) we have, upon integrating by parts d t t f )H = (ψ, LTH f )H (ψ, TH dt Z X λ2i,m γi,m t = dp dq dr f ∇ri,m · ∇ri,m + βi Wi,m ψh TH βi i∈{L,R} Z −
m∈{1,...,M }
t f, dp dq dr L0 (ψh) TH
where L0 is given by L0 =
M X m=1
rL,m · ∇p1 +
M X m=1
rR,m · ∇pn +
n X j=1
pj · ∇qj − (∇qj V ) · ∇pj .
(3.29)
676
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
Since C0∞ (X) is in the domain of Z, we get d tn +t t t (ψ, TH γ)H = (Zψ, TH γ)H = lim (Zψ, LTH f )H n→∞ dt Z t t = (Zψ, TH γ)H = − dp dq dr L0 (ψh) TH γ. t γ is independent of r. We next choose The last equality follows from (3.29) since TH ∞ ψ(p, q, r) ∈ C0 (X) of the Rform ψ(r, p, q) = ϕ1 (r)ϕ2 (p, q)h−1 (p, q, r) with supp(ϕ1 (r)ϕ2 (p, q)) ∩ A = ∅ and dr ϕ1 (r) = 0. For this choice of ψ we have Z Z t t γ)H = dp dq TH γ (p, q) ϕ2 (p, q) · dr ϕ1 (r) = 0, (ψ, TH
and therefore
Z dp dq dr γ(p, q)L0 (ϕ1 (r)ϕ2 (p, q))
0=
Z
Z dp dq γ(p, q)∇p1 ϕ2 (p, q) ·
=
Z
Z +
dr
dp dq γ(p, q)∇pn ϕ2 (p, q) ·
M X
rL,m ϕ1 (r)
m=1
dr
M X
rR,m ϕ1 (r).
m=1
Since ϕ1 (r) is arbitrary, it follows that Z Z dp dq γ(p, q)∇p1 ϕ2 (p, q) = dp dq γ(p, q)∇pn ϕ2 (p, q) = 0, and thus, by a similar argument as above, γ(p, q) must be µ-a.e. independent of p1 and pn : Thus γ is a function γ(p2 , . . . , pn−1 ). Using this information, we choose now ψ(p, q, r) = ϕ1 (p1 , pn )ϕ2 (p2 , . . . , pn−1 , q)ϕ3 (r)h−1 (p, q, r), R with supp(ϕ1 ϕ2 ϕ3 ) ∩ A = ∅ and dp1 dpn ϕ1 (p1 , pn ) = 0. For such a choice of ψ we obtain Z 0 = dp dq dr L0 (ϕ1 ϕ2 ϕ3 )γ Z Z Z = dp2 · · · dpn−1 dq γ(∇q1 ϕ2 ) · dp1 dpn p1 ϕ1 dr ϕ3 Z Z Z + dp2 · · · dpn−1 dq γ(∇qn ϕ2 ) · dp1 dpn pn ϕ1 dr ϕ3 , and from the arbitrariness of ϕ1 , ϕ2 , ϕ3 we conclude that γ is independent of q1 , qn (all our statements hold µ-a.e.). Finally, choose ψ(p, q, r) = ϕ1 (q1 , qn )ϕ2 (p2 , . . . , pn−1 , q2 , . . . , qn−1 )ϕ3 (p1 , pn , r)h−1 (p, q, r),
Anharmonic Chains Coupled to Two Heat Baths
677
R with supp(ϕ1 ϕ2 ϕ3 ) ∩ A = ∅ and dq1 dqn ϕ1 (q1 , qn ) = 0. Then we obtain Z
Z 0=
Z
dp2 . . . dpn−1 dq2 . . . dqn−1 γ(∇p2 ϕ2 ) · dq1 dqn (∇q2 V )ϕ1 Z + dp2 . . . dpn−1 dq2 . . . dqn−1 γ(∇pn−1 ϕ2 ) Z Z · dq1 dqn (∇qn−1 V )ϕ1 dp1 dpn dr ϕ3 .
dp1 dpn dr ϕ3
From the arbitrariness of the ϕi we conclude in particular that Z Z 0 = dp2 . . . dpn−1 dq2 . . . dqn−1 γ(∇p2 ϕ2 ) · dq1 dqn (∇q2 V )ϕ1 .
(3.30)
e 1 , qn ) for some ν 0 ∈ {1, . . . , d} and a positive We may choose ϕ1 (q1 , qn ) = ∂q(ν 0 ) ϕ(q 1 ϕ(q e 1 , qn ). By Condition H2 we see that Z Z ν,ν 0 e1 (q1 , qn ) X (q2 ) ≡ dq1 dqn (∂q(ν) V )ϕ1 (q1 , qn ) = − dq1 dqn (∂q(ν 0 ) ∂q(ν) V )ϕ 2
2
1
is uniformly positive or negative. We can rewrite (3.30) as X Z 0 dp2 . . . dpn−1 dq2 . . . dqn−1 γ∂p(ν) X ν,ν (q2 )ϕ2 , 0= ν∈{1,... ,d}
2
and we conclude that γ is independent of p2 . A similar argument shows that γ is independent of qn−1 and iterating the above procedure we conclude that γ is locally constant µ-a.e. tn f = So far, we have shown that for all (∗)-sequences {tn } one has w–limn→∞ TH γ = const. From the invariance of the measure µ and its ergodicity we conclude that Z γ = (1, f )H = µ(dx)f (x). t f 6= (1, f )H . Then by the weak We conclude as in [Tr]: suppose that w–limt→∞ TH t sequential precompactness of {TH f ; t ≥ 0}, there exists a sequence un ↑ ∞ for t which w–limt→∞ TH f = η 6= (1, f )H . But, referring again to [Br], the sequence {un } has an almost (∗)-subsequence {sn }. This implies that there is a (∗)-sequence tn f = η. This is a contradiction, since we have seen that {tn } such that w–limn→∞ TH tn f = (1, f )H for all (∗)-sequences. By a simple density argument this w–limn→∞ TH implies that Z Z Z t µ(dx)f (x)TH g(x) = µ(dx)f (x) µ(dx)g(x), lim t→∞
for all f, g ∈ H and the proof of Proposition 3.9 is complete.
With Proposition 3.9 the proof of Theorem 2.2 is now complete.
678
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
4. Commutator Estimates and Spectral Properties of LH0 In this section, we prove Proposition 3.4 and Proposition 3.5. We generalize the commutator method of H¨ormander to study the spectral properties of the operator LH0 which is, by Lemma 3.1, the closure of the differential operator L with domain C0∞ (X) which we defined in Eq. (3.11). We recall the definition: L=
M X λ2L,m γL,m
βL
m=1
+
M X λ2R,m γR,m
βR
m=1
+
M X
∇rL,m − βL WL,m · ∇rL,m ∇rR,m − βR WR,m · ∇rR,m
rL,m · ∇p1 + LS +
m=1
M X
(4.1)
rR,m · ∇pn ,
m=1
with the abbreviations −2 WL,m = λ−2 L,m rL,m − q1 , WR,m = λR,m rR,m − qn ,
(4.2)
and where LS is the Liouville operator associated with the Hamiltonian HS (q, p): LS =
n X
pj · ∇qj − (∇qj V ) · ∇pj .
(4.3)
j=1
For the following estimates it will be convenient to work in the flat Hilbert space L2 (X, dx). The differential operator L is unitarily equivalent to the operator K on L2 (X, dx) with domain C0∞ (X) given by K = e−β0 G/2 L eβ0 G/2 =α−
M X λ2L,m γL,m m=1
βL
∗ RL,m RL,m −
M X λ2R,m γR,m m=1
βR
∗ RR,m RR,m + Kas ,
(4.4)
where α is given by (3.10) and p RL,m = ∇rL,m + (βL − β0 /2)β0 /2 WL,m , p RR,m = ∇rR,m + (βR − β0 /2)β0 /2 WR,m , Kas =
M X m=1
rL,m · ∇p1 + LS +
M X
rR,m · ∇pn
m=1
−
M βL − βR X λ2L,m γL,m ∇rL,m · WL,m + WL,m · ∇rL,m βL + βR m=1 2
+
M βL − βR X λ2R,m γR,m ∇rR,m · WR,m + WR,m · ∇rR,m . βL + βR m=1 2
All subsequent estimates will be valid for any f ∈ S(X) and thus for all functions in the domain of K.
Anharmonic Chains Coupled to Two Heat Baths
679
It is convenient to introduce the following notations: We introduce new variables, and recall some earlier definitions: Let n0 = [n/2] denote the integer part of n/2. We define Pj = ∇pj + aL pj ,
j = 1, . . . , n0 ,
Pj = ∇pj + aR pj ,
j = n0 + 1, . . . , n,
Qj = ∇qj + aL Wj (q, r),
j = 1, . . . , n0 ,
Qj = ∇qj + aR Wj (q, r),
j = n0 + 1, . . . , n,
RL,m = ∇rL,m + aL WL,m (q, r), m = 1, . . . , M, RR,m = ∇rR,m + aR WR,m (q, r), m = 1, . . . , M, where aL = (βL − β0 /2)β0 /2
1/2
aR = (βR − β0 /2)β0 /2 W1 (q, r) = ∇q1 V (q) −
M X
1/2
, ,
rL,m ,
m=1
Wj (q, r) = ∇qj V (q),
j = 2, . . . , n − 1,
Wn (q, r) = ∇qn V (q) −
M X
rR,m ,
m=1
WL,m (q, r) = λ−2 L,m rL,m − q1 , WR,m (q, r) =
λ−2 R,m rR,m
− qn ,
m = 1, . . . , M, m = 1, . . . , M.
We next define the operators K0 , K, and 3 which will be used in the statement of our main bound: 3=
1+
n X
Pj∗ Pj +
j=1
n X j=1
Q∗j Qj +
M X
∗ ∗ RL,m + RR,m RR,m RL,m
1/2 ,
m=1
K0 = Kas ,
(4.5)
K = α − K = −K0 +
M X
∗ ∗ (bL,m RL,m RL,m + bR,m RR,m RR,m ).
m=1
Here, we use bL,m = λ2L,m γL,m /βL , bR,m = λ2R,m γR,m /βR . Our main estimate is Theorem 4.1. Under the Assumptions H1, H2 on V , there are an ε > 0 and a C < ∞ such that for all f ∈ S(X) one has (4.6) k3ε f k ≤ C kKf k + kf k . Proof. The proof will be an easy consequence of the following
680
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
Proposition 4.2. There are finite constants Cj , Cj0 , and C such that for all f ∈ S(X) one has with n0 = [n/2], k3εj −1 Pj f k ≤ Cj kKf k + kf k , k3εj −1 Pn+1−j f k ≤ Cj kKf k + kf k , 0 k3εj −1 Qj f k ≤ Cj0 kKf k + kf k , 0 k3εj −1 Qn+j−1 f k ≤ Cj0 kKf k + kf k , kRL,m f k + kRR,m f k ≤ C kKf k + kf k ,
j = 1, . . . , n0 ,
(4.7) 0
j = 1, . . . , n − n , 0
j = 1, . . . , n ,
(4.8) (4.9)
0
j = 1, . . . , n − n ,
(4.10)
m = 1, . . . , M,
(4.11)
where εj = 41−2j and ε0j = 4−2j . Proof of Proposition 4.2. For the Ri,m , we have the easy estimate ∗ Ri,m f ) ≤ b−1 kRi,m f k2 = (f, Ri,m i,m Re(f, Kf )
2 −1 ≤ b−1 i,m kKf k kf k ≤ bi,m kKf k + kf k .
(4.12)
This proves Eq. (4.11) for these cases. For the other cases, the proof will proceed by induction: It will proceed by bounds on P1 , Q1 , P2 , . . . , Qn0 , and a totally symmetric argument, which is left to the reader, can be used from the other end of the chain, proceeding over Pn , Qn , Pn−1 , until the bounds reach the “center” of the chain. We next prepare the inductive proof. To make the result of this calculation clearer, we define the matrices Mj,k = ∇qj ∇qk V, j, k = 1, . . . , n. In components, this means, for µ, ν ∈ {1, . . . , d}, (µ,ν) = ∇q(µ) ∇q(ν) V, j, k = 1, . . . , n. Mj,k j
k
By our choice of potential V all the Mj,k vanish, except Mj,j , with j = 1, . . . , n and (µ,ν) Mj+1,j = Mj,j+1 , with j = 1, . . . , n − 1. Furthermore, by Condition H1, all the Mj,k are uniformly bounded functions of q. Finally, by Assumption H2, the matrices Mj,j+1 are definite, with uniformly bounded inverse. One verifies easily the relations: [RL,m , K0 ] = P1 + cm RL,m , [P1 , K0 ] = Q1 , [Q1 , K0 ] = −M1,1 P1 − M2,1 P2 +
M X
∗ c0m (RL,m + RL,m ),
(4.13) m=1 [Pj , K0 ] = Qj , j = 2, . . . , n − 1, [Qj , K0 ] = −Mj−1,j Pj−1 − Mj,j Pj − Mj+1,j Pj+1 , j = 2, . . . , n − 1, where cm = γL,m (βL − β0 )/βL , c0m = bL,m aL (βL − β0 ).
Anharmonic Chains Coupled to Two Heat Baths
681
Symmetrical relations hold at the other end of the chain. With these notations, we can rewrite (among several possibilities): P1 = [RL,1 , K0 ] − c1 RL,1 , Q1 = [P1 , K0 ], P2 = −M−1 2,1 [Q1 , K0 ] + M1,1 P1 −
M X
∗ c0m (RL,m + RL,m ) ,
(4.14)
m=1
Qj = [Pj , K0 ], j = 2, . . . , n,
0 Pj+1 = −M−1 j+1,j [Qj , K0 ] + Mj−1,j Pj−1 + Mj,j Pj , j = 2, . . . , n , with symmetrical relations at the other end of the chain. We can streamline this representation by defining Q0 = RL,1 , and M1,0 = −1. Then we can write, for j = 1, . . . , n0 : Pj = −M−1 j,j−1 [Qj−1 , K0 ] + Sj , Qj =
[Pj , K0 ],
(4.15) (4.16)
where the operators Sj depend linearly on {P1 , . . . , Pj−1 }, {Q1 , . . . , Qj−1 }, and the RL,m . The relations Eqs.(4.15) and (4.16) will be used in the inductive proof. Such relations are of course reminiscent of those appearing in the study of hypoelliptic operators. The novelty here will be that we obtain bounds which are valid not only in a compact domain, but in the unbounded domain of the p’s and q’s. The following bounds will be used repeatedly: Proposition 4.3. Let Z denote one of the operators Qj , Q∗j , Pj , or Pj∗ . Let M denote one of the Mj,k . Assume that α ∈ (0, 2). Then the following operators are bounded in L2 (X, dx): 1) 2) 3) 4) 5)
3β [M, 3−α ]3γ , if β + γ ≤ α + 1, 3β Z3γ , if β + γ ≤ −1, 3β [K0 , Z]3γ , if β + γ ≤ −1, 3β [Z, 3−α ]3γ , if β + γ ≤ α + 1, 3β [3−α , K0 ]3γ , if β + γ ≤ α.
Proof. The proof will be given in Appendix B. Because we are working in an infinite domain, and work with non-linear couplings, we will not bound the l.h.s. of Eq. (4.7) directly, but instead the more convenient quantity1 : Rj (f ) = (3εj −1 Mj,j−1 Pj f, 3εj −1 Pj f ). We have the Lemma 4.4. There is a constant C such that for all j ∈ {1, . . . , n} and all f ∈ S(X) one has the inequality k3εj −1 Pj f k2 ≤ C |Rj (f )| + kf k2 . 1 For readers familiar with the method of H¨ ormander, we wish to point out that this device seemed necessary because we do not have good bounds on [K0 , [Q1 , K0 ]].
682
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
Therefore, to prove Eq. (4.7), it suffices to prove the corresponding inequality for the Rj . Proof of Lemma 4.4. Let M = Mj,j−1 , ε = εj , and P = Pj . Then, by our Assumption H2, there is a constant m > 0 for which M > m. Therefore, k3ε−1 P f k2 = (3ε−1 P f, 3ε−1 P f ) ≤ m−1 |(M3ε−1 P f, 3ε−1 P f )| ≤ m−1 |(3ε−1 MP f, 3ε−1 P f )| + m−1 |([3ε−1 , M]P f, 3ε−1 P f )| ≤ m−1 |Rj (f )| + m−1 | (3ε [3ε−1 , M]3)(3−1 P )f, (3−1 P )f |. The proof of Lemma 4.4 is completed by using the bounds 1) and 2) of Proposition 4.3. The inductive step. We begin by the induction step for the Pj . We assume now that the bounds (4.7) and (4.9) have been shown for all j ≤ k. We want to show (4.7) for j = k + 1. Using Eq. (4.15) and Lemma 4.4, we start by writing εk+1 −1 εk+1 −1 Mk+1,k Pk+1 f, 3 Pk+1 f Rk+1 (f ) ≡ 3 = 32εk+1 −1 [K0 , Qk ]f, 3−1 Pk+1 f 2εk+1 −1 −1 Sk+1 f, 3 Pk+1 f − 3 ≡ X 1 − X2 . We first bound X2 . Note that Sk+1 is a sum of terms of the form MT where T is equal to Pj or Qj with j ≤ k, and M is either a constant or equal to one of the Mk,` . Therefore, we obtain, using Proposition 4.3, the inductive hypothesis, and the choice 2εk+1 ≤ minj≤k (εj , ε0j ) = ε0k : |(32εk+1 −1 MT f, 3−1 Pk+1 f )|
≤ | M32εk+1 −1 T f, 3−1 Pk+1 f | + | ([32εk+1 −1 , M]3) (3−1 T )f, 3−1 Pk+1 f | 2 ≤ O(1) kKf k + kf k kf k + O(1)kf k2 ≤ O(1) kKf k + kf k .
This proves the desired bound. We now come to the “interesting” term X1 . The commutator is rewritten as [K0 , Qk ] = −Qk K − K ∗ Qk + 21 Qk (K + K ∗ ) + (K + K ∗ )Qk ≡ X 3 + X4 + X5 . We discuss the 3 corresponding bounds: Term X3 . In this case, we are led to bound, with ε = εk+1 , T3 ≡ |(Qk Kf, 32ε−2 Pk+1 f )| = |(Kf, Q∗k 32ε−2 Pk+1 f )| = |(Kf, (Q∗k 32ε−1 )(3−1 Pk+1 )f )| ≤ |(Kf, (3−1 Pk+1 )(Q∗k 32ε−1 )f )| + |(Kf, [Q∗k 32ε−1 , 3−1 Pk+1 ]f )| ≡ X3,1 + X3,2 .
(4.17)
Anharmonic Chains Coupled to Two Heat Baths
683
We start by bounding X3,1 . Since 3−1 Pk+1 is bounded by Proposition 4.3, it suffices to show that (4.18) kQ∗k 32ε−1 f k ≤ C kKf k + kf k . To see this we first write, using Q = Qk , kQ∗ 32ε−1 f k2 = (f, 32ε−1 QQ∗ 32ε−1 f ) = k32ε−1 Qf k2 + (f, [32ε−1 Q, Q∗ 32ε−1 ]f ). 2 The first term is bounded by the inductive hypothesis by O(1) kKf k + kf k and the choice of εk+1 , while the second can be bounded by O(1)kf k2 by expanding the commutator (and using Proposition 4.3): [32ε−1 Q, Q∗ 32ε−1 ] = (32ε−1 Q∗ 3−2ε )32ε [Q, 32ε−1 ] +32ε−1 [Q, Q∗ ]32ε−1 + [32ε−1 , Q∗ ]32ε 3−1 Q. This proves Eq. (4.18). ∗ , Qk ] = 0 and we write To bound X3,2 , we use [Pk+1 [Q∗k 32ε−1 , 3−1 Pk+1 ] = Q∗k 3−1 [32ε−1 , Pk+1 ] + [Q∗k , 3−1 ]32ε 3−2ε Pk+1 32ε−1 . Since each factor above is bounded by Proposition 4.3, the desired bound follows: 2 T3 ≤ O(1) kKf k + kf k . Term X4 . Here, we want to bound T4 ≡ |(K ∗ Qk f, 32ε−2 Pk+1 f )|. We get T4 = |(K ∗ Qk f, 32ε−2 Pk+1 f )| = |(Qk f, K32ε−2 Pk+1 f )| ≤ |(32ε−1 Qk f, 3−1 Pk+1 Kf )| + |(Qk f, [K, 32ε−2 Pk+1 ]f )| ≡ X4,1 + X4,2 .
(4.19)
Using the inductive hypothesis, and the bound k3−1 Pk+1 k ≤ O(1), the term X4,1 is bounded by 2 k32ε−1 Qk f k k3−1 Pk+1 Kf k ≤ O(1) kKf k + kf k . We write the commutator of X4,2 as [K, 32ε−2 Pk+1 ] = 32ε−1 3−1 [K, Pk+1 ] + 3−1 [K0 , 32−2ε ]32ε−1 (3−1 Pk+1 ) , since K − K0 commutes with 3. Using Proposition 4.3 and the inductive hypothesis this leads to the following bound for X4,2 : X4,2 ≤ |(32ε−1 Qk f, 3−1 [K, Pk+1 ]f )|
+ | 32ε−1 Qk f, (3−1 [K0 , 32−2ε ]32ε−1 )(3−1 Pk+1 )f |
≤ O(1)(kKf k + kf k)(kf k). This completes the bounds involving X4 .
684
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
Term X5 . Here, we bound ∗ ∗ 2ε−2 1 Pk+1 f . T5 ≡ 2 Qk (K + K ) + (K + K )Qk f, 3 Assume first k > 1 (and in any case we have k < n). Looking at the definition of K, we see that in this case Qk commutes with 21 (K + K ∗ ) = Re K, and we can rewrite T5 as 0 ∗ 2ε−2 Pk+1 f . T5 = 2 (Re K)f, Qk 3 Using the Schwarz inequality and the positivity of Re K, we get a bound 1/2 1/2 (Re K)Q∗k 32ε−2 Pk+1 f, Q∗k 32ε−2 Pk+1 f |T50 | ≤ (Re K)f, f 1/2 1/2 = Re(Kf, f ) Re(KQ∗k 32ε−2 Pk+1 f, Q∗k 32ε−2 Pk+1 f ) 1/2 1/2 = Re(Kf, f ) Re(3−2ε KQ∗k 32ε−2 Pk+1 f, 32ε Q∗k 32ε−2 Pk+1 f ) 1/2 ≡ Re(Kf, f ) (Re(f1 , f2 ))1/2 . The first factor is clearly bounded by kKf k + kf k
1/2
. To bound f1 , we expand again:
f1 = 3−2ε KQ∗ 32ε−2 P f = (3−2ε Q∗ 32ε−1 )(3−1 P )Kf + 3−2ε [K, Q∗ ]32ε−2 P f + 3−2ε Q∗ [K, 32ε−2 ]P f + 3−2ε Q∗ 32ε−2 [K, P ]f.
The norm of the first term is bounded by O(1) kKf k + kf k . Using Proposition 4.3, the other terms are bounded by O(1)kf k. To bound f2 we write f2 = 32ε Q∗ 32ε−2 P f = 3−1 P Q∗ 34ε−1 f + 32ε Q∗ 3−2ε−1 [34ε−1 , P ]f + 32ε [Q∗ , 3−2ε−1 ]P 34ε−1 f. We control the first term using the inductive hypothesis (it is here that we use the factor 4εk+1 ≤ ε0k ) and the two others by Proposition 4.3. Combining these bounds, we finally 2 get the bound T5 ≤ O(1) kKf k + kf k , and hence the inequality (4.7) is shown for all j. It remains to discuss the cases k = 0, 1 for the term X5 . The commutators of Re K with Q0 ≡ RL,1 or with Q1 do not vanish and hence there are additional terms in T50 . They are of the form M X
∗ bL,m ([RL,m RL,m , RL,1 ]f, 32ε−2 Pk+1 f ),
m=1 M X
∗ bL,m ([RL,m RL,m , Q1 ]f, 32ε−2 Pk+1 f ).
m=1 ∗ ∗ RL,m , RL,1 ] = const. RL,1 δm,1 and [RL,m RL,m , Q1 ] = const. RL,m , this is Since [RL,m obviously bounded by O(1) kKf k + kf k kf k.
Anharmonic Chains Coupled to Two Heat Baths
685
We have discussed now all the cases for the inductive bound on the Pj . The discussion of this step for the Qj is the same, except that some simplifications appear because of the simpler relations Qj = [Pj , K0 ]. The proof of Proposition 4.2 is complete. Proof of Theorem 4.1. Let ε ≤ ε0n+1 . We rewrite 32ε = 32ε−2 1 +
n+1 X
Q∗j Qj +
j=0
n X
Pj∗ Pj .
(4.20)
j=1
Note now that for Q = Qj , 32ε−2 Q∗ Q = Q∗ 32ε−2 Q + [32ε−1 , Q∗ ]Q. Using Proposition 4.2 and Proposition 4.3, we get a bound (f, 32ε−2 Q∗ Qf ) ≤ O(1) kKf k + kf k
2
+ O(1)kf k2 .
Of course, the P satisfy analogous relations. Since k32ε−2 f k ≤ O(1)kf k, the assertion (4.6) follows by summing the terms in Eq. (4.20) . The proof of Theorem 4.1 is complete. Using Theorem 4.1 we can now prove Proposition 3.4. We have Proposition 4.5. If the potential V satisfies Conditions H1, H2 and if β0 < 2 min(βL ,βR ) there is a λ∗ > 0 such that if the couplings satisfy |λL,m |, |λR,m | ∈ (0, λ∗ ) then both LH0 and L∗H0 have compact resolvent. Proof. We show that the operator K on L2 (X, dx) has compact resolvent. From Theorem 4.1 we get the bound (4.21) k3ε f k ≤ C k(K − α − 1)f k + kf k , for all f ∈ S(X). Since, by Lemma 3.1, C0∞ (X) is a core of K, we see, by taking limits, that the estimate (4.21) holds for all f in D(K). We note that 32 has compact resolvent. Indeed, recall the definition Eq. (3.7) of the effective Hamiltonian G. It is easily checked that, first of all, G grows quadratically in every direction of R2d(n+M ) , for sufficiently small |λi,m |. Second, it is also easily verified that 32 = 1 −
n X
(1pj + 1qj ) +
j=1
M X
(1rL,m + 1rR,m ) + W(p, q, r),
m=1
and, by construction 0
W(p, q, r) ≈
n X
a2L
j=1
+
M X m=1
(∇pj G) + (∇qj G) 2
2
+
n X j=n0 +1
a2L (∇rL,m G)2 + a2R (∇rR,m G)2 ,
a2R (∇pj G)2 + (∇qj G)2
686
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
up to bounded terms. Thus W(p, q, r) diverges in all directions of R2d(n+M ) . Using the Rellich criterion (see [RS], Thm. XII.67) we conclude that 3ε has compact resolvent for every ε > 0. Therefore, Eq. (4.21) implies, using again the Rellich criterion, that (K−α−1)∗ (K− α − 1) has compact resolvent. We claim this implies that K itself has compact resolvent. Indeed, since K − α − 1 is strictly m-accretive, its inverse exists, and therefore the −1 = ((K − α − 1)∗ )−1 (K − α − 1)−1 exists and is operator (K − α − 1)∗ (K − α − 1) −1 compact. This implies that (K − α − 1) is compact and hence K has compact resolvent as asserted. Finally, we prove Proposition 3.5. We have the following Proposition 4.6. Let g denote an eigenvector of LH0 or L∗H0 . If the assumptions of Proposition 3.4 are satisfied then g exp(β0 G/2) is in the Schwartz space S(X). Proof. We prove the corresponding statement for the operator K on L2 (X, dx). We consider the set of C ∞ vectors of eKt , i.e., the set C ∞ (K) ≡ {f ∈ L2 (X, dx) ; eKt f ∈ C ∞ (R+ , L2 (X, dx))}. The set C ∞ (K) obviously contains all eigenvectors of K. Therefore Proposition 4.6 is a direct consequence of the following proposition. Proposition 4.7. C ∞ (K) = S(X). Proof. By Theorem 1.43 in [Da] we have the following characterization of C ∞ (K): C ∞ (K) = ∩n≥0 D(Kn ), where D(Kn ) = {f ∈ D(Kn−1 ), Kn−1 f ∈ D(K)}. Since S(X) ⊂ D(K) and KS(X) ⊂ S(X), we have the easy inclusion S(X) ⊂ ∩n D(Kn ) = C ∞ (K). To show the inclusion in the other direction we will need the following theorem which we will prove in Appendix C. This is a (slight) generalization of the core theorem, [Da], Thm. 1.9. Theorem 4.8. Let B be a Banach space. Let A : D(A) → B be m-accretive. For all n = 1, 2, . . . , if D is a subset of D(An ) and is dense in B and furthermore D is invariant under the semi-group eAt , then D is a core for An . Given this result we first show that S(X) is invariant under eKt . For s ≥ 0 we consider the scale of spaces Ns given by Ns = D(3s ), with the norm kf k(s) = k3s f k. For s ≤ 0 we let Ns be the dual of N−s . From the definition of 32 , it is easy to see that {k · k(s) ; s = 0, 1, · · · } is a system of semi-norms for the topology of S(X) and hence S(X) = ∩s Ns . To show that S(X) is left invariant by the semi-group eKt generated by K, it is enough to show that eKt Ns ⊂ Ns for all s ≥ 0.
Anharmonic Chains Coupled to Two Heat Baths
687
For f , g in S(X) we have the identity ∗ 3−s eK t 3s f, g = f, 3s eKt 3−s g Z t dτ f, 3s KeKτ 3−s g = f, g + 0 Z t dτ f, (K + B)3s eKτ 3−s g = f, g + 0 Z t ∗ dτ 3−s eK τ 3s (K∗ + B ∗ )f, g , = f, g +
(4.22)
0
where
B = [3s , K]3−s ,
is a bounded operator by Proposition 4.3. From (4.22) we see that ∗ d −s K∗ t s 3 e 3 f = 3−s eK t 3s (K∗ + B ∗ )f. dt
(4.23)
Now K∗ is the generator of a strongly continuous quasi-bounded semi-group, B ∗ is D(K∗ ) is the genbounded and so, [Ka], Chap. 9, Thm. 2.7, K∗ + B ∗ with domain ∗ ∗ ∗ ∗ erator of a strongly continuous quasi-bounded semi-group e(K +B )t with ke(K +B )t k ∗ ≤ e(α+kB k)t . From (4.23) we see that e(K Thus we obtain
∗
+B ∗ )t
∗
∗
= 3−s eK t 3s .
k3−s eK t 3s k ≤ e(α+kB
∗
k)t
,
∗
and so eK t : N−s → N−s , s > 0, is bounded. By duality eKt : Ns → Ns , s > 0, is also bounded. This implies that eKt Ns ⊂ Ns , s > 0, and therefore S(X) is invariant under eKt . We now use Theorem 4.1. Let f ∈ S(X), then replacing f by 3m f in Eq. (4.6), we obtain kf k(m+ε) ≤ O(1) kK3m f k + kf k(m) ≤ O(1) kKf k(m) + k[K, 3m ]f k + kf k(m) . Since
k[K, 3m ]f k = k3m [K, 3−m ]3m f k,
and since 3m [K, 3−m ] is bounded by Proposition 4.3 we obtain the bound kf k(m+ε) ≤ O(1) kKf k(m) + kf k(m) . Using (4.24) it is easy to see, by induction, that, for n = 1, 2, · · · we have n X n kKj f k. kf k(nε) ≤ O(1) j j=0
(4.24)
(4.25)
688
J.-P. Eckmann, C.-A. Pillet, L. Rey-Bellet
Since S(X) is a core for Kn by Theorem 4.8, we see, by taking limits, that D(Kn ) ⊂ Nnε . Therefore
C ∞ (K) = ∩n D(Kn ) ⊂ ∩n Nnε = S(X).
And this concludes the proof of Proposition 4.7.
Appendix A: Proof of Lemma 3.1 If x(t) = ξ(t, w; x) denotes the solution of (3.1), it has the cocycle property ξ(t, τ s w; ξ(s, w; x)) = ξ(t + s, w; x), which holds for all t, s ∈ R, x ∈ X and w ∈ W. Here we have introduced the shift (τ t w)(s) = w(t + s) on W. In particular the map x 7→ ξ(t, w; x) is a bijection with inverse x 7→ ξ(−t, τ t w; x). A standard argument shows that these maps are actually diffeomorphisms (see e.g. [IW], Ch. V.2). The Jacobian of ξ(t, w; ·) is given by Rt ds div b◦ξ(s,w;·) , J(t, w; ·) = |detDx ξ(t, w; ·)| = e 0 and according to (3.2) the Jacobian satisfies e−B|t| ≤ J(t, w; x) ≤ eB|t| . Remark. In our case we have in fact div b = −d
X
γi,m ≡ −0 < 0,
i,m
so that
J(t, w; ·) = e−0t .
Lemma 3.1 is an immediate consequence of the following lemmata. Lemma A.1. T t extends to a strongly continuous, quasi-bounded semi-group of positivity preserving operators on L2 (X, dx). Its generator is the closure of L. Proof. Let f ∈ C0∞ , then we have Z Z 2 dx |T t f (x)|2 = lim dx χ{|x| 0. There have been traditionally two approaches to the problem: 1. The KAM approach. (1.3) is solved by a Newton method that constructs a sequence of symplectic changes of coordinates defined on shrinking domains that, in the limit, transform the problem to the λ = 0 case [1, 2, 17, 18]. 2. Perturbation theory. For U analytic (see below) one can attempt to Psolve (1.3) by iteration. This leads to a power series in λ, the Lindstedt series: Z = n Zn λn . Each Zn is given as a sum of several terms (see Sect. 9), some of which are very large, proportional to (n!)a with a > 0, due to piling up of “small denominators” (ω · q) from the momentum space representation of operator D−1 . However, the KAM method also yields the analyticity of Z in λ [19]. Thus the Lindstedt series must converge. To see this directly turned out to be rather hard and was finally done by Eliasson [8] who, by regrouping terms, was able to produce an absolutely convergent series that gives the quasiperiodic solution. Subsequently Eliasson’s work was simplified and extended by Gallavotti [9, 10, 11], by Chierchia and Falcolini [6, 7] and by Bonetto, Gentile, Mastropietro [12, 13, 14, 15, 3, 4]. In the present paper we shall develop a new iterative scheme to solve Eq. (1.3). It is based on a direct application of the renormalization group (RG) idea of quantum field theory (QFT) to the problem. The idea is to split the operator D (or rather its inverse, see Sect. 2) into a small denominator and large denominator part, where small and large are defined with respect to a scale of order unity. The next step is to solve the large denominator problem which results in a new effective equation of the type (1.3) for the small denominator part, with a new right-hand side. The procedure is iterated, with the scale separating small and large at the nth step equal to η n for some fixed η < 1. As a result we get a sequence of effective problems that converge to a trivial one as n → ∞. A generic step is solved by a simple application of the Banach Fixed Point Theorem in a big space of functionals of Z representing the right-hand side of Eq. (1.3) in the nth iteration step. Our iteration can be viewed as an iterative resummation of the Lindstedt series, as will be discussed in Sect. 9. This iterative approach trivializes the rather formidable
KAM Theorem and Quantum Field Theory
701
combinatorics of the small denominators. The functional formulation in terms of effective problems removes also the mystery behind the subtle cancellations in the Lindstedt series: they turn out to be an easy consequence of a symmetry in the problem as formulated in terms of the so-called Ward identities of QFT. The QFT analogy of the problem (1.3) has been forcefully emphasized by Gallavotti et al. [11, 12]. The proof of Eliasson’s theorem by these authors was based on a separation into scales of the graphical expressions entering the Lindstedt series and was a direct inspiration for the present work. An important part of the standard RG theory is an approximate scale invariance of the problem that is exhibited and exploited by the RG method. The KAM problem also is expected to have this aspect: as the coupling λ is increased the solution with a given ω eventually ceases to exist. For suitable “scale invariant” ω (e.g. in d = 2 for ω = (1, γ) with γ a “noble” irrational) the solution at the critical λ is expected to exhibit a power law decay of Fourier coefficients and periodic orbits converging to it have peculiar “universal” scaling properties [16, 21]. We hope that the present approach will shed some light on these problems in the future. While the main goal of this paper is to develop a new method, we use it to reprove the following (classical) result: Theorem 1. Let U be real analytic in φ and analytic in I in a neighborhood of I = 0. Assume that ω satisfies condition (1.6). Then Eq. (1.3) has a solution which is analytic in λ and real analytic in φ provided that either (a) (the non-isochronous case) µ is an invertible matrix and |λ| is small enough (in a µ-dependent way). R (b) (the isochronous case) µ = 0, d ∂I U (φ, 0) dφ = 0, the d × d matrix with elements T R ∂ ∂ U (φ, 0) dφ, k, l = 1, . . . , d, is invertible and |λ| is small enough. Td I k I l The above solutions are unique up to translations (1.5). Remark. Actually, we show that the solution is an analytic function not only of λ, but of the potential U , when the latter belongs to a small ball in a Banach space of analytic functions (see Sect. 3 for the introduction of such spaces). This allows us to consider more general Hamiltonians of the form H(I, φ) = H0 (I) + U (φ, I), with H0 and U analytic and U small. Indeed, we may expand H0 around I0 s.t. ∂I H0 (I0 ) = ω, with ω satisfying condition (1.6): H0 (I) = H(I0 ) + ω · (I − I0 ) +
1 2
(I − I0 ) · µ(I − I0 ) + H˜ 0 (I),
and define U˜ = U + H˜ 0 so as to include in it all the terms of order higher than two in the expansion of H0 . Replacing I − I0 by I, we may apply Theorem 1 provided that U˜ satisfies the corresponding hypotheses. Also, more general cases where µ is a degenerate matrix can be treated. The organization of the paper is as follows. In Sect. 2 we explain the RG formalism. In Section 3, we introduce spaces of analytic functions on Banach spaces; such spaces will be used to solve our RG equations. In Sect. 4, we state the main inductive estimates which are proved in Sect. 6 after an interlude on the Ward identities in Sect. 5. Theorem 1 is proved then in Sect. 7. Section 8 explains the connection of our formalism to QFT
702
J. Bricmont, K. Gaw¸edzki, A. Kupiainen
for those familiar with the latter. We should emphasize that the QFT is solely a source of intuition, the simple RG formalism of Sect. 2 is independent of it. Finally, in Sect. 9, the connection with the Lindstedt series is explained. 2. Renormalization Group Scheme In this section we explain the iterative RG scheme without spelling out the technical assumptions that are needed to carry it out. We refer the reader to Sect. 9 for a graphical representation of the main quantities introduced here. We shall work with Fourier transforms, denoting by lower case letters the Fourier transforms of functions of φ, the latter being denoted by capital letters: Z X −i q·φ e f (q), where f (q) = ei q·φ F (φ) dφ F (φ) = Td
q∈Zd
with dφ standing for the normalized Lebesgue measure on Td . Note first that we may use the translations (1.5) to limit our search for the solution of Eq. (1.3) to the subspace of 2 with zero average, i.e. with θ(0) = 0 in the Fourier language. It will be convenient to separate the constant mode of J explicitly by writing Z = X + (0, ζ), where X has zero average. Let us define W0 (φ; X, ζ) = λ ∂U ((φ, ζ) + X(φ)).
(2.1)
Denote by G0 the operator −D−1 acting on R2d -valued functions on Td with zero average. In terms of the Fourier transforms, µ(ω · q)−2 i(ω · q)−1 x(q), (2.2) (G0 x)(q) = −i(ω · q)−1 0 for q 6 = 0 and (G0 x)(0) = 0. Writing Eq. (1.3) separately for the averages (i.e. q = 0) and the rest, we may rewrite it as the fixed point equations X = G0 P W0 (X, ζ), Z W0 (φ; X, ζ) dφ, (0, µζ) = −
(2.3) (2.4)
Td
R where P projects out the constants: P F = F − d F (φ)dφ. Our strategy is to solve T Eq. (2.3) by an inductive RG method for given ζ. This turns out to be possible quite generally without any nondegeneracy assumptions on U . The latter enter only in the solution of Eq. (2.4). Below, we shall treat W0 given by the Eq. (2.1) as a map on a space of R2d -valued functions X on Td with arbitrary averages1 . The vector ζ will be treated as a parameter and we shall often suppress it in the notation for W0 . For the inductive construction of the solution of Eq. (2.3), we shall decompose G0 = G1 + 00 ,
(2.5)
where 00 will effectively involve only the Fourier components with |ω · q| larger than O(1) and G1 the ones with |ω · q| smaller than that (see Sect. 4). In particular, we shall have 00 = 00 P . Upon writing X = Y + Y˜ , Eq. (2.3) becomes 1
That the solution X of Eq. (2.3) has zero average follows from the form of the equation.
KAM Theorem and Quantum Field Theory
703
Y + Y˜ = (G1 + 00 )P W0 (Y + Y˜ ).
(2.6)
Suppose that Y˜ = Y˜0 , where Y˜0 solves for fixed Y the “large denominator” equation: Y˜0 = 00 W0 (Y + Y˜0 ).
(2.7)
Then Eq. (2.6) reduces to the relation Y = G1 P W1 (Y )
(2.8)
if we define W1 (Y ) = W0 (Y + Y˜0 ). We have thus reduced the orginal problem (2.3) to the one from which the largest denominators were eliminated, at the cost of solving the easy large denominator problem (2.7) and of replacing the map W0 by W1 . Note that, with these definitions, Y˜0 = 00 W1 (Y ) and thus W1 satisfies the fixed point equation W1 (Y ) = W0 (Y + 00 W1 (Y )).
(2.9)
Conversely, this equation, which we shall solve for W1 by the Banach Fixed Point Theorem in a suitable space, implies that Y˜0 = 00 W1 (Y ) satisfies Eq. (2.7) and thus that X = Y + 00 W1 (Y ) ≡ F1 (Y )
(2.10)
is a solution of Eq. (2.3) if and only if Y solves Eq. (2.8). After n − 1 inductive steps, the solution of Eq. (2.3) will be given as X = Fn−1 (Y ),
(2.11)
Y = Gn−1 P Wn−1 (Y )
(2.12)
where Y solves the equation and Gn−1 contains only the denominators |ω · q| ≤ O(η n ), where η is a positive number smaller than 1 fixed once for all. The next inductive step consists of decomposing Gn−1 = Gn + 0n−1 , where 0n−1 involves |ω · q| of order η n and Gn the ones smaller than that. We define Wn (Y ) as the solution of the fixed point equation Wn (Y ) = Wn−1 (Y + 0n−1 Wn (Y ))
(2.13)
Fn (Y ) = Fn−1 (Y + 0n−1 Wn (Y ))
(2.14)
and set
(which is consistent with relation (2.10) if we take F0 (Y ) = Y ). Then replacing Y in Eqs. (2.11) and (2.12) by Y +0n−1 Wn (Y ), we infer that X = Fn (Y ) if Y = Gn P Wn (Y ) completing the next inductive step. Note also the cumulative formulas that follow easily by induction: Wn (Y ) = W0 (Y + 0