February 19, 2003 14:25 WSPC/148-RMP
00158
Reviews in Mathematical Physics, Vol. 15, No. 1 (2003) 1–78 c World Scientific Publishing Company
SELF ORGANIZATION IN THE LOW TEMPERATURE REGION OF A SPIN GLASS MODEL
MICHEL TALAGRAND Universit´ e Paris VI Equipe d’Analyse, Institut Math´ ematique UMR n◦ 1074 Boite 186, 4 Place Jussieu, 75230 Paris Cedex 05 Received 31 October 2002 Revised 25 November 2002 We obtain an almost complete description of the structure of the p-spin interaction model down to temperatures that decrease exponentially with p. We prove in particular the spontaneous creation of pure states, and we describe the distribution of their weights. This confirms the picture of “one step of symmetry breaking” predicted by the physicists. Similar results are obtained when a small external field is added, provided one accepts to add a lower order “generic” perturbation to the Hamiltonian. Keywords: p-spin interaction model; replica-symmetry breaking; Poisson-Dirichlet distribution; pure states; cavity method.
Contents 1. Introduction 2. A Priori Estimates 3. Construction of the Lumps 4. Pure States 5. Orthogonality in the Absence of External Field 6. The Ghirlanda–Guerra Relations and the Poisson–Dirichlet Distribution 7. Conditioning and the Relative Weights 8. Conditioning and the Cavity Method 9. The Perturbed Hamiltonian and the Extended Ghirlanda–Guerra Identities 10. The Model with External Field References
1 8 22 23 31 34 43 55 66 69 77
1. Introduction The study of the supremum of a family of random variables (r.v.) is obviously a topic of considerable importance. A collection of r.v. is also called a stochastic process. A main use of these is to model phenomenon that evolve with time, and a stochastic process is then a collection (Xt )t∈R of r.v. The use of an index set with such precise features as R (in particular an order) motivates the consideration of dependant structures where, typically, the correlation 1
February 19, 2003 14:25 WSPC/148-RMP
2
00158
M. Talagrand
of Xs and Xt decreases as |s−t| increases. A large part of probability theory consists in the study of such situations. In a somewhat different direction, one can consider a stochastic process (Xt )t∈T where T is now an “abstract set”. This point of view is extremely useful in the theory of Gaussian processes, and more generally in probabilistic arguments in analysis (see e.g. [1]). Concerning Gaussian processes, it can be said that for such a process Xt the order of magnitude of supt∈T Xt is understood “within a constant multiplicative factor” (through the theory of majorizing measures, see [2]). Due to the variety of possible situations, it seems difficult to obtain a better description in a general setting. In a different but connected order of ideas, when the r.v. (Xt )t∈T are independent, there is a very satisfactory theory of the “extreme values” taken by this family. Theoretical physicists discovered in the 80s a new direction of investigations (although probably they did not quite formulate it in the present terms) [3]. They discovered that very natural, and apparently simple processes display a very rich behavior of their “extreme values”. The present paper is devoted to the study of such a situation. Given an integer p, we will consider a family (HN (σ))σ∈ΣN of Gaussian r.v., where ΣN = {−1, 1}N
(1.1)
such that N E HN (σ )HN (σ ) ' 2 1
2
1 X 1 2 σi σi N
!p ,
(1.2)
i≤N
where of course σ ` = (σi` )i≤N and where ' means equality within terms of order 1. (See the exact formula (1.4) below.) For N large we want to understand, for a given (but typical) realization of these variables, what are the large values among this realization. The somewhat canonical character of this situation should be apparent. The richness and the depth of the situation are largely due to the choice of the index set ΣN . The natural distance on ΣN is the Hamming distance given by 1 card{i ≤ N : σi1 6= σi2 } d(σ 1 , σ 2 ) = N and we observe that 1 X 1 2 σi σi = 1 − 2d(σ 1 , σ2 ) , N i≤N
so that (1.2) clearly relates the structure of the process (HN (σ))σ∈Σn to the metric structure of (ΣN , d). The “high dimensional” character of the correlation (1.2) sharply contrasts with the “one dimensional” situation of many processes (Xt )t∈R . Condition (2.1) occurs with p = 2 in the famous Sherrington–Kirkpatrick (SK) model [4]. In this model, the energy HN (σ) of a configuration σ ∈ ΣN is given by X 1 gij σi σj , (1.3) −HN (σ) = √ N 1≤i<j≤N
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
3
where (gij ) are independent standard normal r.v. (the minus sign follows the convention of physics). It was completely unexpected that the study of this model (even at the non-rigorous level) should prove very difficult. The predictions made by G. Parisi, the so-called “Parisi solution” are stunningly beautiful, and indicate that the simple, canonical formula (1.3) creates structures of great intricacy. While investigating the relevance of Parisi’s ideas to other situations, it was discovered [5, 6], that a simpler version of Parisi’s structure should occur if in (1.2), one replaces the “2-spin” interaction by a “p-spin” interaction, i.e. one considers 1/2 X p! gi1 ···ip σi1 · · · σip , (1.4) −HN (σ) = 2N p−1 i 2 log 2 there exist a (random) partition (Cα )α≥1 of ΣN such that if two configurations belong to the same set Cα , their overlap is (typically) about qN (β), while if they belong to two different sets Cα , Cγ , their overlap is about zero. The sequence of weights wα = GN (Cα ) is a random sequence with a precisely understood distribution (namely, a Poisson–Dirichlet distribution, as will be explained later). The Gibbs measure thus breaks into an asymptotically infinite sequence of non trivial pieces. These pieces are as far apart as they can be, and contain no further structure. Their existence is certainly not obvious from (1.4) (and the way they depend upon the randomness remains a mystery). There is a kind of “selforganization”, and one of the most remarkable predictions of [3] is verified.
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
5
What does this tell us about the original question, the large values of a given realization of the r.v. −HN (σ)? It turns out that max(−HN (σ)) is of order N , and that such is also the case of EhHN (σ)i. By general principles (that will be detailed in the proof of the Ghirlanda–Guerra identities) it is typically true that 2 ! 1 1 HN − EhHN i = o(1) , (1.10) E N N so that, in some sense, Gibbs’ measure looks only at the configurations σ for which HN (σ) = EhHN i + o(N ) .
(1.11)
A consequence of the structure previously described is that (in a certain sense) for a suitable range of values of β, the configurations satisfying (1.11) do not appear “everywhere” in the configuration space but only in the small, far apart clusters Cα . We now turn toward a precise formulation of Theorem 1.1. Throughout the paper, we set p i. TN (β) = EhR12
(1.12)
We will prove that if p is large enough, for 1 ≤ β ≤ 2p , the system of equations TN (β) . qp
(1.13)
E th2 X chm X . E chm X
(1.14)
m = 1− q= where
r X=β
pq p−1 g, 2
(1.15)
and where g is N (0, 1) (i.e. standard normal) has a unique solution mN (β), qN (β). Theorem 1.2 (Formal version). There exists a number L such that if p ≥ L, p odd, for each ε > 0 we have Z 2p/L E(G⊗2 (1.16) lim β,N (Dε ))dβ = 0 , N →∞
1
where Dε = {(σ 1 , σ 2 ); |R12 | ≥ ε, |R12 − qN (β)| ≥ ε} .
(1.17)
In (1.16), the notation Gβ,N stresses the fact that Gibbs’ measure depends the parameter β. The reason why in (1.16) the integral is over [1, 2p/L ] is that qN (β) is not defined for β small; but the case β ≤ 1 is not interesting, because then lim Eh|R12 |i = 0 .
N →∞
(1.18)
This is shown in [10] and will be shown again here. (On the other hand (1.18) √ does not hold for β > 2 log 2.)
February 19, 2003 14:25 WSPC/148-RMP
6
00158
M. Talagrand
We will not only consider (1.3), but also the more general case 1/2 X X p! gi1 ···ip σi1 · · · σip + h σi . −HN (σ) = p−1 2N i 0, L(a) = −a log a . Consider the function ξ(β, h, t) given by 2 ψ(t) + β + βht 4 ξ(β, h, t) = p β ψ(t) + βht
(2.8) p if β ≤ 2 ψ(t) p if β ≥ 2 ψ(t) .
(2.9)
An important step in the proof of Theorem 2.1 is the following lower bound for ZN (β, h, t). Theorem 2.2. There exists a number L such that, if p ≥ L, h ≤ 1/L, β ≤ 2p/L , |t| ≤ 1/4, then N . (2.10) P (ZN (β, h, t) ≤ exp N (ξ(β, h, t) − 2−p/L )) ≤ K exp − K This is an accurate result, because, as we will also show, ZN (β, h, t) is hardly ever larger than exp N (ξ(β, h, t) + ε) if ε > 0. The simple observation (overlooked in [10]) is that such large values of β as in Theorem 2.2 can be reached by using
February 19, 2003 14:25 WSPC/148-RMP
10
00158
M. Talagrand
p that the function β → 7 ξ(β, h, t) is linear for β ≥ 2 ψ(t) and by proving (2.10) for p β ≤ 2 ψ(t). This is explained in Proposition 2.8 below. The proof of Theorem 2.2 is not very complicated; but the “upper bound” argument needed in Theorem 2.1 will require more struggling. We now collect simple facts. The reason for the occurrence of the function ψ is the following well known estimate. Lemma 2.3. We have 1 √ exp N ψ(t) ≤ card S(t) ≤ exp N ψ(t) . L N
(2.11)
It is of course understood here and everywhere that we consider only values of t for which S(t) is not empty. To distinguish between the Hamiltonians (1.4) and (1.19), we will denote by −HN,0 (σ) the quantity (1.4), so that (1.19) reads X σi . (2.12) −HN (σ) = −HN,0 (σ) + h i≤N
Lemma 2.4. We have ∀ σ,
N N 2 − K ≤ EHN,0 , (σ) ≤ 2 2
∀ σ 1 , σ 2 , |E(HN,0 (σ 1 )HN,0 (σ 2 )) − N R(σ 1 , σ 2 )p | ≤ K .
(2.13) (2.14)
Proof. For (2.13), we write p! N 2N p−1 p 1 p−1 1 ··· 1 − = N 1− 2 N N
2 (σ) = EHN,0
because there are
N p
choices for i1 < · · · < ip . To prove (2.14) we note that
2EHN,0 (σ 1 )HN,0 (σ 2 ) =
= P
X
p! N p−1 1 N p−1
σi11 · · · σi1p σi21 · · · σi2p
i1 0 such that h ≤ h0 ⇒ ∀ β ,
1 . 8
(2.51)
(t − tm )2 . 4
(2.52)
0 ≤ tm (β, h) ≤
Moreover, if |t| ≤ 1, ξ(β, h, t) ≤ ξ(β, h, tm ) − Proof. Fixing β, h we have ξ(β, h, t) = p if β ≤ 2 ψ(t), while otherwise
β2 + ψ(t) + tβh 2
p ξ(β, h, t) = β( ψ(t) + th) ,
so that tm satisfies either −ψ 0 (tm ) = βh
(2.53)
ψ 0 (tm ) = h. − p 2 ψ(tm )
(2.54)
1−t 1 , ψ 0 (t) = − log 2 1+t
(2.55)
or else
Now
√ and (2.53) means that tm = th βh. The case (2.53) can occurs only if β ≤ 2 log 2; Since ψ(tm ) ≤ 2 log 2 the solution of (2.54) goes to zero with h. Thus (2.51) should be obvious. Next, from (2.55) we have ψ 00 (t) = −
1 ≤ −1 1 − t2
February 19, 2003 14:25 WSPC/148-RMP
18
00158
M. Talagrand
and p ψ 0 (t)2 ψ 00 (t) 1 − , ≤− p ( ψ(t))00 = p 3/2 4ψ(t) 2 ψ(t) 2 ψ(t) p p so that (β ψ(t))00 ≤ −1, whenever β ≥ 2 ψ(t). Clearly this implies (2.52). Given t1 , t2 , u we set D(β, h, t1 , t2 , u) =
X
exp(−βHN (σ 1 ) − βHN (σ 2 )) ,
(2.56)
where the summation is over σ 1 ∈ S(t1 ), σ 2 ∈ S(t2 ), R12 = u. The reason for considering this quantity is that G⊗2 ({(σ 1 , σ 2 ); R12 ∈ U }) = where A=
X
A , ZN (β, h)2
D(β, h, t1 , t2 , u) ,
(2.57)
(2.58)
for a summation over |t1 |, |t2 |, |u| ≤ 1, u ∈ U , N t1 , N t2 , N u integers. We set η(β, h, t1 , t2 , u) = t2m .
1 E log D(β, h, t1 , t2 , u) . N
(2.59)
We now turn to the proof of Theorem 2.1. In this theorem we will have q(β, h) = For clarity, we consider a parameter c, and, when p is odd, we define U = {x ∈ [−1, 1]; |x − t2m | ≥ c, x ≤ 1 − c} ,
(2.60)
while, when p is even, we define U = {x ∈ [−1, 1]; |x − t2m | ≥ c, |x| ≤ 1 − c} .
(2.61)
To prove Theorem 2.1 it suffices to prove the following. Lemma 2.17. Given a number L0 , we can find L1 such that if p ≥ L1 , if c = 2−p/L1 , then for all t1 , t2 , all h ≤ h0 , and all β with 1 ≤ β ≤ 2p/L1 we have u ∈ U ⇒ η(β, h, t1 , t2 , u) ≤ 2ξ(β, h, tm ) − 2−p/L0 +1 .
(2.62)
To see this, we take for L0 the number L of Theorem 2.2. We bound the sum in (2.58) by (2N + 1)2 times its largest term, and we see that from (2.59), (2.62), we have log(2N + 1) 1 E log A ≤ 2 + 2ξ(β, h, tm ) − 2−p/L0 +1 . (2.63) N N We use Theorem 2.2 with t = tm to control from below the denominator of (2.57). Now, mimicking (2.27), we have 1 1 N u2 P log A − E log A ≤ u ≤ exp − 2 , N N 4β
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
19
and together with (2.63) this controls the numerator of (2.57). Proof of Lemma 2.17. First, we observe that D(β, h, t1 , t2 , u) ≤ ZN (β, h, t1 )ZN (β, h, t2 ) , so that, taking logarithms, expectation, and using (2.18) we get η(β, h, t1 , t2 , u) ≤ ξ(β, h, t1 ) + ξ(β, h, t2 ) . Thus (2.52) shows that to prove (2.62), we can assume (as we do in the rest of the proof) that |t1 − tm |, |t2 − tm | ≤ 4d
(2.64)
where d = 2−p/2L0 . √ For a, b > 0, we define F (a, b) = a + b if a ≤ b and F (a, b) = 2 ab if a ≥ b. We observe that 2 β , 2ψ(t) + βht (2.65) ξ(β, h, t) = F 2 and that η(β, h, t1 , t2 , u) ≤ F
β2 p (1 + u ), ψ(t1 , t2 , u) + βht1 + βht2 , 2
(2.66)
a fact that follows from Proposition 2.6, using (2.15), (2.35). Next, we show that if p ≥ L1 , c = 2−p/L1 , L1 large enough, then u ∈ U ⇒ (1 + up )ψ(t1 , t2 , u) ≤ 2ψ(tm ) −
c2 . L
(2.67)
If up ≤ 2−7 , this follows from (2.42), since |ψ(tj ) − ψ(tm )| ≤ LL(d)
(2.68)
for j = 1, 2, by (2.64). To treat the case up ≥ 2−7 , we observe that if x ≥ c, c = 2−p/L1 , L1 large enough, then c c2 p x − x≤− ≤− , LL(d) + LL 4 L L L and we use (2.47) (resp. (2.48)) when u > 0 (resp. p even and u < 0). We now prove (2.62) when β2 (1 + up ) ≥ ψ(t1 , t2 , u) . 2
(2.69)
February 19, 2003 14:25 WSPC/148-RMP
20
00158
M. Talagrand
By (2.66), we have, using (2.67) in the second line r 1 (1 + up )ψ(t1 , t2 , u) + βh(t1 + t2 ) η(β, h, t1 , t2 , u) ≤ 2β 2 r c2 + βh(t1 + t2 ) ≤ 2β ψ(tm ) − L ≤ 2β
p
ψ(tm ) + 2βtm + 2βd −
≤ 2ξ(β, h, tm ) + 2βd −
βc2 L
βc2 . L
(2.70)
Since we assume β ≥ 1 (and c = 2−p/L1 where L1 is large enough) this finishes the proof of (2.62) under (2.69). Finally we prove (2.62) when (2.69) fails, i.e. β2 (1 + up ) < ψ(t1 , t2 , u) . 2 First, we assume up > 0. Then, from (2.71) β2 β2 < (1 + up ) < ψ(t1 , t2 , u) ≤ ψ(t1 ) + ψ(t2 ) ≤ 2ψ(tm ) + LL(d) , 2 2 using (2.68). Thus, if β 2 > 4ψ(tm ), then 2 β β2 , 2ψ(tm ) ≥ F (2ψ(tm ), 2ψ(tm )) = 4ψ(tm ) ≥ + 2ψ(tm ) − LL(d) F 2 2
(2.71)
(2.72)
(2.73)
by (2.72). If β 2 ≤ 4ψ(tm ), (2.73) remains true since F (β 2 /2, 2ψ(tm)) = β 2 /2 + 2ψ(tm ). Now, under (2.71) 2 β 2 up β β2 (1 + up ), ψ(t1 , t2 , u) = + + ψ(t1 , t2 , u) F 2 2 2 ≤
β2 + (1 + up )ψ(t1 , t2 , u) 2
c2 β2 + 2ψ(tm ) − , (2.74) 2 L using (2.72) in the second line and (2.67) in the last line. Combining with (2.65), (2.66), (2.73), this proves again (2.62). The much easier case u < 0 is left to the reader. ≤
We have proved Theorem 2.1. The following is also worth noting. Proposition 2.18. If h≤
1 , 2
β 2 log 2, the sequence of the weights GN ({σ}) (when ranked in decreasing order) has asymptotically a distribution Λm . The distribution Λm is very well understood (see [14, 15]). Quite interestingly some properties crucial
February 19, 2003 14:25 WSPC/148-RMP
38
00158
M. Talagrand
in the present work are a simple consequence of the previous description using the √ REM. First, it should be obvious that if β > 2 log 2, * 2 +! HN p − log 2 → 0, (6.13) E − N √ because only the values of σ for which −HN /N ∼ log 2 are relevant for Gibbs’ measure. In particular p HN (6.14) = log 2 . lim E − N →∞ N We integrate by parts, using now the formula Eg1 u(g2 ) = Eg1 g2 Eu0 (g2 ) for g1 = HN (σ 1 ), g2 = HN (σ 2 ) and we get using (6.10) that β HN 2 (σ)i − hE(HN (σ 1 )HN (σ 2 )i = E hEHN E − N N β (1 − Eh1{σ1 =σ2 } i) 2 X β 1−E G2N ({σ1 }) , = 2 and comparing with (6.13) we get in the limit that √ X 2 log 2 2 = 1 − m, vα = 1 − E β =
(6.15)
(6.16)
α≥1
(which is of course well known). Consider now a function f on k replicas, |f | ≤ 1. If we use (6.13) to mimic the proof of Theorem 6.2 (integrating now by parts using (6.10), as in (6.15)) we get 1 1 X Eh1{σ1 =σ` } f i + o(1) (6.17) Eh1{σ1 =σk+1 } f i = TN Ehf i + k k 2≤`≤k
P
where TN = Eh1{σ1 =σ2 } i = E G2N ({σ}). Above and below, o(1) goes to zero as N → ∞. Consider an integer n, and for s ≤ n consider ks different replicas σ s,` , ` ≤ ks . P Consider the function of k = s≤n ks replicas given by Y Y 1{σs,` =σs,`+1 } . f= s≤n ` 0 and an integer k such that, for each 0 < m < 1, if we have ! Y X X ks (m) ks ≤ k ⇒ E ηα − S (k1 , . . . , kn ) ≤ ε , ∀ n, ∀ k1 , . . . , kn , s≤n s≤n α≥1 (6.32) then, if the r.v. U, V satisfy V ≥ 1 and EU 2 ≤ A, we have P E(U V m−1 ) Pα≥1 ηα Uα − ≤ ε0 , E EV m α≥1 ηα Vα P 2 2 E(U 2 V m−2 ) P α≥1 ηα Uα − (1 − m) ≤ ε0 , E ( α≥1 ηα Vα )2 EV m
(6.33)
(6.34)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
P 2 E(U V m−1 ) α6=β ηα ηβ Uα Uβ −m E P ≤ ε0 . ( α≥1 ηα Vα )2 EV m
43
(6.35)
Proof. If we fix m, U , V the result follows from Proposition 6.3 because the conditions (6.22) force the distribution of (ηα ) to resemble Λm . The only issue could be the uniformity over m. But as m → 0, the largest weight becomes close to one (in which case everything is obvious) and as m → 1, it becomes small, in which case everything is also obvious by the law of large numbers. 7. Conditioning and the Relative Weights In our quest for a proof of Theorem 1.1 we have (roughly speaking and assuming p odd) reached the following stage. It σ 1 , σ 2 do not belong to the same pure state, then their overlap is zero. If they belong to the pure state Cα , then R(σ 1 , σ 2 ) ' qα = hσ 1 , σ 2 iα . We would like to show that the numbers qα do not depend upon α and are not random. The first approach that comes to mind is to try a kind of iteration procedure in the spirit of Sec. 5, but that does not seem to work. Rather, our approach relies upon the following observation. If we knew that qα ' q (q non-random) we would know the distribution of wα . We would then make precise computations with the cavity method and prove that q must satisfy a relation such as (1.14), so that q would be completely determined by TN (β). To follow this line of attack, we will show that, given a number q, it is possible to make sense of “the part of the system consisting of the pure states for which qα ' q”, and most importantly, that this “partial system” satisfies relations similar to the Ghirlanda–Guerra identities. These relations are proved in a similar manner than the Ghirlanda–Guerra identities, but seem strictly more general. They involve rather interesting combinatorics. The effect of this construction is that we obtain an object similar to the whole system, but where we now know that qα is always near a given value of q. In the next section we will then learn how to use the cavity method to prove that q (nearly) satisfies (1.14). This will mean that the solution of (1.14) is the only possible value for qα ; this will finish the proof of Theorem 1.1. While, as mentioned, the proof contains an interesting and unexpected combinatorial ingredient it is obscured by a number of unimportant technical complications. Thus it seems appropriate to first give an informal sketch of this proof. We consider 1/2 ≤ q ≤ 1 and ε > 0. The role of ε is that we want to consider only those pure states Cα for which qα is within distance ε of q. Consider the function given by ψ(x) = 1{|x−q| 0 such that
The “edge effect” will occur in the region ε ≤ |x − q| ≤ ε + ε0 . Typically ε0 will be much smaller than ε. We assume that |x − q| − ε , ε ≤ |x − q| ≤ ε + ε0 ⇒ ψ(x) = 1 − ε0 so that 1 (7.13) |ψ(x) − ψ(y)| ≤ 0 |x − y| . ε We define V = hψ(R12 )i . Lemma 7.1. We have
X δ wα2 ψ(qα ) ≤ 0 . E V − ε
(7.14)
(7.15)
α≥1
Here and below δ denotes a quantity depending upon N , β, but independent of all the various parameters (ε, q, ε0 , `, . . .) of our construction, and which goes to zero as N → ∞ when one averages over β ≤ 2p/L . In fact, in (7.15), δ goes to zero as N → ∞ for each value of β, and the need to average over β will arise only when we will use (6.2).
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
47
Proof. We have ψ(R12 ) 6= 0 ⇒ R12 ≥ q − ε − ε0 ≥ 1/2. From Theorem 4.1 we have - [ 1 N ⊗2 1 2 2 Cα ≤ exp − . (σ , σ ); R12 ≥ EGN 2 K α≥1
Thus, we make only an exponentially small error if we pretend that X X h1Cα2 ψ(R12 )i = wα2 hψ(R12 )iα V = α≥1
α≥1
using the notation (4.6). To improve clarity, we will not mention these exponentially small errors. Using (7.13), we have hψ(R12 )iα − ψ(qα ) ≤ h|ψ(R12 ) − ψ(qα )|iα ≤
1 h|R12 − qα |i . ε0
Now, using Jensen’s and Cauchy–Schwarz inequalities h|R12 − qα |iα = h|R12 − < R12 iα |iα ≤ h|R12 − R34 |iα ≤ h|R12 − R13 |iα + h|R13 − R34 |iα = 2h|R12 − R13 |iα 1 σ · (σ 2 − σ 3 ) =2 ≤ 2Iα1/2 , N α where Iα is defined in (4.7). P Thus, since α≥1 wα ≤ 1, we have X 2 X 2 1/2 2 wα ψ(qα ) ≤ 0 wα Iα V − ε α≥1
α≥2
2 ≤ 0 ε ≤
X
!1/2 wα3 Iα
α≥1
2 UN (β)1/2 , ε0
(7.16)
which implies the result by Theorem 4.1. To express that V is not too small, we consider another smooth function 0 ≤ ϕ ≤ 1, and we assume ϕ is differentiable and |ϕ0 | ≤ 2 , x ≤ 1 ⇒ ϕ(x) = 0 ,
x ≥ 2 ⇒ ϕ(x) = 1 .
(7.17) (7.18)
We set W = ϕ(2` V ) ,
(7.19)
February 19, 2003 14:25 WSPC/148-RMP
48
00158
M. Talagrand
where ` is an integer. Thus, saying that W is not zero is a smooth version of saying that V is larger than 2−` . We recall the notation (7.5). Lemma 7.2. We have hf i ≤ V k .
(7.20)
Proof. Since 0 ≤ ψ ≤ 1, we have Y Y Y Y ψ(R(σ s,k , σ s,k+1 )) ≤ ψ(R(σ s,2k−1 , σs,2k )) s 1≤k≤2ks −1
s 1≤k≤ks
because there are more terms on the left-hand side. The k terms on the right-hand side depend upon different replicas, so the thermal average of this quantity is V k .
We consider the function ϕ`,k (x) =
ϕ(2` x) . xk
(7.21)
Lemma 7.3. The derivative of ϕ`,k is bounded. Moreover ϕ`,k (x) ≤ 2`k .
(7.22)
Proof. ϕ ≤ 1 and ϕ(2` x) = 0 unless x ≥ 2−` . In order to control the “edge effect” (the region where ψ ∈ / {0, 1}), we introduce ψ ∼ (x) = 1{ε≤|x−q|≤ε+ε0 }
(7.23)
0 < ψ(x) < 1 ⇒ ψ ∼ (x) = 1 .
(7.24)
so that
Lemma 7.4. We have " # hf i Y X X δ ` ks ` 2 ∼ ϕ(2 V ) − ηα ϕ(2 V ) ≤ K(k, `) E wα ψ (qα ) + 0 . E V k ε α s≤n
(7.25)
α≥1
There of course K(k, `) is a number depending only upon k, `. The control of the error terms requires great case, so we will explicitly write all the quantities on which the various constants K(· · ·) depend, except p, that is fixed once and for all. We recall that δ depends only on N, β, but not on k, `, . . . . We note the different nature of the error terms: the term δ/ε0 will go to zero as N → ∞ (if one averages P over β). The “edge effect” E α≥1 wα2 ψ ∼ (qα ) will be made small by taking ε0 very small and averaging over ε.
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
49
Proof. We will proceed as in Lemma 7.1. The one difficulty is that one has to perform the various approximations in the right order to avoid potential problems with small denominators. We first note that if −1 ≤ as , bs ≤ 1, then Y Y X as − bs ≤ |as − bs | . (7.26) s≤n
From (7.5) we have hf i =
Y s≤n
=
s≤n
*
s≤n
+
Y
ψ(R(σ s,k , σ s,k+1 ))
1≤k≤2ks −1
Y X
*
+
Y
wα2ks
ψ(R(σ
s,k
,σ
s,k+1
))
1≤k≤2ks −1
s≤n α≥1
We set U1 =
Y X
. α
wα2ks ψ(qα )2ks −1 .
s≤n α≥1
Using (7.26) twice, we see that |hf i − U1 | ≤ K(k)
X
wα2 h|ψ(R12 ) − ψ(qα )|iα ,
α≥1
so that, as shown in the proof of Lemma 7.1 we have E|hf i − U1 | ≤ K(k)
δ . ε0
Thus E(|hf i − U1 |ϕ`,k (V )) ≤ K(k, `) i.e.
δ , ε0
hf i δ ϕ(2` V ) − U1 ϕ`,k (V ) ≤ K(k, `) 0 . E k ε V
Next, we set U2 =
Y X
wα2ks ψ(qα )ks .
s≤n α≥1
We note that ψ(qα )2ks −1 6= ψ(qα )ks ⇒ ψ ∼ (qα ) = 1 . Using (7.26), we get |U2 − U1 | ≤ n
X α≥1
wα2 ψ ∼ (qα ) ,
(7.27)
February 19, 2003 14:25 WSPC/148-RMP
50
00158
M. Talagrand
and thus E|(U1 − U2 )ψ`,k (V )| ≤ K(k, `)E
X
wα2 ψ ∼ (qα ) .
(7.28)
α≥1
Next, we set V1 =
X
wα2 ψ(qα ) .
α≥1
Combining Lemmas 7.1 and 7.3 we get, since U2 ≤ 1 E|U2 (ϕ`,k (V ) − ϕ`,k (V1 ))| ≤ K(k, `)
δ . ε0
(7.29)
Now U2 ϕ`,k (V1 ) =
U2 V1k
ϕ(2` V1 ) ,
and, obviously U2 ≤ V1k . Thus using again Lemma 7.1. U δ 2 ` ` (ϕ(2 V1 ) − ϕ(2 V )) ≤ K(k, `) 0 . E V k ε
(7.30)
1
Now U2 V1k
=
X X
ηα2ks ,
s≤n α≥1
and combining (7.27) to (7.30), this proves (7.18). We set W (k1 , . . . , kn ) = E
X X
! ηαks ϕ(2` V
) ,
(7.31)
s≤n α≥1
S(k1 , . . . , kn ) =
1 W (k1 , . . . , kn ) . Eϕ(2` V )
(7.32)
(The notation does not indicate that these quantities depend upon ε, ε0 , `). This is where the conditioning argument appears. We replace the basic probability P by a probability P 0 having a density proportional to ϕ(2` V ). The relations among the quantities S(k1 , . . . , kn ) will follow from the next estimate, where m0 is given in (1.10). Proposition 7.5. Given k, there exists for ` ≥ 1 numbers a(`, N ), K(k, `) with the following properties X a(`, N ) ≤ 8 . (7.33) ∀ N, `≥1
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
If we consider k1 , . . . , kn ≥ 1 and
X
51
ks ≤ k, then
s≤n
|kW (k1 + 1, k2 , . . . , kn ) − [(k1 − m0 )W (k1 , . . . , kn ) + k2 W (k1 + k2 , k3 , . . . , kn ) + · · · + kn W (k1 + kn , . . . , kn−1 )]| " # X δ wα2 ψ ∼ (qα ) + 0 + E(ϕ(2` V ))K(k)pε . ≤ a(`, N ) + K(k, `) E ε
(7.34)
α≥1
Compared to Lemma 7.4, we get two new error terms. The term a(`, N ) is an “edge effect” produced by ϕ(2` V ). Condition (7.33) shows that it can be made small by averaging over `. The last term will be made small by taking ε small. (Of course parameters will have to be chosen in an appropriate order.) Proof. Since ϕk,` is bounded, we see by the Cauchy–Schwarz inequality and Proposition 4.1 that 1 1,2k1 )f ϕk,` (V ) E − HN,0 (σ N 1 E(hf iϕk,` (V )) + K(k, `)δ =E − HN,0 (σ) N β (1 − TN (β))E(hf iϕk,` (V )) + K(k, `)δ . 2 To integrate by parts the left-hand side we write
(7.35)
=
h− N1 HN,0 (σ 1,2k1 )f i Vk
=
2k h− N1 HN,0 (σ 1,2k1 )f i ZN 2 V )k (ZN
.
In words, the denominators of the brackets on the numerator and the denominator are the same and cancel out. When integrating by parts, we get three terms corresponding respectively to dependence on the disorder of the Boltzmann factors occuring in HN,0 1,2k1 2k (σ )f ZN − N and in 2 V, ZN
and to the dependence on the disorder of ϕ(2` V ) = ϕ(2` hψ(R12 )i). These terms are labeled respectively I, II, III. The term III is the all important error term, so we handle it first. Introducing three new replicas σ 1 , σ 2 , σ 3 , we have III =
2` ϕ0 (2` V ) β E (hf E(σ 1,2k1 , σ 1 )ψ(R12 ) + f E(σ 1,2k1 , σ 2 )ψ(R12 )i k 2 V ! − 2hf ψ(R12 )E(σ 1,2k1 , σ 3 )i) ,
February 19, 2003 14:25 WSPC/148-RMP
52
00158
M. Talagrand
where E(σ 1 , σ 2 ) =
1 E(HN,0 (σ 1 )HN,0 (σ 2 )) . N
We bound crudely |E(σ 1 , σ 2 )| by 1. Since ψ(R12 ) is thermally independent of f , we get a bound ` 2 V hf iϕ0 (2` V ) ≤ 2βE(2` V ϕ0 (2` V )) III ≤ 2βE Vk since hf i ≤ V k by Lemma 7.2. Now, it is obvious from the definition of ϕ that |xϕ0 (x)| ≤ 4 1{1<x 0, we have that E α≥1 {wα2 ; |qα −q| ≤ ε} → 0. Thus if (1.14) has a unique solution, this unique solution is asymptotically the only possible value for qα . One essential step in this line of reasoning is to find an argument proving that if qα = q for each α, then q must satisfy (1.14). It is then not difficult (although cumbersome) to reproduce this argument in the setting of Sec. 7. Our first task is to outline this argument. We will stay informal about the details, since these will be fully covered in the case of “conditioning” which is the one we really need. This argument is designed to involve only the squares of the weights wα (which is what the numbers ηα are). Remembering that we assume that R12 can essentially be only 0 or q, we write q'E
2 hσ 1 σ 2 R12 i i hR12 =E N N , hR12 i hR12 i
(8.1)
by symmetry upon the sites. Changing N into N + 1 and β into β 0 , and setting 0 0 = R12 (%1 , %2 ) = R12
σ1 σ2 N R12 + N +1 N +1 , N +1 N +1
(8.2)
we get q'E
1 2 0 0 hσN +1 σN +1 R12 i 0 i0 hR12
.
(8.3)
0 by R12 so that One can expect a limited effect of replacing R12
q'E
1 2 0 hσN +1 σN +1 R12 i 0 hR12 i
and using cavity P hR12 Av ε1 ε2 exp( `≤2 ε` g(σ ` ))i P . q'E hR12 Av exp `≤2 ε` g(σ ` )i
(8.4)
Remembering that R12 ' 0 if σ 1 , σ 2 do not belong to the same set Cα we have P P 2 2 2 α≥1 wα hsh g(σ)iα α≥1 ηα hsh g(σ)iα = E . (8.5) q ' EP P 2 2 2 α≥1 wα hch g(σ)iα α≥1 ηα hch g(σ)iα
February 19, 2003 14:25 WSPC/148-RMP
56
00158
M. Talagrand
Next, we use the fact that R12 takes essentially values close to 0 or q to prove that P α≥1 ηα Uα (8.6) q 'EP α≥1 ηα Vα where (Uα , Vα )α≥1 are i.i.d, and the law of the couple (Uα , Vα ) is that of the couple (sh2 (X), ch2 (X)), where X is Gaussian, EX 2 = βpq p−1 /2. This step might not be so intuitive, so let us at least mention that the fact that the couples (hsh2 g(σ)iα , hch2 g(σ)iα ) are nearly independent as α varies follows from the fact, proved in Sec. 5, that g(σ) is typically nearly independent of g(σ0 ) if σ ∈ Cα , σ 0 ∈ Cα0 , α 6= α0 . Next, as proved in Sec. 6, the sequence (wα ) has nearly distribution Λm for m = mN (β), so that the sequence (ηα ) has nearly distribution Λm/2 . Thus (6.33) yields q'
E th2 X chm X , E chm X
which determines q. We now repeat the previous argument “under conditioning”, taking case of all the details. The delicate step (control of the distribution of the weights ηα ) was performed in Sec. 7. The rest of proof is not difficult, but it is made cumbersome by the need of averaging to control the error terms of Theorem 7.6. We will use the notation of Sec. 6. The starting point (that corresponds to (8.1)) is as follows. Since ψ(x) = 0 if |x − q| ≥ ε + ε0 , we have |qψ(x) − xψ(x)| ≤ (ε + ε0 )ψ(x) ≤ 2εψ(x) .
(8.7)
Thus we have |qhψ(R12 )i − hR12 ψ(R12 )i| ≤ 2εhψ(R12 )i . Dividing by V = hψ(R12 )i, multiplying by ϕ(2` V ) and taking expectation, we get
` qEϕ(2` V ) − E hR12 ψ(R12 )i ϕ(2 V ) ≤ 2εEϕ(2` V ) . V We use (8.8) and symmetry between sites to get ϕ(2` V ) 1 2 ` qEϕ(2` V ) − E hσN σN ψ(R12 )i ≤ 2εEϕ(2 V ) . V
(8.8)
(8.9)
We rewrite (8.9) changing N into (N + 1) and β into β 0 . We recall the notation 0 )i0 , so that (8.9) implies (8.2) and we set V 0 = hψ(R12 ` 0 1 2 0 0 ϕ(2 V ) ` 0 Eϕ(2` V 0 ) − E hσN σ ψ(R )i (8.10) +1 N +2 12 ≤ 2εEϕ(2 V ) . 0 V
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
57
0 We recall that ϕ, ψ are Lipschitz, and that |R12 − R12 | ≤ 2/N . Thus, at the cost of adding to the right-hand side of (8.10) an error term K(`)δ, we can replace 0 by R12 and V 0 by in (8.10) R12
V1 = hψ(R12 )i0 .
(8.11)
We use the cavity method to write 1 2 0 hσN +1 σN +1 ψ(R12 )i =
V1 = hψ(R12 )i0 =
hψ(R12 ) sh g(σ1 ) sh g(σ2 )i hch g(σ)i2
hψ(R12 ) ch g(σ 1 ) ch g(σ 2 )i hch g(σ)i2
(8.12) (8.13)
so that (8.10) yields 1 2 qEϕ(2` V1 ) − E hψ(R12 ) sh g(σ ) sh g(σ )i ϕ(2` V1 ) hψ(R12 ) ch g(σ 1 ) ch g(σ 2 )i ≤ 2εEϕ(2` V1 ) + K(`)δ .
(8.14)
At some later stage of the proof we will want to use the results of Sec. 7. These involve a new probability having a density proportional to ϕ(2` V ), not ϕ(2` V1 ), so in (8.14) we would like to replace ϕ(2` V1 ) by ϕ(2` V ). The idea to do this is simply that the ratio V1 /V is not too different from 1, so that ϕ(2` V1 ) and ϕ(2` V ) are equal except for a few values of `; this should not make much difference. We recall that the dependence of the various constants on p is implicit. Lemma 8.1. We have Eg
V1 ≤K; V
Eg
V ≤ 1. V1
(8.15)
Proof. By (8.13) hψ(R12 ) ch g(σ 1 )ch g(σ 2 )i V1 ≤ V hψ(R12 )i so that Eg
V1 ≤ exp 2β 2 p ≤ K , V
since β ≤ 2p/2 . The rest is obvious. We use again the notation a(`, N ) = P (1 < 2` V ≤ 2) . Lemma 8.2. Given r ≥ 1, we have E(|ϕ(2` V ) − ϕ(2` V1 )|) ≤
X `−r≤s≤`+r
a(s, N ) + K2−r .
(8.16)
February 19, 2003 14:25 WSPC/148-RMP
58
00158
M. Talagrand
Proof. It should be obvious from the properties of ϕ that E(|ϕ(2` V ) − ϕ(2` V1 )|) ≤ P (2−r ≤ 2` V ≤ 2r ) + P (2` V ≤ 2−r , 2` V1 ≥ 1) + P (2` V ≥ 2r , 2` V1 ≤ 2) . Using Lemma 8.1 and Markov inequality, P (2` V ≤ 2−r , 2` V1 ≥ 1) ≤ K2−r , P (2` V ≥ 2−r , 2` V1 ≤ 2) ≤ K2−r , and also P (2−r ≤ 2` V ≤ 2r ) ≤
X
a(s, N ) .
`−r≤s≤`+r
Combining Lemma 8.2 and (8.14), we have shown that 1 2 qEϕ(2` V ) − E hψ(R12 ) sh g(σ ) sh g(σ )i ϕ(2` V ) 1 2 hψ(R12 ) ch g(σ ) ch g(σ )i ≤ 2εEϕ(2` V ) + K(`)δ + K2−r + b(`, N ) where
X
b(`, N ) ≤ K(r) .
(8.17)
(8.18)
`≥1
The next step is to replace in (8.17) the process (g(σ)) by a simpler process. In the scheme of proof described early in this section, the step we are going to perform corresponds from going from (8.5) to (8.6). We consider i.i.d Gaussian variables (gα )α≥1 , with Egα2 = β 2 pq p−1 /2, and we define the process g 0 (σ) by g 0 (σ) = gα if σ ∈ Cα . Lemma 8.3. We have hψ(R12 ) sh g 0 (σ 1 ) sh g 0 (σ 2 )i hψ(R12 ) sh g(σ 1 ) sh g(σ 2 )i ` E − V ) ϕ(2 1 2 0 1 0 2 hψ(R12 ) ch g(σ ) ch g(σ )i hψ(R12 ) ch g (σ ) ch g (σ )i ≤ εKϕ(2` V ) + K(`)
δ . ε0
(8.19)
√ √ tg(σ) + 1 − tg 0 (σ) and hψ(R12 ) sh gt (σ 1 ) sh gt (σ 2 )i . ξ(t) = Eg hψ(R12 ) ch gt (σ 1 ) ch gt (σ 2 )i
Proof. We write gt (σ) =
Writing 0
0
∆``0 = Eg(σ ` )g(σ ` ) − Eg 0 (σ ` )g 0 (σ ` ) = 2E
0 d gt (σ ` ) gt (σ ` ) . dt
(8.20)
(8.21)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
59
we obtain from (8.20), after integration by parts, as in Lemma 5.2 (and using the fact ∆`` is a constant so that the corresponding terms cancel out) ξ 0 (t) = Eg
hψ12 ∆12 it hψ12 ψ34 ∆13 ε2 ε3 it − 4Eg hψ12 it hψ12 i2t
+ 3Eg
hψ12 ψ34 ψ56 ∆35 ε1 ε2 ε3 ε5 it . hψ12 i3t
(8.22)
Here, we write ψ``0 = ψ(R``0 ) and we use compact notation as follows. Given n (= 2, 4 or 6), and a function f of σ 1 , . . . , σ n , ε1 , . . . , εn we write hf it = hAv f Et i for Et = exp
X
ε` gt (σ ` ) .
`≤n
Thus, we get |ξ 0 (t)| ≤ Eg
hψ12 |∆12 |it hψ12 ψ34 |∆13 |it + 7Eg . hψ12 i2t hψ12 i2t
(8.23)
We will explain only how to take care of the last term (the hardest), i.e. to control hψ12 ψ34 |∆13 |it ` ϕ(2 V ) . (8.24) E Eg hψ12 i2t Consider a function ψ ∗ , such that 0 ≤ ψ ∗ ≤ 1, ψ ∗ (x) = 0 if x ≤ 2βp2 ε, while ψ (x) = 1 if |x| ≥ 3βp2 ε, ψ ∗ Lipschitz. Then the term (8.24) is at most ϕ(2` V ) 3εβp2 Eϕ(2` V ) + E Eg hψ12 ψ34 |∆13 |ψ ∗ (∆13 )it hψ12 i2t ∗
≤ 3εβp2 Eϕ(2` V ) + K(`)Ehψ12 ψ34 |∆13 |ψ ∗ (∆13 )i ,
(8.25)
because, since ch x ≥ 1, we have hψ12 it ≥ hψ12 i = V . All we have to do is to show that the last term of (8.25) is bounded by δ/. We recall Lemma 4.2, so that ∆13 − βp (Rp−1 − δ(σ 1 , σ 3 )) ≤ K , (8.26) 13 N 2 S where δ(σ, σ 0 ) = q p−1 if (σ, σ 0 ) ∈ α≥1 Cα2 , while δ(σ, σ 0 ) = 0 otherwise. Setting ∆013 =
βp p−1 (R13 − δ(σ 1 , σ 3 )) , 2
we have to show that Ehψ12 ψ34 |∆013 |ψ ∗ (∆013 )i ≤
δ . ε0
(8.27)
We first show that Ehψ12 ψ34 |∆013 |ψ ∗ (∆013 )1{|R13 |≤1/2} i ≤
δ . ε0
(8.28)
February 19, 2003 14:25 WSPC/148-RMP
60
00158
M. Talagrand
Indeed, if |R(σ 1 , σ 3 )| ≤ 1/2 then δ(σ 1 , σ3 ) = 0, so that the quantity (8.28) is at most βp p−1 Eh|R13 |1{|R13 |≤1/2} i , 2
(8.29)
and this goes to zero by Theorem 5.1. Next, the method of Lemma 7.1 shows that Ehψ12 ψ34 |∆013 |ψ ∗ (∆013 )1{|R13 |≥1/2} i X βp p−1 βp p−1 δ 4 2 p−1 ∗ p−1 (q (q wα ψ(qα ) −q ) ψ −q ) + 0, =E 2 α 2 α ε α α≥1
(8.30) but the first term on the right-hand side of (8.30) is zero because ψ(qα ) 6= 0 ⇒ |qα − q| ≤ ε + ε0 ≤ 2ε βp p−1 βp p−1 p−1 2 ∗ p−1 (q ) ≤ εβp ⇒ ψ −q ) = 0. ⇒ (qα − q 2 2 α We have proved (8.19). The proof of the following mimics that of Lemma 7.4. Lemma 8.4. We have P ηα sh2 gα hψ(R12 ) sh g 0 (σ 1 ) sh g 0 (σ 2 )i ` −P ϕ(2 V ) E hψ(R12 ) ch g 0 (σ 1 ) ch g 0 (σ 2 )i ηα ch2 gα ! X δ 2 ∼ wα ψ (qα ) . ≤ K(`) 0 + E ε
(8.31)
α≥1
If we combine this with (8.17), (8.19), we have shown the following. Lemma 8.5. We have P 2 qEϕ(2` V ) − E P ηα sh gα ϕ(2` V ) 2 ηα ch gα ≤ εKEϕ(2` V ) + K2−` +
! X δ +E wα2 ψ ∼ (qα ) + b(`, N ) . ε0 α≥1
where b(`, N ) satisfies (8.18). If we can control the error terms, (8.32) means that P ηα sh2 gα 1 ` E Eϕ(2 V ) , q' P Eϕ(2` V ) ηα ch2 gα
(8.32)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
61
which, when combined with (6.33) and the work in Sec. 7, will yield (1.14). Thus, we turn to the control of the error terms and the averaging arguments. The function ϕ is fixed once and for all. Our “conditioning” construction depends upon the parameters ε, ε0 , `. We write, for a r.v. Y , Eε,ε0 ,` (Y ) =
1 E(Y ϕ(2` V )) . Eϕ(2` V )
(8.33)
Considering a parameter d > 0, we write ! ! P 2 η sh X α α α≥1 F (β, ε, ε0 , `, d) = max q − Eε,ε0 ,` P − d, 0 , 2 α≥1 ηα ch Xα so F (β, ε, ε0 , `, d) ≥ 0 and ! P 2 α≥1 ηα sh Xα q − Eε,ε0 ,` P ≤ d + F (β, ε, ε0 , `, d) . 2 η ch X α α≥1 α
(8.34)
(8.35)
Considering an integer `0 , andε0 > 0, we write Z 1 X 1 2ε0 F (β, ε, ε0 , `, d)dε . AvF (β, ε0 , ε0 , `0 , d) = `0 ε0 ε0 `0 ≤` 0 depending only upon η such that if 2 m q − E th Xmch X ≥ η , (8.40) E ch X then, for each ε0 , `0 , we have 0
Av F (β, ε0 , ε0 , `0 , K(k)ε0 ) + Av Gm (β, ε0 , ε0 , `0 , k, K(k)ε0 ) ≥ θ .
(8.41)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
63
Proof. It follows from Proposition 6.4 that we can find a number k and a number ξ > 0, depending only upon η such that, if the random weights (ηα )α≥1 satisfy X ks ≤ k ∀ n, ∀ k1 , . . . , kn ≥ 1, s≤n
⇒ E then
E
Y
! ηαks
s≤n
P P
2 α≥1 ηα sh Xα
α≥1
ηα ch2 Xα
!
− S (m ) (k1 , . . . , kn ) ≤ ξ , 0
Eth2 X chm X η − ≤ . E chm X 2
(8.42)
(8.43)
Let us denote by c the left-hand side of (8.41). If two functions have an average less than c, there exists a point where both are at most 2c. Thus we can find ε and ` such that ! P 2 α≥1 ηα sh Xα (8.44) ≤ 2c + K(k)ε0 . q − Eε,ε0 ,` P 2 α≥1 ηα ch Xα ∀ k1 , . . . , kn ,
X
0
ks ≤ k, |S(k1 , . . . , kn ) − S (m ) (k1 , . . . , kn )|
s≤n
≤ 2c + K(k)ε0 .
(8.45)
Combining (8.44) and (8.40), we have ! P 2 E th2 X chm X α≥1 ηα sh Xα − Eε,ε0 ,` P ≥ η − 2c − K(k)ε . m 2 E ch X α≥1 ηα ch Xα Recalling that S(k1 , . . . , kn ) = Eε,ε0 ,`
Y X
(8.46)
! ηαks
,
s≤n α≥1
we see from (8.45) and the implication (8.42) ⇒ (8.43) that we must have either 2c + K(k)ε0 > ξ or η − 2c − K(k)ε0 < η/2, so that in any case we have η − K(k)ε0 . 2c ≥ min ξ, 2 We conclude the proof by taking θ = min(ξ, η/2)/4 and ε0 small enough, depending only upon η. Corollary 8.9. Given η > 0, there exists ε0 > 0, depending only upon η such that, if (8.40) holds, then, for each integer `0 , we have ! 0 X ε δ K(r, η) 2 −`0 +1 wα 1|q−qα |≥ε0 ≥ 2 + 0 + K(η)2−r + . ≤ K(`0 , η) P ε0 ε `0 α≥1
(8.47)
February 19, 2003 14:25 WSPC/148-RMP
64
00158
M. Talagrand
Proof. Combine the three previous lemmas. We now define the set 1 1 ,1 ; ∃q ∈ , 1 , |x − q| ≤ η; J(η, β, N ) = x ∈ 2 2
2 m q − E th Xmch X ≤ η , E ch X (8.48)
where, as usual, EX 2 = β 2 pq p−1 /2 and m = TN (β)/q p−1 . Theorem 8.10. We have E
X
! wα2 ; qα ∈ / J(η, β, N )
= δ.
(8.49)
α≥1
What (8.49) means is that the left-hand side of (8.49), at h0 fixed, goes to zero if we average over β in an interval (while staying in the domain β ≤ 2p/2 , (β, h) in the region of Theorem 2.1). Proof. We consider ε0 > 0, depending upon η only as provided by Corollary 8.9. We can assume ε0 ≤ η. Considering x in [0, 1], there is an integer n such that / |x − nε0 | ≤ ε0 ≤ η. If q = ε0 n fails (8.40), then x ∈ J(η, β, N ). Thus if x ∈ J(η, β, N ), q = ε0 n must satisfy (8.40). Thus ! ! X X X 2 2 wα ; qα ∈ / J(η, β, N ) ≤ E wα ; |qα − nε0 | ≤ ε0 E α≥1
α≥1
where the sum on the right is over n ≤ 1/ε0 such that q = nε0 satisfies (8.40). Which terms are in that sum depends upon N , β (through TN (β)). It now suffices to show that for each q, ! X 2 wα 1|q−qα |≤ε0 = δ (8.50) AE α≥1
where A = 1 if (8.40) holds and A = 0 otherwise. For a r.v. 0 ≤ X ≤ 1, we have E(X) ≤ 2−`0 +1 + P (X ≥ 2−`0 +1 ) , and combining this with Corollary 8.9 shows that the left-hand side of (8.50) is at most 0 ε δ K(r, η) + . + K(η)2−r + 2−`0 +1 + K(`0 , η) ε0 ε `0
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
Thus
Z
2p/L
AE
lim sup N →∞
0
X
65
! wα2 1|q−qα |≤ε0
dβ
α≥1
≤ K2−`0 +1 + K(`0 , η)
ε0 K(r, η) + K(η)2−r + . ε0 `0
We let ε0 → 0, then `0 → ∞, then r → ∞ to finish the proof. To finish the proof of Theorem 1.1, it suffices to prove the following. Proposition 8.11. If 1 ≤ β ≤ 2p/L , p ≥ L, the system of Eqs. (1.13), (1.14) has a unique solution such that q ≥ 1 − 2−p/L . Proof. Considering the function Φ(q, m) = we show that the map
E th2 X chm X , E chm X
TN (q, m) 7→ Φ(q, m), 1 − p q
sends [1 − 2−p/L , 1] × [1/Lβ, 1] into itself. First, we note that Φ(q, m) ≤ 1, 1 − TN /q p ≤ 1. We observe that TN ≤ 1 − 1/Lβ. This follows from (6.8), using that, √ by (2.22), we have |Eh−HN /N i| ≤ log 2. Thus, if q ≥ 1 − 2−p/L , we have TN ≥ 1 − TN (1 − 2−p/L )−1 qp 1 1 ≥ 1− 1− (1 − 2−p/L )−1 ≥ Lβ Lβ
m = 1−
(8.51)
for β ≥ 1, β ≤ 2p/L . Now Φ(q, m) = 1 −
1 E chm−2 X ≥1− , E chm X E chm X
and E chm X ≥ 2−m E exp mX = 2−m exp
m2 β 2 p p−2 q 2
p 1 exp , 2 L using again that m ≥ 1/Lβ. Thus indeed Φ(q, m) ≥ 1 − 2 exp(−p/L). The function TN f (q) = Φ q, 1 − p q ≥
(8.52)
February 19, 2003 14:25 WSPC/148-RMP
66
00158
M. Talagrand
satisfies f (1 − 2−p/L ) > 1 − 2−p/L , f (1) < 1, so in between these values there is a number q with q = f (q). To show that this number is unique we show that f 0 < 1 on the previous interval. This is because the partial derivatives of Φ with respect to q, m are exponentially small in p. This follows from (8.51), and elementary considerations. We have finished the proof of Theorem 1.1. Now we know that the only possible value of qα is given by (1.13), (1.14), we can use to the argument of Sec. 6 (see (6.27)) to see that the distribution of the sequence (wα )α≥1 , is about ΛmN (β) . (When mN (β) = 1, i.e. TN (β) = 0, this means of course that there are “no macroscopic weights”.) Theorem 1.1 deals only with the case p odd, and we now investigate the case p even. In that case, Gibbs’ measure is invariant under the symmetry σ → −σ. The pure states Cα go by pairs, Cα , Cϕ(α) = −Cα , and GN (Cα ) = GN (−Cα ). The only change to make in the proof of Theorem 1.1 is that in (8.5) it is no longer true that the terms in the sums are nearly independent; but this is true after one regroups the contributions of Cα and −Cα . We then conclude that qα can essentially only be equal to q, so that the overlap are asymptotically q, 0, or −q, the later being obtained as the overlap of a configuration in Cα and one in −Cα . 9. The Pertubed Hamiltonian and the Extended Ghirlanda Guerra Identities We would like to have the identities (6.4) when Rp is replaced by any other power of R. Following an idea of [13] this is possible if one adds to the Hamiltonian a smaller order term that “contains a s-spin interaction for each integer s > 0”. More precisely, we consider 1/2 X s! (s) (s) (9.1) gi1 ···is σi1 · · · σis , gN (σ) = N s−1 (s)
where the summation is over 1 ≤ i1 < · · · < is ≤ N . (Thus gN (σ) = 0 if s > N ). (s) In (9.1), the r.v. gi1 ···is are all independent standard normal. Given β, we define the “perturbation term of the Hamiltonian” by X (s) per (σ) = ξ(N ) 2−s βs gN (σ) , (9.2) −βHN s≤N
where −1 ≤ βs ≤ 1, and where ξ(N ) = N −1/6 . The purpose of the factor 2−s is to ensure convergence. There is nothing magical about the power N −1/6 . One could also take N −a , 0 < a < 1/4. The full Hamiltonian is now given by per full (σ) = −βHN (σ) − βHN (σ) −βHN
and N −1 E log ZN is now a function pN (β, h, β) where β = (β1 , β2 , . . .).
(9.3)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
67
The following proves that the perturbation term is indeed small is some sense. Lemma 9.1. We have pN (β, h, 0) ≤ pN (β, h, β) ≤ pN (β, h, 0) +
X 2−s β 2 s
s≥1
2
ξ 2 (N ) .
Proof. The right-hand side follows by Jensen’s inequality, integrating in the (s) r.v. gi1 ,...is inside the log rather than outside. The left-hand side is obtained by observing that pN (β, h, β) − pN (β, h, 0) =
1 per E loghexp(−βHN (σ))i N
where the bracket is for the choice of parameters corresponding to pN (β, h, 0). Now per per (σ))i ≥ E log exp(−βhHN (σ)i) E loghexp(−βHN per (σ)i = 0 , = −βEhHN (s)
as is seen by integrating first in gi1 ···is at gi1 ···ip fixed. Lemma 9.2. If β ≤ 2p we have (s) 2 + Z 1 * (s) g (σ) 22s g (σ) 2s √ . −E + E dβs ≤ K(p) N N N ξ(N ) ξ(N )2 N −1 It is understood that in all the brackets, the parameters are β, h, β. It is in this lemma that the condition ξ(N )N 1/4 → ∞ arises. Proof. The proof mimics that of Lemma 6.1. We start with + * (s) g (σ) ∂pN N , (β, h, β) = 2−s ξ(N )E ∂βs N * !2 + * +2 (s) (s) g (σ) (σ) g ∂ 2 pN N N , (β, h, β) = 2−2s N ξ(N )2 E − ∂βs2 N N so that Z
1
−1
* E
(s)
gN (σ) N
!2 +
* −
(s)
gN (σ) N
+2 dβs ≤
L2s . N ξ(N )
As in the proof of Lemma 6.1, we deduce from Proposition 3.4 of [10] that * +2 +!2 * Z 1 (s) (s) g g (σ) (σ) N dβs ≤ K(p) √ . E N − E (2−s ξ(N ))2 N N N −1
February 19, 2003 14:25 WSPC/148-RMP
68
00158
M. Talagrand
Proposition 9.3 (Extended Ghirlanda Guerra identities). Given a function f on k replicas, |f | ≤ 1, and a continuous function ξ, we have Ehξ(R1,k+1 )f (σ 1 , . . . , σk )i =
1 Ehξ(R12 )iEhf i k 1 X Ehξ(R1,` )f i + δ , + k
(9.4)
2≤`≤k
where
Z lim
N →∞
δdβ = 0 ,
(9.5)
for an integral over −1 ≤ βs ≤ 1 for each s ≥ 1. Proof. By approximation one can assume that ξ is a polynomial, and by linearity that it is a power in which case Lemma 9.2 allows to prove (9.4) as in Theorem 6.2.
Throughout the rest of the paper, δ will denote a quantity such as in (9.5). Proposition 9.3 is a statement of amazing power as we will now show. There is nothing to change in the work of Sec. 2 in the case of the full Hamiltonian (as is seen along the lines of Lemma 9.1). Thus we can construct the sets (Cα ) as in Sec. 3, and we denote wα = GN (Cα ) (where the Gibbs’ measure now corresponds to the full Hamiltonian). Throughout the rest of the paper, we write m = mN (β, h, β) = Eh1{R12 ≥3/4} i . We recall the notation S
(m)
(9.6)
(k1 , . . . , kn ) of (6.20).
Theorem 9.4. For any integers n, k1 , . . . , kn we have Y X ks (m) wα − S (k1 , . . . , kn ) = δ . E
(9.7)
s≤n α≥1
This fact should be obvious following the method of Sec. 6, i.e. recursive use of (9.4) for a function ξ such that ξ(x) = 0 if x ≤ 1/2, ξ(x) = 1 if x ≥ 3/4. (Let us insist that the argument makes essential use of the fact that we know a priori that R12 essentially never belongs to the interval [1/2, 3/4].) The meaning of Theorem 9.4 is essentially that the weights of the lumps have a Poisson–Dirichlet distribution Λm . As we mentioned earlier, not controlling this distribution was the main obstacle in using the cavity method. Once this obstacle has been passed (almost effortlessly) the problem becomes much easier, as will be shown in Sec. 10. It is of course disturbing that the perturbation term seems to bring information out of nowhere. A possible explanation is that at a certain deep level (not yet understood) this information is “generically present” and that adding the perturbation term eliminates the exceptional “unstable” situations that escape the general rule.
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
69
One clear occurrence of this is when h = 0, p is even. Without the perturbation term, the pure states go by symmetric pairs. We will show in the next section that the perturbation term breaks the symmetry. How do we change the problem by adding the perturbation term to the Hamiltonian? The answer to that question really depends on what we study. If we study the overlap of two configurations, the example of h = 0, p even shows that we do change of problem (the overlap takes essentially two values rather than three). On the other hand, if we are not interested in the detailed structure of Gibbs’ measure, but only in the asymptotic computation of pN , Lemma 9.1 shows that we have not changed the problem. 10. The Model with External Field The purpose of this section is to prove Theorem 10.1, that extends Theorem 1.1 to the case h 6= 0, provided we accept to add the perturbation term of Sec. 9 to the Hamiltonian. Given two numbers q0 ≤ q1 , and two independent standard normal r.v. z, g, we consider r q q p p−1 p−1 p−1 g q1 − q0 + z q0 + βh . (10.1) X =β 2 We denote by Eg (resp. Ez ) expectation at z (resp. g) fixed. We set m = mN = 1 − E(h1{R12 ≥1/2} i) .
(10.2)
Theorem 10.1. There exists a number L with the following property. If p > L, 1 ≤ β ≤ 2p/L , h ≤ 1/L, then the system of equations 2 ! Eg (th X chm X) , (10.3) q0 = Ez Eg chm X q1 = Ez
Eg (th2 X chm X) Eg chm X
has a unique solution (q0 , q1 ). Given ε > 0, we have Z EG⊗2 lim N ({|R12 − q0 | ≥ ε and |R12 − q1 | ≥ ε})dβ = 0 . N →∞
(10.4)
(10.5)
In (10.5), β is fixed, and the average is over β, such that −1 ≤ βs ≤ 1, for s ≥ 1. Gibbs’ measure in (10.5) refers to the Hamiltonian (9.3). The only reason for the requirement β ≥ 1 is to ensure that there is a solution to (10.3), (10.4). In fact, in the setting of Proposition 2.18, (or more generally when m → 1 as N → ∞) one can interpret (10.3) as meaning s p−1 pq 0 z + βh . q0 = Ez th2 β 2
February 19, 2003 14:25 WSPC/148-RMP
70
00158
M. Talagrand
In that case, asymptotically, R12 takes only the value q0 . (This is the so-called replica-symmetric solution.) Before we start the proof of Theorem 10.1, we need to know that the overlap cannot take values close to −1, even if p is even. Lemma 10.2. If β ≤ 2p/L , h ≤ 1/L, we have Eh1{R12 ≤−1/2} i = δ .
(10.6)
Proof. With the notation of the discussion following Theorem 3.2, we have, combining (3.7) and (3.9), that - [ 1 N ⊗2 Cα × Cϕ(α) ≤ K exp − R12 ≤ − EGN 2 K α≥1
and all we need to show is that ! [ X ⊗2 Cα × Cϕ(α) = E wα wϕ(α) = δ . EGN α≥1
(10.7)
α≥1
To do this, we observe that for two continuous functions θ, ψ, on R, the extended Ghirlanda–Guerra relations imply that 1 1 (10.8) Ehθ(R13 )ψ(R12 )i = Ehθ(R12 )iEhψ(R12 )i + Ehθ(R12 )ψ(R12 )i + δ . 2 2 We take ψ such that ψ(x) = 1 if x ≤ −3/4, ψ(x) = 0 if x ≥ −1/2. Thus it is (essentially) true that ψ(R12 ) = 1 if and only if σ 2 ∈ Cϕ(α) , where α is such that σ 1 ∈ Cα . Taking θ = ψ, we see that (10.8) implies !2 ! X X X 1 2 wα wϕ(α) = E wα wϕ(α) +E wα wϕ(α) + δ . (10.9) E 2 α≥1
α≥1
α≥1
Taking θ(x) = ψ(−x), since θ(x)ψ(x) = 0, we get now from (10.8) that X X 1 X wα2 wϕ(α) = E wα wϕ(α) E wα2 + δ . (10.10) E 2 α≥1 α≥1 α≥1 P P 2 wα , comparing (10.9) and (10.10) gives Since α≥1 wα2 wϕ(α) = α≥1 wϕ(α) !" # X X X 2 wα wϕ(α) 1 + E wα wϕ(α) − E wα = δ E and since
P
α≥1
wα2
α≥1
α≥1
≤ 1, this implies (10.8).
We now start the main argument of the proof of Theorem 10.1. Considering two numbers q0 , q1 to be determined later, we define 3 q1 if R12 ≥ 4 , (10.11) q12 = q12 (σ 1 , σ 2 ) = q0 if R12 < 3 . 4
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
71
We will study the quantity AN (β, h0 , β) = Eh(R12 − q12 )2 i ,
(10.12)
0
where Gibbs’ measure is of course for the values β, h , β of the parameters. Using the symmetry between sites, 4 1 2 1 2 + Eh(σN (10.13) −1 σN −1 − q12 )(σN σN − q12 )i . N We will use a technique related to that of Secs. 4 and 5, but to make the proof work it seems required to distinguish two coordinates rather than one . We set p−1/2 N +2 00 β (10.14) β = N AN (β, h, β) ≤
β 00 = (βs00 ) ,
where βs00 ξ(N + 2) = βs ξ(N )
for s ≥ 1 .
(10.15)
Lemma 10.3. We have AN +2 (β 00 , h0 , β00 ) = E
hAv(η1 η2 − q12 )(ε1 ε2 − q12 )Ei +δ, hAv Ei
where Av is average over η1 , η2 , ε1 , ε2 = ±1, E = exp
X
0
`
0
(10.16)
! `
0
(η` (g (σ ) + h ) + ε` (g(σ ) + h )) ,
(10.17)
`≤2
g(σ) is given by (4.12) and the process (g 0 (σ)) is an independent copy of the process (g(σ)). In the right-hand side of (10.15), Gibbs’ measure is for the value (β, h0 , β) of the parameters. Proof. This formula is clearly related to (4.15). One uses (10.13) for N + 2 rather than N , and one makes explicit the contribution of the last two spins. The righthand side of (10.16) does not however exactly arise from the second term in (10.13). For equality to hold, in E there would be terms taking into account the perturbation term in the Hamiltonian and there would also be an interaction term between the (N + 1)th spin η and the (N + 2)th spin ε. These extra terms are obviously of lower 00 order. Also, in an identity, we would have to define q12 as in (10.11) but using R12 rather than R12 , where 00 = R12
1 (N R12 + η1 η2 + ε1 ε2 ) . N +2
(10.18)
But (as we used several times) this makes little difference since R12 is essentially never in [1/2, 1 − 2p/L ]. To take advantage of (10.16), we will replace the processes g(σ), g 0 (σ) by simpler ones. We consider i.i.d. N (0, 1) r.v. z, gα , and we define r q q p p−1 q0 z + q1p−1 − q0p−1 gα (10.19) γ(σ) = β 2
February 19, 2003 14:25 WSPC/148-RMP
72
00158
M. Talagrand
for σ ∈ Cα . Thus we have Eγ(σ 1 )γ(σ 2 ) = if (σ 1 , σ 2 ) ∈
S α≥1
β 2 p p−1 β 2 p p−1 q1 = q 2 2 12
(10.20)
Cα2 , while we have Eγ(σ 1 )γ(σ 2 ) =
β 2 p p−1 β 2 p p−1 q0 = q 2 2 12
(10.21)
otherwise. We consider an independent copy (γ 0 (σ)) of this process. For 0 ≤ t ≤ 1 we define √ √ (10.22) gt (σ) = tg(σ) + 1 − tγ(σ) and we define gt0 (σ) similarly. We define Et = exp
X
!
(η` (gt0 (σ ` ) + h0 ) + ε` (gt (σ ` ) + h0 ))
(10.23)
`≤2
and θ(t) = E
hAv(η1 η2 − q12 )(ε1 ε2 − q12 )Et i . hAv Et i
(10.24)
To study θ(1), we will use the relation 1 θ(1) = θ(0) + θ0 (1) − θ00 (1) + 2
Z
1 2
t 000 θ (t)dt 2
(10.25)
1 θ(t) ≤ θ(0) + θ0 (1) − θ00 (1) + max |θ000 (t)| . 0 0. Differentiating by t we have V ∗ (∆ + t)−2 D1
1/2
= (∆0 + t)−2 T (D1 )1/2
(9)
and we infer kV ∗ (∆ + t)−1 D1 k2 = h(∆0 + t)−2 T (D1 )1/2 , T (D1 )1/2 i 1/2
= hV ∗ (∆ + t)−2 D1 , T (D1 )1/2 i 1/2
= k(∆ + t)−1 D1 k2 . 1/2
When kV ∗ ξk = kξk holds for a contraction V , it follows that V V ∗ ξ = ξ. In the light of this remark we arrive at the condition V V ∗ (∆ + t)−1 D1
1/2
= (∆ + t)−1 D1
1/2
and V (∆0 + t)−1 T (D1 )1/2 = V V ∗ (∆ + t)−1 D1
1/2
= (∆ + t)−1 D1
1/2
.
By Stone–Weierstrass approximation we have 1/2
V f (∆0 )T (D1 )1/2 = f (∆)D1
(10)
February 18, 2003 10:23 WSPC/148-RMP
00157
Monotonicity of Quantum Relative Entropy Revisited
83
for continuous functions. In particular for f (x) = xit we have T ∗ (T (D2 )it T (D1 )−it ) = D2it D1−it .
(11)
This condition is necessary and sufficient for the equality. Theorem 3.1. Let T : B(H) → B(K) be a 2-positive trace preserving mapping and let D1 , D2 ∈ B(H), T (D1 ), T (D2 ) ∈ B(K) be invertible density matrices. Then the equality S(D1 , D2 ) = S(T (D1 ), T (D2 )) holds if and only if the following equivalent conditions are satisfied : (i) T ∗ (T (D1 )it T (D2 )−it ) = D1it D2−it for all real t. (ii) T ∗ (log T (D1 ) − log T (D2 )) = log D1 − log D2 . The equality implies (11) which is equivalent to Theorem 3.1(i). Differentiating (i) at t = 0 we have the second condition which obviously applies the equalities of the relative entropies. The above proof follows the lines of [17]. The original paper is in the setting of arbitrary von Neumann algebras and hence slightly more technical (due to the unbounded feature of the relative modular operators). Condition (ii) of Theorem 3.1 appears also in the paper [22] in which different methods are used. Next we recall a property of 2-positive mappings. When T is assumed to be 2-positive, the set AT := {X ∈ B(H) : T (X ∗X) = T (X)T (X ∗)
and T (X ∗X) = T (X ∗)T (X)}
is a ∗-sub-algebra of B(H) and T (XY ) = T (X)T (Y )
for all X ∈ AT
and Y ∈ B(H) .
(12)
Corollary 3.1. Let T : B(H) → B(K) be a 2-positive trace preserving mapping and let D1 , D2 ∈ B(H), T (D1 ), T (D2 ) ∈ B(K) be invertible density matrices. Assume that T (D1 ) and T (D2 ) commute. Then the equality S(D1 , D2 ) = S(T (D1 ), T (D2 )) implies that D1 and D2 commute. Under the hypothesis ut := T (D1 )it T (D2 )−it and wt := D1it D2−it are unitaries. Since T ∗ is unital ut ∈ AT ∗ for every t ∈ R. We have wt+s = T ∗ (ut+s ) = T ∗ (ut us ) = T ∗ (ut )T ∗ (us ) = wt ws which shows that wt and ws commute and so do D1 and D2 . 4. Consequences and Related Inequalities 4.1. The Golden Thompson inequality The Golden–Thompson inequality tells that Tr eA+B ≤ Tr eA eB
February 18, 2003 10:23 WSPC/148-RMP
84
00157
D. Petz
holds for self-adjoint matrices A and B. It was shown in [18] that this inequality can be reformulated as a particular case of monotonicity when eA /Tr eA is considered as a density matrix and eA+B /Tr eA+B is the so-called perturbation by B. Corollary 5 of the original paper is formulated in the context of von Neumann algebras but the argument was adapted to the finite dimensional case in [19], see also [14, p. 128]. The equality holds in the Golden–Thompson inequality if and only if AB = BA. One of the possible extensions of the Golden–Thompson inequality is the statement that the function p 7→ Tr(epB/2 epA epB/2 )1/p
(13)
is increasing for p > 0. The limit at p = 0 is Tr eA+B [5]. It was proved by Friedland and So that the function (13) is strictly monotone or constant [7]. The latter case corresponds to the commutativity of A and B. 4.2. A posteriori relative entropy
P Let Ej (1 ≤ j ≤ m) be a partition of unity in B(H)+ , that is j Ej = I. (The operators Ej could describe a measurement giving finitely many possible outcomes.) Any density matrix Di ∈ B(H) determines a probability distribution µi = (Tr Di E1 , Tr Di E2 , . . . , Tr Di Em ) . It follows from Uhlmann’s theorem that S(µ1 , µ2 ) ≤ S(D1 , D2 ) .
(14)
We give an example that the equality in (14) may appear non-trivially. Example 4.1. Let D2 = Diag(1/3, 1/3, 1/3), D1 = Diag(1 − 2µ, µ, µ), E1 = Diag(1, 0, 0) and 0 0 0 0 0 0 E2 = 0 x z , E3 = 0 1 − x −z . 0 −¯ z x 0 z¯ 1 − x When 0 < µ < 1/2, 0 < x < 1 and for the complex z the modulus of z is small enough we have a partition of unity and S(µ1 , µ2 ) = S(D1 , D2 ) holds. First we prove a lemma. Lemma 4.1. If D2 is an invertible density then the equality in (14) implies that D2 commutes with D1 , E1 , E2 , . . . , Em . The linear operator T associates a diagonal matrix Diag(Tr DE1 , Tr DE2 , . . . , Tr DEm ))
February 18, 2003 10:23 WSPC/148-RMP
00157
Monotonicity of Quantum Relative Entropy Revisited
85
to the density D acting on H and under the hypothesis (11) is at our disposal. We have hD2 , T ∗ (T (D1 )it T (D2 )−it )D2 i = hD2 , D1it D2−it D2 i . 1/2
1/2
1/2
1/2
Actually we benefit from the analytic continuation and we put −i/2 in place of t. Hence m X 1/2 1/2 (Tr Ej D1 )1/2 (Tr Ej D1 )1/2 = Tr D1 D2 . (15) j=1
The Schwarz inequality tells us that m X 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 hD1 Ej , D2 Ej i Tr D1 D2 = hD1 , D2 i = j=1
≤
q m q X 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 hD1 Ej , D1 Ej i hD2 Ej , D2 Ej i j=1
=
m X
(Tr Ej D1 )1/2 (Tr Ej D2 )1/2 .
j=1
The condition for equality in the Schwarz inequality is well-known: There are some complex numbers λj ∈ C such that 1/2
1/2
D1 Ej
1/2
1/2
= λj D2 Ej
.
(16)
(Since both sides have positive trace, λj are actually positive.) The operators Ej 1/2 and Ej have the same range, therefore 1/2
1/2
D1 Ej = λj D2 Ej .
(17)
Summing over j we obtain −1/2
D2
1/2
D1
=
m X
λj Ej .
j=1 −1/2
1/2
1/2
−1/2
D1 = D1 D2 and D1 D2 = Here the right hand side is self-adjoint, so D2 D2 D1 . Now it follows from (16) that Ej commutes with D2 . Next we analyse the equality in (14). If D2 is invertible, then the previous lemma tells us that D1 and D2 are diagonal in an appropriate basis. In this case S(µ1 , µ2 ) is determined by the diagonal elements of the matrices Ej . Let E(A) denote the diagonal matrix whose diagonal coincides with that of A. If Ej is a partition of unity, then so is E(Ej ). However, given a partition of unity Fj of diagonal matrices, there could be many choice of a partition of unity Ej such that E(Ej ) = Fj , in general. In the moment we do not want to deal with this ambiguity, and we assume that we have a basis e1 , e2 , . . . , en consisting of common eigenvectors of the operators D1 , D2 , E(E1 ), E(E2 ), . . . , E(En ): Di ek = vki ek
and E(Ej )ek = wkj ek
(i = 1, 2, j = 1, 2, . . . , m, k = 1, 2, . . . , n) .
February 18, 2003 10:23 WSPC/148-RMP
86
00157
D. Petz
The matrix [wkj ]kj is (raw) stochastic and condition (17) gives vk1 wkj = (λj )2 wkj . vk2 This means that wkj 6= 0 implies that vk1 /vk2 does not depend on k. In other words, D1 D2−1 is constant on the support of any Ej . Let j be equivalent with k, if the support of E(Ej ) intersects the support of E(Ek ). We denote by [j] the equivalence class of j and let J be the set of equivalence classes. X E(Ek ) P[j] := k∈[j]
must be a projection and {P[j] : [j] ∈ J} is a partition of unity. We deduced above that D1 D2−1 P[j] = λj P[j] . One cannot say more about the condition for equality. All these extracted conditions hold in the above example and E(Ek )’s do not determine Ek ’s, see the freedom for the variable z in the example. We can summarise our analysis as follows. The case of equality in (14) implies some commutation relation and the whole problem is reduced to the commutative case. It is not necessary that the positive-operator-valued measure Ej should have projection values. 4.3. The Holevo bound
P Let Ej (1 ≤ j ≤ m) be a partition of unity in B(K)+ , j Ej = I. We assume that P the density matrix D ∈ B(H) is in the form of a convex combination D = i pi Di of other densities Di . Given a coarse graining T : B(H) → B(K) we can say that our signal i appears with probability pi , it is encoded by the density matrix Di , after transmission the density T (Di ) appears in the output and the receiver decides that the signal j was sent with the probability Tr T (Di )Ej . This is the standard scheme of quantum information transmission. Any density matrix Di ∈ B(H) determines a probability distribution µi = (Tr T (Di )E1 , Tr T (Di )E2 , . . . , Tr T (Di )Em ) on the output. The inequality X X pi S(µi ) ≤ S(D) − pi S(Di ) S(µ) − i
i
(18)
P P (where µ := i pi µi and D := i pi Di ) is the so-called Holevo bound for the amount of information passing through the communication channel. Note that the Holevo bound appeared before the use of quantum relative entropy and the first proof was more complicated.
February 18, 2003 10:23 WSPC/148-RMP
00157
Monotonicity of Quantum Relative Entropy Revisited
87
µi is a coarse-graining of T (Di ), therefore inequality (18) is of the form X X pi S(R(Di ), R(D)) ≤ pi S(Di , D) . i
i
On the one hand, this form shows that the bound (18) is a consequence of the monotonicity, on the other hand, we can make an analysis of the equality. Since the states Di are the codes of the messages to be transmitted, it would be too much to assume that all of them are invertible. However, we may assume that D and T (D) are invertible. Under this hypothesis Lemma 4.1 applies and tells us that the equality in (18) implies that all the operators T (D), T (Di ) and Ej commute. 4.4. α-entropies The α-divergence of the densities D1 and D2 is Sα (D1 , D2 ) =
1+α 1−α 4 2 2 Tr(D − D D ), 1 1 2 1 − α2
(19)
which is essentially 1/2
hD2 , ∆
1+α 2
1/2
D2 i
up to constants in the notation of Sec. 2. The proof of the monotonicity works for this more general quantity with a small alteration. What we need is Z sin πβ ∞ β 1/2 1/2 1/2 1/2 −t hD2 , (∆ + t)−1 D2 i + tβ−1 dt hD2 , ∆β D2 i = π 0 for 0 < β < 1. Therefore for 0 < α < 2 the proof of the above Theorem 3.1 goes through for the α-entropies. The monotonicity holds for the α-entropies, moreover (i) and (ii) from Theorem 3.1 are necessary and sufficient for the equality. The role of the α-entropies is smaller than that of the relative entropy but they are used for approximation of the relative entropy and for some other purposes (see [9], for example). 5. Strong Subadditivity of Entropy and the Markov Property The strong subadditivity is a crucial property of the von Neumann entropy it follows easily from the monotonicity of the relative entropy. (The first proof of this property of entropy was given by Lieb and Ruskai [11] before the Uhlmann’s monotonicity theorem.) The strong subadditivity property is related to the composition of three different systems. It is used, for example, in the analysis of the translation invariant states of quantum lattice systems: The proof of the existence of the global entropy density functional is based on the subadditivity and a monotonicity property of local entropies is obtained by the strong subadditivity [20]. Consider three Hilbert spaces, Hj , j = 1, 2, 3 and a statistical operator D123 on the tensor product H1 ⊗ H2 ⊗ H3 . This statistical operator has marginals on all subproducts, let D12 , D2 and D23 be the marginals on H1 ⊗ H2 , H2
February 18, 2003 10:23 WSPC/148-RMP
88
00157
D. Petz
and H2 ⊗ H3 , respectively. (For example, D12 is determined by the requirement Tr D123 (A12 ⊗ I3 ) = Tr D12 A12 for every operator A12 acting on H1 ⊗ H2 ; D2 and D23 are similarly defined.) The strong subadditivity asserts the following: S(D123 ) + S(D2 ) ≤ S(D12 ) + S(D23 ) .
(20)
In order to prove the strong subadditivity, one can start with the identities S(D123 , tr123 ) = S(D12 , tr12 ) + S(D123 , D12 ⊗ tr3 ) , S(D2 , tr2 ) + S(D23 , D2 ⊗ tr3 ) = S(D23 , tr23 ) , where tr with a subscript denotes the density of the corresponding tracial state, for example tr12 = I12 / dim(H1 ⊗ H2 ). From these equalities we arrive at a new one, S(D123 , tr123 ) + S(D2 , tr2 ) = S(D12 , tr12 ) + S(D23 , tr23 ) + S(D123 , D12 ⊗ tr3 ) − S(D23 , D2 ⊗ tr3 ) . If we know that S(D123 , D12 ⊗ tr3 ) ≥ S(D23 , D2 ⊗ tr3 )
(21)
then the strong subadditivity (20) follows. Set a linear transformation B(H1 ⊗ H2 ⊗ H3 ) → B(H2 ⊗ H3 ) as follows: T (A ⊗ B ⊗ C) := B ⊗ C(Tr A) ,
(22)
T is completely positive and trace preserving. On the other hand, T (D123 ) = D23 and T (D12 ⊗ tr3 ) = D2 ⊗ tr3 . Hence the monotonicity theorem gives (21). This proof is very transparent and makes the equality case visible. The equality in the strong subadditivity holds if and only if we have equality in (21). Note that T is the partial trace over the third system and T ∗ (B ⊗ C) = I ⊗ B ⊗ C .
(23)
Theorem 5.1. Assume that D123 is invertible. The equality holds in the strong subadditivity (20) if and only if the following equivalent conditions hold: −it it it D12 = D23 D2−it for all real t. (i) D123 (ii) log D123 − log D12 = log D23 − log D2 .
Note that both condition (i) and (ii) contain implicitly tensor products, all operators should be viewed in the three-fold-product. Theorem 3.1 applies due to (23) and this is the proof. It is not obvious the meaning of conditions (i) and (ii) in Theorem 5.1. The easy choice is log D12 = H1 + H2 + H12 ,
log D23 = H2 + H3 + H23 ,
log D2 = H2
February 18, 2003 10:23 WSPC/148-RMP
00157
Monotonicity of Quantum Relative Entropy Revisited
89
for a commutative family of self-adjoint operators H1 , H2 , H3 , H12 , H23 and to define log D123 by condition (ii) itself. This example lives in an abelian subalgebra of H1 ⊗ H2 ⊗ H3 and a probabilistic representation can be given. D123 may be regarded as the joint probability distribution of some random variables ξ1 , ξ2 and ξ3 . In this language we can rewrite (i) in the form Prob(ξ2 = x2 , ξ3 = x3 ) Prob(ξ1 = x1 , ξ2 = x2 , ξ3 = x3 ) = Prob(ξ1 = x1 , ξ2 = x2 ) Prob(ξ2 = t2 )
(24)
or in terms of conditional probabilities Prob(ξ3 = x3 |ξ1 = x1 , ξ2 = x2 ) = Prob(ξ3 = x3 |ξ2 = x2 ) .
(25)
In this form one recognizes the Markov property for the variables ξ1 , ξ2 and ξ3 ; subscripts 1, 2 and 3 stand for “past”, “present” and “future”. It must be well-known that for classical random variables the equality case in the strong subadditivity of the entropy is equivalent to the Markov property. The equality S(D123 ) − S(D12 ) = S(D23 ) − S(D2 )
(26)
means an equality of entropy increments. Concerning the Markov property, see [2] or [14, pp. 200–203]. Theorem 5.2. Assume that D123 is invertible. The equality holds in the strong subadditivity (20) if and only if there exists a completely positive unital mapping γ : B(H1 ⊗ H2 ⊗ H3 ) → B(H2 ⊗ H3 ) such that (i) Tr(D123 γ(x)) = Tr(D123 x) for all x. (ii) γ|B(H2 ) ≡ identity. If γ has properties (i) and (ii), then γ ∗ (D23 ) = D123 and γ ∗ (D2 ⊗ Tr3 ) = D12 ⊗ Tr3 for its dual and we have equality in (21). To prove the converse let E(A ⊗ B ⊗ C) := B ⊗ C(Tr A/ dim H1 )
(27)
which is completely positive and unital. Set −1/2
γ(·) := D23
1/2
1/2
−1/2
E(D123 · D123 )D23
.
(28)
If the equality holds in the strong subadditivity, then property (i) from Theorem 3.1 is at our disposal and it gives γ(x) = x for x ∈ B(H2 ). In a probabilistic interpretation E and γ are conditional expectations. E preserves the tracial state and it is a projection of norm one. γ leaves the state with density D123 invariant, however it is not a projection. (Accardi and Cecchini called this γ generalised conditional expectation, [1].) It is interesting to construct translation invariant states on the infinite tensor product of matrix algebras (that is, quantum spin chain over Z) such that condition (26) holds for all ordered subsystems 1, 2 and 3.
February 18, 2003 10:23 WSPC/148-RMP
90
00157
D. Petz
Acknowledgment The work was supported by the Hungarian OTKA T032662.
References [1] L. Accardi and C. Cecchini, Conditional expectations in von Neumann algebras and a theorem of Takesaki, J. Funct. Anal. 45 (1982), 245–273. [2] L. Accardi and A. Frigerio, Markovian cocycles, Proc. R. Ir. Acad. 83 (1983), 251–263. [3] H. Araki, Relative entropy for states of von Neumann algebras, Publ. RIMS Kyoto Univ. 11 (1976), 809–833. [4] H. Araki and T. Masuda, Positive cones and Lp -spaces for von Neumann algebras, Publ. RIMS Kyoto Univ. 18 (1982), 339–411. [5] H. Araki, On an inequality of Lieb and Thirring, Lett. Math. Phys. 19 (1990), 167–170. [6] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, 2000. [7] S. Friedland and W. So, On the product of matrix exponentials, Lin. Alg. Appl. 196 (1994), 193–205. [8] F. Hansen and G. K. Pedersen, Jensen’s inequality for operator and L¨ owner’s theorem, Math. Anal. 258 (1982), 229–241. [9] H. Hasegawa and D. Petz, Non-commutative extension of information geometry II, in Quantum Communication, Computing and Measurement, eds. Hirota et al., Plenum Press, New York, 1997. [10] A. S. Holevo, Information theoretical aspects of quantum measurement, Prob. Inf. Transmission USSR 9 (1973), 31–42. [11] E. H. Lieb and M. B. Ruskai, Proof of the strong subadditivity of quantum mechanical entropy, J. Math. Phys. 14 (1973), 1938–1941. [12] G. Lindblad, Completely positive maps and entropy inequalities, Comm. Math. Phys. 40 (1975), 147–151. [13] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, 2000. [14] M. Ohya and D. Petz, Quantum Entropy and Its Use, Springer-Verlag, Heidelberg, 1993. [15] D. Petz, Quasi-entropies for finite quantum systems, Rep. Math. Phys. 23 (1986), 57–65. [16] D. Petz, A dual in von Neumann algebras, Quart. J. Math. Oxford 35 (1984), 475–483. [17] D. Petz, Sufficiency of channels over von Neumann algebras, Quart. J. Math. Oxford 39 (1988), 907–1008. [18] D. Petz, A variational expression for the relative entropy, Commun. Math. Phys. 114 (1998), 345–348. [19] D. Petz, A survey of trace inequalities, in Functional Analysis and Operator Theory, Banach Center Publications 30 (Warszawa 1994), pp. 287–298. [20] D. Petz, Entropy density in quantum statistical mechanics and information theory, in Contributions in Probability, ed. C. Cecchini, Forum, Udine, 1996, pp. 221–226. [21] M. B. Ruskai, Beyond strong subadditivity? Improved bounds on the contraction of generalized relative entropy, Rev. Math. Phys. 6 (1994), 1147–1161. [22] M. B. Ruskai, Inequalities for quantum entropy: A review with conditions with equality, quant-ph/0205064 (2002).
February 18, 2003 10:23 WSPC/148-RMP
00157
Monotonicity of Quantum Relative Entropy Revisited
91
[23] A. Uhlmann, Relative entropy and the Wigner-Yanase-Dyson-Lieb concavity in an interpolation theory, Commun. Math. Phys. 54 (1977), 21–32. [24] H. Umegaki, Conditional expectations in an operator algebra IV (entropy and information), Kodai Math. Sem. Rep. 14 (1962), 59–85.
April 11, 2003 14:43 WSPC/148-RMP
00160
Reviews in Mathematical Physics Vol. 15, No. 2 (2003) 93–198 c World Scientific Publishing Company
EQUILIBRIUM STATISTICAL MECHANICS OF FERMION LATTICE SYSTEMS
HUZIHIRO ARAKI Research Institute for Mathematical Sciences, Kyoto University Kitashirakawa-Oiwakecho, Sakyoku, Kyoto 606-8502, Japan HAJIME MORIYA Institute of Particle and Nuclear Studies High Energy Accelerator Research Organization (KEK) 1-1 Oho, Tsukuba, Ibaraki, 305-0801, Japan Received 1 July 2002 Revised 30 November 2002 We study equilibrium statistical mechanics of Fermion lattice systems which require a different treatment compared with spin lattice systems due to the non-commutativity of local algebras for disjoint regions. Our major result is the equivalence of the KMS condition and the variational principle with a minimal assumption for the dynamics and without any explicit assumption on the potential. Its proof applies to spin lattice systems as well, yielding a vast improvement over known results. All formulations are in terms of a C∗ -dynamical systems for the Fermion (CAR) algebra A with all or a part of the following assumptions: (I) The interaction is even, namely, the dynamics αt commutes with the even-oddness automorphism Θ. (Automatically satisfied when (IV) is assumed.) (II) The domain of the generator δα of αt contains the set A◦ of all strictly local elements of A. (III) The set A◦ is the core of δα . (IV) The dynamics αt commutes with lattice translation automorphism group τ of A. A major technical tool is the conditional expectation from A onto its C ∗ -subalgebras A(I) for any subset I of the lattice, which induces a system of commuting squares. This technique overcomes the lack of tensor product structures for Fermion systems and even simplifies many known arguments for spin lattice systems. In particular, this tool is used for obtaining the isomorphism between the real vector space of all ∗-derivations with their domain A◦ , commuting with Θ, and that of all Θ-even standard potentials which satisfy a specific norm convergence condition for the one point interaction energy. This makes it possible to associate a unique standard potential to every dynamics satisfying (I) and (II). The convergence condition for the potential is a consequence of its definition in terms of the ∗-derivation and not an additional assumption. If translation invariance is imposed on ∗-derivations and potentials, then the isomorphism is kept and the space of translation covariant standard potentials becomes a separable Banach space with respect to the norm of the one point interaction energy. 93
April 11, 2003 14:43 WSPC/148-RMP
94
00160
H. Araki & H. Moriya This is a crucial basis for an application of convex analysis to the equivalence proof in the major result. Everything goes in parallel for spin lattice systems without the evenness assumption (I).
Contents 1. Introduction 2. Conditional Expections 2.1. Basic properties 2.2. Geometrical lemma 2.3. Commuting square 3. Entropy and Relative Entropy 3.1. Definitions 3.2. Monotone property 3.3. Strong subadditivity 4. Fermion Lattice Systems 4.1. Fermion algebra 4.2. Product property of the tracial state 4.3. Conditional expectations for Fermion algebras 4.4. Commuting squares for Fermion algebras 4.5. Commutants of subalgebras 5. Dynamics 5.1. Assumptions 5.2. Local Hamiltonians 5.3. Internal energy 5.4. Potential 5.5. General potential 6. KMS Condition 6.1. KMS condition 6.2. Differential KMS condition 7. Gibbs Condition 7.1. Inner perturbation 7.2. Surface energy 7.3. Gibbs condition 7.4. Equivalence to KMS condition 7.5. Product form of the Gibbs condition 8. Translation Invariant Dynamics 8.1. Translation invariance and covariance 8.2. Finite range potentials 9. Thermodynamic Limit 9.1. Surface energy estimate 9.2. Pressure 9.3. Mean energy 10. Entropy for Fermion Systems 10.1. SSA for Fermion systems 10.2. Mean entropy 10.3. Entropy inequalities for translation invariant states 11. Variational Principle 11.1. Extension of even states 11.2. Variational inequality
95 102 102 104 105 106 106 108 108 109 109 112 113 117 118 123 123 124 129 130 134 134 134 135 137 137 139 139 141 143 146 146 151 153 154 157 160 161 161 162 163 164 164 167
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
11.3. Variational equality 11.4. Variational principle 12. Equivalence of Variational Principle and KMS Condition 12.1. Variational principle from Gibbs condition 12.2. Some tools of convex analysis 12.3. Differential KMS condition from variational principle 13. Use of Entropy in the Variational Equality 13.1. CNT-entropy 13.2. Variational equality in terms of CNT-entropy 14. Discussion Appendix: Van Hove Limit A.1. Van Hove net A.2. Van Hove limit References
95
167 171 172 172 173 179 183 183 184 187 190 191 194 196
1. Introduction We investigate the equilibrium statistical mechanics of Fermion lattice systems. While equilibrium statistical mechanics of spin lattice systems has been well studied (see e.g. [17], [23] and [40]), there is a crucial difference between spin and Fermion cases. Namely, local algebras for disjoint regions commute elementwise for spin lattice systems, but do not commute for Fermion lattice systems. Due to this difference, the known formulations and proof in the case of spin lattice systems do not necessarily go over to the case of Fermion lattice systems and that is the motivation for this investigation. An example of a Fermion lattice system is the well-studied Hubbard model, to which our results apply. It turned out that, in the matter of the equivalence of the KMS condition and the variational principle (i.e. the minimum free energy) for translation invariant states, we obtain its proof without any explicit assumption on the potential except for the condition that it is the standard potential corresponding to a translation invariant even dynamics, a minimal condition for a proper formulation of the problem. Without any change in the methods of proof, this strong result holds for spin lattice systems as well — a vast improvement over known results for spin lattice systems and a solution of a problem posed by Bratteli and Robinson (Remark after Theorem 6.2.42. [17]). In addition to this major result, we hope that the present work supplies a general mathematical foundation for equilibrium statistical mechanics of Fermion lattice systems, which was lacking so far. There are two distinctive features of our approach. One feature is the central role of the time derivative (i.e. the generator of the dynamics). On one hand, this enables us to deal with all types of potentials without any explicit conditions on their long range or many body behavior, as long as the first time derivative of strictly localized operators can be defined. On the other hand, the existence of the dynamics for a given potential is separated from the problems treated here and we can bypass that existence problem via Assumption (III) below.
April 11, 2003 14:43 WSPC/148-RMP
96
00160
H. Araki & H. Moriya
Another feature is the use of conditional expectations instead of the tensor product structure traditionally used for spin lattice systems. They provide not only a substitute tool (for the tensor product structure), which is applicable for both spin and Fermion lattice systems, but also a method of estimates which does not use the norm of individual potentials, for which we do not impose any explicit condition. The main subject of our paper is the characterization of equilibrium states in terms of the KMS condition and the variational principle, which have an entirely different appearance but are shown to be equivalent. They refer to canonical ensembles in the infinite volume limit. However, they also refer to grand canonical ensembles if the dynamics is modified by gauge transformations with respect to Fermion numbers [11]. Namely, in the language of potentials, we may add a onebody potential, which consists of the particle number operator(s) times c-number chemical potential(s), and then the canonical ensemble for the so-modified potential is the grand canonical ensemble for the original potential, so that the grand canonical ensemble can be studied as a canonical ensemble for a modified potential, which is in the scope of our theory. For the sake of notational simplicity, our presentation is for the case of one Fermion at each lattice site. Our results and proofs hold without any essential change for more general case where a finite numbers of Fermions and finite spins coexist at each lattice site. The even-oddness in that case refers to the total Fermion number. For example, for Hubbard model, there are two Fermions at each lattice site, representing the two components of a spin 1/2 Fermion. Our starting point is a C∗ -dynamical system (A, αt ), where A is the C∗ -algebra of Fermion creation and annihilation operators on lattice sites of Zν with local subalgebras A(I) for finite subsets I ⊂ Zν and αt is a given strongly continuous one-parameter group of ∗-automorphisms of A. Since the normal starting point in statistical mechanics is a potential, a digression on our formulation and strategy starting from a given dynamics may be appropriate at this point. The KMS condition, which is formulated in terms of the dynamics, is one of two main components of our equivalence result. On the other hand, the variational principle, which is formulated in terms of the potential, is the other main component. Therefore both dynamics and potential are indispensable for our main results and their mutual relation is of at most importance. The key equation for that relation is the following formula. For any operator A localized in a finite subset I of the lattice, its time derivative is given by d αt (A) = αt (i[H(I), A]) dt where H(I) is described as a sum of potentials Φ(J), based on a finite subset J of the lattice, the sum being over all J except those J for which Φ(J) commutes with any A localized in I, thus H(I) depending on I. The problem of construction of αt from a given class of potentials is not a straight-forward task and has been studied by many people. As a result, a large
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
97
number of results are known for quantum spin lattice systems (see e.g. [17]) and most of them can be applied to Fermion lattice systems. There are also some specific analyses for Fermion lattice systems (see e.g. [29]). In parallel, the equivalence of the KMS condition and the variational principle for translation invariant states has been proved for a wide class of potentials for quantum spin lattice systems. The same proof also works for Fermion lattice systems in most cases; for example this is the case for finite range potentials (see e.g. p. 113 of [30]). While these results cover a wide range of explicit models, it seems difficult to decide exactly which class of potentials determine a dynamics and to show the equivalence in question in most general cases (which is not explicitly known) from the potential point of view. In the present work, we do not intend to make any contribution to the problem of either construction of a dynamics from a potential, or giving a complete criterion for potentials, which give rise to a unique dynamics. (Thus we do not directly contribute to the study of explicit models.) On the contrary, we avoid these difficult problems by assuming that the dynamics is already given (since this is needed in any case for the KMS condition) and prove the equivalence result in question under minimal (general) assumptions on the dynamics, explained immediately below. Note that we do not make any explicit assumptions about the existence of a potential for a given dynamics nor about its property (such as the absolute convergence of the sum defining H(I) in terms of the potential). For any given dynamics, for which all finitely localized operators have the time derivative at t = 0 (Assumption (II) below) and which is lattice translation invariant (Assumption (IV) below), we show the existence of a corresponding potential, of which H(I) is a sum (as in usual formulation) convergent in a well-defined sense. We now explain our assumptions and interconnection of dynamics with potentials in more detail. The following two assumptions make it possible to associate a potential to any given dynamics satisfying them. (I) The dynamics is even. In other words, αt Θ = Θ αt for any t ∈ R, where Θ is an involutive automorphism of A, multiplying −1 on all creation and annihilation operators. (II) The domain D(δα ) of the generator δα of αt includes A◦ , the union of all A(I) for all finite subsets I of the lattice. It should be noted that Assumption (I) follows from Assumption (IV) below. (See Proposition 8.1.) We denote by ∆(A◦ ) the set of all ∗-derivations with A◦ as their domain and their values in A, commuting with Θ (on A◦ ). Then the generator δα of our αt , when restricted to A◦ , belongs to ∆(A◦ ). It is shown that ∆(A◦ ) is in one-to-one correspondence with the set P of standard even potentials, which are functionals Φ(I) of all finite subsets I of the lattice
April 11, 2003 14:43 WSPC/148-RMP
98
00160
H. Araki & H. Moriya
with values in the self-adjoint Θ-even part of the local algebra A(I), satisfying our standardness condition and a topological convergence condition (Theorem 5.13). The topological convergence condition ((Φ-e) in Definition 5.10) is required in order that the potential is associated with a ∗-derivation on A◦ and refers to the convergence of the interaction energy operator for every finite subset I X H(I) = {Φ(K); K ∩ I 6= ∅} , K
where a finite sum is first taken over K contained in a finite subset J and the limit of J tending to the whole lattice is to converge in the norm topology of A. (If this condition is satisfied for every one-point set I = {n} (n ∈ Zν ), then it is satisfied for all finite subsets I.) Note the difference from conventional topological conditions, such as summability of kΦ(I)k over all I containing a point n, which are assumed for the sake of mathematical convenience. For Φ ∈ P, internal energy U (I) and surface energy W (I) are also given in terms of Φ by the conventional formulae for every finite I. The connection of the derivation δ and the corresponding potential Φ is given by δA = i[H(I), A]
(A ∈ A(I)) .
Due to the Θ-evenness assumption (I), the replacement of H(I) by H(K) with K ⊃ I gives the same δ on A(I), a necessary condition for consistency. The standardness ((Φ-d) in Definition 5.10) is formulated in terms of conditional expectations and picks up a unique potential for each δ ∈ ∆(A◦ ). Without the standardness condition, there are many different potentials (called equivalent potentials) which yield exactly the same δ through the above formulae. Through the one-to-one correspondence between δ(∈ ∆(A◦ )) and Φ(∈ P), any dynamics αt satisfying our standing assumptions (I) and (II) is associated with a unique standard potential Φ ∈ P. This is a crucial point of our formulation, leading to our major result. When we want to derive a statement involving αt from a condition involving the potential Φ, we need the following assumption, guaranteeing the unique determination of αt from the given Φ: (III) A◦ is the core of the generator δα of the dynamics αt . For the discussion of variational principle, we need the translation invariance assumption for the dynamics: (IV) αt τk = τk αt , where τk , k ∈ Zν , is the automorphism group of A representing the lattice translations. The above Assumptions (I)–(IV) are the only assumptions needed for our theory below. On the other hand, if a potential Φ (say, in the class P) is first given for any model, it is a hard problem in general to show that the corresponding derivation
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
99
δΦ ∈ ∆(A◦ ) is given by some dynamics satisfying Assumptions (II) and (III), or equivalently that the closure of δΦ is a generator of a dynamics (i.e. it can be exponentiated to a one-parameter group of automorphisms of A). We now present our main theorem after the explanation about the variational principle and its ingredients. The set Pτ of all translation covariant potentials in P forms a Banach space (Proposition 8.8) with respect to the norm kΦk ≡ kH({n})k , which is independent of the lattice point n. The finite range potentials are shown to be dense in Pτ with respect to this norm and to imply separability of Pτ (Theorem 8.12 and Corollary 8.13). In terms of this norm, we obtain the energy estimate kU (I)k ≤ kH(I)k ≤ kΦk · |I| , where |I| is the cardinality of I (Lemma 8.6). Then the conventional estimate for W (I) follows. These estimates are used to show the existence of the thermodynamic functionals, such as pressure P (Φ) and mean energy eΦ (ω). All these estimates are carried out by the technique of conditional expectations without using the norm of the individual Φ(I). For any state ω of A, its local entropy SA(I) (ω) = S(ω|A(I) ) is given as usual by the von Neumann entropy S(·). Due to the non-commutativity of local algebras for disjoint regions, not all known properties of entropy for spin lattice systems hold for our Fermion case [33]. However, the strong subadditivity of entropy (SSA) for Fermion systems holds. Then the existence of the mean entropy s(ω) for any translation invariant state ω for Fermion lattice systems follows by a known method of spin lattice systems. The variational principle refers to the following equation for a translation invariant state ϕ of A for a given translation covariant potential Φ(∈ P τ ) and β ∈ R: P (βΦ) = s(ϕ) − βeΦ (ϕ) .
(1.1)
Our major result can be formulated as the following two theorems. Theorem A. Under Assumptions (II) and (IV) for the dynamics αt , any translation invariant state, which satisfies the KMS condition for αt at the inverse temperature β, is a solution of Eq. (1.1), where Φ is the unique standard potential corresponding to αt . Theorem B. Under Assumptions (II), (III) and (IV) for the dynamics αt , any solution ϕ of (1.1) satisfies the KMS condition for αt at β. Remark. These two theorems hold also for spin lattice systems. We now present an over-all picture of the proof of our main results above. The proof of Theorem A and Theorem B will be carried out through the following steps:
April 11, 2003 14:43 WSPC/148-RMP
100
(1) (2) (3) (4) (5)
00160
H. Araki & H. Moriya
KMS condition ⇒ Gibbs condition. Gibbs condition ⇒ Variational principle. Variational principle ⇒ dKMS condition on A◦ . dKMS condition on A◦ ⇒ dKMS condition on D(δα ). dKMS condition on D(δα ) ⇒ KMS condition.
Assumptions (I) and (II) are used throughout (1)–(5). Assumption (IV) is used for the formulation of the variational principle and necessarily for (2) and (3). It is also used to derive Assumption (I), which is not included in the premise of Theorems A and B. Assumption (III) is used only for (4). The differential KMS (abbreviated as dKMS) condition in (4) and (5) refers to a known condition, which is entirely described in terms of the generator δα of αt and without use of αt (Definition 6.3). This condition on the full domain D(δα ) of the generator δα of αt is known to be equivalent to the KMS condition (which is Step (5)). The differential KMS condition for our purpose is the condition for the restriction of δα to A◦ . Thus we need to show Step (4) using the additional assumption (III) on αt . For Steps (1) and (2), we follow the proof for spin lattice systems in principle. However, the Gibbs condition for Fermion lattice systems requires a careful definition. We define the Gibbs condition for a state ϕ as the requirement that the local algebra A(I) is in the centralizer of the perturbed functional ϕβH(I) , which is obtained from ϕ by a perturbation βH(I), for each finite subset I of the lattice (Definition 7.1 and Lemma 7.2). When A(I) and A(Ic ) commute (as in the case of spin lattice systems), this condition reduces to the product type characterization which was introduced and called the Gibbs condition by Araki and Ion for quantum spin lattice systems [5]. With our definition of the Gibbs condition, we have been able to prove Steps (1) and (2). The product type characterization mentioned above is the condition that ϕβH(I) is the product of the tracial state of A(I) and its restriction to the complement algebra A(Ic ). In the present case of Fermion lattice systems, we show that a Gibbs state satisfies this condition if and only if it is an even state of A (Proposition 7.7). The same kind of formulation and result are valid for a perturbation βW (I). For Step (3) as well as for the proof of the variational equality P (βΦ) =
sup {s(ω) − βeΦ (ω)} ,
τ ω∈A∗ +,1,
(1.2)
which is crucial for the variational principle, we need a product state of local Gibbs state. For this purpose, we have a technical result about the existence of a joint extension from states of local algebras for disjoint subsets of the lattice to a state of the algebra for their union, which holds if the individual states are even possibly except one (Theorem 11.2).
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
101
As an aside, the converse of Step (1) is shown under Assumptions (I), (II) and (III) (Theorem 7.6). A major tool of our analysis is the C∗ -algebra conditional expectation EI : A 7→ A(I) with respect to the unique tracial state τ of A. Its existence is shown not only for finite subsets but for all subsets I of the lattice (Theorem 4.7). Based on the product property of τ for subalgebras A(I) and A(J) for disjoint I and J, we obtain the following commuting square of C∗ -subalgebras (Theorem 4.13) for Fermion systems. (It holds trivially for spin systems.) E
I A(I ∩ J) −−−− → EJ y
A(J)
A(I) y EI∩J
−−−−→ A(I ∩ J) . EI∩J
This serves as a replacement for the tensor-product structure in traditional arguments for spin lattice systems. As by-products, we obtain a few useful results on the CAR algebra: The even-odd automorphism Θ is shown to be outer for any infinite CAR algebra (Corollary 4.20) and formulae for commutants of A(I) and A(I)+ in A for finite and infinite I are obtained (Theorem 4.17 and Theorem 4.19). Some more results contained in this paper are as follows. We show the validity of the variational equality (1.2) when the Connes– Narnhofer–Thirring entropy hω (τ ) with respect to the group of lattice translation automorphisms τ is used in place of the mean entropy s(ω) (Theorem 13.2). Note that our system (A, τ ), where τ denotes the group of lattice translation automorphisms, does not belong to the class of C∗ -systems considered in [34], being a non-abelian system. We define general potentials as those which satisfy all conditions for those in P except for the standardness. They include all potentials satisfying the following condition: X kΦ(I)k < ∞ (1.3) I3n
for every lattice point n. For each general potential, the corresponding H(I) and δ are defined and there is a unique standard potential in P with the same δ as a given general potential as described earlier. Restricting our attention to those general potentials satisfying (1.3) (a condition which is introduced also in some discussion of spin lattice systems), we are able to show by a straightforward argument that the set of solutions of variational principle for a general translation covariant potential satisfying (1.3) coincide with those for the equivalent standard potential (which is automatically translation covariant) (Remark 1 to Proposition 14.1), although the pressure and the mean energy may be different between the two potentials.
April 11, 2003 14:43 WSPC/148-RMP
102
00160
H. Araki & H. Moriya
2. Conditional Expectations 2.1. Basic properties The following proposition is well-known (see, e.g. Proposition 2.36, Chapter V [43]). Proposition 2.1. Let M be a von Neumann algebra with a faithful normal tracial state τ and N be its von Neumann subalgebra. Then there exists a unique conditional expectation M M EN : a ∈ M → EN (a) ∈ N
satisfying M τ (ab) = τ (EN (a)b)
(2.1)
for any b ∈ N . M Remark. A conditional expectation EN is linear, positive, unital, and satisfies M M EN (ab) = EN (a)b ,
for any a ∈ M and b ∈ N , and
M M EN (ba) = bEN (a) ,
M kEN k = 1.
(2.2)
(2.3)
We shall obtain a C∗ -version of this proposition for the Fermion algebra in Sec. 4, where M and N are C∗ -algebras with a unique tracial state τ . The main step of M (a) ∈ N for every a ∈ M satisfying (2.1). Once its proof is the existence of EN M it is established, the map EN is a conditional expectation by standard argument, which we formulate for the sake of completeness as follows. Lemma 2.2. Let M be a unital C∗ -algebra with a faithful tracial state τ and N be its subalgebra containing the identity of M. Suppose that for every a ∈ M there M M from M to N (a) of N satisfying (2.1). Then the map EN exists an element EN is the unique conditional expectation from M to N with respect to τ, possessing the following properties: M (1) EN is linear, positive and unital map from M onto N . (2) For any a ∈ M and b ∈ N , M M EN (ab) = EN (a)b ,
M M EN (ba) = bEN (a) .
M is a projection of norm 1. (3) EN M Proof. First we prove the uniqueness of EN (a) ∈ N satisfying (2.1) for a given 0 00 a ∈ M. Let a and a in N satisfy (2.1), namely,
τ (ab) = τ (a0 b) = τ (a00 b)
for all b ∈ N . Then
τ (b(a0 − a00 )) = 0 .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
103
By taking b = (a0 − a00 )∗ and using the faithfulness of τ , we obtain a0 − a00 = 0, M hence the uniqueness of EN (a) ∈ N for each a ∈ M. Except for the positivity, (1) and (2) can be shown in the same pattern as follows. Let a = c1 a1 + c2 a2 where a1 , a2 ∈ M and c1 , c2 ∈ C. Then for any b ∈ N , M M τ (ab) = c1 τ (a1 b) + c2 τ (a2 b) = c1 τ (EN (a1 )b) + c2 τ (EN (a2 )b) M M = τ ({c1 EN (a1 ) + c2 EN (a2 )}b) . M M Since c1 EN (a1 ) + c2 EN (a2 ) ∈ N , the uniqueness already shown implies M M M c1 EN (a1 ) + c2 EN (a2 ) = EN (a) .
M Therefore, EN is linear. In the same way, for any a ∈ M and b ∈ N ,
M τ (abb0 ) = τ (EN (a)bb0 )
holds for all b0 ∈ N and hence
M M EN (ab) = EN (a)b .
Also M τ (bab0 ) = τ (ab0 b) = τ (EN (a)b0 b) M = τ (bEN (a)b0 )
implies M M EN (ba) = bEN (a) . M (a)b) with b ∈ N and the uniqueness If a ∈ N , then the identity τ (ab) = τ (EN result imply M EN (a) = a . M is a map onto N . By taking a = 1(∈ N ), we have Therefore EN M EN (1) = 1 .
M Hence EN is unital. M M M M (a). Therefore (a) ∈ N for any a ∈ M, we have EN (EN (a)) = EN Since EN M EN is a projection. M To show the positivity of the map EN , we consider the GNS triplet for the tracial state τN of N (which is the restriction of τ to N ) consisting of a Hilbert N space HτN , a representation πτN of N on HτN and a unit vector ΩN τ ∈ Hτ , giving N N rise to the state τN (A) = τ (A) = (ΩN τ , πτ (A)Ωτ ) for A ∈ N . If a ∈ M and a ≥ 0, then for b ∈ N N M N N ∗ M (πτN (b)ΩN τ , πτ (EN (a))πτ (b)Ωτ ) = τN (b EN (a)b) M = τN (EN (a)bb∗ ) = τ (abb∗ ) = τ (b∗ ab) ≥ 0 .
April 11, 2003 14:43 WSPC/148-RMP
104
00160
H. Araki & H. Moriya
N Since πτN (b)ΩN τ , b ∈ N is dense in Hτ , we obtain
M πτN (EN (a)) ≥ 0 .
Since πτN is faithful,
M EN (a) ≥ 0 ,
M and the positivity of EN is shown. For any a ∈ M, the faithfulness of πτN implies M M kEN (a)k = kπτN (EN (a))k
=
N M N N sup {|(πτN (b1 )ΩN τ , {πτ (EN (a))}πτ (b2 )Ωτ )| ;
b1 ,b2 ∈N
N kπτN (b1 )ΩN τ k ≤ 1, kπτ (b2 )Ωτ k ≤ 1)|}
=
M sup {|(τ (b∗1 EN (a)b2 )|; τ (b∗1 b1 ) ≤ 1, τ (b∗2 b2 ) ≤ 1}
b1 ,b2 ∈N
=
sup {|(τ (b∗1 ab2 )|; τ (b∗1 b1 ) ≤ 1, τ (b∗2 b2 ) ≤ 1}
b1 ,b2 ∈N
=
M M M sup {|(πτM (b1 )ΩM τ , πτ (a)πτ (b2 )Ωτ )| ;
b1 ,b2 ∈N
M M kπτM (b1 )ΩM τ k ≤ 1, kπτ (b2 )Ωτ k ≤ 1)|}
≤ kπτM (a)k = kak , where we have used the cyclicity of
πτN (N )
(2.4) for
HτN
for the second equality,
M M τ (b∗1 EN (a)b2 ) = τ (EN (a)b2 b∗1 ) = τ (ab2 b∗1 ) = τ (b∗1 ab2 ) ,
for the fourth equality, and the same computation backwards replacing N by M M (1) = 1 and (2.4), we have for the fifth equality. Due to EN M kEN k = 1.
We have completed the proof. 2.2. Geometrical lemma Let us consider finite type I factors (i.e. full matrix algebras) M and N such that M ⊃ N . We have the isomorphisms M ' N ⊗ N1 , N ' N ⊗ 1, and τ = τN ⊗ τN1 where N1 ≡ M ∩ N 0 is a finite type I factor. A conditional expectation satisfying (2.1) is given by the slice map: M (bb1 ) = τ (b1 )b (b ∈ N , b1 ∈ N1 ) . EN
(2.5)
M We give this EN a geometrical picture which we find useful. We introduce the following inner product on M:
ha, bi ≡ τ (a∗ b) ,
(a, b ∈ M) .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
105
M M is then a (finite-dimensional) Hilbert space with this inner product. Let PN M be the orthogonal projection onto the subspace N of M. We show that PN is the M same as EN as a map M 7→ N .
Lemma 2.3. With the notation above, M M PN a = EN (a) .
(2.6)
for any a ∈ M. M Proof. Any a ∈ M can be decomposed as a = PN a + a0 where a0 ∈ N ⊥ . For any ∗ b ∈ N , we have b ∈ N and hence M τ (ab) = hb∗ , ai = hb∗ , PN ai + hb∗ , a0 i M M = hb∗ , PN ai = τ ((PN a)b) .
M Since PN a ∈ N , it follows from Proposition 2.1 that M M PN a = EN (a) .
2.3. Commuting square We introduce the following equivalent conditions for a commuting square. (See e.g. [21].) Proposition 2.4. Let M, N1 , N2 and P be finite type I factors satisfying M ⊃ N1 ⊃ P ,
M ⊃ N2 ⊃ P .
Then the following conditions are equivalent: (1) (2) (3) (4) (5)
N2 M EN | = EP 1 N2 N1 M E N 2 |N 1 = E P M M M M E N1 E N2 = E N P = N1 ∩ N2 and EN 2 1 M M M E N1 E N2 = E P M M M . E N1 = E P EN 2
Proof. (1) ⇔ (4): Assume (1). Let a ∈ M and b ∈ P. By the assumption, we have N2 M M M (a)) = EP (EN (a)) ∈ P (EN EN 2 2 1
M due to EN (a) ∈ N2 . On the other hand, 2
M M M τ (EN (EN (a))b) = τ (EN (a)b) 1 2 2
= τ (ab)
(due to b ∈ (P ⊂)N1 )
(due to b ∈ (P ⊂)N2 ) .
M . = EP = and so Hence The converse is obvious: for a ∈ N2 , (4) implies M M (a)) (EN EN 2 1
M EP (a)
M M E N2 EN 1
N2 M M M M EP (a) = EP (a) = EN EN2 (a) = EN (a) 1 1
April 11, 2003 14:43 WSPC/148-RMP
106
00160
H. Araki & H. Moriya
and hence (1). (2) ⇔ (5): Exactly the same proof as above, with N1 and N2 interchanged. (4) ⇔ (3): Assume (4). By Lemma 2.3, (4) implies M M PN PN2 = PPM . 1
Taking adjoints, we obtain M M PN PN1 = PPM . 2
This implies M M M M M E N2 , = EN E N1 = E P EN 1 2
the last equality being due to (4). Due to N1 ⊃ P and N2 ⊃ P, we have P ⊂ N1 ∩ N2 . If b ∈ N1 ∩ N2 , then M M M b = EN EN2 (b) = EP (b) ∈ P 1
by (4). Hence P = N1 ∩ N2 . This completes the proof of (4) ⇒ (3). M M M M (a)) ∈ N1 ∩ (EN (a)) = EN (EN Assume (3). For any a ∈ M, (3) implies EN 1 2 2 1 M M N2 = P because the range of EN1 is N1 and the range of EN2 is N2 . For any b ∈ P and a ∈ M, M M M τ (EN (EN (a))b) = τ (EN (a)b) = τ (ab) . 1 2 2 M M M (a). This implies (4). Hence EN (EN (a)) = EP 1 2 (5) ⇔ (3): Exactly the same proof as above, with N1 and N2 interchanged.
3. Entropy and Relative Entropy 3.1. Definitions We introduce some definitions and related lemmas needed for formulation of the main result of this section. Lemma 3.1. Let M be a finite type I factor. (i) Let ϕ be a positive linear functional on M. Then there exists a unique ρˆϕ ∈ M+ (called adjusted density matrix ) satisfying ϕ(a) = τ (ˆ ρϕ a) for all a ∈ M. (ii) Let N be a subfactor of M and ϕN be the restriction of ϕ to N . Then M ρˆϕN = EN (ˆ ρϕ )
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
107
Proof. (i) is well-known. M M (ii) For b ∈ N , ϕN (b) = ϕ(b) = τ (ˆ ρϕ b) = τ (EN (ˆ ρϕ )b). Since EN (ˆ ρϕ ) ∈ N+ , we have M ρˆϕN = EN (ˆ ρϕ ) .
Remark. The above definition of density matrix is given in terms of the tracial state in contrast to the standard definition using the matrix trace Tr. Hence we use the word ‘adjusted’. Definition 3.2. Let ρˆϕ be the adjusted density matrix of a positive linear functional ϕ of a finite type I factor. Then ˆ S(ϕ) ≡ −ϕ(log ρˆϕ ) is called the adjusted entropy of ϕ. Remark. The adjusted density matrix and the adjusted entropy for a type In factor M with the dimension Tr(1) = n are related to the usual ones by the following relations: ρˆϕ = nρϕ ,
ˆ S(ϕ) = S(ϕ) − ϕ(1) log n .
(3.1)
The range of the values of entropy is given by the following well-known lemma. Lemma 3.3. If M is a type In factor and ϕ is a state of M, then 0 ≤ S(ϕ) ≤ log n .
(3.2)
The equality S(ϕ) = 0 holds if and only if ϕ is a pure state of M. The equality S(ϕ) = log n holds if and only if ϕ is the tracial state τ of M. Definition 3.4. The relative entropy of % and σ in M+ as well as that of positive linear functionals ϕ and ψ are defined by S(σ, %) = τ (%(log % − log σ)) ,
(3.3)
S(ψ, ϕ) = ϕ(log ρˆϕ − log ρˆψ )(= τ (ˆ ρϕ log ρˆϕ − ρˆϕ log ρˆψ )) .
(3.4)
Remark. S(ψ, ϕ) remains the same if ρˆϕ and ρˆψ are replaced by the density matrices ρϕ and ρψ with respect to Tr. The right-hand sides of (3.3) and (3.4) are well-defined when %, σ, ρˆϕ and ρˆψ are regular. Otherwise, one may define them as the limit of regular cases, for example by taking the limit ε → 0 for (1 − ε)ϕ + ετ , (1 − ε)ψ + ετ for (3.4), and similarly for (3.3). The value of S(ψ, ϕ) is real or +∞ for positive linear functionals ϕ and ψ. The following lemma is also well-known. Lemma 3.5. Let ϕ and ψ be states. Then S(ψ, ϕ) is non-negative. It vanishes if and only if ϕ = ψ.
April 11, 2003 14:43 WSPC/148-RMP
108
00160
H. Araki & H. Moriya
Remark. We note that there are different notations for the relative entropy and that we adopt that of Araki [8] and Kosaki [25]. In comparison with our notation, the order of two states is reversed in that of Umegaki [45], while both the order of states and the sign are reversed in that of Bratteli and Robinson [17]. 3.2. Monotone property Under any conditional expectation E and under restriction to any subalgebra, the relative entropy is known to be non-increasing: S(ψ ◦ E, ϕ ◦ E) ≤ S(ψ, ϕ) ,
(3.5)
S(ψN , ϕN ) ≤ S(ψ, ϕ) .
(3.6)
(For example, (3.6) is Theorem 4.1(iv) of [25]. (3.5) follows from Theorem 4.1(v) of [25], because E is a Schwarz map [44].) When we want to exhibit the dependence of entropy on M more explicitly, we ˆ The relation between the entropy use the notation SM and SˆM instead of S and S. and the relative entropy for a state ϕ is given by ˆ S(ϕ) = −S(τ, ϕ) = S(ϕ) − S(τ ) . Note that S(τ ) = log n for a type In factor M. We identify M with N ⊗ (M ∩ N 0 ) and use the notation ϕN ⊗ τM∩N 0 . We also identify A ∈ N ⊂ M with A ⊗ 1 ∈ N ⊗ (M ∩ N 0 ). Lemma 3.6. Let M ⊃ N be finite type I factors, and ϕ be a state on M. Then M , ϕ) . SˆN (ϕN ) − SˆM (ϕ) = SM (ϕN ⊗ τM∩N 0 , ϕ) = SM (ϕ ◦ EN
(3.7)
Proof. If ϕ is a faithful state, we show the above identity by a straight-forward calculation. If ϕ is not faithful, we add ε · τ to (1 − ε)ϕ and then take the limit ε → 0. Remark. Sˆ in the above Lemma cannot be replaced by S. 3.3. Strong subadditivity If the system under consideration enjoys the commuting square property with respect to a tracial state, the strong subadditivity property for the adjusted entropy Sˆ holds (see Theorem 12 in [35]). Theorem 3.7. Let M, N1 , N2 and P be finite type I factors satisfying one of the equivalent conditions of Proposition 2.4. Let ψ be a state on M. Then ˆ P) ≤ 0 . ˆ N2 ) + S(ψ ˆ ˆ N1 ) − S(ψ S(ψ) − S(ψ
Proof. By (3.7) and (3.5) M M M M SˆN2 (ψN2 ) − SˆM (ψ) = SM (ψ ◦ EN , ψ) ≥ SM (ψ ◦ EN ◦ EN , ψ ◦ EN ). 2 2 1 1
(3.8)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
109
M M M M M By the assumption, EN E N1 = E N E N2 = E P . Hence, 2 1 M M M SM (ψ ◦ EN ◦ EN , ψ ◦ EN ) = SM (ψP ⊗ τM∩P 0 , ψN1 ⊗ τM∩N10 ) 2 1 1
= SN (ψP ⊗ τN1 ∩P 0 , ψN1 ) = SˆP (ψP ) − SˆN1 (ψN1 ) , where the second equality is due to τM∩P 0 = τN1 ∩P 0 ⊗ τM∩N10 and the last equality due to (3.7). Therefore we obtain (3.8). 4. Fermion Lattice Systems 4.1. Fermion algebra We introduce Fermion lattice systems where there exists one spinless Fermion at each lattice site and they interact with each other. The restriction to spinless particle (i.e. one degree of freedom for each site) is just a matter of simplification of notation. All results and their proofs in the present work go over to the case of an arbitrary (constant) finite number of degrees of freedom at each lattice site without any essential alteration. The lattice we consider is ν-dimensional lattice Zν (ν ∈ N, an arbitrary positive integer). Definition 4.1. The Fermion C∗ -algebra A is a unital C∗ algebra satisfying the following conditions and generated by elements in (1 − 1): (1-1) For each lattice site i ∈ Zν , there are elements ai and a∗i of A called annihilation and creation operators, respectively, where a∗i is the adjoint of ai . (1-2) The CAR (canonical anticommutation relations) are satisfied for any i, j ∈ Zν : {a∗i , aj } = δi,j 1 , {a∗i , a∗j } = {ai , aj } = 0 .
(4.1)
Here {A, B} = AB + BA (anticommutator ), δi,j = 1 for i = j, and δi,j = 0 for i 6= j. (1-3) Let A◦ be the ∗-algebra generated by all ai and a∗i (i ∈ Zν ), namely the (algebraic) linear span of their monomials A1 · · · An where Ak is aik or a∗ik , ik ∈ Zν . (2) For each subset I of Zν , the C∗ -subalgebra of A generated by ai , a∗i , i ∈ I, is denoted by A(I). If the cardinality |I| of the set I is finite, then A(I) is referred to as a local algebra or more specifically the local algebra for I. For the empty set ∅, we define A(∅) = C1. Remark 1. A◦ is dense in A. Remark 2. For finite I, A(I) is known to be isomorphic to the tensor product of |I| copies of the full 2 × 2 matrix algebra M2 (C) and hence isomorphic to M2|I| (C).
April 11, 2003 14:43 WSPC/148-RMP
110
00160
H. Araki & H. Moriya
Then A◦ =
[
|I| 0 and a given aσ , σ = + or −, there exists a polynomial p, i.e. a linear combination p of monomials of ai and a∗i , i ∈ I, satisfying kaσ − pk < ε. Since Eσ ≡ (1/2)(id + σΘ) satisfies Eσ aσ = aσ and kEσ k = 1, we have kEσ (aσ − p)k = kaσ − pσ k < ε where pσ = Eσ p. Since Eσ selects even or odd monomials (annihilating others) according as σ is + or −, pσ is a linear combination of even or odd monomials of ai and a∗i , i ∈ I. Similarly there exits a linear combination qσ0 of even or odd monomials of aj and a∗j , j ∈ J, satisfying kbσ0 −qσ0 k < ε. Since the graded commutation relation (4.8) holds for pσ and qσ0 , it holds for aσ and bσ0 . Definition 4.4. (1) For each k ∈ Zν , τk denotes a unique automorphism of A satisfying τk (a∗i ) = a∗i+k ,
τk (ai ) = ai+k ,
(i ∈ Zν ) .
(4.9)
(2) For a state ϕ of A, the adjoint action of τk is defined by (τk∗ ϕ)(A) = ϕ(τk (A)) ,
(A ∈ A) .
(4.10)
Remark. The automorphism τk represents the lattice translation by the amount k ∈ Zν . The map k ∈ Zν 7→ τk is a group of automorphisms: τk τl = τk+l ,
(k, l ∈ Zν ) .
The subalgebras transform covariantly under this group: τk (A(I)) = A(I + k) ,
(4.11)
where I + k = {i + k; i ∈ I} for any subset of I of Zν and any k ∈ Zν . Definition 4.5. The sets of all states and all positive linear functionals of A are denoted by A∗+,1 and A∗+ ; the sets of all Θ invariant and all τ invariant ones by Θ τ A∗+,1 , A∗+Θ and A∗+,1 , A∗+τ , respectively. For any subset I of Zν , the set of all states Θ of A(I) is denoted by A(I)∗+,1 ; the set of all Θ invariant ones by A(I)∗+,1 . Remark 1. Any translation invariant state is automatically even (see, e.g. Example 5.2.21 of [17]): τ Θ A∗+,1 ⊂ A∗+,1 .
(4.12)
Remark 2. For each subset I of Zν , we can consider the set of all states {A(I)+ }∗+,1 on the even subalgebra A(I)+ . There exists an obvious one-to-one correspondence
April 11, 2003 14:43 WSPC/148-RMP
112
00160
H. Araki & H. Moriya
Θ between A(I)∗+,1 and {A(I)+ }∗+,1 due to (4.7) by the restriction and the unique Θ invariant extension.
4.2. Product property of the tracial state The following proposition provides a basis for the present section. Proposition 4.6. If J1 and J2 are disjoint, then τ (ab) = τ (a)τ (b)
(4.13)
for arbitrary a ∈ A(J1 ) and b ∈ A(J2 ). Proof. It is enough to prove the formula when a and b are monomials of the form (4.2). Let a = Ai a0 , where i ∈ J1 , a0 ∈ A(J1 \{i}) is a monomial of the form (4.2) and Ai is one of a∗i , ai , a∗i ai , ai a∗i . We will now show τ (ab) = τ (Ai )τ (a0 b) .
(4.14)
If a0 b is a Θ-odd monomial, then τ (a0 b) = 0 by (4.7). If Ai is Θ-even, then ab is odd and τ (ab) = 0, implying (4.14). If Ai is odd, then Ai (a0 b) = −(a0 b)Ai . Hence τ (ab) = τ (Ai (a0 b)) = −τ ((a0 b)Ai ) = −τ (Ai (a0 b)) = 0, where the third equality is due to the tracial property of τ . So (4.14) holds in either case. If a0 b is even and Ai is odd, then τ (Ai ) = 0 because Ai is odd and τ (ab) = 0 because ab = Ai (a0 b) is odd. Again (4.14) holds. Finally, if a0 b is even and Ai = a∗i ai , then a∗i commutes with a0 b due to CAR and hence τ (ab) = τ ((a∗i ai )(a0 b)) = τ (ai (a0 b)a∗i ) = τ (ai a∗i (a0 b)) =
(due to [a∗i , a0 b] = 0)
1 1 τ ((a∗i ai + ai a∗i )(a0 b)) = τ (a0 b) . 2 2
The same formula for a0 b = 1 yields τ (Ai ) =
1 2
and hence
τ (ab) = τ (Ai )τ (a0 b) . If a0 b is even and Ai = ai a∗i , the above formula holds in the same way. We have now proved (4.14) for all cases. Let a be now given by (4.2). By using (4.14) for i1 , i2 , . . . , ik successively, we obtain τ (ab) = τ (Ai1 ) · · · τ (Aik )τ (b) .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
113
The same equality for b = 1 yields τ (a) = τ (Ai1 ) · · · τ (Aik ) . Hence we have τ (ab) = τ (a)τ (b) . This completes the proof. We may say that the tracial state τ is a ‘product’ state although A(J1 ) and A(J2 ) do not commute. We will show in the next subsections that this product property of the tracial state implies the commuting square property for the conditional expectations. 4.3. Conditional expectations for Fermion algebras We prove the C∗ -algebraic version of Proposition 2.1 for the Fermion algebra A and its subalgebras. We note that A(I) is not a von Neumann algebra unless I is a finite subset of Zν . Hence Proposition 2.1 is not directly applicable to the Fermion algebra. Theorem 4.7. For any subset I of Zν , there exists a conditional expectation EI : a ∈ A 7→ EI (a) ∈ A(I)
(4.15)
uniquely determined by EI (a) ∈ A(I) and τ (ab) = τ (EI (a)b)
(b ∈ A(I)) .
(4.16)
For any second subset J of Zν , EI (a) ∈ A(I ∩ J)
(4.17)
EI EJ = EJ EI = EI∩J .
(4.18)
for any a ∈ A(J), and
Proof. The C∗ -subalgebra of A generated by A(I) and A(Ic )+ is isomorphic to their tensor product and will be denoted as A(I) ⊗ A(Ic )+ . Let (1)
EI
≡
c 1 (id + ΘI ) . 2
It maps A onto A(I) ⊗ A(Ic )+ . Since c
c
τ (ΘI (a)b) = τ (ΘI (ab)) = τ (ab) (1)
for all a ∈ A and b ∈ A(I) ⊗ A(Ic )+ , EI
satisfies (4.16).
(4.19)
April 11, 2003 14:43 WSPC/148-RMP
114
00160
H. Araki & H. Moriya
Since τ is a product state for the tensor product A(I) ⊗ A(Ic )+ , there exists (2) a conditional expectation EI from A(I) ⊗ A(Ic )+ onto A(I) satisfying (4.16), (2) characterized by EI (cd) = τ (d)c for c ∈ A(I) and d ∈ A(Ic )+ and called a slice map. Therefore (2)
(1)
EI = E I EI
(4.20)
is a map from A onto A(I) satisfying (4.16). By Lemma 2.2, it is a unique conditional expectation from A onto A(I) satisfying (4.16). To show (4.17), note that A(J) is generated by A(J ∩ I) and A(J ∩ Ic ), namely, the linear span of products ab with a ∈ A(J ∩ I) and b ∈ A(J ∩ Ic ) is dense in A(J). Due to the linearity of EI and kEI k = 1, it is enough to show (4.17) for such (1) products. We have EI (b) ∈ A(Ic )+ and hence (2)
(1)
(1)
EI (ab) = EI (aEI (b)) = aτ (EI (b)) = aτ (b) ∈ A(J ∩ I) , which proves (4.17). For any a ∈ A, EJ (a) ∈ A(J) and hence EI (EJ (a)) ∈ A(I ∩ J). For b ∈ A(I ∩ J), (4.16) implies τ (EI (EJ (a))b) = τ (EJ (a)b) = τ (ab) , where the first equality is due to b ∈ A(I), while the second equality is due to b ∈ A(J). This equality and EI (EJ (a)) ∈ A(I ∩ J) imply EI∩J (a) = EI (EJ (a)) by the uniqueness result. By interchanging I and J, we obtain EI EJ = EJ EI = EI∩J , which proves the last statement (4.18). Remark 1. For spin lattice systems, the conditional expectation EI can be obtained simply as a slice map with respect to the tracial state τ . When spins and Fermions coexist at each lattice site, EI can be obtained in exactly the same way as Theorem 4.7 (by including spin operators in the even part A(I)+ ), provided that the degree of freedom at each lattice site is finite (i.e. A(I) is a finite factor of type I for any finite I). In all these cases, the results of our paper are valid as they are proved by the use of conditional expectations EI . Remark 2. Theorem 4.7 can be shown by a more elementary (lengthy) method by giving EI explicitly for a finite I and then giving EJ for an infinite J as a limit of EIn for an increasing sequence of finite subsets In of Zν tending to J. Proof presented above is by a suggestion of a referee. Corollary 4.8. For each subset I of Zν , EI Θ = ΘEI .
(4.21)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
115
Proof. For any a ∈ A and b ∈ A(I),
τ (EI (Θ(a))b) = τ (Θ(a)b) = τ (Θ{Θ(a)b}) = τ (aΘ(b)) = τ (EI (a)Θ(b)) = τ (Θ{EI (a)Θ(b)}) = τ (Θ(EI (a))b) .
Since A(I) is invariant under Θ as a set, we have Θ(EI (a)) = EI (Θ(a)) due to the uniqueness of EI in the preceding theorem. We now show a continuous dependence of EI on the subsets I of Zν . We use the following notation for various limits of subsets of Zν . If {Iα } is a monotone (not necessarily strictly) increasing or decreasing net of subsets converging to a subset I of Zν , we write Iα % I or Iα & I. For these cases, I = ∪α Iα or I = ∩α Iα , respectively. We use Iα → I for the standard convergence of a net Iα to I (i.e. lim supα Iα = lim inf α Iα = I). By J % Zν (which is written without any index), we mean a net of all finite subsets tending to Zν with the set inclusion as its partial ordering. (In the same way, we use J % I.) In this case, J itself serves as the net index and it is a monotone increasing net. Later in Secs. 9 and 10, we use a more restrictive notion of a van Hove net {Iα } tending to Zν or to ‘∞’ (see Appendix for detailed explanation). Lemma 4.9. Let {Iα } be an increasing net of (finite or infinite) subsets of I such that their union is I. For any a ∈ A, lim EIα (a) = EI (a) .
(4.22)
lim EIα (a) = a .
(4.23)
α
As a special case I = Zν , Iα %Zν
Proof. Since polynomials of ai and a∗i , i ∈ I, are dense in A(I), there exists a finite subset Jn of I and an ∈ A(Jn ) such that 1 kEI (a) − an k < . n Because Jn is a finite subset of I and ∪α Iα = I, there exists a finite number of Iα , say, Iα(1) , . . . Iα(k) , such that ∪kl=1 Iα(l) ⊃ Jn . Since Iα is a net, there exists an index αn > α(1), . . . , α(k). Since Iα is increasing, Iαn ⊃ Iα(1) ∪ · · · Iα(k) ⊃ Jn . For any α ≥ αn , Iα ⊃ Jn and so EIα (an ) = an . Hence by I ⊃ Iα , we have 1 kEIα (a) − an k = kEIα (EI (a) − an )k ≤ kEI (a) − an k < n due to kEIα k ≤ 1. Thus 2 kEIα (a) − EI (a)k ≤ kEIα (a) − an k + kEI (a) − an k < , n for all α ≥ αn , which proves the assertion (4.22).
April 11, 2003 14:43 WSPC/148-RMP
116
00160
H. Araki & H. Moriya
Lemma 4.10. Let {Iα } be a decreasing net of (finite or infinite) subsets of Zν such that their intersection is I. For any a ∈ A, lim EIα (a) = EI (a) . α
(4.24)
Proof. Let Lk be a monotone increasing sequence of finite subsets of Zν such that their union is Zν . For any ε > 0, there exists kε such that ka − ELk (a)k < ε for all k ≥ kε by Lemma 4.9. Hence kEI (a) − EI∩Lk (a)k = kEI (a − ELk (a))k < ε ,
(4.25)
kEIα (a) − EIα ∩Lk (a)k = kEIα (a − ELk (a))k < ε
(4.26)
for all k ≥ kε and all α due to kEI k ≤ 1 and kEIα k ≤ 1. Since Iα & I, we have (Iα ∩ Lk ) & (I ∩ Lk ). Since Lkε is a finite set, there exists αε such that Iα ∩ Lkε = I ∩ Lkε and hence EIα ∩Lkε = EI∩Lkε for all α ≥ αε . Therefore, we obtain kEIα (a) − EI (a)k ≤ kEIα (a) − EIα ∩Lkε (a)k + kEIα ∩Lkε (a) − EI (a)k = kEIα (a) − EIα ∩Lkε (a)k + kEI∩Lkε (a) − EI (a)k < 2ε for all α ≥ αε , where the first term is estimated by (4.26), and the second by (4.25). Hence we obtain lim EIα (a) = EI (a) . α
Theorem 4.11. If a net {Iα } converges to I, then lim EIα (a) = EI (a) . α
(4.27)
for all a ∈ A. Proof. By definition, Iα → I means I = ∩β (∪α≥β Iα ) = ∪β (∩α≥β Iα ) . Set Jβ ≡ ∪α≥β Iα ,
Jβ ≡ ∩α≥β Iα .
Then Jβ & I and Jβ % I. By Lemmas 4.9 and 4.10, there exists a βε for any given ε > 0 such that for all β ≥ βε , kEJβ (a) − EI (a)k < ε ,
kEJβ (a) − EI (a)k < ε .
Hence kEJβ (a) − EJβ (a)k < 2ε .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
117
Since Jβ ⊃ Iβ ⊃ Jβ , we have EIβ EJβ = EIβ , EIβ EJβ = EJβ and kEIβ (a) − EJβ (a)k = kEIβ (EJβ (a) − EJβ (a))k < 2ε . Therefore, kEIβ (a) − EI (a)k < 3ε for all β ≥ βε . This proves (4.27). The following corollary follows immediately from the results obtained in this subsection. Corollary 4.12. For any countable family {In } of subsets of Zν , ∞ ∩∞ n=1 A(In ) = A(∩n=1 In ) .
(4.28)
Proof. Let Jn ≡ ∩nk=1 Ik and I ≡ ∩∞ n=1 In . Then Jn & I. By (4.18), EJn−1 EIn = EJn Qn and hence EJn = k=1 EIk . On one hand, Jn ⊂ Ik for k = 1, . . . , n, and hence A(Jn ) ⊂ ∩nk=1 A(Ik ). On the other hand, a ∈ ∩nk=1 A(Ik ) satisfies EIk (a) = a for all k = 1, . . . , n and hence EJn (a) = a ∈ A(Jn ). Therefore A(Jn ) = ∩nk=1 A(Ik ) .
Since Jn ⊃ I, we have A(Jn ) ⊃ A(I) and hence
∞ ∩∞ n=1 A(In ) = ∩n=1 A(Jn ) ⊃ A(I) .
For a ∈ ∩∞ n=1 A(Jn ), EJn (a) = a for any n. Since limn EJn (a) = EI (a) by Lemma 4.10, we have a = EI (a) ∈ A(I). Now we obtain the desired conclusion ∩∞ n=1 A(In ) = A(I) .
4.4. Commuting squares for Fermion algebras In the following theorem, we show that any two subsets I and J of Zν are associated with a commuting square of the conditional expectations with respect to the tracial L state τ . For K ⊂ L ⊂ Zν , denote the restriction of EK to A(L) by EK . Then it is a conditional expectation from A(L) to A(K) with respect to the tracial state. Theorem 4.13. For any subsets I and J of Zν , the following subalgebras of A form a commuting square: A(I) A(I ∪ J) Q
Q
Q
3
Q
Q
Q s Q A(J)
Q
Q s Q A(I ∩ J) 3
L Here the arrow from A(L) to A(K) represents the conditional expectation E K .
April 11, 2003 14:43 WSPC/148-RMP
118
00160
H. Araki & H. Moriya
Proof. It follows from (4.18) that I I∪J J EI∩J EII∪J = EI∩J = EI∩J EJI∪J ,
which shows the assertion. 4.5. Commutants of subalgebras We are going to determine the commutants of subalgebras of A. Lemma 4.14. For a finite I, (A(I)+ )0 ∩ A = A(Ic ) + vI A(Ic ) ,
(4.29)
where vI is a self-adjoint unitary in A(I)+ given by Y vI ≡ vi , vi ≡ a∗i ai − ai a∗i
(4.30)
i∈I
and implementing ΘI on A. Proof. By CAR, a∗i vi = −a∗i ,
ai vi = a i ,
vi a∗i = a∗i ,
vi ai = −ai .
Thus vi anticommutes with ai and a∗i . If j 6= i, vi commutes with aj and a∗j due to vi ∈ A({i})+ . Therefore for any a ∈ A(I), we have (AdvI )a ≡ vI avI∗ = Θ(a) ,
(4.31)
vI a = Θ(a)vI .
(4.32)
vI a = avI .
(4.33)
or equivalently,
For any a ∈ A(Ic ), Due to vI∗ = vI = vI2 , vI is a self-adjoint unitary implementing ΘI on A. Since vI ∈ A(I)+ implements ΘI , (A(I)+ )0 is contained in the fixed point I (1) subalgebra AΘ . In terms of EIc = 21 (id + ΘI ), we have I
(1)
(A(I)+ )0 ⊂ AΘ = EIc (A) = A(I)+ ⊗ A(Ic ) . Since A(Ic ) is in (A(I)+ )0 , we have
(A(I)+ )0 = Z(A(I)+ ) ⊗ A(Ic )
(4.34)
where Z(A(I)+ ) is the center of A(I)+ . Since A(I)+ = {vI }0 ∩A(I), vI is a self-adjoint unitary in A(I) and A(I) is a full matrix algebra for a finite I, we have Z(A(I)+ ) = C1 + CvI . By (4.34) and (4.35), we obtain (4.29).
(4.35)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
119
Lemma 4.15. For a finite I, A(I)0 ∩ A = A(Ic )+ + vI A(Ic )− .
(4.36)
Proof. By Lemma 4.14 and A(I)0 ⊂ (A(I)+ )0 , any element a ∈ A(I)0 is of the form a = a 1 + v I a2 ,
a1 , a2 ∈ A(Ic ) .
Take any unitary u ∈ A(I)− (e.g. u = ai + a∗i , i ∈ I). Then we have a=
1 1 1 (a + uau∗ ) = (a1 + ua1 u∗ ) + vI (a2 − ua2 u∗ ) 2 2 2
= (a1 )+ + vI (a2 )− due to uvI = −vI u, where (a1 )+ =
1 (a1 + Θ(a1 )) ∈ A(Ic )+ , 2
(a2 )− =
1 (a2 − Θ(a2 )) ∈ A(Ic )− . 2
Hence A(Ic )0 ⊂ A(Ic )+ + vI A(Ic )− . The inverse inclusion follows from (4.32) and Lemma 4.3. Hence (4.36) holds. Lemma 4.16. For an infinite I, A(I)0 ∩ A = A(Ic )+ .
(4.37)
Proof. It is clear that elements of A(Ic )+ and A(I) commute. Hence it is enough to prove A(I)0 ∩ A ⊂ A(Ic )+ . Let a ∈ A(I)0 ∩ A. Then a± =
1 (a ± Θ(a)) ∈ A(I)0 ∩ A 2
because Θ(A(I)) = A(I). For any finite subset K of I, a± ∈ (A(K)0 )± . Hence by Lemma 4.15, a+ ∈ A(Kc )+ . Consider an increasing sequence of finite subsets Kn % I. We apply Corollary 4.12 to (Kn )c & Ic , and obtain c c a+ ∈ ∩∞ n=1 A((Kn ) )+ = A(I )+ .
(4.38)
We now prove a− = 0, which yields the desired conclusion due to a = a+ + a− and (4.38). For a monotone increasing sequence of finite subsets Ln of Zν such that Ln % Zν , we have limn ELn (a− ) = a− and hence there exists nε for any given ε > 0 such that kELn (a− ) − a− k < ε
(4.39)
April 11, 2003 14:43 WSPC/148-RMP
120
00160
H. Araki & H. Moriya
for n ≥ nε . For any k, we set Kk ≡ I ∩ Lk (⊂ I). Then a− ∈ A(Kk )0 and by Lemma 4.15 we have a − = v Kk b k for some bk ∈ A((Kk )c )− . For any i ∈ Kk , E{i}c (a− ) = τ (vi )v(Kk \{i}) bk = 0 .
(4.40)
Now take an n0 ≥ nε . Since Kk % I and I is an infinite set while any Ln0 is a finite set, there exists a number k such that Kk contains a point i of Zν such that i ∈ / L n0 . Then Ln0 ⊂ {i}c . It follows from (4.40) that ELn0 (a− ) = ELn0 E{i}c (a− ) = 0 . This and (4.39) imply ka− k < ε . Since ε is arbitrary, we obtain a− = 0. Combining Lemma 4.15 and Lemma 4.16, we obtain Theorem 4.17. (1) For a finite I, A(I)0 ∩ A = A(Ic )+ + vI A(Ic )− , where vI is given by (4.30). (2) For an infinite I, A(I)0 ∩ A = A(Ic )+ . As a preparation for the remaining case (the commutant of A(I)+ for infinite I), we present the following technical Lemma for the sake of completeness. We define (i)
u11 ≡ a∗i ai ,
(i)
u12 ≡ a∗i ,
(i)
u21 ≡ ai ,
(i)
u22 ≡ ai a∗i .
(4.41)
Lemma 4.18. Let I = (i1 , . . . , i|I| ) be a finite subset of Zν . Put (ij ) j) u0(i αα ≡ uαα for α = 1, 2 ,
0(i )
0(i )
uαβj ≡ uαβj v{i1 ,...,ij−1 } for α 6= β .
(4.42)
Define ukl ≡
|I| Y
0(i )
ukj ljj ,
(4.43)
j=1
where kn and ln are either 1 or 2, respectively, k = (k1 , . . . , k|I| ) and l = (l1 , . . . , l|I| ). Then the following holds. (1) The set of all ukl form a self-adjoint system of matrix units of A(I). (2) Let σ(k, l) be the number of n such that kn 6= ln . Then Θ(ukl ) = (−1)σ(k,l) ukl .
(4.44)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
(3) Any a ∈ A has a unique expansion X a= ukl akl
121
(4.45)
k,l
with akl ∈ A(Ic ) and akl is uniquely given by akl = 2|I| EIc (ulk a) .
(4.46) (i)
Proof. (1) By using (4.1) for the case of i = j, {uαβ }αβ (α, β = 1, 2) satisfies the relations X (i) (i) (i) (i) (i) (uαβ )∗ = uβα , uαβ uα0 β 0 = δβα0 uαβ 0 , u(i) (4.47) αα = 1 , α
for a self-adjoint system of matrix units. Since v{i1 ,...,ij−1 } is a self-adjoint unitary 0(i )
commuting with aij and a∗ij , the same computation shows that {uαβj }αβ (α, β = 1, 2) satisfies the same relations. Since v{i1 ,...,ij−1 } anticommutes with aik and a∗ik for k < j and commutes with 0(i )
them for k ≥ j, {uαβj }αβ commutes with each other for different j. Since they generate all A({ik }) recursively for k = 1, . . . , n, they form a selfadjoint system of matrix units of A(I). (i) (i) (i) (i) (2) Θ(uαα ) = uαα , Θ(uαβ ) = −uαβ for α 6= β, and Θ(v{i1 ,...,ij−1 } ) = v{i1 ,...,ij−1 } imply (4.44). (3) For a full matrix algebra A(I) contained in a C∗ -algebra A, the following expansion of any a ∈ A in term of a self-adjoint system of a matrix units {ukl } of A(I) is well-known. X a= ukl bkl , k,l
bkl =
X m
umk aulm ∈ A(I)0 .
(4.48)
By Lemma 4.15, there are bkl1 and bkl2 in A(Ic ) satisfying bkl = bkl1 + vI bkl2 .
(4.49)
By direct computation, ukl vI = ±ukl where the sign depends on k and l. Thus we have the expansion (4.45) with akl = bkl1 ± bkl2 ∈ A(Ic ). The coefficient akl ∈ A(Ic ) is uniquely determined by the following computation and given by (4.46). ! X EIc (ulk a) = EIc ull0 akl0 l0
=
X l0
EIc (ull0 )akl0 =
X l0
τ (ull0 )akl0 = 2−|I| akl .
April 11, 2003 14:43 WSPC/148-RMP
122
00160
H. Araki & H. Moriya
Here we have used the following relation: τ (ukl ) = τ (ukm uml ) = τ (uml ukm ) = δkl τ (umm ) = δkl 2
−|I|
τ
X
umm
m
!
= 2−|I| δkl .
Theorem 4.19. (1) For a finite I, (A(I)+ )0 ∩ A = A(Ic ) + vI A(Ic ) ,
(4.50)
where vI is given by (4.30). (2) For an infinite I, (A(I)+ )0 ∩ A = A(Ic ) .
(4.51)
Proof. (1) is given by Lemma 4.14. To prove (2), we consider an infinite I. Clearly (A(I)+ )0 ∩ A ⊃ A(Ic ) due to (4.8). Hence it is enough to prove that any b ∈ (A(I)+ )0 ∩ A belongs to A(Ic ). Let {Ln } be an increasing sequence of finite subsets of Zν such that their union is Zν . Set In ≡ Ln ∩ I. Then In % I. For any ε > 0, there exist a positive integer lε and an element bε of A(Llε ) satisfying kb − bε k < ε . For any n, b ∈ (A(In )+ )0 due to In ⊂ I and b ∈ (A(I)+ )0 . The conclusion of (1) implies b = b0n + vIn b1n ,
(4.52)
where b0n , b1n ∈ A({In }c ). Since In % I and I is infinite, there exists an nε such that Inε contains a point i which does not belong to Llε . Then i ∈ In for all n ≥ nε . Due to bε ∈ A(Llε ) and {i}c ⊃ Llε , E{i}c (bε ) = bε . Since
b0n ,
b1n
c
(4.53)
c
∈ A({In } ) ⊂ A({i} ) for all n ≥ nε , we have E{i}c (b0n ) = b0n , E{i}c (vIn b1n ) = τ (vi )vIn \{i} b1n = 0 .
(4.54)
This implies E{i}c (b) = E{i}c (b0n ) + E{i}c (vIn b1n ) = b0n . It follows from (4.53) and (4.55), kbε − b0n k = kE{i}c (bε ) − E{i}c (b)k ≤ kbε − bk < ε .
(4.55)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
123
Therefore, kb − b0n k ≤ kb − bε k + kbε − b0n k < 2ε
(4.56)
for all n ≥ nε . Hence b = lim b0n . n
b0n
c
For any fixed m ∈ N, ∈ A({In } ) ⊂ A({Im }c ) for all n ≥ m due to In ⊃ Im . c Thus b ∈ A({Im } ) for any m. By Corollary 4.12, b ∈ ∩m A({Im }c ) = A(∩m (Icm )) = A({∪m Im }c ) = A(Ic ) . As a by-product, we obtain the following. Corollary 4.20. For any infinite I, the restriction of Θ to A(I) is outer. Proof. We denote the restriction of Θ by the same letter. For any infinite subsets I and J, (A(I), Θ) is isomorphic to (A(J), Θ) as a pair of C∗ -algebra and its automorphism through any bijective map between I and J. Therefore it is enough to show the assertion for a proper infinite subset I of Zν . Suppose that u is a unitary element in A(I) such that u∗ au = Θ(a) , for all a ∈ A(I). Substituting u into a, we have Θ(u) = u. Let b ∈ A(Ic )− and b 6= 0. Then ub ∈ A− . By (4.8), ba = Θ(a)b . Hence ub ∈ A(I)0 . Therefore ub ∈ (A(I)0 )− , which implies ub = 0 by Lemma 4.16. This implies b = u∗ (ub) = 0 , a contradiction. 5. Dynamics 5.1. Assumptions We consider a one-parameter group of ∗-automorphisms αt of the Fermion algebra A. Throughout this work, αt is assumed to be strongly continuous, that is, t ∈ R 7→ αt (A) ∈ A is norm continuous for each A ∈ A. In order to associate a potential to the dynamics αt (see Sec. 5.4 for details), we need the following two assumptions on αt and its generator δα with the domain D(δα ): (I) αt Θ = Θ αt for all t ∈ R. (II) A◦ is in the domain of δα , namely, A◦ ⊂ D(δα ).
April 11, 2003 14:43 WSPC/148-RMP
124
00160
H. Araki & H. Moriya
The assumption (I) of Θ-even dynamics comes from two sources. On the physical √ side, the generator of the time translation αt should be i = −1 times the commutator with the energy operator which is a physical observable and hence Θ-even. On the technical side, the potential to be introduced below has to commute with a fixed local element of A when the support region of the potential is far away in order that the expression for the action of the generator on that local element converges and makes sense. For αt to be uniquely specified by the associated potential to be introduced in Sec. 5.4, we need the following assumption: (III) A◦ is the core of δα , namely, if δ denotes the restriction of δα to A◦ , its closure δ¯ is δα . The assumption (III) will be used to derive a conclusion involving αt such as the KMS condition from other conditions involving the associated potential such as the Gibbs condition and the variational principle. Later, when we discuss translation invariant equilibrium states, we will add the assumption of translation invariance: (IV) αt τk = τk αt
for any t ∈ R, k ∈ Zν .
Later in Proposition 8.1, it will be shown that Assumption (IV) implies Assumption (I). By Assumptions (I) and (II), the restriction δ of δα to A◦ satisfies δΘ(A) = Θ(δA)
(5.1)
for any A ∈ A◦ . In the rest of this section, we deal with an arbitrary ∗-derivation δ with the domain A◦ commuting with Θ (Eq. (5.1)) irrespective of whether it comes from a dynamics αt or not. Of course, we can use the results about such a general δ for the restriction of δα to A◦ . 5.2. Local Hamiltonians Since A(I) is a finite type I factor for each finite subset I of Zν , there exists a self-adjoint element HI0 ∈ A satisfying δA = i[HI0 , A]
(5.2)
for any A ∈ A(I) where δ is any ∗-derivation with its domain A◦ and values in A (i.e. δ is a linear map from A◦ into A satisfying δ(AB) = (δA)B + A(δB) and δ(A∗ ) = (δA)∗ ). Although this is well-known (see e.g. [38]), we include its proof for the sake of completeness. Lemma 5.1. Let {uij } be a self-adjoint system of matrix units of A(I). Define X XX hij ≡ uli δujl − δij 2−|I| ulm δuml . l
l
m
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
Then hij ∈ A(I)0 . Define iH ≡ It satisfies H ∗ = H and
X
125
uij hij .
i,j
[iH, A] = δA for A ∈ A(I). Furthermore, EIc (H) = 0 .
(5.3)
Proof. (1) We first prove hij ∈ A(I)0 . If i 6= j, X uli (δujl )uαβ − uαi δujβ [hij , uαβ ] = l
=
X l
uli (δ(ujl uαβ ) − ujl δuαβ ) − uαi δujβ
= uαi δujβ − uαi δujβ = 0 . If i = j, [hii , uαβ ] =
X l
− =
XX l
X l
−
uli (δuil )uαβ − uαi δuiβ
m
2−|I| {ulm (δuml )uαβ − uαβ ulm δuml }
uli (δ(uil uαβ ) − uil δuαβ ) − uαi δuiβ
XX l
m
2−|I| {ulm (δ(uml uαβ ) − uml δuαβ ) − uαβ ulm δuml }
= uαi δuiβ − δuαβ − uαi δuiβ −
X
2−|I| uαm δumβ + 2−|I| (2|I| 1)δuαβ + 2−|I|
m
X
uαm δumβ
m
= 0. (2) We prove [iH, uαβ ] = δuαβ , which yields [iH, A] = δA for any A ∈ A(I) by linearity. X X X [iH, uαβ ] = [uij , uαβ ]hij = uiβ hiα − uαj hβj i,j
=
X i
i
uii δuαβ −
− uαβ
X j
X
j
2−|I| uαm δumβ
m
δujj +
X m
2−|I| uαm δumβ
April 11, 2003 14:43 WSPC/148-RMP
126
00160
H. Araki & H. Moriya
= δuαβ − uαβ δ
X j
= δuαβ − uαβ δ1
ujj
= δuαβ , where we have used hij ∈ A(I)0 for the first equality. (3) Next we prove H ∗ = H or iH + (iH)∗ = 0. By using u∗ij = uji and (δa)∗ = δa∗ , we obtain X iH + (iH)∗ = uij (hij + h∗ji ) , hij + h∗ji =
X l
=
{uli δujl + (δuli )ujl } − δij 2−|I|
X l
δ(uli ujl ) − δij 2−|I|
= δij δ
X l
ull
!
− δij δ
XX l
X l
XX l
m
{ulm δuml + (δulm )uml }
δ(ulm uml )
m
ull
!
= 0.
Hence iH + (iH)∗ = 0. (4) We prove the last statement. Note that τ (uij ) = 2−|I| δij . Hence ( ) XX X X X −|I| −|I| uli δuil − 2 ulm δuml iEIc (H) = 2 hii = i
i
l
l
m
= 0. We denote this H by HI0 . Lemma 5.2. If δ is a ∗-derivation with domain A◦ and values in A commuting with Θ, then there exists a self-adjoint element H(I) ∈ A+ satisfying δA = i[H(I), A] for all A ∈ A(I) and EIc (H(I)) = 0 . Proof. Due to commutativity of δ and Θ and Θ2 = 1, we have δA = Θ(δΘ(A)) = Θ(i[HI0 , Θ(A)]) = i[Θ(HI0 ), A] for any A ∈ A(I). Set H(I) ≡ (HI0 )+ =
1 0 (H + Θ(HI0 )) (∈ A+ ) . 2 I
(5.4)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
127
Then we have H(I)∗ = H(I) and δA = i[H(I), A] (A ∈ A(I)) . Since EIc (HI0 ) = 0, it follows from (5.4) and (4.21) that EIc (H(I)) = 0 . The local Hamiltonian operator H(I) obtained in the above lemma has the following properties: (H-i) H(I)∗ = H(I) ∈ A. (H-ii) Θ(H(I)) = H(I) (i.e. H(I) ∈ A+ ). (H-iii) δA = i[H(I), A] (A ∈ A(I)). (H-iv) EIc (H(I)) = 0. Remark. The property (H-iv) implies τ (H(I)) = τ (EIc (H(I))) = 0 .
(5.5)
Lemma 5.3. H(I) satisfying (H-ii)–(H-iv) is uniquely determined by δ. Proof. If H(I) and H(I)0 satisfy (H-ii)-(H-iv), then ∆ = H(I) − H(I)0 satisfies [∆, A] = 0 for all A ∈ A(I) due to (H-iii). By Lemma 4.15 and (H-ii) for ∆, ∆ ∈ A(I)0 ∩ A+ = A(Ic )+ . Hence (H-iv) implies ∆ = EIc (∆) = EIc (H(I)) − EIc (H(I)0 ) = 0 . Therefore H(I) satisfying (H-ii)-(H-iv) is unique. We call H(I) the standard Hamiltonian for the region I. Remark. For the empty set ∅, H(∅) = 0 by (H-iv). Under the conditions (H-ii)–(H-iv), the property H(I)∗ = H(I) of (H-i) and the property (δA)∗ = δA∗ , (A ∈ A(I)) for δ are equivalent because of the following reason. If H(I)∗ = H(I), then (δA)∗ = δA∗ immediately follows from (H-iii). If (δA)∗ = δA∗ , then H(I)∗ satisfies (H-iii) along with (H-ii) and (H-iv). Hence H(I)∗ = H(I) by the uniqueness result Lemma 5.3. Lemma 5.4. If I ⊂ J is a pair of finite subsets, then H(I) = H(J) − EIc (H(J)) .
(5.6)
Proof. H(J) satisfies (H-ii) and (H-iii) for the region I(⊂ J). Furthermore, EIc (H(J)) ∈ A(Ic )+ due to (H-ii) for H(J) and hence it commutes with A ∈ A(I). Therefore H(J) − EIc (H(J)) satisfies (H-ii)–(H-iv) for the region I. By the uniqueness (Lemma 5.3), we obtain H(I) = H(J) − EIc (H(J)).
April 11, 2003 14:43 WSPC/148-RMP
128
00160
H. Araki & H. Moriya
We give the number (H-v) to the condition above: (H-v) H(I) = H(J) − EIc (H(J)) for any finite subsets I ⊂ J of Zν . The proof above has shown that (H-v) is derived from (H-ii)–(H-iv). So far we have derived the properties (H-i), (H-ii), (H-iv) and (H-v) for the family {H(I)} from its definition in terms of δ through the relation (H-iii). In the converse direction, any family of an element H(I) ∈ A for each finite subset I of Zν defines a derivation δ on A◦ by (H-iii). This definition requires a consistency: if A ∈ A(I) and A ∈ A(J), we have a definition of δ(A) by H(I) and H(J). The proof that they are the same is given as follows. First we note that A ∈ A(I) ∩ A(J) = A(I ∩ J). Thus it is enough to show [H(I), A] = [H(K), A]
(5.7)
for any K ⊂ I and A ∈ A(K), because, using this identity for the pair I ⊃ K = I ∩ J and J ⊃ K; we obtain [H(I), A] = [H(J), A] for any A ∈ A(I ∩ J). Since EKc (H(I)) is Θ-even by (H-ii) and (4.21), EKc (H(I)) is in A(Kc )+ and commutes with A ∈ A(K). By (H-v), H(K) = H(I) − EKc (H(I)) which leads to the consistency equation (5.7). δ defined by (H-iii) is a ∗-derivation with domain A◦ due to (H-i), and commutes with Θ by (H-ii). We have not used (H-iv) in this argument, but have imposed it on H(I) to obtain the uniqueness of H(I) for a given δ. Namely, by Lemmas 5.2 and 5.3, the correspondence of δ and H(I) is bijective, for which the condition (H-iv) is used. Summarizing the argument so far, we have obtained Theorem 5.7 stated below after introduction of two definitions. Definition 5.5. The real vector space of all ∗-derivations with their definition domain A◦ and commuting with Θ (on A◦ ) is denoted by ∆(A◦ ). Remark. Under Assumptions (I) and (II), the restriction δ of the generator δα of αt belongs to ∆(A◦ ). Definition 5.6. The real vector space of functions H(I) of finite subsets I satisfying the following four conditions is denoted by H and its element H is called a local Hamiltonian. (H-i) H(I)∗ = H(I) ∈ A, (H-ii) Θ(H(I)) = H(I) (i.e. H(I) ∈ A+ ) (H-iv) EIc (H(I)) = 0, (H-v) H(I) = H(J) − EIc (H(J)) for any finite subsets I ⊂ J of Zν .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
129
Theorem 5.7. The following relation between H ∈ H and δ ∈ ∆(A◦ ) gives a bijective, real linear map from H to ∆(A◦ ). (H-iii) δA = i[H(I), A] (A ∈ A(I)). Remark. The value δA of the derivation δ ∈ ∆(A◦ ) for A ∈ A◦ is in general not in A◦ . 5.3. Internal energy For a finite subset I of Zν , set U (I) ≡ EI (H(I)) (∈ A(I))
(5.8)
and call it the internal energy for the region I. Due to H(∅) = 0, U (∅) = 0. Due to the property (5.5), EI EIc ((H(J))) = τ ((H(J))) = 0 . By (H-v), we obtain for I ⊂ J U (I) = EI H(I) = EI ({H(J) − EIc (H(J))}) = EI H(J) = EI EJ H(J) = EI U (J) .
(5.9)
Furthermore, for any finite subset I and any subset J of Zν , we have EJ (U (I)) = EJ EI (U (I)) = EJ∩I (U (I)) = U (I ∩ J) ,
(5.10)
where the last equality is due to (5.9). Due to (5.5), τ (U (I)) = τ (EI (H(I))) = τ (H(I)) = 0 .
(5.11)
HJ (I) ≡ EJ (H(I)) .
(5.12)
Let us denote
Lemma 5.8. (1) For any pair of finite subsets I and J, HJ (I) = U (J) − U (Ic ∩ J) .
(5.13)
(2) For any finite subset I, H(I) = limν (U (J) − U (Ic ∩ J)) . J%Z
Proof. (1): By applying (H-v) for pairs I ⊃ I ∩ J and J ⊃ I ∩ J, we obtain H(I ∩ J) = H(I) − E(I∩J)c (H(I)) , H(I ∩ J) = H(J) − E(I∩J)c (H(J)) . Therefore, H(I) = H(J) − E(I∩J)c (H(J) − H(I)) .
(5.14)
April 11, 2003 14:43 WSPC/148-RMP
130
00160
H. Araki & H. Moriya
By applying EJ to this equation, we obtain HJ (I) = U (J) − EJ E(I∩J)c (H(J) − H(I)) . Since J ∩ (I ∩ J)c = J ∩ (Ic ∪ Jc ) = (J ∩ Ic ) ∪ (J ∩ Jc ) = J ∩ Ic , we obtain EJ E(I∩J)c = EJ∩(I∩J)c = EJ∩Ic = EJ EIc = EIc EJ . Since EIc (H(I)) = 0 by (H-iv), we have EJ E(I∩J)c (H(J) − H(I)) = EIc EJ (H(J)) = EIc (U (J)) . Thus HJ (I) = U (J) − EIc (U (J)) . By this and (5.10), we arrive at (5.13). (2): By (4.23), we have H(I) = limν HJ (I) . J%Z
(5.15)
This and (5.13) imply the desired (5.14). 5.4. Potential We introduce the potential {Φ(I)} in terms of {H(I)} and derive its characterizing properties. As a consequence, we establish the one-to-one correspondence between {Φ(I)} and {H(I)}. Lemma 5.9. For a given {H(I)} ∈ H and the corresponding {U (I)}, there exists one and only one family of {Φ(I) ∈ A; finite I ⊂ Zν } satisfying the following conditions: (1) (2) (3) (4) (5)
Φ(I) ∈ A(I). Φ(I)∗ = Φ(I), Θ(Φ(I)) = Φ(I), Φ(∅) = 0. EJ (Φ(I)) = 0 if J ⊂ I and J 6= I. P U (I) = K⊂I Φ(K). P H(I) = limJ%Zν K {Φ(K); K ∩ I 6= ∅, K ⊂ J}.
Proof. We show this lemma in several steps. Step 1. Existence of Φ satisfying (1) and (4) for all finite I. The following expression for Φ(I) in terms of U (K), K ⊂ I satisfies (1) and (4) for all I and hence the existence. X (−1)|I|−|K|U (K) . (5.16) Φ(I) = K⊂I
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
131
P In fact, substituting this expression into J⊂I Φ(J), we obtain XX X (−1)|J|−|K|U (K) = α(K)U (K) , J⊂I K⊂J
α(K) =
K⊂I
X
(−1)|J|−|K| =
J:K⊂J⊂I
|I| X
(−1)m−|K| βm ,
(5.17)
m=|K|
where βm is the number of distinct J satisfying K ⊂ J ⊂ I,
|J| = m .
This is the number of way for choosing m − |K| elements (for J \ K) out of I \ K, . Putting l = m − |K|, n = |I| − |K|, we obtain which is |I|−|K| m−|K| n X n α(K) = (−1)l = (1 − 1)n = 0 l l=0
for all K 6= I (then n ≥ 1), while we have α(I) = 1. Hence (4) is satisfied by Φ(I) given as (5.16) for all I. Step 2. Uniqueness of Φ satisfying (4). The relation (4) implies X Φ(I) = U (I) − Φ(K) (5.18) K⊂I,K6=I
which obviously determines Φ(I) uniquely for a given {U (I)} by the mathematical induction on |I| = m starting from Φ(∅) = U (∅) = 0. Step 3. Property (2). We already obtain Φ(∅) = 0. Since U (I)∗ = U (I) and Θ(U (I)) = U (I), Φ(I) defined by (5.16) as a real linear combination of U (K), K ⊂ I satisfies (2). Step 4. Property (3). We note that (3) is equivalent to the following condition: EJ (Φ(I)) = 0 ,
for J 6⊃ I ,
(5.19)
because EJ (Φ(I)) = EJ EI (Φ(I)) = EJ∩I (Φ(I)) by Theorem 4.7, J ∩ I ⊂ I, and J ∩ I 6= I if and only if J 6⊃ I. On the other hand, EJ (Φ(I)) = Φ(I) if J ⊃ I due to Φ(I) ∈ A(I) ⊂ A(J). We now prove (3) by the mathematical induction on |I| = m. For m = 1, the only J satisfying J ⊂ I and J 6= I is J = ∅ for which Φ(J) = 0. Then Φ(I) = U (I) and EJ (Φ(I)) = τ (Φ(I))1 = τ (U (I)) = 0 due to (5.11). Suppose (3) holds for |I| < m. We consider I with |I| = m. We apply EJ (for J ⊂ I, J 6= I) on both sides of (5.18). All K in the summation on the right-hand side satisfy |K| < m due to K ⊂ I and K 6= I. Hence the inductive assumption is applicable to Φ(K) on the right-hand side. If K 6⊂ J, we
April 11, 2003 14:43 WSPC/148-RMP
132
00160
H. Araki & H. Moriya
have EJ (Φ(K)) = 0 by (5.19). If K ⊂ J, we have EJ (Φ(K)) = Φ(K). Therefore, by using EJ U (I) = U (J) (due to J ⊂ I), we obtain X EJ Φ(I) = EJ U (I) − EJ Φ(K) K⊂I,K6=I
= U (J) −
X
Φ(K) = 0 .
K⊂J
This proves (3). Step 5. Property (5). For a finite subset J and I ⊂ J, HJ (I) is written in terms of Φ by (5.13) and (4) as X HJ (I)(= EJ (H(I))) = {Φ(K); K ∩ I 6= ∅, K ⊂ J} . (5.20) K
Due to (5.15), Φ satisfies (5). We collect useful formulae for U and H in terms of Φ which have been obtained above: X U (I) = Φ(K) , (5.21) K⊂I
HJ (I) =
X K
{Φ(K); K ∩ I 6= ∅, K ⊂ J} ,
H(I) = limν J%Z
X K
{Φ(K); K ∩ I 6= ∅, K ⊂ J}
(5.22) !
= limν HJ (I) J%Z
.
(5.23)
Definition 5.10. A function Φ of finite subsets I of Zν with the value Φ(I) in A is called a standard potential if it satisfies the following conditions: (Φ-a) Φ(I) ∈ A(I), Φ(∅) = 0. (Φ-b) Φ(I)∗ = Φ(I). (Φ-c) Θ(Φ(I)) = Φ(I). (Φ-d) EJ (Φ(I)) = 0 if J ⊂ I and J 6= I. (Φ-e) For each fixed finite subset I of Zν , the net X HJ (I) = {Φ(K); K ∩ I 6= ∅, K ⊂ J} , K
is a Cauchy net in the norm topology of A for J % Zν . The index set for the net is the set of all finite subsets J of Zν , partially ordered by the set inclusion.
Remark. (Φ-d) is equivalent to the following condition: (Φ-d)0 EJ (Φ(I)) = 0 unless I ⊂ J, because EJ (Φ(I)) = EJ EI (Φ(I)) = EJ∩I (Φ(I)). Definition 5.11. The real vector space of all standard potentials is denoted by P.
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
133
Remark. P is a real vector space as a function space, where the linear operation is defined by (cΦ + dΨ)(I) = cΦ(I) + dΨ(I) ,
c, d ∈ R ,
Φ, Ψ ∈ P .
(5.24)
We show the one-to-one correspondence of Φ ∈ P and H ∈ H. Theorem 5.12. The equations (5.22) and (5.23) for Φ ∈ P and H ∈ H give a bijective, real linear map from P to H. Proof. First note that (4) of Lemma 5.9 is satisfied for U (I) = EI (H(I)) due to (Φ-d), if (5.22) and (5.23) are satisfied. By Lemma 5.9, there exists a unique Φ ∈ P satisfying (5.22) and (5.23) for any given H ∈ H. The map is evidently linear. The only remaining task is to prove the property (H-i), (H-ii), (H-iv) and (H-v) for the H(I) given by (5.22) and (5.23), on the basis of (Φ-a)-(Φ-e). (H-i), (H-ii) and (H-iv) follow from (Φ-b), (Φ-c) and (Φ-d)0 , respectively. To show (H-v), let L be a finite subset containing J ⊃ I. Then X HL (J) − HL (I) = {Φ(K); K ∩ J 6= ∅, K ∩ I = ∅, K ⊂ L} K
X
= E Ic
K
{Φ(K); K ∩ J 6= ∅, K ⊂ L}
!
= EIc (HL (J)) due to (5.22), (Φ-a) and (Φ-d)0 . By taking limit L % Zν , we obtain H(J) − H(I) = EIc (H(J)) , where the convergence is due to (Φ-e) and kEIc k = 1. Remark. We will use later the real linearity of the above map: HcΦ+dΨ (I) = cHΦ (I) + dHΨ (I) ,
c, d ∈ R ,
UcΦ+dΨ (I) = cUΦ (I) + dUΨ (I) ,
c, d ∈ R ,
Φ, Ψ ∈ P , Φ, Ψ ∈ P ,
(5.25) (5.26)
where HΦ (I) and UΦ (I) denote H(I) and U (I) corresponding to Φ ∈ P. Theorem 5.13. The following relation between Φ ∈ P and δΦ ∈ ∆(A◦ ) gives a bijective, real linear map from P to ∆(A◦ ). δΦ A = i[H(I), A] (A ∈ A(I)) , X H(I) = limν {Φ(K); K ∩ I 6= ∅, K ⊂ J} . J%Z
K
Proof. This is a consequence of Theorem 5.7 and Theorem 5.12.
(5.27) (5.28)
April 11, 2003 14:43 WSPC/148-RMP
134
00160
H. Araki & H. Moriya
Remark 1. The technique using the conditional expectations for associating a unique standard potential with a a given ∗-derivation has been developed for quantum spin lattice systems by one of the authors [12]. The corresponding formalism for classical lattice systems is developed in [13]. Also see [23] where EI for the quantum spin case is called a partial trace. Remark 2. We note that P is a Fr´echet space with respect to a countable family of seminorms {kH({i})k}, i ∈ Zν . 5.5. General potential If the function Φ : I ∈ {finite subsets of Zν } 7→ Φ(I)
(5.29)
satisfies (Φ-a), (Φ-b), (Φ-c) and (Φ-e), we call it a general potential. By (Φ-e), we define H(I) by (5.23) and (5.22). Then, for any finite subsets K ⊃ I, X H(K) − H(I) = limν {Φ(L); L ∩ K 6= ∅, L ∩ I = ∅, L ⊂ J} (5.30) J%Z
L
due to (Φ-e). Therefore, we can define δΦ with the domain A◦ by δΦ A = i[H(I), A] for A ∈ A(I) ,
(5.31)
which is a consistent definition due to (5.30) by essentially the same argument as the one leading to (5.7). The properties (Φ-a), (Φ-b), (Φ-c), and (Φ-e) imply that δΦ ∈ ∆(A◦ ). Two general potentials Φ and Φ0 are said to be equivalent if δΦ = δΦ0 . It follows from Theorem 5.13 that there is a unique standard potential which is equivalent to any given general potential defined above. The equivalence is discussed, e.g., in [23] and [40] with the name of physical equivalence. We will consider the consequence of equivalence for a specific class of general potentials in Sec. 14. 6. KMS Condition 6.1. KMS condition We recall the definition of the KMS condition for a given dynamics αt of A (see e.g. [17]). Definition 6.1. A state ϕ of A is called an αt -KMS state at the inverse temperature β ∈ R or (αt , β)-KMS state (or more simply KMS state) if it satisfies one of the following two equivalent conditions: (A) Let Dβ be the strip region Dβ = {z ∈ C; 0 ≤ Im z ≤ β}
if β ≥ 0 ,
= {z ∈ C; β ≤ Im z ≤ 0}
if β < 0 ,
◦
in the complex plane C and Dβ be its interior.
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
135
For every A and B in A, there exists a function F (z) of z ∈ Dβ (depending on A and B) such that ◦
(1) F (z) is analytic in Dβ , (2) F (z) is continuous and bounded on Dβ , (3) For all real t ∈ R, F (t) = ϕ(Aαt (B)) ,
F (t + iβ) = ϕ(αt (B)A) .
(B) Let Aent be the set of all B ∈ A for which αt (B) has an analytic extension to A-valued entire function αz (B) as a function of z ∈ C. For A ∈ A and B ∈ Aent , ϕ(Aαiβ (B)) = ϕ(BA) . Remark. In (A), the condition (1) is empty if β = 0. The boundedness in (2) can be omitted (see e.g. Proposition 5.3.7 in [17]). Aent is known to be dense in A. For a state ϕ on A, let {Hϕ , πϕ , Ωϕ } denote its GNS triplet, namely, πϕ is a (GNS) representation of A on the Hilbert space Hϕ , and Ωϕ is a cyclic unit vector in Hϕ , representing ϕ as the vector state. If ϕ is an (αt , β)-KMS state, then Ωϕ is separating for the generated von Neumann algebra M ≡ πϕ (A)00 . Let ∆ϕ and σtϕ be the modular operator and modular automorphisms for Ωϕ and ϕ, respectively, [42]. The KMS condition implies that σtϕ (πϕ (A)) = πϕ (α−βt (A)) ,
A ∈ A.
(6.1)
It is a result of Takesaki [42] that the KMS condition of a one-parameter automorphism group of a von Neumann algebra with respect to a cyclic vector implies the separating property of the vector, and the modular automorphism group of the von Neumann algebra with respect to the cyclic and separating vector is characterized by the KMS condition at β = −1 with respect to the state given by that vector. For the sake of brevity in stating an assumption later, we use the following terminology. Definition 6.2. A state ϕ is said to be modular if Ωϕ is separating for πϕ (A)00 . 6.2. Differential KMS condition It is convenient to introduce the following condition in terms of the generator δα of the dynamics αt , equivalent to the KMS condition with respect to αt . Definition 6.3. Let δ be a ∗-derivation of A with its domain D(δ). A state ϕ is said to satisfy the differential (δ, β)-KMS condition (or briefly, (δ, β)-dKMS condition) if the following two conditions are satisfied (C-1) ϕ(A∗ δA) is pure imaginary for all A ∈ D(δ).
April 11, 2003 14:43 WSPC/148-RMP
136
00160
H. Araki & H. Moriya
(C-2) −iβϕ(A∗ δA) ≥ S(ϕ(AA∗ ), ϕ(A∗ A)) for all A ∈ D(δ) where the function S(x, y) is given for x ≥ 0, y ≥ 0 by: S(x, y) = y log y − y log x S(x, y) = +∞ S(x, y) = 0
if x > 0, y > 0 ,
if x = 0, y > 0 ,
if x ≥ 0, y = 0 .
We use the following known result (see e.g., Theorem 5.3.15 in [17]). Theorem 6.4. Let δα be a generator of αt , namely, etδα = αt . Then the (δα , β)dKMS condition and the (αt , β)-KMS condition are equivalent. Remark. The function S(x, y) is the relative entropy for linear functionals of onedimensional ∗-algebra. The order of the arguments x, y in our notation is opposite to that of the definition in [45]. (Both the order of the argument and the sign are opposite to those in [17].) Our definition here is in accordance with our definition of the relative entropy previously given. Lemma 6.5. S(x, y) is convex and lower semi-continuous in x, y. Proof. A convenient expression for S(x, y) is ) ( Z ∞ 2 −1 2 dt (ys(t) + t x{1 − s(t)} ) , S(x, y) = sup sup y log n − 1 t n s(t) n
(6.2)
where s(t) varies over the linear span of characteristic functions of finite intervals in [0, +∞). The equality is immediate for x = 0, y > 0 as well as for x ≥ 0, y = 0. For x > 0, y > 0, (6.2) follows from identities for λ = x/y. 1 x + y(log y − log x) = sup −y log y n n ( ) Z ∞ λ dt y = sup y log n − , 1 t+λ t n n −y
λ = sup{−(ys2 + xt−1 (1 − s)2 )} . t+λ s∈R
From the expression above, S(x, y) is seen to be convex and lower semicontinuous in (x, y) because it is a supremum of homogeneous linear functions of (x, y). (The variational expression (6.2) for general von Neumann algebras is established by Kosaki [25]. This expression indicates manifestly some basic properties of relative entropy for the general case.) Lemma 6.6. The conditions (C-1) and (C-2) are stable under the simultaneous limit of A and δA in norm topology and ϕ in the weak∗ topology as well as under the convex combination of states ϕ.
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
137
Proof. Let An , A ∈ D(δ), kAn − Ak → 0, kδAn − δAk → 0, |ϕn (B) − ϕ(B)| → 0 for every B ∈ A. Then |ϕn (A∗n δAn ) − ϕ(A∗ δA)| ≤ |ϕn (A∗n δAn − AδA)| + |ϕn (AδA) − ϕ(AδA)| ,
which converges to 0 as n → ∞. Therefore, the condition (C-1) holds for ϕ and A if it holds for ϕn and An . Similarly, ϕn (An A∗n ) → ϕ(AA∗ ) ,
ϕn (A∗n An ) → ϕ(A∗ A) ,
as n → ∞. By the lower semi-continuity of S(x, y) in (x, y), we then obtain S(ϕ(AA∗ ), ϕ(A∗ A)) ≤ lim inf S(ϕ(An A∗n ), ϕ(A∗n An )) . n
Hence we obtain the condition (C-2) for ϕ and A if it holds for ϕn and An . Since ϕ(A∗ δA) is affine in ϕ while S(ϕ(AA∗ ), ϕ(A∗ A)) is convex in ϕ, the conditions (C-1) and (C-2) are stable under the convex combination of ϕ. Corollary 6.7. Let αt be a one-parameter group of ∗-automorphisms of A satisfying the conditions (II) and (III). Let δα be the generator of αt . Then a state ϕ is an (αt , β)-KMS state if and only if it is a (δ, β)-dKMS state, where δ denotes the restriction of δα to A◦ . Proof. The restriction δ of δα to A◦ makes sense due to the assumption (II). By Theorem 6.4, it suffices to prove that the dKMS condition for δ implies the same for δα . By Assumption (III), there exists a sequence An ∈ A◦ for any given A ∈ D(δα ) such that kAn − Ak → 0, kδAn − δα Ak → 0. Hence the conditions (C-1) and (C-2) for δ imply the same for δα due to Lemma 6.6. 7. Gibbs Condition In this section, we define the Gibbs condition. We first recall the notion of perturbation of dynamics and states. 7.1. Inner perturbation Consider a given dynamics αt of A with its generator δ on the domain D(δ). For each h = h∗ ∈ A, there exists the unique perturbed dynamics αht of A with its generator δ h given by δ h (A) ≡ δ(A) + i[h, A] (A ∈ D(δ)) on the same domain as the generator δ of αt . This
αht (A)
(7.1) is explicitly given by
αht (A) = uht αt (A)(uht )∗ where uht
≡1+
∞ X
m=1
i
m
Z
t
dt1 0
Z
t1 0
dt2 · · ·
Z
(7.2)
tm−1 0
dtm αtm (h) · · · αt1 (h) .
(7.3)
April 11, 2003 14:43 WSPC/148-RMP
138
00160
H. Araki & H. Moriya
This is unitary and satisfies the following cocycle equation: uhs αs (uht ) = uhs+t . The same statements hold for a von Neumann algebra M and its one parameter group of ∗-automorphisms αt ; the t-continuity of αt for each fixed x ∈ M in the strong operator topology of M is to be assumed. Let Ω be a cyclic and separating vector for M. Let ∆Ω be the modular operator for Ω and σtω be the corresponding modular automorphism group −it σtω (x) = ∆it Ω x∆Ω ,
where ω indicates the positive linear functional ω(x) = (Ω, xΩ) ,
(x ∈ M) .
For h = h∗ ∈ M, the perturbed vector Ωh is given by Z t1 Z tm−1 ∞ Z 1 X 2 h tm−1 −tm dt1 dt2 · · · Ω ≡ dtm ∆tϕm πϕ (h)∆ϕ πϕ (h) · · · ∆tϕ1 −t2 πϕ (h)Ω m=0
= Expr
0
0
Z
1 2
0
0
!
; ∆tϕ πϕ (h)∆−t ϕ dt Ω ,
(7.4)
where the sum is known to converge absolutely ([2]). The notation Expr is taken from [3]. The positive linear functional ω h on M is defined by ω h (x) ≡ (Ωh , xΩh ) (x ∈ M) .
(7.5)
The vector Ωh defined above is cyclic and separating for M. Its modular automorh phism group σtω of M coincides with (σtω )h , i.e. the perturbed dynamics of (σtω , M) by h. Ωh is in the natural positive cone of (Ω, M) (see e.g. [43] and [17]) for any self-adjoint element h ∈ M and satisfies (Ωh1 )h2 = Ωh1 +h2
(7.6)
for any self-adjoint elements h1 , h2 ∈ M. We have (ω h1 )h2 = ω h1 +h2 ,
{ω (h1 +h2 ) }
σt
(= (σtω )(h1 +h2 ) ) = {(σtω )h1 }h2 ,
(7.7)
where {(σtω )h1 }h2 indicates the dynamics which is given by the successive perturbations first by h1 and then by h2 . We denote the normalization of ω h by [ω h ]: [ω h ] = ω h (1)−1 ω h = ω (h−{log ω
h
(1)}1)
.
(7.8)
We use the following estimates (Theorem 2 of [4]) and a formula (e.g. (3.5) of [7] and Theorem 3.10 of [9]) later. 1 kΩh k ≤ exp khk , 2
log ω h (1) ≤ khk .
S(ϕh , ϕ) = −ϕ(h) .
(7.9) (7.10)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
139
7.2. Surface energy Let us consider Φ ∈ P. For any finite subset I of Zν , we define W (I) ≡ H(I) − U (I) .
(7.11)
By (5.21), (5.22) and (5.23), the expression for W (I) in terms of the potential is given as follows. X W (I) = {Φ(K); K ∩ I 6= ∅, K ∩ Ic 6= ∅} (7.12) K
= limν J%Z
X K
c
{Φ(K); K ∩ I 6= ∅, K ∩ I 6= ∅, K ⊂ J}
!!
.
W (I) is the sum of all (interaction) potentials between the inside and the outside of I by definition, and will be called the surface energy. 7.3. Gibbs condition We are now in a position to introduce our Gibbs condition for a state ϕ of A for a given δ ∈ ∆(A◦ ). We use the following notation in its definition below. As in Sec. 6.1, {Hϕ , πϕ , Ωϕ } is the GNS triplet for ϕ. The normal extension of ϕ to the weak closure M(= πϕ (A)00 ) is denoted by the same letter ϕ: ϕ(x) = (Ωϕ , xΩϕ ) ϕ(πϕ (a)) = ϕ(a)
(x ∈ M) ,
(a ∈ A) .
Let Φ(I), H(I), U (I) and W (I) be those uniquely associated with δ. The following operators will be used for perturbations of dynamics and states ˆ = πϕ (βH(I)) , h
u ˆ = πϕ (βU (I)) ,
w ˆ = πϕ (βW (I)) .
(7.13)
Definition 7.1. For δ ∈ ∆(A◦ ), a state ϕ of A is said to satisfy the (δ, β)-Gibbs condition, or alternatively the (Φ, β)-Gibbs condition, if the following two conditions are satisfied. (D-1) ϕ is a modular state. (See Definition 6.2.) w ˆ (D-2) For each finite subset I of Zν , σtϕ satisfies w ˆ
σtϕ (πϕ (A)) = πϕ (e−iβU (I)t AeiβU (I)t ) for all A ∈ A(I). The condition (D-2) is equivalent to the following condition (D-2)0 as shown in the subsequent Lemma and hence we may define the (δ, β)-Gibbs condition by (D-1) and (D-2)0 . ˆ h
(D-2)0 For each finite subset I of Zν and A ∈ A(I), πϕ (A) is σtϕ -invariant, namely, ˆ πϕ (A(I)) is in the centralizer of the positive linear functional ϕh .
April 11, 2003 14:43 WSPC/148-RMP
140
00160
H. Araki & H. Moriya
Lemma 7.2. The conditions (D-2) and (D-2)0 are equivalent. ˆ Proof. First assume (D-2). Since ˆ h=w ˆ+u ˆ, we have ϕh = (ϕwˆ )uˆ and hence ˆ h
σtϕ = {(σtϕ )wˆ }uˆ w ˆ
= (σtϕ )uˆ . w ˆ
Since e−iβU (I)t U (I) eiβU (I)t = U (I), πϕ (U (I)) is invariant under σtϕ by (D-2). Then ˆ h
w ˆ
unitary cocycle bridging σtϕ and σtϕ becomes eiˆut . Hence w ˆ
h
σtϕ = Ad(eiˆut ) ◦ σtϕ . Therefore, for πϕ (A), A ∈ A(I), we have h
w
σtϕ (πϕ (A)) = eiˆut σtϕ (πϕ (A))e−iˆut = πϕ (Ad(eiβU (I)t ) ◦ Ad(e−iβU (I)t ) ◦ A) = πϕ (A) . Thus (D-2)0 is satisfied. w ˆ ˆ−u We show the converse. Assume (D-2)0 . Since w ˆ=h ˆ, σtϕ is the perturbed ˆ h
ˆ h
u. Since u ∈ A(I) is σtϕ -invariant (being in the centralizer), dynamics of σtϕ by −ˆ the corresponding unitary cocycle is e−iˆut . Hence, for πϕ (A), A ∈ A(I), we have ˆ h
w ˆ
σtϕ (πϕ (A)) = e−iˆut σtϕ (πϕ (A))e+iˆut = e−iβπϕ (U (I))t πϕ (A)eiβπϕ (U (I))t = πϕ (e−iβU (I)t AeiβU (I)t ) , and (D-2) is derived. We introduce the local Gibbs state. Definition 7.3. For finite I, the local Gibbs state of A(I) (or local Gibbs state for I) with respect to (δ, β) is given by ϕcI (A) ≡
τ (e−βU (I) A) , τ (e−βU (I) )
A ∈ A(I) .
(7.14) ˆ
Corollary 7.4. If ϕ satisfies the (δ, β)-Gibbs condition, then the restriction of ϕh ˆ to A(I) is ϕh (1) times the tracial state τ and that of ϕwˆ is ϕwˆ (1) times the local Gibbs state ϕcI given by (7.14). ˆ
Proof. Since ϕh has the tracial property for A(I) by (D-2)0 , its restriction to A(I) ˆ must be ϕh (1) times the unique tracial state τ .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
141
Since the inner automorphism group αIt ≡ Ad(e−iβU (I)t )
(7.15)
leaves A(I) invariant and has the same action on A(I) as the modular automorphism of ϕwˆ |A(I) (the restriction of ϕwˆ to A(I)), ϕwˆ |A(I) satisfies (αIt , −1) KMS condition and hence must be ϕwˆ (1) times the unique KMS state given by the local Gibbs state ϕcI . 7.4. Equivalence to KMS condition Theorem 7.5. Let αt be dynamics of A satisfying conditions (I) and (II) and δ be the restriction of its generator δα to A◦ . Then any (αt , β)-KMS state ϕ of A satisfies (δ, β)-Gibbs condition. Proof. As already indicated, it is known that the KMS condition implies (D-1). It remains to show (D-2). We have w ˆ
(d/ds)(σsϕ (x) − σsϕ (x))s=0 = i[w, ˆ x] , for x ∈ M. By the group property of the automorphisms, w ˆ
w ˆ
w ˆ
(d/dt)σtϕ (x) = σtϕ {(d/ds)σsϕ (x)|s=0 } w ˆ
for x in the domain of the generator of σtϕ . For the same x, we have w ˆ
w ˆ
ˆ x]} . (d/dt)σtϕ (x) = σtϕ {(d/ds)σsϕ (x)|s=0 + i[w, The KMS condition implies that σsϕ (πϕ (A)) = πϕ (α−βs (A)) ,
A ∈ A.
Therefore, if A ∈ A is in the domain of the generator of αt , we have w ˆ
w ˆ
w ˆ
(d/dt)σtϕ (πϕ (A)) = σtϕ {(d/ds)(πϕ {α−βs (A)})|s=0 } + σtϕ (πϕ {[iβW (I), A]}) .
Now we take A ∈ A(I). By (H-iii), w ˆ
w ˆ
w ˆ
(d/dt)σtϕ (πϕ (A)) = σtϕ (−iβπϕ {[H(I), A]}) + σtϕ (iβπϕ {[W (I), A]}) w ˆ
= −iβσtϕ (πϕ {[U (I), A]}) .
For A ∈ A(I), eiβU (I)t Ae−iβU (I)t ∈ A(I), and we have w ˆ
(d/dt)σtϕ (πϕ {eiβU (I)t Ae−iβU (I)t }) w ˆ
w ˆ
= σtϕ {(d/ds)σsϕ (πϕ {eiβU (I)(t+s) Ae−iβU (I)(t+s) })|s=0 } w ˆ
= σtϕ (−iβπϕ {[U (I), eiβU (I)t Ae−iβU (I)t ]} + πϕ {d/ds(eiβU (I)(t+s) Ae−iβU (I)(t+s) )|s=0 }) = 0.
April 11, 2003 14:43 WSPC/148-RMP
142
00160
H. Araki & H. Moriya
This implies that w ˆ
σtϕ (πϕ {eiβU (I)t Ae−iβU (I)t }) is a constant function of t and hence equals to its value at t = 0, which is πϕ (A). Thus w ˆ
ϕ (πϕ (A)) = πϕ (eiβU (I)t Ae−iβU (I)t ) σ−t
and (D-2) is shown. To show the converse, we need the assumption (III) for the dynamics αt . Theorem 7.6. Let αt be a dynamics of A satisfying the conditions (I), (II) and (III). Let δ be the restriction of its generator δα to A◦ . Then any (δ, β)-Gibbs state ϕ of A satisfies (αt , β)-KMS condition. Proof. We use (D-2)0 . It says that ˆ h
(d/dt)σtϕ (πϕ (A)) = 0 for all A ∈ A(I). By the group property of the automorphism, (d/dt)σtϕ (x) = σtϕ {(d/ds)σsϕ (x)|s=0 } .
ˆ
ˆ
For any A ∈ A◦ , there exists a finite subset I such that A ∈ A(I). Since ϕ = (ϕh )−h , we have ˆ
h ˆ πϕ (A)]} (d/dt)σtϕ (πϕ (A)) = σtϕ {(d/ds)σsϕ (πϕ (A))|s=0 − [ih,
= σtϕ (−iβπϕ ([H(I), A])) = −βσtϕ (πϕ (δA)) .
(7.16)
We note that for any A ∈ A −it σtϕ (πϕ (A)) = ∆it ϕ πϕ (A)∆ϕ ,
∆ϕ Ω ϕ = Ω ϕ .
By applying (7.16) on Ωϕ and setting t = 0, we conclude that πϕ (A)Ωϕ is in the domain of log ∆ϕ and i(log ∆ϕ )πϕ (A)Ωϕ = −βπϕ (δ(A))Ωϕ
(7.17)
for all A ∈ A◦ . By Assumption (III), for every A ∈ D(δα ), there exists a sequence {An }, An ∈ ¯ A◦ such that {An } and {δAn (= δα An )} converge to A and δα A(= δA), respectively, in the norm topology of A. Since log ∆ϕ is a (self-adjoint) closed operator, πϕ (A)Ωϕ must be in the domain of log ∆ϕ and (7.17) holds for any A ∈ D(δα ). For A ∈ D(δα ) and t ∈ R, we set ξt ≡ σtϕ (πϕ {αβt (A)}Ωϕ = ∆it ϕ πϕ (αβt (A))Ωϕ .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
143
For A ∈ D(δα ), αt (A) is in D(δα ) for any t ∈ R. Therefore, we can substitute αβt (A) into A of (7.17) and obtain is it (d/dt)ξt = ∆it ϕ {(d/ds)∆ϕ πϕ {αβt (A)}Ωϕ |s=0 } + ∆ϕ ((d/dt)πϕ {αβt (A)}Ωϕ )
= ∆it ϕ {−βπϕ {δ(αβt (A))}Ωϕ + πϕ {βδ(αβt (A))}Ωϕ } = 0. Therefore, we have ξt = ξ0 and σtϕ (πϕ {αβt (A)})Ωϕ = πϕ (A)Ωϕ . Since Ωϕ is separating for Mϕ , we obtain σtϕ (πϕ {αβt (A)}) = πϕ (A) . This implies ϕ πϕ {αβt (A)} = σ−t (πϕ (A)) .
Since D(δα )(⊃ A◦ ) is norm dense in A, we have πϕ {α−βt (A)} = σtϕ (πϕ (A)) , for every A ∈ A. Since ϕ satisfies (σtϕ , −1)-KMS condition as a state of Mϕ , we obtain the (αt , β)KMS condition for ϕ. 7.5. Product form of the Gibbs condition In the case of quantum spin lattice systems, for any region I ⊂ Zν , A = A(I) ⊗ A(Ic ). In this situation, the Gibbs condition implies that ϕwˆ (= ϕπϕ (βW (I)) ) is a product of the local Gibbs state of A(I) and its restriction to A(Ic ), or equivalently ˆ ϕh (= ϕπϕ (βH(I)) ) is a product of the tracial state of A(I) and its restriction to A(Ic ) for any finite region I [5]. ˆ However, this product property for ϕwˆ and ϕh for the present Fermion case does not seem to be automatic in general. We show that such a product property holds if and only if the Gibbs state ϕ is Θ-even, where the product property refers to the validity of the formula ψ(AB) = ψ(A)ψ(B)/ψ(1) ,
A ∈ A(I), B ∈ A(Ic )
(7.18)
ˆ
for ψ = ϕh and for ψ = ϕwˆ . Proposition 7.7. Assume the conditions (I) and (II) for the dynamics. Let I be a non-empty finite subset of Zν . If ϕ satisfies the Gibbs condition, then ϕπϕ (βW (I)) has the product property (7.18) if and only if ϕ is Θ-even. The same is true for ϕπϕ (βH(I)) .
April 11, 2003 14:43 WSPC/148-RMP
144
00160
H. Araki & H. Moriya
Proof. First assume that ϕ is even. It follows from the Gibbs condition that A(I) ˆ ˆ is in the centralizer of ϕh and the restriction of ϕh to A(I) is tracial. We will show ˆ
ϕh ([A1 , A2 ]B) = 0
(7.19)
for any A1 , A2 ∈ A(I) and any B ∈ A(Ic ). It is enough to show this for all combinations of even and odd A1 , A2 and B because the general case follows from these cases by linearity. ˆ Since A1 and A2 are in the centralizer of ϕh , we have ˆ
ˆ
ϕh (A1 A2 B) = ϕh (A2 BA1 ) ,
ˆ
ˆ
ϕh (A2 A1 B) = ϕh (A1 BA2 ) .
If one, or more of A1 , A2 , B is even, then BA1 = A1 B or BA2 = A2 B holds. Hence (7.19) follows for this case. ˆ The remaining case is when A1 , A2 , B are all odd. We now show that ϕh is even so that (7.19) holds in this case. Since ϕ is assumed to be even at this part of proof, Θ leaves ϕ invariant and hence there exists an involutive unitary UΘ on the GNS representation space Hϕ of ϕ, satisfying UΘ πϕ (A)UΘ ∗ = πϕ (Θ(A)) ,
(A ∈ A) ,
(7.20)
UΘ Ω ϕ = Ω ϕ .
(7.21)
Since H(I) is even by assumption, it follows from the commutativity of UΘ with ˆ ∆ϕ [42] and the above equations (7.20), (7.21) that the perturbed vector Ω hϕ is UΘ ˆ
ˆ
ˆ
invariant. Therefore ϕh is even, since it is the vector functional by Ωhϕ . Hence ϕh vanishes on every odd element and (7.19) is satisfied if A1 , A2 and B are all odd. Now (7.19) is proved for all the cases. Since A(I) is a 2|I| ×2|I| full matrix algebra, any element A ∈ A(I) can be written as X A = τ (A) + [Aj1 , Aj2 ] j
for some Aj1 , Aj2 ∈ A(I). Hence (7.19) implies ˆ
ˆ
ϕh (AB) = τ (A)ϕh (B)
(7.22) ˆ
for any A ∈ A(I) and B ∈ A(Ic ). This means that ϕh has a form of the product of τ of A(I) and its restriction to A(Ic ). ˆ Since U (I) is in the centralizer of ϕh , we have ˆ
ˆ
ϕwˆ = {ϕh }−ˆu = ϕh · e−ˆu . Hence, for any A ∈ A(I) and B ∈ A(Ic ), ˆ
ϕwˆ (AB) = τ (e−ˆu )ϕcI (A)ϕh (B) .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
145
By setting A = 1, we have ˆ
ϕwˆ (B) = τ (e−ˆu )ϕh (B) . Therefore, ϕwˆ (AB) = ϕcI (A)ϕwˆ (B) .
(7.23)
Hence we have the desired product property of ϕwˆ . ˆ We now prove the converse, starting from the assumption that ϕh has a product form (7.18). We note that 1 1 1 ∗ ∗ ∗ ∗ τ (ai ai ) = τ (ai ai ) = τ (ai ai + ai ai ) = τ 1 = 2 2 2 due to CAR. On the other hand, ai anticommutes with any odd element B in A(Ic ) and hence ˆ
ˆ
ˆ
ϕh (ai a∗i B) = ϕh (a∗i Bai ) = −ϕh (a∗i ai B) ,
(7.24) ˆ h
where the first equality follows because ai is in the centralizer of ϕ due to the Gibbs condition. By the product form assumption, ˆ
ˆ
ˆ
ˆ
ϕh (AB) = ϕh (A)ϕh (B)/ϕh (1) ˆ
ˆ
for A ∈ A(I) and B ∈ A(Ic ). Since A is in the centralizer, ϕh (A)/ϕh (1) = τ (A) for the unique tracial state τ of A(I). Hence ˆ
ˆ
1 hˆ ϕ (B) , 2
ˆ
ˆ
1 hˆ ϕ (B) . 2
ϕh (ai a∗i B) = τ (ai a∗i )ϕh (B) = ϕh (a∗i ai B) = τ (a∗i ai )ϕh (B) =
(7.25)
From (7.24) and (7.25), we obtain ˆ
ϕh (B) = 0
(7.26) ˆ
for any B ∈ A(Ic )− . Since A− = A(I)+ A(Ic )− + A(I)− A(Ic )+ for a finite I, ϕh ˆ vanishes on odd elements of A. We conclude that ϕh is even. This implies that ϕ is also even by the same argument as in the first part of this proof due to ˆ ˆ ϕ = {ϕh }−h . Remark. By the above Proposition, we have already shown that if a Gibbs state ϕ satisfies the condition that ϕπϕ (βW (I)) has the product property (7.18) for the pair (A(I), A(Ic )) for one non-empty finite I, then ϕ has this product property for every finite subset I. In connection with Proposition 7.7, if A(Ic ) is replaced by the commutant algebra A(I)0 in the product property (7.18), then ϕwˆ is a product of the local Gibbs state of A(I) and its restriction to A(I)0 for every finite region I irrespective of
April 11, 2003 14:43 WSPC/148-RMP
146
00160
H. Araki & H. Moriya
whether ϕ is even or not as is shown in the following corollary. This situation is much the same as in quantum spin lattice systems. Corollary 7.8. Assume the conditions (I) and (II) for the dynamics. Let ϕ be a modular state. The state ϕ satisfies the Gibbs condition if and only if the perturbed functional ϕwˆ is a product of the local Gibbs state ϕcI of A(I) and its restriction to A(I)0 for every finite I. Proof. For a finite I, A(I) is a full matrix algebra and hence A is an (algebraic) tensor product of A(I) and A(I)0 . If ϕwˆ has the product property described above, then the GNS representation of A associated with ϕwˆ is the tensor product of those for (A(I), ϕcI ) and (A(I)0 , ψ) where ψ = ϕwˆ |A(I)0 . Therefore the product of the modular automorphisms for these two pairs satisfies the KMS condition (with β = −1) for (A, ϕwˆ ) and must be the modular operator for (A, ϕwˆ ). In particular, the restriction of the modular automorphisms of (A, ϕwˆ ) to A(I) coincides with the modular automorphisms αIt (= Ad(e−iβU (I)t )) for (A(I), ϕcI ). Hence the Gibbs condition is satisfied. Conversely, assume that the Gibbs condition is satisfied for ϕ. By the elementwise commutativity of A(I) and A(I)0 , we can show directly (7.19) in Proposition 7.7 in this case for any A1 , A2 ∈ A(I) and B ∈ A(I)0 skipping the previous discussion about even and odd elements. The argument showing (7.22) and (7.23) are still valid after we replace A(Ic ) by A(I)0 . 8. Translation Invariant Dynamics 8.1. Translation invariance and covariance From now on, we need the following assumption for the dynamics αt for the most part of our theory. (IV) αt τk = τk αt for all t ∈ R and k ∈ Zν . If (IV) holds, αt is said to be translation invariant. This assumption implies our earlier assumption (I) due to the following Proposition, which we owe to a referee. Proposition 8.1. Any automorphism αt commuting with the lattice translation τk , k ∈ Zν , must commute with Θ. For its proof, we need the following Lemma. Lemma 8.2. An element x ∈ A is Θ-even if and only if the following asymptotically central property holds. lim k[τk (x), y]k = 0
k→∞
for all y ∈ A.
(8.1)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
147
Proof. If x ∈ (A◦ )+ and y ∈ A◦ , then [τk (x), y] = 0 for sufficiently large k. By the density of (A◦ )+ in A+ and A◦ in A, we obtain (8.1) for x ∈ A+ and y ∈ A. In the converse direction, consider a general x ∈ A and define x± = 1/2(x ± Θ(x)) ∈ A± . Due to the validity of (8.1) for x+ , which is just shown, we have lim k[τk (x), y]k = lim k[τk (x− ), y]k .
k→∞
k→∞
Take a unitary y ∈ A− (e.g., ai +
a∗i ).
Then
k[τk (x− ), y]k = 2kτk (x− )yk = 2kx− k . Hence (8.1) for x implies x− = 0, namely x ∈ A+ . Proof of Proposition 8.1. Due to τk α = ατk , we have k[τk (α(x)), α(y)]k = kα{[τk (x), y]}k = k[τk (x), y]k . Hence α(x) ∈ A+ if and only if x ∈ A+ by Lemma 8.2. Let
1 (id + Θ) . (8.2) 2 It is the conditional expectation from A onto A+ , characterized by E+ (x) ∈ A+ for all x ∈ A and τ (xy) = τ (E+ (x)y) for all x ∈ A and y ∈ A+ . Then α(α−1 (y)) = y ∈ A+ implies α−1 (y) ∈ A+ and E+ ≡
τ (E+ (α(x))y) = τ (α(x)y) = τ (α(xα−1 (y)) = τ (xα−1 (y)) = τ (E+ (x)α−1 (y)) = τ (α−1 {α(E+ (x))y}) = τ (α(E+ (x))y) , where we have used α−1 (y) ∈ A+ in the fourth equality. Since E+ (α(x)) ∈ A+ and α(E+ (x)) ∈ A+ (due to E+ (x) ∈ A+ ), we have E+ (α(x)) = α(E+ (x)). Therefore E+ α = αE+ and α commutes with Θ. Remark. A referee pointed out the following approach (which we have not adopted). Under assumption IV, any αt |A+ -KMS state of A+ has a unique even extension to an αt -KMS state of A (e.g. by [11]). This allows one to reduce the analysis of KMS states to the case of asymptotically abelian system due to (8.1). The dynamics αt is translation invariant if and only if its generator αt commutes with every τk (k ∈ Zν ). (This statement includes the τk -invariance of the domain of the generator.) The corresponding standard potential (which exists under the assumptions (I) and (II)) satisfies the following translation covariance condition: (Φ-f) τk Φ(I) = Φ(I + k), for all finite subsets I of Zν and all k ∈ Zν . Such a potential will be said to be translation covariant. We consider the set Pτ of all translation covariant potentials in P. Namely, Pτ is defined to be the set of all Φ satisfying all conditions of Definition 5.10, i.e. (Φ-a,b,c,d,e) and the translation covariance (Φ-f).
April 11, 2003 14:43 WSPC/148-RMP
148
00160
H. Araki & H. Moriya
We make Pτ a real vector space as a function space on the set of finite subsets of Zν by the linear operation given in (5.24). In the same way, we define Hτ to be the subspace of H such that each element H satisfies the following translation covariance condition: (H-vi) τk (H(I)) = H(I + k) for all k ∈ Zν . We denote the set of all translation invariant derivations in ∆(A◦ ) by ∆τ (A◦ ). Namely, ∆τ (A◦ ) is the set of all ∗-derivations with A◦ as their domain, commuting with Θ and also with τ . From Theorems 5.7, 5.12 and 5.13, the following corollaries obviously follow. Corollary 8.3. The relation (H-iii) (as given in Sec. 5.2) between H ∈ Hτ and δ ∈ ∆τ (A◦ ) gives a bijective, real linear map from Hτ to ∆τ (A◦ ). Corollary 8.4. The equations (5.22) and (5.23) for Φ ∈ Pτ and H ∈ Hτ give a bijective, real linear map from Pτ to Hτ . Corollary 8.5. The equations (5.27) and (5.28) between Φ ∈ Pτ and δΦ ∈ ∆τ (A◦ ) gives a bijective, real linear map from Pτ to ∆τ (A◦ ). For Φ ∈ Pτ , we define kΦk ≡ kH({n})k which is independent of n ∈ Zν due to the translation covariance of Φ. It defines a norm on Pτ . We show that this norm makes Pτ a Banach space, after giving the following energy estimates. Lemma 8.6. For Φ ∈ Pτ , the following estimate hold : kU (I)k ≤ kH(I)k ≤ kΦk · |I| ,
(8.3)
In particular, if kΦk = 0, H = U = Φ = 0 (as functions of finite subsets I of Zν ). Proof. For I = ∅, both sides of the above inequalities are 0. For I = {n1 , . . . , n|I| }, we obtain X H(I) = limν {Φ(K); K ∩ I 6= ∅, K ⊂ J} J%Z
= limν J%Z
= limν J%Z
=
|I| X i=1
K
|I| X X i=1 K
|I| X i=1
{Φ(K); K 3 ni , K 63 n1 , . . . , ni−1 , K ⊂ J}
E{n1 ,...,ni−1 }c
X K
{Φ(K); K 3 ni , K ⊂ J}
E{n1 ,...,ni−1 }c H({ni }) ,
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
149
where the third equality comes from the following identities ( 0 if {n1 , . . . , ni−1 } ∩ K 6= ∅, i.e. {n1 , . . . , ni−1 }c 6⊃ K , E{n1 ,...,ni−1 }c Φ(K) = Φ(K) if n1 , . . . , ni−1 ∈ / K, i.e. {n1 , . . . , ni−1 }c ⊃ K , and the interchange of limJ%Zν and E{n1 ,...,ni−1 }c in the fourth equality is allowed due to kE{n1 ,...,ni−1 }c k = 1. The following estimate follows: kH(I)k ≤
|I| X
kE{n1 ,...,ni−1 }c H({ni })k
≤
|I| X
kH({ni })k = |I| · kΦk .
i=1
i=1
(8.4)
Since U (I) = EI (H(I)) and kEI k = 1, we obtain kU (I)k ≤ kH(I)k ≤ kΦk · |I| . If kΦk = 0, then H(I) = U (I) = 0 for all I by this estimate and hence Φ(I) = 0 by (5.16). The following estimate will be used later. Lemma 8.7. For disjoint finite subsets I and J of Zν , kU (I ∪ J) − U (I)k ≤ kΦk · |J| .
(8.5)
Proof. Due to I ∩ J = ∅, U (I ∪ J) − U (I) = {Φ(K); K ∩ J 6= ∅, K ⊂ I ∪ J} . Therefore, we have U (I ∪ J) − U (I) = EI∪J H(J) , because H(J) is the sum of Φ(K) for all K satisfying K∩J 6= ∅, and EI∪J annihilates all Φ(K) for which K is not contained in I ∪ J while it retains Φ(K) unchanged if K is contained in I ∪ J. Hence kU (I ∪ J) − U (I)k = kEI∪J H(J)k ≤ kH(J)k ≤ kΦk · |J| . Proposition 8.8. P is a real Banach space with respect to the norm kΦk = kH({n})k. Proof. Pτ is a normed space with respect to kΦk, because kΦ1 + Φ2 k = kHΦ1 +Φ2 ({n})k = kHΦ1 ({n}) + HΦ2 ({n})k
April 11, 2003 14:43 WSPC/148-RMP
150
00160
H. Araki & H. Moriya
≤ kHΦ1 ({n})k + kHΦ2 ({n})k = kΦ1 k + kΦ2 k , kcΦk = kcHΦ ({n})k = |c|kHΦ ({n})k = |c| kΦk , for Φ1 , Φ2 , Φ ∈ Pτ , and c ∈ R, due to the linear dependence of HΦ on Φ and because kΦk = 0 implies Φ(I) = 0 for all I due to Lemma 8.6 and (5.16). We now show its completeness. Suppose {Φn } is a Cauchy sequence in Pτ with respect to the norm k · k. Let us denote the corresponding H(I) and U (I) for Φn by Hn (I) and Un (I), respectively. The linear dependence of H(I) on Φ and Lemma 8.6 imply that {Hn (I)} is a Cauchy sequence in A with respect to the C∗ -norm. Since A is a C∗ -algebra, {Hn (I)} has a unique limit in A, which will be denoted by H∞ (I). Since U (I) = EI (H(I)) with kEI k = 1, {Un (I)} is also a Cauchy sequence in A, has a unique limit U∞ (I), and U∞ (I) = EI (H∞ (I)). For each finite subset I of Zν , {Φn (I)} also converges to the potential Φ∞ (I) for U∞ (I) in the C∗ -norm because Φ(I) is a finite linear combination of U (J), J ⊂ I due to (5.16), and {Un (J)} converges to U∞ (J) in the C∗ -norm for every such J. For any finite subsets I, J of Zν , we obtain X K
{Φ∞ (K); K ∩ I 6= ∅, K ⊂ J} =
X K
= lim n
lim{Φn (K); K ∩ I 6= ∅, K ⊂ J} n
X K
{Φn (K); K ∩ I 6= ∅, K ⊂ J}
= lim EJ (Hn (I)) = EJ (lim Hn (I)) n
n
= EJ (H∞ (I)) , where the third equality is due to (5.20). Hence, by (4.23) we have lim
J%Zν
X {Φ∞ (K); K ∩ I 6= ∅, K ⊂ J} K
!
= limν EJ (H∞ (I)) = H∞ (I) . J%Z
Thus Φ∞ satisfies the condition (Φ-e) in the definition of Pτ . The other conditions (Φ-a), (Φ-b), (Φ-c), (Φ-d), and (Φ-f) are satisfied since each Φn satisfies them and limn Φn (I) = Φ∞ (I) for every finite subset I of Zν . In conclusion, we have Φ∞ ∈ Pτ . Finally, we have lim kΦn − Φ∞ k = lim kHn ({0}) − H∞ ({0})k = 0 . n
n
We have now shown the completeness of Pτ .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
151
8.2. Finite range potentials Definition 8.9. (1) A potential Φ ∈ Pτ is said to be of a finite range if there exists an r ≥ 0 such that Φ(I) = 0 whenever diam(I) = max{|i − j|; i, j ∈ I} > r .
(8.6)
The infimum of such r is called the range of Φ. (2) The subspace of P consisting of all potentials Φ ∈ P of a finite range is denoted by P f . Furthermore, we denote Pτf ≡ P f ∩ Pτ .
(8.7)
Ca ≡ {x ∈ Zν ; 0 ≤ xi ≤ a − 1, i = 1, . . . , ν} .
(8.8)
We introduce the following averaged conditional expectation. 1 X ECa −i , Ea ≡ |Ca |
(8.9)
For a ∈ N, Ca denotes the following cube in Zν
i∈Ca
where |Ca | = aν is the number of lattice points in Ca , called the volume of Ca . (The sum in the above equation is over all translates of Ca which contain the origin 0 ∈ Zν .) For any finite subset I ⊂ Zν , l(a, I) denotes the number of translates of Ca containing I. By definition, for any m ∈ Zν , l(a, I) = l(a, I + m) .
(8.10)
We need the following lemma in this subsection and later. Lemma 8.10. For a finite I, lim
a→∞
l(a, I) = 1. |Ca |
(8.11)
Proof. Let d ∈ N be fixed such that there exists a translate Cd + k (k ∈ Zν ) of Cd containing I. For a > d, a translate of Ca contains I if it contains Cd + k. Hence l(a, I) is bigger than the number of translates of Ca which contains Cd , which is (a − d + 1)ν . Hence ν (a − d + 1)ν (d − 1) l(a, I) ≥ = 1− → 1 (a → ∞) . 1≥ |Ca | |Ca | a This shows (8.11).
In order to prove that the subspace Pτf is dense in Pτ , we need the following Lemma. Lemma 8.11. For any A ∈ A, lim Ea (A) = A .
a→∞
(8.12)
April 11, 2003 14:43 WSPC/148-RMP
152
00160
H. Araki & H. Moriya
Proof. Since A◦ is dense in A, there exists Aε ∈ A◦ for any ε > 0 such that kAε − Ak < ε .
(8.13)
Let Aε ∈ A(Iε ) for a finite Iε . Then there exists a sufficiently large positive integer b such that a translate of Cb , say Cb − k, contains both 0 (the origin of Zν ) and Iε . If a translate Ca − i of Ca contains Cb − k, then ECa −i (Aε ) = Aε because Ca − i ⊃ Cb − k ⊃ Iε and Aε ∈ A(Iε ). Such i belongs to Ca due to 0 ∈ Cb − k ⊂ Ca − i. The number of translates Ca − i of Ca which contains Cb − k is equal to l(a, Cb ) (the number of translates of Ca which contains Cb ). Therefore, we obtain
1 X l(a, Cb )
Aε − {ECa −i (Aε ); i ∈ Ca , Ca − i 6⊃ Cb − k} kAε −Ea (Aε )k = 1 −
. |Ca | |Ca | Hence, by using kECa −i (Aε )k ≤ kAε k due to kECa −i k = 1, we obtain 1 l(a, Cb ) + {|Ca | − l(a, Cb )} kAε k kAε − Ea (Aε )k ≤ 1− |Ca | |Ca | l(a, Cb ) = 2 1− kAε k . |Ca | By Lemma 8.10
lim
a→∞
l(a, Cb ) = 1. |Ca |
Hence, there exists nε ∈ N such that for a ≥ nε ,
kAε − Ea (Aε )k < ε .
(8.14)
Hence, for a ≥ nε , kA − Ea (A)k ≤ kA − Aε k + kAε − Ea (Aε )k + kEa (Aε − A)k < 3ε by (8.13), (8.14) and kEa k = 1. Theorem 8.12. Pτf is dense in Pτ . Proof. Let Φ ∈ Pτ . For any finite I ⊂ Zν containing the origin 0 of Zν , Ea (Φ(I)) =
l(a, I) Φ(I) , |Ca |
(8.15)
because ECa −i (Φ(I)) = Φ(I) if Ca − i contains I while ECa −i (Φ(I)) = 0 if Ca − i does not contain I due to (Φ-d). Note that all translates of Ca which contains I appear in the sum (8.9) since I is assumed to contain 0. We now consider the following potential Φa (I) =
l(a, I) Φ(I) . |Ca |
(8.16)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
153
Due to Φ ∈ Pτ , (Φ-a), (Φ-b), (Φ-c) and (Φ-d) for Φa follow automatically. Since Φ ∈ Pτ is translation covariant and l(a, I) is translation invariant under translation of I by (8.10), Φa satisfies the translation covariance (Φ-f). Φa is of a finite range √ because there is no translates of Ca containing I if diam(I) > ν(a − 1) and hence l(a, I) = 0 for such I and a(∈ N). Hence (Φ-e) is automatically satisfied. Therefore we conclude that Φa ∈ Pτf . We compute Ea (HΦ ({0})) =
X 1 X ECa −i (Φ(J)) |Ca | J30
=
i∈Ca
X l(a, J) J30
|Ca |
Φ(J) = HΦa ({0}) ,
where we have used ECa −i (Φ(J)) = Φ(J) for Ca − i ⊃ J and ECa −i (Φ(J)) = 0 for Ca − i 6⊃ J due to (Φ-d). (Note that if a translate Ca − i contains J, then i ∈ Ca due to 0 ∈ J and hence the number of i ∈ Ca , for which Ca − i ⊃ J, is l(a, J).) By Lemma 8.11, we obtain lim kΦ − Φa k = lim kHΦ ({0}) − HΦa ({0})k
a→∞
a→∞
= lim kHΦ ({0}) − Ea (HΦ ({0}))k = 0 . a→∞
This completes the proof. Corollary 8.13. Pτ is a separable Banach space. Proof. For each n ∈ N , the set of all Φ ∈ Pτf with its range not exceeding n is a finite dimensional subspace of Pτ , because such Φ is determined by Φ(I) for a finite number of I containing the origin and satisfying diam(I) ≤ n, and so has a dense countable subset. Taking union over n ∈ N, we have a countable dense subset of Pτf . By Theorem 8.12, the same countable subset is dense in Pτ . We have now shown that Pτ is separable. 9. Thermodynamic Limit The van Hove limits of the densities (volume average) of extensive quantities are usually called thermodynamic limits. We now provide their existence theorems. The same proof as the case of spin lattice systems (see e.g. [17], [23] and [40]) is applicable to the present Fermion lattice case. We, however, present slightly simplified proof by using methods different from those of the known proof. First we derive a surface energy estimate which we will find useful and crucial in the argument of the present section.
April 11, 2003 14:43 WSPC/148-RMP
154
00160
H. Araki & H. Moriya
9.1. Surface energy estimate Lemma 9.1. For Φ ∈ Pτ ,
kW (I)k = 0. I→∞ |I|
v.H. lim
(9.1)
Proof. Let {Iα } be an arbitrary van Hove net of Zν . For n ∈ Zν and a finite subset I of Zν , let X Wn (I) ≡ limν {Φ(K); K 3 n, K ∩ Ic 6= ∅, K ⊂ J} J%Z
K
= limν (HJ ({n}) − EI {HJ ({n})}) J%Z
= H({n}) − EI {H({n})} .
ν
Let BrZ (n) be the intersection of Br (n) (the ball with its center n and radius r) ν and Zν . If n ∈ I and n ∈ / surf r (I), then BrZ (n) ⊂ I and hence EI (HBrZν (n) ({n})) = HBrZν (n) ({n}) . Therefore, Wn (I) = H({n}) − HBrZν (n) ({n}) − EI {H({n}) − HBrZν (n) ({n})} . From this, we obtain kWn (I)k ≤ 2kH({n}) − HBrZν (n) ({n})k . By (5.23), for given ε > 0, we can take sufficiently large r > 0 (hence sufficiently ν large BrZ (0)) satisfying ε kH({0}) − HBrZν (0) ({0})k < . 4 By the translation covariance assumption on Φ, we have kH({n}) − HBrZν (n) ({n})k = kτn {H({0}) − HBrZν (0) ({0})}k = kH({0}) − HBrZν (0) ({0})k
0, take an a0 ∈ N such that kW (Ca )k < |Ca | ε/2 for all a > a0 . For any such a, there exists an α0 (a) such that, for α > α0 (a),
n− (Iα ) a X
(a,α) −
H(Iα ) − U (Di ) (9.6)
< na (Iα )|Ca |ε ,
i=1
(Iα ) n− a X
(a,α) −
U (Iα ) − U (Di )
< na (Iα )|Ca |ε ,
i=1
and
1≥
n− ε a (Iα )|Ca | ≥1− . |Iα | kΦk
(9.7)
(9.8)
Proof. Before we start the proof, we note that the existence of a0 is guaranteed by Lemma 9.1. Let us set n− a (Iα )
D
(a,α)
≡
[
i=1
(a,α)
Di
,
D0
(a,α)
≡ Iα \ D(a,α) .
April 11, 2003 14:43 WSPC/148-RMP
156
00160
H. Araki & H. Moriya
Obviously |D0
(a,α)
− | ≤ (n+ a (Iα ) − na (Iα ))|Ca | ,
and − n+ a (Iα )|Ca | ≥ |Iα | ≥ na (Iα )|Ca | .
From this, we obtain n− a (Iα ) , n+ a (Iα )
1≥
|Iα | n+ (I a α )|Ca |
1≥
n− n− a (Iα )|Ca | a (Iα ) . ≥ + |Iα | na (Iα )
≥
(9.9)
On the other hand, n− a (Iα )
n− a (Iα )
H(Iα ) −
X
(a,α) U (Di )
=
X i=1
i=1
(a,α)
E{D(a,α) ∪···D(a,α) }c (W (Di 1
))
i−1
+ E{D(a,α) }c (H(D0
(a,α)
)) .
Therefore,
n− n− (Iα ) (Iα ) a a X X
(a,α) (a,α) (a,α)
H(Iα ) −
U (Di ) ≤ kW (Di )k + kH(D0 )k
i=1 i=1 ≤ n− a (Iα )|Ca | ·
ε (a,α) + kΦk|D0 |, 2
(9.10)
where in the second inequality the assumption kW (Ca )k < |Ca | ε/2 together with (a,α) the translation covariance of Φ are used for kW (Di )k, and Lemma 8.6 is used (a,α) )k. Due to condition (1) for the van Hove limit, there exists α0 (a) for kH(D0 for given ε1 > 0 such that, for α ≥ α0 (a), 0≤1−
n− a (Iα ) < ε1 . n+ a (Iα )
(9.11)
If ε1 < 1, then n+ a (Iα ) < |D0
(a,α)
1 n− (Iα ) , 1 − ε1 a
| ≤ n+ a (Iα )ε1 |Ca |
0 such that for all α > α0 (a), 1 1 (9.30) |Iα | ω(H(Iα )) − |Ca | ω(U (Ca )) < 2ε ,
where we can take the same a ∈ N and α0 (a) uniformly in ω ∈ A∗+,1 . This estimate implies that { |I1α | ω(H(Iα ))}α is a Cauchy net in R and hence converges. Since ω(H(I)) is linear in Φ and affine in ω, so is eΦ (ω). Due to (8.3), we obtain |eΦ (ω)| ≤ kΦk. Finally we show the continuity in ω. Let {ωγ }γ be a net of states converging to ω in the weak∗ topology. For any ε > 0, we fix a ∈ N satisfying (9.30) for all α > α0 (a) and for all states ω. From the weak∗ convergence of {ωγ }γ to ω, there exixts γε such that for all γ ≥ γε 1 |ω(U (Ca )) − ωγ (U (Ca ))| < ε . |Ca |
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
Thus we have
161
1 1 |Iα | ω(H(Iα )) − |Iα | ωγ (H(Iα )) < 5ε ,
for all α > α0 (a). By taking the van Hove limit, we obtain |eΦ (ω) − eΦ (ωγ )| < 5ε
for all γ ≥ γε . Hence eΦ (ω) is continuous in ω relative to the weak∗ topology. 10. Entropy for Fermion Systems 10.1. SSA for Fermion systems We first show the SSA property of entropy for the Fermion case, which is a consequence of the results on the conditional expectations in Secs. 3 and 4. Theorem 10.1. For finite subsets I and J of Zν , the strong subadditivity (SSA) of Sˆ holds for any state ψ of A: ˆ I∪J ) − S(ψ ˆ I ) − S(ψ ˆ J ) + S(ψ ˆ I∩J ) ≤ 0 , S(ψ
(10.1)
where ψK denotes the restriction of ψ to A(K). Sˆ in this inequality can be replaced by S: S(ψI∪J ) − S(ψI ) − S(ψJ ) + S(ψI∩J ) ≤ 0 .
(10.2)
Proof. The SSA of Sˆ follows from Theorem 3.7 and Theorem 4.13. By (3.1) and log 2|I∪J| − log 2|I| − log 2|J| + log 2|I∩J| = 0 , the SSA of Sˆ implies that of S. Remark 1. The strong subadditivity can be rewritten as S(ψ123 ) − S(ψ13 ) − S(ψ23 ) + S(ψ3 ) ≤ 0 ,
(10.3)
for any disjoint subsets I1 , I2 and I3 of Zν , where ψ123 denotes the restriction of ψ to A(I1 ∪ I2 ∪ I3 ), and so on. Remark 2. The SSA for Fermion systems above does not seem to follow from those for the tensor product systems ([27, 28]) in any obvious way. Remark 3. Note that the SSA for Fermion systems holds whether the state ψ is Θeven or not. For two disjoint finite regions I and J, the so-called triangle inequality of entropy |S(ψI ) − S(ψJ )| ≤ S(ψI∪J )
April 11, 2003 14:43 WSPC/148-RMP
162
00160
H. Araki & H. Moriya
is known to hold for quantum spin lattice systems [1]. However, it can fail for Fermion lattice systems when ψ breaks Θ-evenness (see a concrete example in [33]). The following is a special case of Theorem 10.1 when I ∩ J = ∅. Corollary 10.2. For disjoint finite subsets I and J, the following subadditivity holds. ˆ I∪J ) ≤ S(ψ ˆ I ) + S(ψ ˆ J) , S(ψ
(10.4)
S(ψI∪J ) ≤ S(ψI ) + S(ψJ ) .
(10.5)
10.2. Mean entropy We now show the existence of mean entropy (von Neumann entropy density) for translation invariant states of A. For s = (s1 , . . . , sν ) ∈ Nν , we define Rs as the following box region with edges Qν of length si − 1 containing si points of Zν and with the volume |Rs | = i=1 si . Rs ≡ {x ∈ Zν ; 0 ≤ xi ≤ si − 1, i = 1, . . . , ν} .
(10.6)
Theorem 10.3. Let ω be a translation invariant state. The van Hove limit 1 s(ω) ≡ v.H. lim S(ωI ) (10.7) I→∞ |I|
exists and is given as the following infimum s(ω) = infν s∈N
The mean entropy functional
1 S(ωRs ) . |Rs |
ω 7→ s(ω) ∈ [0, log 2]
(10.8)
(10.9)
τ defined on the set A∗+,1, of translation invariant states is affine and upper semicontinuous with respect to the weak ∗ topology.
Proof. The SSA property of von Neumann entropy proved in Theorem 10.1 is sufficient for the same proof of this Theorem as in the case of quantum spin lattice systems. (See e.g. Proposition 6.2.38 of [17].) The following results about Lipschitz continuity of bounded affine functions on a state space and, in particular, of entropy density are known. τ Proposition 10.4. A bounded affine function f on A∗+,1, satisfies
|f (ω1 ) − f (ω2 )| ≤ (M/2)kω1 − ω2 k τ for any ω1 , ω2 ∈ A∗+,1, , where
τ M ≡ sup{|f (ω1 ) − f (ω2 )|; ω1 , ω2 ∈ A∗+,1, }.
(10.10)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
163
Corollary 10.5. The mean entropy s(ω) satisfies |s(ω1 ) − s(ω2 )| ≤
1 (log 2)kω1 − ω2 k 2
(10.11)
τ for any ω1 , ω2 ∈ A∗+,1, .
Proposition 10.4 is the first equation on p. 108 of [23] and Corollary 10.5 is Corollary IV.4.3 on the same page of [23]. The inequality (10.11) without 12 factor is obtained in [20]. The coefficient 21 log 2 is best possible, the equality being reached by ω1 = τ and any pure translation invariant state ω2 with vanishing mean entropy s(ω2 ) = 0, in which case kω1 − ω2 k = 2 because πτ (type II) and πω2 (type I) are disjoint. An example of such an ω2 is given by Theorem 11.2 as a ‘product state extension’ of Θ-even pure states ϕi of A({i}) (i ∈ Zν ) satisfying the covariance condition τk∗ ϕi = ϕi+k for all k ∈ Zν . τ We define mean entropy sˆ(ω) for ω ∈ A∗+,1, by using trace τ instead of matrix trace TrI for each finite I: 1 ˆ S(ωI ) . (10.12) sˆ(ω) ≡ v.H. lim I→∞ |I| It is obviously related to s(ω) by
s(ω) = sˆ(ω) + log 2 ,
(10.13)
τ for any ω ∈ A∗+,1, .
10.3. Entropy inequalities for translation invariant states In addition to Theorem 10.3, the SSA property of von Neumann entropy plays an essential role in the derivation of some basic entropy inequalities for the present Fermion lattice systems in the same way as for quantum spin lattice systems. The following two consequences are about monotone properties of entropy as a function on the set of box regions of the lattice; the first one is a monotone decreasing property of the finite-volume entropy density and the second one is a monotone increasing property of the entropy. Theorem 10.6. Let ω be a translation invariant state on A and let Rs and Rs0 be finite boxes of Zν such that Rs ⊂ Rs0 . Then 1 1 S(ωRs0 ) , S(ωRs ) ≥ |Rs | |Rs0 |
(10.14)
S(ωRs ) ≤ S(ωRs0 ) .
(10.15)
This theorem follows from [24], where (10.14) and (10.15) are derived from the following properties without any other input. • Positivity and finiteness of the entropy of every local region. • Strong subadditivity. • Shift invariance.
April 11, 2003 14:43 WSPC/148-RMP
164
00160
H. Araki & H. Moriya
In [16], sufficient conditions are given for a sequence of regions of more general shape than boxes which guarantee a monotone decreasing property of the form (10.14) for any translation invariant state ω. This result also applies to our Fermion lattice systems. 11. Variational Principle We first prove the existence of a (unique) product state extension of given states in any (finite or infinite) number of mutually disjoint regions under the condition that all given states except for at most one are Θ-even. This result is a crucial tool to overcome possible difficulties which originate in the non-commutativity of Fermion systems in connection with the proof of variational equality in this section and in the equivalence proof of the variational principle with the KMS condition in the next section. 11.1. Extension of even states For each I, A(I) is invariant under Θ and hence the restriction of Θ to A(I) is an automorphism of A(I) and will be denoted by the same symbol Θ. We need the following lemma. Lemma 11.1. Let I be a finite subset of Zν . Let ϕ be a state of A(I) and % ∈ A(I) be its adjusted density matrix : ϕ(A) = τ (%A) = τ (A%) ,
(A ∈ A(I)) .
Then ϕ is an even state if and only if % is Θ-even. Proof. Since the tracial state τ is invariant under any automorphism, we obtain ϕ(A) = ϕ(Θ(A)) = τ (%Θ(A)) = τ (Θ{%Θ(A)}) = τ (Θ(%)A) if ϕ is even. By the uniqueness of the density matrix, we have Θ(%) = %. By the same computation, ϕ(Θ(A)) = ϕ(A) for every A ∈ A(I) if Θ(%) = %. Theorem 11.2. Let {Ii } be a (finite or infinite) family of mutually disjoint subsets S of Zν and ϕi be a state of A(Ii ) for each i. Let I = i Ii . Then there exists a state ϕ of A(I) satisfying ϕ(Ai1 · · · Ain ) =
n Y
ϕij (Aij )
(11.1)
j=1
for any set (i1 , . . . , in ) of distinct indices and for any Aij ∈ A(Iij ) if all states ϕi except for at most one are Θ-even. When such ϕ exists, it is unique. Proof. (Case 1) A finite family of finite subsets {Ii }, i = 1, . . . , n.
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
165
For each i, let %i be the density matrix of ϕi : (A ∈ A(Ii )) ,
ϕi (A) = τ (%i A) = τ (A%i ) , %i ∈ A(Ii ) ,
%i ≥ 0 ,
%i (1) = 1 .
If ϕi is Θ-even, then %i is Θ-even, namely, %i ∈ A(Ii )+ . If all states ϕi except for one is even, all %i except for one belong to A(Ii )+ . Thus each %i commutes with any %j . The product % = % n · · · %1
(11.2)
is a product of mutually commuting non-negative hermitian operators and hence it is positive. Define ϕ(A) ≡ τ (%A) ,
A ∈ A(I) .
(11.3)
By the product property of τ (4.13), we have ϕ(A1 · · · An ) = τ (%A1 · · · An ) = τ (%n−1 · · · %1 A1 · · · An−1 An %n ) = τ (%n−1 · · · %1 A1 · · · An−1 )τ (An %n ) = τ (%n−1 · · · %1 A1 · · · An−1 )ϕn (An ) . Using this recursively, we obtain ϕ(A1 · · · An ) =
n Y
ϕi (Ai ) .
i=1
This also shows ϕ(1) = 1. Hence the existence is proved for Case 1. Since the monomials of the form (4.2) with all indices in I are total in A(I), the uniqueness of a state ϕ of A(I) satisfying the product property (11.1) follows. (Case 2) A general family {Ii }. Let {Lk } be an increasing sequence of finite subsets of Zν such that their union is Zν . Set Iki ≡ Ii ∩ Lk and Ik ≡ I ∩ Lk for each k. For each k, only a finite number (which will be denoted by n(k)) of Iki are non-empty and all of them are finite subsets of Zν . Note that the restriction of an even state ϕi to A(Iki ) is even. Hence we can apply the result for Case 1 to {Iki }. We obtain a unique product state ϕk of A(Ik ) satisfying n(k) k
ϕ (Ai1 · · · Ain(k) ) =
Y
ϕkij (Aij ) ,
j=1
Aij ∈ A(Ikij ) .
(11.4)
By the uniqueness already proved, the restriction of ϕk to A(Il ) for l < k coincides with ϕl . There exists a state ϕ◦ of the ∗-algebra ∪k A(Ik ) defined by ϕ◦ (A) = ϕk (A)
April 11, 2003 14:43 WSPC/148-RMP
166
00160
H. Araki & H. Moriya
for A ∈ A(Ik ). Since ∪k Ik = I, ∪k A(Ik ) is dense in A(I). Then there exists a unique continuous extension ϕ of ϕ◦ to A(I) and ϕ is a state of A(I). Take an arbitrary index n. Let A = A 1 · · · An ,
Ai ∈ A(Ii ) .
Set Aki ≡ ELk (Ai ) ∈ A(Iki ). Since Lk % Zν , Ai = lim Aki , k
A = lim(Ak1 · · · Akn ) . k
Hence ϕ(A) = lim ϕ(Ak1 · · · Akn ) k
= lim ϕk (Ak1 · · · Akn ) = lim k
=
n Y
k
n Y
ϕi (Aki )
i=1
ϕi (Ai ) .
i=1
Thus ϕ satisfies the product property (11.1). The uniqueness of ϕ is proved in the same way as Case 1. Remark 1. This result is given in Theorem 5.4. of Power’s Thesis [36]. Remark 2. The unique product state extension ϕ is even if and only if all ϕi are even. Remark 3. The condition that all ϕi except for at most one are Θ-even can be shown to be necessary for the existence of the product state extension ϕ satisfying (11.1) [14]. Lemma 11.3. Let {Ii } be a finite family of mutually disjoint finite subsets of Zν . Let ϕi be a state of A(Ii ) for each i and all ϕi be Θ-even with at most one exception. Let ϕ be their product state extension given by Theorem 11.2. Then X X ˆ ˆ i) . S(ϕ) = S(ϕi ) , S(ϕ) = S(ϕ (11.5) i
i
Proof. This follows from the computation using the density matrix (11.2). X X X ˆ ˆ i) . S(ϕ) = −ϕ(log %) = − ϕ(log %i ) = − ϕi (log %i ) = S(% (11.6) i
i
Here the mutual commutativity of %i is used. Due to |I| = Sˆ by S.
i
P
i
|Ii |, we can replace
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
167
11.2. Variational inequality We have already quoted the positivity of relative entropy: S(ψ, ϕ) = τ (ˆ ρϕ log ρˆϕ − ρˆϕ log ρˆψ ) ≥ 0 ,
(11.7)
where the equality holds if and only if ϕ = ψ. Recall our notation (7.14) for the local Gibbs state ϕcI of A(I) with respect to (Φ, β). Let ω be a state of A. Substituting ψ = ϕcI and ϕ = ωI into (11.7), we obtain ˆ I ) + βω(U (I)) + log τ (e−βU (I) ) ≥ 0 . S(ϕcI , ωI ) = −S(ω
(11.8)
Now we assume that ω is translation invariant. By dividing the above inequality by |I| and then taking the van Hove limit I → ∞, we obtain the following variational inequality p(βΦ) ≥ sˆ(ω) − βeΦ (ω) ,
(11.9)
where sˆ(ω) is given by (10.12). Equivalently, we have P (βΦ) ≥ s(ω) − βeΦ (ω) .
(11.10)
11.3. Variational equality The variational inequality in the preceding subsection is now strengthened to the following variational equality. Theorem 11.4. Let Φ ∈ Pτ . Then P (βΦ) =
sup {s(ω) − βeΦ (ω)} ,
τ ω∈A∗ +,1,
(11.11)
where P (βΦ), s(ω) and eΦ denote the pressure, mean entropy and mean energy, τ respectively, and A∗+,1, denotes the set of all translation invariant states of A. Proof. The proof below will be carried out in the same way as for classical or quantum lattice systems ([37] or e.g. Theorem III.4.5 in [40]), with a help of the product state extension provided by Theorem 11.2. By the variational inequality (11.10), we only have to find a sequence {ρ n } of translation invariant states of A satisfying {s(ρn ) − βeΦ (ρn )} → P (βΦ) (n → ∞) .
(11.12)
For this purpose, we interrupt the proof and show the following lemma about mean entropy and mean energy of periodic states. It corresponds to Theorem 10.3 and Theorem 9.5 for translation invariant states. Lemma 11.5. Let a ∈ N, ω be an aZν -invariant state and Φ ∈ Pτ . (1) The mean entropy S(ωA(Cna ) ) s(ω) = lim n→∞ |Cna |
(11.13)
April 11, 2003 14:43 WSPC/148-RMP
168
00160
H. Araki & H. Moriya
exists. It is affine, weak∗ upper semicontinuous in ω and translation invariant: s(ω) = s(τk∗ (ω)) , (2) The mean energy eΦ (ω) = lim
n→∞
(k ∈ Zν ) .
(11.14)
(11.15)
(ω(U (Cna )) |Cna |
exists. It is linear in Φ, bounded by kΦk, affine and weak∗ continuous in ω, and translation invariant: eΦ (ω) = eΦ (τk∗ (ω)) ,
(k ∈ Zν ) .
(11.16)
Proof. We introduce a new lattice system (Aa , Aa (I)) where the total algebra Aa is equal to A and its local algebra is Aa (I) ≡ A(∪m∈I (Ca + am)) for each finite subset I of Zν . For this new system (Aa , {Aa (I)}), we assign its local Hamiltonian H a (I) ≡ H(∪m∈I (Ca + am))
to each finite I, where H(·) denotes a local Hamiltonian of the original system (A, {A(I)}). If ω is an aZν -invariant state of the system (A, {A(I)}), then it goes over to a translation invariant state of the new system (Aa , {Aa (I)}). We denote mean entropy and mean energy of ω for the system (Aa , {Aa (I)}) by a s (ω) and eaΦ (ω) which are shown to exist by Theorems 10.3 and 9.5. Because of the scale change, we have s(ω) = lim
n→∞
S(ωCna ) = |Ca |−1 sa (ω) , |Cna |
(ω(U (Cna )) = |Ca |−1 eaΦ (ω) . n→∞ |Cna |
eΦ (ω) = lim
(11.17) (11.18)
Hence those properties of mean entropy and mean energy of translation invariant states given in Theorems 10.3 and 9.5 go over to those for periodic states. Now we show (11.14) for any aZν -invariant state ω and any k ∈ Zν . Due to the ν aZ -invariance of ω, we only have to show the assertion for any k ∈ Ca . For any n ∈ N, we have S(τk∗ ω|A(Cna ) ) = S(ω|A(Cna +k) ) ,
(11.19)
which is to be compared with S(ω|A(Cna ) ). Since k ∈ Ca , we have C(n−1)a + a(1, . . . , 1) ⊂ Cna + k ⊂ C(n+1)a . By (3.2), (10.5), and the periodicity of ω, S(ωA(Cna +k) ) ≤ S(ωA(C(n−1)a ) ) + {|Cna | − |C(n−1)a |} log 2 , S(ωA(Cna +k) ) ≥ S(ωA(C(n+1)a ) ) − {|C(n+1)a | − |Cna |} log 2 .
(11.20)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
169
Due to lim
n→∞
|Cna | = 1, |C(n−1)a |
lim
n→∞
|Cna | = 1, |C(n+1)a |
(11.21)
and (11.19), we obtain s(τk∗ ω) = lim
n→∞
S(ωA(Cna +k) ) |Cna |
S(ωA(Cna ) ) = s(ω) , n→∞ |Cna |
= lim
which is the desired equality (11.14). It remains to show (11.16). Applying the inequality (8.5) to the pair I = (C(n−1)a + a(1, . . . , 1)), J = (Cna + k) \ {C(n−1)a + a(1, . . . , 1)} and to the pair I = (C(n−1)a + a(1, . . . , 1)), J = Cna \{C(n−1)a + a(1, . . . , 1)}, we obtain kU (Cna ) − U (Cna + k)k ≤ kU (Cna ) − U (I)k + kU (I) − U (Cna + k)k ≤ 2kΦk{|Cna| − |C(n−1)a |} , where I = (C(n−1)a + a(1, . . . , 1)). Hence due to (11.21) and the periodicity of ω, eΦ (τk∗ ω) = lim
n→∞
= lim
n→∞
ω(U (Cna + k)) |Cna | ω(U (Cna )) = eΦ (ω) , |Cna |
which is the desired equality (11.16). Now we resume the proof of Theorem 11.4. Proof of Theorem 11.4 (continued). Due to Θ-evenness of the internal energy U (I) for every finite I ⊂ Zν , we have Θ ϕcI ∈ A(I)∗+,1 .
(11.22)
Let a ∈ N. For distinct m ∈ Zν , {Ca + am} are mutually disjoint and their union for all m ∈ Zν is Zν . Θ We apply Theorem 11.2 to the local Gibbs states ϕcCa +am ∈ A∗+,1 (Ca + am), ν m ∈ Z and obtain an even product state of A, which we denote by ϕca . ∗ For any k ∈ Zν , τak ϕca = ϕca by the uniqueness of the product state with the same component states. Thus ϕca is an aZν -invariant state. cca which is translation invariant as By using ϕca we construct an averaged state ϕ follows. X τ ∗ ϕc m a τ cca ≡ ∈ A∗+,1, . (11.23) ϕ |Ca | m∈Ca
April 11, 2003 14:43 WSPC/148-RMP
170
00160
H. Araki & H. Moriya
ccn . By affine dependence of s and eΦ on We now show (11.12) by taking ρn = ϕ the space of periodic states in Lemma 11.5, X ∗ c cca ) = |Ca |−1 s(ϕ s(τm ϕa ) , m∈Ca
cca ) = |Ca |−1 e Φ (ϕ
X
∗ c eΦ (τm ϕa ) .
m∈Ca
Due to (11.14) and (11.16), they imply
cca ) = s(ϕca ) , s(ϕ
(11.24)
cca ) = eΦ (ϕca ) . e Φ (ϕ
By (11.24), we have
cca ) = s(ϕca ) = s(ϕ =
(11.25)
1 S(ϕcCa ) |Ca |
1 {log TrCa (e−βU (Ca ) ) + βϕca (U (Ca ))} , |Ca |
(11.26)
where the last equality is given by the substitution of an explicit form of the density matrix of the local Gibbs state ϕcCa in Definition 7.3. In order to show (11.12), we now compare eΦ (ϕca ) with |C1a | ϕca (U (Ca )) in (11.26). Let k ∈ N and consider the following division of Cka as a disjoint union of translates of Ca : [ (Ca + am) . (11.27) Cka = m∈Ck
We give the lexicographic ordering for elements in Ck and set [ m Cka ≡ (Ca + am0 ) m0 <m
for m ∈ Ck . For any k ∈ N, X X m } W (Ca + am) . E{Cka \Cka U (Ca + am) = U (Cka ) − m∈Ck
m∈Ck
By kEk ≤ 1 and the translation covariance (Φ-f) of the potential Φ, we obtain X 1 1 U (Ca + am)k ≤ kU (Cka ) − (|Ck | · kW (Ca )k) |Cka | |Cka | m∈Ck
=
kW (Ca )k . |Ca |
(11.28)
Therefore, by (9.1), there exists a0 ∈ N for any ε > 0 such that for all a > a0
) (
1
X
U (Ca + am) < ε . (11.29) U (Cka ) −
|Cka |
m∈Ck
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
171
Note that the above a0 can be taken independent of k ∈ N. For any a ∈ N, ϕca (U (Ca + am)) = ϕca (U (Ca )) , for any m ∈ Zν , due to the aZν -invariance of ϕca . Therefore, we obtain 1 1 c c ϕ (U (C )) − ϕ (U (C )) ka a < ε, a |Cka | a |Ca | for a > a0 . By taking the limit k → ∞, we have eΦ (ϕc ) − 1 ϕc (U (Ca )) < ε . a a |Ca |
From this estimate, (11.25 ) and (11.26 ), it follows that 1 −βU (Ca ) s(ϕ cca ) − βeΦ (ϕ cca ) − ) < |β|ε , log TrCa (e |Ca | ccn in view of (9.22). for all a ≥ a0 . This proves (11.12) for ρn = ϕ
11.4. Variational principle
Definition 11.6. Any translation invariant state ϕ satisfying P (βΦ) = s(ϕ) − βeΦ (ϕ)
(11.30)
(namely, maximizing the functional s − βeΦ ) is called a solution of the (Φ, β)variational principle (or a translation invariant equilibrium state for Φ at the inverse temperature β). The set of all solutions of the (Φ, β)-variational principle is denoted by ΛβΦ . τ , P (βΦ) = s(ϕ) − βeΦ (ϕ)} . ΛβΦ ≡ {ϕ; ϕ ∈ A∗+,1,
(11.31)
Remark 1. Since βeΦ (ϕ) = eβΦ (ϕ), the condition ϕ ∈ ΛβΦ is equivalent to the condition that ϕ is a solution of the (βΦ, 1)-variational principle, and hence ΛβΦ is a consistent notation. Remark 2. In the usual physical convention, the functional s − βeΦ is −β times the free energy functional. τ Theorem 11.7. For any Φ ∈ Pτ and β ∈ R, there exists a solution ϕ(∈ A∗+,1, ) of (Φ, β)-variational principle, namely,
ΛβΦ 6= ∅ . τ cca } in the proof of Theorem 11.4 has an accumulation point in A∗+,1, Proof. {ϕ τ by the weak∗-compactness of A∗+,1, . Let ϕ be any such accumulation point. By the proof of Theorem 11.4, the weak∗ continuity of eΦ and the weak∗ upper
April 11, 2003 14:43 WSPC/148-RMP
172
00160
H. Araki & H. Moriya
semicontinuity of s in ω, the state ϕ satisfies cca )) ≤ s(ϕ) − βeΦ (ϕ) . cca ) − βeΦ (ϕ P (βΦ) = lim (s(ϕ a→∞
(11.32)
By (11.10), we obtain (11.30).
Our Fermion algebra A is not asymptotically abelian with respect to the lattice translations, but if ω is translation invariant state of A, it is well known that the pair (A, ω) is Zν -abelian and that ω is automatically even (see, for example, Example 5.2.21 in [17]). From this consideration and Theorem 11.4, we obtain the following result, which corresponds to Theorem 6.2.44 in [17] in the case of quantum spin lattice systems, by the same argument as for that theorem. For a convex set K, we denote the set of extremal points of K by E(K). Proposition 11.8. For Φ ∈ Pτ and β ∈ R, ΛβΦ is a simplex with E(ΛβΦ ) ⊂ τ E(A∗+,1, ) and the unique barycentric decomposition of each ϕ in ΛβΦ coincides with its unique ergodic decomposition. 12. Equivalence of Variational Principle and KMS Condition Among 5 steps for establishing the equivalence stated in the title (which are described in Sec. 1), Step (1) “KMS condition ⇒ Gibbs condition” is obtained in Theorem 7.5 in Sec. 7.4, Step (4) “dKMS condition on A◦ ⇒ dKMS condition on D(δα )” is obtained in Corollary 6.7, and Step (5) “dKMS condition on D(δα ) ⇒ KMS condition” is stated in Theorem 6.4. In this section, we complete the remaining two steps of proof by showing Step (2) “Gibbs condition ⇒ Variational principle” in Sec. 12.1 and Step (3) “Variational principle ⇒ dKMS condition on A◦ ” in Sec. 12.3. As a preparation for the latter, some tools of convex analysis is gathered in Sec. 12.2. 12.1. Variational principle from Gibbs condition Proposition 12.1. For Φ ∈ Pτ , each translation invariant state ϕ satisfying (Φ, β)-Gibbs condition is a solution of the (Φ, β)-variational principle. Proof. We follow the method of proof in [6]. The Gibbs condition for ϕ implies [ϕβW (I) ]|A(I) = ϕcI
(12.1)
for every finite subset I, where ϕcI is given by (7.14), and [ϕβW (I) ] denotes the normalization of ϕβW (I) given by (7.8). By (11.8) with ω replaced by ϕ, we have ˆ I ) + βϕ(U (I)) + log τ (e−βU (I) ) S(ϕcI , ϕI ) = −S(ϕ = −S(ϕI ) + βϕ(U (I)) + log TrI (e−βU (I) ) .
(12.2)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
173
Since relative entropy is nonnegative and is monotone nonincreasing under restriction of states, it follows that 0 ≤ S(ϕcI , ϕI ) ≤ S([ϕβW (I) ], ϕ) . By (7.8), (7.10) and (7.9), we have S([ϕβW (I) ], ϕ) = log(ϕβW (I) (1)) − ϕ(βW (I)) ≤ 2kβWI k . From these estimates and (12.2), it follows that 0 ≤ S(ϕcI , ϕI ) = −S(ϕI ) + βϕ(U (I)) + log TrI (e−βU (I) ) ≤ 2kβWI k . (Up to this point, the assumption of translation invariance of ϕ is irrelevant.) We now divide the above inequality by |I| and take the van Hove limit I → ∞. Then by the translation invariance of ϕ and (9.1), we obtain s(ϕ) − βeΦ (ϕ) = P (βΦ) , which completes the proof. Combining this proposition with Theorem 7.5, we immediately obtain the following. Corollary 12.2. Let αt be a dynamics of A satisfying the Assumptions (II) and (IV) in Sec. 5 and Φ be the (translation covariant) standard potential uniquely corresponding to this αt . If ϕ is a translation invariant (αt , β)-KMS state of A, then ϕ is a solution of the (Φ, β)-variational principle. We have now completed the proof of Theorem A. 12.2. Some tools of convex analysis We use the pressure functional Φ ∈ Pτ 7→ P (Φ) ∈ R, which is a norm continuous convex function on the Banach space Pτ due to Corollary 9.4. A continuous linear functional α ∈ Pτ∗ (the dual of Pτ ) is called a tangent of the functional P at Φ ∈ Pτ if it satisfies P (Φ + Ψ) ≥ P (Φ) + α(Ψ)
(12.3)
for all Ψ ∈ Pτ . Proposition 12.3. For any solution ϕ of the (Φ, 1)-variational principle, define αϕ (Ψ) ≡ −eΨ (ϕ)
(12.4)
for all Ψ ∈ Pτ . Then αϕ is a tangent of Pτ at Φ. Proof. By linear dependence (9.26) of eΨ on Ψ, αϕ is a linear functional on Pτ . Due to |eΨ (ϕ)| ≤ kΨk given by (9.28), we have αϕ ∈ Pτ∗ . Due to the variational
April 11, 2003 14:43 WSPC/148-RMP
174
00160
H. Araki & H. Moriya
inequality (11.10), P (Φ + Ψ) ≥ s(ϕ) − eΦ+Ψ (ϕ) = s(ϕ) − eΦ (ϕ) − eΨ (ϕ) = P (Ψ) + αϕ (Ψ) for all Ψ ∈ Pτ , where the last equality is due to the assumption that ϕ is a solution of the (Φ, 1)-variational principle. (We will establish the bijectivity between solutions of the (Φ, β)-variational principle and tangents of P at βΦ through (12.4) in Theorem 12.10.) Since P (Φ + kΨ) is a convex continuous function of k ∈ R for any fixed Φ, Ψ ∈ Pτ , there exist its right and left derivatives at k = 0, ± (DΨ P )(Φ) = lim
k→±0
P (Φ + kΨ) − P (Φ) . k
By the convexity of P , + − (DΨ P )(Φ) ≥ (DΨ P )(Φ) .
If and only if they coincide, P (Φ + kΨ) is differentiable at k = 0. Then we define + − (DΨ P )(Φ) = (DΨ P )(Φ) = (DΨ P )(Φ) .
(12.5)
± The derivatives (DΨ P )(Φ) and hence (DΨ P )(Φ) (when it exists) satisfy ± ± P )(Φ)| ≤ kΨ1 − Ψ2 k , P )(Φ) − (DΨ |(DΨ 2 1
|(DΨ1 P )(Φ) − (DΨ2 P )(Φ)| ≤ kΨ1 − Ψ2 k ,
(12.6)
as is shown by the following computation in the limit k → ±0. {P (Φ + kΨ1 ) − P (Φ)} − {P (Φ + kΨ2 ) − P (Φ)} k P (Φ + kΨ1 ) − P (Φ + kΨ2 ) = k ≤ |k|−1 kk(Ψ2 − Ψ2 )k = kΨ1 − Ψ2 k ,
where (9.23) is used for the inequality. If (12.5) holds for all Ψ, then P is said to be differentiable at Φ. Let Pτ1 be the set of all Φ ∈ Pτ where P is differentiable. Proposition 12.4. If Φ ∈ Pτ1 , αΦ (Ψ) = (DΨ P )(Φ) , Pτ∗
(Ψ ∈ Pτ ) ,
(12.7)
defines an αΦ ∈ which is the unique tangent of P at Φ. Then any solution ϕ of (Φ, 1)-variational principle satisfies αΦ (Ψ) = αϕ (Ψ) , for all Ψ ∈ Pτ , where αϕ is given by (12.4).
(12.8)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
175
Proof. By Theorem 11.7, there is a solution ϕ of the (Φ, 1)-variational principle and, by Proposition 12.3, αϕ is a tangent of P at Φ. Let α0 be any tangent of P at Φ ∈ Pτ1 . We have for k > 0 P (Φ + kΨ) ≥ P (Φ) + kα0 (Ψ) , P (Φ − kΨ) ≥ P (Φ) − kα0 (Ψ) . Hence + (Dψ P )(Φ) = lim
P (Φ + kΨ) − P (Φ) ≥ α0 (Ψ) , k
− (DΨ P )(Φ) = lim
P (Φ − kΨ) − P (Φ) ≤ α0 (Ψ) . (−k)
k→+0
k→+0
By (12.5) for Φ ∈ Pτ1 , we obtain
α0 (Ψ) = (DΨ P )(Φ) .
Then α0 is unique and (12.8) holds. Lemma 12.5. For each A ∈ A◦ such that A = A∗ = Θ(A), there exists ΨA ∈ Pτf such that eΨA (ϕ) = ϕ(A) − τ (A)
(12.9)
for all translation invariant states ϕ. Proof. Let A = A∗ = Θ(A) ∈ A(I) for some finite I and A1 ≡ A − τ (A)1 (∈ A(I)) . Since EIc (A1 ) = τ (A1 )1 = 0, there exists a unique decomposition X A(J) , A(J) ∈ A(J) , A1 =
(12.10)
J⊂I J6=∅
EK (A(J)) = 0 for K 6⊃ J . To show these formulae, let A(J) =
X
(−1)|J|−|K| EK (A1 )
(12.11)
(12.12)
K⊂J
for all non-empty J ⊂ I, a formula in parallel with (5.16). Then X EJ (A1 ) = A(K)
(12.13)
K⊂J K6=∅
for J ⊂ I by exactly the same computation as Step 1 of the proof of Lemma 5.9. (When J = ∅, the right-hand side is interpreted as 0 and E∅ (A1 ) = 0.) We have A(J)∗ = A(J) = Θ(A(J)) ∈ A(J) ,
(12.14)
April 11, 2003 14:43 WSPC/148-RMP
176
00160
H. Araki & H. Moriya
because A(J) is a real linear combination of EK (A1 ), K ⊂ J, and all EK (A1 ) satisfy the same equation. We note that Step 4 of Lemma 5.9 uses only the following properties of U (K), U (∅) = 0 ,
τ (U (K)) = 0 ,
EK (U (J)) = U (K) ,
for K ⊂ J ⊂ I, and that all of them are satisfied also by EK (A1 ). Therefore, (12.11) follows from the same argument as Step 4 of Lemma 5.9. We now construct ΨJ ∈ Pτf for each A(J) in (12.10) such that eΨJ (ϕ) = ϕ(A(J))
(12.15)
for all translation invariant states ϕ. Then by linear dependence of eΨ on Φ ∈ Pτ , P we obtain for Ψ = J⊂I ΨJ the desired relation (12.9): X X ϕ(A(J)) = ϕ(A1 ) = ϕ(A) − τ (A) . eΨJ (ϕ) = eΨ (ϕ) = J⊂I
J⊂I
We define a potential ΨJ for each J ⊂ I, J 6= ∅ by ΨJ (J + m) = τm (A(J)) ,
(m ∈ Zν ) ,
ΨJ (K) = 0 if K is not a translate of J .
(12.16)
Due to the property (12.14) and (12.11), ΨJ belongs to Pτf . We compute 1 X 1 ϕ(UΨJ (Ca )) = ϕ {ΨJ (J + m); J + m ⊂ Ca } |Ca | |Ca | =
Na ϕ(A(J)) , |Ca |
where Na is the number of m such that J + m ⊂ Ca . Na We now show that |C → 1 as a → ∞. Since J + m ⊂ Ca is equivalent to a| J ⊂ Ca − m, Na is the same as l(a, J) (the number of translates of Ca containing J). By (8.11), lim
a→∞
Na l(a, J) = lim = 1. a→∞ |Ca | |Ca |
Hence eΨJ (ϕ) = lim
a→∞
1 ϕ(UΨJ (Ca )) = ϕ(A(J)) . |Ca |
Corollary 12.6. If ϕ1 and ϕ2 are distinct solutions of (Φ, 1)-variational principle for Φ ∈ Pτ , then the corresponding tangent of P at Φ are distinct, that is, αϕ1 6= αϕ2 . Proof. If ϕ1 6= ϕ2 , there exists an A ∈ A◦ such that ϕ1 (A) 6= ϕ2 (A). Let A± = 1 2 (A±Θ(A)). Then A = A+ +A− . Since ϕ1 and ϕ2 are translation invariant, both of
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
177
them are Θ-even, and hence ϕ1 (A− ) = ϕ2 (A− ) = 0. Thus ϕ1 (A+ ) 6= ϕ2 (A+ ). So we 1 may assume that Θ(A) = A. Let A1 = 12 (A + A∗ ), A2 = 2i (A − A∗ ), A = A1 + iA2 . Then either ϕ1 (A1 ) 6= ϕ2 (A1 ) or ϕ1 (A2 ) 6= ϕ2 (A2 ). Since A∗1 = A1 and A∗2 = A2 , we may assume A = A∗ = Θ(A). Let ΨA ∈ Pτf be given as in Lemma 12.5 for this A ∈ A◦ . Then αϕ1 (ΨA ) = −eΨA (ϕ1 ) = −ϕ1 (A) + τ (A) 6= −ϕ2 (A) + τ (A) = −eΨA (ϕ2 ) = αϕ2 (ΨA ) . Hence αϕ1 6= αϕ2 . Corollary 12.7. For Φ ∈ Pτ1 , a solution of (Φ, 1)-variational principle is unique. Proof. This follows from Proposition 12.4 and Corollary 12.6. We will use the following result in the proof of Theorem 12.11. Theorem 12.8. (1) The set Pτ1 of points of unique tangent of P is residual (an intersection of a countable number of dense open sets) and dense in Pτ . (2) For any Φ ∈ Pτ , any tangent of P at Φ is contained in the weak∗ closed convex hull of the set Γ(Φ) which is defined by Γ(Φ) ≡ {α ∈ Pτ∗ ; there exists a net Φγ ∈ Pτ1 such that kΦγ − Φk → 0 , and αΦγ → α in the weak∗ topology of Pτ∗ } ,
(12.17)
where αΦγ is the unique tangent of P at Φγ . Proof. (1) is Mazur’s theorem [31]. (2) is Theorem 1 of [26] where the function f is to be set f (Ψ) = P (Φ + Ψ) for our purpose. The proof in [26] is by the Hahn–Banach theorem. (Separability of Pτ given by Corollary 8.13 is needed for both (1) and (2).) We now show a bijective correspondence between solutions of the (Φ, β)variational principle and tangents of P at βΦ. We first prove a lemma about stability of solutions of the variational principle under the limiting procedure in (12.17). Lemma 12.9. Let {Φγ } be a net in Pτ and {ϕγ } be a net consisting of a solution ϕγ of the (Φγ , βγ )-variational principle for each index γ such that kΦγ − Φk → 0, (Φ ∈ Pτ ),
βγ → β ∈ R ,
τ τ ϕγ → ϕ ∈ A∗+,1, in the weak ∗ topolgy of A∗+,1, .
Then ϕ is a solution of the (Φ, β)-variational principle.
April 11, 2003 14:43 WSPC/148-RMP
178
00160
H. Araki & H. Moriya
Proof. By the norm continuity (9.23) of P , the weak∗ upper semicontinuity of s (Theorem 10.3) and the continuous dependence of eΦ (ϕ) on Φ in the norm topology (uniformly in ϕ) and on ϕ in the weak∗ topology (Theorem 9.5), we have P (βΦ) = lim P (βγ Φγ ) , γ
s(ϕ) ≥ lim sup s(ϕγ ) , γ
eΦ (ϕ) = lim eΦγ (ϕγ ) . γ
Since, ϕγ is a solution of the (Φγ , βγ )-variational principle, we have P (βγ Φγ ) = s(ϕγ ) − βγ eΦγ (ϕγ ) . Hence P (βΦ) ≤ s(ϕ) − βeΦ (ϕ) . By the variational inequality (11.10), we have P (βΦ) = s(ϕ) − βeΦ (ϕ) . Theorem 12.10. For any Φ ∈ Pτ and β ∈ R, there exists a bijective affine map ϕ 7→ αϕ from the set ΛβΦ to the set of all tangents of the functional P at βΦ, given by αϕ (Ψ) = −eΨ (ϕ) ,
Ψ ∈ Pτ .
(12.18)
Proof. By Remark 1 after Definition 11.6, all solutions of the (Φ, β)- and (βΦ, 1)variational principle coincide. Furthermore, if ϕ is a solution of the (Φ, β)-variational principle, then P (βΦ + Ψ) ≥ s(ϕ) − eβΦ+Ψ (ϕ) = s(ϕ) − βeΦ (ϕ) − eΨ (ϕ) = P (βΦ) + αϕ (Ψ) . Namely αϕ is a tangent of P at βΦ, exactly the same statement as for a solution ϕ of the (βΦ, 1)-variational principle. Therefore, it is enough to prove the case of β = 1. The map ϕ 7→ αϕ is an affine map from the set of all solutions of (Φ, 1)variational principle into the set of all tangents of P at Φ. The map is injective by Corollary 12.6. To show the surjectivity of the map, let α be a tangent of P at Φ. By Theorem 12.8, there exists a net Φγ ∈ Pτ1 such that kΦγ − Φk → 0, and αΦγ → α in the weak∗ topology of Pτ∗ , where αΦγ is the unique tangent of P at Φγ . By Theorem 11.7, there exists a solution ϕγ of the (Φγ , 1)-variational
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
179
principle. By Proposition 12.3, αϕγ is a tangent of P at Φγ and hence must coinτ cide with the unique tangent αΦγ . Due to the weak∗ compactness of A∗+,1, , there ∗τ exists a subnet {ϕγ(µ) }µ which converges to some ϕ ∈ A+,1, . By Lemma 12.9 and by kΦγ(µ) − Φk → 0, ϕ must be a solution of the (Φ, 1)-variational principle. Furthermore, for any Ψ ∈ Pτ , we have αϕ (Ψ) = −eΨ (ϕ) = − lim eΨ (ϕγ(µ) ) = − lim αγ(µ) (Ψ) µ
µ
= α(Ψ) . Hence α = αϕ and the map ϕ → αϕ is surjective. 12.3. Differential KMS condition from variational principle In this subsection, we give a proof for Step 3. Theorem 12.11. Let Φ ∈ Pτ and ϕ be a translation invariant state. If ϕ is a solution of (Φ, β)-variational principle, then ϕ is a (δΦ , β)-dKMS state, where δΦ ∈ ∆(A◦ ) corresponds to Φ by the bijective linear map of Corollary 8.5. Remark. We note that this theorem holds for any Φ ∈ Pτ without any further assumption on Φ and we do not need αt . Note that the domain D(δΦ ) is A◦ by definition. First we present some estimate needed in the proof of this theorem in the form of the following lemma. Lemma 12.12. Let I and J be finite subsets of Zν . If A ∈ A(J), then k[U (I), A]k ≤ 2kΦk · kAk · |I ∩ J| .
(12.19)
Proof. Let I0 be the complement of I ∩ J in I. Then I0 ∩ J = ∅ and hence U (I0 ) commutes with A(∈ A(J)) due to U (I0 ) ∈ A(I0 )+ ⊂ A(J)0 . Since I0 and I ∩ J are disjoint and have the union I, the following computation proves (12.19). k[U (I), A]k = k[U (I) − U (I0 ), A]k ≤ 2kU (I) − U (I0 )k kAk ≤ 2kΦk · kAk · |I ∩ J| , where the last inequality is due to (8.5). Proof of Theorem 12.11. We note that (Φ, β)-variational principle and (βΦ, 1)variational principle are the same and (δΦ , β)-dKMS condition and (δβΦ , 1)-dKMS condition are the same. By taking βΦ as a new Φ, we only have to prove the case β = 1.
April 11, 2003 14:43 WSPC/148-RMP
180
00160
H. Araki & H. Moriya
cca be the translation invariant state defined by (11.23) in the proof of Let ϕ cca }a∈N . Then this ϕ is a Theorem 11.4. Let ϕ be any accumulation point of {ϕ solution of (Φ, 1)-variational principle as shown in Theorem 11.7. For the moment, let us assume Φ ∈ Pτ1 (the set of Φ ∈ Pτ where P is differentiable, defined in Sec. 12.2). Due to the assumption Φ ∈ Pτ1 , any accumulation point cca }a∈N coincides with the unique solution ϕ of (Φ, 1)-variational principle, and of {ϕ hence cca = ϕ . lim ϕ
(12.20)
a→∞
We now prove that the above ϕ satisfies the conditions (C-1) and (C-2) of Definition 6.3 for each A ∈ A◦ by using (12.20). Let A ∈ A(I) for a finite subset I of Zν . Suppose Ca − k ⊃ I (a ∈ N, k ∈ Zν ). Since τk∗ ϕca is the (Ad eitU (Ca −k) , 1)-KMS state on A(Ca − k), we have Re(τk∗ ϕca )(A∗ [iU (Ca − k), A]) = 0 ,
(12.21)
Im(τk∗ ϕca )(A∗ [iU (Ca − k), A]) ≥ S(τk∗ ϕca (AA∗ ), τk∗ ϕca (A∗ A)) .
(12.22)
Our strategy of the proof is to replace τk∗ ϕca and [iU (Ca − k), A] by ϕ and δΦ (A), respectively, by using an approximation argument. By (4.23) for J % Zν , there exists a finite subset Jε of Zν for any given ε > 0 such that kH(I) − EJ (H(I))k < ε ,
(12.23)
for all J ⊃ Jε . Let b be sufficiently large so that there exists a translate Cb − l0 of Cb containing both I and Jε . τ cca (∈ A∗+,1, We will use the following convenient expression for ϕ ) which is equivalent to (11.23): cca = τl∗ ϕ cca = ϕ
∗ X τl+m ϕca = |Ca |
m∈Ca
X
m∈(Ca +l)
∗ c τm ϕa , |Ca |
(12.24)
for any l ∈ Zν . We will take l = l0 . We divide Ca + l0 into the following two disjoint subsets when a > b: C1 ≡ Ca−b + l0 ,
C2 ≡ (Ca + l0 ) \ C1 .
(12.25)
Then Ca − k ⊃ C b − l0 ⊃ I ∪ J ε if k ∈ C1 , while
as a → ∞.
|C2 | = |Ca |
1−
|Ca−b | |Ca |
→ 0,
(12.26)
(12.27)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
181
For k ∈ C1 , A(∈ A(I)) belongs to A(Ca − k) due to I ⊂ Ca − k. By using the general property of the conditional expectation, we have i[U (Ca − k), A] = iECa −k ([H(Ca − k), A]) = iECa −k ([H(I), A]) = i[ECa −k (H(I)), A] . By (12.23) for J = Ca − k(⊃ Jε ), this implies ki[H(I), A] − i[U (Ca − k), A]k < 2εkAk . Noting that δΦ (A) = i[H(I), A], we have kδΦ (A) − i[U (Ca − k), A]k < 2εkAk .
(12.28)
It follows from (12.21) and (12.28) that |Re(τk∗ ϕca )(A∗ δΦ (A))| < 2εkAk2
(12.29)
for k ∈ C1 . For k ∈ C2 , we use the following obvious estimate. |Re(τk∗ ϕca )(A∗ δΦ (A))| < kA∗ δΦ (A)k .
(12.30)
Substituting (12.29) and (12.30) into (12.24), we obtain ! X 1 cca (A∗ δΦ (A))| ≤ Re |Re ϕ τk∗ ϕca (A∗ δΦ (A)) |Ca | k∈C1
+ Re
X
k∈C2
≤ 2εkAk2 +
1 ∗ c τ ϕ |Ca | k a
!
(A δΦ (A)) ∗
|C2 | ∗ kA δΦ (A)k . |Ca |
Taking the limit a → ∞ and using (12.27), we obtain |Re ϕ(A∗ δΦ (A))| ≤ 2εkAk2 . Due to arbitrariness of ε > 0, we obtain |Re ϕ(A∗ δΦ (A))| = 0 . Hence the condition (C-1) holds. By (12.22) and (12.28), we have the following inequality for k ∈ C1 , Im(τk∗ ϕca )(A∗ δΦ (A)) ≥ S(τk∗ ϕca (AA∗ ), τk∗ ϕca (A∗ A)) − 2εkAk2 . For k ∈ C2 , we use simply the following estimate. Im(τk∗ ϕca )(A∗ δΦ (A)) ≥ −kAδΦ (A)k .
(12.31)
April 11, 2003 14:43 WSPC/148-RMP
182
00160
H. Araki & H. Moriya
From these inequalities, we obtain X 1 τ ∗ ϕc |Ca | k a
cca (A∗ δΦ (A)) = Im Im ϕ
k∈C1
+ Im
X
k∈C2
!
1 ∗ c τ ϕ |Ca | k a
(A∗ δΦ (A)) !
(A∗ δΦ (A))
1 X S(τk∗ ϕca (AA∗ ), τk∗ ϕca (A∗ A)) |Ca |
≥
k∈C1
−2
|C2 | |C1 | εkAk2 − kAδΦ (A)k . |Ca | |Ca |
(12.32)
Due to the estimate (12.27), the last term tends to 0 as a → ∞, while the second last term tends to −2εkAk2 as a → ∞. Due to the convexity of S(·, ·) in two variables, the first term on the right-hand side has the following lower bound: 1 X |C1 | 0 0 cca (AA∗ ), ϕ cca (A∗ A)) , S(τk∗ ϕca (AA∗ ), τk∗ ϕca (A∗ A)) ≥ S(ϕ |Ca | |Ca |
(12.33)
k∈C1 0
cca is a state of A defined by where ϕ 0
cca (B) ≡ ϕ
1 X ∗ c τk ϕa (B) , |C1 | k∈C1
B ∈ A.
0
cca and ϕ cca can be estimated as The difference of the states ϕ X 1 X ∗ c 1 1 0 ccn − ϕ ccn = τk∗ ϕca − − τ k ϕa ϕ |C1 | |Ca | |Ca | k∈C1
=
k∈C2
|C2 | c 0 1 X ∗ c cn − ϕ τ k ϕa . |Ca | |Ca | k∈C2
Hence
0
cca − ϕ cca k ≤ 2 kϕ
|C2 | , |Ca |
which tends to 0 as a → ∞ by (12.27). We note 0
cca (AA∗ ) = lim ϕ cca (AA∗ ) = ϕ(AA∗ ) , lim ϕ a
a
0
cca (A∗ A) = lim ϕ cca (A∗ A) = ϕ(A∗ A) . lim ϕ a
a
By the lower semi-continuity of S(·, ·), we obtain 0
0
cca (AA∗ ), ϕ cca (A∗ A)) ≥ S(ϕ(AA∗ ), ϕ(A∗ A)) . lim inf S(ϕ a
(12.34)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
183
Combining the estimates (12.32), (12.33), (12.34) as well as (12.27), we obtain the following inequality in the limit a → ∞. Im ϕ(A∗ δΦ (A)) ≥ S(ϕ(AA∗ ), ϕ(A∗ A)) − 2εkAk2 . Due to arbitrariness of ε, we have Im ϕ(A∗ δΦ (A)) ≥ S(ϕ(AA∗ ), ϕ(A∗ A)) , for A ∈ A◦ . Hence the condition (C-2) holds. Thus, we have shown that ϕ satisfies the (δΦ , 1)-dKMS condition if ϕ is the (unique) solution of (Φ, 1)-variational principle when Φ ∈ Pτ1 . For general Φ ∈ Pτ , we will use the standard argument of the convex analysis in the same way as [26], or Theorem 6.2.42 in [17]. By Theorem 12.8, any solution of the (Φ, 1)-variational principle can be obtained by successive use of the following procedures, starting with the unique solution of ϕα of (Φα , 1)-variational principle for Φα ∈ Pτ1 . (1) Weak∗ limits of any converging nets ϕα such that kΦα − Φk → 0. (2) Convex combinations of limits obtained in (1). (3) Weak∗ limits of a converging net of states obtained in (2). By Lemma 6.6, the conditions (C-1) and (C-2) are stable under these procedures. As we have already shown these conditions for ϕα when Φα belongs to Pτ1 , the same holds for any Φ ∈ Pτ . We have now shown Theorem B. 13. Use of Other Entropy in the Variational Equality We now consider the possibility to replace the mean entropy s(ω) in Theorem 11.4 by other entropy. We take up the CNT entropy hω (τ ) with respect to the lattice translation automorphism group τ as one example. But readers will find that any other entropy will do if it has those basic properties of CNT entropy which are used in the proof of Theorem 13.2. Note that it is not known whether CNT entropy is equal to the mean entropy or not so far, either in some general context or in the present case. 13.1. CNT-entropy The CNT-entropy is introduced by Connes–Narnhofer–Thirring [19] for a single automorphism and its invariant state, and is extended by Hudetz [22] to the multidimensional case of the group Zν generated by a finite number (=ν) of commuting automorphisms. We will use the latter extended version for the group of lattice translation automorphisms τm (m ∈ Zν ).
April 11, 2003 14:43 WSPC/148-RMP
184
00160
H. Araki & H. Moriya
For a positive integer k, we consider a finite decomposition of a state ω in the state space A∗+,1 : X ω= ωi(1)i(2)···i(k) , (13.1) i(1),i(2),...,i(k)
where each i(l) runs over a finite subset of N, l = 1, . . . , k, and ωi(1)i(2)···i(k) is a nonzero positive linear functional of A. For each fixed l and i(l), let l ≡ ωi(l)
X
ωi(1)i(2)···i(k) ,
i(1),i(2),...,i(k) i(l):fixed
l ω ˆ i(l) ≡
l ωi(l) l (1) ωi(l)
.
(13.2)
Let η(x) ≡ −x log x for x > 0 and η(0) = 0. For finite dimensional subalgebras A1 , A2 , . . . , Ak of A, the so-called algebraic entropy Hω (A1 , A2 , . . . , Ak ) is defined by " X η(ωi(1)i(2)···i(k) (1)) Hω (A1 , A2 , . . . , Ak ) ≡ sup i(1),i(2),...,i(k)
−
k X X
−
k X X
l η(ωi(l) (1)) +
S(ω|Al )
l=1
l=1 i(l)
l=1 i(l)
k X
l l ωi(l) (1)S(ˆ ωi(l) |Al )
#
,
(13.3)
where the supremum is taken over all finite decompositions (13.1) of ω with a fixed k. If ω is τ -invariant, the following limit (denoted by hω,τ (N )) is known to exist (as the infimum over a) for any finite dimensional subalgebra N ⊂ A, hω,τ (N ) ≡ lim
a→∞
1 Hω (N, . . . , τ k (N ), . . . , τ a−1,...,a−1 (N )) , |Ca |
where there are |Ca | arguments for Hω (· · ·) and each of them is τ k (N ), k ∈ Ca . Let N1 ⊂ N2 ⊂ · · · ⊂ Nn ⊂ · · · be an increasing sequence of finite algebras such that the norm closure ∪n Nn is equal to A. By a Kolmogorov–Sinai type theorem (Corollary V.4 in [19]), the CNT-entropy hω (τ ) is given by hω (τ ) = lim hω,τ (Nn ) . n→∞
(13.4)
13.2. Variational equality in terms of CNT-entropy Let J1 , J2 , . . . , Jk be disjoint finite subsets of Zν with their union J. From Lemma VIII.1 in [19] it follows that Hω (A(J1 ), A(J2 ), . . . , A(Jk )) ≤ S(ωJ ) .
(13.5)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
185
When ω is an even ‘product state’, the equality holds as follows (the following simple proof is due to a referee). Lemma 13.1. Let J1 , J2 , . . . , Jk be disjoint finite subsets with their union J. Let ω be a Θ-even state of A. Assume that ω has the following product property: ω(A1 A2 · · · Ak B) = ω(A1 )ω(A2 ) · · · ω(Ak )ω(B) ,
(13.6)
where Aj is an arbitrary element in A(Jj ) (j = 1, . . . , k) and B is an arbitrary element in A(Jc ). Then Hω (A(J1 ), A(J2 ), . . . , A(Jk )) = S(ωJ ) = and Hω (A(J1 ), A(J2 ), . . . , A(Jk )) =
k X
k X
S(ωJl ) ,
(13.7)
l=1
Hω (A(Jl )) .
(13.8)
l=1
Proof. We define 1 (id + ΘJi ) . 2 Then EJ1 ,...,Jk (+) ≡ EJ1 (+) · · · EJk (+) is the conditional expectation from A onto A(J1 )+ ⊗ · · · ⊗ A(Jk )+ ⊗ A(Jc ). Since ω is a product state for the tensor product (A(J1 )+ ⊗· · ·⊗A(Jk )+ )⊗A(Jc ), there exists an ω-preserving conditional expectation Eω0 from (A(J1 )+ ⊗ · · · ⊗ A(Jk )+ ) ⊗ A(Jc ) onto A(J1 )+ ⊗ · · · ⊗ A(Jk )+ . Hence EJi (+) ≡
EJω1 ,...,Jk (+) ≡ Eω0 EJ1 ,...,Jk (+)
is an ω-preserving conditional expectation from A onto A(J1 )+ ⊗ · · · ⊗ A(Jk )+ . Hence Hω (A(J1 )+ , A(J2 )+ , . . . , A(Jk )+ ) = Hω|A(J1 )+ ⊗···⊗A(Jk )+ (A(J1 )+ , A(J2 )+ , . . . , A(Jk )+ ) =
k X
S(ω|A(Jl )+ ) = S(ωJ ) .
l=1
On the other hand,
Hω (A(J1 )+ , A(J2 )+ , . . . , A(Jk )+ ) ≤ Hω (A(J1 ), A(J2 ), . . . , A(Jk )) ≤ S(ωJ ) . We are now in a position to give the main theorem of this subsection. Theorem 13.2. Assume the same conditions on Φ as Theorem 11.4. Then P (βΦ) =
sup [hω (τ ) − βeΦ (ω)] ,
τ ω∈A∗ +,1,
(13.9)
where hω (τ ) is the CNT-entropy of ω with respect to the lattice translation τ .
April 11, 2003 14:43 WSPC/148-RMP
186
00160
H. Araki & H. Moriya
Proof. Based on Lemma 13.1, the proof will go in the same as the case of quantum lattice systems [32]. Basic properties of the CNT-entropy to which we use in the proof are as follows. (i) Covariance under an automorphism of A (the adjoint action on states and conjugacy action on the shift). (ii) Scaling property under the scaling of the automorphism group. (iii) Concave dependence on states. Due to (13.5), we have hω (τ ) ≤ s(ω) ,
(13.10)
for any translation invariant state ω. Hence the variational inequality (11.10) obviously holds when s(ω) is replaced by hω (τ ). Due to Lemma 13.1 and the product property of ϕca , the translation invariant cca defined in (11.23) will play an identical role as in the proof of Theorem 11.4. state ϕ Therefore the sequence cca )} {hϕcca (τ ) − eΦ (ϕ
tends to the supremum value P (Φ) of the variational inequality as a → ∞. Hence the theorem follows. Remark. (iii) is a general property of CNT-entropy (see e.g. [41]) and is enough for the proof. But in the situation of the above proof, the affinity holds due to the specific nature of the states to be considered. The preceding result is the variational equality. We are then interested in the variational principle. Proposition 13.3. Suppose that a translation invariant state ϕ satisfies P (βΦ) = hϕ (τ ) − βeΦ (ϕ) .
(13.11)
Then ϕ is a solution of the (Φ, β)-variational principle and hϕ (τ ) = s(ϕ) .
(13.12)
Proof. By (13.5), we have s(ϕ) − βeΦ (ϕ) ≥ hϕ (τ ) − βeΦ (ϕ) = P (βΦ) . By the variational inequality (11.10), we have s(ϕ) − βeΦ (ϕ) = P (βΦ) .
(13.13)
Therefore ϕ is a solution of the (Φ, β)-variational principle. From (13.11) and (13.13), we obtain (13.12). Remark 1. We have no result about the existence theorem for a solution of the variational principle (13.11) in terms of the CNT-entropy for a general Φ ∈ P τ
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
187
(like Theorem 11.7) nor the stability of solutions of such a variational principle (like Lemma 12.9), the obstacle in applying the usual method being absence of any result about weak∗ upper semicontinuity of hω (τ ) in ω. In this sense, Proposition 13.3 is a superficial result, and Theorem 11.4 is short of ‘the variational principle’ in terms of the CNT-entropy. See also the discussion in Sec. 4 of [32]. Remark 2. Although we have used CNT entropy throughout this section, other entropy such as htω (σ) defined by Choda [18] can be substituted into hω (τ ), yielding similar results. 14. Discussion The following are some of remaining problems about equilibrium statistical mechanics of Fermion lattice systems which are not covered in this paper. 1. Dynamics which does not commute with Θ Obviously, there is an inner one-parameter group of ∗-automorphisms which does not commute with Θ. Examples of outer dynamics not commuting with Θ can be constructed in the following way (suggested by one of referees). Let {Ii }i=l,2,··· be a partition of the lattice Zν into mutually disjoint finite subsets Ii and let Jj ≡ ∪i≤j Ii . Choose a self-adjoint bi in A(Ii )− for each i and set Φ(Ji ) ≡ vJi−1 bi where vJ is given by (4.30). By Theorem 4.17(1), they mutually commute and Φ(Ji ) ∈ A(Ji−1 )0 for (i) each i. Hence αt ≡ Ad eitΦ(Ji ) , i = 1, 2, . . . , are mutually commuting dynamics Q (i) (i) gives a of A, αt leaving elements of A(Ji−1 ) invariant. Hence αt ≡ ∞ i=1 αt dynamics of A satisfying Θαt = α−t Θ. (Namely, its generator anticommutes with Θ.) The corresponding potential is given by Φ(I) = 0 if I 6= Ji for any i and Φ(I) = Φ(Ji ) if I = Ji . This potential satisfies the standardness condition (Φ-d) ∗ αt (Un,N ) = if each bi satisfies it for the set Ii . By looking at the behavior of Un,N PN Q N −2it i=0 Φ(Jn+i ) e for Un,N ≡ i=0 vIn+i as n → ∞, the dynamics is seen to be outer P unless i Φ(Ji ) is convergent.
2. Broken Θ-invariance of equilibrium states In connection with the Gibbs condition, we have shown in Sec. 7.7 that the perturbed state either by surface energy or by the local interaction energy satisfies the product property if and only if the equilibrium state is Θ-invariant. However, we do not know an example of an equilibrium state which is not Θ-invariant. Existence or non-existence of such a state seems to be an important question. It seems to be closely related to the next problem 3. Note that any translation invariant state is Θ-invariant. So we need broken translation invariance of an equilibrium state for its broken Θ-invariance. 3. Local Thermodynamical Stability (LTS) In parallel with the case of quantum spin lattice system, one can formulate the local stability condition ([10], [39]) for our Fermion lattice system. However, there
April 11, 2003 14:43 WSPC/148-RMP
188
00160
H. Araki & H. Moriya
seems to be two choices of the outside system for a local algebra A(I) (I finite). (1) The commutant A(I)0 . (2) A(Ic ). For the choice (1), all arguments in the case of quantum spin lattice systems seem to go through for the Fermion lattice system leading to equivalence of LTS with the KMS condition under our basic Assumptions (I), (II) and (III). On the other hand, (2) seems to be physically correct choice, although we do not have an equivalence proof for (2) so far. In this connection, the problem 2 is crucial. If all equilibrium state is Θ-invariant, then the choice (2) also seems to give the LTS which is equivalent to the KMS under our basic assumptions. A paper on this problem is forthcoming [15]. 4. Downstairs Equivalence We may say that the dynamics αt is working upstairs while its generator is working downstairs. In particular, our arena for the downstairs activity is A◦ . The stair going upstairs seems to be not wide open. On the other hand, there seems to be a lot more room downstairs. There, we have established the one-to-one correspondence between (Θ-invariant) derivations on A◦ and standard potentials. We have shown that the solution of the variational principle (described in terms of a translation covariant potential) satisfies the dKMS condition on A◦ (described in terms of the corresponding derivation). How about the converse. There is also the problem of equivalence of LTS condition (in terms of a potential) and the dKMS condition on A◦ (in terms of the corresponding derivation) where the translation invariance is not needed. Some aspects of this problem will also be included in the forthcoming paper [15]. 5. Equivalent Potentials We have introduced the notion of general potentials and equivalence among them in Sec. 5.5. Our theory is developed only for the unique standard potential among each equivalence class. Natural questions about general potentials arise. Does the existence of the limits defining the pressure P (βΦ) and the mean energy eΦ (ϕ) hold also for translation covariant general potentials Φ? Assuming the existence, are the P (βΦ) and eΦ (ϕ) the same as those for the unique standard potential Φs equivalent to Φ? If they are different, how about the solution of their variational principle? We give a partial answer to these questions. Proposition 14.1. Let Φ be a translation covariant potential (which satisfies (Φa,b,c,e,f) by definition) fulfilling the following additional condition: the surface energy X WΦ (I) = limν {Φ(K); K ∩ I 6= ∅, K ∩ Ic 6= ∅, K ⊂ J} , (14.1) J%Z
K
satisfies
v.H. lim
I→∞
kWΦ (I)k = 0. |I|
(14.2)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
189
Let Φs be the standard potential (in Pτ ) which is equivalent to Φ. Then both van τ Hove limits defining P (βΦ) and eΦ (ω) for all ω ∈ A∗+,1, exist if and only if CΦ ≡ v.H. lim
I→∞
τ (HΦ (I)) |I|
(14.3)
exists. If this is the case, then the following relations hold 1 log TrI (e−βH(I) ) I→∞ |I|
P (βΦ) = v.H. lim
1 log TrI (e−βU (I) ) I→∞ |I|
= v.H. lim
= P (βΦs ) − βCΦ , eΦ = v.H. lim
I→∞
(14.4)
1 ω(H(I)) |I|
1 ω(U (I)) I→∞ |I|
= v.H. lim
= eΦs (ω) + CΦ .
(14.5)
Furthermore, (Φ, β)- and (Φs , β)-variational principle give the same set of solutions. Remark. If τ (Φ(I)) = 0 for all I, then (14.3) exists and CΦ = 0. Hence P (βΦ) = P (βΦs ) and eΦ (ω) = eΦs (ω). This can be achieved for any general potential Φ by changing it to Φ1 = Φ − Φ0 where Φ0 is a scalar-valued potential given by Φ0 (I) = τ (Φ(I))1 . Proof. Since Φ and Φs are equivalent, we have HΦ (I) − HΦs (I) ∈ A(I)0 . Since HΦ (I) − HΦs (I) is Θ-even by (Φ-c) for Φ and Φs , we have HΦ (I) − HΦs (I) ∈ A(Ic )+ . Hence, UΦ (I) − UΦs (I) = EI (UΦ (I) − UΦs (I)) = EI (HΦ (I) − HΦs (I)) − EI (WΦ ) − WΦs (I)) = τ (HΦ (I) − HΦs (I)) − EI (WΦ (I) − WΦs (I)) , due to (14.6). By τ (HΦs (I)) = 0 and EI (WΦs (I)) = 0 due to (Φ-d), we have UΦ (I) − UΦs (I) = τ (HΦ (I)) − EI (WΦ (I)) .
(14.6)
April 11, 2003 14:43 WSPC/148-RMP
190
00160
H. Araki & H. Moriya
By (14.2), we have v.H. lim
I→∞
Also by (14.2),
1 kUΦ (I) − UΦs (I) − τ (HΦ (I))k = 0 . |I|
v.H. lim
I→∞
Hence (14.5) follows: v.H. lim
I→∞
1 kHΦ (I) − UΦ (I)k = 0 . |I|
1 1 ω(HΦ (I)) = v.H. lim ω(UΦ (I)) I→∞ |I| |I| 1 1 ω(UΦs (I)) + v.H. lim τ (HΦ (I)) I→∞ |I| I→∞ |I|
= v.H. lim
1 τ (HΦ (I)) . I→∞ |I|
= eΦs + v.H. lim We also have v.H. lim
I→∞
1 1 log TrI (e−H(I) ) = v.H. lim log TrI (e−U (I) ) I→∞ |I| |I| 1 = P (βΦs ) − β v.H. lim τ (HΦ (I)) , I→∞ |I|
which shows (14.4).
Remark. Suppose that Φ satisfies (Φ-a), (Φ-b), (Φ-c), (Φ-f) and X kΦ(I)k < ∞ .
(14.7)
I30
Then it satisfies (Φ-e) automatically and is a general potential. Furthermore, (14.2) is known to be satisfied (the same proof as Lemma 9.1 holds except for estimates (9.2) (9.3), (9.4) and (9.5) which follow from the absolute convergence of (14.7) due to (7.12)) and τ (HΦ (I)) τ (UΦ (I)) = v.H. lim = eΦ (τ ) I→∞ I→∞ |I| |I|
CΦ = v.H. lim
(14.8)
is known to converge. (The same proof as Theorem 9.5 holds except for a modification of proof of some estimates for Lemma 9.2 on the basis of the absolute convergence of (14.7). See also e.g. Proposition 6.2.39 of [17].) Therefore (14.4) and (14.5) hold and the solutions of (Φ, β)- and (Φs , β)variational principle coincide. Appendix: Van Hove Limit For the sake of mathematical precision, we present some digression about Van Hove limit.
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
191
A.1. Van Hove net We introduce mutually equivalent two types of conditions for the van Hove limit. First we start with our notation about the shapes of regions of Zν , which will be used hereafter. Recall that Ca is a cube of size a given by (8.8). For a finite subset I of Zν and a ∈ N, let n+ a (I) be the smallest number of translates of Ca whose union covers I, while n− (I) be the largest number of mutually disjoint translates of Ca a that can be packed in I. Let Br (n) be a closed ball in Rν (⊃ Zν ) with the center n ∈ Zν and the radius r ∈ R. Denote the surface of I with a thickness r(> 0) by surf r (I) ≡ {n ∈ I; Br ({n}) ∩ Ic 6= ∅} .
(A.1)
In what follows, we consider a net of finite subsets Iα of Zν where the set of indices α is a directed set. Its partial ordering need not have any relation with the set inclusion partial ordering of Iα . Lemma A.1. For a net of finite subsets Iα of Zν , the following two conditions are equivalent: (1) For any a ∈ N,
n− a (Iα ) = 1. n+ a (Iα )
(A.2)
1 |surf r (Ia )| = 0 . |Iα |
(A.3)
lim α
(2) For any r > 0, lim α
Proof. (1) → (2): Let ε > 0 and r > 0 be given. Let a ∈ N be sufficiently large so that a ≥ 2r + 1 and ε [a − 2r]ν < , ε1 ≡ 1 − aν 2 where [b] indicates the maximal integer not exceeding b. By the condition (1), there exists an index α0 of the net {Iα } such that, for α ≥ α0 , ε2 ≡ 1 −
n− ε a (Iα ) < . 2 n+ a (Iα )
Let D1 , . . . , DN , with N = n− a (Iα ), be mutually disjoint translates of Ca contained in Ia . Let Di0 be a translate of C[a−2r] placed in Di with a distance larger than r from the complement of Di in Zν for each i = 1, . . . , N which exists. Then |Di0 | [a − 2r]ν = 1 − ε1 . = |Di | aν
April 11, 2003 14:43 WSPC/148-RMP
192
00160
H. Araki & H. Moriya
0 Let D be the union of D1 , . . . , DN and D0 be the union of D10 , . . . , DN . Then
|D0 | |D \ D0 | =1− = 1 − (1 − ε1 ) = ε1 . |D| |D|
Since n+ a (Iα ) translates of Ca covers Iα , we have
+ ν |Iα | ≤ n+ a (I)|Ca | = na (Iα )a .
Hence |Iα \ D| n− |D| |D| a (Iα ) =1− + = ε2 . =1− ≤1− + ν |Iα | |Iα | na (Iα )a na (Iα ) Due to Iα ⊃ D, |D \ D0 | |D \ D0 | ≤ = ε1 . |Iα | |D|
By construction, the distance between Di0 and the complement of Di (in Zν ) is larger than r, and hence the distance between Di0 and the complement of Iα is larger than r. Thus, surf r (Iα ) ⊂ Iα \ D0 = (D \ D0 ) ∪ (Iα \ D) . For α ≥ α0 , we obtain
|surf r (Iα )| ≤ ε1 + ε2 < ε . |Iα |
Now (1) → (2) is proved. (2) → (1): √ Let ε > 0 and a ∈ N be given. Take r > νa. Let α0 be an index of the net Ia such that, for α ≥ α0 , |surf r (Iα )| < a−ν ε . |Iα |
The translates Ca + an of Ca are disjoint for distinct n ∈ Zν and their union over n ∈ Zν is Zν . Let Oα be the union of all those Ca + an contained in Iα and N1 be their number. Let Oα0 be the union of all those Ca + an which have nonempty intersections with both Iα and (Iα )c , and N2 be their number. From the construction, the following estimates follow + N1 ≤ n − a (Iα ) ≤ na (Iα ) ≤ N1 + N2 .
Furthermore, since Ca + an in Oα0 contains a point in Iα as well as a point in (Iα )c , √ and the distance of any two points in it is at most νa < r, it has a non-empty intersection with Iα , which is contained in surf r (Ia ). Therefore, |surf r (Iα )| ≥ N2 = (N1 + N2 ) − N1 − ≥ n+ a (Iα ) − na (Iα ) .
We have also + ν |Iα | ≤ n+ a (Iα )|Ca | = na (Iα )a .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
193
Combining above estimates, we obtain for α ≥ α0 0 ≤ 1− ≤
− n− n+ a (Iα ) a (Iα ) − na (Iα ) = n+ n+ a (Iα ) a (Iα )
|surf r (Iα )|aν |Iα |
< ε.
Hence, (2) → (1) is now proved. Definition A.2. If a net of finite subsets {Iα } satisfies the above condition (1) (or equivalently (2)), then it is said to be a van Hove net (in Zν ). We introduce the third condition on a net of finite subsets Iα of Zν : (3) For any finite subset I of Zν , there exists an index α◦ such that Iα ⊃ I for all α ≥ α◦ . Definition A.3. If a net {Iα } (in Zν ) satisfies the conditions (1) (or equivalently (2)) and (3), then it is said to be a van Hove net tending to Zν . Remark. The condition (1) (or equivalently (2)) does not imply the condition (3). {Cn }n∈N of (8.8) is obviously a van Hove sequence. But it does not cover the whole Zν . Hence it is not a van Hove sequence tending to Zν . Lemma A.4. For any van Hove net and for any van Hove net tending to Zν , the directed set can not have a maximal element. Proof. Let {Iα }α∈A be a van Hove net where A is a directed set of indices. We show that for any α◦ ∈ A, there exists α0 ∈ A satisfying α0 ≥ α◦ , α0 6= α◦ . In fact, for a given α◦ , there exist a(α◦ ) ∈ N and n ∈ Zν such that Iα◦ ⊂ Ca(α◦ )−n , and hence n− a(α◦ ) (Iα◦ ) = 0 . On the other hand, for the above a(α◦ ) ∈ N there exists α1 such that 1−
n− a(α◦ ) (Iα ) n+ a(α◦ ) (Iα )
α0 and β > β 0 or if α = α0 , β = β 0 and i ≥ i0 . For any (α1 , β1 , i1 ) ∈ C and (α2 , β2 , i2 ) ∈ C, there exist α ∈ A and β ∈ B such that α > α1 , α > α2 , β > β1 , β > β2 , because A and B are directed sets without maximal elements due to Lemma A.4. Hence (α, β, 2)(∈ C) obviously satisfies (α, β, 2) > (α1 , β1 , i1 ) , So C is a directed set. Let I(α,β,i) =
(
(α, β, 2) > (α2 , β2 , i2 ) .
I1α
if i = 1 ,
I2β
if i = 2 .
Since {I1α } and {I2β } are van Hove nets, there exists α◦ ∈ A and β◦ ∈ B for any d > 0 and ε > 0 such that |surf d (I1α )| < ε if α ≥ α◦ |I1α | |surf d (I2β )| < ε if β ≥ β◦ . |I2β | Set γ◦ ≡ (α◦ , β◦ , 1). For any γ = (α, β, i) ≥ γ◦ , we have obviously α ≥ α◦ and β ≥ β◦ by the definition of the ordering. Hence, ) ( |surf d (Iγ )| |surf d (I1α )| |surf d (I2β )| < ε. ≤ max , |Iγ | |I1α | |I2β |
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
195
Thus {Iγ }γ∈C is also a van Hove net. If {I1α } and {I2β } are van Hove nets tending to Zν , then {Iγ } is also a van Hove net tending to Zν by its definition. Since {Iγ }γ∈C is a van Hove net (van Hove net tending to Zν ), f has the following limit by the assumption on f , f∞ = lim{f (Iγ ), γ ∈ C} . γ
Thus for any ε, there exists a γ◦ = (α◦ , β◦ , 1) or γ◦ = (α◦ , β◦ , 2) such that |f∞ − f (Iγ )| < ε for γ ≥ γ◦ . This inequality holds especially for γ = (α, β, 1) ≥ γ◦ with α > α◦ and β > β◦ . For this γ, Iγ = I1α , and hence f (Iγ ) = f (I1α ). Thus we have |f∞ − f (I1α )| < ε for α > α◦ . Therefore, we obtain f∞ = lim f (I1α ) . α
Similarly, f∞ = lim f (I2β ) . β
Now we have shown that the limit is the same for {I1α }α∈A and {I2β }β∈B . Hence the independence of the limit on the choice of the net follows. Definition A.6. If f (Iα ) has a limit for any van Hove net {Iα }, then f (I) is said to have the van Hove limit for large I, and its limit is denoted by v.H. lim f (I) . I→∞
(A.4)
If f (Iα ) has a limit for any van Hove net {Iα } tending to Zν , then f (I) is said to have the van Hove limit for I tending to Zν , and its limit is denoted by v.H. limν f (I) . I→Z
(A.5)
In general, the first condition is stronger than the second. If f (I) is translation invariant, however, the existence of the two limits are equivalent as shown below. Lemma A.7. If f (I) is translation invariant in the sense that f (I + n) = f (I) for any finite subset I of Zν and n ∈ Zν , then f (I) has the van Hove limit for large I if and only if f has the van Hove limit for I tending to Zν . Proof. The only if part is obvious. Let {Iα }α∈A be an arbitrary van Hove net. Let a(α) be the largest integer a such that a translate of Ca is contained in Iα . Let Ca(α) + n ⊂ Iα and hence Ca(α) ⊂ Iα − n. Now we shift an approximate center of
April 11, 2003 14:43 WSPC/148-RMP
196
00160
H. Araki & H. Moriya
Ca(α) to the origin of Zν and simultaneously shift Iα − n by the same amount. More precisely, Iα − n is shifted to a(α) − 1 I0α ≡ Iα − n − (1, . . . , 1) . 2 Obviously, |surf d (Iα )| |surf d (I0α )| = 0 |Iα | |Iα |
for all d > 0 and α ∈ A. We show that this {I0α }(α ∈ A) is tending to Zν . Let I be a finite subset of Zν . For sufficiently large integer a, I ⊂ Ca−[ a−1 ] . For this a, there exists α1 such that 2 n− a (Iα ) > 0 for α ≥ α1 . Then a(α) ≥ a and Iα0 ⊃ Ca(α)−[ a(α)−1 ] ⊃ Ca−[ a−1 ] ⊃ I 2
for α ≥ α1 . Thus invariant,
{I0α }(α
2
∈ A) is a van Hove net tending to Zν . Since f is translation f (Iα ) = f (I0α ) .
By the assumption that f has the van Hove limit tending to Zν , limα f (I0α ) exists, and hence limα f (Iα ) exists. References [1] H. Araki and E. H. Lieb, Entropy inequalities, Comm. Math. Phys. 18 (1970), 160–170. [2] H. Araki, Relative hamiltonian for faithful normal states of a von Neumann algebra, Publ. RIMS, Kyoto Univ. 7 (1973), 165–209. ´ [3] H. Araki, Expansional in Banach algebra, Ann. Sci. Ecole Norm Sup. S´er. 46 (1973), 67–84. [4] H. Araki, Golden–Thompson and Peierls–Bogoliubov inequalities for a general von Neumann algebra, Comm. Math. Phys. 34 (1973), 167–178. [5] H. Araki and P. D. F. Ion, On the equivalence of KMS and Gibbs conditions for states of quantum lattice systems, Comm. Math. Phys. 35 (1974), 1–12. [6] H. Araki, On the equivalence of the KMS condition and the variational principle for quantum lattice systems, Comm. Math. Phys. 38 (1974), 1–10. [7] H. Araki, Relative entropy and its application, in Colloques Interationaux du C.N.R.S. No. 248 Les Methodes Mathematiques de la Theorie Quantique des Champs, eds. F. Guerra, D. W. Robinson and R. Stora, CNRS, Paris, 1976. [8] H. Araki, Relative entropy of states of von Neumann algebras, Publ. RIMS, Kyoto Univ. 11 (1976), 809–833. [9] H. Araki, Relative entropy of states of von Neumann algebras II, Publ. RIMS, Kyoto Univ. 13 (1977), 173–192. [10] H. Araki and G. L. Sewell, KMS conditions and local thermodynamical stability of quantum lattice systems, Comm. Math. Phys. 52 (1977), 103–109. [11] H. Araki, D. Kastler, M. Takesaki and R. Haag, Extension of KMS states and chemical potentials, Comm. Math. Phys. 53 (1977), 97–134.
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
197
[12] H. Araki, On KMS states of a C ∗ dynamical system, Lecture Notes in Math. 650, Springer-Verlag, 1978. [13] H. Araki, Toukeirikigaku no suuri, Iwanami (Japanese), 1994. [14] H. Araki and H. Moriya, Joint extension of states of subsystems for a CAR system, to appear in Comm. Math. Phys. [15] H. Araki and H. Moriya, Local thermodynamical stability of Fermion lattice systems, Lett. Math. Phys. 62 (2002), 33–45. [16] B. Baumgartner, A partial ordering of sets, making mean entropy monotone, J. Phys. A: Math. Gen. 35 (2002), 3163–3182. [17] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics 2, 2nd edition, Springer-Verlag, 1996. [18] M. Choda, A C∗ -Dynamical Entropy and Applications to Canonical Endomorphisms, J. Funct. Anal. 173 (2000), 453–480. [19] A. Connes, H. Narnhofer and W. Thirring, Dynamical Entropy of C∗ Algebras and von Neumann Algebras, Comm. Math. Phys. 112 (1987), 691–719. [20] M. Fannes, A continuity property of the entropy density for spin lattice systems, Comm. Math. Phys. 31 (1973), 291–294. [21] F. M. Goodman, P. de la Harpe and V. F. R. Jones, Coxeter Graphs and Towers of Algebras, Springer-Verlag, 1989. [22] T. Hudetz, Spacetime Dynamical Entropy of Quantum Systems, Lett. Math. Phys. 16 (1988), 151–161. [23] R. B. Israel, Convexity in the Theory of Lattice Gases, Princeton University Press, 1979. [24] A. R. Kay and B. S. Kay, Monotonicity with volume of entropy and of mean entropy for translationally invariant systems as consequences of strong subadditivity, J. Phys. A. Math. Gen. 34 (2001) 365–382. [25] H. Kosaki, Relative entropy for states: a variational expressions, J. Operator. Theory 16 (1986), 335–348. [26] O. E. Lanford III and D. W. Robinson, Statistical mechanics of quantum spin systems III, Comm. Math. Phys. 9 (1968), 327–338. [27] E. H. Lieb and M. B. Ruskai, Proof of the strong subadditivity of quantum-mechanical entropy, J. Math. Phys. 14 (1973), 1938–1941. [28] E. H. Lieb and M. B. Ruskai, A fundamental property of quantum-mechanical entropy, Phys. Rev. Lett. 30 (1973), 434–436. [29] T. Matsui, Ground states of fermions on lattices, Comm. Math. Phys. 182 (1996), 723–751. [30] T. Matsui, Quantum statistical mechanics and Feller semigroup, Quantum Probability Communication 10 (1998), 101–124. ¨ [31] S. Mazur, Uber konvexe Menge in linearen normierten Raumen, Studia. Math. 4 (1933), 70–84. [32] H. Moriya, Variational principle and the dynamical entropy of space translation, Rev. Math. Phys. 11 (1999), 1315–1328. [33] H. Moriya, Some aspects of quantum entanglement for CAR systems, Lett. Math. Phys. 60 (2002), 109–121. [34] S. Neshveyev and E. Størmer, The variational principle for a class of asymptotically abelian C∗ -algebras, Comm. Math. Phys. 215 (2000), 177–196. [35] D. Petz, On certain properties of the relative entropy of states of operator algebras, Math. Z. 206 (1991), 351–361. [36] R. T. Powers, Representations of the canonical anticommutation relations, Thesis, Princeton University, 1967.
April 11, 2003 14:43 WSPC/148-RMP
198
00160
H. Araki & H. Moriya
[37] D. Ruelle, A variational formulation of equilibrium statistical mechanics and the Gibbs phase rule, Comm. Math. Phys. 5 (1967), 324–329. [38] S. Sakai, On one-parameter subgroups of ∗-automorphisms on operator algebras and the corresponding unbounded derivations, Am. J. Math. 98 (1976), 427–440. [39] G. L. Sewell, KMS conditions and local thermodynamical stability of quantum lattice systems II, Comm. Math. Phys. 55 (1977), 53–61. [40] B. Simon, The Statistical Mechanics of Lattice Gases, Princeton University Press, 1993. [41] E. Størmer, A survey of noncommutative dynamical entropy, Oslo preprint, Dep. of Mathematics 18 (2000). [42] M. Takesaki, Tomita’s Theory of Modular Hilbert-Algebras and its Application, Lecture Notes in Math. 128, Springer-Veralag (1970). [43] M. Takesaki, Theory of Operator Algebras I, Springer-Verlag, 1979. [44] J. Tomiyama, On the projection of norm one in W ∗ -algebras, Proc. Japan. Acad. 33 (1957), 609–612. [45] H. Umegaki, Conditional expectation in an operator algebra IV, (entropy and information), Kodai. Math. Sem. Rep. 14 (1962), 59–85.
April 11, 2003 14:51 WSPC/148-RMP
00159
Reviews in Mathematical Physics Vol. 15, No. 2 (2003) 199–215 c World Scientific Publishing Company
ON THE GEOMETRY OF THE CHARACTERISTIC CLASS OF A STAR PRODUCT ON A SYMPLECTIC MANIFOLD∗
PIERRE BIELIAVSKY Universit´ e Libre de Bruxelles, Brussels, Belgium
[email protected] PHILIPPE BONNEAU Universit´ e de Bourgogne, Dijon, France
[email protected] Received 7 May 2002 Revised 11 October 2002 The characteristic class of a star product on a symplectic manifold appears as the class of a deformation of a given symplectic connection, as described by Fedosov. In contrast, one usually thinks of the characteristic class of a star product as the class of a deformation of the Poisson structure (as in Kontsevich’s work). In this paper, we present, in the symplectic framework, a natural procedure for constructing a star product by directly quantizing a deformation of the symplectic structure. Basically, in Fedosov’s recursive formula for the star product with zero characteristic class, we replace the symplectic structure by one of its formal deformations in the parameter ~. We then show that every equivalence class of star products contains such an element. Moreover, within a given class, equivalences between such star products are realized by formal one-parameter families of diffeomorphisms, as produced by Moser’s argument. Keywords: Deformation quantization; characteristic class of star products; reduction.
1. Introduction Inspired by the pioneering work of Weyl [16, 17], Wigner [18] and Moyal [10] a rigorous description of quantum mechanics as a deformation of classical mechanics has been given in [1, 2]. These are the foundational papers of what is now called “deformation quantization”. A fundamental problem is the construction, for a given smooth manifold N , of a formal associative product on C ∞ (N )[[t]] that is a deformation of the natural pointwise product, i.e. a product ? such that ∗ Research
supported by the Communaut´e fran¸caise de Belgique, through an Action de Recherche Concert´ee de la Direction de la Recherche Scientifique. 199
April 11, 2003 14:51 WSPC/148-RMP
200
00159
P. Bieliavsky & P. Bonneau
P f ? g = f.g + n>1 tn Pn (f, g) where f, g ∈ C ∞ (N ) and the Pn ’s are bidifferential operators. Such a product is called a “star product”. If it exists then it is straightforward to see that N is a Poisson manifold. So a natural question arises: Does it exist a star product on every Poisson manifold? An affirmative answer has been given in [9]. Two star products ?1 and ?2 on a Poisson manifold N are called P equivalent if there exists a formal series T = id + k≥1 tk Tk of differential operators {Tk : C ∞ (N ) → C ∞ (N )} such that T (f ?2 g) = T f ?1 T g. In the general case of Poisson manifolds, a classifying space for equivalence classes of star products is described in [9]. For the particular case of symplectic manifolds, this has been known for quite a while [7, 8]: equivalence classes of star products are in one-toone correspondence with sequences of elements of de Rham’s H 2 (N ). The sequence Ω ∈ H 2 (N )[[t]] associated to the equivalence class of a given star product is called the characteristic class of the star product. The characteristic class of a star product on a symplectic manifold appears as the class of a deformation of a given symplectic connection, as described by Fedosov [7, 8]. In contrast, one usually thinks of the characteristic class of a star product as the class of a deformation of the Poisson structure [9]. In this paper, we present, in the symplectic framework, a natural procedure for constructing a star product by directly quantizing a deformation of the symplectic structure. Basically, in Fedosov’s recursive formula for the star product with zero characteristic class, we replace the symplectic structure by one of its formal deformations in the parameter ~. We then show that every equivalence class of star products contains such an element. Moreover, within a given class, equivalences between such star products are realized by formal one-parameter families of diffeomorphisms, as produced by Moser’s argument. More precisely, let (M, ω) be a compact symplectic manifold. Let {Ω(t)}t∈]−,[ be a smooth path of symplectic structures on M such that Ω0 = ˆ on M ˆ = M× ] − ω. The pair (M, {Ω(t)}) defines a regular Poisson structure Ω , [ whose symplectic leaves are {(M × {t}, Ω(t))}. Applying Fedosov’s method to ˆ , Ω), ˆ one obtains a tangential star product ˆ ˆ , Ω) ˆ with zero characteristic (M ? on (M class. The “infinite jet at 0 of ˆ ? in t = ~” then defines a star product ? on (M, ω) to which is associated the de Rham class [Ω~ ]de Rham where Ω~ denotes the infinite jet at 0 of {Ω(t)} in t = ~. If {Ω0 (t)} is such that [Ω0 (t)]de Rham = [Ω(t)]de Rham ∀t, then an equivalence between the corresponding star products is realized as the infinite jet of a family of diffeomorphisms {ϕt } — whose existence is guaranteed by Moser’s argument — such that ϕ?t Ω0 (t) = Ω(t). This work is motivated by the question of obtaining a quantum analogue of Kirwan’s map when considering the problem of commutation between Marsden– Weinstein reduction and deformation quantization. However this point is not investigated in the present article. 2. Fedosov Construction on Regular Poisson Manifolds We present Fedosov star products on regular Poisson manifolds [7, 8] by mean of a partial connection defined (only) on the characteristic distribution of the Poisson
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
201
structure. By this we avoid considering Poisson affine connections (cf. Lemma 2.8). This little point excepted, there is essentially nothing new in the present section. But it sets the notations and presents Fedosov’s construction in a completely intrinsic way. 2.1. Linear Weyl algebra Let (V, ω) be a real symplectic vector space and consider the associated Heisenberg Lie algebra H over the dual space V ? . That is H = V ? ⊕ R~ where ~ is central and where the Lie bracket of two elements y, y 0 ∈ V ? is defined by [y, y 0 ] = y 0 (] y)~, ]
the map V ? → V being the isomorphism induced by ω. Denote by S(H) (resp. U(H)) the symmetric (resp. the universal enveloping) algebra of H and consider ϕ the complete symmetrization map S(H) → U(H) given by the Poincar´e–Birkhoff– Witt theorem. The symmetric product on S(H) will be denoted by •, while ? will denote the product on S(H) transported via ϕ of the universal product on U(H). L (r) (H) on Lemma 2.1. There exists one and only one grading S(H) =: r≥0 S S(H) such that: (i) (ii)
S r (V )? ⊂ S (r) (H)
S (r) (H) ? S (s) (H) ⊂ S (r+s) (H) ,
where S r (V ? ) denotes the rth symmetric power of V ? . This grading is compatible with the symmetric product • as well. One then defines the linear Weyl algebra W(H) as the direct product W(H) := Q∞ (r) (H) endowed with the extended product ?. Note that the symmetric prodr=0 S uct • extends to W(H) as well. The center ZW(H) of (W(H), ?) is canonically isomorphic to the space of power series R[[~]]. By using the symplectic structure, one gets an identification between the Lie algebra sp(V, ω) and the second symmetric power S 2 (V ? ): sp(V, ω) → S 2 (V ? ) ⊂ W(H) A 7→ A . Lemma 2.2. For all a ∈ W(H) and A ∈ sp(V, ω), one has [A, a] = 2~A(a) , where [ , ] denotes the Lie bracket on W(H) induced by the associative product ?. Proof. Both ad(A) and ~A are derivations of (W(H), ?). It is therefore sufficient to verify formula (i) on generators. [
µ
The isomorphism V → V ? defines an injection V → W(H) which we call the linear moment. Observe that, viewed as an element of W(H) ⊗ V ? , µ is fixed under the action of the symplectic group Sp(V, ω).
April 11, 2003 14:51 WSPC/148-RMP
202
00159
P. Bieliavsky & P. Bonneau
Both products ? and • extend naturally to the space W(H) ⊗ Λ• (V ? ) of multilinear forms on V valued in W(H). We define the total degree t of an element a ⊗ ω, a ∈ S (r) (H), ω ∈ Λp (V ? ) by t = p + r. With respect to this degree on W(H)⊗Λ• (V ? ), the extended multiplications, again denoted by ? and •, are graded. The bracket [ , ] mentioned in Lemma 2.2 therefore extends to W(H) ⊗ Λ• (V ? ) as well, and, (W(H) ⊗ Λ• (V ? ), [ , ]) is a graded Lie algebra. To an element a ⊗ x ∈ W(H) ⊗ Λp (V ), one can associate the operator ia⊗x : W(H) ⊗ Λ• (V ? ) → W(H) ⊗ Λ•−p (V ? ) , defined by ia⊗x (b ⊗ ω) := a • b ⊗ ix ω , where ix ω denotes the usual interior product. Using the universal property, one gets a map (W(H) ⊗ V ) × W(H) ⊗ Λ• (V ? ) → W(H) ⊗ Λ•−p (V ? ) (X, s) 7→ iX s . In the case where p is odd, since iX acts “symmetrically” on the “Weyl part” and “anti-symmetrically” on the “form part”, one has i2X = 0. In the same way, if Y ⊂ W(H) is a subspace such that [Y, Y ] ⊂ ZW(H) (e.g. Y = S (1) (H)), to any element U ∈ Y ⊗ Λp (V ? ), one can associate the operator ad(U ) : W(H) ⊗ Λ• (V ? ) → W(H) ⊗ Λ•+p (V ? ) . Using Jacobi identity on the “Weyl part”, one observes that, if p is odd, one has ad(U )2 = 0. Definition 2.3. Using the duality S (1) (H) ⊗ V ? → S (1) (H) ⊗ V U 7→ ] U , one defines the cohomology (resp. homology) operator δ (resp. δ ? ) by ~δ := ad(µ) δ ? := i]µ , where the linear moment µ is viewed as an element of S (1) (H) ⊗ V ? . For a form a ∈ S • (V ? ) ⊗ Λ• (V ? ) ⊂ W(H) ⊗ Λ• (V ? ) with total degree t, we set 1 ? δ a if t > 0 δ −1 a := t 0 if t = 0 . One extends this definition C[[~]]-linearly to the whole W(H) ⊗ Λ• (V ? ).
Lemma 2.4. (“Hodge decomposition”) δδ −1 + δ −1 δ = Id − pr0 pr0
where pr0 is the canonical projection W(H) ⊗ Λ• (V ? ) → ZW(H).
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
203
Proof. We observe that δ and δ ? are anti-derivations of degree ±1 of (W(H) ⊗ Λ• (V ? ), •). Their anti-commutator being a derivation of degree 0, it is therefore sufficient to check the formula on generators. Observe that δ is an anti-derivation of degree +1 of (W(H) ⊗ Λ• (V ? ), ?). 2.2. The Weyl bundle Let (N, Λ) be a regular Poisson manifold. The Poisson bivector Λ induces a short sequence of vector bundles over N : ι?
0 → rad(Λ) → T ? (N ) → D? → 0 ι
where D → T (N ) denotes the characteristic distribution associated to Λ [15], and where rad(Λ) is the radical of Λ in T ? (N ). One therefore gets a non-degenerate foliated 2-form ω D ∈ Ω2 (D), dual to the canonical one on the quotient T ? (N )/rad(Λ) = D? . Fix a rank(D)-dimensional symplectic vector space (V, ω), and, for all x ∈ N, define Px = {b ∈ HomR (V, Dx )|b? ωxD = ω} . S Then P = x∈N Px is naturally endowed with a structure of Sp(V, ω)-principal bundle over N (analogous to the symplectic frames in the symplectic case, except that here, one does not have a G-structure in general). Definition 2.5. The Weyl bundle is the associated bundle W = P ×Sp(V,ω) W(H) , where W(H) is the vector space underlying the linear Weyl algebra defined from the data of (V, ω). The space of p-forms with values in the sections of W is denoted by Ωp (W); it is canonically isomorphic to the space of sections of the associated bundle P ×Sp(V,ω) (W(H) ⊗ Λp (V ? )). The Sp(V, ω)-invariance, at the linear level, of both product ? and • on W(H) ⊗ Λ• (V ? ) provides graded products, again denoted by ? and •, on Ω• (W). In the same way, the operators δ and δ −1 on W(H) ⊗ Λ• (V ? ) define operators on sections: Ω• (W)
δ −→ ←− δ −1
Ω•+1 (W) ,
leading to a Hodge decomposition of sections as in Lemma 2.4. Notes that the bundle ZW = P ×Sp(V,ω) ZW(H) being trivial, its space of sections is isomorphic to C ∞ (N )[[~]]. Remark 2.6. Observe that, as a vector bundle, W is defined as soon as the distribution D is given (cf. Lemma 2.1). The full data of the Poisson tensor Λ is only needed to define the algebra structure on its space of sections.
April 11, 2003 14:51 WSPC/148-RMP
204
00159
P. Bieliavsky & P. Bonneau
2.3. Fedosov Moyal star products Definition 2.7. A foliated connection is a linear map ∇: D⊗D → D u ⊗ v 7→ ∇u v verifying (f ∈ C ∞ (N )) (i) ∇f u v = f ∇u v, (ii) ∇u f v = f ∇u v + Lι(u) f v. A foliated connection is said to be symplectic if (iii) ∇u v − ∇v u − [u, v] = 0, (iv) ∇ω = 0. Lemma 2.8. On a regular Poisson manifold, a symplectic foliated connection always exists. Proof. Choose any linear connection ∇0 in the vector bundle D → N . Since D is an involutive tangent distribution, the torsion T 0 of the connection is well defined as a section of D? ⊗End(D). One then obtains a “torsion-free” connection ∇1 = ∇0 − 21 T 0 in D. Now, the formula 1 1 D (∇ ω (v, w) + ∇1v ω D (u, w)) 3 u defines a tensor S, section of D ? ⊗ D? ⊗ D such that ∇ = ∇1 + S is as desired. ω D (S(u, v), w) =
Now, fix such a foliated symplectic connection ∇ in D and consider its associated covariant exterior derivative ∂
Ωp (W) −→ Ωp+1 (W) , defined by ∂s(u1 , . . . , up+1 ) =
p+1 X
(−1)i−1 (∇ui s)(u1 , . . . , u ˆi , . . . , up+1 ) .
i=1
Lemma 2.2 then provides a 2-form R ∈ Ω2 (D ⊗ D) ⊂ Ω2 (W) defined by the formula 2~∂ 2 = ad(R) . Inductively on the degree, one sees [8, Theorem 5.2.2] that the equation R + 2~(∂γ − δγ + γ 2 ) = 0 has a unique solution γ ∈ Ω1 (W) such that δ −1 γ = 0. This implies that the graded derivation D = ∂ − δ + ad(γ)
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
205
of (Ω• (W), ?) is flat i.e. D2 = 0. One then proves, again inductively, that the projection pr
0 WD −→ C ∞ (N )[[~]] ,
where WD is the kernel of D restricted to the sections of W, is a linear isomorphism. The space of flat sections WD being a subalgebra of the sections of W with respect to the product ? (D is a derivation), the above linear isomorphism yields a star product on C ∞ (N ) called Fedosov Moyal star product on (N, Λ). Remark 2.9. The Fedosov–Moyal star product constructed above is tangential. That means that it restricts well on the leaves, or, to say it more technically, that f ?g = f.g for f, g leafwise constant functions. Indeed, the only differential operators used in the construction are generated by the sections of D which are the vector fields on N that vanish on leafwise constant functions. 3. The Main Construction 3.1. A particular Poisson manifold
Notations
Let (M, ω) be a compact symplectic manifold. Let Ω ∈ C ∞ ( ] − , [, Ω2 (M )) be a smooth path of symplectic structures on M such that Ω(0) = ω. The smooth family ˆ := M × ] − , [ a Poisson structure {Ω(t)}t∈ ]−, [ then canonically defines on M ˆ ˆ the Ω whose symplectic leaves are {(M × {t}, Ω(t))}. We will denote by D ⊂ T M ˆ characteristic distribution of the Poisson structure Ω (i.e. D(x,t) = T(x,t) (M × {t})). ˆ )[[~]] (resp. C ∞ (M )[[~]]) of power series in ~ with values in the The spaces C ∞ (M ˆ (resp. M ) are R[[~]]-algebras. The quotient R[[~]]algebra of smooth functions on M ∞ n+1 ∞ algebra C (M )[[~]]/~ C (M )[[~]] will be denoted by C ∞ (M )[[~]]n . It will often be identified with the space of polynomials in ~ of degree at most n with values in C ∞ (M ). ˆ )[[~]] defined We will consider the natural inclusion i : C ∞ (M )[[~]]n ,→ C ∞ (M ˆ by i(f )(x, t) = f (x), ∀t ∈] − , [. We will often denote i(f ) by f. ˆ ) we will denote the algebra of tangential (with respect to the disBy DOD (M ˆ i.e. the set of all differential operators on tribution D) differential operators on M, ˆ M vanishing on leafwise constant functions. By DO(M ) we will denote the algebra of differential operators on M . As above we can consider the R[[~]]-algebras, ˆ )[[~]] and DO(M )[[~]]/~n+1 DO(M )[[~]] (abbreviated by DO(M )[[~]]n ). DOD (M When dealing with bidifferential operators, we will use the prefix “biDO”. 3.2. Taylor expansions ˆ ' C ∞ (] − , [, C ∞ (M )) seeing every element a ∈ C ∞ (M ˆ ) as a We have C ∞ (M) function of one variable with values in a Fr´echet space. We can therefore consider [4] its Taylor expansion of order n at 0: n X 1 a(t) = tk a(k) (0) + tn Rn (u)(t) with Rn (u)(t) → 0 as t → 0 . k! k=0
April 11, 2003 14:51 WSPC/148-RMP
206
00159
P. Bieliavsky & P. Bonneau
We define the R-linear map, ˆ ) → C ∞ (M )[[~]]n by j ~ a = jn~ : C ∞ (M n
n X k=0
ˆ )[[~]] in the following way: It is extended to C (M
~k
1 (k) a (0) . k!
∞
ˆ )[[~]] → C ∞ (M )[[~]]n , C ∞ (M a=
X l>0
One then has
~l al 7→ jn~ a =
n X
~ ~l jn−l al =
l=0
X
06k+l6n
~k+l
1 (k) a (0) . k! l
Lemma 3.1. (1) jn~ a = jn~ (a mod ~n+1 ). (2) jn~ is an R[[~]]-algebra homomorphism. ˆ )[[~]] in the natural way. We now extend the map jn~ to DOD (M Definition 3.2. ˆ )[[~]], we define the operator (1) For Φ ∈ DOD (M jn~ Φ : C ∞ (M )[[~]] → C ∞ (M )[[~]]n ; jn~ Φ . f = jn~ (Φ.fˆ) , ∀ f ∈ C ∞ (M )[[~]] . ˆ (2) Similarly, for B ∈ biDOD (M)[[~]], we set jn~ B . (f, g) = jn~ (B.(fˆ, gˆ)) , ∀f, g ∈ C ∞ (M )[[~]] . Lemma 3.3. (1) One has jn~ Φ ∈ DO(M )[[~]]n and jn~ B ∈ biDO(M )[[~]]n . ˆ )[[~]] one has (2) For all a, b ∈ C ∞ (M jn~ (Φ.a) = jn~ Φ . jn~ a and jn~ (B.(a, b)) = jn~ B . (jn~ a, jn~ b) . Proof. We will show that jn~ Φ and jn~ B are local hence differential by Peetre’s theorem [5, 12, 13]. Let f ∈ C ∞ (M ) and U be an open set in M such that f/U ≡ 0. ˆ t) = 0 ∀ (x, t) ∈ U × ]−, [ and Φ is differential, one has (Φ.fˆ)/U × ]−, [ ≡ Since f(x, 0. Hence X ~k+l (l) (jn~ Φ).f )/U = (jn~ (Φ.fˆ))/U = (Φ.fˆ)/U × ]−, [ (0) = 0 . l! 06k+l6n
The bidifferential case follows in the same way. This proves the first part of the lemma. The second one follows from simple computations.
ˆ )[[~]] to Remark 3.4. Lemma 3.3 implies that jn~ , defined as a map from DOD (M DO(M )[[~]]n , is an R[[~]]-algebra homomorphism for the composition product on both algebras.
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
207
3.3. Induced star-products ˆ , Ω); ˆ for instance consider the Let now ˆ ? be any tangential star product on (M Moyal–Fedosov star product defined in Sec. 2. Definition 3.5. (1) We define ?n to be the map from C ∞ (M )[[~]]n × C ∞ (M )[[~]]n to C ∞ (M )[[~]]n given by ?gˆ) . f ?n g = jn~ (fˆˆ ˆ )[[~]], one ˆ as an element of biDOD (M Equivalently (by Lemma 3.3), seeing ? ~ˆ has ?n = jn ?. (2) We define ? to be the operation from C ∞ (M )[[~]] × C ∞ (M )[[~]] to C ∞ (M )[[~]] given by f ? g mod ~n+1 = f mod ~n+1 ?n g mod ~n+1 for all n in N. Lemma 3.6. (1) ?n is an associative product on the R[[~]]-algebra C ∞ (M )[[~]]n . (2) ? is a star-product on M, called the induced star product on M by ˆ ?. ˆ = fˆˆ ˆ Proof. For f, g, h ∈ C ∞ (M )[[~]]n , one has (fˆˆ ?gˆ)ˆ ?h ?(ˆ gˆ ?h). ~ ˆˆ ~ˆ ˆ ˆ Therefore, jn (f ?gˆ)ˆ ?h = jn fˆ ?(ˆ gˆ ?h) if and only if ˆ = j ~ (ˆ ˆ ?.(ˆ ˆ jn~ (ˆ ?.(ˆ ?.(fˆ, gˆ), h)) g , h))) (reformulation) n ?.(f , ˆ ~ˆ ~ ˆ = (j ~ ˆ ˆ ⇔ (jn~ ˆ ?).(jn~ (ˆ ?.(fˆ, gˆ)), jn~ h) ?.(ˆ g , h))) (by Lemma 3.3) n ?).(jn f, jn (ˆ
⇔ (jn~ ˆ ?).((jn~ ˆ ?).(f, g), h) = (jn~ ˆ ?).(f, (jn~ ˆ ?).(g, h)) (by Lemma 3.3) ⇔ (f ?n g) ?n h = f ?n (g ?n h) (by Definition 3.5). This proves item 1 which is a classical way to show that a star-product is associative. Corollary 3.7. If ˆ ?1 and ˆ ?2 are tangentially equivalent tangential star products on ˆ , Ω), ˆ then the induced star products ?1 and ?2 on (M, ω) are equivalent. (M ˆ )[[~]] Proof. The hypothesis implies that there exists an equivalence Φ ∈ DOD (M ∞ ˆ )[[~]]. We then check, as in such that Φ.(aˆ ?1 b) = Φ.a ˆ ?2 Φ.b for all a, b ∈ C (M n+1 the proof of Lemma 3.6 that the operator Ψ mod ~ := jn~ Φ , n ∈ N defines an equivalence between ?1 and ?2 .
April 11, 2003 14:51 WSPC/148-RMP
208
00159
P. Bieliavsky & P. Bonneau
4. Characteristic Classes P Let Ω~ = k>0 ~k ω k ∈ Z 2 (M )[[~]] be a formal power series of closed 2-forms on M . A refinement of the classical Borel lemma (see the appendix) yields Lemma 4.1. Let Ω~i ∈ Z 2 (M )[[~]] (i = 1, 2). Assume that [Ω~1 ] = [Ω~2 ] in H 2 (M )[[~]] or, equivalently, that there exists ν ~ ∈ Ω1 (M )[[~]] such that Ω~2 − Ω~1 = dν ~ . Then there exists smooth functions Ωi ∈ C ∞ ( ] − , [, Ω2 (M )) and ν ∈ C ∞ ( ] − , [, Ω1 (M )) such that 1 d k (i) k! dt Ωi |t=0 = ωi ; (ii) ∀ t, Ωi (t) is symplectic; (iii) ∀ t, Ω2 (t) − Ω1 (t) = d(ν(t)) or, equivalently, [Ω1 (t)] = [Ω2 (t)].
Definition 4.2. Let us fix a connection ∇0 in the vector bundle ˆ = M× ] − , [ . D→M Let Ω~ ∈ Ω2 (M )[[~]] be a series of closed 2-forms on M such that Ω~ mod ~ = ω. Let Ω ∈ C ∞ ( ] − , [, Ω2 (M )) be a smooth family of symplectic structures on M admitting Ω~ as ∞-jet (cf. Lemma 4.1). Let ∇ be the symplectic foliated connection ˆ obtained from the data of ∇0 and Ω (cf. Sec. 2). Let ˆ on M ? be the Moyal–Fedosov ˆ , Ω) ˆ associated to ∇. The star product ?Ω~ on (M, ω) induced by star product on (M ˆ ? will be called the star product associated to the series Ω ~ . Proposition 4.3. Let Ω~i (i = 1, 2) be two series of closed 2-forms on M such that Ω~i mod ~ = ω. Denote by ?i (i = 1, 2) the associated star products on (M, ω). Then 2 ?1 and ?2 are equivalent star products if and only if [Ω~1 ] = [Ω~2 ] in Hde Rham [[~]]. The proof of Proposition 4.3 is postponed to the end of this section. ˆ →M ˆ preserves the foliation if Definition 4.4. A diffeomorphism ϕˆ : M (i) ϕ(M ˆ t ) ⊂ Mt ∀ t and (ii) ϕ| ˆ M0 = idM0 . We first adapt Moser’s lemma to our parametric situation. Lemma 4.5. Let {Ωi (t)}t∈]−, [ (i = 1, 2) be two smooth families of symplectic structures on M such that Ω1 (0) = Ω2 (0) = ω. Assume that, for all t ∈] − , [ they have the same de Rham class: [Ω1 (t)] = [Ω2 (t)] in H 2 (M ). Then there exists ˆ , Ωˆ2 ) → (M ˆ , Ωˆ1 ) which preserves the foliation. a Poisson diffeomorphism ϕˆ : (M Proof. By Hodge’s theory one has that Ω1 (t) − Ω2 (t) = dν t where ν t ∈ Ω1 (M ) is smooth in t. Set ωst = Ω2 (t) + s dν t , s ∈ [0, 1]. The form ωs0 = ω is symplectic on M for all s ∈ [0, 1]; hence by compactness, one can choose > 0 such that ωst is symplectic for all t ∈] − , [ and s ∈ [0, 1].
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
209
Consider N = M × [0, 1] endowed with the natural foliation F = {M × {s}}. Define the following smooth families of 2-forms on N : (˜ ωt )(x,s) := (ωst )x and (ωt )(x,s) := (˜ ωt )(x,s) − (ν t )x ∧ ds . Then dN (ωt ) = dN (˜ ω ) − dM (ν t ) ∧ ds = 0 for all t. Moreover, radT(x,s) (N ) (ωt ) is not entirely contained in T (F); hence one can find a smooth family of vector fields of the form: Xt = ∂∂s + Yt (Yt ∈ Γ(T (F))) generating the smooth family of smooth distributions: rad(ωt ). One has therefore LXt ωt = d(iXt ωt ) + iXt dωt = 0 . Denoting by {ϕuXt } the flow of Xt , one has: (ϕuXt )? ωt = ωt and ϕuXt (M × {s}) = M × {s + u} . One then gets a smooth family {ϕt } of diffeomorphisms of M defined by ϕ1Xt ◦ i0 = i1 ◦ ϕt such that ϕ?t (Ω1 (t)) = Ω2 (t) (is : M → N denotes the natural inclusion is (x) = (x, s)). Shrinking once more if necessary, one gets the desired Poisson map by setting ϕ(x, ˆ t) = (ϕt (x), t). Observe that X0 = ∂s , hence ϕ0 = idM . Lemma 4.6. Let ˆ ?i (i = 1, 2) there exists a diffeomorphism ϕˆ : ˆ ˆ ?ϕ ?2 mod (~n ). Then, ?1 and ?2 1 =ˆ
ˆ . Suppose be tangential star products on M ˆ ˆ M → M preserving the foliation such that are equivalent star products up to order n.
ρ ˆ ) × Diff(M ˆ) → Proof. The right action of the diffeomorphism group, C ∞ (M ˆ ), ρ(ϕ)u C ∞ (M ˆ = ϕˆ? u yields a map:
ˆ . ˆ ) → HomR (C ∞ (M ), C ∞ (M )[[~]]n ) : ρ~n (ϕ)f ρ~n : Diff(M ˆ = jn~ (ϕˆ? f) Definition 4.4 implies that if ϕˆ preserves the foliation, then ρ~n (ϕ) ˆ ∈ DO(M )[[~]]n ~ and ρ0 (ϕ) ˆ = id. Therefore an argument similar to the one used for Lemma 3.6 yields the conclusion. Corollary 4.7. Within the notations of Proposition 4.3, if Ω~1 and Ω~2 are cohomologous in H 2 (M )[[~]], then the star products ?1 and ?2 are equivalent. Proof. The first N cochains of a Fedosov star product are entirely determined by the N first terms of its Weyl curvature. Therefore, the above lemmas imply that ?1 and ?2 are equivalent up to any order. It is then classical that they are equivalent [1].
April 11, 2003 14:51 WSPC/148-RMP
210
00159
P. Bieliavsky & P. Bonneau
Proof of Proposition 4.3. We first consider a particular case. Let α~ = αo + ~α1 · · · ∈ Z 2 (M )[[~]] be a sequence of closed 2-forms on M . Set Ω0~ = Ω~ + ~k α~ . Denote by Ωt , αt and Ω0t = Ωt + tk αt respectively the smooth functions associated to the series Ω~ , α~ and Ω0~ as in Lemma 4.1. The functions Ωt and Ω0t define two ˆ We denote by Λt and Λ0t (resp. ω t and ω 0t ) the different Poisson structures on M. corresponding bivector fields (resp. D-2-forms). One has ω 0t = ω t + tk αt
and Λ0t = Λt − tk ]αo + tk+1 λ ,
(4.1)
where we denote again by αt the D-2-form corresponding to αt , where ] is the musical isomorphism between D ∗ and D induced by Ωt and where λ ∈ C ∞ ( ] − ˆ , Λt ) obtained , [, Γ ∧2 D). Let ∇ be the symplectic foliated connection on (M 0 0 from the data of ∇ (cf. Definition 4.2). Let ? be the star-product on M induced ˆ , Λ0t ). We now define a specific foliated by the Moyal–Fedosov star-product ˆ ?0 on (M 0 0t symplectic connection ∇ adapted to ω . Let us look for ∇0 of the form ∇+S where S is a symmetric 2-D-tensor field. We set 1 1 ω 0t (∇0u v, w) = ω 0t (∇u v, w) + (∇u ω 0t )(v, w) + (∇v ω 0t )(u, w) . 3 3 k
This leads to (ω t + tk αt )(S(u, v), w) = t3 [(∇u αt )(v, w) + (∇v αt )(u, w)] as ∇ω t = 0. By construction ω t + tk αt is invertible, so S(u, v) is completely determined and of the form S(u, v) = tk s(u, v). We thus have ∇0 = ∇ + t k s .
(4.2)
Let now ◦t (resp. ◦0t ) be the associative product on the sections of the Weyl bundle ˆ determined by the data of Λ (resp. Λ0 ) (cf. Sec. 2 and Remark 2.6). By W over M construction, we then get ∀ u, v ∈ W, dl (u ◦t v − u ◦0t v)(0) = 0 ∀ l 6 k − 1 . dtl
(4.3)
ˆ0 , associated to (Ω, ∇) and Similarly for Moyal–Fedosov star products, ˆ ? and ? (Ω0 , ∇0 ), Eqs. (4.2) and (4.3) yield dl (aˆ ?b − aˆ ?0 b)(0) = 0 ∀ l 6 k − 1 . dtl
(4.4)
Now let us see what happens for ? and ?0 . Let f, g ∈ C ∞ (M ) and write ˆ ? = P P 0 dl i ˆ i 0 (l) ˆ ? = i>0 ~ Ci . Setting u := dtl u, we have i>0 ~ Ci and ˆ f ? g − f ?0 g =
X ~j j>0
=
j!
X ~j j>k
j!
(f ˆ ?g − f ˆ ?0 g)(j) (0)
(f ˆ ?g − f ˆ ?0 g)(j) (0) (cf. Eq. (4.4))
April 11, 2003 14:51 WSPC/148-RMP
00159
211
Star Product on a Symplectic Manifold
=
j>k
=
X ~j
j!
X
j>k,i>0
=
X
i>0
(j)
~i (Cˆi (f, g) − Cˆi0 (f, g))
(0)
~i+j ˆ (Ci (f, g) − Cˆi0 (f, g))(j) (0) j!
X
~m
m>k
=
X
m=i+j,j>k,i>0
(j) 1 X ˆ (Ci (f, g) − Cˆi0 (f, g)) (0) j! k>0
~k+1 ~k+1 ˆ ~k (f g − gf ) + (f g − gf ) + (C1 (f, g) − Cˆ10 (f, g)) k! (k + 1)! k!
+
X
m>k+2
~m
X
m=i+j,j>k,i>0
(j) 1 X ˆ Ci (f, g) − Cˆi0 (f, g) (0) j! k>0
=
~k+1 t (Λ (df, dg) − Λ0t (df, dg))(k) (0) + o(~k+1 ) k!
=
~k+1 t (Λ (df, dg) − Λt (df, dg) + tk ]αo (df, dg)tk+1 λ(df, dg))(k) (0) k!
+ o(~k+1 ) = ~k+1 ]αo (df, dg) + o(~k+1 ) . P P Then, setting ? = i>0 ~i Ci and ?0 = i>0 ~i Ci0 , we have Ci0 = Ci ,
i = 0, . . . , k
0 and Ck+1 = Ck+1 + ]αo .
(4.5)
Let us pass to the general case. Suppose that [Ω~1 ] 6= [Ω~2 ]. We denote by k the smallest integer such that [ω1k ] 6= [ω2k ]. Let us consider Ω~3 = ~ω11 + ~2 ω12 + · · · + ~k−1 ω1k−1 + ~m ω2k + ~k+1 ω2k+1 + · · · . We have [Ω~3 ] = [Ω~2 ] and Ω~1 = Ω~3 + ~k (ω1k − ω2k ) + ~k+1 · · ·. Denoting by ?i the product associated with = Ω~i , we know that ?2 and ?3 are equivalent. What has been done previously implies ?1 = ?3 mod ~k+1 (1) (3) and Ck+1 = Ck+1 ± ]αo with αo = ω1k − ω2k . But in this case, we know that ?1 ∼ ?3 mod ~k+2 if and only if αo is exact [3]. Since ω1k − ω2k is not exact by hypothesis, ?1 6∼ ?3 and thus ?1 6∼ ?2 . Remark 4.8. An alternative construction would be to directly consider formal deformations of the Weyl algebra bundle and related structures based on the preliminary data of a formal deformation of the symplectic structure (as opposed to a smooth deformation as considered here). This direction allows to treat the case of non-compact symplectic manifolds as well. Indeed, one could either observe that the completeness of the vector field occurring in the proof of Lemma 4.5 is not necessary in the formal category, or design a formal version of Moser’s argument.
April 11, 2003 14:51 WSPC/148-RMP
212
00159
P. Bieliavsky & P. Bonneau
In the present article we chose to remain in the smooth category to underline the link with the classical Moser’s lemma. 5. Appendix: Borel’s Lemma We did not find in the literature a suitable reference for a “Borel’s lemma” applying in our framework, we therefore establish such a result here. For a classical statement of this lemma see [11]. Proposition 5.1 (Borel’s Lemma). Let E be a Fr´echet space and {αn ∈ E | n ∈ N} be a sequence in E. Then there exists f ∈ C ∞ (R, E) such that f (n) (0) = αn . Proof. Let ϕ ∈ C ∞ (R, R) be nonnegative, such that ϕ(t) = 1 for |t| 6 21 and ϕ(t) = 0 for |t| > 1 and define fn ∈ C ∞ (R, E) by fn (t) = αn!n tn ϕ(λn t) where the numbers {λn } (λn ≥ 1) will be defined later. As E is a Fr´echet space there exists {p0 , . . . , pr , . . .} a nondecreasing countable basis of continuous seminorms on E [14]. Lemma 5.2. For all n ∈ N one can find λn ∈ R such that sup pn−1 (fn(k) (t)) 6 t∈R
1 2n
∀ k ∈ N s.t. 0 6 k 6 n − 1 .
P Proof. (Lemma 5.2) Let us define Kn = pn−1 (αn ) and Mn = nj=1 supt∈R |ϕ(j) (t)|. We have ! ! k X k n! αn n−p k−p (k−p) (k) t λn ϕ (λn t) pn−1 (fn (t)) = pn−1 p (n − p)! n! p=0 6
k X
k
k X
k
k X
k
p=0
6
p=0
6
p=0
p
p
p
!
pn−1 (αn ) n−p k−p (k−p) |t| λn |ϕ (λn t)| (n − p)!
!
pn−1 (αn ) n−p n−p 1 |t| λn |ϕ(k−p) (λn t)| n−k (n − p)! λn
!
k Kn M n X Kn M n 1 6 (n − p)! λnn−k λn p=0
n−1 Kn M n X 6 λn p=0
n−1 p
!
k p
!
1 (n − p)!
1 (n − p)!
with the appropriate justifications for the inequalities: on the support of ϕ(k−p) (λn t) we have |λn t| 6 1 and, as n − k > 1, we have λnn−k > λn if λn > 1 and kp 6 n−1 p .
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
Thus a choice of the λn ’s such that (
λn > max 1, 2n Kn Mn
n−1 X p=0
n−1 p
!
1 (n − p)!
213
)
yields the assertion. Lemma 5.3. For the preceding choice of the λn ’s, C ∞ (R, E).
P
n>0
fn is convergent in
Proof. (Lemma 5.3) For the details about the topology of C ∞ (R, E) we refer to [14, Chap. 20]. The set {Pr,k,m | (r, k, m) ∈ N3 } with Pr,k,m (g) := supt∈[−m,m] pr (g (k) (t)) forms a basis of seminorms for the (Fr´echet) topology of C ∞ (R, E). If we show that, P ∀ (r, k, m) ∈ N3 , n>0 Pr,k,m (fn ) converges as a real series, the lemma is proved. As the fn ’s are compactly supported, it is sufficient to prove that, ∀ (k, r) ∈ N2 , P (k) n>0 supt∈R pr (fn (t)) converges as a real series. So let us fix (k, r) ∈ N2 and define s = max{k, r}. We have X
n>0
sup pr (fn(k) (t)) = t∈R
s X
sup pr (fn(k) (t)) +
n=0 t∈R
X
n>s+1
sup pr (fn(k) (t)) . t∈R
As fn ∈ C ∞ (R, E) for each n there is no problem for the first (finite) sum. (k) (k) For the second one we have supt∈R pr (fn (t)) 6 supt∈R pn−1 (fn (t)) 6 21n since, first, r 6 s 6 n − 1 and the countable basis of seminorms is nondecreasing and, secondly, k 6 s 6 n − 1 and we can apply the Lemma 5.2. Thus it is convergent. P∞ So, by the two lemmas we have constructed f = n=0 fn ∈ C ∞ (R, E). Let us finally see that this function fulfills the desired property. We have ! ∞ k k X X X (k) k αn n n! αn n−p (k−p) (k) t ϕ(λn t) t ϕ (λn t) . + f (t) = n! p (n − p)! n! n=0 p=0 n=k+1
In the second sum, we have n − p > n − k > 1. Thus it vanishes for t = 0. In the first sum, if n 6 k − 1, ϕ is differentiated at least once. As ϕ(j) (0) = 0 for j > 1, it vanishes for t = 0. Therefore, we have αk k f (k) (0) = (t ϕ(λk t))(k) (0) k! ! k k! αk X k tk−p λkk−p ϕ(k−p) (λk t)|t=0 . = k! p=0 p (k − p)! For p 6 k − 1, we have k − p > 1 and the corresponding term vanishes. Hence f (k) (0) = αk!k k!ϕ(0) = αk .
April 11, 2003 14:51 WSPC/148-RMP
214
00159
P. Bieliavsky & P. Bonneau
Corollary 5.4. Let N be a smooth manifold and {αn ∈ Ωq (N ) | n ∈ N} be a sequence in Ωq (N ). Then there exists f ∈ C ∞ (R, Ωq (N )) such that f (n) (0) = αn . Proof. As Ωq (N ) is a Fr´echet space [6] it is a straightforward application of the preceding proposition. Corollary 5.5. Let (α1n )n∈N , (α2n )n∈N and (νn )n∈N be sequences of forms on a smooth manifold N . Then there exist smooth functions f 1 , f 2 and f corresponding respectively to (α1n )n∈N , (α2n )n∈N and (νn )n∈N as in Corollary 5.4 such that (1) If dα1n = 0, ∀ n ∈ N, then d(f 1 (t)) = 0, ∀ t ∈ R. (2) If α2n − α1n = dνn ∀ n ∈ N, then f 2 (t) − f 1 (t) = d(f (t)), ∀ t ∈ R. α1
Proof. (1) We have fn1 (t) = n!n ϕ(λn t) hence d(fn1 (t)) = 0, ∀ t ∈ R and for each t, P∞ f 1 (t) = n=0 fn1 (t) converges in Γ1 (N, ∧q T ∗ N ). (2) Let λ1n , λ2n and λn be three real sequences defining smooth functions f˜1 , f˜2 and f˜ corresponding respectively to (α1n )n∈N , (α2n )n∈N and (νn )n∈N as in the proof of Proposition 5.1. Consider the sequence µn = max{λ1n , λ2n , λn }. Replacing λ1n , λ2n and λ1n by µn we get new functions f 1 , f 2 and f again corresponding respectively to (α1n )n∈N , (α2n )n∈N and (νn )n∈N such that fn2 − fn1 = dfn ∀ n ∈ N. Since for t fixed the series converges in Γ0 (N, ∧q T ∗ N ), we obtain the result. Acknowledgment We warmly thank the referee for several improvements of the manuscript and interesting suggestions. References [1] F. Bayen, M. Flato, C. Fronsdal, A. Lichnerowicz and D. Sternheimer, Deformation theory and quantization. I. Deformations of symplectic structures, Ann. Phys. 111(1) (1978), 61–110. [2] F. Bayen, M. Flato, C. Fronsdal, A. Lichnerowicz and D. Sternheimer, Deformation theory and quantization. II. Physical applications, Ann. Phys. 111(1) (1978), 111–151. [3] M. Bertelson, M. Cahen and S. Gutt, Equivalence of star products, Class. Quantum Grav. 14(1A) (1997), A93–A107. ´ ements de math´ematique. Vari´et´es diff´erentielles et analytiques. [4] N. Bourbaki. El´ Fascicule de r´esultats (Paragraphes 1 a ` 15), Masson, 1983. [5] M. Cahen, S. Gutt and M. De Wilde, Local cohomology of the algebra of C ∞ functions on a connected manifold, Lett. Math. Phys. 4(3) (1990), 157–167. ´ ements d’analyse, Tome III , Chap. XVI et XVII, Gauthier-Villars, [6] J. Dieudonn´e, El´ 1970. [7] B. V. Fedosov, A simple geometrical construction of deformation quantization, J. Diff. Geom. 40(2) (1994), 213–238. [8] B. V. Fedosov, Deformation quantization and index theory, Akademie Verlag, Berlin, 1996.
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
215
[9] M. Kontsevich, Deformation quantization of Poisson manifolds I. (preprint math.QA/9709040) 1997. [10] J. E. Moyal, Quantum mechanics as a statistical theory, Proc. Camb. Philos. Soc. 45 (1949), 99–124. [11] R. Narasimhan, Analysis on real and complex manifolds (Advanced Studies in Pure Mathematics, Vol. 1), Paris: Masson et Cie, Editeur; Amsterdam: North-Holland Publishing Company, 1968. [12] J. Peetre, Une caracterisation abstraite des operateurs differentiels, Math. Scand. 7 (1959), 211–218. [13] J. Peetre, Rectification a l’article “Une caracterisation abstraite des operateurs differentiels”, Math. Scand. 8 (1960), 116–120. [14] F. Tr`eves, Topological vector spaces, distributions and kernels, Academic Press, 1967. [15] I. Vaisman, Lectures on the geometry of Poisson manifolds, Birkh¨ auser Verlag, Basel, 1994. [16] H. Weyl, Gruppentheorie und quantenmechanik, Z. Physik, 1927. [17] H. Weyl, The theory of groups and quantum mechanics, Dover, 1931. [18] E. P. Wigner, On the quantum correction for thermodynamic equilibrium, Phys. Rev., II. Ser. 40 (1932), 749–759.
May 26, 2003 12:17 WSPC/148-RMP
00163
Reviews in Mathematical Physics Vol. 15, No. 3 (2003) 217–243 c World Scientific Publishing Company
DECOHERENCE INDUCED TRANSITION FROM QUANTUM TO CLASSICAL DYNAMICS
PH. BLANCHARD and R. OLKIEWICZ∗ Physics Faculty and BiBoS, University of Bielefeld, 33615 Bielefeld, Germany of Theoretical Physics, University of Wroclaw, 50-204 Wroclaw, Poland
∗Institute
Received 28 May 2002 Revised 19 November 2002 Framework for a general discussion of environmentally induced classical properties, like superselection rules, privileged basis and classical behavior, in quantum systems with both finite and infinite number of degrees of freedom is proposed. A number of examples showing that classical properties do not have to be postulated as an independent ingredient are given. In particular, it is shown that infinite open quantum systems in some cases may behave like simple classical dynamical systems. Keywords: Quantum open systems; decoherence; dynamical semigroups; superselection rules; classical behavior.
1. Introduction Quantum mechanics is usually thought of as a generalization of classical mechanics in which commutation relations are imposed on dynamical variables. This might suggest that one need a full deterministic theory first, and then should apply to it a recipe called quantization to get a more fundamental theory. Such a procedure has a great heuristic value and was used in many concrete cases. However, there is no fundamental reason for such a way of reasoning. Why quantum theory cannot be completely formulated without regard to an underlying classical picture, all the more since some observables seem to possess a genuine quantum character without classical counterparts. Therefore, it is much more natural to consider quantum systems as primary objects and try to derive classical properties, like superselection operators, pointer states, and, at the extreme, emergence of classical dynamical systems, from quantum theory. The origin of deterministic laws that govern the classical domain of our everyday experience has attracted much attention in recent years. For example, the question in which asymptotic regime non-relativistic quantum mechanics reduces to its ancestor, i.e. Hamiltonian mechanics, was addressed in [19, 20]. It was shown there that for very many bosons with weak two-body interactions there is a class of states for which time evolution of expectation values of certain operators in these 217
May 26, 2003 12:17 WSPC/148-RMP
218
00163
Ph. Blanchard & R. Olkiewicz
states is approximately described by a nonlinear Hartree equation. The problem under what circumstances such an equation reduces to the Newtonian mechanics of point particles was also discussed. A program of deriving irreversible transport equations for macroscopic quantum systems was also carried out. For example, in [17] time evolution of a spinless quantum particle moving in a Gaussian random environment was discussed. It was shown there that in the weak coupling limit the Wigner distribution of a wave function converges globally in time to a solution of the linear Boltzmann equation. The connection between the reversible dynamics of classical macroscopic observables of infinite mean-field quantum systems and a Hamiltonian flow on a generalized phase space was described in [10, 44, 45]. As was shown in [27], a collective dynamical behavior of a system consisting of infinitely many two-level atoms leads to a flow on the classical phase space of the atoms and this results in periodic time dependence of the asymptotic states. Finally, the classical ~ → 0 limit for quantum mechanical correlation functions of systems with both finitely and infinitely many degrees of freedom was discussed in [26]. A different point of view was taken in a seminal paper by Gell-Mann and Hartle [21]. They gave a thorough analysis of the role of decoherence in the derivation of phenomenological classical equations of motion. Various forms of decoherence (weak, strong) and realistic mechanisms for the emergence of various degrees of classicality were also presented. Since quantum interferences are damped in the presence of an environment, so one may hope that the classical ~ → 0 limit for quantum dissipative dynamics may exists for arbitrary large time. Such a problem was discussed in [25]. In this work we adopt a different point of view and follow the idea of environmentally induced decoherence whose potential impact on behavior of quantum open systems was briefly described by Zeh: “All quasi-classical phenomena, even those representing reversible mechanics, are based on de facto irreversible decoherence” [48]. The main objective of the present paper is to provide an algebraic framework which will enable a general discussion of the environmentally induced decoherence and, as a consequence, the appearance of classical properties in quantum systems with both finite and infinite number of particles. It is worth noting that our approach is dynamical and so it constitutes an alternative way to the classical limit. A number of examples showing that classical concepts do not have to be presumed as an independent fundamental ingredient are also discussed. 2. Mathematical Description of Quantum Systems In order to describe a quantum system we apply the idea of Segal [40], and Haag and Kastler [24] that all relevant information about the system is contained in a certain abstract noncommutative algebra A. Thus, as a primary object of the mathematical formalism of quantum theory, we take the algebra generated by bounded
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
219
observables of the system equipped with a norm topology, a so called C ∗ -algebra. It is believed that such an algebra reflects intrinsic properties of the corresponding quantum system. In this view Hilbert spaces play only a secondary role and they appear as representation spaces of the algebra. In general, C ∗ -algebras admit an uncountable number of unitary inequivalent representations, most of which presumably have no physical interpretation. Only quantum systems with a finite number of degrees of freedom, due to the Stone–von Neumann uniqueness theorem, possess one (up to unitary equivalence) irreducible representation, the so called Schr¨ odinger representation. However, in order to enter traditional framework of quantum theory, which postulates that with any physical system one can associate a definite Hilbert space with physical properties of the system expressed in terms involving only mathematical objects related to this Hilbert space, one has to select a subset of admissible states, which through the GNS construction would lead to physically meaningful structures. Let us discuss this point more precisely. Suppose that φ is a faithful state on a C ∗ -algebra A of a quantum system and let πφ : A → B(H) be the corresponding (faithful) GNS representation [11]. Let M be a von Neumann algebra generated by πφ (A), that is M = πφ (A)00 , the bicommutant of πφ (A). M is called sometimes the algebra of contextual operators [38]. We argue now that if M has to describe a system with pure quantum character it has to be a factor, i.e. an algebra with a trivial center. By the pure quantum character we mean the following property of the system: For any two distinct orthogonal pure states |ψ1 i, |ψ2 i ∈ H, their superpositions should be physically distinguishable from the corresponding statistical mixtures. In other words, there should exist at least one Hermitian operator A in M such that the following expectation values are different, hψ, Aψi 6= tr(ρA) , where ψ = z1 ψ1 + z2 ψ2 , |z1 |2 + |z2 |2 = 1, z1 z2 6= 0, and ρ = |z1 |2 |ψ1 ihψ1 | + |z2 |2 |ψ2 ihψ2 |. However, if the center of M is nontrivial we may take a central (different from zero and identity) projection E and choose |ψ1 i ∈ EH and |ψ2 i ∈ E ⊥ H, where E ⊥ = 1 − E. Then, for any A = A∗ ∈ M one would have that hψ, Aψi coincides with tr(ρA), and so the pure state ψ and the statistical mixture ρ would be physically indistinguishable. Therefore, in the following we assume that φ is a factor state. It is worth noting that a discussion concerning the coherent superposition of states and a complete classification of the coherence classes of states in factors were presented in [39]. Another constraint on the algebra M is expressed in the so called Dirac’s requirement. It states that there should exist at least one complete set of mutually compatible observables. Expressing this in the algebraic language we say that the commutant of M is Abelian or, equivalently, that there exists a maximal (in B(H)) Abelian algebra C contained in M. The following observation is clear.
May 26, 2003 12:17 WSPC/148-RMP
220
00163
Ph. Blanchard & R. Olkiewicz
Theorem 1. The postulate about pure quantum character of the system together with the Dirac’s requirement is true if and only if M = B(H), i.e. M is a type I factor. It follows that the Dirac’s requirement is an additional condition, which specifies the type of factor representation of algebra A and leads to the framework of standard quantum theory. Since we want to consider quantum systems in the thermodynamic limit, which are known to be represented by other types of factors, we drop the Dirac’s condition keeping only the postulate about pure quantum character of the system. Physical observables are Hermitian operators from the algebra M, or, more generally, self-adjoint operators affiliated to M. Generalizing the notion of a density matrix representing mixture of states we say that statistical states of the system are represented by positive normal and normalized functionals on M. The set of statistical states we denote by D. Hence φ ∈ D iff φ(A) ≥ 0 whenever A ≥ 0, φ(1) = 1, where 1 is the identity operator, and φ is continuous in the σ-weak topology on M (see, for example, [11] for definition of these terms). The linear space generated by D is called the predual space of M and denoted by M∗ . The connection between a Hermitian operator A representing an observable and experimentally measured values of this observable, whenR the system is described by a statistical state φ, is the following one. Suppose λdE(λ) is the spectral decomposition of A. The probability that the measured value is in an interval [a, b] is given R by φ(E[a, b]), and so the expectation value of A in the state φ equals to hAi = λdφ(E(λ)). Let us observe that dφ(E(λ)) is a probability measure on σ(A), the spectrum of A. Let us now consider the dynamics of a quantum system. If a system is closed (conservative), then the time development of any observable is given by a continuous symmetry transformation, i.e. A → A(t) = αt (A), where αt is a σ-weakly continuous one parameter group of ∗ -automorphisms of M. If there exists an energy observable H for the system, then automorphisms αt are inner, given by αt (A) = i i e ~ tH Ae− ~ tH . However, if a system interacts with an environment, then its evolution becomes irreversible. In fact, although the whole system evolves unitarily according to the total Hamiltonian H = HS + HE + HI , where the three parts represent respectively the system, environment and interaction Hamiltonians, the evolution of a system observable A is given by i
i
Tt (A) = PE (e ~ tH (A ⊗ 1E )e− ~ tH ) ,
(1)
where 1E is the identity operator in the Hilbert space of the environment, and PE denotes the conditional expectation onto the algebra M with respect to a reference state φE of the environment. Equivalently, we may define Tt as the adjoint map to the operator Tt∗ : M∗ → M∗ given by i
i
Tt∗ (φ) = TrE (e− ~ tH (φ ⊗ φE )e ~ tH ) ,
(2)
where TrE denotes the partial trace with respect to the environmental variables. Tt being the composition of ∗ -automorphisms and a conditional expectation is a
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
221
family of maps (superoperators) which in general satisfies a complicated integrodifferential equation describing an irreversible dynamics. For this reason we consider only the forward evolution, i.e. assume that t ≥ 0. Nevertheless, some important properties of Tt may be explicitly derived. (a) For any observable A ∈ M, the function t → Tt (A) is σ-weakly continuous. (b) For all t ≥ 0, the superoperators Tt are completely positive, normal and unital. Moreover, Tt are contractive in the operator norm, i.e. kTt Ak∞ ≤ kAk∞ . In case when Tt ◦ Ts = Tt+s , i.e. when the memory effect can be neglected, the family {Tt } is called a quantum Markov semigroup. A general discussion of the limiting procedures, like weak coupling limit and singular coupling limit which lead to the Markovian approximation, can be found in [1]. Let us point out, however, that many physical models possess an additional property, namely that there exists a faithful and normal state preserved by the evolution, a so called equilibrium state. Generalizing this concept we assume that: (c) There is a faithful, normal and semifinite weight ω0 on M such that ω0 ◦Tt = ω0 for all t ≥ 0. Roughly speaking, the passage from a state to a weight in the noncommutative framework corresponds to the replacement of a compact space with a probability measure by a locally compact space with a σ-finite measure. For a broad discussion of weights see, for example, [41]. Summing up this section: An open system with pure quantum character is described by a factor M acting on a separable Hilbert space of the system. The evolution of observables of the system is given by a family of superoperators {Tt }t≥0 on M which satisfy conditions (a)–(c). Having described the framework for quantum systems we now turn to the Hilbert space description of classical dynamical systems. 3. Koopman’s Formalism for Classical Systems Everybody agrees that concepts of classical and quantum physics are opposite in many aspects. Therefore, in order to demonstrate how quanta become classical, it is necessary to express them in one mathematical framework. Since, as was shown in the previous section, a natural language for quantum system is that of von Neumann algebras, we reformulate now the concept of a classical dynamical system in a similar way. The idea of using the same algebraic formalism for description of both quantum and classical mechanics was suggested in [2]. The use of the Koopman formalism together with the reverse procedure are essential for a rigorous analysis of the decoherence induced classical dynamical systems, see Definition 9 and Example 7. Suppose that M is a configuration space of a classical system. We assume that M is a locally compact metric space. A continuous evolution of the system is given by a (continuous) flow on M , i.e. a continuous mapping g : R × M → M such that gt : M → M is a homeomorphism for all t ∈ R, and t → gt is a group
May 26, 2003 12:17 WSPC/148-RMP
222
00163
Ph. Blanchard & R. Olkiewicz
homomorphism. The map t → gt (x) is called a trajectory of a point x ∈ M . From the very definition, all trajectories are continuous. We assume also that there exists a σ-finite Borel measure µ0 on M , finite on compact sets, and such that µ0 (gt−1 (B)) = µ0 (B) for all R t ∈ R and all µ0 -finite Borel subsets B ⊂ M . In addition, we assume that f dµ0 > 0 whenever f ≥ 0 and f 6= 0. The triple (M, gt , µ0 ) is called a (classical) topological dynamical system. The following result is clear. Proposition 2. Suppose that gt is a flow on M. Then γt : C0 (M ) → C0 (M ), γt (f )(x) = f (gt x) is a strongly continuous one parameter group of ∗ -automorphisms of C0 (M ), where C0 (M ) is the C ∗ -algebra of continuous functions on M vanishing at infinity. It follows that a dynamical system may be equivalently described by the triple (C0 (M ), γt , φ0 ), where φ0 is a γt -invariant weight on C0 (M ) determined by the measure µ0 . If M is compact and therefore µ0 is finite, we always assume that µ0 is a probability measure, what implies that φ0 is a state on C(M ). So far we have made half of the way. What we really need is a Hilbert space representation of the system. Suppose H = L2 (M, µ0 ). There is a natural representation of the algebra C0 (M ) in H given by π(f )ψ(x) = f (x)ψ(x). Let us define A = π(C0 (M ))00 . Then A is the von Neumann algebra L∞ (M, µ0 ) of essentially bounded functions on M , acting in the Hilbert space H. Moreover, γt extends uniquely to a σ-weakly continuous group of ∗ -automorphisms of A, and µ0 determines a γt -invariant, faithful, normal and semifinite weight φ0 on A. We call the triple (A, γt , φ0 ) a Hilbert space representation of the dynamical system (M, gt , µ0 ). Let us now discuss the reverse procedure. Suppose we start with a triple (A, γt , φ0 ), where A is a commutative von Neumann algebra, γt is a σ-weakly continuous group of ∗ -automorphisms of A, and φ0 is a γt -invariant, faithful, normal and semifinite weight on A. The problem how one can determine the underlying topological space is not a trivial one since there are essentially two non-isomorphic examples of A, namely the algebra of bounded sequences over a discrete set and the algebra of essentially bounded functions on the unit interval with respect to the Lebesgue measure. So if L∞ ([0, 1], dx) and L∞ (S 3 , µ0 ), where S 3 is a threedimensional sphere and µ0 is a normalized rotationally invariant measure, and L∞ (Rn , dxn ) are all isomorphic, how we can choose an appropriate space. To answer this question we propose the following reduction procedure. Let us start with an arbitrary representation of A, say A = L∞ (Ω, µ), Ω being a locally compact space, arising in a particular model in a natural way. It is obvious that the C ∗ algebra Cb (Ω) of continuous and bounded functions on Ω is contained in A. Let A0 be the maximal C ∗ -subalgebra in Cb (Ω) such that γt : A0 → A0 and is strongly continuous on A0 . Let us comment on this point. Let X = (Ai )i∈I be the family of all unital C ∗ -subalgebras Ai ⊂ Cb (Ω) such that γt : Ai → Ai , and is strongly continuous on it. Then (X, ⊂) is a non-empty ordered set. Suppose that (Aj )j∈J S is a linearly ordered set (chain) of elements from X. Then AJ = j∈J Aj ∈ X,
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
223
where the closure is taken in the sup-norm of Cb (Ω). In fact, AJ is a C ∗ -subalgebra S S of Cb (Ω). It is also clear that γt : j∈J Aj → j∈J Aj and is strongly continuous on it. Because γt is contractive in the sup-norm so γt : AJ → AJ . The strong continuity on AJ follows from the standard /3 argument. Moreover, for any j ∈ J, Aj ⊂ AJ . Hence, by the Kuratowski–Zorn lemma, there exists a maximal element in X. By the Gelfand construction [11], A0 is isomorphic with C(M ), where M is a compact Hausdorff space, the spectrum of A0 . In order to avoid pathological situations we assume that the topology on M is metrizable, i.e. given by a metric on M . This property would be ensured if we additionally assumed that the spectra of all Ai ∈ X are metrizable. Because AJ is the direct limit of unital commutative C ∗ -algebras, so its spectrum is the inverse limit of the spectra of Aj , j ∈ J, and hence would be metrizable. Thus M would be also metrizable. If φ0 is a state, then we choose M as the space of the system. Next we define a probability Borel measure µ0 on M by the formula Z ˆ f(x)dµ 0 (x) = φ0 (x) , M
where fˆ ∈ C(M ) is associated with f ∈ A0 by the Gelfand isomorphism. The corresponding group of automorphisms of C(M ) we denote by γˆt . It is worth pointing out that γˆt is implemented by a strongly continuous group of unitary operators defined on the Hilbert space L2 (M, µ0 ). Let us recall that M is the space of mulˆ tiplicative states (characters) on A0 with x(f ) = f(x). Therefore, for any x ∈ M and t ∈ R we may define a new point gt x ∈ M by the formula (gt x)(f ) := x(γt f ), ˆ f ∈ A0 . Hence, the semigroup γˆt is induced by a flow, i.e. γˆt f(x) = fˆ(gt x). We show now that gt is continuous. Suppose it is not. Then there is a sequence xn → x such that gt (xn ) is not convergent to gt x. It means that there exists > 0 and a subsequence {xnm } such that d(gt (xnm ), gt x) > , where d is the metric on M . Let fˆ0 be a continuous function such that fˆ0 (gt x) = 1 and supp fˆ0 ⊂ K(gt x, ), the ball of radius and the center in gt x. Because γˆt fˆ0 is also continuous so γˆt fˆ0 (xnm ) → 1. However, γˆt fˆ0 (xnm ) = 0 for all natural m, the contradiction. Hence gt is continuous. Since (gt )−1 = g−t , it is a homeomorphism of M . By the strong continuity of γˆt we conclude that the flow g is continuous. Because, by definition, the measure µ0 is gt -invariant we have obtained in this way a topological dynamical system (M, gt , µ0 ). Suppose now that φ0 is a weight. Let A00 be the C ∗ -algebra generated by the following set C = {f ∈ A0 : f ≥ 0 and φ0 (f ) < ∞} . It is clear that γt : C → C and so γt : A00 → A00 . Because A00 does not possess the identity so it is isomorphic with C0 (M ), where M is a locally compact Hausdorff space. Assuming again that M is metrizable (if the spectrum of A0 is metrizable, then this property is automatically satisfied; because the spectrum of A00 + C · 1,
May 26, 2003 12:17 WSPC/148-RMP
224
00163
Ph. Blanchard & R. Olkiewicz
being the image of a continuous and closed mapping of the spectrum of A0 [16], is metrizable, so is the spectrum of A00 ) we obtain, by using similar arguments as in the previous case, a topological dynamical system (M, gt , µ0 ), with M being locally compact and µ0 being σ-finite. Summing up: It is the dynamics and the invariant state or weight which determine the underlying space for an abstract commutative dynamical system. It is worth noting, however, that such a reduction procedure is a “minimal” one since we aimed at getting topological dynamical systems. In some case it may be convenient to impose on A0 or A00 additional conditions. For example, to obtain a smooth dynamical system one has to require that the group γt preserves a subspace of smooth functions. 4. Decoherence in Action In recent years decoherence has been widely discussed and accepted as the mechanism responsible for the appearance of classicality in quantum measurements and the absence, in the real world, of Schr¨ odinger-cat-like states [6, 22, 28, 47, 49, 50]. The basic idea behind it is that classicality is an emergent property induced in quantum systems by unavoidable and practically irreversible interaction with their environment. It is marked by the dynamical suppression of quantum interferences and so the transformation of the vast majority of pure states of the system to statistical mixtures. It should be pointed out, however, that classicality in quantum systems may be also introduced in another way. For example, it was shown in [3] that broken symmetries in infinite systems give rise to classical observables based on a system of imprimitivity. A different approach to the description of classical states and associated with them classical observables based on the algebraic theory of superselection sectors was proposed in [30]. It was also shown there that nonautomorphic time evolution leads to the transition between different folia (equivalence classes of pure states) in the way required to find a mixed state after the measurement. A loss of phase coherence as the consequence of the coupling with an environment has been established both in the Markovian regime [43, 46] and for a system with a non-Markovian evolution (decoherence through emission of Bremsstrahlung) [13]. This idea has been also experimentally verified. For example, Brune et al. [14] created a mesoscopic superposition of quantum states involving radiation fields with classically distinct phases and observed its progressive decoherence to a statistical mixture through two-atom correlation measurements. Moreover, Schr¨ odinger-catlike states were also created in an ion trap experiment using a single beryllium ion and a combination of static and oscillating electric fields and their decoherence was observed [33]. In spite of the progress in the theoretical and experimental understanding of decoherence, the models studied so far do not answer the question concerning its nature satisfactorily. Dynamical diagonalization of pure states with respect to a preferred basis explains essentially the measurements results but it is only an
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
225
example of possible scenarios. Other possibilities include: Environmentally induced superselection rules of discrete and continuous types, and completely classical behavior of the quantum system. Let us now discuss this issue in a detailed way. Because the evolution introduced in Sec. 2 is so general that it embraces also a unitary evolution, we first distinguish the case of a nontrivial coupling between a system and its environment. Definition 3A. Environmentally induced decoherence is said to take place in the system, if there exists at least one projection P ∈ M such that Tt (P ) is not a projection for some instant t > 0. The above definition excludes only automorphic evolutions. For the discussion on emergence of classical properties we find it more useful to strengthen it in the following way. Definition 3B. We say that environmentally induced decoherence takes place in the system, if there are two Banach ∗ -invariant subspaces M1 and M2 in M such that: (i) M = M1 ⊕ M2 with M2 6= 0. Moreover , both M1 and M2 are Tt -invariant. (ii) M1 represents a decoherence free part of the system. It is a von Neumann algebra (the image of a conditional expectation of M) generated by all projections P in M such that Tt (P ) remains a projection for all t > 0. We additionally assume that for any projection P ∈ M1 and any t > 0 there exists a projection Q ∈ M1 such that Tt (Q) = P . (iii) M2 represents those observables of the system which, after some time, are not detectable by measurements, i.e. all their expectation values vanish with time. More precisely, lim φ(Tt B) = 0
t→∞
(3)
for all φ ∈ D and any B = B ∗ ∈ M2 . If the process of decoherence is efficient, and usually it is, then Hermitian operators from M1 are those which can be detected in practice. It should be noted that by property (ii) of Definition 3B, M1 is determined in a unique way. Moreover, the evolution restricted to this subalgebra has a nice automorphic property. Theorem 4. For any t ≥ 0, Tt |M1 is a ∗ -automorphism. Proof. See the Appendix. The above properties justify the following name. Definition 5. M1 is called the algebra of effective observables. Using the decomposition from Definition 3B we now discuss the dynamical appearance of classical properties in the quantum system.
May 26, 2003 12:17 WSPC/148-RMP
226
00163
Ph. Blanchard & R. Olkiewicz
Definition 6. If M1 is noncommutative with Z(M1 ) 6= C · 1, where Z(M1 ) denotes the center of M1 , and Tt ◦ Ts (A) = Tt+s (A) for all A ∈ M1 , then we speak of environmentally induced superselection rules in the system. In such a case we may define T−t := (Tt )−1 and obtain in this way a one parameter group of ∗ -automorphisms on the algebra M1 . Hence the system dynamically loses its pure quantum character and behaves like a conservative one, however, with nontrivial superselection operators. If there are minimal projections in Z(M1 ), which are not minimal in M1 , then the superselection rules are of discrete type. In P such a case M1 = Pi MPi and the evolution preserves each superselection sector. If Z(M1 ) does not possess any minimal projections the induced superselection rules are continuous. Definition 7. We say that environment induces a classical structure in the system, if M1 is a commutative algebra greater than C · 1. If M1 = C · 1, then we say that the system is ergodic. Ergodic systems possess the property of return to equilibrium in the following sense [32]. The decomposition of any observable A ∈ M is now given by A = φ0 (A)1 + A2 , where φ0 ∈ D and A2 ∈ M2 . Hence, for any φ ∈ D, (Tt∗ φ)(A) = φ(Tt A) = φ(φ0 (A)1 + Tt (A2 )) → φ0 (A) ,
(4)
when t → ∞. Definition 8. Suppose that environment induces a classical structure. The classical structure is said to be discrete, if M1 contains minimal projections P1 , P2 , . . . . Since minimal projections (there are always countably many of them) are necessarily orthogonal and they sum up to the identity operator, it follows that any P observable A ∈ M1 may be written as A = ai Pi , where ai ∈ R. Because the evolution restricted to M1 is trivial (since Tt (Pi ) = Pi for all t ≥ 0 and all indexes i) so this case corresponds to a dynamical selection of the so called privileged basis. Let us emphasize, however, that in general Pi may not be one-dimensional, and so they represent generalized rays. Definition 9. Suppose that environment induces a classical structure. The classical structure is said to represent a classical dynamical system, if Tt |M1 is a semigroup and (M1 , Tt , ω0 ) is isomorphic with (L∞ (M ), Tˆt , µ0 ), where M is a locally compact space, Tˆt is a one parameter group of ∗ -automorphisms on L∞ (M ) induced by a continuous flow gt on M, and µ0 is a Tˆt -invariant σ-finite Borel measure on M . Let us notice that, due to the semigroup property, the restriction of Tt to M1 extends to negative times by the formula T−t := (Tt )−1 (the existence of (Tt )−1 on M1 is guaranteed by Theorem 4). The procedure of retrieving the space M from an abstract commutative von Neumann algebra M1 was discussed in Sec. 3. By the isomorphism of two dynamical systems we mean a map λ : M1 → L∞ (M )
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
227
which is a ∗ -isomorphism intertwining between Tt and Tˆt , i.e. Tˆt ◦ λ = λ ◦ Tt , and such that µ0 (Λ) = ω0 (λ−1 χΛ ), where χΛ is the characteristic function of a Borel subset Λ ⊂ M . It follows that the above definition describes a process of dynamical de-quantization of a quantum system, (M, {Tt }t≥0 , φ0 ) → (M1 = A, {Tt }t∈R , φ0 ) → (A0 , {Tt }t∈R , φ0 ) → (C(M ), {Tˆt }t∈R , φˆ0 ) → (M, gt , µ0 ) ,
if φ0 is a state. The first arrow represents the process of decoherence, the second the reduction procedure, the third corresponds to the Gelfand isomorphism, and the last one represents the passage from statistical description to individual one expressed in terms of trajectories. A similar scheme holds true also if φ0 is a weight (we just replace A0 by A00 and C(M ) by C0 (M )). In the next section we present a number of examples showing how these definitions work in practice. 5. Examples We start with the following theorem for quantum systems which additionally satisfy the Diracs requirement. Theorem 10. Suppose M is a type I factor, i.e. M = B(HS ), where HS is a separable (finite or infinite dimensional) Hilbert space. Let the evolution of the system be given by a family of maps {Tt }t≥0 which fulfils the conditions (a)–(c) from Sec. 2 with ω0 = Tr, the standard trace. If {Tt } satisfies the semigroup property Tt ◦ Ts = Tt+s , and if there exists a faithful density matrix ρ0 subinvariant with respect to Tt , i.e. Tr ρ0 Tt (A) ≤ Tr ρ0 A for all A ≥ 0, then the decomposition M = M1 ⊕ M2 from Definition 3B always exists. Moreover, the effective part of any observable A ∈ M is given by a Tr-compatible conditional expectation from M onto M1 , the automorphic evolution of the algebra M1 is a Hamiltonian one, and the limit in equation (3) is uniform on bounded sets of M2 . Remark. If dim HS = n, then ρ0 = n1 1 is obviously a Tt -invariant faithful density matrix so the last assumption of the theorem may be omitted. Proof. See the Appendix. Example 1. The Araki–Zurek model: Superselection rules. We follow a mathematical description of the model [4, 49] as given by Kupsch [29]. Suppose the total Hamiltonian H = HS ⊗ 1E + 1S ⊗ HE + A ⊗ B , defined on a Hilbert space HS ⊗ HE , satisfies the following assumptions: • [HE , B] = 0, • B = B ∗ has an absolutely continuous spectrum,
(5)
May 26, 2003 12:17 WSPC/148-RMP
228
00163
Ph. Blanchard & R. Olkiewicz
PN • A = n=1 λn Pn , λn ∈ R, and Pn are mutually orthogonal projections summing up to the identity operator, PN L • HS = n=1 Pn HS Pn , i.e. HS = Hn , where Hn is a self-adjoint operator in the Hilbert space Pn HS , • ωE is an arbitrary statistical state of the environment represented by a density matrix ρ. Because all three terms in Eq. (5) commute (we say that two self-adjoint operators commute when their spectral measures commute) so eitH = eitHS ⊗1E eit1S ⊗HE eitHI . In order to simplify notation we have put ~ = 1. The Hamiltonian (5) is just the generator of the above one parameter group of unitary operators. Let P E be the conditional expectation from B(HS ) ⊗ B(HE ) onto B(HS ) with respect to the state ωE . Then, for any X ∈ B(HS ), Tt (X) = PE [eitH X ⊗ 1E e−itH ] = eitHS PE [eitHI X ⊗ 1E e−itHI ]e−itHS . Because eitHI =
N X
n=1
so Tt (X) =
N X
Pn ⊗ eitλn B
χn,m (t)eitHS Pn XPm e−itHS ,
(6)
n,m=1
where χn,m (t) =
Z
eit(λn −λm )s d Tr(ρE(s)) ,
and dE(s) is the spectral measure of B. Since this measure is absolutely continuous so d Tr(ρE(s)) is a probability measure absolutely continuous with respect to the Lebesgue measure. Hence, by the Riemann–Lebesgue lemma, χn,m ∈ C0 (R). Because there are finitely many λn so minn6=m |λn − λm | = δ > 0. It implies that for any > 0 there exists t0 > 0 such that |χn,m (t)| < for all n 6= m and all t > t0 . It is clear now that M1 = Pˆ (B(HS )) :=
N X
Pn B(HS )Pn .
n=1
It means that M1 describes what is called a quantum system with superselection rules. In each superselection sector Pn HS the evolution is given by the Hamiltonian Hn . Finally, we show that all expectation values of observables from M2 decrease to zero uniformly on bounded sets. Because in general {Tt } is not a semigroup we
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
229
cannot therefore apply Theorem 10. However, this may be done explicitly. Suppose B ∈ M2 , i.e. B = (id − Pˆ )B, and kBk∞ ≤ 1. Then for any ρS ∈ D,
X
ˆ
|Tr ρS Tt (B)| ≤ kTt∗ ρS − P (Tt∗ ρS )k1 kBk∞ ≤ Pn ρS Pm χn,m (t)
, n6=m
1
where k · k1 is the trace norm. Suppose that > 0 is given. Let us take t0 > 0 such that for all t > t0 , |χn,m (t)| < N (N −1) . Then
X X
≤
P ρ P χ (t) kPn ρS Pm χn,m (t) + Pm ρS Pn χm,n (t)k1 n S m n,m
1
n6=m
n<m
=
X
n<m
|χn,m (t)| · kPn ρS Pm + Pm ρS Pn k1 .
Because kPn ρS Pm + Pm ρS Pn k1 ≤ kρS k1 + kPˆ (ρS )k1 = 2 so
X
N (N − 1)
0, n ∈ CP n−1 — the complex projective space, and n → PA (n) is a tautological map which assigns to a point n ∈ CP n−1 the corresponding onedimensional projection in the system A. By µ(n) we denote a unique U (n)-invariant measure on CP n−1 normalized in such a way that Z dµ(n)PA(B) = 1A(B) . CP n−1
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
231
By direct calculations we obtain that, for an appropriately chosen value of the coupling constant κ (depending on dimension n), LD (X) = trA X + trB X − 2X, where trA (trB ) denotes the conditional expectation from Mn2 ×n2 onto 1A ⊗ Mn×n (Mn×n ⊗ 1B ) respectively. In other words, trA (X) = n1 1A ⊗ TrA (X), where TrA is the partial trace taken over the system A. Let us further assume that there is no interaction between systems A and B. Then Hamiltonian H of the system AB equals to H = HA ⊗ 1B + 1A ⊗ HB . In such a case the superoperator L(X) = i[H, X] + LD (X) generates a quantum Markov semigroup Tt (X) = eitH (e−2t X + e−t (1 − e−t )(trA X + trB X) + (1 − e−t )2 trAB X)e−itH ,
(8)
where trAB is the conditional expectation from Mn2 ×n2 onto the trivial subalgebra C · 1AB . If we take ρ0 = n12 1AB , then the semigroup Tt satisfies all assumptions of Theorem 10. Hence Mn2 ×n2 = M1 ⊕ M2 , with M1 = C · 1AB . It is easy to check that all statistical states of the system AB evolve to a completely mixed state ρ 0 . Example 4. Quantum stochastic process. Quantum stochastic processes were introduced by Davies [15] to describe rigorously certain continuous measurement processes. They can be constructed from two infinitesimal generators. The first is the generator Z of a strongly continuous semigroup on a Hilbert space HS , and the second is a stochastic kernel J, describing how the measuring apparatus interacts with the system. Let us recall that a stochastic kernel is a measure defined on the σ-algebra of Borel sets in some locally compact space and with values in the space of bounded positive linear operators on Tr(HS ), the Banach space of trace class operators on HS . In this example we take the Poincar´e disc D1 = {ζ ∈ C : |ζ| < 1} as the underlying topological space, and define Z = iHS − κ2 1S where HS is the Hamiltonian of the system, κ > 0 is the coupling constant. For E ⊂ D1 and ρ ∈ Tr(HS ) the stochastic kernel is defined by Z Tr[J(E, ρ)A] = κ dµ(ζ) Tr(eζ ρeζ A) , E
where A ∈ B(HS ), eζ = |ζihζ| with |ζi being a SU(1, 1) coherent state, i.e. a holomorphic function on D1 [37], |ζi(z) = (1 − |ζ|2 )(1 − zζ)−2 and dµ(ζ) =
1 dζdζ¯ π (1 − |ζ|2 )2
is a SU(1, 1) invariant measure on D1 . It should be pointed out that |ζi are coz herent states in Hilbert space HS = L2 (D1 , dzd¯ π ), which is the space of a unitary irreducible representation π of the group SU(1, 1) given by αz + β¯ , π(g)f (z) = (βz + α ¯ )f βz + α ¯
May 26, 2003 12:17 WSPC/148-RMP
232
00163
Ph. Blanchard & R. Olkiewicz
where g=
α β β¯ α ¯
with |α|2 − |β|2 = 1. In order to define a quantum stochastic process, Z and J have to satisfy the following relation tr[J(D1 , eψ )] = −2 Rehψ, Zψi , eψ = |ψihψ|, for all normalized vectors ψ ∈ D(Z), the domain of Z. It is straightforward to check that Z dµ(ζ) Tr(eζ eψ eζ ) = κ = −2 Rehψ, Zψi . Tr[J(D1 , eψ )] = κ D1
As was shown in [8], the generator of the semigroup Tt associated with the process is given by Z L(X) = i[HS , X] + κ dµ(ζ)eζ Xeζ − κX . (9) D1
From Eq. (9) it is clear that Tt satisfies all but the last assumption of Theorem 10. However, although the decomposition B(HS ) = M1 ⊕ M2 does not hold in this case, Tt describes a very efficient decoherence in the quantum system in the spirit of Definition 3A. In fact, if HS is the operator closure of (dπ(h), DG ), where h ∈ su(1, 1) — the Lie algebra of group SU(1, 1), and DG is the G˚ arding domain, then lim kTt Ak∞ = 0 ,
t→∞
for all A ∈ K(HS ), the space of compact operators on HS (see Theorem 3.10 in [9]). It follows that K(HS ) ∈ M2 , and so all pure states of the system instantaneously deteriorate to statistical states. Let us notice that the pre-adjoint semigroup T t∗ is asymptotically stable, i.e. lim kTt∗ ρ1 − Tt∗ ρ2 k1 = 0
t→∞
for all density matrices ρ1 and ρ2 . Hence, the set of Tt -invariant, or even subinvariant, density matrices is empty. Moreover, since kTt∗ ρk2 → 0, where k · k2 is the Hilbert–Schmidt norm, and kTt∗ ρk1 = 1 for all t ≥ 0, so for dzd¯ z -almost all z and dzd¯ z -almost all z 0 6= z, ρt (z, z 0 ) → 0, where ρt (z, z 0 ) stands for the integral kernel of the density matrix Tt∗ ρ. So far we have restricted the discussion on the emergence of classical properties to quantum open systems associated with factors of type In and I∞ . A generic feature of such factors is that they possess minimal projections. Hence, the only possible classical structure induced in such factors is a discrete one, and so the dynamics restricted to it has to be trivial. Let us now turn to infinite quantum spin systems whose GNS representation with respect to a normalized trace tr is known
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
233
to be a hyperfinite factor of type II1 [18]. Such a factor is continuous, i.e. there are no minimal projections in it. We start with the following general theorem. Theorem 11. Suppose M is a type II1 factor. Let its evolution be given by a family of maps {Tt } satisfying the conditions (a)–(c) from Sec. 2 with ω0 = tr, the normalized trace on M. If {Tt } is a semigroup, then the decomposition M = M1 ⊕ M2 always exists. Moreover, the effective part of any observable in M is given by a tr-compatible conditional expectation from M onto M1 . Remark. Notice that tr(1S ) = 1, so the identity operator is a Tt -invariant faithful statistical state in this case. Proof. See the Appendix. Example 5. Apparatus with continuous readings. Suppose an apparatus, a semi-infinite linear array of spin- 21 particles, fixed at positions k = 1, 2, 3, . . . , interacts with a quantum particle moving along the x-axis. Then, the algebra M of the measuring device is a hyperfinite factor of type II1 , and the algebra of the system is B(HS ), where HS = L2 (R, dx). More precisely, N∞ M = π( 1 M2×2 )00 , where π is the GNS representation with respect to the norN∞ malized trace on the Glimm algebra 1 M2×2 . The evolution of the joint system is determined by a Hamiltonian H = HA ⊗ 1S + 1A ⊗ HS + A ⊗ B ,
(10)
where HA , HS , A and B are assumed to satisfy the following conditions: • [HA , A] = 0, P∞ • A = π( n=1 ( 21 )n σn3 ), where σn3 is the third Pauli matrix located at position n, and so A ∈ M, 1 ∆, the kinetic energy operator, • HS = − 2m • B = pˆ, the momentum operator,R ixp • ωS = |ψihψ|, where ψ(x) = √12π √ e dp. 2 π(1+p )
Let PS denote the conditional expectation from M ⊗ B(HS ) onto M with respect to the state ωS . Then, for any X ∈ M, Tt (X) = eitHA PS [eitA⊗B X ⊗ 1S e−itA⊗B ]e−itHA .
(11)
By Theorem 12 in [31], Tt satisfies all assumptions of Theorem 11. Hence, M = M1 ⊕ M2 with M1 being the von Neumann algebra generated by spectral projections of all σn3 , n ∈ N. It is easy to check that M1 = L∞ (C, µ), where C is the Cantor set and µ is a continuous, regular, Borel, probability measure on C, see [31] for more details. It is worth pointing out that M1 is unitarily isomorphic with L∞ ([0, 1], dx), and the trace state tr corresponds to the Lebesgue integral on [0, 1].
May 26, 2003 12:17 WSPC/148-RMP
234
00163
Ph. Blanchard & R. Olkiewicz
Because HA commutes with all spectral projections of the operator A so the evolution restricted to M1 is trivial. Hence, this case may be considered as a continuous analog of the selection of pointer states from Example 2. Example 6. Continuous superselection rules. This example is a slight modification of the previous one. Suppose that additionally to the apparatus and system introduced in Example 5, there is another quantum system represented by a factor algebra N acting in a Hilbert space Hq . As Hamiltonian of the total system we take H = Hq ⊗ 1A ⊗ 1S + 1q ⊗ HA ⊗ 1S + 1q ⊗ 1A ⊗ HS + 1q ⊗ A ⊗ B ,
(12)
where Hq is a self-adjoint operator affiliated to N . Since H was formed by adding a free evolution of the system N to the Hamiltonian (10), so N ⊗ M = (N ⊗ M1 ) ⊕ (N ⊗ M2 ) with N ⊗ M1 being the subalgebra of effective observables. Its center Z = 1q ⊗ M1 is a continuous commutative algebra. The evolution restricted to N ⊗ M1 is LR Hamiltonian and preserves each superselection sector of N dµ. This model Z may be easily generalized to the case when the Hamiltonian Hq depends on x ∈ C, and so is different in each superselection sector. Example 7. Classical dynamical system. Suppose that a quantum system is a semi-infinite linear array of spin- 21 particles fixed at positions k ∈ N. The algebra M of the system is a hyperfinite factor of type N∞ II1 defined as M = π( 1 M2×2 )00 ⊂ B(HS ), where π is the GNS representation with respect to the normalized trace on the Glimm algebra. The free evolution of the system is given by a σ-weakly continuous one parameter group of automorphisms αt : M → M constructed in the following way. Suppose U ( 2kn ), k = 0, 1, . . . , 2n − 1, n is a representation of a cyclic group { 2kn }, with addition modulo 1, in the space C2 , such that 1 U n (z1 , . . . , z2n ) = (z2n , z1 , . . . , z2n −1 ) . 2 Since it is a restriction of the standard unitary representation of the permutation group S2n , the U ( 2kn ) are unitary matrices in M2n ×2n . Because there is an N∞ embedding of M2n ×2n into 1 M2×2 , so they may be considered as operators in the Glimm algebra. Hence, they induce a discrete group of unitary automorphisms of M by the formula ∗ k k α kn (X) = π U n Xπ U n . 2 2 2
Because n was arbitrary so we obtain in this way a homomorphism d → αd , where d is a dyadic number, i.e. d = 2kn for some n ∈ N and some k = 0, 1, . . . , 2n −1. By Theorem 13 in [31], this homomorphism extends to the whole set of real numbers
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
235
yielding a group of unitary (but not inner) automorphisms αt (X) = eitH Xe−itH , where H is a self-adjoint operator on HS . It is clear from the construction that αm = id, for any integer m. The reservoir is chosen to consists of phonons of an infinitely extended harmonic crystal. The Hilbert space H representing pure states of a single phonon is given by H = L2 (R3 , dk). It follows that the Hilbert space of the environment is the symmetric Fock space F over H, and its algebra ME is a von Neumann algebra generated by Weyl operators W (f ) = eiφ(f ) , φ(f ) = √12 (a∗ (f ) + a(f )), where a∗ (f ) denotes the creation operator of one particle state f ∈ H, and a(f ) = (a∗ (f ))∗ [12]. Because the Fock representation is irreducible so ME = B(F). The reference state ˜ f˜|, where |f˜i is a coherent of the reservoir ωE is taken to be a pure state ωE = |fih state in the Fock space, i.e. ˜ = e− 2 kf k |fi 1
2
∞ X [a∗ (f )]n Ω, n! n=0
where f ∈ H, and Ω is the vacuum state. Such a state represents a state of phonons associated with a classical acoustic wave. R The free evolution of the reservoir is determined by the Hamiltonian HE = d(k)ω(k)a∗ (k)a(k), where ω(k) is the dispersion function. Suppose now that these two systems interact. The coupling between the matter and the boson field is given by an interaction Hamiltonian HI , a self-adjoint operator on HS ⊗ F. To derive an explicit form of HI we use the formula (I.20) in [5], in √ , g 6= 0 ∈ H, and A is the same as in Example 5. Hence, which we put G(k) = A g(k) 2 HI = A ⊗ φ(g). For simplicity we put the coupling constant equal to one. It should be pointed out that, due to the form of A, HI is a straightforward generalization of the interacting term of the spin-boson model. Because HI commutes neither with H nor with HE , so in order to determine the reduced dynamics of the system we do not follow a general strategy as in the previous cases. Instead we use a simplified procedure: First we calculate the reduced dynamics of the HI only, and next add to it the automorphic evolution αt . Hence, for any X ∈ M, Tt (X) = αt (PE [eitHI X ⊗ 1E e−itHI ]) ,
(13)
where PE is the conditional expectation onto the algebra M with respect to the state ωE . In order to calculate the explicit form of the superoperators Tt we suppose that X ∈ π(M2n ×2n ). Then X eitHI X ⊗ 1E e−itHI = Pi1 ···in XPj1 ···jn ⊗ eit(ai1 ···in −aj1 ···jn )φ(g) , i1 ,...,in j1 ,...,jn
where Pi1 ···in = Pi1 ⊗ · · · ⊗ Pin , ik ∈ {0, 1}, and Pik are spectral projections of π(σk3 ). Parameters ai1 ···in are given by ai1 ···in =
n X k=1
(−1)ik
1 . 2k
May 26, 2003 12:17 WSPC/148-RMP
236
00163
Ph. Blanchard & R. Olkiewicz
Hence, PE [eitHI X ⊗ 1E e−itHI ] = Because
X
i1 ,...,in j1 ,...,jn
Pi1 ···in XPj1 ···jn hf˜, W (t(ai1 ···in − aj1 ···jn )g)f˜i .
√ 2 ∗ 1 W (−i 2f )Ω = e− 2 kf k ea (f ) Ω = |f˜i
so hf˜, W (t(ai1 ···in − aj1 ···jn )g)f˜i √ √ = hΩ, W (i 2f )W (t(ai1 ···in − aj1 ···jn )g)W (−i 2f )Ωi 1 2
= e− 4 t √ Hence (with f → 2f ),
√ (ai1 ···in −aj1 ···jn )2 kgk2 +it(ai1 ···in −aj1 ···jn ) 2Rehf,gi
.
X 2 2 1 2 Pi1 ···in XPj1 ···jn e− 4 t (ai1 ···in −aj1 ···jn ) kgk +it(ai1 ···in −aj1 ···jn )Rehf,gi Tt (X) = αt . i1 ,...,in j1 ,...,jn
(14)
Because ai1 ···in 6= aj1 ···jn if ik 6= jk for at least one k, so the off-diagonal terms vanish when t → ∞. Let us point out that in this case Tt is not a semigroup. However, the subalgebra of effective observables may be determined in the same way as in Example 5, yielding the same result, i.e. M1 = L∞ (C, µ), where C is the Cantor set. If X ∈ M1 , then Tt (X) = αt (X). In this way we have obtained an abstract commutative dynamical system (M1 , αt , tr). In the final step we apply to it the reduction procedure to determine what classical system it represents. Suppose h is a continuous function on C. Let us recall that each point c ∈ C is represented P k by an infinite sequence (i1 , i2 , . . .), ik = 0, 1, as c = k 2i 3k . Let c1 = (i1 , i2 , . . . , in , 1, 0, 0, . . .) ,
c2 = (i1 , i2 , . . . , in , 0, 1, 1, . . .) . Points c1 and c2 are such that there are no points in C between them. Lemma 12. Suppose t → αt (h) is strongly continuous. Then h(c1 ) = h(c2 ). Moreover, h(0) = h(1). Proof. See the Appendix. Let A0 be a C ∗ -algebra of continuous functions on C such that αt is strongly continuous on it. Suppose S 1 = {eia , a ∈ R}, and let λ : C → S 1 be given by ! ∞ X ik λ(i1 , i2 , . . .) = exp 2πi . 2k k=1
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
237
ˆ : C(S 1 ) → C(C), λ(f ˆ )(c) = f (λ(c)), c ∈ C, is an Because λ is surjective so λ ˆ embedding. By Lemma 12, imλ = A0 . Hence M = S 1 , and the group of automorphisms α ˆ t : C(S 1 ) → C(S 1 ) is induced by a uniform flow gt (eia ) = ei(2πt+a) . Thus, we may conclude that (S 1 , gt , µ0 ), where µ0 is the normalized Lebesgue measure on S 1 , is the induced classical system. Example 8. Ergodic spin system. Suppose again that a quantum system is a semi-infinite linear array of spin- 21 particles fixed at positions k ∈ N. The quasi-local algebra A is the norm closure of the S algebra A0 = An of local observables. Here, by An we denote the local algebra Nn associated with the set Λn = {1, 2, . . . , n}. It is clear that An = k=1 A(k) , where A(k) is isomorphic with the algebra of 2 × 2 matrices. M = π(A)00 ⊂ B(HS ), as in the previous example. Suppose that the system interacts with its environment represented by the algebra B(HE ). The evolution of the joint system is determined by a Hamiltonian H = HS ⊗ 1E + 1S ⊗ HE + A ⊗ B ,
(15)
where HS and A are given by Q∞ • HS = π( k=1 (1 + bk σk1 )), σk1 ∈ A(k) is the first Pauli matrix, bk > 0, and P∞ k < ∞, k=1 bP ∞ • A = π( k=1 21k σk3 ), as in the Example 5. Q Because kHS k∞ = ∞ k=1 (1 + bk ) < ∞ so both HS and A are bounded and belong to π(A). We do not specify the form of the operators HE and B. Instead, we assume that the so called singular coupling limit [1] may be applied for derivation of the reduced dynamics of the system. Hence, the Markovian master equation for x ∈ M reads x˙ = L(x) = i[H, x] + LD (x) , where H = HS + αA2 and γ {x, A2 } . 2 The coefficients α ∈ R and γ > 0 are given by the formula Z ∞ γ Tr(ρE eitHE Be−itHE B)dt = + iα , 2 0 LD (x) = γAxA −
where ρE is a density matrix of the environment. It is clear that the semigroup Tt = etL on M preserves the trace tr and satisfies therefore the assumptions of Theorem 11. Hence M = M1 ⊕ M2 . Theorem 13. The system (M, Tt ) is ergodic, i.e. M1 = C · 1. Proof. See the Appendix.
May 26, 2003 12:17 WSPC/148-RMP
238
00163
Ph. Blanchard & R. Olkiewicz
The above results are purely mathematical ones and invite to ask the following question: What is the relation between the physical implications and proposed mathematical procedure of de-quantization. To establish such a close connection it is essential to consider more examples, especially those with nontrivial evolutions on the decoherence-free part of a system. However, Example 7, although a bit contrived, shows directly that in principle infinite quantum systems may behave like simple classical dynamical systems. It means that, when we neglect terms which deteriorate to zero, then the rest of the system may be described by a set of classical parameters which evolve according to the laws of classical physics. And from a mathematical point of view there is potentially a full range of classical systems emerging in this way. Since for any compact metric space M there is a surjective map λ : C → M , so the C ∗ -algebra C(C) contains all algebras C(M ) as subalgebras. The open problem is how to construct physically plausible dynamics of the infinite fermion system so that C(M ) would be selected as A0 . Appendix Proof of Theorem 4. Let t > 0 be fixed. We show that Tt : M1 → M1 is a -automorphism. Let P(M1 ) denote the set of all projections in M1 .
∗
Step 1. Suppose P, Q ∈ P(M1 ) and P Q = 0. Then Tt (P + Q) is a projection and so Tt (P )Tt (Q) + Tt (Q)Tt (P ) = 0. Since both Tt (P ) and Tt (Q) are projections so Tt (P )Tt (Q) = 0. It means that Tt maps orthogonal projections to orthogonal ones. R Step 2. Suppose that A = A∗ ∈ M1 . Let A = λdE(λ) be its spectral decomposition. ByR Step 1, dTt E(λ) is another spectral measure. Because Tt is normal so Tt (A) = λdTt E(λ) and hence Tt (A2 ) = (Tt A)2 . It follows that Tt is a Jordan homomorphism. Step 3. Suppose that A ∈ M1 . Then, by Step 2,
Tt (A∗ A) + Tt (AA∗ ) = Tt (A)∗ Tt (A) + Tt (A)Tt (A)∗ .
Because, by condition (i) of Sec. 2, Tt is completely positive and preserves the identity operator so it satisfies the Schwarz inequality. Hence Tt (A∗ A) ≥ Tt (A)∗ Tt (A) and Tt (AA∗ ) ≥ Tt (A)Tt (A)∗ , what implies that Tt (A∗ A) = Tt (A)∗ Tt (A). Step 4. Suppose that φ ∈ D. A sesquilinear form bφ on M1 given by bφ (A, B) = φ(Tt (A∗ B) − Tt (A∗ )Tt (B))
is positive. By Step 3, bφ (A, A) = 0. Hence also bφ (A, B) = 0. Because state φ was arbitrary so Tt (A∗ B) = Tt (A∗ )Tt (B). In this way we prove that Tt is a homomorphism. Step 5. Next we show that Tt : P(M1 ) → P(M1 ) is bijective. By property (ii) of Definition 3B, it is onto. Suppose that there are P1 , P2 ∈ P(M1 ) such that Tt (P1 ) = Tt (P2 ). Then Tt ((P1 − P2 )2 ) = (Tt (P1 ) − Tt (P2 ))2 = 0
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
239
and so, by property (iii) of Sec. 2, ω0 ((P1 − P2 )2 ) = 0. Because ω0 is faithful so P1 = P 2 . Step 6. Since Tt is normal so ker Tt is a von Neumann algebra. As such it is generated by projections. By Step 5, Tt transforms non-zero projections to non-zero ones. Hence ker Tt = 0. The range of Tt is also a von Neumann algebra. Since it contains P(M1 ), it coincides with M1 . Hence Tt is a ∗ -automorphism of M1 . Proof of Theorem 10. Because Tt : B(HS ) → B(HS ) are normal and contractive in the operator norm so there exists a pre-adjoint semigroup Tt∗ : Tr(HS ) → Tr(HS ), which is contractive in the trace norm k · k1 . By Tr(HS ) we denote the Banach space of trace class operators on HS , the predual space of B(HS ). The set of density matrices D consists of those ρ ∈ Tr(HS ), which are positive and normalized. The cone of positive and normal functionals we denote by Tr(HS )+ . Since Tt is Tr-invariant so Tt : Tr(HS ) → Tr(HS ), and is bounded in the trace norm. Thus, the adjoint map Tˆt = (Tt |Tr(HS ) )∗ : B(HS ) → B(HS ) is bounded in the operator norm. Moreover, for any ρ ∈ D, Tr(ρTˆt (1)) = Tr(Tt ρ) = 1 , what implies that Tˆt is unital. Hence it satisfies the Schwarz inequality. Using this property we show now that Tˆt is in fact contractive in the operator norm. Suppose it is not. Then there exists a minimal constant C > 1 such that kTˆt Ak∞ ≤ CkAk∞ for all A ∈ B(HS ). Let us take v ∈ HS with kvk = 1. Then kTˆt (A)vk2 ≤ hv, Tˆt (A∗ A)vi = k(Tˆt (A∗ A))1/2 vk2 ≤ k(Tˆt (A∗ A))1/2 k2∞ = kTˆt (A∗ A)k∞ ≤ CkAk2∞ . Hence kTˆt (A)k∞ ≤ C 1/2 kAk∞ , a contradiction since the constant C was assumed to be minimal. The contractivity of Tˆt implies that Tt |Tr(HS ) must be also contractive. Next we show that Tt∗ is also contractive in the operator norm. Suppose φ ∈ Tr(HS ). Because Tr(HS ) ⊂ K(HS ), the Banach space (and C ∗ -algebra in fact) of compact operators on HS , and K(HS )∗ = Tr(HS ), so there exists ψ ∈ Tr(HS ) with kψk1 = 1 such that kTt∗ φk∞ = |Tr(Tt∗ φ)ψ|. Hence kTt∗ φk∞ = |Tr φ(Tt ψ)| ≤ kφk∞ kTt ψk1 ≤ kφk∞ . Summing up: For all t ≥ 0, Tt∗ : Tr(HS ) → Tr(HS ) is completely positive and contractive in both the trace and operator norm. Next we consider topological properties of the semigroup {Tt∗ }. Since, by the assumption, it possesses a faithful and Tt∗ -subinvariant density matrix ρ0 so it is relatively compact in the weak operator topology [23]. It means that a set Kφ = {Tt∗ φ}t≥0 is relatively compact for any φ ∈ Tr(HS ) in the weak topology ˇ on Tr(HS ). Suppose that φ ≥ 0. By the Eberlein–Smulian theorem Kφ is weakly sequentially compact. Let {φn } be an arbitrary sequence in Kφ . Then there exists a subsequence {φmn } such that w-lim φmn = ψ, where ψ ∈ Tr(HS )+ . However,
May 26, 2003 12:17 WSPC/148-RMP
240
00163
Ph. Blanchard & R. Olkiewicz
φmn ∈ Tr(HS )+ so, by Corollary 5.11 in [42], limn→∞ kφmn − ψk1 = 0. This implies that Kφ is sequentially compact, and so it is relatively compact (in the trace norm topology). Suppose now that φ ∈ Tr(HS ). Then φ = φ1 − φ2 + iφ3 − iφ4 , where all φj ∈ Tr(HS )+ . Each set Kj = {Tt∗ φj }t≥0 is relatively compact. Because function f (ψ1 , ψ2 , ψ3 , ψ4 ) = ψ1 − ψ2 + iψ3 − iψ4 , ψj ∈ Tr(HS ), is norm continuous so the set f (×Kj ) is compact in Tr(HS ). However, for all t ≥ 0, Tt∗ φ ∈ f (×Kj ), what implies that Kφ is compact. Hence, the semigroup {Tt∗ } is relatively compact in the strong operator topology. We are now in position to apply Theorem 24 from [34]. It states that Tr(HS ) decomposes into an isometric and sweeping part, Tr(HS ) = Tr(HS )iso ⊕ Tr(HS )s such that Tt∗ (φ1 ) = Ut φ1 Ut∗ , φ1 ∈ Tr(HS )iso , where Ut is a strongly continuous group of unitary operators, and limt→∞ kTt∗ φ2 k1 = 0 for all φ2 ∈ Tr(HS )s . Moreover, there exists a Tr-compatible projection Pˆ (a linear, contractive and completely positive superoperator which satisfies Pˆ 2 = Pˆ ) on Tr(HS ) such that its range is equal to Tr(HS )iso . In the final step we translate these results to the operator algebra framework. By Theorem 4.1 in [35], the dual projection Pˆ ∗ is a Tr-compatible conditional expectation on B(HS ). Hence B(HS ) = M1 ⊕ M2 , where M1 is the range of Pˆ ∗ , and M2 is the range of (id − Pˆ ∗ ). Moreover, M1 is a von Neumann algebra and the evolution on it is given by Tt (A1 ) = Ut∗ A1 Ut . What remains to be proven is the uniform decrease to zero of all expectation values of observables belonging to M2 . To this end suppose that A2 ∈ M2 with kA2 k∞ ≤ 1 and ρ ∈ D. Then, by Theorem 24 in [34], lim |Tr ρTt (A2 )| = lim |Tr(Tt∗ φ)(id − Pˆ ∗ )A2 | = lim |Tr(id − Pˆ )(Tt∗ φ)A2 |
t→∞
t→∞
t→∞
≤ kA2 k∞ lim kTt∗ ρ − Pˆ (Tt∗ ρ)k1 = 0 , t→∞
and the limit is uniform in A2 provided it belongs to the unit ball of M2 . Proof of Theorem 11. Since tr ◦ Tt = tr, so Tt is bounded in the trace norm. Hence, it may be extended to a map Tt : L1 (M) → L1 (M). However, L1 (M) = ∗ M∗ , so the adjoint map Tt : M → M is bounded and unital. Because it is also completely positive so it satisfies the Schwarz inequality. Using the same argument ∗ as in the proof of Theorem 10, we conclude that Tt is contractive in the operator norm, what further implies the contractivity of Tt in the trace norm. Hence, the assertion follows from Theorem 7 and Corollary 9 in [31]. Proof of Lemma 12. Suppose on the contrary that h(c1 ) 6= h(c2 ), where c1 = (i1 , i2 , . . . , in , 1, 0, 0, . . .) , c2 = (i1 , i2 , . . . , in , 0, 1, 1, . . .) .
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
Let us take a sequence of dyadic numbers dm =
1 2m .
241
Then, for m > n + 1,
(αdm h)(c2 ) = h(U (dm )c2 ) = h(cm ) , where cm = (i1 , i2 , . . . , in , 1, 0, . . . , im = 0, 1, 1, . . .) . It is clear that cm → c1 . Because h is continuous so for = 21 |h(c1 ) − h(c2 )|, there exists N such that for all m > N , |h(cm ) − h(c1 )| < . Hence kαdm (h) − hksup ≥ |(αdm h)(c2 ) − h(c2 )| = |h(cm ) − h(c2 )| > , a contradiction. The condition h(0) = h(1) may be shown in the same way. Proof of Theorem 13. Let us observe that Tt : π(An ) → π(An ) for any n ∈ N. Let us first show that the algebra of effective observables for Tt |π(An ) consists only of operators proportional to the identity operator. Using the isomorphism of π(A n ) with the algebra M2n ×2n , we reduce this problem to determining this algebra for (n) Tt = etLn , where Ln (x) = i[Hn , x] + LD n (x) and x ∈ M2n ×2n . Here Nn Hn = k=1 (12×2 + bk σ 1 ) + αA2n , LD n (x) = γAn xAn −
γ {x, A2n } , 2
and An =
2 X
i1 ,...,in =1
ai1 ···in =
n X 1 (−1)ik −1 , 2k k=1
ai1 ···in Pi1 ···in , Pi1 ···in = Pi1 ⊗ · · · ⊗ Pin .
P1 and P2 are the spectral projections of σ 3 , i.e. σ 3 = P1 − P2 . It is clear that if (n)∗ (n) x = z12n ×2n , z ∈ C, then Tt x = Tt x = x for all t ≥ 0. Conversely, suppose (n) (n)∗ (n)∗ (n) that Tt Tt x = Tt Tt x = x for some x ∈ M2n ×2n . Then, by calculating the first and second time derivative in t = 0 of the above equation, we obtain that D LD n (x) = 0 and Ln ([Hn , x]) = 0. Because γ 2 (LD n (x))i1 ···in ,j1 ···jn = − xi1 ···in ,j1 ···jn (ai1 ···in − aj1 ···jn ) 2 so LD n (x) = 0 if and only if x is diagonal, i.e. x=
2 X
i1 ,...,in =1
xi1 ···in Pi1 ···in .
However, [Hn , x] is diagonal as well so Pi1 ···in [Hn , x]Pj1 ···jn = 0
May 26, 2003 12:17 WSPC/148-RMP
242
00163
Ph. Blanchard & R. Olkiewicz
for all (i1 · · · in ) 6= (j1 · · · jn ). Because Pi1 ···in [Hn , x] pj1 ···jn = (xi1 ···in − xj1 ···jn )Pi1 ···in Hn Pj1 ···jn and (Hn )i1 ···in ,j1 ···jn =
n Y
(12×2 + bk σ 1 )ik jk =
k=1
n Y
k=1
(δik jk + bk (σ 1 )ik jk ) 6= 0 ,
so, for all (i1 · · · in ) = 6 (j1 · · · jn ), xi1 ···in = xj1 ···jn . Hence x = z12n ×2n . Suppose now that y ∈ π(An ) and try = 0. Then, for any x ∈ π(An ), lim tr(xTt y) = 0 .
t→∞
Since π(An ) is finite dimensional so all topologies coincide on it. Hence kTt yk2 → 0, when t → ∞. Finally, we show that M1 = C · 1. Suppose on the contrary that x ∈ M1 and x 6= z1. Then y = x − (tr x)1 ∈ M1 and y 6= 0. Hence, we may assume that kyk2 = 1. Let (xn ) be a sequence such that xn ∈ π(An ) and xn → x in L2 (M). Then yn = xn − (tr xn )1 ∈ π(An ) and yn → y. Hence, there exists n0 ∈ N such that ky − yn k2 < 14 . On the other hand, since tr yn0 = 0, there exists t0 > 0 such that kTt0 yn0 k2 < 14 . Thus 1 = kTt0 yk2 ≤ kTt0 (y − yn0 )k2 + kTt0 yn0 k2
0 of self-adjoint operators (as κ → ∞) which are defined by “cutting off” S and may be tractable. We apply this abstract method to the construction of a self-adjoint extension of the Pauli–Fierz Hamiltonian without spin (Appendix B) and that with spin 1/2 (Sec. 3.3). 2. The Dirac Maxwell Operator and the Pauli Fierz Hamiltonian For a linear operator T on a Hilbert space, we denote its domain by D(T ), and its adjoint by T ∗ (provided that T is densely defined). For two objects a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ) such that products aj bj (j = 1, 2, 3) and their sum can be P3 defined, we set a · b := j=1 aj bj . We use the physical unit system in which c (the speed of light) = 1 and ~ = 1 (~ := h/(2π); h is the Planck constant). 2.1. The Dirac operator Let Dj (j = 1, 2, 3) be the generalized partial differential operator in the variable xj , the jth component of x = (x1 , x2 , x3 ) ∈ R3 , and ∇ := (D1 , D2 , D3 ). We denote the mass and the charge of the Dirac particle by m > 0 and q ∈ R\{0} respectively. We consider the situation where the Dirac particle is in a potential V which is a Hermitian-matrix-valued Borel measurable function on R 3 . Then the Hamiltonian of the Dirac particle is given by the Dirac operator HD := α · (−i∇) + mβ + V
(2.1)
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
247
acting in the Hilbert space HD := ⊕4 L2 (R3 )
(2.2)
with domain D(HD ) := [⊕4 H 1 (R3 )] ∩ D(V ) (H 1 (R3 ) is the Sobolev space of order 1) and α := (α1 , α2 , α3 ), where αj (j = 1, 2, 3) and β are 4 × 4 Hermitian matrices satisfying the anticommutation relations {αj , αk } = 2δjk , {αj , β} = 0 ,
j, k = 1, 2, 3 ,
β 2 = I4 ,
j = 1, 2, 3 ,
(2.3) (2.4)
{A, B} := AB + BA, δjk is the Kronecker delta and I4 is the 4 × 4 identity matrix. We assume the following: Hypothesis (A). Each matrix element of V is almost everywhere (a.e.) finite with respect to the three-dimensional Lebesgue measure dx and the subspace ∩3j=1 [D(Dj ) ∩ D(V )] is dense in HD . Under this hypothesis, HD is a symmetric operator. Detailed analysis of the Dirac operator is given in [11]. Example 2.1. A typical example for V is Vem := φI4 − qα · Aex ex ex 3 3 with φ : R3 → R an external scalar potential and Aex := (Aex 1 , A2 , A3 ) : R → R ex an external vector potential, where Aj and φ are in the set Z L2loc (R3 ) := f : R3 → C; Borel measurable |f (x)|2 dx < ∞, ∀R > 0 . |x|≤R
Then D(Vem ) ⊃ ⊕4 C0∞ (R3 ), where C0∞ (R3 ) is the set of C ∞ -functions on R3 with compact support. Hence ∩3j=1 [D(Dj ) ∩ D(Vem )] is dense. Thus Vem obeys Hypothesis (A). 2.2. The quantum radiation field The Hilbert space of one-photon states in momentum representation is given by Hph := L2 (R3 ) ⊕ L2 (R3 ) ,
(2.5)
where R3 := {k = (k1 , k2 , k3 )|kj ∈ R, j = 1, 2, 3} physically means the momentum space of photons. Then a Hilbert space for the quantum radiation field in the Coulomb gauge is given by n Frad := ⊕∞ n=0 ⊗s Hph ,
⊗ns Hph
(2.6)
the Boson Fock space over Hph , where denotes the n-fold symmetric tensor 0 product of Hph and ⊗s Hph := C. For basic facts on the theory of the Boson Fock space, we refer the reader to [9, §X.7].
May 26, 2003 16:37 WSPC/148-RMP
248
00162
A. Arai
We denote by a(F ) (F ∈ Hph ) the annihilation operator with test vector F on Frad ; its adjoint is given by √ ∗ (a(F )∗ Ψ)(n) = nSn (F ⊗ Ψ(n−1) ) , n ≥ 0 , Ψ = {Ψ(n) }∞ n=0 ∈ D(a(F ) ) , where Sn is the symmetrization operator on ⊗n Hph and Ψ(−1) := 0. For each f ∈ L2 (R3 ), we define a(1) (f ) := a(f, 0) ,
a(2) (f ) := a(0, f ) .
(2.7)
The mapping : f → a(r) (f ∗ ) restricted to S(R3 ) (the Schwartz space of rapidly decreasing C ∞ -functions on R3 ) defines an operator-valued distribution (f ∗ denotes complex conjugate of f ). We denote its symbolical kernel by a(r) (k) : a(r) (f ) = Rthe(r) a (k)f (k)∗ dk. We take a nonnegative Borel measurable function ω on R3 to denote the one free photon energy. We assume that, for a.e. k ∈ R3 with respect to the Lebesgue measure on R3 , 0 < ω(k) < ∞. Then the function ω defines uniquely a multiplication operator on Hph which is nonnegative, self-adjoint and injective. We denote it by the same symbol ω. The free Hamiltonian of the quantum radiation field is then defined by Hrad := dΓ(ω) ,
(2.8)
the second quantization of ω ([8, p. 302, Example 2] and [9, §X.7]). The operator Hrad is a nonnegative self-adjoint operator. The symbolical expression of Hrad is 2 Z X Hrad = ω(k)a(r) (k)∗ a(r) (k)dk . r=1
Remark 2.1. Usually ω is taken to be of the form ωphys (k) := |k|, k ∈ R3 , but, in this paper, for mathematical generality, we do not restrict ourselves to this case. There exist R3 -valued Borel measurable functions e(r) (r = 1, 2) on R3 such that, for a.e. k e(r) (k) · e(s) (k) = δrs ,
e(r) (k) · k = 0 ,
r, s = 1, 2 .
(2.9)
These vector-valued functions e(r) are called the polarization vectors of a photon. The time-zero quantum radiation field is given by A(x) := (A1 (x), A2 (x), A3 (x)) with (r) 2 Z X ej (k) {a(r) (k)∗ e−ik·x + a(r) (k)eik·x } , j = 1, 2, 3 , Aj (x) := dk p 3 ω(k) 2(2π) r=1 (2.10) in the sense of operator-valued distribution. Let % be a real tempered distribution on R3 such that %ˆ √ , ω
%ˆ ∈ L2 (R3 ) , ω
(2.11)
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
249
where %ˆ denotes the Fourier transform of %. The quantum radiation field A% := (A%1 , A%2 , A%3 ) with momentum cut-off %ˆ is defined by (r) 2 Z X ej (k) (r) ∗ −ik·x % Aj (x) := dk p {a (k) e %ˆ(k)∗ + a(r) (k)eik·x %ˆ(k)}. 2ω(k) r=1 R Symbolically A%j (x) = Aj (x − y)%(y)dy.
(2.12)
(2.13)
2.3. The Dirac Maxwell operator The Hilbert space of state vectors for the coupled system of the Dirac particle and the quantum radiation field is taken to be F := HD ⊗ Frad .
(2.14)
This Hilbert space can be identified as F = L2 (R3 ; ⊕4 Frad ) =
Z
⊕ R3
⊕4 Frad dx
(2.15)
the Hilbert space of ⊕4 Frad -valued Lebesgue square integrable functions on R3 (the constant fibre direct integral with base space (R3 , dx) and fibre ⊕4 Frad [10, §XIII.6]). We freely use this identification. The total Hamiltonian of the coupled system — a particle-field Hamiltonian — is defined by H := HD + Hrad − qα · A% = α · (−i∇ − qA% ) + mβ + V + Hrad .
(2.16)
We call H a Dirac–Maxwell operator. The (essential) self-adjointness of H is discussed in [3]. 2.4. The Pauli Fierz Hamiltonian with spin 1/2 A Hamiltonian which describes a quantum system of non-relativistic charged particles interacting with the quantum radiation filed is called a Pauli–Fierz Hamiltonian [7]. Here we consider a non-relativistic charged particle with mass m, charge q and spin 1/2. Suppose that the particle is in an external electromagnetic vector potenex ex 3 3 3 tial Aex = (Aex , φ), where Aex := (Aex 1 , A2 , A3 ) : R → R and φ : R → R are Borel measurable and a.e. finite with respect to dx. Let 0 1 0 −i 1 0 σ1 := , σ2 := , σ3 := , (2.17) 1 0 i 0 0 −1
the Pauli spin matrices, and set
σ := (σ1 , σ2 , σ3 ) .
(2.18)
Then the Pauli–Fierz Hamiltonian of this quantum system is defined by HPF :=
{σ · (−i∇ − qA% − qAex )}2 + φ + Hrad 2m
(2.19)
May 26, 2003 16:37 WSPC/148-RMP
250
00162
A. Arai
acting in the Hilbert space FPF := L2 (R3 ; C2 ) ⊗ Frad = L2 (R3 ; ⊕2 Frad ) =
Z
⊕ R3
⊕2 Frad dx .
(2.20)
For the Pauli–Fierz Hamiltonian without spin, see Appendix B. 3. Main Result 3.1. A Dirac operator coupled to the quantum radiation field We use the following representation of αj and β [11, p. 3]: I2 0 0 σj , , β := αj := 0 −I2 σj 0
(3.1)
± where I2 is the 2×2 identity matrix. Hence the eigenspaces HD of β with eigenvalue ±1 take the forms respectively f 0 g 0 + − 2 3 2 3 HD = f, g ∈ L (R ) , HD = f, g ∈ L (R ) (3.2) 0 f 0 g
and we have
+ − HD = H D ⊕ HD .
Let P± be the orthogonal projections onto
± HD .
(3.3) Then we have
V = V0 + V1
(3.4)
with V0 = P + V P + + P − V P − ,
V1 = P + V P − + P − V P + .
(3.5)
Note that [V0 , β] = 0 ,
{V1 , β} = 0 ,
where [A, B] := AB − BA. In operator-matrix form relative to the orthogonal decomposition (3.3), we have 0 W∗ U+ 0 , V1 = , (3.6) V0 = W 0 0 U− where U± are 2 × 2 Hermitian matrix-valued functions on R3 and W is a 2 × 2 complex matrix-valued function on R3 . Let
then, recalling that
A%j
D /(V1 ) := α · (−i∇ − qA% ) + V1 , is
1/2 Hrad -bounded
(3.7)
[3] by (2.11), we see that D /(V1 ) is densely 1/2
defined and symmetric with D(D /(V1 )) ⊃ (∩3j=1 [D(Dj )∩D(V )])⊗alg D(Hrad ), where ⊗alg means algebraic tensor product.
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
251
By (3.3), we have the following orthogonal decomposition of F: F = F + ⊕ F− ,
(3.8)
± F± := HD ⊗ Frad ∼ = FPF .
(3.9)
where
Relative to this orthogonal decomposition, we can write 0 DW ∗ , D /(V1 ) = DW 0
(3.10)
where DW := σ · (−i∇ − qA% ) + W ,
(3.11)
DW ∗ := σ · (−i∇ − qA% ) + W ∗
(3.12)
acting in FPF . For a closable linear operator T on a Hilbert space, we denote its closure by T¯ unless otherwise stated. Note that DW is densely defined as an operator on FPF and (DW )∗ ⊃ DW ∗ . Hence (DW )∗ is densely defined. Thus DW is closable. Based on this fact, we can define ¯ W )∗ 0 (D ˜ D /(V1 ) := . (3.13) ¯W D 0 ˜/(V1 ) is a self-adjoint extension of D Lemma 3.1. Under Hypothesis (A), D /(V1 ). ˜/(V1 ) follows from a general theorem (e.g. [11, Proof. The self-adjointness of D ˜/(V1 )|[D(DW ) ⊕ D(DW ∗ )] = D /(V1 ), where, p. 142, Lemma 5.3]). It is obvious that D for a linear operator T and a subspace D ⊂ D(T ), T |D denotes the restriction of ˜/(V1 ) is a self-adjoint extension of D T to D. Hence D /(V1 ). Remark 3.1. The operator ˆ/(V1 ) := D
0 ¯ W ∗ )∗ (D
¯W∗ D 0
(3.14)
is also a self-adjoint extension of D /(V1 ). But, for simplicity, we consider here only ˜/(V1 ). Discussions on D ˜/(V1 ) presented below apply also to D ˆ/(V1 ) with suitable D modifications. 3.2. A scaled Dirac Maxwell operator For a self-adjoint operator A, we denote the spectrum and the spectral measure of A by σ(A) and EA (·) respectively. In the case where A is bounded from below, we set E0 (A) := inf σ(A) ,
A0 := A − E0 (A) ≥ 0 .
May 26, 2003 16:37 WSPC/148-RMP
252
00162
A. Arai
Let Λ : (0, ∞) → (0, ∞) be a nondecreasing function such that Λ(κ) → ∞ as κ → ∞ and A be a self-adjoint operator on a Hilbert space. Then, for each κ > 0, we define A(κ) by EA0 ([0, Λ(κ)])A0 EA0 ([0, Λ(κ)]) + E0 (A) if A is bounded from below and E0 (A) < 0 A(κ) := E ([0, Λ(κ)])AE|A| ([0, Λ(κ)]) if A is nonnegative or A |A| is not bounded from below .
(3.15)
Then A(κ) is a bounded self-adjoint operator with kA(κ) k ≤ Λ(κ) .
(3.16)
Proposition 3.2. The following hold : (i) For all ψ ∈ D(A), s- limκ→∞ A(κ) ψ = Aψ, where s- lim means strong limit. (ii) For all z ∈ C\R, s- limκ→∞ (A(κ) − z)−1 = (A − z)−1 . Proof. Part (i) follows from the functional calculus of A. Part (ii) follows from (i) and a general convergence theorem [8, p. 292, Theorem VIII.25(a)]. With this preliminary, we define for κ > 0 a scaled Dirac–Maxwell operator ˜/(V1 ) + κ2 mβ − κ2 m + V0,κ + H (κ) , H(κ) := κD rad
(3.17)
where (κ)
V0,κ :=
U+
0
0
U−
(κ)
!
.
(3.18)
Some remarks may be in order on this definition. The parameter κ in H(κ) means the speed of light concerning the Dirac particle only. The speed of light related to the external potential V = V0 + V1 and the quantum radiation field A% is absorbed in them respectively. The third term −κ2 m on the right hand side of (3.17) is a subtraction of the rest energy of the Dirac particle. Hence taking the scaling limit κ → ∞ in H(κ) in a suitable sense corresponds in fact to a partial non-relativistic limit of the quantum system under consideration. If one considers the non-relativistic limit in a way similar to the usual Dirac operator HD , then one may define ˆ ˜/(V1 ) + κ2 mβ − κ2 m + V0 + Hrad H(κ) := κD
(3.19)
as a scaled Dirac–Maxwell operator, where no cut-offs on V0 and Hrad are made. In this form, however, we find that, besides the (essential) self-adjointness problem of ˆ H(κ), the methods used in the usual Dirac type operators ([11, Chapter 6] or those in [2]) seem not to work. This is because of the existence of the operator Hrad in ˆ ˜/(V1 ) + κ2 mβ − κ2 m + V0 H(κ) which is singular as a perturbation of H0 (κ) := κD
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
253
(if one would try to apply the methods on scaling limits discussed in the cited literatures, then one would have to treat Hrad as a perturbation of H0 (κ)). To ˆ avoid this difficulty, we replace Hrad in H(κ) by a bounded self-adjoint operator which is obtained by cutting off Hrad . This is one of the basic ideas of the present paper. We apply the same idea to V0 which also may be singular as a perturbation ˜/(V1 ) + κ2 mβ − κ2 m. In this way we arrive at Definition (3.17) of a scaled of κD Dirac–Maxwell operator. Lemma 3.3. Under Hypothesis (A), H(κ) is self-adjoint with D(H(κ)) = ˜/(V1 )). D(D (κ)
Proof. The operator κ2 mβ − κ2 m + V0,κ + Hrad is a bounded self-adjoint operator. Hence, by the Kato–Rellich theorem, the assertion follows. 3.3. Self-Adjoint extension of the Pauli Fierz Hamiltonian Essential self-adjointness of the Pauli–Fierz Hamiltonian HPF given by (2.19) and its generalizations is discussed in [4, 5]. These papers show that, under additional conditions on %ˆ, ω, Aex and φ, the Pauli–Fierz Hamiltonians are essentially selfadjoint. In the present paper, we do not intend to discuss essential self-adjointness problem of the Pauli–Fierz type Hamiltonians. Instead, we define a self-adjoint extension of HPF , which may not be known before. We define ¯ W )∗ D ¯W (D (κ) (κ) HPF (κ; W, U+ ) := + U+ + Hrad , κ > 0 (3.20) 2m acting in FPF . Lemma 3.4. Under Hypotheses (A), HPF (κ; W, U+ ) is self-adjoint and bounded from below. Proof. By von Neumann’s theorem (e.g. [9, p. 180, Theorem X.25], the operator ¯ W )∗ D ¯ W is self-adjoint and nonnegative. The operator U (κ) + H (κ) is (2m)−1 (D + rad bounded and self-adjoint. Hence, by the Kato–Rellich theorem, HPF (κ; W, U+ ) is self-adjoint and bounded from below. A generalization of the Pauli–Fierz Hamiltonian HPF is defined by HPF (W, U+ ) :=
DW ∗ DW + U+ + Hrad 2m
(3.21)
acting in FPF . We formulate additional conditions: Hypothesis (B). The function U+ is bounded from below. In this case we set u0 := E0 (U+ ) .
May 26, 2003 16:37 WSPC/148-RMP
254
00162
A. Arai
Remark 3.2. Under Hypothesis (A), D(HPF (W, U+ )) is not necessarily dense in ¯ W )∩D(U+ )∩D(Hrad ) is dense in FPF . Hence D(D ¯ W )∩D(|U+ |1/2 )∩ FPF , but, D(D 1/2 D(Hrad ) is also dense in FPF . Therefore we can define a densely defined symmetric form sPF as follows: 1/2
¯ W ) ∩ D(|U+ |1/2 ) ∩ D(H ) (form domain) , D(sPF ) := D(D rad sPF (Ψ, Φ) :=
1 ¯ ¯ W Φ) + (Ψ, U+ Φ) + (H 1/2 Ψ, H 1/2 Φ) , (DW Ψ, D rad rad 2m Ψ, Φ ∈ D(sPF ) .
(3.22) (3.23) (3.24)
Assume Hypothesis (B) in addition to Hypothesis (A). Then it is easy to see that s PF (f) (f) is closed. Let HPF be the self-adjoint operator associated with sPF . Then HPF ≥ u0 (f) and HPF is a self-adjoint extension of HPF (W, U+ ). Theorem 3.5. Under Hypotheses (A) and (B), there exists a self-adjoint extension ˜ PF (W, U+ ) of HPF (W, U+ ) which have the following properties: of H (i) (ii) (iii)
˜ PF (W, U+ ) ≥ u0 . H ˜ PF (W, U+ )|1/2 ) ⊂ D(D ¯ W ) ∩ D(|U+ |1/2 ) ∩ D(H 1/2 ) D(|H rad For all z ∈ (C\R) ∪ {ξ ∈ R|ξ < u0 }, ˜ PF (W, U+ ) − z)−1 , s- lim (HPF (κ; W, U+ ) − z)−1 = (H κ→∞
where s- lim means strong limit. ˜ PF (W, U+ )|1/2 ), (iv) For all ξ < u0 and Ψ ∈ D(|H ˜ PF (W, U+ ) − ξ)1/2 Ψ . s- lim (HPF (κ; W, U+ ) − ξ)1/2 Ψ = (H κ→∞
Proof. We need only to apply Theorem A.1 in Appendix A to the following case: H = FPF , N = 2, A =
¯W ¯ W )∗ D (D , B1 = U+ , B2 = Hrad , L = Λ . 2m
Remark 3.3. As for conditions for ρˆ and ω for Theorem 3.5 to hold, we only need condition (2.11); no additional condition is necessary. Remark 3.4. In the same manner as in Theorem 3.5, we can define a self-adjoint extension of the Pauli–Fierz Hamiltonian without spin (see Appendix B). Remark 3.5. Under Hypotheses (A), (B) and that D(HPF (W, U+ )) is dense, HPF (W, U+ ) is a symmetric operator bounded from below. Hence it has the ˆ PF (W, U+ ). But it is not clear that, in the case where Friedrichs extension H ˜ PF (W, U+ ) = H ˆ PF (W, U+ ) or HPF (W, U+ ) is not essentially self-adjoint, H (f) ˜ HPF (W, U+ ) = HPF (Remark 3.2) or both of them do not hold.
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
255
3.4. Main theorems We now state main results on the non-relativistic limit of H(κ). Theorem 3.6. Let Hypotheses (A) and (B) be satisfied. Suppose that Λ(κ)2 = 0. κ→∞ κ lim
Then, for all z ∈ C\R, s- lim (H(κ) − z)−1 = κ→∞
(3.25)
˜ (HPF (W, U+ ) − z)−1 0
0 0
.
(3.26)
In the case where U+ is not necessarily bounded from below, we have the following. Theorem 3.7. Let Hypothesis (A) and (3.25) be satisfied. Suppose that HPF (W, U+ ) is essentially self-adjoint. Then, for all z ∈ C\R, (HPF (W, U+ ) − z)−1 0 . (3.27) s- lim (H(κ) − z)−1 = κ→∞ 0 0 Remark 3.6. Under additional conditions on %, ω, W and U+ , one can prove that HPF (W, U+ ) is essentially self-adjoint for all values of the coupling constant q [4, 5]. We now apply Theorems 3.6 and 3.7 to the case where V = Vem = φ I4 −qα·Aex (Example 2.1), i.e. the case where W = −qσ · Aex and U± = φI2 . We assume the following. Hypothesis (C) 2 3 (C.1) The subspace ∩3j=1 [D(Dj ) ∩ D(Aex j ) ∩ D(φ)] is dense in L (R ). (C.2) φ is bounded from below. In this case we set φ0 := inf σ(φ).
Under Hypothesis (C), we have a self-adjoint opeartor ˜ PF := H ˜ PF (−qσ · Aex , φ) , H
(3.28)
which is a self-adjoint extension of the original Pauli–Fierz Hamiltonian HPF given by (2.19). Let (κ)
HDM (κ) := κD /(−qα · Aex ) + κ2 mβ − κ2 m + φ(κ) + Hrad ,
(3.29) ex
then HDM (κ) is the Dirac–Maxwell operator H(κ) with V1 = −qα · A and V0 = φI4 . Theorems 3.6 and 3.7 immediately yield the following results on the nonrelativistic limit of HDM (κ). Corollary 3.8. Let Hypothesis (C) and (3.25) be satisfied. Then, for all z ∈ C\R, ˜ (HPF − z)−1 0 s- lim (HDM (κ) − z)−1 = . (3.30) κ→∞ 0 0
May 26, 2003 16:37 WSPC/148-RMP
256
00162
A. Arai
Corollary 3.9. Assume (C.1) and (3.25). Suppose that HPF is essentially selfadjoint. Then, for all z ∈ C\R, ! ¯ PF − z)−1 0 (H −1 s- lim (HDM (κ) − z) = . (3.31) κ→∞ 0 0 Thus a mathematically rigorous connection of relativistic QED to nonrelativistic QED is established. 4. Limit Theorem on Strongly Anticommuting Self-Adjoint Operators In this section we prove a limit theorem concerning strongly anticommuting selfadjoint operators. For a review of the fundamental abstract theory of strongly anticommuting self-adjoint operators, see [1]. Definition 4.1. Let A and B be self-adjoint operators on a Hilbert space H. (i) We say that A and B strongly commute if their spectral measures E A and EB commute (i.e. for all Borel sets J, K ⊂ R, EA (J)EB (K) = EB (K)EA (J)). (ii) We say that A and B strongly anticommute if , for all ψ ∈ D(A) and t ∈ R, e−itB ψ ∈ D(A) and Ae−itB ψ = eitB Aψ (i.e. eitB A ⊂ Ae−itB ). Let A 6= 0 and B be strongly anticommuting self-adjoint operators on a Hilbert space H. We assume that B is injective. For each κ > 0, we define T0 (κ) := κA + κ2 (B − |B|) .
(4.1)
The operator κA + κ2 B is an abstract form of Dirac-type operators and −κ2 |B| is a “renormalization” term. It is shown that T0 (κ) is essentially self-adjoint (Lemma 3.1 in [2]). We consider a perturbation of T0 (κ). Let C(κ) (κ > 0) be a symmetric operator on H and T (κ) := T0 (κ) + C(κ) .
(4.2)
The main purpose of this section is to consider the limit κ → ∞ of T (κ) in the strong resolvent sense under a general condition for C(κ). A basic assumption for C(κ) is as follows: Hypothesis (I). D(T0 (κ)) ⊂ D(C(κ)) and T (κ) is self-adjoint with D(T (κ)) = D(T0 (κ)). To state the main result we need some preliminaries. Let B = UB |B| be the polar decomposition. Then UB is self-adjoint and unitary and σ(UB ) = {±1}, where, for a linear operator T , σ(T ) denotes the spectrum of T (see p. 141 in [2]). The operators P±B :=
1 (I ± UB ) , 2
(4.3)
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
257
are respectively the orthogonal projections onto the eigenspaces H± := ker(UB ∓ I)
(4.4)
of UB with eigenvalues ±1 and we have the orthogonal decomposition H = H+ ⊕ H− .
(4.5)
It is known that A and |B| strongly commute (Lemma 2.2(v) in [2]). Hence the product spectral measure E := EA ⊗ E|B| of A and |B| can be defined with spectral representations Z Z µdE(λ, µ) . λdE(λ, µ) , |B| = A= R2
R2
With the spectral measure E, we can define a nonnegative self-adjoint operator Z 1 λ2 K0 := dE(λ, µ) ≥ 0 . (4.6) 2 R2 µ Note that K0 =
A2 |B|−1 2
on D(A2 |B|−1 ) ∩ D(|B|−1 A2 ) .
(4.7)
It is shown that K0 is reduced by H± (see Lemma 2.4 in [2]). We denote K0,± the reduced part of K0 to H± respectively. Thus we have K0,+ 0 K0 = , (4.8) 0 K0,− where the operator-matrix representation is relative to the orthogonal decomposition (4.5): I 0 0 0 B B P+ = , P− = . (4.9) 0 0 0 I We define K(κ) := K0 + P+B C(κ)P+B .
(4.10)
Hypothesis (II). Let κ0 > 0 be a constant. (II.1) For all κ ≥ κ0 , C(κ) is reduced by H± so that it has the operator-matrix representation C+ (κ) 0 C(κ) = , (4.11) 0 C− (κ) where C± (κ) are the reduced parts of C(κ) to H± respectively. 1/2 (II.2) For all κ ≥ κ0 , D(K0 ) ⊂ D(C(κ)) and there exist nonnegative constants a(κ) and b(κ) such that 1/2
kC(κ)f k ≤ a(κ)kK0 f k + b(κ)kf k ,
1/2
f ∈ D(K0 ) .
(4.12)
May 26, 2003 16:37 WSPC/148-RMP
258
00162
A. Arai
Lemma 4.2. Let Hypothesis (II) be satisfied and let K+ (κ) := K0,+ + C+ (κ) .
(4.13)
Then, for all κ ≥ κ0 , K(κ) is self-adjoint with D(K(κ)) = D(K0 ) and bounded from below. Moreover, K(κ) is reduced by H± with K+ (κ) 0 K(κ) = K+ (κ) ⊕ K0,− = . (4.14) 0 K0,− Proof. By (II.2), D(K0 ) ⊂ D(C(κ)) ⊂ D(P+B C(κ)P+B ). Hence D(K(κ)) = D(K0 ). Let f ∈ D(K0 ). Then we have for all ε > 0, 1/2
kK0 f k2 ≤ kf kkK0f k ≤ ε2 kK0 f k2 +
kf k2 . 4ε2
Hence kf k . 2ε
1/2
kK0 f k ≤ εkK0 f k +
(4.15)
This estimate and (4.12) imply kC(κ)f k ≤ a(κ)εkK0 f k +
a(κ) + b(κ) kf k . 2ε
(4.16)
By the reducibility of C(κ) by H± , we have kP+B C(κ)P+B f k ≤ kC(κ)f k. Since ε > 0 is arbitrary, it follows from the Kato–Rellich theorem that K(κ) is self-adjoint and bounded from below. The last assertion is easy to prove. Hypothesis (III). Under Hypothesis (II) (so that, by Lemma 4.2, for all κ ≥ κ0 , K+ (κ) is self-adjoint), there exists a self-adjoint operator K+ on H+ such that, for all z ∈ C\R, s- lim (K+ (κ) − z)−1 = (K+ − z)−1 .
(4.17)
κ→∞
The main result of this section is the following: Theorem 4.3. Assume Hypotheses (I)–(III). Suppose that a(κ)3 = 0, κ→∞ κ lim
b(κ)2 = 0, κ→∞ κ lim
a(κ)2 b(κ) =0 κ→∞ κ lim
(4.18)
and M := inf σ(|B|) > 0 . Then, for all z ∈ C\R, s- lim (T (κ) − z) κ→∞
−1
=
(K+ − z)−1 0
We prove Theorem 4.3 by a series of lemmas.
(4.19)
0 0
.
(4.20)
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
259
In what follows, we assume (4.19). Then |B|−1 is bounded with k|B|−1 k ≤
1 . M
For z ∈ C\R, we define K(κ, z) := K(κ) − z −
(4.21)
z2 |B|−1 2κ2
(4.22)
> 0.
(4.23)
and set d(κ, z) :=
|z|2
2κ2 M |Im z|
Lemma 4.4. Assume Hypothesis (II) and (4.19). Let z ∈ C\R, κ ≥ κ0 and L(κ, z) := 1 −
z2 |B|−1 (K(κ) − z)−1 . 2κ2
(4.24)
Let d(κ, z) < 1 .
(4.25)
Then the following statements hold: (i) L(κ, z) is bijective with L(κ, z)
−1
∞ 2 n X n z = |B|−1 (K(κ) − z)−1 2 2κ n=0
(4.26)
in operator norm topology and
kL(κ, z)−1k ≤
1 . 1 − d(κ, z)
(4.27)
(ii) K(κ, z) is bijective and K(κ, z)−1 = (K(κ) − z)−1 L(κ, z)−1 ∞ 2 n X z (K(κ) − z)−1 (|B|−1 (K(κ) − z)−1 )n = 2 2κ n=0
(4.28) (4.29)
in operator norm topology with
kK(κ, z)−1k ≤ r(κ, z) ,
(4.30)
where r(κ, z) :=
1 . |Im z|(1 − d(κ, z)
Proof. (i) We have by (4.21)
2
z
−1 −1
|B| (K(κ) − z)
≤ d(κ, z) < 1 .
2κ2
(4.31)
May 26, 2003 16:37 WSPC/148-RMP
260
00162
A. Arai
Hence the bijectivity of L(κ, z) follows with Neumann expansion (4.26). Inequality (4.27) follows from the general fact that, for all bounded linear operators T with kT k < 1, k(1 − T )−1 k ≤ (1 − kT k)−1 . (ii) We have K(κ, z) = L(κ, z)(K(κ) − z), which implies that K(κ, z) is bijective with (4.28). Expansion (4.29) follows from (4.28) and (4.26). Using (4.27) and (4.28), we obtain (4.30). The following fact is an important key to the analysis here. Theorem 4.5. Assume Hypotheses (I), (II) and (4.19). Let z ∈ C\R and −1 K(κ, z)−1 is d(κ, z) < 1 with κ ≥ κ0 . Then the operator 1 + C(κ) 2κ2 (κA + z)|B| bijective and 1 −1 −1 B K(κ, z)−1 (T (κ) − z) = P+ + 2 (κA + z)|B| 2κ −1 C(κ) −1 −1 × 1+ (κA + z)|B| K(κ, z) . (4.32) 2κ2 Proof. Informal (heuristic) manipulations to obtain (4.32) are similar to the case of an abstract Dirac operator [11, p. 180, Theorem 6.4] or to a case previously discussed by the present author [2, p. 155, Theorem 4.3]. But, for completeness (since the assumption here is slightly different from those in [2, 11]), we give an outline of proof. Introducing an operator W (κ, z) := 1 + C(κ)(T0 (κ) − z)−1 , which is well-defined by Hypothesis (I), we have T (κ) − z = W (κ, z)(T0 (κ) − z) . This implies that W (κ, z) is bijective and (T (κ) − z)−1 = (T0 (κ) − z)−1 W (κ, z)−1 . On the other hand, we have (T0 (κ) − z)−1 =
1 (S0 (κ) + z)|B|−1 K0 (κ, z)−1 , 2κ2
(4.33)
where S0 (κ) := κA + κ2 (B + |B|) , K0 (κ, z) := K0 − z −
z2 |B|−1 = K(κ, z) − P+B C(κ)P+B , 2κ2
see [2, (3.17) and (3.18)]. Hence 1 (T (κ) − z)−1 = 2 (S0 (κ) + z)|B|−1 K0 (κ, z)−1 W (κ, z)−1 . 2κ Let X(κ, z) := 1 + P+B C(κ)P+B K0 (κ, z)−1 .
(4.34)
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
261
Using (4.33), we have C(κ) (κA + z)|B|−1 K0 (κ, z)−1 , 2κ2 where we have used that B + |B| = 2P+B |B| and C(κ)P+B = P+B C(κ)P+B . Note that W (κ, z) = X(κ, z) +
K(κ, z) = X(κ, z)K0 (κ, z) . This implies that X(κ, z) is bijective with X(κ, z)−1 = K0 (κ, z)K(κ, z)−1 . Hence we obtain
C(κ) −1 −1 −1 X(κ, z) (κA + z)|B| K (κ, z) X(κ, z) 0 2κ2 C(κ) −1 −1 X(κ, z) , (κA + z)|B| K(κ, z) = 1+ 2κ2
W (κ, z) =
1+
which implies that
Y (κ, z) := 1 +
C(κ) (κA + z)|B|−1 K(κ, z)−1 2κ2
is also bijective with W (κ, z)−1 = X(κ, z)−1 Y (κ, z)−1 = K0 (κ, z)K(κ, z)−1 Y (κ, z)−1 . Putting this equation into (4.34), we obtain (4.32). Lemma 4.6. Assume Hypothesis (II) and (4.19). Let ε > 0. Then, for all f ∈ D(K0 ), εa(κ) 1 a(κ) kC(κ)|B|−1 f k ≤ kK0 f k + + b(κ) kf k . (4.35) M M 2ε Proof. We see by functional calculus that, for all f ∈ D(K0 ), |B|−1 f ∈ D(K0 ) and K0 |B|−1 f = |B|−1 K0 f . Using this fact, (4.16) and (4.21), we obtain (4.35). Lemma 4.7. Assume (4.19). Then D(K0 ) ⊂ D(A|B|−1 ) and kA|B|−1 f k ≤ εkK0 f k +
1 kf k , εM
f ∈ D(K0 ) ,
where ε > 0 is arbitrary. Proof. Let g ∈ D := D(A2 |B|−1 ) ∩ D(|B|−1 A2 ), we have kA|B|−1 gk2 = 2(|B|−1 g, K0 g) ≤
2kgk 1 kK0 gk ≤ ε2 kK0 gk2 + 2 2 kgk2 , M ε M
where ε > 0 is arbitrary. Hence kA|B|−1 gk ≤ εkK0 gk +
1 kgk . εM
(4.36)
May 26, 2003 16:37 WSPC/148-RMP
262
00162
A. Arai
Since D is a core of K0 (p. 143, Lemma 2.4 in [2]) and |B|−1 is bounded, the assertion follows from a limiting argument. Lemma 4.8. Assume Hypothesis (II) and (4.19). Then D(K0 ) ⊂ D(C(κ)A|B|−1 ) and √ b(κ) 2a(κ) −1 √ kf k , f ∈ D(K0 ) , (4.37) + εb(κ) kK0 f k + kC(κ)A|B| f k ≤ εM M where ε > 0 is arbitrary. Proof. Let f ∈ D(K0 ). Then it follows from the functional calculus on the product 1/2 spectral measure E and (4.12) that f ∈ D(K0 A|B|−1 ) ⊂ D(C(κ)A|B|−1 ) and 1/2
kC(κ)A|B|−1 f k ≤ a(κ)kK0 A|B|−1 f k + b(κ)kA|B|−1 f k √ = a(κ)k 2|B|−1/2 K0 f k + b(κ)kA|B|−1 f k √ 2a(κ) ≤ √ kK0 f k + b(κ)kA|B|−1 f k . M This estimate and (4.36) give (4.37). Lemma 4.9. Assume Hypothesis (II) and (4.19). Let δ > 0 be a constant such that a(κ)δ < 1. Then, for all f ∈ D(K0 ) and κ ≥ κ0 , kK0 f k ≤
1 kK(κ, z)f k 1 − a(κ)δ a(κ) |z|2 1 + b(κ) kf k . |z| + 2 + + 1 − a(κ)δ 2κ M 2δ
(4.38)
Proof. Using (4.16), we have kK0 f k ≤ kK(κ)f k + kC(κ)P+B f k ≤ kK(κ)f k + a(κ)δkK0 f k +
a(κ) + b(κ) kf k , 2δ
where δ > 0 is arbitrary. Taking δ > 0 such that a(κ)δ < 1, we obtain a(κ) 1 1 kK0 f k ≤ kK(κ)f k + + b(κ) kf k . 1 − a(κ)δ 1 − a(κ)δ 2δ On the other hand, we have |z|2 kf k . kK(κ)f k ≤ kK(κ, z)f k + |z| + 2 2κ M Thus (4.38) follows.
(4.39)
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
263
Lemma 4.10. Assume Hypothesis (II), (4.19) and (4.25). Let δ > 0 be a constant such that a(κ)δ < 1 and ε > 0. Let εa(κ) |z|2 a(κ) G1 (κ, z, ε, δ) := 1 + r(κ, z) |z| + 2 + + b(κ) M (1 − a(κ)δ) 2κ M 2δ a(κ) b(κ) . (4.40) + + r(κ, z) 2εM M Then C(κ)|B|−1 K(κ, z)−1 is bounded with
kC(κ)|B|−1 K(κ, z)−1 k ≤ G1 (κ, z, ε, δ) .
(4.41)
Proof. This follows from Lemma 4.6 and Lemma 4.9. Lemma 4.11. Assume Hypothesis (II), (4.19) and (4.25). Let δ > 0 be a constant such that a(κ)δ < 1 and ε > 0. Let G2 (κ, z, ε, δ) :=
1 1 − a(κ)δ √ a(κ) 2a(κ) |z|2 √ + b(κ) × + εb(κ) 1 + r(κ, z) |z| + 2 + 2κ M 2δ M
r(κ, z)b(κ) . εM Then C(κ)A|B|−1 K(κ, z)−1 is bounded with +
kC(κ)A|B|−1 K(κ, z)−1 k ≤ G2 (κ, z, ε, δ) .
(4.42)
(4.43)
Proof. This follows from Lemma 4.8 and Lemma 4.9. Lemma 4.12. Assume Hypotheses (II), (III) and (4.19). Then (K+ − z)−1 0 s- lim P+B K(κ, z)−1 = . κ→∞ 0 0
(4.44)
Proof. Let K := K+ ⊕ K0,+ . By Lemma 4.4, we have K(κ, z)−1 = (K(κ) − z)−1 + (K(κ) − z)−1 V (κ) P∞ z2 n (|B|−1 (K(κ) − z)−1 )n . Hence with V (κ) := n=1 2κ 2
K(κ, z)−1 − (K − z)−1 = (K(κ) − z)−1 − (K − z)−1 + (K(κ) − z)−1 V (κ) .
It is easy to see that kV (κ)k → 0 as κ → ∞. By Hypothesis (III), we have s- lim (K(κ) − z)−1 = (K − z)−1 . κ→∞
May 26, 2003 16:37 WSPC/148-RMP
264
00162
A. Arai
Hence s- lim K(κ, z)−1 = (K − z)−1 , κ→∞
which implies that s- lim P+B K(κ, z)−1 κ→∞
=
P+B (K
− z)
−1
=
(K+ − z)−1 0
0 0
.
Thus (4.44) holds. Proof of Theorem 4.3. By Lemmas 4.10 and 4.11, we have
G2 (κ, z, ε, δ)
C(κ) |z| −1 −1
(κA + z)|B| K(κ, z) + 2 G1 (κ, z, ε, δ) .
≤
2κ2 2κ 2κ
Let 0 < α < 1 be fixed and set δ = α/a(κ) so that a(κ)δ = α < 1. Let κ1 > 0 be a constant such that d(κ1 , z) < 1 and κ1 ≥ max{κ0 , 1}. Let κ ≥ κ1 . Then G1 (κ, z, ε, δ) ≤ C1 [a(κ) + a(κ)3 + a(κ)b(κ) + b(κ)] ,
G2 (κ, z, ε, δ) ≤ C2 [a(κ) + a(κ)3 + a(κ)b(κ) + b(κ) + b(κ)a(κ)2 + b(κ)2 ] , where C1 and C2 are constants independent of κ ≥ κ1 . Hence, under condition (4.18), we have lim
κ→∞
G1 (κ, z, ε, δ) = 0, κ2
G2 (κ, z, ε, δ) = 0. κ
Hence
which implies that lim
κ→∞
C(κ)
−1 −1
lim (κA + z)|B| K(κ, z) = 0 , κ→∞ 2κ2
C(κ) (κA + z)|B|−1 K(κ, z)−1 1+ 2κ2
−1
=1
(4.45)
in operator-norm topology. By Lemmas 4.7 and 4.9, we have kA|B|
−1
K(κ, z)
−1
ε r(κ, z)ε a(κ) r(κ, z) |z|2 k≤ + +b(κ) + . |z|+ 2 + 1 − a(κ)δ 1 − a(κ)δ 2κ M 2δ εM
Hence, in the same way as above, we can show that
1 (κA + z)|B|−1 K(κ, z)−1 = 0 2κ2 in operator-norm topology. These facts together with Theorem 4.5 and Lemma 4.12 imply (4.20). lim
κ→∞
Remark 4.5. Higher order corrections to the limiting formula (4.20) can be computed by using Theorem 4.5 and (4.29).
May 26, 2003 16:37 WSPC/148-RMP
00162
265
Non-Relativistic Limit of a Dirac–Maxwell Operator
5. Proof of the Main Theorems 5.1. Proof of Theorem 3.6 We apply Theorem 4.3. For this purpose, we first prove the following lemma. ˜/(V1 ) strongly anticommutes with mβ. Lemma 5.1. The self-adjoint operator D Proof. We have for all t ∈ R e
−itmβ
=
e−itm I2
0
0
eitm I2
.
˜/(V1 )) = D(D ¯ W ) ⊕ D((D ¯ W )∗ ), e−itmβ Ψ ∈ This implies that, for all Ψ ∈ D(D −itmβ itmβ ˜ ˜ ˜ ˜ D(D /(V1 )) and D /(V1 )e Ψ=e D /(V1 )Ψ. Hence D /(V1 ) strongly anticommutes with mβ. Let ˜/(V1 ) , A=D
B = mβ ,
(κ)
C(κ) = V0,κ + Hrad .
Then |B| = m and we can write
H(κ) = κA + κ2 (B − |B|) + C(κ) .
By Lemma 5.1, A and B strongly anticommute. Hence H(κ) is of the form T (κ) in Sec. 4. We need only to check that T (κ) = H(κ) satisfies the assumption of Theorem 4.3. Since C(κ) is bounded, Hypothesis (I) holds. In the present case we have P±B = P± and C(κ) is reduced by F± with ! (κ) (κ) U+ + Hrad 0 . (5.1) C(κ) = (κ) (κ) 0 U− + Hrad Hence Hypothesis (II.1) holds. In the present case we have K0 =
(D ¯ W )∗ D ¯W
˜/(V1 )2 D = 2m
2m
0
. (5.2) ¯ W (D ¯ W )∗ D 0 2m By (3.16), kC(κ)Ψk ≤ 2Λ(κ)kΨk for all Ψ ∈ F. Hence Hypothesis (II.2) holds with a(κ) = 0 ,
b(κ) = 2Λ(κ) .
(5.3)
By (5.1) and (5.2), we have K+ (κ) = HPF (κ; W, U+ ) . ˜ PF (W, U+ ). By (5.3) and By Theorem 3.5, Hypothesis (III) holds with K+ = H (3.25), (4.18) holds. Thus the assumption of Theorem 4.3 is satisfied. Hence we can apply Theorem 4.3 to obtain (3.26).
May 26, 2003 16:37 WSPC/148-RMP
266
00162
A. Arai
5.2. Proof of Theorem 3.7 Hypotheses (I) and (II) hold in this case too. But it is not immediately obvious if Hypothesis (III) holds, since, in this case, we can not use Theorem 3.5. We note that lim HPF (κ; W, U+ )Ψ = HPF (W, U+ )Ψ ,
κ→∞
Ψ ∈ D(HPF (W, U+ )) .
By the assumption on the essential self-adjointness of HPF (W, U+ ), we can apply a general convergence theorem [8, p. 292, Theorem VIII.25(a)] to conclude that, for all z ∈ C\R, s- lim (HPF (κ; W, U+ ) − z)−1 = (HPF (W, U+ ) − z)−1 . κ→∞
Hence Hypothesis (III) holds with K+ = HPF (W, U+ ). Then, in the same way as in the proof of Theorem 3.6, we obtain Theorem 3.7. Appendix A. A Class of Self-Adjoint Extensions of Hermitian Operators We say that a linear operator S on a Hilbert space H is Hermitian if (ψ, Sφ) = (Sψ, φ) for all ψ, φ ∈ D(S). In this definition, we do not assume the denseness of D(S). A densely defined Hermitian operator is called a symmetric operator. In this appendix we present a class of self-adjoint extensions of Hermitian operators. To the author’s best knowledge, this class is new. Let H be a complex Hilbert space. Let A be a nonnegative self-adjoint operator on H and Bj (j = 1, 2, . . . , N, N ∈ N) be self-adjoint operators bounded from below 1/2 with Bj ≥ bj (bj ∈ R is a constant) such that ∩N ) ∩ D(|Bj |1/2 )] is dense j=1 [D(A in H. Let c0 :=
N X
bj .
j=1
Then the operator S := A +
N X
Bj
j=1
is Hermitian and bounded from below with S ≥ c0 . Remark A.1. If S is densely defined (i.e. D(S) = ∩N j=1 [D(A) ∩ D(Bj )] is dense), then S is a symmetric operator bounded from below and hence S has a self-adjoint extension SF , called the Friedrichs extension (e.g. [9, p. 177, Theorem X.23]). Remark A.2. The operator S has another type of self-fadjoint extension Sf which ˙ 1+ ˙ · · · +B ˙ N , i.e. the self-adjoint operator is given by the form sum Sf := A+B
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
267
associated with the densely defined symmetric closed form s0 given by 1/2 D(s0 ) := ∩N ) ∩ D(|Bj |1/2 )] (form domain) , j=1 [D(A
s0 (ψ, φ) := (A1/2 ψ, A1/2 φ) +
N X
ˆ 1/2 ψ, B ˆ 1/2 φ) + c0 (ψ, φ) , (B
j=1
ψ, φ ∈ D(s0 ) ,
where ˆj := Bj − bj B and (·, ·) denotes the inner product of H. Here we want to construct a self-adjoint extension of S which may be different from SF and Sf if S is symmetric, but not essentially self-adjoint. For this purpose we first introduce an approximate or a “cut-off” version of S. Remark A.3. If each Bj is bounded, then, by the Kato–Rellich theorem, S is self-adjoint. Thus the arguments below are nontrivial only if A and at least one of Bj (j = 1, . . . , N ) are unbounded. Let L : (0, ∞) → (0, ∞) be a nondecreasing function such that L(κ) → ∞ as κ → ∞ and ˆj (κ) := E ˆ ([0, L(κ)])B ˆj E ˆ ([0, L(κ)]) , B Bj Bj
κ > 0,
ˆj . It is easy to see that each B ˆj (κ) is a where EBˆj is the spectral measure of B ˆj (κ)k ≤ L(κ). nonnegative bounded self-adjoint operator with kB Let
S(κ) := A +
N X
ˆj (κ) + c0 . B
j=1
Then, by the Kato–Rellich theorem, S(κ) is self-adjoint with S(κ) ≥ c0 . Moreover, for all ψ ∈ ∩N j=1 [D(A) ∩ D(Bj )], we have s- lim S(κ)ψ = Sψ . κ→∞
In this sense S(κ) may be regarded as an approximate version of S. Theorem A.1. Let A, Bj , S and S(κ) be as above. Then there exists a unique self-adjoint extension S˜ of S such that the following properties hold : (i) (ii) (iii)
S˜ ≥ c0 . ˜ 1/2 ) ⊂ ∩N [D(A1/2 ) ∩ D(B ˆ 1/2 )]. D(|S| j=1 j For all z ∈ (C\R) ∪ {ξ ∈ R|ξ < c0 },
s- lim (S(κ) − z)−1 = (S˜ − z)−1 . κ→∞
˜ 1/2 ), (iv) For all ξ < c0 and ψ ∈ D(|S|
s- lim (S(κ) − ξ)1/2 ψ = (S˜ − ξ)1/2 ψ . κ→∞
May 26, 2003 16:37 WSPC/148-RMP
268
00162
A. Arai
Proof. For each κ > 0, we define a symmetric form sκ with form domain D(s) = D(A1/2 ) by sκ (ψ, φ) := (A1/2 ψ, A1/2 φ) +
N X
ˆj (κ)φ) + c0 (ψ, φ) , (ψ, B
j=1
ψ, φ ∈ D(A1/2 ) .
This is the densely defined closed symmetric form associated with the self-adjoint ˆj (κ)ψ) is nondecreasing in κ for all ψ ∈ H with operator S(κ). Since (ψ, B ˆj (κ)φ) ≤ (B ˆ 1/2 φ, B ˆ 1/2 φ) , 0 ≤ (φ, B j j
ˆ 1/2 ) , φ ∈ D(B j
it follows that, for all κ, κ0 > 0 with κ < κ0 , c 0 ≤ s κ ≤ s κ0 ≤ s 0 . Hence we can apply a general convergence theorem on nondecreasing symmetric forms ([6, p. 461, Theorem 3.13]) to conclude that there exists a self-adjoint operator S˜ on H such that (i), (iii) and (iv) hold with sκ ≤ s, where s is the symmetric form ˜ so that D(|S| ˜ 1/2 ) ⊂ D(A1/2 ). associated with S, To show that S˜ is a self-adjoint extension of S, let ψ ∈ D(S) = ∩N j=1 [D(A) ∩ ˜ ˜ D(Bj )] and φ ∈ D(S) = D(S − c0 + 1). Then (ψ, (S˜ − c0 + 1)φ) = ((S(κ) − c0 + 1)ψ, (S(κ) − c0 + 1)−1 (S˜ − c0 + 1)φ) . Note that s- limκ→∞ (S(κ) − c0 + 1)ψ = (S − c0 + 1)ψ and, by property (iii), s- lim (S(κ) − c0 + 1)−1 = (S˜ − c0 + 1)−1 . κ→∞
Hence (ψ, (S˜ − c0 + 1)φ) = ((S − c0 + 1)ψ, (S˜ − c0 + 1)−1 (S˜ − c0 + 1)φ) = ((S − c0 + 1)ψ, φ) , ˜ and (S˜ − c0 + 1)ψ = (S − c0 + 1)ψ, which implies that ψ ∈ D(S˜ − c0 + 1) = D(S) ˜ ˜ i.e. Sψ = Sψ. Thus S is a self-adjoint extension of S. We next prove (ii). It follows from the inequality sκ ≤ s as shown above and the nondecreasingness of sκ in κ that D(s) ⊂ D(sκ ) = D(A1/2 ) and ˜ 1/2 ), limκ→∞ sκ (ψ, ψ) exists. This implies that that, for all ψ ∈ D(s) = D(|S| ˆj (κ)1/2 ψ, B ˆj (κ)1/2 ψ) exists (j = 1, . . . , N ). By using the spectral replimκ→∞ (B ˆ ˆj (κ)1/2 ψ) and the monotone convergence theorem, we resentation for (Bj (κ)1/2 ψ, B 1/2 ˆ ), j = 1, . . . , N . Thus part (ii) follows. see that ψ ∈ D(B j The uniqueness of S˜ follows from property (iii). Remark A.4. The self-adjoint extension S˜ may depend on the choice of the function L. Unfortunately we have been unable to make clear whether S# = S˜ or not (# = F, f) in the case where S is symmetric, but not essentially self-adjoint.
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
269
Appendix B. Self-Adjoint Extension of the Pauli Fierz Hamiltonian Without Spin Let Aex and φ be as in Example 2.1 in Sec. 2 and Pj := −iDj − qA%j − qAex j . We set P = (P1 , P2 , P3 ). Then the Pauli–Fierz Hamiltonian without spin is given by hPF :=
P2 + φ + Hrad 2m
R⊕ acting in the Hilbert space L2 (R3 ) ⊗ Frad = L2 (R3 ; Frad ) = R3 Frad dx. It is easy to see that hPF is Hermitian. We assume Hypothesis (C) in Sec. 3. Then each Pj is symmetric. Hence we can (f) define a nonnegative self-adjoint operator KPF as the form sum 1 (f) ˙ P¯2 )∗ P¯2 +( ˙ P¯3 )∗ P¯3 } , KPF := {(P¯1 )∗ P¯1 +( 2m which is a self-adjoint extension of KPF,0 := (2m)−1 P 2 . Hence KPF,0 has a selfadjoint extension which is nonnegative. Let KPF be any self-adjoint extension of KPF,0 such that KPF ≥ 0 and 1/2 1/2 D(KPF ) ∩ D(|φ|1/2 ) ∩ D(Hrad ) is dense. Then we define (κ)
hPF (κ) := KPF + Hrad + φ(κ) ,
where (κ)
Hrad := EHrad ([0, L(κ)])Hrad EHrad ([0, L(κ)]) , φ(κ) := (φ − φ0 )χ[0,L(κ)] (φ − φ0 ) + φ0 ,
(κ)
where χ[0,L(κ)] is the characteristic function of the interval [0, L(κ)]. Since Hrad + φ(κ) is bounded and symmetric, hPF (κ) is self-adjoint and bounded from below with hPF (κ) ≥ φ0 . Theorem B.1. Assume Hypothesis (C) in Sec 3. Then there exists a unique self˜ PF of hPF such that the following properties hold : adjoint extension h (i) ˜ hPF ≥ φ0 . ˜ PF |1/2 ) ⊂ D(K 1/2 ) ∩ D(|φ|1/2 ) ∩ D(H 1/2 ). (ii) D(|h PF rad (iii) For all z ∈ (C\R) ∪ {ξ ∈ R|ξ < φ0 }, ˜ PF − z)−1 . s- lim (hPF (κ) − z)−1 = (h κ→∞
˜ PF |1/2 ), (iv) For all ξ < φ0 and Ψ ∈ D(|h
˜ PF − ξ)1/2 Ψ . s- lim (hPF (κ) − ξ)1/2 Ψ = (h κ→∞
Proof. We only need to apply Theorem A.1 to the following case: H = L2 (R3 ; Frad ), A = KPF , N = 2, B1 = φ, B2 = Hrad .
May 26, 2003 16:37 WSPC/148-RMP
270
00162
A. Arai
Adknowledgment This work was supported by the Grant-in-Aid No. 13440039 for Scientific Research from the JSPS. References [1] A. Arai, Analysis on anticommuting self-adjoint operators, Adv. Stud. Pure Math. 23 (1994), 1–15. [2] A. Arai, Scaling limit of anticommuting self-adjoint operators and applications to Dirac operators, Integr. Equat. Oper. Theory 21 (1995), 139–173. [3] A. Arai, A particle-field Hamiltonian in relativistic quantum electrodynamics, J. Math. Phys. 41 (2000), 4271–4283. [4] F. Hiroshima, Essential self-adjointness of translation-invariant quantum field models for arbitrary coupling constants, Comm. Math. Phys. 211 (2000), 585–613. [5] F. Hiroshima, Self-adjointness of the Pauli-Fierz Hamiltonian for arbitrary values of coupling constants, Ann. Henri Poincar´e 3 (2002), 171–201. [6] T. Kato, Perturbation Theory for Linear Operators, 2nd Edition, Springer, Berlin Heidelberg New York, 1976. [7] W. Pauli and M. Fierz, Zur Theorie der Emission langwelliger Lichtquanten, Nuovo Cimento 15 (1938), 167–188. [8] M. Reed and B. Simon, Methods of Modern Mathematical Physics I : Functional Analysis, Academic Press, New York, 1972. [9] M. Reed and B. Simon, Methods of Modern Mathematical Physics II : Fourier Analysis, Self-adjointness, Academic Press, New York, 1975. [10] M. Reed and B. Simon, Methods of Modern Mathematical Physics IV : Analysis of Operators, Academic Press, New York, 1978. [11] B. Thaller, The Dirac Equation, Springer-Verlag, Berlin, Heidelberg, 1992.
May 26, 2003 16:50 WSPC/148-RMP
00166
Reviews in Mathematical Physics Vol. 15, No. 3 (2003) 271–312 c World Scientific Publishing Company
LOCALIZATION OF THE NUMBER OF PHOTONS OF GROUND STATES IN NONRELATIVISTIC QED
FUMIO HIROSHIMA Department of Mathematics and Physics, Setsunan University 572-8508, Osaka, Japan
[email protected] Received 6 November 2002 Revised 23 January 2003 One electron system minimally coupled to a quantized radiation field is considered. It is assumed that the quantized radiation field is massless, and no infrared cutoff is imposed. The Hamiltonian, H, of this system is defined as a self-adjoint operator acting on L2 (R3 ) ⊗ F ∼ = L2 (R3 ; F ), where F is the Boson Fock space over L2 (R3 × {1, 2}). It k is shown that the ground state, ψg , of H belongs to ∩∞ k=1 D(1 ⊗ N ), where N denotes the number operator of F . Moreover, it is shown that for almost every electron position m+1 variable x ∈ R3 and for arbitrary k ≥ 0, k(1 ⊗ N k/2 )ψg (x)kF ≤ Dk e−δ|x| with some constants m ≥ 0, Dk > 0, and δ > 0 independent of k. In particular ψg ∈ β|x|m+1 ⊗ N k ) for 0 < β < δ/2 is obtained. ∩∞ k=1 D(e Keywords: Pauli–Fierz model; ground states; number operators; pull-through formula.
1. Introduction 1.1. The Pauli Fierz Hamiltonian In this paper one spinless electron minimally coupled to a massless quantized radiation field is considered. It is the so-called Pauli–Fierz model of the nonrelativistic QED. The Hilbert space of state vectors of the system is given by H = L2 (R3 ) ⊗ F , where F denotes the Boson Fock space defined by " # ∞ M n 2 3 F= ⊗s L (R × {1, 2}) , n=0
⊗ns L2 (R3
where × {1, 2}), n ≥ 1, denotes the n-fold symmetric tensor product of L2 (R3 × {1, 2}) and ⊗0s L2 (R3 × {1, 2}) = C. The Fock vacuum Ω is defined by Ω = {1, 0, 0, . . .}. Let ( ∞ ) M F0 = Ψ(n) ∈ F Ψ(n) = 0 for n ≥ m with some m . n=0
271
May 26, 2003 16:50 WSPC/148-RMP
272
00166
F. Hiroshima
For each {k, j} ∈ R3 × {1, 2}, the annihilation operator a(k, j) is defined by, for (n) Ψ = ⊕∞ ∈ F0 , n=0 Ψ √ (a(k, j)Ψ)(n) (k1 , j1 , . . . , kn , jn ) = n + 1Ψ(n+1) (k, j, k1 , j1 , . . . , kn , jn ) . The creation operator a∗ (k, j) is given by a∗ (k, j) = (a(k, j)dF0 )∗ . They satisfy the canonical commutation relations on F0 : [a(k, j), a∗ (k 0 , j 0 )] = δ(k − k 0 )δjj 0 , [a(k, j), a(k 0 , j 0 )] = 0 , [a∗ (k, j), a∗ (k 0 , j 0 )] = 0 . The closed extensions of a(k, j) and a∗ (k, j) are denoted by the same symbols respectively. The annihilation and creation operators smeared by f ∈ L2 (R3 ) are formally written as Z ] a (f, j) = a] (k, j)f (k)dk , a] = a or a∗ , and act as (a(f, j)Ψ)
(n)
=
√
n+1
Z
f (k)Ψ(n+1) (k, j, k1 , j1 , . . . , kn , jn )dk ,
1 X (a∗ (f, j)Ψ)(n) = √ f (k)Ψ(n−1) (k1 , j1 , . . . , kd l , jl , . . . , k n , jn ) , n j =j l
P
ˆ means neglecting X. where jl =j denotes to sum up jl such that jl = j, and X We work with the unit ~ = 1 = c. The dispersion relation is given by ω(k) = |k| . Then the free Hamiltonian Hf of F is formally written as X Z Hf = ω(k)a∗ (k, j)a(k, j)dk , j=1,2
and acts as (Hf Ψ)
(n)
(k1 , j1 , . . . , kn , jn ) =
n X
ω(kj )Ψ(n) (k1 , j1 , . . . , kn , jn ) ,
j=1
n ≥ 1,
(Hf Ψ)(0) = 0 with the domain D(Hf ) =
(
∞ ) X (n) Ψ = ⊕∞ k(Hf Ψ)(n) k2⊗n L2 (R3 ×{1,2}) < ∞ . n=0 Ψ n=0
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
273
Since Hf is essentially self-adjoint and nonnegative, we denotes the self-adjoint extension of Hf by the same symbol Hf . Under the identification Z ⊕ H∼ Fdx , = R3
the quantized radiation field A with a form factor ϕ is given by the constant fiber direct integral Z ⊕ A= A(x)dx , R3
where A(x) is the operator acting on F defined by Z e(k, j) ∗ 1 X p {a (k, j)e−ik·x ϕ(−k) ˆ + a(k, j)eik·x ϕ(k)}dk ˆ . A(x) = √ 2 j=1,2 ω(k)
Here ϕˆ denotes the Fourier transform of ϕ and e(k, j), j = 1, 2, are polarization vectors such that (e(k, 1), e(k, 2), k/|k|) forms a right-handed system, i.e. k·e(k, j) = 0, e(k, j) · e(k, j 0 ) = δjj 0 , and e(k, 1) × e(k, 2) = k/|k| for almost every k ∈ R3 . We fix polarization vectors through this paper. The decoupled Hamiltonian is given by H0 = H p ⊗ 1 + 1 ⊗ H f . Here
1 2 p +V 2 denotes a particle Hamiltonian, where p = (−i∇x1 , −i∇x2 , −i∇x3 ) and x = (x1 , x2 , x3 ) are the momentum operator and its conjugate position operator in L2 (R3 ), respectively, and V : R3 → R an external potential. We are prepared to define the total Hamiltonian, H, of this system, which is given by the minimal coupling to H0 . i.e. we replace p ⊗ 1 with p ⊗ 1 − eA, Hp =
1 (p ⊗ 1 − eA)2 + V ⊗ 1 + 1 ⊗ Hf , 2 where e denotes the charge of an electron. H=
1.2. Assumptions on V and fundamental facts We give assumptions on external potentials. We say V ∈ K3 (the three-dimensional Kato class [23]) if and only if Z |V (y)| lim sup dy = 0 , ↓0 x∈R3 |x−y| −∞, Z ∈ L1loc (R3 ), W < 0, and W ∈ Lp (R3 ) for some p > 3/2. 1
For V ∈ K, a functional integral representation of e−t(− 2 ∆+V ) by means of the Wiener measure on C([0, ∞); R3 ) is obtained. See e.g. [23]. For V ∈ K ∩ Vexp , using this functional integral representation, it can be proven that a ground state, fp , of − 12 ∆ + V decays exponentially, i.e. |fp (x)| ≤ c1 e−c2 |x|
c3
(1.1)
for almost every x ∈ R3 with some positive constants c1 , c2 , c3 . Similar estimates are available to the Pauli–Fierz Hamiltonian H with V ∈ K ∩Vexp . See Proposition 1.5. Furthermore we need to define class V (m), m = 0, 1, 2, . . . to estimate constant c3 in (1.1) precisely. Definition 1.2. Suppose that V = Z + W ∈ Vexp ∩ K, where the decomposition Z + W is that of the definition of Vexp . (1) We say V ∈ V (m), m ≥ 1, if and only if Z(x) ≥ γ|x|2m for x ∈ / O with a certain compact set O and with some γ > 0. (2) We say V ∈ V (0) if and only if lim inf |x|→∞ Z(x) > inf σ(H), where σ(H) denotes the spectrum of H. −eZ A physically reasonable example of V is the Coulomb potential 4π|x| , where Z > 0 denotes the charge of a nucleus. Actually we see the following proposition.
Proposition 1.3. Assume that Z
R3
2 |ϕ(k)| ˆ Z2 dk < . ω(k) 2(4π)2
Then −
eZ ∈ V (0) 4π|x|
for all e > 0. Proof. It is known that −1/|x| ∈ K3 ∩ Vexp . Then we shall show inf σ(H) < 0. Let V = −eZ/(4π|x|) and f be a normalized ground state of Hp = − 12 ∆ + V , Hp f = −E0 f , where E0 =
e2 Z 2 . 2(4π)2
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
275
Then we have inf σ(H) ≤ (f ⊗ Ω, Hf ⊗ Ω)H = (f, Hp f )L2 (R3 ) + e2 = −E0 + 2 =−
e2 2
X Z
µ=1,2,3
Z2 −2 (4π)2
Z
R3
R3
kµ2 1− 2 |k| 2 |ϕ(k)| ˆ dk ω(k)
!
e2 (f ⊗ Ω, A2 f ⊗ Ω)H 2
2 |ϕ(k)| ˆ dk ω(k)
< 0.
Thus the proposition follows. We introduce Hypothesis Hm , m = 0, 1, 2, . . . . Hypothesis Hm (1) D(∆) ⊂ D(V ) and there exists 0 ≤ a < 1 and 0 ≤ b such that for f ∈ D(∆), kV f kL2 (R3 ) ≤ ak∆f kL2(R3 ) + bkf kL2(R3 ) . √ (2) ϕ(−k) ˆ = ϕ(k), ˆ and ϕ/ω, ˆ ωϕˆ ∈ L2 (R3 ). (3) inf σess (Hp )−inf σ(Hp ) > 0, where σ(Hp ) (resp. σess (Hp )) denotes the spectrum (resp. essential spectrum) of Hp . (4) V ∈ V (m). Proposition 1.4. We assume (1) and (2) of Hm . Then for arbitrary e ∈ R, H is self-adjoint on D(∆ ⊗ 1) ∩ D(1 ⊗ Hf ) and bounded from below, moreover essentially self-adjoint on any core of −∆ ⊗ 1 + 1 ⊗ Hf . Proof. See [14, 15]. The number operator of F is defined by X Z N= a∗ (k, j)a(k, j)dk . j=1,2
(n) The operator N k , k ≥ 0, acts as, for Ψ = ⊕∞ , n=0 Ψ
(N k Ψ)(n) = nk Ψ(n) with the domain k
D(N ) =
(
Ψ=
∞ X
(n) ⊕∞ n=0 Ψ
n=0
n
2k
kΨ(n) k2⊗n L2 (R3 ×{1,2})
)
0 and δ > 0. Proof. See [5, 10] for (i) and (iii), [13] for (ii) and [16] for (iv). Remark 1.6. It is not clear from Proposition 1.5 that ψg ∈ D(eδ|x| See Corollary 1.11.
m+1
⊗ N 1/2 ).
The condition I=
Z
R3
2 |ϕ(k)| ˆ dk < ∞ ω(k)3
(1.3)
is called the infrared cutoff condition. (1.3) is not assumed in Proposition 1.5. For suitable external potentials, e0 = ∞ is available in Proposition 1.5. This is established in [10]. In the case where inf ess (Hp ) − inf σ(Hp ) = 0, examples for H to have a ground state is investigated in [17, 19]. It is unknown, however, whether such a ground state decays in x exponentially or not. When electron includes spin, H has a twofold degenerate ground state for sufficiently small |e|, which is shown in [18]. 1.3. Localization of the number of bosons and infrared singularities for a linear coupling model The Nelson Hamiltonian [22] describes a linear coupling between a nonrelativistic particle and a scalar quantum field with a form factor ϕ. Let HN = L2 (R3 ) ⊗ FN , L∞ where FN = n=0 [⊗sn L2 (R3 )]. The Nelson Hamiltonian is defined as a self-adjoint operator acting in the Hilbert space HN , which is given by HN = Hp ⊗ 1 + 1 ⊗ HfN + gφ , R where g denotes a coupling constant, HfN = ω(k)a∗ (k)a(k)dk is the free R⊕ ∼ Hamiltonian in FN , and under identification HN = R3 FN dx, φ is defined by R⊕ φ = R3 φ(x)dx with ) Z ( 1 ˆ ˆ ∗ −ikx ϕ(−k) ikx ϕ(k) p φ(x) = √ a (k)e + a(k)e p dk . 2 ω(k) ω(k)
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
277
It has been established in [2, 4, 9, 25] that the Nelson Hamiltonian has the unique ground state, ψgN , under the condition I < ∞. Let us denote the number operator of FN by the same symbol N as that of F. In [6] it has been proven that ψgN decays superexponentially, i.e. ke+β(1⊗N ) ψgN kHN < ∞
(1.4)
for arbitrary β > 0. This kind of results has been obtained in [11, Sec. 3] and [24] for relativistic polaron models, and [26, Sec. 8] for spin-boson models. Moreover in [6] we see that lim k(1 ⊗ N 1/2 )ψgN kHN = ∞ .
I→∞
(1.5)
Actually in the infrared divergence case, I = ∞,
(1.6)
it is shown in [20] that the Nelson Hamiltonian with some confining external potentials has no ground states in HN . Then we have to take a non-Fock representation to investigate a ground state with (1.6). See [1, 3, 21] for details. That is to say, as the infrared cutoff is removed, the number of bosons of ψgN diverges and the ground state disappears. A method to show (1.4) and (1.5) is based on a path integral representation of (ψgN , e+β(1⊗N ) ψgN )HN . Precisely it can be shown that in the case I < ∞ there exists a probability measure µ on C(R; R3 ) such that for arbitrary β > 0, Z R∞ R0 2 +β (ψgN , e+β(1⊗N ) ψgN )HN = e−(g /2)(1−e ) −∞ ds 0 dtW (qs −qt ,s−t) µ(dq) , C(R;R3 )
(1.7)
where (qt )−∞i
and ϕi (Ai ) = Bi . To complete the proof note that FΘ and Cπ˜ are the C ∗ Nd Nd subalgebras of i=1 Ai and i=1 Bi accordingly and ν˜ : FΘ → Cπ˜ is the restriction of d O i=1
ϕi :
d O i=1
Ai →
d O
Bi .
i=1
(2) Assume now that π corresponds to Φ 6= {1, . . . , d}. We will use induction on d. Suppose that the assertion is true for algebras with d − 1 generator. Assume that π is irreducible representation of A0,Θ and ker(π(s∗1 )) 6= {0}. Let us denote the C ∗ -algebra generated by operators π by Cπ . Then one can deduce that π(s1 ) = S ⊗ 1 ,
π(sj ) = d(λ1j ) ⊗ π ˜ (sj ) ,
j≥2
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
335
d ∗ ˜ where π ˜ is irreducible representation of A0,Θ ˜ , Θ = (θij )i,j=2 . The C -algebra generated by operators of π ˜ will be denoted by Cπ˜ . Analogously
πF (s1 ) = S ⊗ 1 ,
πF (sj ) = d(λ1j ) ⊗ π ˜F (sj ) ,
j≥2
∗ where π ˜F is Fock representation of A0,Θ ˜ . Let FΘ ˜ be the C -algebra generated by operators of Fock representation of AΘ ˜. By assumption of induction we have an homomorphism
ϕπ˜ : FΘ ˜ → Cπ ˜,
π ˜ = ϕπ˜ π ˜F .
∗ Let D = C ∗ (s, d(λ1j ), j ≥ 2). Construct D ⊗ FΘ ˜ and D ⊗ Cπ ˜ . The C crossnorms are uniquely defined as all algebras are nuclear. Evidently FΘ and Cπ are the C ∗ -subalgebras of these algebras. By property of tensor product we have an homomorphism
id ⊗ ϕπ˜ : D ⊗ FΘ ˜ → D ⊗ Cπ ˜. Denote by ϕπ the restriction of this homomorphism to FΘ . It is easily seen that ϕ(πF (si )) = π(si ), i = 1, . . . , d, hence ϕπ : F Θ → C π and π = ϕπ πF . In the following Proposition we clarify the structure of the ideal M in the case d = 2. Proposition A.2. The sequence 0 → K → M → K ⊗(C(T)⊕C(T)) → 0 is exact. Proof. We work with the Fock realization, i.e. with S1 = S ⊗ 1 ,
S2 = d(λ) ⊗ S ,
λ = λ12 .
Let 2 1 = S2i (1 − S2 S2∗ )S2∗j S1k S1∗l Pijkl = S1i (1 − S1 S1∗ )S1∗j S2k S2∗l : Pijkl 2 1 } respectively. } and {Pijkl and let M1 and M2 be ideals generated by sets {Pijkl Then, since 1 = S i (1 − SS ∗ )S ∗j d(λ)k−l ⊗ S k S ∗l , Pijkl
one has M1 ' K ⊗ T (T is a Toeplitz algebra). To prove that the M2 ' K ⊗ T one have to change basis in l2 (N) ⊗ l2 (N) so that in new basis S1 = d(λ) ⊗ S and S2 = S ⊗ 1. Further M1 ∩ M2 = M1 M2 = K(l2 (N) ⊗ l2 (N)) since Pi11 j1 k1 l1 · Pi22 j2 k2 l2 ∈ K(l2 (N) ⊗ l2 (N)). Then M/K ' M1 /K ⊕ M2 /K ' (K ⊗ T /K ⊗ K) ⊕ (K ⊗ T /K ⊗ K) .
June 19, 2003 15:50 WSPC/148-RMP
336
00161
C. S. Kim et al.
But the sequence 0 → K → T → C(T) → 0 is exact and K is nuclear so the sequence 0 → K ⊗ K → K ⊗ T → K ⊗ C(T) → 0 is exact and M/K ' K ⊗ (C(T) ⊕ C(T)) . K-theory for TCCR We use the stability of Bµ at µ = 0 and faithfulness of the Fock representation of Bµ to compute the K-groups for the TCCR. Namely we consider the Fock realization of B0 ' Bµ . In the Fock representation the generators of B0 have the following form O O si = (1 − SS ∗ ) ⊗ S ⊗ 1 , i = 1, . . . , d . j
j>i
Proposition A.3. K0 (Bµ ) = Z, K1 (Bµ ) = {0}. Proof. As it was noted above we can identify B0 with the C ∗ (si , s∗i , i = 1, . . . , d) where O O 1 , i = 1, . . . , d . (1 − ss∗ ) ⊗ s ⊗ si = j>i
j
Let us consider the case d = 2. Let T˜0 be the ideal generated by the element (1 − ss∗ ) ⊗ (1 − s). It is easy to see that T˜0 ' K ⊗ T0 , where T0 is an ideal in the Toeplitz algebra T generated by the element 1 − s. It was shown by J. Cuntz (see [9]) that Ki (T0 ) = {0}. Further, B0 /T˜0 ' T , i.e. one has the following short exact sequence 0 → T˜0 → B0 → T → 0 . Since K0 (T ) ' Z and K1 (T ) = {0}, the corresponding six-term exact sequence become 0 −−−−→ K0 (B0 ) −−−−→ x
Z y
0 ←−−−− K1 (B0 ) ←−−−− 0
Nd−1 In the general case we consider the ideal Tˆ0 generated by the element i=1 (1 − ∗ ss ) ⊗ (1 − s). Then Tˆ0 '
d−1 O i=1
K ⊗ T0 ' K ⊗ T 0
and B0 (d)/Tˆ0 ' B0 (d − 1). Applying again the six-term sequence corresponding to the 0 → K ⊗ T0 → B0 (d) → B0 (d − 1) → 0 and induction on d we get K0 (B0 (d)) ' Z and K1 (B0 (d)) = {0}.
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
337
Remark A.1. In fact via the stability of Aα,k,Θ the result of the previous theorem is true for the C ∗ -algebra generated by the GCCR corresponding to k = (d, d, . . . , d).
References [1] O. Bratelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Springer Verlag, Berlin, Heidelberg, New York, 1981. [2] L. C. Biedenharn, The quantum group SUq (2) and q-analogue of the boson operators, J. Phys. A. 22 (1989), L873–L878. [3] L. A. Coburn, The C ∗ -algebra generated by an isometry, I, Bull. Am. Math. Soc. 73 (1967), 722–726. [4] L. A. Coburn, The C ∗ -algebra generated by an isometry, II, Trans. Am. Math. Soc. 137 (1969), 211–217. [5] P. R. Halmos, A Hilbert Space Problem Book, D. Van Nostrand co. (1967). [6] M. Bo˙zejko and R. Speicher, An example of a generalized Brownian motion, Comm. Math. Phys. 137 (1991), 519–531. [7] M. Bo˙zejko and R. Speicher, Completely positive maps on Coxeter groups, deformed commutation relations, and operator spaces, Math. Ann. 300 (1994), 97–120. [8] J. Cuntz, Simple C ∗ -algebras generated by isometries, Comm. Math. Phys. 57 (1977), 173–185. [9] J. Cuntz, K-theory and C ∗ -algebras, Proc. Conf. on K-Theory (Bielefeld 1982), Springer Lecture Notes in Math. 1046, 55–79. [10] C. Daskaloyannis, Generalized deformed oscillator and nonlinear algebras, J. Phys. A. 24 (1991), L789–L794. [11] D. I. Fivel, Interpolation between Fermi and Bose statisitics using generalized commutators, Phys. Rev. Lett. 65 (1990), 3361–3364. [12] K. Dykema and A. Nica, On the Fock representation of the q-commutation relations, J. Reine Angew. Math. 440 (1993), 201–212. [13] O. W. Greenberg, Particles with small violations of Fermi or Bose statistics, Phys. Rev. D. 43 (1991), 4111–4120. [14] P. E. T. Jørgensen, L. M. Schmitt and R. F. Werner, Positive representations of general commutation relations allowing Wick ordering, J. Funct. Anal. 134 (1995), 33–99. [15] P. E. T. Jørgensen, L. M. Schmitt and R. F. Werner, q-canonical commutation relations and stability of the Cuntz algebra, Pacific J. Math. 163, 1 (1994), 131–151. [16] P. E. T. Jørgensen, L. M. Schmitt and R. F. Werner, Coherent states of the qcanonical commutation relations, funct-an/9303002. [17] W. Marcinek, On commutation relation s for quons, Rep. Math. Phys. 41 (1998), 155–172. [18] W. Marcinek and R. Ralowski, On Wick algebras with braid relations, J. Math. Phys. 36 (1995), 2803–2820. [19] O. Bratelli and P. E. T. Jørgensen, Iterated function systems and permutation representations of the Cuntz algebra, Mem. Am. Math. Soc. 139 (1999), 663. [20] O. Bratelli, P. E. T. Jørgensen and V. L. Ostrovky˘ı, Representation Theory and Numerical AF-invariants: The representations and centralizers of certain states on Od , math.OA/9907036. [21] O. Bratteli, P. E. T. Jørgensen, A. Kishimoto and R. F. Werner, Pure states on Od , J. Oper. Theory 43 (2000), 97–143.
June 19, 2003 15:50 WSPC/148-RMP
338
00161
C. S. Kim et al.
[22] P. E. T. Jørgensen, Representations of Cuntz algebras, loop groups and wavelets, XIIIth International Congress on Mathematical Physics (London, 2000) (A. Fokas, A. Grigoryan, T. Kibble and B. Zegarlinski, eds.), International Press, Boston, 2001, pp. 327–332. [23] P. E. T. Jørgensen, D. P. Proskurin and Yu. S. Samo˘ılenko, The kernel of Fock representation of Wick algebras with braided operator of coefficients, Pacific J. Math. 198 (2001), 109–122. [24] A. J. Macfarlane, On q-analogues of the quantum harmonic oscillator and the quantum group SU (2)q , J. Phys. A. 22 (1989), 4581–4588. [25] V. Mazorchuk and L. Turowska, ∗-Representations of twisted generalized Weyl conctructions, Algebras and Rep. Theory 5, 2 (2002), 163–186. [26] V. L. Ostrovsky˘ı and D. P. Proskurin, Operator relations, dynamical systems, and representations of a class of Wick algebras, Oper. Theory Adv. Appl., Birkhauser Verlag 118 (2000), 335–345. [27] V. Ostrovsky˘ı and Yu. Samo˘ılenko, Introduction to the Theory of Representations of Finitely Presented ∗-Algebras. I. Representations by bounded operators, The Gordon and Breach Publishing Group, London (1999). [28] W. Pusz and S. L. Woronowicz, Twisted second quantization, Rep. Math. Phys. 27 (1989), 251–263. [29] D. Proskurin, Stability of a special class of qi j-CCR and extensions of higherdimensional noncommutative tori, Lett. Math. Phys. 52, 2 (2000), 165–175. [30] D. Proskurin and Yu. Samo˘ılenko, Stability of the C ∗ -algebra associated with the twisted CCR, Algebras and Rep. Theory 5 (2002), 433–444. [31] J. Slawny, On factor representations and the C ∗ -algebra of canonical commutation relations, Comm. Math. Phys. 24 (1972), 151–170. [32] B. Brenken, A classification of some noncommutative tori, Rocky Mountain J. Math. 20, 2 (1990), 389–397. [33] L. Vaksman, Lectures on q-analogues of Cartan domains and associated HarishChandra modules, math. QA/0109198. [34] G. A. Elliott, On the classification of C ∗ -algebras of real rank zero, J. Reine Angew. Math. 443 (1993), 179–219. [35] G. A. Elliott and D. E. Evans, The structure of the irrational rotation C ∗ -algebra, Ann. Math. 138 (1993), 477–501. [36] G. A. Elliott and G. Gong, On inductive limits of matrix algebras over the two-torus, Am. J. Math. 118 (1996), 263–290. [37] G. A. Elliott and Q. Lin, Cut-down method in the inductive limit decomposition of noncommutative tori, J. London Math. Soc. 54 (1996), 121–134. [38] G. A. Elliott and M. Rørdam, The automorphism group of the irrational rotation algebra, Comm. Math. Phys. 155 (1993), 3–26. [39] M. Pimsner and D. Voiculescu, Imbedding of the irrational C ∗ -algebra into AF algebras, J. Oper. Theory 4 (1980), 201–210. [40] M. Pimsner and D. Voiculescu, Exact sequence for K-groups and Ext groups of certain cross-product C ∗ -algebras, J. Oper. Theory 4 (1980), 93–118. [41] S. Disney and I. Raeburn, Homogeneous C ∗ -algebras whose spectra are tori, J. Austral. Math. Soc. Ser. A 38, 1 (1985), 9–39. [42] M. A. Rieffel, C ∗ -algebras associated with irrational rotations, Pacific J. Math. 93 (1981), 415–429. [43] M. A. Rieffel, Projective modules over higher-dimensional non-commutaive tori, Canad. J. Math. 40 (1988), 257–338.
June 19, 2003 16:13 WSPC/148-RMP
00165
Reviews in Mathematical Physics Vol. 15, No. 4 (2003) 339–386 c World Scientific Publishing Company
EXPONENTIALLY SMALL SPLITTING AND ARNOLD DIFFUSION FOR MULTIPLE TIME SCALE SYSTEMS
MICHELA PROCESI S.I.S.S.A. Functional Analysis Sector, 34014 Trieste
[email protected] Received 8 May 2002 Revised 10 February 2003
We consider the class of Hamiltonians: n−1 n X 1 p2 1 X 2 Ij + εIn2 + sin(ψi ) , + ε[(cos q − 1) − b2 (cos 2q − 1)] + εµf (q) 2 j=1 2 2 i=1
where 0 ≤ b < 12 , and the perturbing function f (q) is a rational function of eiq . We prove upper and lower bounds on the splitting for such class of systems, in regions of the phase space characterized by one fast frequency. Finally using an appropriate Normal Form theorem we prove the existence of chains of heteroclinic intersections. Keywords: Homoclinic splitting; Arnold diffusion; whiskered tori; perturbation theory; diagrammatic expansion.
Contents 1. Presentation of the Model and Main Theorems 2. Perturbative Construction of the Homoclinic Trajectories 2.1 Whisker calculus, the “primitive” =t 2.2 The recursive equations 3. Proofs of the Theorems 3.1 The formal linear equation 3.2 Lower bounds on the Melnikov term 3.3 Heteroclinic intersection for systems with one fast frequency 4. Tree Representation 4.1 Definitions of trees 4.2 Admissible trees 4.3 Values of trees 4.4 Tree identities 4.4.1 Mark adding functions 4.4.2 Fruit adding functions 4.4.3 Changing the first node 4.5 Upper bounds on the values of trees 339
340 344 346 348 350 350 353 354 359 359 362 365 366 366 367 371 373
June 19, 2003 16:13 WSPC/148-RMP
340
00165
M. Procesi
A. Appendix A.1 Proof of Proposition 4.16 A.2 Normal form theorem A.3 Proof of Lemma 4.23 References
378 378 379 384 385
1. Presentation of the Model and Main Theorems The general setting of this paper is the problem of homoclinic splitting and Arnol’d diffusion in a priori stable systems with three or more relevant time scales. The general strategy is the one proposed in [1] and [2] and in particular the application to a priori stable systems proposed in [3] and further developed in [4]. More precisely we consider a class of close to integrable n degrees of freedom Hamiltonian systems for which one can prove the existence of (n−1)-dimensional unstable KAM tori together with their stable and unstable manifolds. We use a perturbative diagrammatic construction (proposed and developed in [3], [4] and [5]) to prove upper bounds on the angles of intersection of the stable and unstable manifolds of a KAM torus (homoclinic splitting). Such bounds are generally exponentially small in the perturbation parameter and depend on the chosen torus and in particular on the number of fast degrees of freedom. For systems with one fast degree of freedom we prove as well lower bounds on the homoclinic splitting through the mechanism of Melnikov dominance. Finally for such systems we prove the existence of “long” chains of heteroclinic intersections; namely we produce a list of unstable KAM tori T1 , . . . , Th such that T1 , Th are at distances of order one in the action variables and the unstable manifold of each Ti intersects the stable manifold of Ti+1 . This paper is a generalization of the results of [4], [5], [6], therefore in proving our claims we will rely heavily on intermediate results proved in the latter papers which we will not prove again. Consider the class of Hamiltonians n−1 n X 1 − c2 1 X ˜2 1 ˜2 p˜2 I + εIn + + ε (cos q˜ − 1) − (cos 2˜ q − 1) + εµf (˜ q) sin ψ˜i , 2 j=1 2 2 4 i=1
(1.1)
where the pairs I˜ ∈ Rn , ψ˜ ∈ Tn and p˜ ∈ R, q˜ ∈ T are conjugate action-angle coordinates, 0 < c ≤ 1, f (˜ q ) is odd and analytic on the torus and µ, ε are small parameters. We will consider them independent and then prove that one can prove Arnold Diffusion for µ ≤ εP , for an appropriate P . This class of Hamiltonians is a model for a near to integrable system close to a simple resonance where the dependence on the hyperbolic variables is not through the standard pendulum, but still maintains various qualitative properties of the pendulum. Namely we have a “generalized pendulum”, p˜2 1 − c2 + ε (cos q˜ − 1) − (cos 2˜ q − 1) 2 4 √ which has an unstable fixed point in p˜ = q˜ = 0 with Lyapunov exponent λ = c ε.
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
341
Generally one rescales the time and action variables so that the Lyapunov exponent is one: I˜ c√t ε p˜ c√t ε t t ˜ I(t) = √ , ψ(t) = ψ √ , p(t) = √ , q(t) = q˜ √ . c ε c ε c ε c ε (1.2) Such rescaling sends Hamiltonian 1.1 in n X 1 − c2 1 (I, A(ε)I) p2 sin(ψi ) + + 2 (cos q − 1) − (cos 2q − 1) + µf (q) 2 2 c 4 i=1
(1.3)
where A(ε) is the diagonal matrix with eigenvalues ai = 1 for i = 1, . . . , n − 1 and an = ε. So from now on we will work on Hamiltonian (1.3) and turn back to Hamiltonian (1.1) only to prove the existence of heteroclinic chains. The system (1.3) is integrable for µ = 0. It represents a list of n uncoupled rotators and a generalized pendulum (depending on the parameter c). We will denote the frequency of the rotators (which determines the initial data I(0)) by ω so that I(t) = I(0) = A−1 ω ,
ψ(t) = ψ(0) + ωt .
The initial data are chosen in an appropriate domain (physically interesting in the ˜ so that there are at least three characteristic orders of magnitude for variables I) the frequencies of the unperturbed system. Definition 1.1. In frequency space we first consider the ellipsoid ( ) n X Σ := x ∈ Rn : x2i /ai = 2E i=1
a
where E is an order one constant E ∼ Oε (1). For notational convenience we split the frequency ω in two vectorial components: ω1 ω = (√ , εα ω2 ) with ω1 ∈ Rm , ω2 ∈ Rn−m , and 0 ≤ α ≤ 12 . Finally, given two ε suitable order one constants R, r ∼ Oε (1), we consider the region √ √ Ω ≡ {ω ∈ Rn : εω ∈ Σ , r < |ω1,i | < R and r < |ω2 | < R , εα |ω2,i | ≥ ε , √ εα |ω2,n−m | ∼ ε} . We have chosen the generalized pendulum so that its dynamics on the separatrix is particularly simple,b namely sinh(±t) + ic 1 q(t) = 2 arc cot g sinh(±t) , eiq(t) = . (1.4) c sinh(±t) − ic a Now
a(ε)
and in the following we will say a(ε) ∼ Oε (f (ε)) if limε→0+ f (ε) = L 6= 0. motion on the separatrix can be easily obtained by direct computation; the main feature is that the motion on the separatrix is such that eiq(t) is a rational function of et . Here we are considering the simplest class of examples, which contains the standard pendulum c = 1.
b The
June 19, 2003 16:13 WSPC/148-RMP
342
00165
M. Procesi
√ 1 There are at least three characteristic time scales Oε (ε− 2 ), Oε (εα ), Oε ( ε) (coming from the degenerate variable In ) and 1 which is the Lyapunov exponent of the unperturbed pendulum. We will call ψ1 , . . . , ψm the fast variables and we will sometimes denote them as ψF ∈ Tm . Conversely we will call ψm+1 , . . . , ψn the slow variables ψS ∈ Tn−m . The perturbing function is a trigonometric polynomial of degree one in the rotators ψ and a rational functionc in eiq . We have decoupled the dependence of ψ and q only to simplify the computations. For each ω ∈ Rn the unperturbed system has an unstable fixed torus, p(t) = q(t) = 0 ,
I(t) = I(0) = A−1 ω ,
ψ(t) = ψ(0) + ωt .
The stable and unstable manifolds of such tori coincide and can be expressed as graphs on the angles. 1
Definition 1.2. Given any γ ∈ R, ε < γ ≤ O(ε 2 ) and a fixed τ > n − 1, we define the set γ Ωγ ≡ ω ∈ Ω : |ω · l| > τ , ∀ l ∈ Zn /{0} |l| of γ, τ Diophantine vectors in Ω. Now we consider 1 1 ∗ Ωγ ≡ Ω γ × − , 2 2 and for all (ω, ρ) ∈ Ω∗γ we set ωρ = (1 + ρ)ω. γ For all (ω, ρ) ∈ Ω∗γ and for all l ∈ Zn /{0} |ωρ · l| > 2|l| τ , ω ∈ Ωγ implies that ω1 and ω2 are Diophantine as well; we will call τF and τS their exponents. KAM like theorems (see [2], [5]) imply that there exists µ0 (ε, γ) ∼ ε2 such that if |µ| ≤ µ0 and if (ω, ρ) ∈ Ω∗γ , there exists one and only one n-dimensional Hµ invariant unstable torus Tµ (ω, ρ) whose Hamiltonian flow is analytically conjugated to the flow Tn 3 ϑ → ϑ + ωρ t. Moreover one can parameterize the stable and unstable manifolds of Tµ (ω, ρ) by functions I ± (ω, ϕ, q, µ), analytic in the last three arguments, with ϕ, q ∈ Tn × [− 23 π, 32 π]. Namely given z ± (ω, ϕ, q, µ) = (I ± (ω, ϕ, q, µ), p± (ω, ϕ, q, µ), ϕ, q) , where the pendulum action is derived by energy conservation, the trajectory d: ( t + ΦH z (ω, ϕ, q, µ) if t > 0 z(ω, ϕ, q, µ, t) = ΦtH z − (ω, ϕ, q, µ) if t < 0 tends exponentially to a quasi-periodic function of frequency ω. c Actually
it is sufficient that the singularity of f (ψ(t), q(t)), which is nearest to the real axis is polar and isolated. d Φt is the evolution at time t of the Hamiltonian flow (1.3). H
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
343
Remark 1.3. We have introduced the variable ρ in order to fix the energy of the perturbed system,e namely given a list of ωi ∈ Ωγ one can find ρ(ωi , µ) such that all the corresponding whiskered tori are on the same energy surface, see for instance [5]. Definition 1.4. We will study the difference between the stable and unstable manifolds on an hyper-plane transverse to the flow (a Poincar´e section), we choose the hyper-plane q = π and consequently drop the dependence on q. We call 1 G0j (ϕ, ω) = aj (Ij (ϕ, ω, 0− ) − Ij (ϕ, ω, 0+ )) 2 the splitting vector and prove that G0j (ϕ = 0, ω) = 0. A measure of the transversality is ∆0ij = ∂ϕj G0i (ϕ)|ϕ=0 called splitting matrix. We will prove the following theorems: Theorem 1. The splitting matrix ∆0 satisfies the formal power series relationf : ∆0 ∼ AD0 B
where A, B are close to identity matrices and D 0 is the “holomorphic part” of the splitting matrix; namely its entries are expressed as integrals over R of analytic functions. Moreover the formal power series involved are all asymptotic. g This statement was posed as a conjecture in [7] Paragraph 3. Corollary 1.5. The preceding Theorem implies that Hamiltonian (1.3), in regions of the action variables corresponding to m 6= 0 fast time scales, has exponentially small upper bounds on the determinant of the splitting matrix : c 1 , |det ∆0 | ≤ Ce− εb , with b = 2m n provided that µ < ε1+2 m . Theorem 2. Consider Hamiltonian (1.3) in regions of the action variables corresponding to m = 1 fast variables and for perturbing functions f (q) such that the pole f (q(t)) closest to the imaginary axis, say t¯, is such that |Im t¯| = d ≤ arc sin c. Setting µ ≤ εP with P = p/2 + 8 + 4n where p is the degree of the pole of f (q(t)) in t¯ we prove that C1 ε−p1 e
−
d|ω1 | √ ε
≤ |det ∆0 | ≤ C2 ε−p2 e
−
d|ω1 | √ ε
where C1 , C2 , p1 , p2 are appropriate order one constants. e The final goal is to find heteroclinic intersections on the fixed energy surface, and so “Arnold diffusion”, but in the following sections we will discuss only homoclinic intersections and so we will drop the parameter ρ. f We denote formal power series identities with the symbol A ∼ B. P n g A formal power series µ an (ε) is asymptotic if for all q > 0 there exists Q > 0 such that for −q all n ≤ ε then an (ε) ≤ ε−Qn .
June 19, 2003 16:13 WSPC/148-RMP
344
00165
M. Procesi
Corollary 1.6. Under the conditions of Theorem 2 the Hamiltonian (1.1) has heteroclinic chains, namely a set of N ≥ 1 trajectories z 1 (t), . . . , z N (t) together with N + 1 different minimal setsh T0 , . . . , TN such that for all 1 ≤ i ≤ N lim
t→−∞
dist(z i (t), Ti−1 ) = 0 = lim dist(z i (t), Ti ) . t→∞
Moreover one can construct such chains between tori T (ω a ; µ), T (ω b ; µ) such that ¯ ⊂ Ωγ and ωa , ωb ∈ Ω 1
|ε− 2 (ωna − ωnb )| ∼ Oε (1) . The techniques used for proving the Theorems are those proposed in [3] and developed in [4] for partially isochronous three time scale systems with three degrees of freedom. In this paper, particular attention is given to the formalization of the tree expansions and of the “Dyson equation” and relative cancellations proposed in [4]. This enables us to extend Theorem 1 to systems with n degrees of freedom and at least two time scales; moreover the proof is definitely simplified and quite compact. In this article we have considered completely anisochronous systems only to fix an example; generalizing to partially (or totally, thus recovering the results of [8]) isochronous systems is completely trivial. Theorem 1 and hence Corollary 1.5 are purely formal, relying only on general features of the perturbation series for the homoclinic trajectory; indeed they can be proved for very general systems, as we will show in a forthcoming paper. Moreover we have generalized the class of perturbing functions and the “pendulum” (the literature considers only trigonometric polynomials and the standard pendulum); the latter generalizations are quite technical but nevertheless non-trivial and interesting, we think, as the techniques we propose are easily generalizable and give a clear picture of the limits of proving Arnold diffusion via Melnikov dominance.
2. Perturbative Construction of the Homoclinic Trajectories One can use perturbation theory to find the (analytic for µ ≤ µ0 ) trajectories on the S/U manifolds of Hamiltoniani (1.3) z(ϕ, ω, t) =
X (µ)k z k (ϕ, ω, t) . k
h A closed subset of the phase space is called minimal (with respect to a Hamiltonian flow φ t ) if h it is non-empty, invariant for Φth and contains a dense orbit. In our case the minimal sets will be unstable tori T (I) with ω(I) Diophantine. i Notice that the apex k on the functions I, ψ represents the order in the expansion in µ NOT an exponent. To avoid confusion, when we need to exponentiate we always set the argument in parentheses.
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
345
Namely we insert the expansion in µ in the Hamilton equations of system (1.3), ψ˙ j = aj Ij ,
I˙j = −(µ) cos ψj f (q) , p˙ =
n X
df 1 sin ψi (q) , sin q(1 − (1 − c2 ) cos q) − (µ) c2 dq i=1
(2.1)
q˙ = p ,
and find initial data I(ω, ϕ, µ, 0± ) (and consequently p(ω, ϕ, µ, 0± )) such that the solution of (2.1) tends exponentially to a quasi-periodic function of frequency ω. Inserting in the Hamilton equations the convergent power series representation: I(t, ϕ, µ) =
∞ X
k k
(µ) I (t, ϕ) ,
ψ(t, ϕ, µ) =
k=0
p(t, ϕ, µ) =
∞ X
∞ X
(µ)k ψ k (t, ϕ) ,
k=0
(µ)k pk (t, ϕ) ,
q(t, ϕ, µ) = q 0 (t) +
k=0
∞ X
(µ)k ψ0k (t, ϕ)
k=1
we obtain, for k > 0, the hierarchy of linear non-homogeneous equations,j ψ˙ jk = aj Ijk ,
I˙jk = Fjk ({ψih }i=0,...,n ) , h 0 we have a linear non-homogeneous ODE that we can solve by variation of constants. The fundamental solution of the linearized pendulum equation is given by, w˙ 0 x˙ 00 , w0 = 1 σ(t)x1 where σ(t) = sign(t) W (t) = 0 0 2 w 0 x0 x00 =
x10 =
c2 cosh(t) , + sinh(t)2
c2
σ(t)x00 (2(−3 + 4 c2 ) t + sinh(2 t) + 4(−1 + c2 )2 tanh(t)) . 2c4
(2.3)
It is easily seen (see [3] or [5]) that one can choose an appropriate “primitive” in the right hand side of the first column of Eqs. (2.2) so that the solutions are exponentially quasi-periodic. 2.1. Whisker calculus, the “primitive” =t Let us first define the function spaces on which we work, all the definitions and statements of this Subsection and of the following one are proposed and explained in detail in [3], we are simply reformulating them to suit our needs. Definition 2.1. (i) H is the vector space (on C) generated by monomials of the form m = σ(t)a
|t|j h i(ϕ+ωt)·ν x e j!
x = e−|t| ,
a = 0, 1 ,
where h ∈ Z ,
ν ∈ Zn ,
j ∈ N,
σ(t) = sign(t) .
(2.4)
(ii) Given two positive constants b and d, H(b, d) is the subset of functions f (t) analytic on the real axis in t 6= 0 that admit, separately for t > 0 and t < 0, a (unique) representation, f (t) =
k X |t|j j=0
j!
σ(t)
Mj
(x, ϕ + ωt) ,
σ(t)
(2.5) σ(t)
with Mj (x, ϕ) trigonometric polynomials in ϕ and the function Mk tically zero.
not iden-
σ(t)
The Fourier coefficients Mjν (x) are all holomorphic in the x-plane in a region {0 < |x| < e−b } ∪ {|arg x| < d} and have possible polar singularities at x = 0. k is called the t degree of f. In Fig. 1 we have represented a possible domain of analyticity for the Mνj . Notice that H is contained in all the spaces H(b, d); moreover if |t| > b, f (t) can be represented as an absolutely convergent series of monomials of the type m,
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
347
eb d
Fig. 1.
separately for t > b and t < −b. One can easily check that the functional that acts on monomials m of the form (2.4) as j X |t|j−p a+1 h i(ψ+ωt)·ν if |h| + |ν| 6= 0 −σ x e (j − p)!(h − iσω · ν)p+1 t p=0 (2.6) = (m) = a+1 j+1 σ |t| − if |h| + |ν| = 0 (j + 1)! is a primitive of m. We can extend =t , with |t| > b, to a primitive on functions f ∈ H(b, d) by expanding f in the monomials m (we obtain absolutely convergent series) and applying (2.6). Then if |t| ≤ b we set Z t =t ≡ =2σ(t)b + , (2.7) 2σ(t)b
obviously the choice of 2b is arbitrary and this is still the same primitive of f . In H(b, d) we can extend =t to complex values of t such that t ∈ C(b, d) where C(b, d) := {t ∈ C : |Im t| ≤ d, |Re t| ≤ b} ∪ {t ∈ C : |Im t| ≤ 2π, |Re t| > b} , is the domain in Fig. 1 in the t variables. An equivalent (and quite useful) definition of =t is I Z t du =t f = e−σ(τ )uτ f (τ )dτ , 2iπu σ(t)∞+is
(2.8)
where σ(t) = sign(Re t), t = t1 + is, with t1 , s ∈ R and the integral is performed on the line Im τ = s; finally the integrals in u have to be considered to be the analytic continuation on u from u positive and large. This definition is clearly compatible with the formal definition given above and one easily sees that H(b, d) is closed under the application of =t . Definition 2.2. H0 (b, d) is the subspace of H(b, d) of functions that can be extended to analytic functions in C(b, d).
June 19, 2003 16:13 WSPC/148-RMP
348
00165
M. Procesi
Notice that f is in H0 (b, d) if it is in H(b, d) and f (t) is analytic at t = 0. Remark 2.3. If f ∈ H0 (b, d) then generally =f ∈ / H0 (b, d) and has a discontinuity in t = 0. For instance if f ∈ L1 is positive, then Z ∞ 0− 0+ =(f ) := (= − = )f = f 6= 0 . −∞
−
+
We can construct operators which preserve H0 (b, d); let = = =0 − =0 and ( ( =t if t ≥ 0 =t if t ≤ 0 t t =+ = = = − =t − = if t < 0 , =t + = if t > 0 . The operator 1 1 X t =ρ = =t − σ(t)= 2 ρ=±1 2
preserves the analyticity.
Now let us cite two important properties of H0 (b, d), proved in [3]. Lemma 2.4. In H0 (b, d) we have the following shift of contour formulas: ∀f ∈ H0 (b, d) and for all d > s ∈ R, (i) =f (τ ) = =f (τ + is) , Z I X dR X t −Rσ(τ )(τ +is) t+is e f (τ + is)dτ . (ii) =ρ f (τ ) = 2iπR ρ=±1 ρ∞ ρ=±1 2.2. The recursive equations One can easily verify that f 1 (ψ0 (t), q0 (t)) and f 0 (q0 (t)) are in H0 (a, d) (and bounded at infinity) for some “optimal” values a, d corresponding respectively to the maximal distance from the imaginary axis and the minimal distance from the real axis of the poles of such functions. One can prove by induction, see [3] or [5] for the details, that the solutions of Eqs. (2.2) tend to quasi-periodic functions provided that the initial data are chosen to be: X X ± ± Ijk (ϕ, ω, 0± ) = µk =0 Fjk , p(ϕ, ω, 0± ) = µk =0 x00 F0k . k
k
Fjk (ϕ, ω, t)
Moreover one can prove that has no constant component. Consequently it is convenient to express the trajectories in terms of the “primitives” =t in the form (a0 = 1): 1 0k (µ)k ψjk (ϕ, t) = (µ)k aj Qtj Fjk + x0j G1k j + xj Gj
where x0j = 1, x1j = |t| for j 6= 0 while the xi0 are defined in Eq. 2.3,
1 k1 Qtj [f ] = (=t+ +=t− )[(x0j (t)σ(τ )x1j (τ )−x1j (t)σ(t)x0j (τ ))f (τ )] , Gik aj =xij Fjk . j = (µ) 2 2 For the proofs of these assertions see [3] or [5].
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
Notice that by our definitions, X −1 0 Ij (ϕ, 0− ) − Ij (ϕ, 0+ ) = 2a−1 G0k j ≡ 2aj Gj , j
349
2G00 = p(ϕ, 0− ) − p(ϕ, 0+ ) .
k
We define the formal power series X l Glk j (ϕ) ≡ Gj (ϕ), j = 0, . . . , n ,
l = 0, 1 .
k
Notice that by the KAM theorem the G0j are convergent series. Remark 2.5. (i) We will often use formal power series and in particular formal power series identities, namely identities which hold only at each order k in the series expansion in µ; we will mark such identities with the symbol A ∼ B. In Sec. 4.5 we will prove that the formal power series we use are “asymptotic”. As a definition of asymptotic power series we will assume that a formal power series P n µ an (ε) is asymptotic if for all q > 0 there exists Q > 0 such that, for all n ≤ ε−q , an (ε) ≤ ε−Qn . This implies that we can control the first ε−q terms provided that µ < εQ . (ii) It should be stressed that we do not need to prove convergence for all the asymptotic power series involved in a given identity to obtain information on those series which are known to be convergent (by the KAM theorem). The following Proposition contains some important properties of the operators Qj all proved in [3]. Proposition 2.6 (Chierchia). (i) The operators Qj are “symmetric” on H(a, d) : =(f Qj g) = =(g Qj f ) .
(ii) H0 (a, d) is closed under the application of Qtj . (iii) The operators Qj preserve parities and if f ∈ H0 (a, d) is odd then =f = 0. (iv) If F, G ∈ H(a, d) are such that the projection on polynomials, πP F · G, has no constant component, then σ
σ
=0 G(τ )∂τ F (τ ) = F (0σ )G(0σ ) − =0 F (τ )∂τ G(τ ) . Proposition 2.6(iii) immediately implies the following (again proved in [3]) Corollary 2.7. For all k ∈ N, j = 0, . . . , n, i = 0, 1, the function Gik j (ϕ) is zero for ϕ = 0. In particular the splitting vector is zero for ϕ = 0 and the system has an homoclinic point. Proof. We proceed by induction; by Proposition 2.6(iii) Gij 1 (ϕ = 0) = 0 as it is the integral of an odd analytic function. Consequently ψj1 (ϕ = 0, t) is both odd and in H0 (a, d). Now we suppose that Gij h (ϕ = 0) = 0 and ψjh (ϕ = 0, t) is odd and in H0 (a, d) for all h < k and j = 0, . . . , n. The function Fjk is an odd analytic P function of the angles ψi (∂ψj f δ ) computed at ψ = h m (m is the number of fast degrees of 1 1 freedom). We choose N = C1 ε− 2m¯ (where C1 ≤ (γF /|ω2 |) 2m¯ ) if α = 0) so that we can remove the absolute value in e−c|ω·ν| and for all frequencies ν such that νF 6= 0: |Sˆ0k (ν)| ≤ (k!)c1 ε−k e
cγ
− √ε(N )Fm−1 +c|ω2 ||ν| ¯
;
we can sum on the frequencies ν : νF = 6 0 in X 0k Dij = νi νj S 0k (ν) |ν|≤k
with ϕi or ϕj fast. 0k Dij ≤ (k!)c1 k 3 ε−k e−˜cε
−1/2m ¯
X
0≤l≤k
n−m
ec|ω2 |l
˜ −1 )k e−˜cε−1/2m¯ . ≤ (k!)c1 (Cε
¯ So we can sum the asymptotic series D 0 for k ≤ N and ν < ε1+2(τ +1)/m , N −1/2m ¯ µ ≤ Ce−˜cε . min |det D0≤N |, µ0
Finally we can take any m ¯ > m and τ > n − 1 so we choose m ¯ −1 = m−1 − − 21 −1 − 12 −1 −1/2m ¯ (log(ε )) (similarly τ + 1 = n + (log(ε )) ) so that ε = e−1 ε−1/2m and 1+2(τ +1)/m ¯ 1+2n/m ε ≥ Cε for some order one C. If we have only one fast variable we can give better bounds on det D 0 , namely we use |Sˆ0k (ν)| ≤ (k!)c1 (Cε−
p+7 2
)k e−|ω·ν|d
and the fact that for one fast frequency |ω1 | |ω · ν| ≥ √ |ν1 | − εα |ω2 ||νS | , ε 1
provided that ν1 6= 0 and N ≤ cε− 2 (with c < |ω1 |/|ω2 | if α = 0), so by summing up the formal power series we have that: 0≤N Dij
=
01 Dij
+
N X k=2
0k Dij ,
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
353
˜ p+7 2 +2n : where if |µ| ≤ Cε N X k=2
0k Dij ≤
N X
˜ − (µCε
p+7 2 −2n
)k
X
[e
−
|ω1 |dν1 √ ε
0 d, not ε close to any singularity (if ω1 < 0 we shift to Im t = −l < −d). l The
symbol f¯(z) := f (¯ z ). we could deal with any finite number of poles with this property.
m Naturally
June 19, 2003 16:13 WSPC/148-RMP
354
00165
M. Procesi
Z Im
∞ −∞
e
iω1 t
ω i √1 t f (q0 (t)) ≥ 2π|Re[Res(e ε (f (q0 (t)), t0 ) ω
i √1ε t
(f (q0 (t)), −t¯0 )]| Z ∞ |ω | ω − √1ε l i √1ε t −e e (f (q0 (t + il)) − f (0)) , Im
+ Res(e
−∞
(3.7)
the last integral is again the integral of a bounded ε independent function so we bound it by an order one constant. The residue at the poles can be computed: (iω1 )k−1 (g (t ) − (−1)k g¯k (t0 )) (k−1)/2 k 0 (k − 1)!ε k=1,p X
which is real and generally greater than Ce
−
|ω1 | √ d ε
ε−
p−1 2
.
(3.8)
Proof of Theorem 2. We choose |µ| ≤ εp/2+8+4n so that (3.8) dominates on (3.6). 3.3. Heteroclinic intersection for systems with one fast frequency In the following we will consider systems with one fast frequency and in the a priori stable variables of Hamiltonian (1.1). We can fix µ = εP and ensure Melnikov dominance, as discussed in the previous sections. This means that we have lower and upper bounds on the splitting determinant (and on the eigenvalues of the splitting matrix) of the type: aεp e−cε
−1 2
≤ det ∆0 (ω) ≤ bε−p e−cε
−1 2
.
The coefficients p, a, b, c depend on the perturbing function f . We consider the function: F (ϕ, ω0 , ω) = I˜µ− (ϕ, ω, ρ(ω)) − I˜µ+ (ϕ, ω0 , ρ(ω0 )) √ ≡ c ε(Iµ− (ϕ, ω, ρ(ω)) − Iµ+ (ϕ, ω0 , ρ(ω0 ))) where ω, ω0 ∈ Ωγ . Notice that F (0, ω0 , ω0 ) = 0 ,
det
∂F (0, ω0 , ω0 ) = 2n εn/2 det ∆0 (ω0 ) . ∂ϕ
Hence from the implicit function theorem there exists a function ϕ(ω, ω0 , ε) for which Fµ (ϕ(ω, ω0 , ε), ω, ω0 ) ≡ 0 ,
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
355
provided |ω − ω0 | is small enough. Fixed ω0 standard computations (see [5]) show that the smallness condition is |ω − ω0 | ≤ Cε−2p e−2cε
−1 2
.
To prove the existence of heteroclinic intersections, we have to prove the existence −1
of a chain of KAM tori at distances of order B = Oε (e−Cε 2 ) for some C > 2c, namely we have to adapt to our anisotropic setting (one fast and many slow time scales) the classical techniques discussed in detail in [2] or [5]. Proposition 3.5. There exists a list of Diophantine frequencies ω1 , . . . , ωh ∈ Ωγ such that: √ −1 1 (i) ε|ωi − ωi+1 | ≤ e−C1 ε 2 (ii) ε− 2 |Πn (ω1 − ωh )| ∼ Oε (1) , (3.9) where Πn is the projection on the nth component. To each of the frequencies ωi is associated a preserved unstable invariant torus of Hamiltonian (1.1), T (ω i , ρi ) √ (with ρi ∈ [− 21 , 21 ]) of frequency ερi ωi . The scaling factor ρi is chosen so that all the invariant tori are on the same energy surface, as explained in Remark 1.3. To prove the Proposition we proceed in two steps: ¯ of Diophantine frequencies respecting condi(1) Define an appropriate set Ω tion (3.9). √ (2) Prove the existence of unstable KAM tori of frequency: ερω for ρ ∈ [− 21 , 21 ] ¯ We will only sketch the proof of this second point. and ω ∈ Ω. −1
Definition 3.6. Given an order one C1 > 2c, set A1 = e−C1 ε 2 and consider the set: √ A1 n (a) ε|ω · l| ≥ ∀ l ∈ Z /{0} : l = 6 0 1 |l|τ ¯ := ω ∈ Ω : Ω . 2 √ ε n (b) ε|ω · l| ≥ τ ∀ l ∈ Z /{0} : l1 = 0 |l|
As there is only one fast time scale the condition ω ∈ Ω can be given only on the slow variables, while the fast variable is obtained by “energy conservation” ω ∈ Σ (Σ is the ellipsoid of Definition 1.1), namely we consider a function F : Rn−1 → Σ: v n−1 u X u x2i − ε−1 x2n , x2 , . . . , xn , F (x) := t2E − i=2
so that given β = 12 + a ( 21 ≤ β ≤ 1) and R, r, R1 , r1 , r2 , appropriate order one constantsn and defining: 1 ˜ := {˜ Ω ω ∈ Rn : ω ˜ ε− 2 ∈ Ω} ,
˜ = F (B(R, r) ∩ M ) we have Ω
√ ¯ notice that we are not using the same condition automatically imply r¯ ≤ εω1 ≤ R, notation as in (1.1), here ωi is always the ith component of ω. n This
June 19, 2003 16:13 WSPC/148-RMP
356
00165
M. Procesi
where B(R, r) ⊂ Rn−1 is the spherical shello of radiuses εβ R, εβ r and M := {ω ∈ Rn−1 : εr1 ≤ ωn ≤ εR1 , ωi > r2 εβ , i = 2, . . . , n − 1} . √ As we always deal with ω ˜ = εω we will omit the tilde rescaling all the relations. The Jacobian of F in B(R, r) ∩ M is bounded from above and below by order one constants so that given a measurable setp S ⊂ Ω meas(F −1 (S)) ∼ meas(S). Condition (b) naturally defines subsets of B(R, r)∩M . Moreover we can project the set respecting condition (a) on the subspace of the slow variables. Call this set ¯ 4 ⊂ B(R, r) ∩ M . Ω Let us call S(x) the (n − 2)-dimensional sphere centered in the origin and of ¯ so that radius εβ x. We take 2r < R and consider R ¯ < R 1 , r > r1 . (3.10) R1 /2 < R ¯ R R Definition 3.7. Consider the sets ¯ ¯ + (R1 − R)/4), ¯ S2 := {ω ∈ S(R) : ε(R1 − (R1 − R)/4) ≤ ωn ≤ ε(R ω i ≥ r 2 εβ , ∀ i 6= n} , ¯, S3 := {ω ∈ S(R) : εR1 ≤ ωn ≤ εR
ω i ≥ r 2 εβ ,
∀ i 6= n} .
M ∩ S(R) ⊃ S3 ⊃ S2 ; and the sets all have measure of order ε(n−3)β+1 . Given a set X ∈ S(R), its cone C(X) is the set of semilines stemming from the origin and reaching points of X. We consider truncated cones T (X) := C(X) ∩ B(R, r), and, for any r < a < b < R, Ta,b (X) = T (X) ∩ B(b, a). Notice that by (3.10) if X ∈ S3 , then T (X) ∈ M ∩ B(R, r). Remark 3.8. Recall that given a measurable set X ∈ S(R), the cone of X is measurable and measT (X) ∼ εβ meas(X), meas Ta,b (X) ∼ εβ (b − a) meas(X). −1
Definition 3.9. Given A2 = e−C2 ε 2 with 2c < C2 < C1 and for all s ∈ R, 1 < s < 4R/r, we consider the sets: 2 −1 n−1 ¯ 2 (s) = ω ∈ B(R, r) : |ω · l| ≥ sε , ∀ l ∈ Z /{0} |l| ≤ A Ω 2 |l|τ sε2 Ω3 (s) = ω ∈ B(R, r) : |ω · l| ≥ τ ∀ l ∈ Zn−1 /{0} . |l| ¯ i (s)∩ Remark 3.10. Standard measure theoretic arguments imply that the sets (Ω C (n−3)β+2 ¯ i (s)∩ S(R)) ∩S(R) all have measure of order ε ; this implies as well that (Ω C S2 ) ∩ S2 has measure of the same order and the same holds for intersections with ¯ 2 (s) ∩ Ω ¯ 3 (s) ∩ S2 )C ∩ S2 . We will repeatedly use such relations. S3 and for (Ω o We
call spherical shell of radiuses b, a the (n − 1)-dimensional domain {x ∈ Rn−1 : a ≤ |x| ≤ b}. symbol ∼ means that the two measures are of the same order in ε.
p The
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
357
¯ 2 (2R/r) ∩ S2 , the whole solid ball Bρ (ω) of Lemma 3.11. (i) Given a point ω ∈ Ω 2 1+τ ¯ 2 (R/r) and its intersection with center ω and radius ρ = ε A2 is contained in Ω S(R) is contained in S3 . ¯ 2 (R/r) ∩ S3 ) is in Ω ¯ 2 (1), same for Ω ¯ 3. (ii) The whole truncated cone T (Ω Proof. (i) First notice that any (n − 2)-dimensional “ball”, Bρ (x) ∩ S(R) ∈ S3 if x ∈ S2 . Now consider ω ∈ Ω2 (2R/r) ∩ S2 and a vector x ∈ Rn−1 on the unit sphere: |l| r|l|τ +1 |l| ≤ , as |(ω + ρx) · l| ≥ ||ω · l| − |l|ρ| ≥ |ω · l| 1 − ρ |ω · l| |ω · l| 2Rε2
|l| < 12 . and |l| ≤ A2 , setting ρ = ε2 A1+τ we have 0 < ρ |ω·l| 2 (ii) Given a point x ∈ Ω3 (R/r) ∩ S(R) (or in x ∈ Ω2 (R/r) ∩ S(R)) then rx/R ∈ S(r). Moreover for r/R ≤ t ≤ 1:
|tx · l| = t|x · l| ≥ r/R
ε2 Rε2 = . r|l|τ |l|τ
¯ 2 (R/r) ∩ S(R) is union of a finite number of disjoint Lemma 3.12. The set Ω convex domains. Each domain is contained in a (n−2)-dimensional “ball” of radius C3 εβ A2 for an appropriately fixed order one C3 . Proof. ¯ 2 (R/r) ∩ S(R)) (Ω \ Rε2 Rε2 n−1 x ∈ Rn−1 : (x · l) > ≡ S(R) ∪ x ∈ R : (x · l) < − , r|l|τ r|l|τ n−1 l∈Z |l|≤A2
now the intersection of sets such that each connected component is convex has the same property. Suppose, by contradiction, that there are points x1 , x2 ∈ Ω2 (R/r) ∩ S(R) such that the arc x_ 1 x2 is all in Ω2 (R/r) ∩ S(R) and has length greater than √ 2R−1 nεβ A2 . Let hx1 , x2 i be the plane generated by the vectors x1 , x2 , and on it consider the sector S of unit vectors orthogonal to x_ 1 x2 , this sector has angle √ ϑ = 2 nA2 . The product space of hx1 , x2 i⊥ with the sector S is a multi-cylinder in which there cannot be entire vectors l ∈ Zn−1 with |l| ≤ A−1 2 . Now we consider the intersection of the multi-cylinder with the sphere |x| = √ √ A−1 2 n, on hx1 , x2 i it is an arc of length greater than 2 n so that a ball of 2 −√ √ radius n is contained in the multi-cylinder. Now in each ball of radius n there is at least one entire vector. Namely let x be the center of the ball then [x] (entire part of each component) is entire and |x − [x]|∞ ≤ 1. ¯ 2 (R/r) ∩ S(R) contained in S3 . Let N be the number of connected domains of Ω Each domain contains an (n − 2)-dimensional “ball” of radius ρ = ε2 A1+τ , so that 2 −(n−2)(τ +1) β(n−2)−2n+5 N ≤ A2 ε .
June 19, 2003 16:13 WSPC/148-RMP
358
00165
M. Procesi
¯ 3 (R/r) ∩ S3 , by Remark 3.10 we have that Let us now consider the Cantor set Ω C ¯ 3 (R/r) ∩ S3 ) ∩ S3 has measure of order ε(n−3)β+2 . This implies that Ω ¯ 3 (R/r) ∩ (Ω ¯ ¯ ¯ S3 ∩ Ω2 (R/r) is not empty and the measure of (Ω3 (R/r) ∩ S3 ∩ Ω2 (R/r))C ∩ S3 is of order ε(n−3)β+2 . Lemma 3.13. There exists a connected domain D of Ω2 (R/r) ∩ S3 such that (n−2)(τ +1)+1
¯ 3 (R/r)) ≥ A meas(D ∩ Ω 2
.
Proof. Suppose the assertion to be false, then calling Di , i = 1, . . . , N the connected domains: N X ¯ 2 (R/r) ∩ S3 ∩ Ω ¯ 3 (R/r)) = ¯ 3 ) ≤ A(n−2)(τ +1)+1 N meas S3 ∼ meas(Ω meas(Di ∩ Ω 2 i=1
which is absurd.
¯ 2 (1), Then we can use Lemma 3.11(ii) and consider the truncated cone T (D) ⊂ Ω ¯ 3 (1) has measure of order A(1+τ )(n−2)+1 εβ ; namely by Lemma 3.13 P = T (D) ∩ Ω 2 ¯ 3 (R/r) the Cantor set P contains all radial segments having an endpoint in D ∩ Ω and the other on S(r). Consider an (n − 1)-dimensional ball of radius ρ ∼ εβ A2 centered on a point x ∈ D and which contains D (such ball exists by Lemma 3.13). Given h = [ 2(R−r) 3ρR ], consider the points xi = ti x with ti = 1 − 3/2iρ h ≥ i ∈ N0 and let us cover T (D) with a finite number of balls Bi of radius ρ and centered on points xi . Setting ρ = 2C3 εβ A2 we have that Bi ∩ Bj is empty if |i − j| > 1 and each Bi ∩ Bi+1 contains a truncated cone Tai ,bi (D) with bi − ai ≥ ρ/4. We consider the sets Pi = Tai ,bi (D) ∩ Ω3 (1), by Lemma 3.13 each Pi has measure of order (1+τ )(n−2)+2 εβ A2 . ¯ 4 whose complementary set in M ∩ B(R, r) Now we consider the Cantor set Ω (n−2)β+1 has measure of order ε A1 . Its intersection with Pi has measure of order (1+τ )(n−2)+2 (τ +1)(n−2)+3 εβ A2 , provided that A1 < A2 . Consider a list ωi ∈ Pi ∩ ¯ Ω4 ; for each i we have that ωi , ωai+1 ∈ Bi+1 so the list respects condition 3.9(i) moreover ¯ − 2Cεβ A2 and max yn ≤ r R1 + 2Cεβ A2 min yn ≥ R y∈B0 y∈Bh R for some order one C so the list respects condition 3.9(ii). In the Appendix A.2 we have proved, generalizing similar results of [9], that there exists a symplectic transformation, well defined in a region W of the phase ˜ ψ), which sends Hamiltonian (1.1) in the local normal form: space (I, √ √ 1 (J, AJ) + εG1 (P Q, ε) + µg1 (φS , J, P, Q) + αf1 (φ, J, P, Q) (3.11) 2 −1
where α = Oε (e−Cε 2 ) for any order one C. W is of order one in the actions both in the fast direction J1 and in the degenerate one Jn , namely there exists
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
359
points w1 , w2 ∈ W such that |ΠJn (w1 − w2 )| = Oε (1). We can then prove a KAM ¯ by theorem for the Hamiltonian (3.11) for µ < ε4 with the frequencies ω in Ω choosing (A1 )2 α. Roughly speaking, KAM theorems are proved by performing an infinite sequence of symplectic transformations defined in a set of nested domains whose intersection is not trivial. Each approximation step reduces the order of the perturbation quadratically and is well defined provided an appropriate smallness condition is verified. Roughly speaking, such condition is of the type: µγ −2 1 where µ is the small parameter and γ is the Diophantine constant of the frequency ω of the preserved torus. To apply this scheme to Hamiltonian (3.11), we first perform a finite number of approximation steps on the slow variables with J1 as a parameter; the small denominators involved are |ωS · l| on which we have the stronger Diophantine condition so that the approximation scheme works provided that µε−4 1. Eventually we will reduce the µ perturbation to order α and then continue with the classical KAM scheme on all the variables, now the smallness condition is αA−2 1 1. This completes the proof of Proposition 3.5. 4. Tree Representation 4.1. Definitions of trees We briefly review the tree representation of the homoclinic trajectory. The definitions contained in this Subsections are all adapted from [10]. Definition 4.1. A graph G consists of two sets V (G) (vertices), E(G) (edges) such that E(G) is a subset of the unordered pairs of distinct elements of V (G). We will always consider finite graphs, i.e. graphs such that N (G) = |V (G)| is finite. Two vertices i, j ∈ V (G) are said to be adjacent if (i, j) ∈ E(G). It is customary to write n ∈ G in place of n ∈ V (G) and (i, j) ∈ G in place of (i, j) ∈ E(G). Two graphs G1 , G2 are equal if and only if they have the same vertex set and the same edge set. Definition 4.2. A path joining the vertices i, j ∈ G is a subset Pij of E(G) of the form Pij := {(i, v1 ), (v1 , v2 ), . . . , (vk , j)} . A graph G is connected and without loops if for all i, j ∈ G, there exists one and only one path that connects them. Such graphs are called trees. Their vertices are called nodes and their edges are called branches. A tree T such that the set V (T ) = {1, 2, . . . , N (T )} is called a numbered tree. Definition 4.3. A labeled tree is a tree A plus a label LA (v) ≥ 0 which is generally a set of functions fAi (v) defined on the nodes. When possible we will omit the subscript A in the functions f i .
June 19, 2003 16:13 WSPC/148-RMP
360
00165
M. Procesi
Fig. 2.
Definition 4.4. Two labeled trees X, Y are isomorphic if there is a bijection, say h, from V (X) to V (Y ) such that for all a ∈ V (X), LX (a) ≡ LY (h(a)). Moreover (a, b) ∈ E(X) if and only if (h(a), h(b)) ∈ E(Y ). We say that h is an isomorphism from X to Y. Notice that since h is a bijection h−1 is well defined and is an isomorphism from Y to X. We will call symmetries or automorphisms of X, the isomorphisms from X to X. It is often convenient and more compact to represent a tree by a diagram, with points for the nodes and lines for the branches, as in Fig. 2. In this diagrams the positions of the points and lines do not matter — the only information it conveys is which pairs of nodes are joined by a branch. This means that the two diagrams in Fig. 2 are equal by definition. Strictly speaking these diagrams do not define graphs, since the set V is not specified. However, if the diagram has N points, we may assign distinct natural numbers 1, 2, . . . , N to the points (which we still call nodes), so obtaining a labeled numbered tree. Then it is easily seen that the two trees in Fig. 2 are isomorphic. Definition 4.5. We will call diagrams the equivalence classes of labeled trees via the relation A ∼ = B if and only if A and B are isomorphic. An obvious consequence of this definition is that, LA (v) and N (A) are well defined on the equivalence classes. We can choose a representative A0 of the equivalence class A by giving a numbering 1, 2, . . . , N (A) to the nodes of A. Remark 4.6. Given an equivalence class of labeled trees A and a numbering A0 , the group of automorphisms of A0 can be identified with a subgroup of the group of permutations on N (A) elements SN (A) ; we denote such subgroup by S(A0 ). S(A0 ) is the subgroup of the permutations σ ∈ SN (A) which fix both E(A0 ) and the labels L(A0 ). Namelyq σ ∈ S(A) → σE = E and L(n) = L(σ(n)) for all n ≤ N (A). Given two isomorphic trees A0 and A00 , representatives of A, let h be a bijection such that E(A0 ) = σE(A00 ). The groups S(A0 ) and S(A00 ) = h−1 S(A0 )h are isomorphic. We will improperly call the equivalence classes via this relation the symmetry group S(A) of the diagram A. q With standard abuse of notation we denote σE(A0 ) the function such that σ(a, b) = (σa, σb) for all (a, b) in E(A0 ).
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
361
Using standard notation (see for instance [11]) we denote by a := (i1 , i2 , . . . , im ) with N 3 ij ≤ N (A) the permutation such that a(ih ) = ih+1 , a(im ) = i1 , and a(n) = n for all N 3 n ≤ N (A) such that n ∈ / {i1 , i2 , . . . , im }. Moreover we denote by ab the composition of a and b. As an example in Fig. 3 consider the numbered tree A (N (A) = 6), its symmetries are the identity and a := (1, 4); b := (2, 3); c ≡ a ◦ b; d ≡ (5, 6)(1, 2)(4, 3), e := (5, 6)(1, 3)(2, 4); f := (5, 6)(1, 2, 4, 3); g := f ◦ a. Clearly any other numbering on A would give an isomorphic symmetry group. Definition 4.7. Given a tree A and a node v ∈ A, we define its orbit: [v] := {w ∈ A : w = g(v) for some g ∈ S(A)} , i.e. the list of nodes obtained by applying the whole group S(A) to v, notice that this is an equivalence relation (a proof of this statement is in [10]). In the example of Fig. 3 there are two orbits, which in the chosen numbering are: [1] ≡ {1, 2, 3, 4} and [5] ≡ {5, 6} . Remark 4.8. The orbits are well defined on the equivalence classes of labeled trees, it should be clear, for instance, that the nodes signed in black in the diagram of Fig. 4 are an orbit. Definition 4.9. A rooted labeled tree is a labeled tree A plus one of its nodes called the first node (vA or v0 ); this gives a partial ordering to the tree, namely we say that i > j if Pv0 j ⊂ Pv0 i . Moreover choosing a first node induces a natural 1
2
5
6
4
3
Fig. 3.
L L’
L L Fig. 4.
June 19, 2003 16:13 WSPC/148-RMP
362
00165
M. Procesi
ordering on the couples of nodes representing the branches namely (a, b) ∈ E(A) implies that a < b. We recall some definitions on rooted trees: (a) the level of v l(v) is the cardinality of Pv0 v ; (b) the nodes subsequent to v, s(v), are the nodes adjacent to v and of higher level; the node preceding v is the only node adjacent to v and of lower level; (c) given v node of A, we call A≥v the rooted tree (with first node v) of the nodes w ≥ v; we call A\v the remaining part of the tree A. An isomorphism between rooted trees (A, vA ), (B, vB ) is an isomorphism between A and B which sends vA in vB . The symmetries of a rooted labeled tree (A, vA ), which we denote again by S(A, vA ) are the subgroup of the symmetries of the corresponding unrooted tree that fix the first node vA . As done for trees, we can represent the equivalence classes of rooted trees with diagrams, representing by convention the first node on the left and all the nodes of the same level aligned vertically (it should be obvious that the definitions v > w, A\v and A≥v are well posed on the equivalence classes). 4.2. Admissible trees Definition 4.10. We consider rooted labeled trees such that some nodes are distinguished by having a different set of labels.r An admissible tree is a symbol : A, {vA }, {v1 , . . . , vm }, {w1 , . . . , wh } such that A is a tree, all the vi , wj and vA are nodes of A, the vi are all end-nodes, h {vi }m i=1 ∩ {wj }j=1 = ∅
and the vi are all different. h s We call {vi }m i=1 ≡ F(A) the fruits of A, {wj }j=1 ≡ M(A) the marked nodes of A and the set 0
A: {v ∈ / F(A)} the free nodes of A. Finally s0 (v) are the free nodes in s(v). The labels are distributed in the following way: (a) For each node v 6= vA , one angle label jv ∈ {0, . . . , n} (remember that we are considering a system with n + 1 degrees of freedom). 0
(b) For each node v, one order label δv = 0, 1 if v ∈ A and δv ∈ N otherwise. (c) For each node v ∈ M(A), one angle-marking J = 0, . . . , n and one functionmarking h(t) ∈ H. (d) For each node v ∈ F(A), one type label i = 0, 1. r The sA
dynamical meaning of the labels will be clear when we define the “value” of a tree. node v can appear many times in M(A) we will say it carries more than one marking.
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
f (A 2) A1
i, k 0
f (A 2)
i, k 0
A1
Fig. 5.
363
Examples of trees in A5 and in (T00 )5 (see Definition 4.13).
We set a grammar on the so defined labeled rooted trees, namely: δv = 0 → {jv = Jv = 0, |s(v)| ≥ 2, jv0 = 0 ∀ v 0 ∈ s(v)} . To draw the diagrams without writing down the labels we give a color to each j = 1, n (which forces δ = 1) and two different colors for the couples of labels j = 0, δ = 1 and j = 0, δ = 0. In all the pictures we will set n = 1 and choose the colors gray, black and white, see Fig. 5. The fruits F(A) will be represented as “bigger” end-nodes colored with the color corresponding to their angle label and with their order and type written on a side. The marked nodes will be distinguished by a box of the color corresponding to their angle-marking and with their function-marking written on a side. If the function marking is h(t) = 1 we will omit the function marking. By convention the first node is set on the left, and the nodes of the same level are aligned vertically. Definition 4.11. (1) We will call fruitless trees the (labeled rooted trees) A such that F(A) is empty. We will say that a fruit v stems from w if v ∈ s(w). (2) We will call T the set of equivalence classes (as in Definition 4.5) of admis0
sible trees, T the subset of T of trees with at least a free node and A the subset of m 0 T of “fruitless” trees. Finally we will call A the subset of A of fruitless trees with no marking. (3) We will call Fjik the “tree” composed of one fruit of order k, angle j and type i; clearly [ Fjik . T ≡T i=0,1 j=0,...,n k>0
Notational Convention 1. Using standard notation we represent the equivalence classes by [A] where A is an admissible tree. Moreover given a tree A we will write A ∈ T if it is a representative of an equivalence class in T . Definition 4.12. The order of a tree A ∈ T is: X o(A) = δv . v∈A
The order of a node v of A is o(v) = o(A
≥v
).
June 19, 2003 16:13 WSPC/148-RMP
364
00165
M. Procesi 0
Given a tree A ∈ T and one of its nodes v we call A≥v the tree composed of the nodes greater or equal to v; if A≥v is not a fruit then it is not admissible as it carries a label j in the first node. In such case, we conventionally set A ≥v ∈ T by setting a mark J(v) = jv , h(v, t) = 1 on v and subsequently “forgetting” the label jv . It is easily seen that o(A) > 0 for all A ∈ T and that T k ≡ {A ∈ T t.c. o(A) = k} 0
is a finite set; clearly the same is true in T and in A. Notational Convention 2. In all our sets an apex k means we consider the subset of trees of order k. 0
We list here the subsets of T and A that we will need in the following sections. 0
Definition 4.13. (a) Aaj (Tja ) with j = 0, . . . , n, a = 0, 1, is the subset of A (T ) such that M(A) ≡ {vA } and J(vA ) = j, h(vA , t) = xaj (t). 0
ab (b) Aab ij (Tij ), with i, j = 0, . . . , n, a, b = 0, 1, is the subset of A (T ) such that M(A) ≡ {vA , v} for some v ∈ A moreover J(vA ) = i, h(vA , t) = xai , J(v) = j, h(v, t) = xbj .
Given a set S one can consider a vector space on Q generated by formal linear combinations of the elements of the set; we represent it by V(S). Definition 4.14. V(S) is the vector space of linear combinations of elements of S with rational coefficients. [A] ∈ S → [A] ∈ V(S) ,
[A], [B] ∈ V(S) → q1 [A] + q2 [B] ∈ V(S) ,
∀q1 , q2 ∈ Q .
We construct V(S) for the sets in Definition 4.13, we obtain infinite dimensional vector spaces that can be expressed as direct sum of finite dimensional spaces generated by the sets S k (we call these spaces Vk (S)).t Definition 4.15. In particular, we will be interested in the following vectors: fk =
1 X A , k |S(A)| m
Λak i =
A∈(Tia )k
A∈(A)k δvA =1
fak i =
X
k A∈(Aa i)
t We
A , |S(A)|
X
fabk ij =
X
A∈Aabk ij
A , |S(A)|
A , |S(A)|
are using the fact that the sets are disjoint union of the corresponding “fixed order” sets S k .
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
365
where the sum A ∈ S k means choosing one representative A for each equivalence class (diagram) of the set S k . Clearly the vectors are determined only up to isomorphisms. The same vectors without the apex k will represent the formal series u : P∞ V = k=1 V k . 4.3. Values of trees We link the vectors defined in Definition 4.15 to the dynamics by defining an appropriate tree “value” V(A) where A ∈ T . This definition can be extended to diagrams provided that V(A) = V(B) if A and B are isomorphic, moreover we can uniquely extend V to a linear function on V(T ). The presentation is very schematic as this definitions can be found in [3] and following papers; let us only write the Fjk explicitly (using well known formulas on the derivatives of composite functions), ej the vectors of the canonical basis: Fjk
=−
X X
(∇
m+e ~ j
δ
f (t))
n δ=0,1 m∈N ~ 0
X
n,k−1 Y
j=0 {ph ~ j }m,k−δ h=1
h 1 (ψ h )pj phj ! j
where {phj }m,k is a list of numbers in N0 ≡ N ∪ {0} which respect the relations ~ # " n Y m X X j h m ~ h ∂ψj f (ψ) hpj = k , finally we define ∇ f (t) = pj = m j ,
.
ψi =ϕi +ωi t ψ0 =q0 (t)
j=0
j,h
h
So we define Vϕ (A) =
Y
v>v0
(=τ+w + =τ−w )Ψϕ (A)
where Ψϕ (A) =
Y
wjv (τw , τv )
v∈A0
v∈A0 v>v0
×
Y
α∈F (v)
Y
[i ]
xjαα
Y
Pn 1 − ajv µδv ∇ j=0 mv (j)ej f δv 2
hβ (v, τv )
β∈M(v)
Y
o(α),i(α)
Gj(α)
.
α∈F (A)
F(v) are the fruits stemming from v, M(v) is the list of markings of the node v, w is the node preceding v and finally mv (j) is the number of elements in {v, s0 (v), F(v), M(v)} having angle label (or angle marking) equal to j. We write s0 (v), F(v) instead of s(v) to remark that the fruits are not considered proper nodes. Notice that Ψϕ (A) contains the kernels of the integral operators Qj so that V is obtained by “integrating” on the times τv v > v0 ; clearly the integrations must be performed in the correct order, first the end-nodes . . . . The following proposition u Remember
that the apex k is NOT an exponent.
June 19, 2003 16:13 WSPC/148-RMP
366
00165
M. Procesi
is standard (it is proved in [3] for numbered trees instead of equivalence classes), we sketch the proof in the Appendix. Proposition 4.16. The value of the splitting vectors Gik j (ϕ) is ik Gik j (ϕ) = =Vϕ (Λj ) .
The value of the homoclinic trajectory ψjk is (µ)k ψjk (t, ϕ) = (=t+ + =t− )wj (t, τ0 )Vϕ (Λik j )+
X
[a]
xj Gak j .
a=0,1
Definition 4.17 (Equivalent trees). We are mainly interested in the splitting vectors and splitting matrix so we will consider two trees to be equal if they have the same value in the computation of the Gaj . A∼ = B iff =Vϕ (A) = =Vϕ (B) ∀ ϕ ∈ Tn ; such identity can hold only for some initial data ϕ, ¯ in such case we write (A ∼ = B)ϕ¯ . 4.4. Tree identities 4.4.1. Mark adding functions We can define linear functionsv on V(T ), for instance we can add markings to a 0
tree; given A ∈ T the symbol h(v, t)∂lv A represents the application of an angle-marking J(v) = l and a function-marking h(v, t) in the node v; formally h m h A, {vA }, {vi }m i=1 , {wj }j=1 → A, {vA }, {vi }i=1 , {{wj }j=1 ∪ {v}} ,
notice that given two nodes v, w in the same orbit [v] ∂lv A is isomorphic to ∂lw A. We can define the linear function: X Mj (h(t))[A] := h(v, t)∂jv A . (4.1) ˚ v∈A Particularly interesting mark adding functions are Mjb ≡ Mj (xbj (t)). a Lemma 4.18. The vector fab ij is obtained from fi by the mark adding function
Mjb [fai ] = fab ij . v We always define functions F on trees. Then one should verify that F (A) and F (B) are isomorphic if A, B are so. This implies that one can uniquely extend the functions on the vector spaces by linearity.
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
367
Proof. We need to show that X X m[v] [v] X 1 B ∂jv A = ∂j A = , |S(A)| |S(A)| |S(B)| a ab 0 0
X X
A∈Aa i
A∈Ai
v∈A
[v]∈A
B∈Aij
in the second equality m[v] is the cardinality of the orbit of v and the sum over [v] means we choose one representative from each equivalence class; similarly the [v] symbol ∂j is the application of the angle marking j to one of the nodes of the orbit [v]. We are simply grouping the isomorphic trees ∂jw A with w ∈ [v] and choosing a representative of the equivalence class. Given each tree B ∈ A ab ij there [v]
is one and only one couple A ∈ Aai , [v] ∈ A such that ∂j A = B (there is a common representative). The symmetry group of B fixes both the marked nodes so |S(A)| = m[v]|S(B)| by the Lagrange theorem.w
Lemma 4.19. The function Mj0 with j = 1, . . . , n is a function on the values of trees. Given a fruitless tree A ∈ A the mark-adding function Mj0 with j = 1, . . . , n acts as the derivative on the angle ϕj : ∂ϕj =Vϕ (A) = =Vϕ (Mj0 [A]) . Proof. Adding an angle marking j to the node v is equivalent to adding ej to mv in ∇m f δ , so we add a derivative in ψj to the function f δv (ψ) which is to be evaluated in ψj = ϕj + ωj τv , ψ0 = q0 (t). If j 6= 0 this is equivalent to applying a ϕj derivative to the node v. As the dependence on ϕ comes only from the functions f 1 we have proved our assertion. 4.4.2. Fruit adding functions Remark 4.20. Notice that by our definition of equivalent trees adding a fruit of 0
order k, type i and angle j in the free node v of a tree A ∈ T is equivalent to adding [i] a mark xj (t)∂jv to the node v and multiplying by the ϕ dependent function Gik j . As we have seen in Eq. (3.4) the only contributions to ∆0k ij come from the parts mh of G0k which are at most linear in the G with l = 0, . . . , n; h < k, m = 0, 1. i l In tree representation we can say that the only contribution comes from trees with one fruit. So to find the matrices N a and na (a = 0, 1) we have to understand how to pass from fruitless trees to trees with one fruit. First of all let us notice that the fruitless contribution to Λ0j is clearly f0j so that 0 Dij = =Vϕ=0 f00 ij . w We refer to the Lagrange theorem which states that the order of a group G acting on a set V is the order of the orbit of a point v ∈ V times the order of the subgroup of G which fixes v.
June 19, 2003 16:13 WSPC/148-RMP
368
00165
M. Procesi 0
Now we can add a fruit Fjik to the node v of a tree A ∈ T by adding a node y labeled (i, k, j) to the list F(A) and setting y ∈ s(v), given a tree a we apply this function to each node v ∈ A then sum on the nodes v. By Remark 4.20 this is [i] equivalent to applying the function Gik j (ϕ)Mj , where [i] = |i − 1|, to A. Proof of Proposition 3.2(i). If j 6= 0 we can obtain each tree with one fruit by adding the fruit to a node of a fruitless tree as described above; so that by Lemma 4.18, Nijak = =Vϕ=0 (f0a ij ) and consequently N 0 = D0 .
If j = 0 we have trees with one fruit attached to nodes with δv = 0, so that detaching the fruit we do not obtain an acceptable tree (the node has only one successive free node). We construct such trees from fruitless ones by using a different function: given a tree A ∈ A and a node v ∈ A v 6= vA and jv = 0 we attach the node y of the tree in Fig. 6 to v and w (by convention the node preceding v). Formally we set li (A, v) = E(A) \ (w, v) ∪ (w, y) ∪ (y, v) ; [i]h
then G0 li (A, v) is a tree with one fruit, stemming from y (δy = 0) and y has only one successive free node. We apply li (A, v) to the nodes of A, and set l i (A, v) = 0 if v = vA or if jv 6= 0. X Li (A) = li (A, v) ; v∈A
i h 0
1
=
i
x (t)
ih G 0
0
Fig. 6.
i
L [
]
=
i
+
0 Fig. 7.
The fruit adding functions.
i 0
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
369
[i] notice that this is NOT well defined as a function A → A. However Gih 0 l (A) is 0
well defined and A → T and so we can define the “value” of Li (A); in the next Subsection we will prove that Li (A) is equivalent to an acceptable tree. 0
Lemma 4.21. Calling T 1F the set of trees with one fruit and X A 0(1F ) Λj = , |S(A)| 0 A∈T 1F j
we have that: 0(1F )
Λj
=
n XX
[l]
Gli (Mi (f0j ) + δi0 L[l] (f0j )) ,
l=0,1 i=0
and consequently 0 0 n0j = =Vϕ=0 (f00 j0 + L (fj )) .
Proof. Consider a tree B with one fruit, of angle i, order k and type l attached to a node v 0 . If such node has more than one successive or δv 6= 0, then it can [l]h be obtained by applying Gi xli ∂iv to a tree A ∈ A0j . If the node has δv = 0 and only one successive then there exists one and only one couple A, v with A ∈ A0j , [l]h
v node of A such that G0 ll (A, [v]) = B (as usual the symbol [v] means choosing one representative for the equivalence class). The symmetry group of B fixes both the first node and the fruit (and so consequently all the path joining the fruit to the first node), so if we divide by Glk i we obtain a tree with two marked nodes which again fixes the first node and the node v 0 where the fruit was attached; if v 0 has only one successive free node, say v, then that is fixed as well. This proves the proposition as given A ∈ A0j , X m[v]l(A, [v]) and moreover |S(A)| = m[v]|S(l(A, [v]))| . La (A) = [v]∈A
n0j
Finally is the linear term in G10 in the expansion of G0j , so it is given by trees with one fruit of angle j = 0 and type l = 0. Remark 4.22. As f δ (t) = F (ψi (0) + ω ˜ i t, ψ0 (t)) and ψ˙ 0 (t) = − 2c x00 (t), we have that: X 2 ~ ~ δ ~ 0 δ i δ f (τv ) = f (τv ) . ∂ τ v ∇m ωj ∇m+e f (τv ) − x00 ∇m+e c j=1,...,n For notational convenience we define a symbolx ∂ty A, where A is a fruitless tree and v is one of its nodes, by setting x We
could define ∂tv (A) to be a special marked tree.
June 19, 2003 16:13 WSPC/148-RMP
370
00165
M. Procesi
Ψϕ (∂ty A)
=
Y
wjv (τw , τv )
v∈A
v∈A v>v0
×
Y
Y
v∈A v6=y
∇
Pn
j=0
This definition implies thaty X ∂tv A ∼ = v∈A
mv (j)ej
1 − a j v µ δv 2
f δv ∂τy (∇
Pn
j=0
Y
hβ (v, τv )
β∈M(v) my (j)ej
f δy ) .
2 ωj Mj0 (A) − M00 (A) . c j=1,...,n X
Lemma 4.23. Given an odd function G ∈ H0 the following relation holds: 2 t t 0 3 0 τ ∂t Qj (G) = Qj ∂τ G(τ ) + δj0 x0 (τ )∂0 f (τ )Q0 (G) . c The proof of this Lemma (proposed in [4]) is straightforward but quite long, we report it in the Appendix. Lemma 4.24. Given a tree A ∈ A0i , i = 1, . . . , n then ! X ∂tv A − l0 (A, v) = ∂t Vϕ=0 (A) . Vϕ=0 v∈A
Proof. We drop the ϕ = 0 in V for notational convenience. The assertion is trivially true for trees with only one node, so we prove it by induction on the order of the trees. Let us define Ahj as the set of fruitless trees of order h with only one marking, placed on the first node, Jv0 = j and h(v0 , t) = 1; for j 6= 0, Ahj ∼ = A0h j . Suppose h Lemma 4.24 holds for all trees in Aj , h < k for j = 0, . . . , n, then forz A ∈ A0k i , Y 1 ~ 0 ) δv 0 f ) Qjv [V(A≥v )] ∂τ0 V(A) = − (∂τ0 ∇m(v 2 v∈S(v0 )
+
X
v∈S(v0 )
V(A/v )∂τ0 [Qjv V(A≥v )]) .
Now we set V(A≥v ) = F (which is odd when ϕ = 0) and apply Lemma 4.23 to F ∈ H0 : 2 ∂τ0 Qjv (F ) = Qjv (∂τv F ) + δj0 Q0 (x00 (τy )∂03 f 0 (τy )Q0 (F )) , c
clearly ∂τv F = ∂τv [V(A≥v )] and δj0 Q0 (x00 (τy )∂03 f 0 (τy )Q0 (F )) = −V(l0 (A, v)) . ∼ B means that =V(A) = =V(B). that A = recall that V(A) = V(A/v )Qjv V(A≥v ).
y Remember z We
June 19, 2003 16:13 WSPC/148-RMP
00165
371
Exponentially Small Splitting and Arnold Diffusion
So we obtain ∂τ0 V(A) = V(∂tv0 A) −
X
v∈S(v0 )
X 2 V(A/v )[Qjv ∂τv V(A≥v )] V(l0 (A, v)) + c v∈S(v0 )
by definition A≥v ∈ Ahj for some j, h. So we consider trees of lower order for which the Proposition is true by the inductive hypothesis. Proof of Proposition 3.2(ii). By Lemma 4.21 we must show that ! n X c 00 0 00 0 0 fij ωj . ni = =Vϕ=0 (fi0 + L (fi )) = =Vϕ=0 2
(4.2)
j=1
Now for j 6= 0, =∂t Vϕ=0 (f0j ) = 0 as the integrand has no constant component. So we can use Lemma 4.24 and Remark 4.22 to obtain Eq. 4.2. 4.4.3. Changing the first node Another way of manipulating trees is to change the first node (which is distinguish0
able as it does not have the label j). Generally one can obtain various trees in T by simply changing the uncolored node (for example one can shift the angle labels down along a path joining any node v to the uncolored one vA ). However not all the trees obtained in such a way are in T . 0
Definition 4.25. Given a tree A ∈ T , let vA be the first node and v a free node; 0 0 the change of first node P (A, v) : T → T is so defined : Let vA = v0 , v1 , . . . , vm = v be the nodes of the path PvA ,v . P (A, v) is obtained from h A, {vA }, {vi }m i=1 , {wj }j=1 by shifting only the j labels of the nodes of P vA ,v in the direction of vA . This automatically implies that v is left j-uncolored and is the first node of P (A, v). If we obtain a tree not in T we set P (A, v) = 0. P : V(T ) → V(T ) is the linear function such that ∀ A ∈ T , P (A) = P 0 P (A, v). v∈ A
Lemma 4.26. P (A, v) = 0 if and only if δvA = 0, |s(vA )| = 2. This means that the possibility of applying the change of first node does not depend on the chosen v 6= vA .
Proof. Consider the trees A and P (A, v) and the nodes vA = v0 , v1 , . . . , vm = v of the path PvA ,v . For each i = 0, m − 1, vi precedes vi+1 in A and follows it in P (A, v). So for each node w 6= vA , v, the number of following nodes s(w) is the same in A and P (A, v); s(vA ) decreases by one and s(v) consequently increases by one. This implies that all trees A with δvA = 0 and |s(vA )| = 2 have P (A, v) = 0 for all v. Moreover if vi has δ = 0, then it has j = 0 as well as all the nodes (including vi+1 ) following it. This means that in P (A, v), it will still have δ = j = 0, the same s(vi ) ≥ 2; moreover vi−1 that follows vi in P (A, v) has j = 0. r
We will call T the trees whose first node can be changed.
June 19, 2003 16:13 WSPC/148-RMP
372
00165
M. Procesi
A =
P(A,v) = A =
P(A,v) = v
Fig. 8.
An example of trees thatvare equivalent by changing the first node.
Lemma 4.27. By Proposition 2.6(a), we have: r
∀ A ∈ T , ∀v ∈ A : P (A, v) − A ∈ ker =Vϕ
(4.3)
r
∀ A ∈ T (j,f )(i,h) : P1 (A) − A ∈ ker =Vϕ Proof. Notice that given a tree A and one of its nodes v, if w ∈ P(vA , v) then P (A, v) = P (P (A, w), v), so we only need to prove the assertion for v ∈ s(vA ). r Given A ∈ T and v ∈ s(vA ) such that jv = j, we compare =V(A) and =V(B) with B = P (A, v), so B has first node v (no label jv ) and a node vA in s(v) with jvA = j. P Y 1 Qjw [V(A≥w )] =V(A) = − ajva (µ)δvA =∇ j mvA (j)ej f δvA 2 w∈s(vA ) w6=v
"
δv
× Qj (−µ) ∇
P
j
mv (j)ej
f
w1 ∈s(v)
which by the symmetry of Qj is equal to P Y =∇ j mv (j)ej (−µ)δv f δv
w1 ∈s(v)
"
× (−µ)δvA ∇
P
j
Y
δv
mvA (j)ej
V(A
≥w1
#
) ,
V(A≥w1 )Qj
f δv A
Y
#
Qjw [V(A≥w )] .
w∈s(vA ) w6=v
This is the value of B, namely, both in A and in B, mv (i) with i 6= j is the number of elements in (s(v), M(v), F(v)) having label i and mv (j) − 1 is the number of elements in (s(v), M(v), F(v)) having label j. Lemma 4.28. For each i = 1, n, we have f0i ∼ = Mi0 (f) . Proof. The proof of this statement is in [7], we report it here for completeness. By Lemma 4.27 we have that for A in Akj , A∼ =
1 k
X
[v]:δv =1
m[v]P (A, v)
so
X
A∈Ak j
1 X X A = |S(A)| k k
A∈Aj [v]:δv =1
m[v] P (A, v) , |S(A)|
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
373
m
now there exists one and only one couple B ∈ A, [vA ] ∈ B such that δvB = 1 and ∂jvA B = P (A, vB ). Finally by the Lagrange Theorem, (m[vB ])−1 |S(A)| = (m[vA ])−1 |S(B)|. This completes the proof of Proposition 3.2. 4.5. Upper bounds on the values of trees m
Given a fruitless tree A ∈ A of order k (so with at most 2k − 1 nodes), its value through =Vϕ1 is of the form: ! N (A) Y P Y 1 − a jv = (=τ+w + =τ−w )(µ)δv0 ∇ j mv0 (j)ej f δv0 2 v>v v≥v0
×
Y
v>v0
(µ)δv ∇
P
0
j
mv (j)ej
f δv w(τw , τv ) .
(a)
We expand f 1 in Fourier series in the rotator angles, X eiν·ψ fν (q) , f 1 (ψ, q) = |ν|=1
so that each node has one more label νv ∈ Zn . We will represent as A(ν) a tree A with labels νv such that X νv = ν . v∈A
In each node v with δ = 1 we have as factor the function dnv fνv (q(t)) where nv = mv (0). The functions fν (q) and q(t) are such that fν (q(t)) = Fν (et ) ∈ H0 (a, d). Naturally by our analyticity assumptions fν (q(t)) is limited for |t| → ∞ in |Im t| < 2Π. We are considering rational functions Fν (et ), let us call tiν their (finite number of) poles in |Im t| ≤ Π (all with Im t 6= 0) then d = min |Im (tiν )| ; ν,i
a = max |Re(tiν )| . ν,i
(4.4)
Moreover the following proposition holds. Lemma 4.29. The functions ∂0k fν (q(t)) = Fνk (et ) respect the bound : k t √ |Fν (e )| t∈C(a+2,d− ε)
max
≤ Ck!ε
p+k 2
.
(4.5)
Proof. We can use Cauchy estimates on ∂0k fν (q) provided that the images in the q √ √ variables of C(a + 2, d − ε) and of C(a + 1, d − 21 ε) via the function q0 (t)−1 , have √ distance of the order of c ε for some order one c. This can be verified by direct computation or proved using simple geometric arguments.
June 19, 2003 16:13 WSPC/148-RMP
374
00165
M. Procesi
P Having fixed ν = v νv , in integral (a) we shift the integration to R + iσ(ων )d0 √ where d0 < d (we will then fix d0 = d − ε to obtain optimal estimates and d0 = c ≤ d/2 to obtain simply exponentially small estimates), ων = ω · ν and σ(x) is the sign of x. As the functions are all analytic in |Im(t)| ≤ d0 the integral (a) is unchanged. Notice that in integral (a) we cannot choose the sign of the shift in the single node integrals and so we need to work in the (symmetric) domains C(a, d0 ) to guarantee the indifference of extending in the lower or upper half-plane. To simplify the notation we set 0
σ(ων ) = + and define E(d0 , ν) = e−|ων |d . If A has k nodes with δ = 1, let {νv }kν be the lists of k vectors νv ∈ Zn such that m P νv = ν. The value of A(ν) (tree A ∈ A with total frequency ν) in integral (a) is: N (A) I Z ∞ X Y dRv0 1 mv (s) (iν ) − dτv0 e−σ(τv0 )Rv0 eiν·ϕ E(d0 , ν) vs 2iπRv 2 −∞ 0 k {νv }ν
s=1,...,n δv =1 ,v≥v0
× [dnv0 fνδvv0 (q(τv0 + id0 ))]eiωv τv0
Y I
v>v0
0
dRv 2iπRv
× e−σ(τv )Rv (τv +id ) wjv (τw + id0 , τv + id0 )
Y
Z
τw
dτv + −∞
Z
τw
dτv ∞
[dnv fνδv (q(τv + id0 ))]eiωv τv ;
(a)
v≥v0
naturally fν0 = 0 for all non-zero ν. As usual w is the node preceding v, mv (s) is the number of nodes in the list v, s(v) with label j = s, n(v) the number of those with label j = 0 and ωv = ωνv . The residues in R are introduced by using the Definition 2.8. The factors s (iνvs )mv come only from nodes with δv = 1 and their product is bounded by 1. Now we want estimates on the integrals that depend only on the order k; we start by splitting the sums in monomials. (1) Split wj (τw + id, τv + id) into 6 terms if j = 0 or 2 terms if j 6= 0: so we obtain 63k−1 terms. Each of this terms is of the form 0
0
h −l 0 τvh x−l v y(xv )τw xw y (xw ) ,
where xv = e−|τv | , 0 ≤ h, h0 , l0 , l ≤ 1 and both y(x), y 0 (x) are analytic in |x| ≤ 1 (we will call this the limited x dependent part of the Wronskian). R τw Rτ (2) Separate −∞ dτv + ∞w dτv , and =dτv0 in integral (a). We get other 2k terms like ! Z τw |s(v)|+2 Y Y I dRv dτv e−σ(τv )Rv (τv +id) eiωv τv (τv )hv xlv yvj (xv ) , 2iπRv ρv ∞ j=1 v≥v0
where 0 ≤ lv , hv ≤ |s(v)| + 1. Notice that ρv is not the sign of τv but an extra label. The functions yvj are chosen in the following way:
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
375
(i) One of the yvj is either coming from ∂0nv f 0 , (i.e. it is in the list cos(mq(τv +id)), sin(mq(τv + id)) with m = 1, 2) or is one of the Fνkv . (ii) One is the limited xv dependent part of a term from the Wronskian at the node v. (iii) For each node v 0 following v there is one function yvj which is the xv dependent part of a term coming from the Wronskian w(τv , τv0 ). Notice that the functions y are by definition in H(a, d) and respect condition 4.29. Rτ 0 R0 R0 (3) Given a node v ∈ s(v0 ) split the integral ρvv∞ dτv as ρv ∞ dτv − ρv ∞ dτv + 0 R τv0 2k+1 terms). We consider ρv0 ∞ dτv and proceed recursively for all nodes (other 3 Rτ first the contributions from the term with ρvw ∞ dτv for all nodes (the others will 0 be expressed as products of the same kind of integrals). Set ρv0 = −1, we want to estimate: ! Z τw |s(v)|+2 Y Y I dRv Rv (τv +id) iωv τv hv −lv v dτv e e (τv ) xv yj (τv ) . I− (A) = 2iπRv −∞ j=1 v≥v0
(4.6) R −a0 R 0 Finally we split the first integral −∞ = −∞ + −a0 , where a0 > 0 is suitably Q P large (a0 = a + 2 log 2). We set yjv (τv ) = r=0 yjv,r xr and C{rv } = v yjvrv . The integral is X Y ∂ nv Y Z τw a0 Im = Res C{rv } dτv (eRv (τv +id)+Ev τv eiωv τv xrvv ) (4.7) hv ∂E −∞ v v v {r } R0
v
with τw0 = −a0 . Starting from the end-nodes we now perform the integrals in dτv then the derivatives in Ev and finally the residues in Rv , we do this first for all the end-nodes and then proceed to the inner nodes hierarchically. Lemma 4.30. Integral (4.7) produces the bounds " !# Y |s(v)|+2 Y X v,h −m 2τ +2 k a0 h C1 |yj ||x0 | Im ≤ ε (m!) ; v
j=1
h
−a0
x0 = e , m is the number of nodes (≤ 2k−1), |s(v)| the number of nodes following v and C1 is some order one constant. Finally τ is the Diophantine exponent of √ωε , 1
|ω · n| > ε 2 γ|n|−τ
for some γ = Oε (1) .
If we choose a0 > a the series are all convergent (by the analyticity of the y j ’s in x0 ). We choose x0 = |x| ≤ e−a−2 :
e−a 8
and estimate the coefficients of the Taylor series in the ball ∞ X k=0
|yjv,k |xk0 ≤ 4 max (yjv (x)) . |x|≤2x0
June 19, 2003 16:13 WSPC/148-RMP
376
00165
M. Procesi
Proof of Lemma 4.30. This is taken from [3]. Z t xK e(iA+B)t xK eiAτ eBτ = , The integral K + B + iA −∞ so the Ev derivatives in the end-node v give 2hv terms of the form: hv1 !
v xrwv eidRv e(iωv +Rv )τw (τw )h2 rv + Rv + iωv
hv1 + hv2 = hv .
(4.8)
The residue of Rv−1 times (4.8) is (4.8) if |rv | + |ωv | 6= 0 and v v hv2 ! (τw )h1 (τw + id)h2 +1 v (h2 + 1)!
if |rv | + |ωv | = 0 .
Developing the binomial we obtain other 2hv +1 terms, all of the type ˜
Ghv +1 m!x ¯ rwv eiωv τw (τw )hv . The constant G is the maximum between one (rv 6= 0), (min|ν|≤N |ω · ν|)−1 or ( Π2 ) (we use that d < Π2 ). After integrating all the end-nodes following a node w we can P integrate in dτw a sum of 22 v∈s(w) hv +1 terms of the type ¯
ˆ
¯ r˜w eiΩv τw (τw )h Gh h!x w P P P ¯+ˆ where r˜v = v∈s(w) rv , Ωv = v∈s(w) ωv and h h ≤ v∈s(w) hv + 1. We have proved that the integrals derivatives and residues correspond to calculating the integrands in (4.7) at the limiting point a0 , ignoring the oscillating factors eiΩa0 , substituting the Taylor coefficients with their moduli and multiplying by a factor bounded by 26k−3 (k!)4
max (|ω · ν|)−2τ (2k−1) ≤ C k (k!)4τ +4 .
0 0 such that limK→∞ εK = 0 and Σλ ≤ Σλ,K + εK .
(4.33)
Proof. By (4.27) and a general theorem [16, p. 168, Theorem X.18], one can show that (1 − aK )A(λ) − [µ − E0 (A(λ))]aK ≤ AK (λ) ≤ (1 + aK )A(λ) + [µ − E0 (A(λ))]aK
(4.34)
with |E0 (A(λ))| βK (λ) aK := λ2 αK (λ) 1 + + µ µ
June 19, 2003 16:22 WSPC/148-RMP
406
00168
A. Arai & H. Kawano
where µ > 0 is arbitrary. For all sufficiently large K, we have 1 − aK > 0. We fix such a K. Then it follows from the first inequality in (4.34) and the min–max principle [17, p. 76, Theorem XIII.1] that (1 − aK )Σλ ≤ Σλ,K + [µ − E0 (A(λ))]aK , which implies that Σλ ≤ Σλ,K + εK with εK := aK [Σλ + µ − E0 (A(λ))]. We have limK→∞ εK = 0. Let Σλ,K,V := inf σess (AK,V (λ)) . Lemma 4.9. There exists a constant ηK,V > 0 such that limV →∞ ηK,V = 0 and Σλ,K ≤ Σλ,K,V + ηK,V .
(4.35)
Proof. Similar to the proof of Lemma 4.8. Lemma 4.10. lim E0 (AK (λ)) = E0 (A(λ)) ,
(4.36)
lim E0 (AK,V (λ)) = E0 (AK (λ)) .
(4.37)
K→∞ V →∞
Proof. By (4.34) and the variational principle, (1 − aK )E0 (A(λ)) − µaK ≤ E0 (AK (λ)) ≤ (1 + aK )E0 (A(λ)) + µaK , which implies (4.36). Similarly one can prove (4.37). Lemma 4.11. Suppose that the same hypothesis as in Theorem 2.2 and Hypothesis VIII hold. Then, for all sufficiently large K and V, HK,V (λ) has purely discrete spectrum in [E0 (HK,V (λ)), E0 (HK,V (λ)) + m). ˜ K,V (λ) has purely disProof. By Lemma 4.1, we need only to show that H crete spectrum in [E0 (HK,V (λ)), E0 (HK,V (λ)) + m) [note that E0 (hK,V ) = ˜ K,V (λ)) = E0 (HK,V (λ))]. By Lemma 4.7, it is sufficient to show that the E 0 (H ˜ K,V (λ)|FV has such a property. By Lemma 4.2, we have reduced part hK,V := H |hΨ, δA1,K,V (λ)Ψi| ≤ εhΨ, I ⊗ Hb,V Ψi + bε kΨk2 ,
Ψ ∈ D(I ⊗ Hb,V ) ,
where ε > 0 is arbitrary and 2 λ c3/2,K,V (g)2 1 + |λ|c1,K,V (g) . bε := 4ε 2
(4.38)
(4.39)
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
407
By condition (2.25) and Lemmas 4.8–4.10, we have Σλ,K,V − E0 (AK,V (λ)) > m + 2bε
(4.40)
if ε < 1 is sufficiently close to 1 and K and V are sufficiently large. Note that the spectrum of h0 := Hb,V |Fb (WV ) is purely discrete with Ran(Eh0 ([0, s]) being finite dimensional for all s > 0. Hence we can apply Theorem B.1 with Remark B.1 in Appendix B to conclude that hK,V has purely discrete spectrum in [E0 (hK,V ), E0 (hK,V ) + m). Proof of Theorem 2.2. Let HK (λ) := A ⊗ I + I ⊗ Hb + λ
J X j=1
Bj ⊗ φ(gj,K ) .
Then, in the same way as in [4, Lemma 3.5], one can show that HK,V (λ) converges to HK (λ) in the norm resolovent sense as V → ∞. Hence, by Lemma 4.11 and an application of [4, Lemma 3.12], we conclude that HK (λ) has purely discrete spectrum in [E0 (HK (λ)), E0 (HK (λ)) + m). In the same way as in [4, Lemma 3.11], one can show that HK (λ) converges to H(λ) in the norm resolovent sense as K → ∞. Hence, by the preceding result and [4, Lemma 3.12] again, we see that H(λ) has purely discrete spectrum in [E0 (H(λ)), E0 (H(λ)) + m). Finally we consider the case where each gj is not necessarily continuous. In this (n) (n) case we can take a sequence of continuous functions {gj }∞ ∈W n=1 such that gj (n)
such that kgj (n) gj
− gj k → 0 (n → ∞). Let Hn be the operator H(λ) with gj replaced
(j = 1, . . . , J). Then one can show that Hn converges to H(λ) in the norm resolovent sense as n → ∞. By the result of the last paragraph, Hn has purely discrete spectrum in [E0 (Hn ), E0 (Hn ) + m). Hence, by [4, Lemma 3.12] once again, we see that H(λ) has purely discrete spectrum in [E0 (H(λ)), E0 (H(λ)) + m). 5. Existence of a Ground State in the Massless Case This section is devoted to proof of Theorem 2.3. Throughout the section, all the hypotheses of Theorem 2.3 are assumed. For each constant M > 0, we define ωM : Rd → [M, ∞) by ωM (k) := ω(k) + M , so that inf k∈Rd ωM (k) = M > 0. We set (M )
gj
:=
ωM gj , ω
k ∈ Rd ,
June 19, 2003 16:22 WSPC/148-RMP
408
00168
A. Arai & H. Kawano
which is in W. We introduce a “regularized” version of the Hamiltonian H(λ): HM := A ⊗ I + I ⊗ Hb,M + λ
J X j=1
(M )
Bj ⊗ φ(gj
),
where Hb,M := dΓ(ωM ) . Let (M )
AM := A − λ2 RB with (M )
RB
:=
J (M ) (M ) g 1 X gj Bj Bl , √l √ 2 ωM ωM j,l=1
and ˜ M := AM ⊗ I + I ⊗ Hb,M + δA1 (λ) . H Then, by Lemma 3.7, ˜M . U (λ)HM U (λ)−1 = H By applying the Lebesgue dominated convergence theorem, one can show that
(M )
gj gj
− s =0 (5.1) lim s M →0 ωM ω for all s ≥ 0 such that gj /ω s+1 ∈ W. We write
AM = A(λ) + WM with (M )
WM := λ2 (RB − RB ) . We put cM := λ2
(M ) J (M ) X gj g gl . √ , √gl − √j , √ ω ω ωM ωM
j,l=1
Then we can show that
kWM uk ≤ cM (akA(λ)uk + bkuk) ,
u ∈ D(A0 ) ,
where a and b are constants independent of M (cf. the proof of Lemma 4.3). In the same way as in Lemma 4.10, one can show that lim E0 (AM ) = E0 (A(λ)) .
M →0
(5.2)
June 19, 2003 16:22 WSPC/148-RMP
00168
409
Enhanced Binding in a General Class of Quantum Field Models
By this fact, (5.1) and (2.26), we can take M > 0 (sufficiently small) satisfying 1 (M ) (M ) Σλ − E0 (AM ) > M + λ2 c3/2 (g)2 + |λ|c1 (g) , 2 (M )
(5.3) (M )
where cs (g) is the cs (g) with ω and gj replaced by ωM and gj respectively. It ˜ follows from Theorem 2.2 that HM has a ground state and so does HM . We denote ˜ M by ΨM . a normalized ground state of H ˜ M ) and Lemma 5.1. For all f ∈ W with ωf ∈ W, I ⊗ a(f )ΨM ∈ D(H
˜ M − E 0 (H ˜ M ))I ⊗ a(f )ΨM (H ) ( J X gj λ −1 √ f, ΨM . U (λ)[Bj , A1 ]U (λ) = − a(ωM f ) − ω 2 j=1
Proof. Similar to the proof of [4, Lemma 4.1] except that, in the present case, one uses an easily proven formula λ D gj E U (λ)I ⊗ a(f )U (λ)−1 = I ⊗ a(f ) − √ f, Bj ⊗ I ω 2
on D(A0 ⊗ I) ∩ D(I ⊗ Hb,M ). We set
Nb := dΓ(I) , 1/2
the number operator on Fb (W). Then we have for all f ∈ W and ψ ∈ D(Nb ), 1/2
ka(f )ψk ≤ kf kkNb ψk ,
1/2
ka(f )∗ ψk ≤ kf kkNb ψk + kf k ,
which implies that kφ(f )ψk ≤
√ 1 1/2 2kf kkNb ψk + √ kf kkψk . 2
(5.4)
Now we can apply Theorem B.2 in Appendix B with (K, B, C) = (A(λ), Hb , ˜ δA1 (λ)) so that H = H(λ). By (2.25), Hypothesis (B.1) holds. Hypothesis (B.2) is satisfied with m = M , Km = AM , Bm = Hb,M , ψ0 = Ω (the Fock vacuum) and D = F0 (W) ∩ D(Hb ). We denote by PΩ the orthogonal projection onto the one-dimensional subspace {αΩ|α ∈ C} and set PΩ⊥ . := I − PΩ . By (3.23) and the fact that Hb,M Ω = 0 and Hb Ω = 0, we have for all u ∈ D(A0 ) with kuk = 1,
˜ M u ⊗ Ωi ≤ hu, AM ui + |λ| c1 (g) , hu ⊗ Ω, H 2 ˜ which implies that E0 (HM ) ≤ E0 (AM )+|λ|c1 (g)/2. By (5.2), for every η > 0, there exists a constant M0 > 0 such that |E0 (AM ) − E0 (A(λ))| < η for all M < M0 . By this fact and (2.26), we have ˜ M ) ≤ η + E0 (A(λ)) + |λ| c1 (g) < Σλ E 0 (H 2
June 19, 2003 16:22 WSPC/148-RMP
410
00168
A. Arai & H. Kawano
for all M < M0 , where we take η sufficiently small. Thus, if we show that 1 kδA1 (λ)ΨM k2 + kI ⊗ PΩ⊥ ΨM k2 < δ 2 ˜ (Σλ − E0 (HM ))
(5.5)
˜ for some δ < 1, then H(λ) has a ground state and so does H(λ). Let us prove (5.5). Using estimate (5.4) and (3.24), we have 1/2
kδA1 (λ)ΨM k ≤ |λ|c1 (g)kNb ΨM k +
|λ|c1 (g) . 2
It is well known or easy to see that Nb ≥ PΩ⊥ . Hence 1/2
kI ⊗ PΩ⊥ ΨM k2 ≤ kNb ΨM k2 . Therefore, if
2|λ|2 c1 (g)2 |λ|2 c1 (g)2 1/2 + 1 kNb ΨM k2 + < δ, ˜ M )]2 ˜ M )]2 [Σλ − E0 (H [Σλ − E0 (H
(5.6)
then (5.5) follows. 1/2 To estimate kNb ΨM k, we follow the method given in the proof of [4, Lemma 4.3]. Indeed, by Lemma 5.1, we can show that !
J
g |λ| X j 1/2
k[Bj , A1 ]k .
kNb ΨM k ≤ √ 2 j=1 ωωM Hence, if
2 λ 2|λ|2 c1 (g)2 +1 2 ˜ M )] 2 [Σλ − E0 (H +
J X
gj
ωωM k[Bj , A1 ]k j=1
|λ|2 c1 (g)2 0 We have E0 (HM ) = E0 (H sufficiently small. This completes the proof of Theorem 2.3. 6. The Pauli Fierz Type Model In this section, we apply Theorem 2.3 to a model of the Pauli–Fierz type in nonrelativistic QED. Namely we consider the case where the system S is a system of n nonrelativistic quantum particles moving in Rd under the influence of a potential V : Rdn → R (d, n ∈ N). We set ν := nd. We assume for simplicity that ν ≥ 3,
V ∈ C0∞ (Rν ) ,
V− := min{V, 0} 6= 0 .
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
411
The Hilbert space of the particle system is taken to be H = L2 (Rν ) . Hence the Hamiltonian of the particle system is Hp := −∆ + αV acting in L2 (Rν ), where ∆ is the generalized Laplacian on L2 (Rν ) and α > 0 is a parameter. We write x = (x1 , . . . , xν ) ∈ Rν and define pj := −iDj with Dj being the generalized partial differential operator in the variable xj . By the Cwikel–Lieb–Rosenbljum bound [17, Theorem XIII.12], Hp has no ground state for all sufficiently small α. Let gj : Rd → RN (j = 1, . . . , ν) be such that gj ∈W. (6.1) gj , ω2 We take as the total Hamiltonian of the composed system HPF (λ) := Hp ⊗ I + I ⊗ Hb + λ
ν X j=1
pj ⊗ φ(gj ) .
This model is a concrete realization of the abstract model H(λ) with the following choice: A0 = −∆ ,
A1 = αV ,
J =ν,
B j = pj .
It is straightforward to see that Hypotheses I–V hold with [Bj , A1 ]|D(A0 ) = −iαDj V . Suppose that, for all ξ = (ξ1 , . . . , ξν ) ∈ Rν , ν 1 X gj gl √ , √ ξj ξl = G(g)ξ 2 2 ω ω
(6.2)
j,l=1
with G(g) > 0 a constant independent of ξ. This condition is satisfied in the original Pauli–Fierz model without A2 -term in the dipole approximation [2] (see Example 6.2 below). Condition (6.2) implies that ν 1 X gj gl √ , √ pj pl = −G(g)∆ . 2 ω ω j,l=1
Hence, in the present case, A(λ) takes the following form: A(λ) = −(1 − λ2 G(g))∆ + αV . Therefore, in the present case, Λ = (−λ(g), 0) ∪ (0, λ(g)) 6= ∅ ,
June 19, 2003 16:22 WSPC/148-RMP
412
00168
A. Arai & H. Kawano
where λ(g) := p
Thus Hypothesis VI holds. Also we have cs (g) =
1 G(g)
.
ν X √
gj
2|α| kDj V k∞
ωs , j=1
(6.3)
where kF k∞ := supx∈Rν |F (x)| (F : Rν → C). We set
V0 := infν V (x) < 0 . x∈R
We first consider the massive case. Theorem 6.1. Consider the case m > 0. Let ν ≥ 3 and Hypothesis VII be satisfied. Suppose that α|V0 | >
1 λ(g)2 c3/2 (g)2 + λ(g)c1 (g) 2
(6.4)
and the constant m satisfies 1 (6.5) α|V0 | > m + λ(g)2 c3/2 (g)2 + λ(g)c1 (g) . 2 Then there exists a constant δ such that, for all |λ| ∈ (λ(g) − δ, λ(g)), HPF (λ) has purely discrete spectrum in the interval [E0 (HPF (λ)), E0 (HPF (λ)) + m). In particular, HPF (λ) has a ground state. Proof. Let 0 < |λ| < λ(g). By [17, Theorem XIII.15], Σλ = inf σess (A(λ)) = 0 . Therefore, by Theorem 2.2, we need only to show 1 −E0 (A(λ)) > m + λ2 c3/2 (g)2 + |λ|c1 (g) (6.6) 2 for all |λ| sufficiently close to λ(g). We can take a constant ε > 0 such that E := V0 + ε < 0. Then DE := {x ∈ Rν |V (x) < E} is a nonempty bounded open set. Let dE := inf u∈C0∞ (DE ),kuk=1 hu, −∆ui. Then, by the strict positivity of the Dirichlet Laplacian in a bounded open set, dE > 0. By the variational principle and the fact that hu, V ui ≤ Ekuk2 , u ∈ C0∞ (DE ), E0 (A(λ)) ≤ (1 − λ2 G(g))dE + αE .
(6.7)
Note that −αE = α|V0 | − αε. Hence, by (6.5), for all sufficiently small ε > 0 and |λ| sufficiently close to λ(g), 1 −αE − (1 − λ2 G(g))dE > m + λ(g)2 c3/2 (g)2 + λ(g)c1 (g) . 2
(6.8)
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
413
Hence 1 −E0 (A(λ)) > m + λ(g)2 c3/2 (g)2 + λ(g)c1 (g) 2 1 > m + λ2 c3/2 (g)2 + |λ|c1 (g) . 2 Thus (6.6) holds. We next consider the massless case. Theorem 6.2. Consider the case m = 0. Let ν ≥ 3. Assume Hypothesis VII and (6.4). Suppose in addition that 1 8λ(g)2 c1 (g)2 4λ(g)2 c1 (g)2 + + 1 λ(g)2 c2 (g)2 < 1 . (6.9) α2 V02 2 α2 V02 Then there exists a constant δ such that, for all |λ| ∈ (λ(g) − δ, λ(g)), HPF (λ) has a ground state. Proof. Let 0 < |λ| < λ(g). By Theorem 2.3 and by the proof of Theorem 6.1, we need only to check that λ2 c1 (g)2 2λ2 c1 (g)2 1 + + 1 λ2 c2 (g)2 < 1 (6.10) E0 (HPF (λ))2 2 E0 (HPF (λ))2 for all |λ| sufficiently close to λ(g). By (6.7) and Corollary 3.1, we have E0 (HPF (λ)) ≤ (1 − λ2 G(g))dE + αE +
λ(g) c1 (g) 2
α |V0 | + αε , (6.11) 2 where, in the last step, we have used (6.4). For all |λ| sufficiently close to λ(g) and sufficiently small ε, the right hand side of (6.11) is negative. Hence, if we show that ≤ (1 − λ2 G(g))dE −
λ2 c1 (g)2 0| − αε − (1 − λ2 G(g))dE ]2 [ α|V 2
1 2λ2 c1 (g)2 + 1 λ2 c2 (g)2 < 1 , + 2 [ α|V0 | − αε − (1 − λ2 G(g))dE ]2 2
(6.12)
for all |λ| sufficiently close to λ(g) and sufficiently small ε, then (6.10) follows. It is easy to see that (6.9) implies (6.12) for all |λ| sufficiently close to λ(g) and sufficiently small ε > 0. Remark 6.1. Theorems 6.1 and 6.2 give only sufficient conditions for HPF (λ) to have a ground state with |λ| in an “intermediate” region. Suppose that Hp has no ground state. Then it would be an interesting problem to investigate if there is a constant λ0 > 0 such that, for all |λ| ∈ (0, λ0 ), HPF (λ) has no ground state. Unfortunately we have been unable to give an answer to this problem.
June 19, 2003 16:22 WSPC/148-RMP
414
00168
A. Arai & H. Kawano
κ Example 6.1. Assume Hypothesis VII. Let κ > 0 and HPF (λ) be the HPF (λ) with ω replaced by κω. Then conditions (6.4) and (6.9) take the following forms respectively:
1 1 λ(g)2 c3/2 (g)2 + √ λ(g)c1 (g) , 2κ2 κ 1 8λ(g)2 c1 (g)2 4λ(g)2 c1 (g)2 + 3 + 1 λ(g)2 c2 (g)2 < 1 . κα2 V02 2κ κα2 V02 α|V0 | >
(6.13) (6.14)
For a given α|V0 |, these inequalities are satisfied if κ is sufficiently large. Thus √ κ HPF (λ) has a ground state for all sufficiently large κ and |λ| < κλ(g) sufficiently √ close to κλ(g). This result is somewhat analogous to the results by Hiroshima and Spohn [14, Lemma 3.3, Theorem 3.4], except that the regime of the coupling constant is different. Example 6.2. Consider the original Pauli–Fierz model with one nonrelativistic particle in R3 so that n = 1, d = 3 and N = 2. We take ω(k) = |k|, k ∈ R3 , and the momentum cutoff function gj : R3 → R2 (j = 1, 2, 3) as χ[σ,L] (|k|) (2) χ[σ,L] (|k|) (1) ej (k), p ej (k) , gj (k) = p (2π)3 |k| (2π)3 |k| where χ[σ,L] is the characteristic function of the interval [σ, L] (σ > 0 is an infrared (r) (r) (r) cutoff and L > σ is an ultraviolet cutoff) and e(r) = (e1 , e2 , e3 ) : R3 → R3 (r = 1, 2) is Borel measurable such that he(s) (k), e(r) (k)i = δsr ,
he(r) (k), ki = 0 ,
r, s = 1, 2, a.e. k ∈ R3 .
By the identity 2 X r=1
(r)
(r)
ej (k)el (k) = δjl −
kj kl , |k|2
a.e. k
and the easily proven fact that Z Z 1 f (|k|)k 2 dk f (|k|)kj kl dk = δjl 3 R3 R3 R for all f : [0, ∞) → C such that R3 f (|k|)k 2 dk < ∞, we can show that Z χ[σ,L] (|k|) gj gl 2 , δjl dk = ωs ωs 3(2π)3 |k|2s+1 3 R 8π L s=1 δjl 3(2π)3 log σ ; = 8π 1 δjl (L2(1−s) − σ 2(1−s) ) ; s 6= 1 . 3(2π)3 2(1 − s)
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
415
Hence, in the present example, we have r
1 (2π)3 √ , 4π L−σ !s r 3 X √ L 8π log , kDj V k∞ c1 (g) = 2|α| 3 3(2π) σ j=1 4π G(g) = (L − σ) , (2π)3
c3/2 (g) =
√
2|α|
3 X j=1
c2 (g) =
√
2|α|
3 X j=1
λ(g) =
kDj V k∞
!s
8π 3(2π)3
kDj V k∞
!s
8π 1 √ 3(2π)3 2
r
1 1 − , σ L r
1 1 − 2. σ2 L
From these formulas, we see that λ(g)c1 (g) ∼ const.
r
log L , L
1 λ(g)c3/2 (g) ∼ const. √ , L 1 λ(g)c2 (g) ∼ const. √ L as L → ∞, where “const.” denotes a constant independent of L sufficiently large. Hence, for all sufficiently large L, all the assumptions of Theorem 6.2 are satisfied. Thus, in the present example, HPF (λ) has a ground state for all sufficiently large L and |λ| < λ(g) sufficiently close to λ(g). A possible physical picture of this result is that the coupling of nonrelativistic quantum particles to photons with larger momenta makes higher the possibility for HPF (λ) to have a ground state. Appendix A. Weak Differentiability of a Heisenberg Operator Let X be a Hilbert space. Let H be a self-adjoint operator and S a symmetric operator on X . Then the Heisenberg operator (“time evolution”) of S with respect to H is defined by S(t) := eitH Se−itH ,
t ∈ R.
(A.1)
Proposition A.1. Suppose that there exists a self-adjoint operator K on X such that the following (K.1) and (K.2) hold : (K.1) K strongly commutes with H. (K.2) D(K) ⊂ D(S) and there exist constants a, b ≥ 0 such that kSψk ≤ akKψk + bkψk ,
ψ ∈ D(K) .
June 19, 2003 16:22 WSPC/148-RMP
416
00168
A. Arai & H. Kawano
Then, for all ψ, φ ∈ D(K) ∩ D(H), the function: t 7→ hψ, S(t)φi (t ∈ R) is differentiable and d hψ, S(t)φi = i{hHe−itH ψ, Se−itH φi − hSe−itH ψ, He−itH φi} . dt
(A.2)
Proof. It follows from the strong commutativity of K with H and the two-variable functional calculus that eitH D(K) ∩ D(H) = D(K) ∩ D(H) for all t ∈ R. Let f (t) := hψ, S(t)φi, Fε := (e−iεH − 1)/ε and Gε := e−iεH − 1 with ε ∈ R \ {0}. Then f (t + ε) − f (t) = hFε e−itH ψ, Se−itH Gε φi + hFε e−itH ψ, Se−itH φi ε + hSe−itH ψ, e−itH Fε φi . The first term on the right hand side is estimated as follows: |hFε e−itH ψ, Se−itH Gε φi| ≤ kFε ψk(akKe−itH Gε φk + bkGε φk) = kFε ψk(akGε Kφk + bkGε φk) , where, in the last step, we have used the strong commutativity of K and H. Note that Fε ψ → −iHψ and Gε φ → 0 strongly as ε → 0. Hence lim hFε e−itH ψ, Se−itH Gε φi = 0
ε→0
and (A.2) follows. Appendix B. Ground States of Self-Adjoint Operators In this section we establish two abstract theorems on existence of ground states of a self-adjoint operator on an abstract Hilbert space. They reveal general structures of methods used in previous papers [7, 12, 13] to prove the existence of ground states of particle-field interaction models. B.1. Existence of a ground state of a self-adjoint operator For a self-adjoint operator S on a Hilbert space X , we denote the form domain of S by Q(S) : Z Q(S) := ψ ∈ X |λ|dkES (λ)ψk2 < ∞ = D(|S|1/2 ) , (B.1) R
where ES (·) denotes the spectral measure of S (see Sec. 2 for notations). For ψ, φ ∈ Q(S), we define hψ, Sφi by Z hψ, Sφi := λdhψ, ES (λ)φi . R
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
417
For symmetric operators A, B and a subspace D ⊂ D(A) ∩ D(B), we mean by “A ≤ B on D” that hψ, Aψi ≤ hψ, Bψi for all ψ ∈ D. Let H and K be separable Hilbert spaces. Let A and B be self-adjoint operators on H and K respectively. We assume the following: Hypothesis A. The operator A is bounded from below and B is nonnegative with E0 (B) = 0. We set T0 := A ⊗ I + I ⊗ B ,
(B.2)
which is self-adjoint and bounded from below by E0 (A). For a sesquilinear form Z on a Hilbert space, we denote its form domain by Q(Z). Let Z be a symmemtric sesquilinear form on H ⊗ K obeying the following conditions: (i) Q(Z) ⊃ Q(I ⊗ B); (ii) There exist constants a ∈ [0, 1) and b ≥ 0 such that, for all ψ ∈ Q(I ⊗ B), |Z(ψ, ψ)| ≤ ahψ, I ⊗ Bψi + bkψk2 . Lemma B.1. Assume Hypothesis A and let Z be as above. Then there exists a unique self-adjoint operator T on H ⊗ K such that Q(T ) = Q(T0 ) and hψ, T φi = hψ, T0 φi + Z(ψ, φ) ,
ψ, φ ∈ Q(T0 ) .
T is bounded from below by E0 (A)−b and every domain of essential self-adjointness for T0 is a form core for T. Proof. Let Aˆ := A − E0 (A), which is nonnegative. By the present assumption, we have for all ψ ∈ Q(T0 ), |Z(ψ, ψ)| ≤ ahψ, (Aˆ ⊗ I + I ⊗ B)ψi + bkψk2 . Note that Aˆ ⊗ I + I ⊗ B ≥ 0. Hence we can apply the KLMN theorem [16, Theorem X.17] to conclude that there exists a unique self-adjoint operator T 0 on H ⊗ K such that Q(T 0 ) = Q(Aˆ ⊗ I + I ⊗ B) = Q(T0 ) and T 0 = Aˆ ⊗ I + I ⊗ B + Z in the sense of sesquilinear form on Q(T0 ) with T 0 ≥ −b. Then the operator T defined by T := T 0 + E0 (A) is the desired one. Lemma B.2. Under the same hypothesis as in Lemma B.1, |E0 (T ) − E0 (A)| ≤ b .
(B.3)
Proof. By the variational principle and Lemma B.1, we have E0 (T ) ≥ E0 (A) − b .
(B.4)
June 19, 2003 16:22 WSPC/148-RMP
418
00168
A. Arai & H. Kawano
On the other hand, for all f ∈ D(A) and g ∈ D(B) with kf k = 1 and kgk = 1, we have E0 (T ) ≤ (f, Af ) + (1 + a)(g, Bg) + b , which, together with the variational principle, implies that E0 (T ) ≤ E0 (A) + b, where we have used the condition E0 (B) = 0. Hence (B.3) follows. We set Σ := inf σess (A)
(B.5)
E0 (T ) − E0 (A) + b + s . 1−a
(B.6)
s > 0. 1−a
(B.7)
and, for s > 0, β(s) := By (B.3), we have
β(s) ≥
Theorem B.1. Assume Hypothesis A and let Z be as above. Suppose that Σ − E0 (T ) > b
(B.8)
and, for some s0 > 0, Ran(EB ([0, β(s0 )]) is finite dimensional. Let m be a constant such that Σ − E0 (T ) > m + b ,
(B.9)
0 < m < s0 .
(B.10)
Then T has purely discrete spectrum in the interval [E0 (T ), E0 (T ) + m). In particular, T has a ground state. Remark B.1. By Lemma B.2, condition (B.8) is satisfied if Σ − E0 (A) > 2b . Proof. Let Aˆ := A − E0 (A) ,
Tm := T − E0 (T ) − m .
Then we have on D(T0 ), Tm = Aˆ ⊗ I + I ⊗ B + Z + E0 (A) − E0 (T ) − m ≥ Aˆ ⊗ I + I ⊗ (1 − a)B − α0 , where α0 := E0 (T ) − E0 (A) + b + m ≥ m .
(B.11)
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
419
Since we have (B.9), α0 < Σ − E0 (A) , one can take a constant δ > 0 such that α0 ≤ δ < Σ − E0 (A). Let Pδ := EAˆ ([0, δ]) and Pδ⊥ := I − Pδ = EAˆ ((δ, ∞)), so that Pδ + Pδ⊥ = I. Then we have Aˆ ⊗ I + I ⊗ (1 − a)B − α0 = Pδ Aˆ ⊗ I + Pδ⊥ Aˆ ⊗ I + Pδ ⊗ (1 − a)B + Pδ⊥ ⊗ (1 − a)B − α0 Pδ ⊗ I − α0 Pδ⊥ ⊗ I . Note that Pδ Aˆ ⊗ I ≥ 0 ,
Pδ⊥ Aˆ ⊗ I ≥ δPδ⊥ ⊗ I ,
Pδ⊥ ⊗ (1 − a)B ≥ 0 .
Hence Aˆ ⊗ I + I ⊗ (1 − a)B − α0 ≥ (δ − α0 )Pδ⊥ ⊗ I + Pδ ⊗ [(1 − a)B − α0 ] ≥ (Pδ ⊗ [(1 − a)B − α0 ] , where we have used the condition δ ≥ α0 . Hence we have on D(T0 ), Tm ≥ Pδ ⊗ [(1 − a)B − α0 ] ≥ Pδ ⊗ [(1 − a)B − α0 ]− , where [(1 − a)B − α0 ]− means the negative part of (1 − a)B − α0 . Let Jm := ETm ([−m, 0)). (i) The case where Ran(Jm ) is finite dimensional. In this case Tm has a purely discrete spectrum in [−m, 0). This means that the spectrum of T in [E0 (T ), E0 (T )+ m) is purely discrete. In particular T has a ground state. (ii) The case where Ran(Jm ) is infinite dimensional. Note that [(1 − a)B − α0 ]− = EB ([0, β(m))[(1 − a)B − α0 ]EB ([0, β(m)) . By condition (B.10) and the present assumption, Ran(EB ([0, β(m)))) is finite dimensional. Hence [(1 − a)B − α0 ]− is trace class. Therefore Pδ ⊗ [(1 − a)B − α0 ]− is trace class. Let {ψn }∞ n=1 be a complete orthonormal system of Ran(Jm ). Then, for all N ∈ N, 0≥
N X
n=1
hψn , Tm ψn i ≥
N X
n=1
hψn , Pδ ⊗ [(1 − a)B − α0 ]− ψn i
≥ Tr{Pδ ⊗ [(1 − a)B − α0 ]− } . PN
Hence n=1 hψn , Tm ψn i is convergent as N → ∞, which implies Jm Tm Jm is trace class and hence it is compact. Thus Tm has purely discrete spectrum in [−m, 0), which implies that T has purely discrete spectrum in [E0 (T ), E0 (T ) + m). In particular T has a ground state.
June 19, 2003 16:22 WSPC/148-RMP
420
00168
A. Arai & H. Kawano
B.2. A limit theorem on ground states Let K be a self-adjoint operator on H bounded from below and B be a nonnegative self-adjoint operator on K with E0 (B) = 0. Let C be a symmetric operator on H ⊗ K with D(K ⊗ I) ∩ D(I ⊗ B) ⊂ D(C) such that H := K ⊗ I + I ⊗ B + C
(B.12)
is self-adjoint and bounded from below. Let Σ := inf σess (K) .
(B.13)
Hypothesis (B.1). Σ > E0 (K). Hypothesis (B.2). There are a family {Km }m∈(0,m0 ] of symmetric operators on T H with D(K) ⊂ m∈(0,m0 ] D(Km ) and a family {Bm }m∈(0,m0 ] of nonnegative self-adjoint operators on K with E0 (Bm ) = 0 such that the following hold: (i) There exists a constant cm > 0 such that, for all u ∈ D(K), k(K − Km )uk ≤ cm (kKuk + kuk) and limm→0 cm = 0. (ii) There exists a nonzero vector ψ0 such that, for all m ∈ (0, m0 ], Bm ψ0 = 0. We denote the orthogonal projection onto the one-dimensional subspace {αψ0 |α ∈ C} by P0 . (iii) For each m ∈ (0, m0 ], D(Km ⊗ I) ∩ D(I ⊗ Bm ) ⊂ D(C) and the operator Hm := Km ⊗ I + I ⊗ Bm + C
(B.14)
is self-adjoint and bounded from below. T (iv) There exists a dense subspace D ⊂ [ m∈(0,m0 ] D(Bm )] ∩ D(B) such that, for N N all ψ ∈ D, limm→0 Bm ψ = Bψ and D(K) alg D is a core of H, where alg means algebraic tensor product. For an orthogonal projection P on a Hilbert space, we set P ⊥ := I − P . Theorem B.2. Assume Hypotheses (B.1) and (B.2). Suppose that inf
m∈(0,m0 ]
E0 (Hm ) > −∞ ,
sup
E0 (Hm ) < Σ .
(B.15)
m∈(0,m0 ]
Suppose that, for all m ∈ (0, m0 ], Hm has a ground state Ψm with kΨm k = 1 and there exists a constant δ < 1 independent of m ∈ (0, m0 ] such that 1 kCΨm k2 + kI ⊗ P0⊥ Ψm k2 < δ . (B.16) (Σ − E0 (Hm ))2
Then there exists a subsequence {Ψmj }∞ j=1 with
m1 > m2 > · · · > mj > mj+1 > · · · ,
lim mj = 0
j→∞
such that the weak limit Ψ0 := w- limj→∞ Ψmj is a ground state of H.
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
421
Proof. We divide the proof of this theorem into two steps. (1) By Hypothesis (B.2)-(i), there exists a positive constant ε0 < m0 such that, for all 0 < m < ε0 , cm < 1. Then, by the Kato–Rellich theorem, Km = K + (Km − K) is self-adjoint with D(Km ) = D(K) and bounded from below. We can take a constant ξ such that max{supm∈(0,m0 ] E0 (Hm ), E0 (K)} < ξ < Σ and 1 kCΨm k2 + kI ⊗ P0⊥ Ψm k2 ≤ δ . (ξ − E0 (Hm ))2
(B.17)
Let PK := EK ([E0 (K), ξ]) . Then, by Hypothesis (B.1), dim RanPK < ∞. Let Km (β) := K + βLm with Lm := (Km − K)/cm and β ∈ C. Since Lm is relatively bounded with respect to K, it follows from [17, p. 16, Lemma] that Km (β) is an analytic family of type (A) near β = 0. Hence it is an analytic family in the sense of Kato [17, p. 17, Theorem XII.9] and Km (β) is self-adjoint for real β with |β| sufficiently small. We define Qm (β) := EKm (β) ([E0 (K), ξ]) and Qm := EKm ([E0 (K), ξ]) . Then Qm (β) is analytic near β = 0. In particular, lim kQm (β) − PK k = 0
β→0
and hence dim Ran Qm (β) = dim Ran PK < ∞ for all sufficiently small |β|. Note that Km (cm ) = Km . Therefore, for every ε > 0, there exists a constant η0 > 0 such that, for all m ∈ (0, η0 ), kQm − PK k < ε
(B.18)
and dim Ran Qm = dim Ran PK . (2) By the weak compactness of the unit ball of a Hilbert space and condition (B.15), there exists a subsequence {Ψmj }∞ j=1 (m1 > m2 > · · · > mj > mj+1 > · · · , limj→∞ mj = 0) such that the weak limit Ψ0 := w- limj→∞ Ψmj and E0 := limj→∞ E0 (Hmj ) exist. By Hypothesis (B.2)-(iii), we have limm→0 Hm Ψ = HΨ for N all Ψ ∈ D(K) alg D. Hence, by an applicaiton of [4, Lemma 4.9], if we show that Ψ0 6= 0, then we can conclude that Ψ0 is a ground state with E0 = E0 (H). We have dim Ran P0 = 1. Hence, to show that Ψ0 6= 0, we need only to prove hΨm , PK ⊗ P0 Ψm i ≥ 1 − δ 0
(B.19)
with a constant δ 0 < 1 independent of m. Then, passing to the subsequence {Ψmj }j and taking the limit j → ∞, we obtain hΨ0 , PK ⊗P0 Ψ0 i ≥ 1−δ 0 > 0, which implies that Ψ0 6= 0.
June 19, 2003 16:22 WSPC/148-RMP
422
00168
A. Arai & H. Kawano
To prove (B.19), we first prove hΨm , Qm ⊗ P0 Ψm i ≥ 1 − δ .
(B.20)
Then, by (B.18), we obtain (B.19) for all m < η0 with δ 0 = δ + ε < 1 and hence the proof is completed. Now we note that (B.20) is equivalent to ⊥ hΨm , (Q⊥ m ⊗ P0 + I ⊗ P0 )Ψm i ≤ δ .
(B.21)
⊥ This is seen by using the identity 1 = hΨm , (Qm + Q⊥ m ) ⊗ (P0 + P0 )Ψm i. We prove (B.21). We have ⊥ ⊥ (Q⊥ m ⊗ P0 )Hm = Qm Km ⊗ P0 + Qm ⊗ P0 C .
Hence 0 = (Q⊥ m ⊗ P0 )(Hm − E0 (Hm ))Ψm ⊥ = (Q⊥ m (Km − E0 (Hm )) ⊗ P0 )Ψm + (Qm ⊗ P0 )CΨm ,
which implies that ⊥ hΨm , (Q⊥ m ⊗ P0 )CΨm i = −hΨm , Qm (Km − E0 (Hm )) ⊗ P0 Ψm i
≤ −(ξ − E0 (Hm ))hΨm , Q⊥ m ⊗ P 0 Ψm i . Hence hΨm , Q⊥ m ⊗ P 0 Ψm i ≤ − ≤
1 hΨm , (Q⊥ m ⊗ P0 )CΨm i ξ − E0 (Hm )
1 kQ⊥ ⊗ P0 Ψm kkCΨm k . ξ − E0 (Hm ) m
Hence hΨm , Q⊥ m ⊗ P 0 Ψm i ≤
1 kCΨm k2 , (ξ − E0 (Hm ))2
which, together with (B.17), yields (B.21). Acknowledgments This work was completed during the stay of the first author (A. A.) at the Erwin Schr¨ odinger International Institute for Mathematical Physics (ESI) in the autumn, 2002. A. A. would like to thank Professor H. Grosse for giving him an opportunity to participate in the ESI program: Noncommutative Geometry and Quantum Field Theory and warm hospitality. A. A. also acknowledges the support given by the ESI. This work was supported in part also by the Grant-In-Aid No.13440039 for scientific research from the Japan Society for Promotion of Science.
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
423
References [1] A. Arai, Self-adjointness and spectrum of Hamiltonians in nonrelativistic quantum electrodynamics, J. Math. Phys. 22 (1981), 534–537. [2] A. Arai, An asymptotic analysis and its application to the nonrelativistic limit of the Pauli–Fierz and a spin-boson model, J. Math. Phys. 31 (1990), 2653–2663. [3] A. Arai, Fock Spaces and Quantum Fields, Nippon-Hyouronsha, Tokyo, 2000 (in Japanese). [4] A. Arai and M. Hirokawa, On the existence and uniqueness of ground states of a generalized spin-boson model, J. Funct. Anal. 151 (1997), 455–503. [5] A. Arai and M. Hirokawa, Ground states of a general class of quantum field Hamiltonians, Rev. Math. Phys. 8 (2000), 1085–1135. [6] A. Arai, M. Hirokawa and F. Hiroshima, On the absence of eigenvectors of Hamiltonians in a class of massless quantum field models without infrared cutoff, J. Funct. Anal. 168 (1999), 470–497. [7] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Quantum elecrodynamics of confined nonrelativistic particles, Adv. Math. 137 (1998), 299–395. [8] I. Catto and C. Hainzl, Self-energy of one electron in nonrelativistic QED, mathph/0207036, 2002. [9] T. Chen, V. Vougalter and S. A. Vugalter, The increase of binding energy and enhanced binding in nonrelativistic QED, J. Math. Phys. 44 (2003), 1961–1970. [10] C. Hainzl, One nonrelativistic particle coupled to a photon field, Ann. Henri Poincar´e 4 (2003), 217–237. [11] C. Hainzl, V. Vougalter and S. A. Vugalter, Enhanced binding in nonrelativistic QED, Commun. Math. Phys. 233 (2003), 13–26. [12] F. Hiroshima, Ground states of a model in nonrelativistic quantum electrodynamics I, J. Math. Phys. 40 (1999), 6209–6222. [13] F. Hiroshima, Analysis of ground states of atoms interacting with a quantized radiation field, preprint, 2002. [14] F. Hiroshima and H. Spohn, Enhanced binding through coupling to a quantum field, Ann. Henri Poincar´e 2 (2001), 1159–1187. [15] M. Reed and B. Simon, Methods of Modern Mathematical Physics Vol. I, Academic Press, New York, 1972. [16] M. Reed and B. Simon, Methods of Modern Mathematical Physics Vol. II, Academic Press, New York, 1975. [17] M. Reed and B. Simon, Methods of Modern Mathematical Physics Vol. IV, Academic Press, New York, 1978.
July 14, 2003 10:1 WSPC/148-RMP
00164
Reviews in Mathematical Physics Vol. 15, No. 5 (2003) 425–445 c World Scientific Publishing Company
TRACES FOR STAR PRODUCTS ON THE DUAL OF A LIE ALGEBRA
PIERRE BIELIAVSKY∗ and SIMONE GUTT† D´ epartement de Math´ ematique, Universit´ e Libre de Bruxelles Campus Plaine, C. P. 218, Boulevard du Triomphe B-1050 Bruxelles, Belgique ∗
[email protected] †
[email protected] MARTIN BORDEMANN Laboratoire de Math´ ematiques, Universit´ e de Haute-Alsace Mulhouse 4, Rue des Fr` eres Lumi` ere, F.68093 Mulhouse, France
[email protected] STEFAN WALDMANN Fakult¨ at f¨ ur Mathematik und Physik Albert-Ludwigs-Universit¨ at Freiburg, Physikalisches Institut Hermann Herder Straße 3, D 79104 Freiburg, Germany
[email protected] Received 20 March 2002 Revised 29 January 2003 In this paper, we describe all traces for the BCH star-product on the dual of a Lie algebra. First we show by an elementary argument that the BCH as well as the Kontsevich starproduct are strongly closed if and only if the Lie algebra is unimodular. In a next step we show that the traces of the BCH star-product are given by the ad-invariant functionals. Particular examples are the integration over coadjoint orbits. We show that for a compact Lie group and a regular orbit one can even achieve that this integration becomes a positive trace functional. In this case we explicitly describe the corresponding GNS representation. Finally we discuss how invariant deformations on a group can be used to induce deformations of spaces where the group acts on. Keywords: Deformation quantization; closed star-product; trace functionals.
1. Introduction Trace functionals play an important role in deformation quantization [4] (for recent reviews on deformation quantization we refer to [20, 25, 37, 41], existence and classification results can be found in [5, 21, 29, 32, 33, 43]). Physically, traces correspond to states of thermodynamical equilibrium characterized by the KMS condition at infinite temperature [3, 11]. Note however, that 425
July 14, 2003 10:1 WSPC/148-RMP
426
00164
P. Bieliavsky et al.
for reasonable physical interpretation one has to impose an additional positivity condition on the traces [12, 40]. On the mathematical side traces are one half of the index theorem, namely the part of cyclic cohomology. The other half comes from the K-theory part. Having a trace functional tr : A → C of an associative algebra A over some commutative ring C and having a projection P = P 2 ∈ Mn (A) representing an element [P ] ∈ K0 (A) the value tr(P ) ∈ C does not depend on P but only on its class [P ]. This is just the usual natural pairing of cyclic cohomology with K-theory, see e.g. [17, Chap. III.3], and the value ind([P ]) = tr(P ) is called the index of [P ] with respect to the chosen trace. In the case of deformation quantization the situation is as follows. The starting point is a star-product ? for a Poisson manifold (M, π) whence the algebra of interest is A = (C ∞ (M )[[ν]], ?) viewed as an algebra over C[[ν]]. Then a trace is a C[[ν]]-linear functional tr : C ∞ (M )[[ν]] → C[[ν]] such that tr(f ? g) = tr(g ? f ) ,
(1.1)
whenever one function has compact support. For the K-theory part of the index theorem one knows that K-theory is stable under deformation, see e.g. [36]: any projection P0 of the undeformed algebra Mn (C ∞ (M )) can be deformed into a projection 1 1 1 ?p (1.2) P = + P0 − ? 2 2 1 + 4(P0 ? P0 − P0 ) with respect to ?, see [21, Eq. (6.1.4)]. Moreover, this deformation is unique up to equivalence of projections and any projection of the deformed algebra arises this way. It follows that ind([P ]) only depends on [P0 ] ∈ K0 (C ∞ (M )), which is the isomorphism class of the vector bundle defined by P0 , see also [13] for a more detailed discussion. Now let ˜ ? be an equivalent star product with equivalence transformation T (f ? ? T g. Then clearly ter = tr ◦ T −1 defines a trace functional with respect to g) = T f ˜ f P˜ ]) where ind f is the index with respect ˜ ?. From (1.2) we see that ind([P ]) = ind([ ?. Thus the index transforms well under equivalences of star to the trace ter and ˜ products provided one uses the ‘correct’ corresponding trace. It happens that in the symplectic case there is only one trace up to normalization [32]. So suppose that M is compact and that for each star product ? we have chosen a trace tr ? normalized such that tr? (1) = c where c does not depend on ?. Then T 1 = 1 implies tr ˜? = tr? ◦ T −1 and thus the index does not depend on the choice of ? but only on the equivalence class [?]. This simple reasoning already explains the structure of Fedosov’s index formula [21, Theorem 6.1.6], see also the algebraic index theorem of Nest and Tsygan [32]. Nevertheless we would like to mention that the computation of ind([P ]) in geometrical terms is a quite non-trivial task.
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
427
For a formulation of the index theorem in the general Poisson case we refer to [39]. Here the situation is far more non-trivial as in general there is no longer a unique trace. In [22] it is shown that integration over M with respect to some smooth density Ω is a trace for Kontsevich’s star product provided the Poisson tensor is Ω-divergence free. However, there are much more traces, typically involving integrations over the symplectic leaves. An elementary proof that in the symplectic case one has a unique trace is presented in [27]. This approach uses the canonical way of normalization of the trace, introduced by Karabegov [28] using local ν-Euler derivations, see [26] and the elementary proof of the uniqueness up to scaling of a trace as given in [11]: Here one uses the fact that in the whole algebraic dual of C ∞ (M ) there is only one Poisson trace τ0 ({f, g}) = 0 ,
(1.3)
namely the integration with respect to the Liouville measure. In this article we shall now consider the most simple case of a Poisson manifold: the dual of a Lie algebra. Here we shall determine all the traces for the BCH star product on g∗ by very elementary arguments. The paper is organized as follows. In Sec. 2 we recall the construction of various star products on the dual of a Lie algebra g∗ as well as their relation to star products on T ∗ G where G is a Lie group with Lie algebra g. Then we prove the strong closedness of homogeneous star products on g∗ by elementary computations in Sec. 3 and in Sec. 4 we show that any ad-invariant functional is a trace for the BCH star product. In Sec. 5 we prove the positivity of a trace τO associated to a regular orbit O ⊆ g∗ for compact G by a BRST construction of a star product on O. Section 6 contains a characterization of the GNS representation obtained from the positive trace τO . Finally, Sec. 7 is devoted to a construction of trace functionals by a group action using a ‘universal deformation’ on the group, inspired by techniques developed in [6, 23]. 2. Star Products on g∗ and T ∗ G In this section we shall recall the construction of several star products on the dual g∗ of a Lie algebra g and on T ∗ G where G is a Lie group with Lie algebra g. First we shall establish some notation. By e1 , . . . , en we denote a basis of g with dual basis e1 , . . . , en ∈ g∗ . Such a basis gives raise to linear coordinates x = xi ei on g and ξ = ξi ei on g∗ . Here and in the following we shall use Einstein’s summation convention. With a capital letter X we shall denote the left-invariant vector field X ∈ Γ∞ (T G) corresponding to x ∈ g, i.e. Xe = x. A vector x ∈ g determines a linear function x ˆ ∈ Pol1 (g∗ ) by ∞ ˆ ∈ Pol1 (T ∗ G), x ˆ(ξ) = ξ(x). Analogously, X ∈ Γ (T G) determines a function X ˆ g ) = αg (Xg ), where αg ∈ T ∗ G and g ∈ G. linear in the fibers, by setting X(α g We shall use the same symbol ˆ for the corresponding graded algebra isomorphism
July 14, 2003 10:1 WSPC/148-RMP
428
00164
P. Bieliavsky et al.
W• between the symmetric algebra g of g and all polynomials Pol• (g∗ ) on g∗ . Similar W• we have a graded algebra isomorphism between Γ∞ ( T G) and Pol• (T ∗ G). By use of left-invariant vector fields and one-forms, T G and T ∗ G trivialize canonically. This yields T G ∼ = G×g and T ∗ G ∼ = G×g∗ . The corresponding projections are denoted by %
π
G ←− G × g∗ −→ g∗ ,
(2.1)
ˆ =% x whence in particular X ˆ for a left-invariant vector field X. More generally, • • ∗ ∗ G ∗ Pol (T G) = % Pol (g ). For the symplectic Poisson bracket on T ∗ G we use the sign convention such that the map ˆ : Γ∞ (T G) → Pol1 (T ∗ G) becomes an isomorphism of Lie algebras (and not an anti-isomorphism as in [8]). Then the canonical linear Poisson bracket on g∗ can be obtained by the observation that left-invariant functions on T ∗ G (with respect to the lifted action) are a Poisson sub-algebra which is in linear bijection with C ∞ (g∗ ) via %∗ . Thus it is meaningful to require %∗ to be a morphism of Poisson algebras. In the global coordinates ξ1 , . . . , ξn the resulting Poisson bracket on g∗ reads as ∗
{f, g} = ξk ckij
∂f ∂g , ∂ξi ∂ξj
(2.2)
where ckij = ek ([ei , ej ]) are the structure constants of g and f , g ∈ C ∞ (g∗ ). The first star-product on g∗ is essentially given by the Baker–Campbell– Hausdorff series of g. One uses the total symmetrization map σν : Pol• (g∗ )[ν] → U(g)[ν] into the universal enveloping algebra of g, defined by σν (ˆ x1 · · · xˆk ) =
νk X xτ (1) · · · · · xτ (k) , k!
(2.3)
τ ∈Sk
where we have built in the formal parameter ν already at this stage. Then σν (f ?BCH g) = σν (f ) · σν (g)
(2.4) •
∗
yields indeed a deformed product ?BCH for f , g ∈ Pol (g )[ν], which turns out to extend to a differential star-product for C ∞ (g∗ )[[ν]], see [24] for a detailed discussion. Here we shall just mention a few properties of ?BCH . First, ?BCH is strongly g-invariant, i.e. for f ∈ C ∞ (g∗ )[[λ]] and x ∈ g we have x ˆ ?BCH f − f ?BCH x ˆ = ν{ˆ x, f } .
(2.5)
∂ + LE , Moreover, ?BCH is homogeneous: this means that the operator H = ν ∂ν ∂ where E = ξi ∂ξi is the Euler vector field, is a derivation of ?BCH , i.e.
H(f ?BCH g) = Hf ?BCH g + f ?BCH Hg
(2.6)
for all f , g ∈ C ∞ (g∗ )[[ν]]. It follows immediately that Pol• (g∗ )[ν] is a ‘convergent’ sub-algebra generated by the constant and linear polynomials. The relation to the BCH series can be seen as follows: Consider the exponential functions ex (ξ) := eξ(x) . Then for all x, y ∈ g one has ex ?BCH ey = e ν1 H(νx,νy) ,
(2.7)
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
429
where H(·, ·) is the BCH series of g. Since bidifferential operators on g∗ are already determined by their values on the exponential functions ex , x ∈ g, the star-product ?BCH is already determined by (2.7). For a more detailed analysis and proofs of the above statements we refer to [8, 24]. The other star-product we shall mention is the Kontsevich star-product ?K for g∗ . His general construction of a star-product for arbitrary Poisson structures on Rn simplifies drastically in the case of a linear Poisson structure (2.2). We shall not enter the general construction but refer to [1, 2, 19, 29, 30] for more details and just mention a few properties of ?K . First, ?K is g-covariant, i.e. one has [ x ˆ ?K yˆ − yˆ ?K x ˆ = ν{ˆ x, yˆ} = ν [x, y]
(2.8)
for all x, y ∈ g. This is a weaker compatibility with the (classical) g-action than (2.5). Moreover, ?K is homogeneous, too, but in general ?K and ?BCH do not coincide but are only equivalent, see [19]. Let us now recall how the star-product ?BCH on g∗ is related to star-products on ∗ T G. The main idea is to make the Poisson morphism %∗ into an algebra morphism of star-product algebras. This requirement does not determine the star-product on T ∗ G completely and the remaining freedom (essentially the choice of an ‘ordering prescription’ between functions depending only on G and on g∗ , respectively) can be used to impose further properties. In [24] a star-product ?G of Weyl-type was constructed by inserting additional derivatives in G-direction into the bidifferential operators of ?BCH . In [8] a star product ?S of standard-ordered type was obtained by a (standard-ordered) Fedosov construction using the lift of the half-commutator connection on G to a symplectic connection on T ∗ G. The star-product ?S can also be understood as the resulting composition law of symbols from the standardordered symbol and differential operator calculus induced by the half-commutator connection. A further ‘Weyl-symmetrization’ yields a star-product ?W of Weyltype which does not coincide in general with the original Fedosov star-product ? F built out of the half-commutator connection directly. However, it was shown in [8, Sec. 8] that ?W coincides with ?G . Moreover, the pull-back %∗ is indeed an algebra morphism for both star-products ?G and ?S , i.e. one has %∗ f ?G/S %∗ g = %∗ (f ?BCH g)
(2.9)
for all f , g ∈ C ∞ (g∗ )[[ν]]]. All the star products ?G , ?S , and ?F are homogeneous in the sense of star-products on cotangent bundles whence it follows that they are all strongly closed: integration over T ∗ G with respect to the Liouville form defines a trace on the functions with compact support, see [9, Sec. 8]. 3. Strong Closedness of ?BCH and ?K We shall now discuss an elementary proof of the fact that ?BCH as well as ?K are strongly closed with respect to the constant volume form dn ξ on g∗ if and only if the Lie algebra g is unimodular, i.e. tr ad(x) = 0 for all x ∈ g, or, equivalently,
July 14, 2003 10:1 WSPC/148-RMP
430
00164
P. Bieliavsky et al.
ciij = 0. The unimodularity of g is easily seen to be necessary since it is exactly the condition that the integration is a Poisson trace, see also [42, Sec. 4] for the Poisson case and [22] for a different and more general proof for Kontsevich’s star product on Rn . Before we discuss the general case let us consider the case where G is compact. In this case g is known to be in particular unimodular. Proposition 3.1. Let G be compact. Then ?BCH is strongly closed. Proof. Let f , g ∈ C0∞ (g∗ ). Since G is compact, %∗ f , %∗ g ∈ C0∞ (T ∗ G) and thus the strong closedness of ?G and (2.9) implies Z Z ∗ ∗ ∗ ∗ (f ?BCH g − g ?BCH f ) dn ξ , (% f ?G % g − % g ?G % f ) Ω = vol(G) 0= g∗
T ∗G
where Ω is the (suitably normalized) Liouville measure on T ∗ G. Clearly the above proof relies on the compactness of G, otherwise the integration would not be defined. As an amusing observation we remark that one can also use the above proposition to obtain the well-known fact that compact Lie groups have unimodular Lie algebras. For the general unimodular case we use a different argument which is essentially the same as for homogeneous star-products on a cotangent bundle [9, Sec. 8], see also [10, 34] for more details on star products on cotangent bundles and their traces. A differential operator D on g∗ is called homogeneous of degree r ∈ Z if [LE , D] = rD, where LE is the Lie derivative with respect to the Euler vector field. Lemma 3.2. Let D be a homogeneous differential operator of degree −r with r ≥ 1. Then for all f ∈ C0∞ (g∗ ) one has Z Df dn ξ = 0 . (3.1) g∗
From here we can follow [9] almost literally: If f ∈ Polk (g∗ ) and g ∈ C0∞ (g∗ ) then for every homogeneous star product ? on g∗ one has Z Z k X f ? g dn ξ = νr Cr (f, g) dn ξ , (3.2) g∗
r=0
g∗
where Cr is the rth bidifferential operator of ?. This follows from Lemma 3.2 since Cr (f, ·) is homogeneous of degree k − r. The analogous statement holds for the integral over g ? f . From this we conclude the following lemma: Lemma 3.3. Let ? be a homogeneous star-product for g∗ , f ∈ Pol• (g∗ ), and g ∈ C0∞ (g∗ ). Then Z (f ? g − g ? f ) dn ξ = 0 (3.3) g∗
if and only if g is unimodular.
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
431
Proof. The proof is done by induction on the polynomial degree k of f . For k = 0 the statement (3.3) is true by (3.2). For k = 1 we obtain (3.3) by (3.2) if and only if the integral vanishes on Poisson brackets, i.e. if and only if g is unimodular. For k ≥ 2 we can write f as a ?-polynomial in at most linear polynomials since these polynomials generate Pol• (g∗ )[ν] by the homogeneity of ?. Then we can use the cases k = 0, 1 to prove (3.3). Having the trace property for polynomials and compactly supported functions, we only have to use a density argument, i.e. the Stone–Weierstraß theorem, to conclude the trace property in general: Theorem 3.4. Let ? be a homogeneous star-product for g∗ . Then the integration over g∗ with respect to the constant volume dn ξ is a trace if and only if g is unimodular. Since ?BCH as well as ?K are homogeneous this theorem proves in an elementary way that they are strongly closed in the sense of [18]. 4. Trace Properties of g-Invariant Functionals Quite contrary to the symplectic case it turns out that in the Poisson case traces are no longer unique in general. Before we give an elementary proof in the case of g∗ we shall make a few comments on the general situation. As we have seen already before, the trace functionals are typically not defined on the whole algebra but on a certain subspace, as e.g. the functions with compact support. On the other hand, the property of being a trace only becomes interesting if this subspace is not only a sub-algebra but even an ideal. This motivates the following terminology: For an associative algebra A we call a functional τ defined on J ⊆ A a trace on J if J is a two-sided ideal and for all A ∈ A and B ∈ J one has τ ([A, B]) = 0. Similarly we define a Poisson trace on a Poisson ideal of a Poisson algebra. With this notation the traces which are given by integrations are traces on the ideals C0∞ (g∗ ) and C0∞ (g∗ )[[ν]], respectively. However, there will be some interesting traces with a slightly different domain. If we want to integrate over a sub-manifold ι : N ,→ M then the following space becomes important. Here and in the following we shall only consider the case where ι is an embedding. We define ∞ CN (M ) := {f ∈ C ∞ (M ) | ι(N ) ∩ supp f is compact} .
(4.1)
∞ If N is a closed embedded sub-manifold then C0∞ (M ) ⊆ CN (M ). Moreover, the lo∞ cality of a star-product ensures that CN (M )[[ν]] is a two-sided ideal of C ∞ (M )[[ν]]. Taking such a subspace as example we consider more generally domains of the form D[[ν]] where D ⊆ C ∞ (M ). In this case D is necessarily a Poisson ideal which follows immediately from the ideal properties of D[[ν]]. Moreover, if τ : D[[ν]] → R[[ν]] is a trace for a local star-product ∗ on M with domain D[[ν]]
July 14, 2003 10:1 WSPC/148-RMP
432
00164
P. Bieliavsky et al.
P∞ then τ = r=0 ν r τr with linear functionals τr : D → R. For the following we shall assume that all τr have some reasonable continuity property, e.g. with respect to the locally convex topology of smooth functions. This requirement seems to be reasonable as long as we are dealing with star-products having at least continuous cochains in every order of ν. Now let us come back to the case of g∗ with the star product ?BCH . As a first observation we remark that the strong g-invariance of ?BCH implies that for a two-sided ideal D[[ν]] the space D is g-invariant. Moreover, we have the following theorem: Theorem 4.1. Let D ⊆ C ∞ (g∗ ) be a subspace such that D[[ν]] is a two-sided ideal P∞ with respect to ?BCH and let τ = r=0 ν r τr be a R[[ν]]-linear functional on D[[ν]] with the following continuity property: For a given f ∈ C ∞ (g∗ ) and g ∈ D and a sequence pn ∈ Pol• (g∗ ) such that pn → f in the locally convex topology of smooth functions we have τr ([pn , g]?BCH ) → τr ([f, g]?BCH ) (in each order of ν). Then τ is a ?BCH -trace on D[[ν]] if and only if τ is a Poisson trace on D which is the case if and only if τ is g-invariant. Proof. The continuity ensures that g-invariance coincides with the property of being a Poisson trace. Now let τ0 be a Poisson trace and let g ∈ D. Then for all x ∈ g we have τ0 ([ˆ x, g]) = ντ0 ({ˆ x, g}) = 0 by the strong invariance of ?BCH . But since Pol1 (g∗ )[ν] together with the constants generates Pol• (g∗ )[ν] we have τ0 ([p, g]) = 0 for every polynomial p. Together with the fact that the polynomials are dense in C ∞ (g∗ ) and τ0 has the above continuity it follows that τ0 is a ?BCH -trace. Now if τ is a ?BCH -trace then τ0 is a Poisson trace and hence a ?BCH -trace itself. Thus τ − τ0 is still a ?BCH -trace and a simple induction proves the theorem. The somehow technical continuity property needed above turns out to be rather mild. In the main example it is trivially fulfilled: Example 4.2. (i) Let ι : O ,→ g∗ be a not necessarily closed but embedded coadjoint orbit ∞ ∗ and consider D = CO (g ). Then the integration with respect to the Liouville measure ΩO on O, Z τO (f ) := ι∗ f Ω O , (4.2) O
∞ ∗ CO (g )[[ν]].
is a ?BCH -trace on ∆ (ii) If in addition ∆ is a g-invariant differential operator on g∗ then τO , defined by Z ∆ τO (f ) := τO (∆f ) = ι∗ (∆f )ΩO , (4.3) O
is still a trace on
∞ ∗ CO (g )[[ν]].
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
433
5. Positivity of Traces If one replaces the formal parameter ν by a new formal parameter λ such that ¯ = λ, then it is well-known ν = iλ and if one treats λ as a real quantity, i.e. λ ∞ ∗ that the complex conjugation of functions in C (g )[[λ]] becomes a ∗ -involution for ?BCH . One has f ?BCH g = g¯ ?BCH f¯
(5.1)
for all f , g ∈ C ∞ (g)[[λ]]. Such a star product is also called a Hermitian star-product, see e.g. [14] for a detailed discussion. Thus one enters the realm of ∗ -algebras over ordered rings, see [12, 15]. In particular one can ask whether the traces for ? BCH are positive linear functionals, i.e. satisfy τ (f¯ ?BCH f ) ≥ 0 in the sense of formal power series, if the corresponding classical functional τ0 comes from a positive Borel measure on g. In general a classically positive linear functional is no longer positive for a deformed product, see e.g. [12, Sec. 2] for a simple example and [14]. But sometimes one can deform the functional as well in order to make it positive again: in the case of star-products on symplectic manifolds this is always possible [14, Proposition 5.1]. Such deformations are called positive deformations. In our case we are faced with the question whether we can deform the traces τO such that on one hand they are still traces and on the other hand they are positive. One strategy could be the following: First prove that the trace can be deformed into a positive functional perhaps loosing the trace property. Secondly average over the group in order to obtain a g-invariant functional and hence a trace. This would require a compact group. However, we shall follow another idea giving some additional insight in the problem. Nevertheless we shall first ask the following question as a general problem in deformation quantization of Poisson manifolds: Question 5.1. Is every Hermitian star-product on a Poisson manifold a positive deformation? We shall now consider the following more particular case. We assume the group G to be compact and ι : O ,→ g∗ to be a regular coadjoint orbit. Then we want to find a positive trace for ?BCH with zeroth order given by τO as in (4.2). The construction is based on the following theorem which is of independent interest: Theorem 5.2. Let G be compact and let ι : O ,→ g∗ be a regular coadjoint orbit. Then there exists a star-product ?O on the symplectic manifold O and a series of P∞ g-invariant differential operators S = id + r=1 λr Sr on g∗ such that the deformed restriction map ι∗ = ι∗ ◦ S : C ∞ (g∗ )[[λ]] → C ∞ (O)[[λ]]
(5.2)
becomes a real surjective homomorphism of star-products, i.e. ι∗ f ?O ι∗ g = ι∗ (f ?BCH g)
and
(ι∗ f ) = ι∗ f¯
for all f, g ∈ C ∞ (g∗ )[[λ]]. Hence ?O becomes a Hermitian deformation.
(5.3)
July 14, 2003 10:1 WSPC/148-RMP
434
00164
P. Bieliavsky et al.
One can view this theorem as a certain ‘deformed tangentiality property’ of the star product ?BCH : Though ?BCH is not tangential, i.e. restricts to all orbits, for a particular orbit it can be arranged such that it restricts by deforming the restriction map, see [16] for a more detailed discussion. From this theorem and [12, Lemma 2] we immediately obtain a positive trace deforming τO : Corollary 5.3. Let G be compact and ι : O ,→ g∗ a regular orbit with deformed restriction map ι∗ as in (5.2). Then the functional Z τ O (f ) := ι∗ f Ω O (5.4) O
is a positive trace with classical limit τO . In particular, ?O is strongly closed. Thus it remains to prove Theorem 5.2. We shall use the arguments here from phase space reduction of star-products via the BRST formalism as discussed in detail in [7]. In order to make this article self-contained we shall recall the basic steps of [7] adapted to the case of Poisson manifolds. Proof of Theorem 5.2. Since O is assumed to be a regular orbit there are realvalued Casimir polynomials J1 , . . . , Jk ∈ Pol• (g∗ ) such that O can be written as level surface O = J −1 ({0}) for the map J = (J1 , . . . , Jk ) : g∗ → Rk , where 0 is a regular value. Since the components of J commute with respect to the Poisson bracket this can be viewed as a moment map J : g∗ → t∗ where t∗ is the dual of the k-dimensional Abelian Lie algebra. Moreover, the J’s are in the Poisson center whence the corresponding torus action is trivial. Since the differential operators Sr will only be needed near O it will be sufficient to construct them in a tubular neighbourhood around O. In fact, a globalization beyond is also easily obtained, see [7, Lemma 6]. As 0 is a regular value of J we can use J for the transversal coordinates and find a G-invariant tubular neighbourhood U of O. On U we can define the following maps: First we need a prolongation map prol : C ∞ (O) ,→ C ∞ (U ) given by (prol φ)(o, µ) = φ(o) ,
(5.5)
where o ∈ O and µ ∈ t∗ is the transversal coordinate in U . Next we consider V• (t) ⊗ C ∞ (g∗ ) and define the Koszul coboundary operator ∂ by the (left-)insertion P P of J, i.e. ∂(t ⊗ f ) = l i(el )t ⊗ Jl f , where J = l el Jl . Clearly ∂ is G-invariant with respect to the G action g ∗ (t ⊗ f ) = t ⊗ g ∗ f and ∂ 2 = 0. We shall denote the Vl−1 Vl (t) ⊗ C ∞ (g∗ ) → (t) ⊗ C ∞ (g∗ ) for homogeneous components of ∂ by ∂l : ∗ ∗ l ≥ 1. In the case l = 0 we set ∂0 = ι and clearly ι ∂1 = 0. Finally, we define the V• chain homotopy h on (t) ⊗ C ∞ (U ) by h(t ⊗ f )(o, µ) =
k X l=1
el ∧ t ⊗
Z
1
0
∂f (o, sµ)sk ds , ∂µl
(5.6)
July 14, 2003 10:1 WSPC/148-RMP
00164
435
Traces for Star Products on the Dual of a Lie Algebra
an denote the corresponding homogeneous components by hl . For convenience we set h−1 = prol. Then h is obviously G-invariant and it is indeed a chain homotopy for ∂, i.e. for all l = 0, . . . , k we have hl−1 ∂l + ∂l+1 hl = idVl (t)⊗C ∞ (U ) .
(5.7)
Moreover, one has the obvious identities ι∗ prol = idC ∞ (O) ,
and h0 prol = 0 .
(5.8)
In a next step we quantize the above chain complex and it is homotopy. The first easy observation is that the star-product ?BCH is strongly t-invariant, i.e. the components of J are in the center of ?BCH , too. Thus we can define a deformed V• Koszul operator ∂ on the space ( (t) ⊗ C ∞ (g∗ ))[[λ]] by X i(el )t ⊗ f ?BCH Jl . (5.9) ∂(t ⊗ f ) = l
Then we still have ∂ 2 = 0 as well as ∂(t ⊗ f ) = ∂(t ⊗ f ) since the Jl commute and are real. Moreover, ∂ is still G-invariant. In a next step one constructs the deformations of h and ι∗ as follows. We define h−1 = prol without deformation and set ∂ 0 := ι∗ := ι∗ (id − (∂1 − ∂ 1 )h0 )−1
and hl := hl (hl−1 ∂ l + ∂ l+1 hl )−1 . (5.10)
Clearly the used inverse operators exist as formal power series thanks to (5.7). The proof of the following lemma is completely analogous to the proofs of [7, Propositions 25 and 26]. The G-invariance is obvious. Lemma 5.4. The operators ι∗ and h are G-invariant and fulfill the relations
as well as
hl−1 ∂ l + ∂ l+1 hl = idVl (t)⊗C ∞ (U )[[λ]] ι∗ ∂ 1 = 0
and
ι∗ prol = idC ∞ (O)[[λ]] .
(5.11)
(5.12)
Having the deformed restriction map and the chain homotopy, it is quite easy to characterize the ideal generated by the ‘constraints’ J: Lemma 5.5. Let I(J) be the (automatically two-sided ) ideal generated by J1 , . . . , Jk . Then the map ι∗ : C ∞ (U )[[λ]] → C ∞ (O)[[λ]] is surjective and ker ι∗ = im ∂ 1 = I(J) .
(5.13)
Thus we can simply define ?O by (5.3) which gives a well-defined star-product on the quotient. It is an easy computation that the first order commutator of ?O gives indeed the desired Poisson bracket. Moreover, since the J’s are real the ideal generated by them is automatically an ∗ -ideal. Since h0 as well as ∂ and ∂ are real operators, it follows that ι∗ is real, too. It remains to show that ι∗ can be written by use of a series of differential operators Sr . This is not obvious as we used the non-local homotopy h0 in order to
July 14, 2003 10:1 WSPC/148-RMP
436
00164
P. Bieliavsky et al.
define ι∗ . However, one can show the existence of the Sr in the same manner as in [7, Lemma 27]. Note that this is not even necessary for Corollary 5.3. Note that in the above construction one does not need the ‘full’ machinery of the BRST reduction but only the Koszul part. The reason is that in this case the coadjoint orbit plays the role of the ‘constraint surface’ and the reduced phase space at once. Remark 5.6. It seems that the above statement is not the most general one can obtain: There are certainly more general orbits and also non-compact groups where one can find such deformed restriction maps. We leave this as an open question for future projects. 6. GNS Representation of the Positive Traces Throughout this section we shall assume that G is compact and ι : O ,→ g∗ is a regular orbit. Then we shall investigate the GNS representation of the positive trace τO as constructed in the last section. Let us briefly recall the basic steps of the GNS construction, see [12]. Having an ∗ -algebra A over C[[λ]] with a positive linear functional ω : A → C[[λ]] one finds that Jω = {A ∈ A | ω(A∗ A) = 0} is a left ideal of A, the so-called Gel’fand ideal of ω. Then Hω := A/Jω becomes a pre-Hilbert space over C[[λ]] via hψA , ψB iω := ω(A∗ B), where ψA ∈ Hω denotes the equivalence class of A. Finally, the left representation πω (A)ψB = ψAB of A on Hω turns out to be a ∗ -representation, i.e. one has hψB , πω (A)ψC iω = hπω (A)ψB , ψC iω . According to Theorem 5.2 we have in our case a surjective ∗ -homomorphism ι∗ : C ∞ (g∗ )[[λ]] → C ∞ (O)[[λ]]
(6.1)
and a positive linear functional τO which is the pull back of a positive linear function on C ∞ (O)[[λ]] under ι∗ , namely the trace trO on O. Thus we can use the functionality properties of the GNS construction, see [9, Proposition 5.1 and Corollary 5.2] in order to relate the GNS construction for τO with the one for tr O , which is well-known, see [40, Sec. 5] and [11, Lemma 4.3]. Since tr O is a faithful functional the GNS representation of C ∞ (O)[[λ]] with respect to tr O is simply given by left multiplication L with respect to ?O , where HtrO = C ∞ (O)[[λ]]. Thus we arrive at the following theorem which can also be checked directly: Theorem 6.1. Let G be compact, ι : O ,→ g∗ a regular orbit, and τO = trO ◦ ι∗ the positive trace as in (5.4). (i) supp τO = ι(O). (ii) The Gel’fand ideal JτO of τO coincides with ker ι∗ . (iii) The GNS pre-Hilbert space HτO is unitarily isomorphic to C ∞ (O)[[λ]] endowed with the inner product hφ, χiO := trO (φ¯ ?O χ) via U : HτO 3 ψf 7→ ι∗ f ∈ C ∞ (O)[[λ]] with inverse U
−1
: φ 7→ ψprol φ .
(6.2)
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
437
(iv) For the GNS representation πτO one obtains πO (f )φ := U πτO (f )U −1 φ = ι∗ (f ?BCH prolφ) = Lι∗ f φ .
(6.3)
Since the group G acts on O and since all relevant maps are Ginvariant/equivariant we arrive at the following G-invariance of the representation. This can be checked either directly or follows again from [9, Proposition 5.1 and Corollary 5.2]. Lemma 6.2. The GNS representation πO is G-equivariant in the sense that πO (g ∗ f )g ∗ φ = g ∗ (πO (f )φ)
(6.4)
for all φ ∈ C ∞ (O)[[λ]], f ∈ C ∞ (g∗ )[[λ]] and g ∈ G. Moreover, the G-representation on C ∞ (O)[[λ]] is unitary. Let us finally mention a few properties of the commutant of πO and the ‘baby-version’ of the Tomita–Takesaki theory arising from this representation. The following statements follow almost directly form the considerations in [40, Sec. 7]. We consider the anti-linear map J : φ 7→ φ¯ ,
(6.5)
where φ ∈ C ∞ (O)[[λ]], which is clearly anti-unitary with respect to the inner product h·, ·iO and involutive. This map plays the role of the modular conjugation. The modular operator ∆ is just the identity map since in our case the linear functional is a trace, i.e. a KMS functional for inverse temperature β = 0. Then we can characterize the commutant of the representation πO as follows: Proposition 6.3. For f ∈ C ∞ (g∗ )[[λ]] we denote by Rι∗ f the right multiplication with ι∗ f with respect to the star-product ?O . Then the map πO (f ) = Lι∗ f 7→ JLι∗ f J = Rι∗ f¯
(6.6)
0 of πO . is an anti-linear bijection onto the commutant πO
Note that in this particularly simple case the modular one-parameter group Ut is just the identity Ut = idC ∞ (O)[[λ]] , since we have a trace. More generally, one could also consider KMS functionals of the form f 7→ τO (Exp(−βH) ?BCH f ) where H ∈ C ∞ (g∗ )[[λ]] and Exp denotes the star exponential with respect to ?BCH and β ∈ R is the ‘inverse temperature’. From the above proposition we immediately have the following result on the relation between the g-representations on C ∞ (O)[[λ]] arising from the GNS construction. Lemma 6.4. For x, y ∈ g we have [ πO (ˆ x)πO (ˆ y ) − πO (ˆ y )πO (ˆ x) = iλπO ([x, y]) Rι∗ xˆ Rι∗ yˆ − Rι∗ yˆRι∗ xˆ = −iλRι∗ [x,y] [
(6.7) (6.8)
July 14, 2003 10:1 WSPC/148-RMP
438
00164
P. Bieliavsky et al.
and πO (ˆ x) − Rι∗ xˆ = iλLxO ,
(6.9)
where LxO denotes the Lie derivative in direction of the fundamental vector field of x. 7. Traces for Deformations via Group Actions Let us now describe a quite general mechanism for constructing deformations and traces via group actions. We first consider the algebraic part of the construction. Let G be a group and denote the right translations by Rg : h 7→ hg, where g, h ∈ G. The left translations are denoted by Lg , respectively. Moreover, let AG ⊆ Fun(G) be a sub-algebra of the complex-valued functions on G, closed under complex conjugation. We require R∗g AG ⊆ AG for all g ∈ G. Then an associative formal deformation (AG [[λ]], ?G ) of AG is called (right) universal deformation if it is right-invariant, i.e. R∗g (f1 ?G f2 ) = R∗g f1 ?G R∗g f2
(7.1)
for all g ∈ G and f1 , f2 ∈ AG [[λ]]. Thus the right translations act as automorphisms of ?G . In the sequel we shall always assume that 1 ∈ AG and 1 ?G f = f = f ?G 1. Remark 7.1. If G is a Lie group and AG are all smooth functions on G then the existence of a right-invariant deformation gives quite strong conditions on G. However, in typical examples one may only deform a smaller class of functions. For instance the data of a G-invariant star product on a homogeneous symplectic π space G → H\G determines a right deformation of AG := π ∗ C ∞ (H\G). In the extreme case where H = {e}, the pair (AG , ?G ) becomes a star product algebra (C ∞ (G)[[λ]], ?λ ). The Poisson structure on G associated to the first order term of ?λ is then right-invariant. Its characteristic distribution (generated by Hamiltonian vector fields) — being integrable and right-invariant — determines a Lie subalgebra S of g = Lie(G) endowed with a non-degenerate Chevalley 2-cocycle Ω with respect to the trivial representation of S on R. This type of Lie algebras (S, Ω) (or rather their associated Lie groups) has been studied by Lichnerowicz et al. When unimodular such a Lie algebra is solvable [31]. Now consider a set X with a left action τ : G×X → X of G. For abbreviation we shall sometimes write g.x instead of τ (g, x). We shall use the universal deformation ?G in order to induce a deformation of a certain sub-algebra of Fun(X). First we define αx : Fun(X) → Fun(G) by (αx f )(g) = (τg∗ f )(x)
(7.2)
for x ∈ X and g ∈ G. Having specified AG we define the space AX = {f ∈ Fun(X) | αx f ∈ AG
for all x ∈ X} ,
(7.3)
July 14, 2003 10:1 WSPC/148-RMP
00164
439
Traces for Star Products on the Dual of a Lie Algebra
which is clearly a sub-algebra of Fun(X) stable under complex conjugation. Let us remark that AX contains at least those functions on X which are constant along the orbits of τ . Indeed, let f ∈ Fun(X) satisfy f (g.x) = f (x) for all x ∈ X and g ∈ G. Then (αx f )(g) = f (g.x) = f (x) is constant (not depending on g). The deformation ?G induces canonically an associative deformation ?X of AX , thereby justifying the name ‘universal deformation’. Indeed, define (f1 ?X f2 )(x) = (αx f1 ?G αx f2 )(e) ,
(7.4)
where e ∈ G denotes the unit element. Then we have the following proposition: Proposition 7.2. Let (AG [[λ]], ?G ) be a universal deformation and (AX [[λ]], ?X ) as above. (i) Then (AX [[λ]], ?X ) is an associative formal deformation of AX which is Hermitian if ?G is Hermitian. Moreover, αx : (AX [[λ]], ?X ) → (AG [[λ]], ?G ) is a homomorphism of associative algebras. (ii) If f1 is constant on some orbit G.x0 then (f1 ?X f2 )(g.x0 ) = f1 (g.x0 )f2 (g.x0 ) = (f2 ?X f1 )(g.x0 )
(7.5)
for all functions f2 ∈ AX [[λ]]. In particular, the ?X -product with a function, which is constant along all orbits, is the undeformed product. Thus ? X is ‘tangential’ to the orbits in a very strong sense. Proof. Let us first recall a few basic properties of αx , τ , R, and L. The following relations are straightforward computations: R∗g αx = αg.x
and L∗g αx = αx τg∗ .
(7.6)
Using the right invariance of ?G and the above rules we find the following relation αx (f1 ?X f2 ) = αx f1 ?G αx f2
(7.7)
for f1 , f2 ∈ AX [[λ]]. This implies on one hand that AX [[λ]] is indeed closed under the multiplication law ?X . On the other hand it follows that αx is a homomorphism. With (7.7) the associativity of ?X is a straightforward computation. Finally, if ?G is Hermitian then ?X is Hermitian, too, since all involved maps are real, i.e. commute with complex conjugation. For the second part one computes (f1 ?X f2 )(g.x0 ) = (αx0 f1 ?G αx0 f2 )(g) .
(7.8)
Now αx0 f1 is constant whence the ?G -product is the pointwise product. Thus the claim easily follows. If this holds even for all orbits and not just for G.x0 then the ?X -product with f1 is the pointwise product globally. Remark 7.3. From (7.5) we conclude that, heuristically speaking, the deformation ?X becomes more non-trivial the larger the orbits of τ are.
July 14, 2003 10:1 WSPC/148-RMP
440
00164
P. Bieliavsky et al.
Remark 7.4. Given a right universal deformation (AG , ?R ), one gets a left universal deformation (AG , ?L ) via the formula a ?L b = ι∗ (ι∗ a ?R ι∗ b)
(7.9)
provided AG is a bi-invariant subspace. Here ι : G → G denotes the inversion map g → g −1 . Starting with a left invariant deformation (AG , ?G ) of G and an action τ : G × X → X, the associated deformation of AX is then defined by the formula (f1 ? f2 )(x) = (ι∗ αx f1 ?G ι∗ αx f2 )(e) .
(7.10)
In some interesting cases, in particular in the Abelian case, the universal deformation ?G is also left invariant, i.e. the left translations L∗g acts as automorphisms of ?G , too. In this situation the induced deformation ?X is invariant under τg∗ : Lemma 7.5. Let AG be in addition left invariant and let ?G be a bi-invariant universal deformation. Then AX is invariant under τg∗ for all g ∈ G and τg∗ (f1 ?X f2 ) = τg∗ f1 ?X τg∗ f2 .
(7.11)
Proof. This is a straightforward computation using only the definitions and (7.6). Our main interest in the universal deformations comes from the following simple observation: Theorem 7.6. Let (AG [[λ]], ?G ) be a right universal deformation and let tr G : AG [[λ]] → C[[λ]] be a trace with respect to ?G . Let Φ : Fun(X)[[λ]] → C[[λ]] be an arbitrary C[[λ]]-linear functional. Then trΦ : AX [[λ]] → C[[λ]] defined by trΦ (f ) = Φ(x 7→ trG (αx f ))
(7.12)
is a trace with respect to ?X . Proof. This follows directly from the homomorphism property of αx and the trace property of trG . In particular the trace trG combined with the evaluation functionals at some point x ∈ X trx : f 7→ trG (αx f )
(7.13)
yields a trace for ?X . Thus the only difficult task is to find traces for ?G . As a last remark we shall discuss the positivity of the traces tr Φ . We assume that trG is a positive trace whence trG (f¯ ?G f ) ≥ 0 in the sense of formal power series for all f ∈ AG [[λ]]. Lemma 7.7. Assume tr G is a positive trace and Φ takes non-negative values on non-negative valued functions on X. Then tr Φ is positive. In particular trx is always positive.
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
441
Remark 7.8. The above construction has the big advantage that it can be transfered to the framework of topological deformations instead of formal deformations. This has indeed been done by Rieffel [35] in a C ∗ -algebraic framework for actions of Rd . For a class of non-abelian groups this has been done in [6]. Let us finally mention two examples. The first one is the well-known example of the Weyl–Moyal product for R2n and the second is obtained as the asymptotic version of [6] for rank one Iwasawa subgroups of SU(1, n). Example 7.9. Let ?W be the Weyl–Moyal star product on R2n , explicitly given by λ
f ?Weyl g = µ ◦ e 2i
P
k (∂qk ⊗∂pk −∂pk ⊗∂qk )
f ⊗g,
(7.14)
where µ(f ⊗ g) = f g is the pointwise product and q 1 , . . . , pn are the canonical Darboux coordinates on R2n . Clearly ?Weyl is invariant under translations whence it is a bi-invariant universal deformation of C ∞ (R2n )[[λ]]. Moreover, it is well-known that ?Weyl is strongly closed, whence the integration with respect to the Liouville measure provides a trace, which is positive. Thus one can apply the above general results to this situation. Example 7.10. This example is the asymptotic version of [6]. The groups we consider are Iwasawa subgroups G = AN of SU(1, n), where SU(1, n) = AN K is an Iwasawa decomposition. One has the obvious G-equivariant diffeomorphism G → SU(1, n)/K (here K = U(n)). The group G therefore inherits a left-invariant symplectic (K¨ ahler) structure coming from the one on the rank one Hermitian symmetric space SU(1, n)/U(n). The symplectic group may then be described as follows. As a manifold, one has G = R × R2n × R . In these coordinates the group multiplication law reads 1 0 −a0 0 0 0 0 −a0 0 −2a0 0 , L(a,x,z) (a , x , z ) = a + a , e x + x , e z + z + Ω(x, x )e 2
(7.15)
(7.16)
where Ω is a constant symplectic structure on the vector space R2n . The 2-form ω = Ω + da ∧ dz
(7.17)
then defines a left-invariant symplectic structure on G. The universal deformation ?BM we are looking for is a star product for this symplectic structure. Since on R2n+2 all symplectic star products are equivalent, it will be sufficient to describe P∞ ?BM be means of an equivalence transformation T = id + r=1 λr Tr relating ?BM and ?Weyl. In [6] an explicit integral formula for T has been given, which is defined on the Schwartz space S(R2n+2 ). It allows for an asymptotic expansion in ~ and gives indeed the desired equivalence transformation T. Then ?BM defined by f ?BM g = T −1 (T f ?Weyl T g)
(7.18)
July 14, 2003 10:1 WSPC/148-RMP
442
00164
P. Bieliavsky et al.
is a left-invariant universal deformation of G and again we can use this to apply the above results on universal deformations. Moreover, since ?Weyl is strongly closed, the functional Z G tr (f ) := T (f ) ω n+1 (7.19) G
defines a trace functional for ?BM on C0∞ (G)[[λ]]. This is again positive since that ¯ T is real, i.e. T f = T f. In what follows we give a precise description of the star product ?BM in the two dimensional case, i.e. on the group ax + b. The higher dimensional case is similar but more intricate. The non-formal deformed product in the ax + b case is obtained by transforming Weyl’s product on (R2 , da ∧ d`) under the equivalence T = F −1 ◦ φ∗~ ◦ F where F u(a, α) =
Z
(7.20)
e−iα` u(a, `) d` with
u ∈ S(R2 )
(7.21)
is the partial Fourier transform in the second variable and where φ~ : R2 → R2 is the one-parameter family of diffeomorphisms given by 1 (~ ∈ R) . (7.22) φ~ (a, α) = a, sinh(α~) ~ One has T u(a, `) = c
Z
eiα` e
=c
Z
eiα(`−q) e−iψ~ (α)q u(a, q) dq dα
−i ~
sinh(α~)q
u(a, q) dq dα (7.23)
with ψ~ (α) =
X ~2k α2k+1 . (2k + 1)!
(7.24)
k≥1
Setting p = ~α, one gets c T u(a, `) = ~
Z
i
−i
e ~ p(`−q) e ~ ψ1 (p)q u(a, q) dq dp
(7.25)
which precisely coincides with id ⊗ Op~,1 (e
−i ~ ψ1 (p)q
))u(a, `)
(7.26)
where Op~,1 f (p, q) denotes the anti-normally ordered quantization of the function f (q, p). Recall that the κ-ordered pseudodifferential quantization rule on (R2 , dq ∧ dp) is defined (at the level of test functions) by Op~,κ : D(R2 ) → End(L2 (R)) with Z i c Op~,κ (f )ϕ(q) = e ~ p(q−ξ) f (κξ + (1 − κ)q, p)ϕ(ξ) dξ dp (κ ∈ [0, 1]) . (7.27) ~
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
443
The explicit asymptotic expansion formula for Op~,κ (f ) is well known, see e.g. [38, Sec. 1.2, p. 231 and Eq. (58), p. 258]. It yields an expression for the equivalence T at the formal level which we write, with natural delicacy, as i λ T = id ⊗ exp (7.28) ψ1 ∂` .` , λ i where the operator T(`) := exp( λi ψ1 ( λi ∂` ).`) is to be understood as anti-normally ordered (κ = 1). Observe the reality of the equivalence, which may be directly checked using the fact that the function ψ1 is odd. Moreover, for every right-invariant vector field X on G = ax + b, one checks [6] that T ◦ X ◦ T −1 is an inner derivation of the Moyal–Weyl product ?Weyl. In other words, the star product ?BM is left-invariant on G. Acknowledgments We would like to thank the organizers of the Warwick workshop on quantisation for their excellent working conditions, many ideas of the paper were developed during this workshop. We also would like to thank the referee for his usefull suggestions. References [1] D. Arnal and N. Ben Amar, Kontsevich’s wheels and invariant polynomial functions on the dual of Lie algebras, Lett. Math. Phys. 52 (2000), 291–300. [2] D. Arnal, N. Ben Amar and M. Masmoudi, Cohomology of good graphs and Kontsevich linear star products, Lett. Math. Phys. 48 (1999), 291–306. [3] H. Basart, M. Flato, A. Lichnerowicz and D. Sternheimer, Deformation theory applied to quantization and statistical mechanics, Lett. Math. Phys. 8 (1984), 483–394. [4] F. Bayen, M. Flato, C. Frønsdal, A. Lichnerowicz and D. Sternheimer, Deformation theory and quantization, Ann. Phys. 111 (1978), 61–151. [5] M. Bertelson, M. Cahen and S. Gutt, Equivalence of star products, Class. Quantum Grav. 14 (1997), A93–A107. [6] P. Bieliavsky and M. Massar, Strict deformation quantizations for actions of a class of symplectic Lie Groups, Prog. Theo. Phys. Suppl. 144 (2001), 1–21. [7] M. Bordemann, H.-C. Herbig and S. Waldmann, BRST cohomology and phase space reduction in deformation quantization, Commun. Math. Phys. 210 (2000), 107–144. [8] M. Bordemann, N. Neumaier and S. Waldmann, Homogeneous fedosov star products on cotangent bundles I: Weyl and Standard ordering with differential operator representation, Commun. Math. Phys. 198 (1998), 363–396. [9] M. Bordemann, N. Neumaier and S. Waldmann, Homogeneous Fedosov star products on cotangent bundles II: GNS representations, the WKB expansion, traces, and applications, J. Geom. Phys. 29 (1999), 199–234. [10] M. Bordemann, N. Neumaier, M. J. Pflaum and S. Waldmann, On representations of star product algebras over cotangent spaces on Hermitian line bundles, Preprint math.QA/9811055 (November 1998), J. Funct. Anal. 199 (2003), 1–47. [11] M. Bordemann, H. R¨ omer and S. Waldmann, A remark on formal KMS states in deformation quantization, Lett. Math. Phys. 45 (1998), 49–61. [12] M. Bordemann and S. Waldmann, Formal GNS construction and states in deformation quantization, Commun. Math. Phys. 195 (1998), 549–583.
July 14, 2003 10:1 WSPC/148-RMP
444
00164
P. Bieliavsky et al.
[13] H. Bursztyn and S. Waldmann, Deformation quantization of Hermitian vector bundles, Lett. Math. Phys. 53 (2000), 349–365. [14] H. Bursztyn and S. Waldmann, On Positive Deformations of ∗ -Algebras. In: G. Dito, D. Sternheimer, (eds.): Conf`erence Mosh`e Flato 1999. Quantization, Deformations, and Symmetries, Mathematical Physics Studies no. 22, 69–80. Kluwer Academic Publishers, Dordrecht, Boston, London, 2000. [15] H. Bursztyn and S. Waldmann, Algebraic Rieffel induction, formal Morita equivalence and applications to deformation quantization, J. Geom. Phys. 37 (2001), 307–364. [16] M. Cahen, S. Gutt and J. Rawnsley, On tangential star products for the coadjoint Poisson structure, Commun. Math. Phys. 180 (1996), 99–108. [17] A. Connes, Noncommutative Geometry, Academic Press, San Diego, New York, London, 1994. [18] A. Connes, M. Flato and D. Sternheimer, Closed star products and cyclic cohomology, Lett. Math. Phys. 24 (1992), 1–12. [19] G. Dito, Kontsevich star product on the dual of a Lie algebra, Lett. Math. Phys. 48 (1999), 307–322. [20] G. Dito and D. Sternheimer, Deformation Quantization: Genesis, Developments and Metamorphoses. To appear in the Proceedings of the meeting between mathematicians and theoretical physicists, Strasbourg, 2001. IRMA Lectures in Math. Theoret. Phys., Vol. 1, Walter De Gruyter, Berlin 2002, pp. 9–54. [21] B. V. Fedosov, Deformation Quantization and Index Theory, Akademie Verlag, Berlin, 1996. [22] G. Felder and B. Shoikhet, Deformation quantization with traces, Lett. Math. Phys. 53 (2000), 75–86. [23] A. Giaquinto and J. J. Zhang, Bialgebra actions, twists, and universal deformation formulas, J. Pure Appl. Algebra 128(2) (1998), 133–152. [24] S. Gutt, An explicit ∗ -product on the cotangent bundle of a Lie group, Lett. Math. Phys. 7 (1983), 249–258. [25] S. Gutt, Variations on deformation quantization. In: G. Dito, D. Sternheimer, (eds.): Conf`erence Mosh`e Flato 1999. Quantization, Deformations, and Symmetries, Mathematical Physics Studies no. 21, 217–254. Kluwer Academic Publishers, Dordrecht, Boston, London, 2000. [26] S. Gutt and J. Rawnsley, Equivalence of star products on a symplectic manifold; ˇ an introduction to Deligne’s Cech cohomology classes, J. Geom. Phys. 29 (1999), 347–392. [27] S. Gutt and J. Rawnsley, Traces for star products on symplectic manifolds, J. Geom. Phys. 42 (2002), 12–18. [28] A. V. Karabegov, On the canonical normalization of a trace density of deformation quantization, Lett. Math. Phys. 45 (1998), 217–228. [29] M. Kontsevich, Deformation Quantization of Poisson Manifolds, I. Preprint qalg/9709040 (September 1997). [30] M. Kontsevich, Operads and motives in deformation quantization, Lett. Math. Phys. 48 (1999), 35–72. [31] A. Lichnerowicz and A. Medina, Groupes a structures symplectiques ou kaehleriennes invariantes, C. R. Acad. Sci., Paris, Ser. I 306, No. 3 (1988), 133–138. [32] R. Nest and B. Tsygan, Algebraic index theorem, Commun. Math. Phys. 172 (1995), 223–262. [33] H. Omori, Y. Maeda and A. Yoshioka, Weyl manifolds and deformation quantization, Adv. Math. 85 (1991), 224–255.
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
445
[34] M. J. Pflaum, A deformation-theoretical approach to Weyl quantization on Riemannian manifolds, Lett. Math. Phys. 45 (1998), 277–294. [35] M. A. Rieffel, Deformation quantization for actions of Rd , Mem. Am. Math. Soc. 106(506) (1993). [36] J. Rosenberg, Rigidity of K-theory under deformation quantization. Preprint qalg/9607021 (July 1996). [37] D. Sternheimer, Deformation Quantization: Twenty Years After. In: J. Rembieli` nski, (ed.): Particles, Fields, and Gravitation, AIP Press, New York, 1998. [38] E. M. Stein, Harmonic Analysis Real-Variable Methods, Orthogonality, & Oscillatory Integrals, Princeton Mathematical Series, Princeton University Press (1993). [39] D. Tamarkin and B. Tsygan, Cyclic formality and index theorems, Lett. Math. Phys. 56 (2001), 85–97. [40] S. Waldmann, Locality in GNS representations of deformation quantization, Commun. Math. Phys. 210 (2000), 467–495. [41] A. Weinstein, Deformation quantization, S´eminaire Bourbaki 46`eme ann´ee 789 (1994). [42] A. Weinstein, The modular automorphism group of a Poisson manifold, J. Geom. Phys. 23 (1997), 379–394. [43] A. Weinstein and P. Xu, Hochschild cohomology and characterisic classes for star-products. In: A. Khovanskij, A. Varchenko, V. Vassiliev, (eds.): Geometry of differential equations. Dedicated to V. I. Arnold on the occasion of his 60th birthday, 177–194, American Mathematical Society, Providence, 1998.
July 14, 2003 10:12 WSPC/148-RMP
00167
Reviews in Mathematical Physics Vol. 15, No. 5 (2003) 447–489 c World Scientific Publishing Company
PERTURBATION THEORY OF W ∗ -DYNAMICS, LIOUVILLEANS AND KMS-STATES
´ J. DEREZINSKI Department of Mathematical Methods in Physics, Warsaw University Ho˙za 74, 00-682, Warszawa, Poland ˇ C ´ V. JAKSI Department of Mathematics and Statistics, McGill University 805 Sherbrooke Street West, Montreal, QC, H3A 2K6, Canada C.-A. PILLET PHYMAT, Universit´ e de Toulon, B.P. 132, F-83957 La Garde Cedex, France CPT-CNRS Luminy, Case 907, F-13288 Marseille Cedex 9, France Received 26 March 2002 Revised 14 February 2003 Given a W ∗ -algebra M with a W ∗ -dynamics τ , we prove the existence of the perturbed W ∗ -dynamics for a large class of unbounded perturbations. We compute its Liouvillean. If τ has a β-KMS state, and the perturbation satisfies some mild assumptions related to the Golden–Thompson inequality, we prove the existence of a β-KMS state for the perturbed W ∗ -dynamics. These results extend the well known constructions due to Araki valid for bounded perturbations. Keywords: W ∗ -algebra; W ∗ -dynamics; perturbation theory; KMS states; Liouvilleans.
1. Introduction 1.1. W ∗ -dynamics and KMS states Let M be a W ∗ -algebra equipped with a W ∗ -dynamics (a 1-parameter pointwise σ-weakly continuous group of ∗-automorphisms) R 3 t 7→ τ t . The pair (M, τ ) is often called a W ∗ -dynamical system. Let Q be a self-adjoint element of M. A well known convergent power series expansion, that can be traced back at least to Schwinger and Dyson, can be used to define the perturbed W ∗ -dynamics which we t denote by R 3 t 7→ τQ . The difference of the generators of τQ and τ equals i[Q, ·] ∗ — in fact, the W -dynamics τQ is uniquely characterized by this property. Suppose in addition that β > 0 and that τ possesses a β-KMS state ω. Araki proved that in this case the dynamics τQ also possesses a canonical β-KMS state ωQ . More precisely, if ω(A) = (Ω|AΩ), where Ω is the vector representative of 447
July 14, 2003 10:12 WSPC/148-RMP
448
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
the state ω in the standard positive cone, and L is the so-called Liouvillean of τ , then the vector ΩQ := e−β(L+Q)/2 Ω is well defined and the state ωQ (A) := (ΩQ |AΩQ )/kΩQ k2 is β-KMS for the W ∗ -dynamics τQ . The above two constructions play an important role in applications of operator algebras to quantum statistical physics. Whereas the construction of the perturbed W ∗ -dynamics τQ is relatively easy and not very surprising, the construction of the perturbed KMS state ωQ is more subtle and has a far-reaching physical importance. The both constructions, however, have one technical weakness which restricts the range of their applications: the perturbation Q is assumed to be bounded. In many physical applications the operator Q is unbounded and is only affiliated to M. In this paper we extend the construction of the perturbed W ∗ -dynamics τQ and the (τQ , β)-KMS state ωQ to a large class of unbounded perturbations Q affiliated to M. An application of these results is discussed in [1] and concerns spectral and ergodic theory of Pauli–Fierz systems. The proof of the first result — the construction of τQ — is again relatively simple and does not involve much more than an application of the Trotter product formula. The proof of the second result — the construction of ωQ — is more involved. Its main idea is the use of the so-called Golden–Thompson inequality. The Golden– Thompson inequality in its original form says that if A and B are self-adjoint matrices, then Tr eA+B ≤ Tr eA eB . Translated into the language of W ∗ -algebras and KMS states, the Golden– Thompson inequality can be put into the form kΩQ k ≤ ke−βQ/2 Ωk .
(1.1)
In our approach, the Golden–Thompson inequality is used to control the perturbed KMS-states and gives an upper bound, which combined with a weak convergence argument enables us to construct ΩQ for a large class of unbounded Q. In the literature there exists a different approach to the construction of the perturbed KMS states for unbounded perturbations, which is restricted to perturbations bounded from below. One of its versions has been developed by Sakai [2]; another version (applicable to generalized positive operators which may not have a dense domain) is due to Donald [3] (his method is also discussed in monograph [4]). The Sakai–Donald theory does not cover perturbations which are unbounded from both sides, and in particular is not applicable to Pauli–Fierz systems. The W ∗ -algebraic form (1.1) of the Golden–Thompson inequality was first proven by Araki [5]. A different proof, based on an application of Uhlmann’s monotonicity theorem for the relative entropy [6], was given in [3]. 1.2. Liouvilleans The term Liouvillean has become quite popular in the recent literature on algebraic quantum statistical physics. The meaning of this term can vary depending on the
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
449
author. Therefore, we would like to devote some space to a discussion of possible meanings of the term Liouvillean in the context of W ∗ -dynamical systems. Let (M, τ ) be a W ∗ -dynamical system. It is often important to construct a representation of M equipped with a unitary implementation of the W ∗ -dynamics τ . There are two natural approaches to such construction. The first approach presupposes that τ has an invariant normal state ω. In the corresponding GNS representation this state is represented by a cyclic vector Ω. Then it is easy to see that there exists a unique self-adjoint operator L such that τ t (A) = eitL Ae−itL ,
LΩ = 0 .
The operator L defined this way can be called the Ω-Liouvillean of τ . In the second approach one chooses a standard representation of M on a Hilbert space H. One of the objects that go together with the standard representation is the positive cone H+ . A general theory of standard representations implies that there exists a unique self-adjoint operator L such that τ t (A) = eitL Ae−itL ,
eitL H+ ⊂ H+ .
The operator L defined in this way can be called the standard Liouvillean of τ , or simply the Liouvillean of τ . The two setups overlap if the invariant state ω is faithful and Ω ∈ H + . In this case the Ω-Liouvillean of τ coincides with the standard Liouvillean of τ . This fact is important for applications of W ∗ -algebras to quantum statical physics. If one is interested in the case of equilibrium, then the first approach to Liouvillean suffices. In nonequilibrium situations one needs the second approach. The (standard) Liouvillean encodes in a particularly convenient way the properties of the dynamics. This has been demonstrated in many places in the recent literature [1, 7–10]. The Liouvillean is also one of the main technical tools of our paper. If L is the Liouvillean for the W ∗ -dynamics τ , then one may ask what is the Liouvillean for τQ . If Q is bounded, then the answer is LQ = L + Q − JQJ, where J is the modular conjugation. We will establish the same result for unbounded Q under some mild technical assumptions. 1.3. Organization of the paper We start our paper with a concise review of some aspects of the theory of W ∗ algebras. The choice of topics is motivated by some recent applications of W ∗ algebras to quantum statistical mechanics [1, 7–11]. Among other things, we will discuss the two possible definitions of the Liouvillean. For most of the proofs in Sec. 2 the reader is referred to the literature, especially [12, 13]. In Sec. 3 we describe the perturbation theory of W ∗ -dynamics and Liouvilleans. We describe in particular the case of unbounded perturbations, which goes beyond what we could find in the literature.
July 14, 2003 10:12 WSPC/148-RMP
450
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
To make our paper more accessible, we have included in Sec. 4 the proof of the Uhlmann’s monotonicity theorem [6] and Donald’s proof of the Golden–Thompson inequality [3]. A somewhat different presentation of this topic can be found in [4]. Section 5 contains the perturbation theory of KMS states. The subject naturally splits into three levels. The most restrictive level concerns analytic perturbations. In this case the proofs are essentially algebraic and relatively simple. The next level concerns bounded Q. This is the case considered by Araki [14], see also [13, 15–17]. Finally, we develop perturbation theory for a class of unbounded Q. In all the cases we prove a number of properties of ΩQ , including the Peierls–Bogoliubov and the Golden–Thompson inequalities. We stress that the Golden–Thompson inequality is at the same time an important ingredient of our proof of the existence of ΩQ . We also prove a number of estimates that can be used to compare the vectors Ω and ΩQ . Some of these estimates appear to be new. We have attempted to make the paper reasonably self-contained so that it can serve as a brief introduction to some recent works on algebraic quantum statistical physics. Our presentation is in some respects complementary to the presentation in the standard literature such as [4, 12, 13]. In particular, we tried to emphasize the use of the standard representation and the Liouvillean. In Appendix B we give a concise description of the Pauli–Fierz systems at positive densities. The material of this appendix is based on [1]. We include this material at the request of referee to briefly explain the main physical motivation and application of the results of our paper. 2. General Facts about W ∗ -Algebras In this section we recall some basic definitions and facts about W ∗ -algebras which will play a role in our paper. For additional information and proofs we refer the reader to [12, 13, 18–20]. There are two approaches to the theory of W ∗ -algebras: the concrete and the abstract approach. In the concrete approach one starts with the notion of a concrete W ∗ -algebra (called also a von Neumann algebra), defined as a ∗-algebra of bounded operators on a Hilbert space which equals its double commutant. This is in fact the original definition that dates back to the works of von Neumann. In the abstract approach, due to Sakai [18], one defines an abstract W ∗ -algebra as a C ∗ -algebra that possesses a predual. These approaches are essentially equivalent: every abstract W ∗ -algebra can be represented as a concrete W ∗ -algebra and every concrete W ∗ -algebra is an abstract W ∗ -algebra. The concrete approach is historically the first and is used in most monographs, e.g. [12, 13, 19]. The abstract approach has been developed in [18]. In some respects the abstract approach is more difficult from the pedagogical point of view — many basic properties of W ∗ -algebras are more difficult to show starting from Sakai’s definition than starting from von Neumann’s definition. Nevertheless, one
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
451
can argue that Sakai’s approach is conceptually superior: it helps to distinguish the notions that are intrinsic from the notions that are representation dependent. In our presentation we will stress the abstract approach. 2.1. Abstract W ∗ -algebras If X is a Banach space, then a Banach space Y is called a predual of X iff X is isomorphic to the dual of Y. M is an (abstract) W ∗ -algebra if it is a C ∗ -algebra which possesses a predual. It can be shown that every W ∗ -algebra M possesses a unique predual (up to isomorphism). It will be denoted by M∗ . Elements of M∗ will be called normal functionals on M. The topology on M generated by the seminorms |ω(A)|, ω ∈ M∗ , is called the σ-weak topology. The topology on M generated by the seminorms |ω(A∗ A)|1/2 , ω ∈ M∗ , is called the σ-strong topology. + M+ ∗ denotes the set of positive elements of M∗ . Elements of M∗ satisfying ω(1) = 1 are called normal states. The set of normal states is denoted M∗+,1 . ∗ Let ω ∈ M+ ∗ and let N be a W -subalgebra of M. The support of ω with respect to N is defined as sN ω := inf{P ∈ N : P is an orthogonal projection and ω(1 − P ) = 0} . In particular, the support with respect to M will be called just the support of ω and denoted sω . The support of ω wrt the center of M will be called the central support of ω and denoted zω . ∗ ω ∈ M+ ∗ is called faithful iff sω = 1. A W -algebra is called σ-finite if it possesses a faithful state. Let M, N be W ∗ -algebras and π : M → N a homomorphism. We say that π is normal iff π is σ-weakly continuous. 2.2. Concrete W ∗ -algebras Let H be a Hilbert space. (Ψ|Φ) will denote the scalar product of the vectors Ψ, Φ ∈ H. We adopt “physicist’s convention” and our scalar product is antilinear with respect to the first argument. If C ⊂ B(H), then the commutant of C will be denoted by C 0 . We will say that M is a concrete W ∗ -algebra (or a von Neumann algebra) iff M ⊂ B(H) for some Hilbert space H and M00 = M. A concrete W ∗ -algebra in B(H) is a W ∗ -algebra inside B(H) containing the identity of B(H). Every abstract W ∗ -algebra is ∗-isomorphic to a concrete W ∗ -algebra. Let M be an abstract W ∗ -algebra and π : M → B(H) a representation. Then π(M) is a concrete W ∗ -algebra iff π is unital and normal. Given an injective unital normal representation π : M → B(H), we will often identify M with π(M).
July 14, 2003 10:12 WSPC/148-RMP
452
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
2.3. Concrete affiliations In the following two subsections we recall the concept of operators affiliated to a W ∗ -algebra. This concept is well-known in the case of concrete W ∗ -algebras, see e.g. [12]. Let M ⊂ B(H) be a concrete W ∗ -algebra. Let A be a closed densely defined operator on H and D(A) its domain. We say that A is affiliated to M iff for all A0 ∈ M0 , A0 D(A) ⊂ D(A) and AA0 = A0 A, on D(A). Let M(η) be the set of operators affiliated to M. Theorem 2.1. (1) If A is self-adjoint on H, then A is affiliated to M iff all bounded Borel functions of A belong to M. (2) If A is a closed operator, then A is affiliated to M iff A(1 + A∗ A)−1/2 ∈ M. 2.4. Abstract affiliations The concept of affiliation can be introduced for abstract W ∗ -algebras in a fashion independent of representations. Our definition of an operator affiliated to an abstract W ∗ -algebra is directly inspired by the definition of the affiliation in the context of C ∗ -algebras due originally to Baaj and Jungl [21] and elaborated by Woronowicz [22]. We are grateful to S. L. Woronowicz for a discussion of this issue. Let M be an abstract W ∗ -algebra. In this subsection we will consider linear operators acting on M. The domain of an operator A on M will be denoted by Dom(A). (We reserve the notation D(A) to denote the domain of an operator A acting on a Hilbert space.) Let A be a linear mapping acting on M. We say that A is affiliated to M and write A ∈ Mη , iff there exists B ∈ M such that kBk ≤ 1, (1 − BB ∗ )M is σ-weakly dense in M and, for any C, D ∈ M, C ∈ Dom(A)
and AC = D ⇐⇒ BC = (1 − BB ∗ )1/2 D .
If such B exists, then it is unique. We set z(A) := B. In [22], z(A) is called the z-transform of A. One can show that if A ∈ Mη , then Dom(A) is σ-weakly dense and A is closed, both in the norm topology and in the σ-weak topology. Note that every A ∈ M may be identified with a linear map on M with Dom(A) = M (given by A(C) = AC) and thus it is an element of Mη . The ztransform of A ∈ M equals z(A) = (1 + AA∗ )−1/2 A . The following theorem describes the relationship between abstract and concrete affiliations. It shows that in the case of an injective normal representation we can identify abstract and concrete affiliated operators. Theorem 2.2. Let π : M → B(H) be a normal representation preserving the identity. Then there exists a unique extension of π to a surjective map π : Mη →
July 14, 2003 10:12 WSPC/148-RMP
00167
453
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
π(M)(η) satisfying (1 + π(A)π(A)∗ )−1/2 π(A) = π(z(A)) . If π is injective on M, then its extension on Mη is injective as well. 2.5. Vector representatives of states Let M ⊂ B(H) be a concrete W ∗ -algebra and Ω a vector in H. Then ωΩ (A) := (Ω|AΩ) ,
A ∈ M,
defines a normal positive functional on M. We say that Ω is a vector representative of ωΩ . ωΩ is a state iff Ω is normalized. The support and the central support of ωΩ are also called the support and the central support of Ω and denoted sΩ and zΩ respectively. We thus have sωΩ = s Ω , ∗
z ωΩ = z Ω . 0
The support of Ω wrt the W -algebra M will be denoted s0Ω . One shows that Ran sΩ = (M0 Ω)cl ,
Ran s0Ω = (MΩ)cl ,
where cl stands for the closure. A vector Ω ∈ H is called cyclic if s0Ω = 1. A vector Ω is called separating if sΩ = 1, or equivalently, if it is a vector representative of a faithful state. The following construction, called after Gelfand, Naimark and Segal, associates to every normal state a normal representation equipped with a cyclic vector. Theorem 2.3 (The GNS construction). Let ω be a normal state. Then there exist a (unique up to a unitary equivalence) Hilbert space H, a normal unital representation π : M → B(H) and a cyclic vector Ω ∈ H, such that ω(A) = (Ω|π(A)Ω) . The representation π is injective on zω M and zero on (1 − zω )M. 2.6. Automorphisms of W ∗ -algebras Let Aut(M) denote the group of ∗-automorphisms of a W ∗ -algebra M. We equip Aut(M) with the following topology: if ρα is a net in Aut(M) and ρ ∈ Aut(M), then ρα → ρ iff for all A ∈ M, ρα (A) → ρ(A) σ-weakly. This topology is called the pointwise σ-weak topology. A one parameter pointwise σ-weakly continuous group R 3 t 7→ τ t ∈ Aut(M) is called W ∗ -dynamics on M. The pair (M, τ ) is called a W ∗ -dynamical system. Let M ⊂ B(H) be a concrete W ∗ -algebra and ρ ∈ Aut(M). We say that ρ is implemented by U ∈ U(H), where U(H) denotes the set of unitary operators on H, iff ρ(A) = U AU ∗ .
(2.1)
July 14, 2003 10:12 WSPC/148-RMP
454
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
Let t 7→ τ t be a W ∗ -dynamics on M and t 7→ U (t) ∈ U(H) a strongly continuous group. We say that τ t is implemented by U (t) iff τ t (A) = U (t)AU (t)∗ .
(2.2)
In general, neither ∗-automorphisms nor W ∗ -dynamics need be implementable. If they are, the implementation is not unique. In the next subsections we will describe two situations where there exist distinguished implementations. 2.7. Automorphisms with a fixed invariant state ∗ + ∗ Let ω ∈ M+ ∗ and ρ ∈ Aut(M). We define ρ ω ∈ M∗ by ρ ω(A) = ω(ρ(A)). We say ∗ that ω is ρ-invariant if ω = ρ ω. The automorphisms that leave ω invariant form a group denoted Autω (M). If ρ ∈ Autω (M), then ρ(zω ) = zω and ρ(sω ) = sω . Thus ρ maps zω M and (1 − zω )M into itself, and without loss of generality we may assume that zω = 1. By passing to the GNS-representation we may assume that M ⊂ B(H) and that Ω is a cyclic vector representative of ω.
Proposition 2.1. There exists a unique representation Autω (M) 3 ρ 7→ U Ω (ρ) ∈ U(H) such that U Ω (ρ)Ω = Ω ,
U Ω (ρ)AU Ω (ρ)∗ = ρ(A) .
It is continuous if we equip Autω (M) with the pointwise σ-weak topology and U(H) with the strong operator topology. Proof. One just sets U Ω (ρ)AΩ = ρ(A)Ω ,
A ∈ M.
U Ω (ρ) will be called the Ω-implementation of ρ. Suppose now that t 7→ τ t is a W ∗ -dynamics that leaves ω invariant. Then, by Proposition 2.1, τ is implemented by a strongly continuous unitary group R 3 t 7→ U Ω (τ t ) ∈ U(H). The self-adjoint generator of U Ω (τ t ) will be denoted LΩ and called Ω the Ω-Liouvillean of τ t . (Thus U Ω (τ t ) = eitL ). The following fact is a corollary of Proposition 2.1: Proposition 2.2. The operator LΩ is the unique self-adjoint operator such that LΩ Ω = 0 ,
Ω
Ω
eitL Ae−itL = τ t (A) ,
A ∈ M.
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
455
2.8. The Tomita Takesaki theory Let ω be a faithful state on M. By passing to the GNS representation we may assume that M ⊂ B(H) and that ω has a vector representative Ω which is cyclic and separating. The following theorem summarizes the results of the well known Tomita– Takesaki theory. Theorem 2.4. (1) Define the operator SΩ with the domain MΩ by SΩ AΩ = A∗ Ω . Then SΩ is antilinear, closable, has a zero kernel and cokernel. Its closure will be 1/2 denoted also SΩ . Let SΩ = J∆Ω be its polar decomposition; (2) J is an antiunitary involution; (3) ∆Ω is a positive operator satisfying J∆Ω J = ∆−1 Ω and ∆Ω Ω = Ω; (4) The map it τωt (A) := ∆−it Ω A∆ ∈ M ,
A ∈ M,
is a W ∗ -dynamics on M and − log ∆Ω is its Ω-Liouvillean. The W ∗ -dynamics R 3 t 7→ τω−t is called the modular dynamics and ∆Ω is called the modular operator. 2.9. Standard form One of the central notions of the theory of W ∗ -algebras is the so-called standard form. It has been introduced by Haagerup [23], following the work of Araki [24] and Connes [25]. A W ∗ -algebra in a standard form is a quadruple (M, H, J, H+ ), where H is a Hilbert space, M ⊂ B(H) is a concrete W ∗ -algebra, J is an antiunitary involution on H (that is, J is antilinear, J 2 = 1, J ∗ = J) and H+ is a self-dual cone in H such that: (1) (2) (3) (4)
JMJ = M0 ; JAJ = A∗ for A in the center of M; JΨ = Ψ for Ψ ∈ H+ ; AJAH+ ⊂ H+ for A ∈ M.
If M is an abstract W ∗ -algebra, then we will say that (π, H, J, H+ ) is its standard representation if π : M → B(H) is an injective unital representation and (π(M), H, J, H+ ) is a standard form. Theorem 2.5. Let M be a W ∗ -algebra with a faithful state ω. Let π : M → B(H) be the corresponding GNS representation with the cyclic vector Ω. Let J be the modular conjugation obtained by the Tomita–Takesaki theory and H + := {π(A)Jπ(A)Ω :
July 14, 2003 10:12 WSPC/148-RMP
456
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
A ∈ M}cl . Then H+ is a self-dual cone and (π, H, J, H+ ) is a standard representation of M. If (π, H, J1 , H1+ ) is another standard representation of M and Ω ∈ H1+ , then H1+ = H+ and J1 = J. Theorem 2.6. Every W ∗ -algebra M possesses a standard representation. Moreover, if (π1 , H1 , J1 , H1+ ) and (π2 , H2 , J2 , H2+ ) are two standard representations of M, then there exists a unique unitary operator W 0 : H1 → H2 such that W 0 π1 (A) = π2 (A)W 0 , W 0 H1+ = H2+ . We then automatically have W 0 J1 = J2 W 0 . If M is σ-finite, then Theorem 2.6 is proven e.g. in [12]. In this case the existence part follows from Theorem 2.5. If M is not σ-finite, the theorem is proven using weights instead of states. The details can be found in [20, 23]. 2.10. States and automorphisms in the standard representation In this subsection we fix a W ∗ -algebra in the standard form (M, H, J, H+ ). Theorem 2.7. (1) H+ 3 Ω 7→ ωΩ ∈ M+ ∗ is a bijection. Its inverse will be denoted + M+ ∗ 3 ω 7→ Ωω ∈ H .
(2) If Ψ, Φ ∈ H+ , then kΨ − Φk2 ≤ kωΨ − ωΦ k ≤ kΨ − ΦkkΨ + Φk . (3) If Ω ∈ H+ , then Ω is cyclic ⇔ Ω is separating ⇔ ωΩ is faithful. (4) For Ω ∈ H+ , s0Ω = JsΩ J. The vector Ωω ∈ H+ will be called the standard vector representative of ω. A unitary operator U on H is called a standard unitary operator iff (1) U H+ = H+ , (2) U MU ∗ = M. Theorem 2.8. (1) If U is a standard unitary operator, then JU = U J and U M0 U ∗ = M0 . (2) There exists a unique unitary representation Aut(M) 3 ρ 7→ U (ρ) ∈ U(H) satisfying the following conditions:
(2.3)
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
457
(a) U (ρ)AU (ρ)∗ = ρ(A), A ∈ M; (b) U (ρ)H+ ⊂ H+ . (3) The image of (2.3) is the group of all standard unitary operators. (4) (2.3) is continuous if Aut(M) is equipped with the pointwise σ-weak topology and U(H) with the strong operator topology. (5) U (ρ)Ωω = Ωρ−1∗ ω for all ω ∈ M+ ∗. U (ρ) will be called the standard implementation of ρ. Suppose that t 7→ τ t is a W ∗ -dynamics on M and let U (τ t ) be as in Theorem 2.8. Then there exists a unique self-adjoint L such that U (τ t ) = eitL . The operator L will be called the standard Liouvillean of the W ∗ -dynamics τ , or simply the Liouvillean of τ . Theorem 2.9. The Liouvillean of τ is the unique self-adjoint operator L satisfying eitL H+ ⊂ H+ ,
eitL Ae−itL = τ t (A) ,
A ∈ M,
for all t ∈ R. The final result we wish to mention follows easily from Theorems 2.7 and 2.8. It has been a key tool in recent investigations of invariant states of a certain class of W ∗ -dynamical systems called Pauli–Fierz systems [1, 7–10]. Theorem 2.10. Let τ be a W ∗ -dynamics and L the corresponding Liouvillean. Then t {ωΦ : Φ ∈ H+ ∩ KerL} = {ω ∈ M+ ∗ : ω is τ invariant} .
Consequently, (1) dim KerL = 0 ⇔ there are no normal τ -invariant states. (2) dim KerL = 1 ⇔ there exists exactly one normal τ -invariant state. We will not make use of this result in our paper. 2.11. Comparison In some circumstances the setups of Subsecs. 2.7 and 2.10 overlap. Recall that in Subsec. 2.7 we have a W ∗ -algebra M with a faithful state ω. We can assume that M ⊂ B(H) and that ω has a cyclic vector representative Ω. By Theorem 2.5, we can construct J and H+ so that (M, H, J, H+ ) is a standard form and Ω ∈ H+ . Proposition 2.3. Let ρ ∈ Autω (M). Suppose that U ∈ U(H) implements ρ, that is ρ(A) = U AU ∗ , A ∈ M. Then the following conditions are equivalent: (1) U Ω = Ω (U = U Ω (ρ) is the Ω-implementation of ρ); (2) U H+ = H+ (U = U (ρ) is the standard implementation of ρ).
July 14, 2003 10:12 WSPC/148-RMP
458
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
Proof. We know from Proposition 2.1 that the Ω-implementation of ρ exists and is unique. We also know from Theorem 2.8 that the standard implementation of ρ exists and is unique. Hence, it is sufficient to show the implication in one direction. (2) ⇒ (1). The vector U Ω determines the state ρ∗ ω = ω. Hence the vectors U Ω, Ω belong to the cone H+ and determine the same state. This implies U Ω = Ω. As a corollary, if the invariant state ω is faithful, then the concepts of the Ω-Liouvillean and the standard Liouvillean coincide. Proposition 2.4. Let t 7→ τ t be a W ∗ -dynamics on M that leaves invariant a faithful state ω. Suppose that L is a self-adjoint operator such that τ t (A) = eitL Ae−itL . Then the following conditions are equivalent: (1) LΩ = 0 (L = LΩ is the Ω-Liouvillean of τ ); (2) For t ∈ R, eitL H+ ⊂ H+ (L is the standard Liouvillean of τ ). 2.12. KMS states In this subsection we recall basic properties of KMS states. Let (M, τ t ) be a W ∗ dynamical system. Definition 2.1. Let β > 0. ω ∈ M+,1 is called a (τ, β)-KMS state if for any ∗ A, B ∈ M there exists a function FA,B (z), analytic in the strip {z : 0 < Im z < β}, continuous on its closure, and satisfying the KMS boundary conditions for t ∈ R : FA,B (t) = ω(Aτ t (B)) , FA,B (t + iβ) = ω(τ t (B)A) . Theorem 2.11. Let ω be a (τ, β)-KMS state and β > 0. Then (1) (2) (3) (4)
ω is τ -invariant. sω = zω . (In particular, ω is faithful on zω M). If B ∈ zω Z, where Z is the center of M, then τ t (B) = B. Let τω be the dynamics on zω M generated by ω. Then τ t |zω M = τωβt .
Theorem 2.12. Let ω be a faithful state on M and τω the corresponding dynamics. Then ω is a (τω , 1)-KMS state. Let (M, H, J, H+ ) be a standard form. We say that Ω is a standard (τ, β)-KMS vector iff it is a standard vector representative of a (τ, β)-KMS state. Suppose that L is the Liouvillean of τ . The following theorem gives a criterium for the KMS property expressed in terms of Hilbert spaces. Theorem 2.13. Let Ω ∈ H+ be a unit vector. Then
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
459
(1) Ω is a standard (τ, β)-KMS vector iff MΩ ⊂ D(e−βL/2 ) and e−βL/2 AΩ = JA∗ Ω ,
A ∈ M.
(2) If in addition Ω is cyclic and ∆Ω is the corresponding modular operator, then ∆Ω = e−βL . 2.13. Convergence It is often convenient to reduce the study of W ∗ -dynamics and normal states to the study of corresponding Liouvilleans and standard vector representatives. In this subsection we apply this point of view to the convergence properties of W ∗ dynamics, invariant states and KMS states. Theorem 2.14. Assume that (M, H, J, H+ ) is a W ∗ -algebra in the standard form. (1) Suppose that τn is a sequence of W ∗ -dynamics with Liouvilleans Ln , L is a self-adjoint operator, and Ln → L in the strong resolvent sense. Then τ t (A) := eitL Ae−itL is a W ∗ -dynamics on M and L is its Liouvillean. (2) Assume in addition that ωn ∈ M+ ∗ are τn -invariant and Ωn are their standard vector representatives. Suppose also that w- limn Ωn = Ω. Then Ω ∈ H+ and the functional ωΩ is τ -invariant. (3) Assume in addition that ωn are (τn , β)-KMS states and that Ω 6= 0. Then ωΩ/kΩk is a (τ, β)-KMS state. Proof. (1) Let A ∈ M. We have s- limn→∞ e±itLn = e±itL , hence s- lim eitLn Ae−itLn = eitL Ae−itL ∈ M . n→∞
Therefore τ is a W ∗ -dynamics. Since H+ is closed and eitLn preserve H+ , eitL preserves H+ . Hence L is the Liouvillean of τ . (2) Since H+ is weakly closed, Ω ∈ H+ . Moreover, since Ωn ∈ D(Ln ) and Ln Ωn = 0, by Proposition A.4, Ω ∈ D(L) and LΩ = 0. (3) Let A ∈ M. Ωn are (τn , β)-KMS vectors, hence exp(−βLn /2)AΩn = JA∗ Ωn . Since exp(−βLn /2) → exp(−βL/2) in the strong resolvent sense, JA∗ Ωn → JA∗ Ω weakly, and AΩn → AΩ weakly, it follows from Proposition A.4 that AΩ ∈ D(e−βL/2 ) and e−βL/2 AΩ = JA∗ Ω . Hence Ω/kΩk is a (τ, β)-KMS vector.
(2.4)
July 14, 2003 10:12 WSPC/148-RMP
460
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
2.14. Analytic elements Let (M, τ ) be a W ∗ -dynamical system. An element A ∈ M is called τ -analytic if there exists a strip I(r) = {z : |Im z| < r} and a function f : I(r) → M such that: (1) f (t) = τ t (A) for t ∈ R; (2) I(r) 3 z 7→ φ(f (z)) is analytic for all φ ∈ M∗ . Under these conditions we write f (z) = τ z (A). A standard argument based on the uniform boundedness theorem shows that f (z) is actually analytic in the norm of M. If r = ∞, then we say that A is τ -entire. For A ∈ M and n ∈ N let 21 Z 2 n e−nt τ t (A)dt . An = π R Theorem 2.15. An is τ -entire and An % A in the σ-strong topology. Thus the τ -entire elements form a σ-strongly dense subspace of M. This subspace is denoted by Mτ . For additional discussion of analytic elements we refer the reader to [12]. 3. The Perturbation Theory of W ∗ -Dynamics In this section, given a W ∗ -dynamics τ and a perturbation Q, we construct a perturbed W ∗ -dynamics τQ . We also construct the so-called Araki–Dyson expansionals EτQ (t) which intertwine these two dynamics. We describe these objects in three cases: for analytic perturbations, bounded perturbations, and for a large class of unbounded perturbations. The constructions in the first two cases are well known, see [13, 26]. 3.1. Bounded perturbations Let (M, τ ) be a W ∗ -dynamical system and Q a self-adjoint element of M. The following formula defines the W ∗ -dynamics τQ on M: X Z t in [τ tn (Q), [. . . , [τ t1 (Q), τ t (A)] · · ·]]dt1 · · · dtn . τQ (A) = (3.1) n≥0
0≤tn ≤···t1 ≤t
t has the same domain as δ If δ is the generator of τ , then the generator of τQ and equals
δQ (A) = δ(A) + i[Q, A] . Let EτQ (t) be a one-parameter family of elements of M given by X Z EτQ (t) = in τ tn (Q) · · · τ t1 (Q)dt1 · · · dtn . n≥0
0≤tn ≤···t1 ≤t
(3.2)
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
461
We will call EτQ (t) the Araki–Dyson expansionals. Whenever there is no danger of confusion we will write EQ (t) for EτQ (t). We remark that integrals in (3.1) and (3.2) converge in σ-weak topology and define a norm-convergent series of bounded operators. The expansions (3.1) and (3.2) played an important role in the works of Schwinger, Tomonaga and Dyson on QED. The operators EτQ (t) are closely related to the so-called Connes cocycles [25]. Let us list some properties of Araki–Dyson expansionals: Theorem 3.1. Let t, t1 , t2 ∈ R. Then (1) (2) (3) (4)
EQ (t) are unitary elements of M; t τQ (A) = EτQ (t)τ t (A)EτQ (t)−1 ; EQ (t)−1 = EQ (t)∗ = τ t (EQ (−t)); EQ (t1 + t2 ) = EQ (t1 )τ t1 (EQ (t2 ));
Assume in addition that M is a concrete W ∗ -algebra in B(H) and that L is a self-adjoint operator on H such that τ t (A) = eitL Ae−itL for A ∈ M. Then t (5) τQ (A) = eit(L+Q) Ae−it(L+Q) for A ∈ M; (6) EQ (t) = eit(L+Q) e−itL .
3.2. Analytic perturbations In this subsection we assume that Q is τ -entire. Then τQ extends to C by the formula Z X z n τQ (A) = (iz) [τ sn z (Q), [. . . , [τ s1 z (Q), τ z (A)] · · ·]]ds1 · · · dsn , 0≤sn ≤···s1 ≤1
n≥0
(3.3)
valid for A ∈ Mτ . Thus Mτ = MτQ . For τ -analytic Q, the Araki–Dyson expansionals can be defined for all complex z by Z X τ n EQ (z) = (iz) τ sn z (Q) · · · τ s1 z (Q)ds1 · · · dsn . (3.4) n≥0
0≤sn ≤···s1 ≤1
The series (3.3) and (3.4) converge in norm uniformly for z in compact sets and define analytic functions with values in M. Theorem 3.2. Let z, z1 , z2 ∈ C. Then (1) (2) (3) (4)
EQ (z) ∈ Mτ ; z τQ (A) = EτQ (z)τ z (A)EτQ (z)−1 ; EQ (z)−1 = EQ (¯ z )∗ = τ z (EQ (−z)); EQ (z1 + z2 ) = EQ (z1 )τ z1 (EQ (z2 )).
July 14, 2003 10:12 WSPC/148-RMP
462
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
Assume in addition that M is a concrete W ∗ -algebra in B(H) and that L is a self-adjoint operator on H such that τ t (A) = eitL Ae−itL for A ∈ M. Then z (5) τQ (A)eiz(L+Q) = eiz(L+Q) A for A ∈ Mτ ; (6) EQ (z)eizL = eiz(L+Q) .
3.3. Unbounded perturbations In this subsection we consider a concrete W ∗ -algebra M ⊂ B(H) with a W ∗ dynamics τ implemented by a self-adjoint operator L and assume that Q is a self-adjoint operator affiliated to M. We formulate the following assumption on Q: Assumption 3.1. L + Q is essentially self-adjoint on D(L) ∩ D(Q). Theorem 3.3. Suppose that Assumption 3.1 holds and let t τQ (A) = eit(L+Q) Ae−it(L+Q) .
(3.5)
Then (1) τQ is a W ∗ -dynamics on M; (2) If Q is bounded, then τQ defined by (3.5) coincides with τQ defined by (3.1). Proof. Let A ∈ M. The Trotter product formula (Theorem A.1) yields that t τQ (A) = s- lim (eitL/n eitQ/n )n A(e−itQ/n e−itL/n )n . n→∞
t (A) ∈ M. Therefore, τQ is a W ∗ -dynamics and (1) is Since exp(±itQ/n) ∈ M, τQ proven. (2) follows from Theorem 3.1 (5).
Under Assumption 3.1 we set EτQ (t) := eit(L+Q) e−itL .
(3.6)
Again, for simplicity we will often write EQ (t) for EτQ (t). By the Trotter product formula EQ (t) = s- lim exp(itQ/n) exp(itτ t/n (Q)/n) · · · exp(itτ t(n−1)/n (Q)/n) , n→∞
hence EQ (t) ∈ M. Theorem 3.4. Suppose that Assumption 3.1 holds. Then all the statements of Theorem 3.1 hold. 3.4. Perturbations of Liouvilleans We continue with the setup of the previous subsection. In addition, we suppose that (M, H, J, H+ ) is a standard form and that L is the Liouvillean of τ . Define LQ := L + Q − JQJ .
(3.7)
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
463
We set an additional hypothesis: Assumption 3.2. The operator LQ is essentially self-adjoint on D(L) ∩ D(Q) ∩ D(JQJ). The main result of this section is: Theorem 3.5. Assume that Assumptions 3.1 and 3.2 hold. Then LQ is the Liouvillian for τQ . Proof. We have to show that for t ∈ R : t (1) τQ (A) = eitLQ Ae−itLQ , A ∈ M; itLQ + (2) e H ⊂ H+ .
Clearly, eitJQJ = Je−itQ J ∈ M0 .
(3.8)
By definition, D(L + Q) ⊃ D(L) ∩ D(Q). Therefore, D(L + Q) ∩ D(JQJ) ⊃ D(L) ∩ D(Q) ∩ D(JQJ). Hence, by Hypothesis 3.2, LQ is essentially self-adjoint on D(L + Q) ∩ D(JQJ), and we can use the Trotter formula (Theorem A.1) to write eitLQ = s- lim (eit(L+Q)/n e−itJQJ/n )n . n→∞
Therefore, for all A ∈ M, t τQ (A) = eit(L+Q) Ae−it(L+Q)
= s- lim (eit(L+Q)/n e−itJQJ/n )n A(eitJQJ/n e−it(L+Q)/n )n n→∞
= eitLQ Ae−itLQ .
(3.9)
This yields (1). To establish (2), note that since eitQ and eitJQJ commute, eit(Q−JQJ) = eitQ JeitQ J . Hence eit(Q−JQJ) H+ ⊂ H+ . Moreover, eitL H+ ⊂ H+ . By definition, D(Q) ∩ D(JQJ) ⊂ D(Q + JQJ). Therefore, D(L) ∩ D(Q − JQJ) ⊃ D(L) ∩ D(Q) ∩ D(JQJ). Hence LQ is essentially self-adjoint on D(L) ∩ D(Q − JQJ) and it follows from Theorem A.1 that eitLQ = s- lim (eitL/n eit(Q−JQJ)/n )n . n→∞
This and the fact that H
+
is a closed set imply (2).
July 14, 2003 10:12 WSPC/148-RMP
464
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
The following formulas are sometimes useful: Theorem 3.6. (1) Assume that Assumptions 3.1 and 3.2 hold. Then for t ∈ R, EQ (t) = eitLQ e−it(L−JQJ) , eitLQ = JEQ (t)JeitL EQ (−t)−1 . (2) Assume that Q is τ -analytic. Then for z ∈ C, EQ (z) = eizLQ e−iz(L−JQJ) , z )JeizL EQ (−z)−1 . eizLQ = JEQ (¯ 4. Relative Modular Theory and Relative Entropy One of the main tools used in our paper is the relative modular theory and relative entropy. We devote this section to a concise introduction to this subject. Our presentation follows partly [3, 4, 6, 27, 28]. 4.1. Relative modular operator Let M ⊂ B(H) be a W ∗ -algebra. Let Φ, Ψ ∈ H. Following Araki [28], we define the operator SΦ,Ψ on domain MΨ + (1 − s0Ψ )H by SΦ,Ψ (AΨ + Θ) = sΨ A∗ Φ , where A ∈ M and Θ ∈ (1 − s0Ψ )H = (MΨ)⊥ . It is easy to check that SΦ,Ψ is a well defined antilinear closable operator. Its closure will be denoted by the same symbol. It is useful to note that MΨ = {AΨ : A ∈ M, AsΨ = A} , and that for A ∈ M satisfying AsΨ = A and Θ as above we have SΦ,Ψ (AΨ + Θ) = A∗ Φ .
(4.1)
The positive operator ∗ ∆Φ,Ψ = SΦ,Ψ SΦ,Ψ
will be called the relative modular operator. The following facts are proven in [28]: Theorem 4.1. (1) Ker ∆Φ,Ψ = Ker s0Ψ sΦ ; 2 (2) ∆λΦ,µΨ = λµ2 ∆Φ,Ψ , λ, µ ∈ R; (3) if B belongs to the center of M, then B commutes with ∆Φ,Ψ . In the remaining part of the theorem we assume that (M, H, J, H + ) is a standard form and Φ, Ψ ∈ H+ . Then 1/2
(4) SΦ,Ψ = J∆Φ,Ψ ;
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States 1/2
465
1/2
(5) ∆Φ,Ψ Ψ = ∆Φ,Ψ sΦ Ψ = s0Ψ Φ; (6) J∆Ψ,Φ J∆Φ,Ψ = ∆Φ,Ψ J∆Ψ,Φ J = s0Ψ sΦ . The following convergence property of relative modular operators will be useful. Theorem 4.2. Let (M, H, J, H+ ) be a standard form. Suppose that Ψn , Φn ∈ H+ , that ∆Φn ,Ψn → M in the strong resolvent sense, and that w- lim n Ψn = Ψ, s- limn sΨn = sΨ and w- limn Φn = Φ. Then M = ∆Φ,Ψ . Proof. For A ∈ M, 1/2
∆Φn ,Ψn AΨn = JsΨn A∗ Φn . Note that AΨn → AΨ weakly and JsΨn A∗ Φn → JsΨ A∗ Φ weakly. Hence, by Proposition A.4 and remark after it, AΨ ∈ D(M ) and M AΨ = JsΨ A∗ Φ . Now let Θ ∈ (1 − s0Ψ )H and Θn := (1 − s0Ψn )Θ. Since s0Ψn → s0Ψ strongly, Θn → Θ strongly. Since ∆Φn ,Ψn Θn = 0, Θ ∈ D(M ) and M Θ = 0. This yields M = ∆Φ,Ψ . 4.2. Relative entropy Let M be a W ∗ -algebra. The relative entropy of two functionals ψ, φ ∈ M+ ∗ , denoted Ent(ψ|φ), is defined as follows. Choose a standard form (π, H, J, H + ) of M and let Ψ, Φ, be the standard vector representatives of ψ, φ. Then ( (Ψ| log ∆Φ,Ψ Ψ) if sψ ≤ sφ , Ent(ψ|φ) = −∞ otherwise . The relative entropy was introduced by Araki in fundamental papers [27, 28]. In the above definition we used the sign and ordering convention of [13]. The relative entropy is discussed in detail in the monograph [4]. We will need the following well-known facts [3, 4, 27, 28]. Theorem 4.3. (1) t/2
Ent(ψ|φ) = lim t−1 (k∆Φ,Ψ Ψk2 − kΨk2 ) ; t↓0
(2) for µ, λ ∈ R+ , Ent(λψ|µφ) = λ Ent(ψ|φ) + λψ(1)(log µ − log λ) ; (3) Ent(ψ|φ) ≤ ψ(1)(log φ(sψ ) − log ψ(1)) , in particular, if φ(sψ ) = ψ(1) then Ent(ψ|φ) ≤ 0 ;
(4.2)
July 14, 2003 10:12 WSPC/148-RMP
466
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
(4) if Q is a self-adjoint element in the center of M and ψ(1) = 1, then Ent(ψ|φ) + ψ(Q) ≤ log φ(eQ ) . Proof. (1) Assume first that sΨ ≤ sΦ . Then the statement follows from the spectral theorem, monotone convergence theorem and the fact that lim t−1 (xt − 1) = log x , t↓0
decreasingly on ]0, ∞[. If sΦ Ψ 6= Ψ, then Ψ = Ψ1 + Ψ2 , where Ψ1 6= 0, Ψ1 ⊥ Ψ2 and Ψ1 ∈ Ker ∆Φ,Ψ , and one easily shows that the limit in (4.2) is −∞. Scaling property of Theorem 4.1 yields (2). We first prove the part (3) under the assumption φ(sψ ) = ψ(1) = 1. Using log x ≤ x − 1 ,
x > 0,
(4.3)
we get log ∆Φ,Ψ ≤ ∆Φ,Ψ − 1 . Thus 1/2
Ent(ψ|φ) ≤ k∆Φ,Ψ Ψk2 − kΨk2 = φ(sψ ) − ψ(1) = 0 . 1/2
(We used ∆Φ,Ψ Ψ = s0Ψ Φ = JsΨ Φ). To extend (3) to arbitrary φ, ψ, use (2). To prove (4), note that since eQ commutes with ∆Φ,Ψ , log ∆Φ,Ψ + Q − log φ(eQ sψ ) = log(∆Φ,Ψ eQ /φ(eQ sψ )) The inequality (4.3) yields log(∆Φ,Ψ eQ /φ(eQ sψ )) ≤ ∆Φ,Ψ eQ /φ(eQ sψ ) − 1 . Hence 1/2
Ent(ψ|φ) + ψ(Q) − log φ(eQ sψ ) ≤ k∆Φ,Ψ eQ/2 Ψk2 /φ(eQ sψ ) − 1 = keQ/2 s0ψ Φk2 /φ(eQ sψ ) − 1 = 0 , where we used keQ/2 s0ψ Φk = keQ/2 Jsψ JΦk = kJeQ/2 sψ Φk = keQ/2 sψ Φk. 4.3. Uhlmann’s monotonicity theorem In this subsection we prove a relative entropy inequality due to Uhlmann [6]. Our proof follows the steps of an argument in [4] and is based on an interpolation theorem for self-adjoint operators (Theorem A.2 in the appendix). A different proof can be found in [29]. Let M1 and M2 be W ∗ -algebras. A map γ : M1 → M2 is called a Schwartz map iff γ(1) = 1 and γ(A∗ A) ≥ γ(A)∗ γ(A).
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
467
Theorem 4.4 (Uhlmann’s monotonicity theorem). Let ψi , φi be normal states on Mi , i = 1, 2, and let γ : M1 → M2 be a Schwartz map such that ψ2 ◦ γ = ψ 1 ,
(4.4)
φ2 ◦ γ = φ 1 .
(4.5)
Then Ent(ψ2 |φ2 ) ≤ Ent(ψ1 |φ1 ) . The following inequality is a consequence of Uhlmann’s theorem: Corollary 4.1. Let N ⊂ M be W ∗ -algebras with common identity and ψ, φ ∈ M+,1 ∗ . Then Ent(ψ|φ) ≤ Ent(ψ|N | φ|N ) . Proof. The inclusion map γ : N → M is Schwartz and satisfies the conditions of Theorem 4.4 with respect to ψ, φ and the restricted states ψ|N , φ|N . To prove Uhlmann’s theorem it is convenient to work in the standard representation and to translate the problem into the language of operators on Hilbert spaces. Hence we assume that Mi ⊂ B(Hi ) and that (Mi , Hi , Ji , Hi+ ) is a standard form. Let γ : M1 → M2 be a Schwartz map. Let ψi ∈ M+ i,∗ satisfy (4.4) and let Ψi be the standard vector representatives of ψi . Set D1 := M1 Ψ1 + (M1 Ψ1 )⊥ . We define a linear map T : D1 → H2 by T (AΨ1 + Θ1 ) := γ(A)Ψ2 for A ∈ M1 and Θ1 ∈ (M1 Ψ1 )⊥ . Since γ(1) = 1, T Ψ1 = Ψ2 . Lemma 4.1. The map T is well defined and extends to a contraction from H 1 to H2 . Proof. kγ(A)Ψ2 k2 = ψ2 (γ(A)∗ γ(A)) ≤ ψ2 (γ(A∗ A)) = ψ1 (A∗ A) = kAΨ1 k2 .
(4.6)
Hence if (A − B)Ψ1 = 0, then (γ(A) − γ(B))Ψ2 = 0. Therefore, T is well defined. By (4.6), T is a contraction. Let Φi be the standard vector representative of φi . The main step of the proof of Theorem 4.4 is the following interpolation estimate for the relative modular operator:
July 14, 2003 10:12 WSPC/148-RMP
468
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
Lemma 4.2. For 0 ≤ t ≤ 1, t/2
t/2
k∆Φ2 ,Ψ2 Ψ2 k ≤ k∆Φ1 ,Ψ1 Ψ1 k . 1/2
Proof. The space D1 , defined above, is a core for ∆Φ1 ,Ψ1 . Let A ∈ M with A = AsΨ1 . For Ω1 = AΨ1 + Θ1 ∈ D1 we get 1/2
1/2
∆Φ2 ,Ψ2 T Ω1 = ∆Φ2 ,Ψ2 γ(A)Ψ2 = JsΨ2 γ(A)∗ Φ2 , 1/2
1/2
∆Φ1 ,Ψ1 Ω1 = ∆Φ1 ,Ψ1 AΨ1 = JA∗ Φ1 . By (4.5), kJsΨ2 γ(A)∗ Φ2 k2 ≤ φ2 (γ(A)γ(A)∗ ) ≤ φ2 (γ(AA∗ )) = φ1 (AA∗ ) = kJA∗ Φ1 k2 . Hence 1/2
1/2
k∆Φ2 ,Ψ2 T Ω1 k = k∆Φ1 ,Ψ1 Ω1 k . By Lemma 4.1, T is a contraction. Hence, by Theorem A.2, for t ∈ [0, 1], t/2
t/2
k∆Φ2 ,Ψ2 T Ω1 k ≤ k∆Φ1 ,Ψ1 Ω1 k . Setting Ω1 = Ψ1 we derive the statement. Proof of Theorem 4.4. Using Theorem 4.3 (1), Lemma 4.2 and 1 = kΨ1 k2 = kΨ2 k2 , we obtain t/2
Ent(ψ2 |φ2 ) = lim t−1 (k∆Φ2 ,Ψ2 Ψ2 k2 − kΨ2 k2 ) t↓0
t/2
≤ lim t−1 (k∆Φ1 ,Ψ1 Ψ1 k2 − kΨ1 k2 ) t↓0
= Ent(ψ1 |φ1 ) . 5. Perturbation Theory of KMS States Let β > 0. In this section, given a (τ, β)-KMS state ω and a perturbation Q, we describe the construction of the perturbed β-KMS state ωQ . We also prove various properties of this state, including the Peierls–Bogoliubov and the Golden– Thompson inequalities. The Golden–Thompson inequality plays an important role in our construction. The construction is performed on three levels: for analytic perturbations, bounded perturbations and a class of unbounded perturbations. Although the results on the first two levels are well known, the method of the proof on the second level (bounded perturbations) is new. The results concerning unbounded perturbations are new and they are the main results of our paper.
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
469
5.1. Bounded perturbations Let (M, H, J, H+ ) be a W ∗ -algebra in the standard form. Let τ be the W ∗ -dynamics on M with the standard Liouvillian L. Let ω be a faithful (τ, β)-KMS state with the standard vector representative Ω. Let Q ∈ M be self-adjoint and τQ the perturbed W ∗ -dynamics defined by (3.1). By Theorem 3.5, LQ = L + Q − JQJ is the standard Liouvillean of τQ . The following two theorems summarize the (bounded) perturbation theory of KMS states developed by Araki. Theorem 5.1. (1) Ω ∈ D(e−β(L+Q)/2 ). Set ΩQ := e−β(L+Q)/2 Ω , (2) (3) (4) (5) (6)
ωQ (A) = (ΩQ |AΩQ )/kΩQ k2 .
ΩQ ∈ H + . ΩQ is a cyclic and separating vector for M. The state ωQ is a (τQ , β)-KMS state. log ∆ΩQ = −βLQ . For all self-adjoint Q1 , Q2 ∈ M, (ΩQ1 )Q2 = ΩQ1 +Q2 ,
(ωQ1 )Q2 = ωQ1 +Q2 .
(7) log ∆ΩQ ,Ω = log ∆Ω − βQ. (8) log ∆Ω,ΩQ = log ∆ΩQ + βQ. (9) Ent(ω|ωQ ) + βω(Q) = − log kΩQ k2 . (10) Ent(ωQ |ω) − βωQ (Q) = log kΩQ k2 . (11) The Peierls–Bogoliubov inequality holds: e−β(Ω|QΩ)/2 ≤ kΩQ k . (12) The Golden–Thompson inequality holds: kΩQ k ≤ ke−βQ/2 Ωk . (13) Assume that Qn ∈ M are self-adjoint and Qn → Q strongly. Then ΩQn → ΩQ and ωQn → ωQ in norm. Theorem 5.2. Let Tβ,n = {(β1 , . . . , βn ) ∈ Rn : βi ≥ 0, i = 1, . . . , n, β1 + · · · + βn ≤ β/2} . Then Ω ∈ D(e−β1 L Q · · · e−βn L Q) for (β1 , . . . , βn ) ∈ Tβ,n , the function Tβ,n 3 (β1 , . . . , βn ) 7→ e−β1 L Q · · · e−βn L Q Ω is norm continuous, ke−β1 L Q · · · e−βn L Q Ωk ≤ kQkn ,
sup
(5.1)
(β1 ,...,βn )∈Tβ,n
and ΩQ =
∞ X
n=0
(−1)
n
Z
··· Tβ,n
Z
e−β1 L Q · · · e−βn L Q Ω dβ1 · · · dβn .
(5.2)
July 14, 2003 10:12 WSPC/148-RMP
470
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
We have separated Theorem 5.2 from the other results of Araki’s theory for several reasons. Theorem 5.2 contains the main idea of Araki’s original proof of Theorem 5.1. In fact, his proof was centered around the expansion (5.2). Our methods are in a certain sense orthogonal to Araki’s and we do not need Theorem 5.2 to prove Theorem 5.1. The expansion (5.2) is an additional information about ΩQ which, strictly speaking, cannot be derived by our methods alone. Hence, for bounded perturbations our method yields a slightly weaker result than the Araki method. On the other hand, our method is simpler and easily extends to a large class of unbounded perturbations Q. Both Araki and our methods start with analytic perturbations. In this case, the proofs of Theorems 5.1 and 5.2 are essentially algebraic and relatively easy. For a general bounded Q one picks a sequence of analytic Qn with Qn → Q and uses various limit arguments to establish the theorems. The key difference between the two methods concerns these limit arguments — we use weak limits while Araki uses strong limits. The use of weak limits leads to some technical simplifications and the method naturally extends to unbounded perturbations. Finally, we mention some additional estimates which can be used to compare Ω with ΩQ . Theorem 5.3. (1) kΩQ − Ωk ≤ (eβkQk/2 − 1). (2) β(Ω|QΩ)/2 ≥ kΩk2 − (Ω|ΩQ ) ≥ β(Ω|QΩQ )/2 ≥ (Ω|ΩQ ) − kΩQ k2 ≥ β(ΩQ |QΩQ )/2 . (3) β(Ω|QΩ) ≥ kΩk2 − kΩQ k2 ≥ β(ΩQ |QΩQ ) . (4) kΩQ − Ωk2 ≤ β(Ω|QΩ)/2 − β(ΩQ |QΩQ )/2 . (5) kΩQ − Ωk ≤ βf (kQΩk, kQΩQk)/2 , where, for x, y > 0, we set x−y , log x − log y f (x, y) := x
x 6= y ; x = y.
The estimate (1) follows immediately from (5.1) and is of course well-known. The estimates (2)–(5) appear to be new.
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
5.2. Analytic perturbations
471
proofs
In this section we prove Theorem 5.1 for analytic self-adjoint perturbations Q ∈ M τ . The proofs are based on the algebraic arguments and are relatively easy. Proof of Theorem 5.1 in the analytic case. (1) For t real, EQ (t)Ω = eit(L+Q) e−itL Ω = eit(L+Q) Ω . Since EQ (t) has an analytic continuation to an entire function z 7→ EQ (z), Ω ∈ D(eiz(L+Q) ) for all z ∈ C and EQ (z)Ω = eiz(L+Q) Ω. In particular, ΩQ = EQ (iβ/2)Ω .
(5.3)
(2) We have EQ (iβ/2) = EQ (iβ/4)τ iβ/4 (EQ (iβ/4)) = EQ (iβ/4)τ iβ/2 (EQ (iβ/4)∗ ) . Hence, by (5.3), ΩQ = EQ (iβ/4)e−βL/2 EQ (iβ/4)∗ Ω = EQ (iβ/4)JEQ (iβ/4)Ω . Therefore, ΩQ ∈ H+ . (3) Since EQ (iβ/2) is an invertible element of M, ΩQ is obviously a cyclic and separating vector for M. (4) Theorem 3.6 yields e−βLQ /2 = JEQ (−iβ/2)Je−βL/2EQ (−iβ/2)−1 , and MΩQ = MΩ ⊂ D(e−βLQ /2 ). Moreover, for A ∈ M, e−βLQ /2 AΩQ = JEQ (−iβ/2)Je−βL/2 EQ (−iβ/2)−1 AEQ (iβ/2)Ω = JEQ (−iβ/2)EQ (iβ/2)∗ A∗ EQ (−iβ/2)−1∗ Ω = JEQ (−iβ/2)EQ (−iβ/2)−1 A∗ EQ (iβ/2)Ω = JA∗ ΩQ . (5) By Theorem 3.5, we know that LQ := L + Q − JQJ is the Liouvillean of τQ . By Theorem 2.13 we know that ∆ΩQ = e−βLQ . (6) follows from τQ
EQ21 (iβ/2)EτQ1 (iβ/2) = EτQ1 +Q2 (iβ/2) , which is an immediate consequence of Theorem 3.1 (6), where L + Q1 is to be used τ for L in the expression for EQQ21 (t). (7) The relation SΩ EQ (iβ/2)∗ AΩ = A∗ ΩQ = SΩQ ,Ω AΩ
July 14, 2003 10:12 WSPC/148-RMP
472
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
implies that SΩQ ,Ω = SΩ EQ (iβ/2)∗ . Hence ∗ ∆Ω,ΩQ = SΩ S Q ,Ω ΩQ ,Ω
= EQ (iβ/2)∆Ω E∗Q (iβ/2) = (EQ (iβ/2)e−βL/2 )(e−βL/2 EQ (iβ/2)∗ ) = e−β(L+Q) , where we used ∆Ω = e−βL . (8) follows from (7) if we note that, by (6), (ΩQ )−Q = Ω. ˜ := Q + β −1 log kΩQ k2 . Then ωQ = ω ˜ and Ω ˜ := ΩQ /kΩQ k. Using (9) Set Q Q Q (7) we get ˜, log ∆ΩQ˜ ,Ω = log ∆Ω − β Q which implies ˜ . Ent(ω|ωQ ) = −βω(Q) (10) Similarly, using (8) we get ˜, log ∆Ω,Ω˜ Q = log ∆ΩQ˜ + β Q which implies ˜ . Ent(ωQ |ω) = βωQ (Q) (11) Since Ent(ω|ωQ ) ≤ 0, (9) yields that e−β(Ω|QΩ)/2 ≤ kΩQ k . This is the Peierls–Bogoliubov inequality. (12) Let N be the Abelian von Neumann subalgebra of M generated by Q. Then, log kΩQ k2 = Ent(ωQ |ω) − βωQ (Q) ≤ Ent(ωQ |N | ω|N ) − βωQ (Q) ≤ log ω(e−βQ ) = log ke−βQ/2 Ωk2 , and so kΩQ k ≤ ke−βQ/2 Ωk .
(5.4)
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
473
This is the Golden–Thompson inequality. In the first step of (5.4) we used (10), in the second — Uhlmann’s estimate of Corollary 4.1 and in the third — Theorem 4.3 (4) with Q replaced by −βQ. (13) is a general fact which has the same proof for analytic and bounded perturbations. Its proof is given in the next section. We remark that the Golden–Thompson inequality was first proven by Araki [5]. The proof described in (12) is due to Donald [3]. 5.3. Bounded perturbations
proofs
In this subsection we prove Theorem 5.1. We assume that Q is an arbitrary selfadjoint element of M. By Theorem 2.15, we can find a sequence Qn of self-adjoint τ -analytic elements such that Qn → Q σ-strongly. This implies that Qn → Q strongly and the following lemma holds: Lemma 5.1. (1) L + Qn → L + Q in the strong resolvent sense. (2) LQn → LQ in the strong resolvent sense. Proof of Theorem 5.1. (1) Clearly, limn e−βQn /2 Ω = e−βQ/2 Ω. Hence there exists C such that for all n, ke−βQn /2 Ωk ≤ C . By the Golden–Thompson inequality for analytic perturbations, kΩQn k ≤ ke−βQn /2 Ωk . Hence kΩQn k ≤ C. Now by Proposition A.4, Ω ∈ D(e−β(L+Q)/2 ) and w- lim e−β(L+Qn )/2 Ω = e−β(L+Q)/2 Ω . n→∞
(2) follows from the analytic case of (2) and the fact that H + is weakly closed. t (3) Let P := 1 − sΩQ . Clearly, P ∈ M, τQ (P ) = P and P ΩQ = 0. Set Ω(z) = e−z(L+Q) Ω . By Proposition A.1, the vector-valued function Ω(z) is analytic inside the strip 0 < Re z < β/2 and norm continuous on its closure. Moreover, Ω(β/2) = ΩQ and eit(L+Q) P Ω(it + β/2) = eit(L+Q) P e−it(L+Q) Ω(β/2) t = τQ (P )ΩQ
= P ΩQ = 0 . Thus, for all real t, P Ω(it + β/2) = 0. This implies that P Ω(z) = 0 for all z in the strip 0 ≤ Re z ≤ β/2. In particular, P Ω(0) = P Ω = 0. Since Ω is a separating vector for M, P = 0. Hence sΩQ = 1 and ΩQ is a separating vector for M. Since ΩQ is separating, (2) and Theorem 2.7 (3) imply that ΩQ is also cyclic.
July 14, 2003 10:12 WSPC/148-RMP
474
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
(4) follows from the analytic case of (4) and Theorem 2.14. (5), (7) and (8) follow from their analytic versions and Theorem 4.2. (6) Let now Q1 , Q2 be two self-adjoint elements and Q1,n , Q2,n the sequences of the corresponding analytic approximations. Then, by the analytic case of (6), (ΩQ1,m )Q2,n = ΩQ1,m +Q2,n . As n → ∞, (ΩQ1,m )Q2,n → (ΩQ1,m )Q2 weakly, ΩQ1,m +Q2,n → ΩQ1,m +Q2 weakly, and so (ΩQ1,m )Q2 = ΩQ1,m +Q2 .
(5.5)
By the arguments of the proof of (1), as m → ∞, ΩQ1,m +Q2 → ΩQ1 +Q2 weakly. Moreover, (ΩQ1,m )Q2 = e−β(L+Q1,m −JQ1,m J+Q2 )/2 ΩQ1,m , ΩQ1,m → ΩQ1 weakly and L + Q1,m − JQ1,m J + Q2 → L + Q1 − JQ1 J + Q2 in the strong resolvent sense. Hence by Proposition A.4, ΩQ1 ∈ D(e−β(L+Q1 −JQ1 J+Q2 )/2 ) and (ΩQ1 )Q2 = e−β(L+Q1 −JQ1 J+Q2 )/2 ΩQ1 = ΩQ1 +Q2 . (9) and (10) follow from (7) and (8) precisely as in the analytic case. (11) (The Peierls–Bogoliubov inequality) follows from (9) just as in the analytic case. (12) limn e−βQn /2 Ω = e−βQ/2 Ω implies lim ke−βQn /2 Ωk = ke−βQ/2 Ωk .
n→∞
(5.6)
Moreover, w- limn ΩQn = ΩQ implies kΩQ k ≤ lim inf kΩQn k . n→∞
(5.7)
By the Golden–Thompson inequality for analytic perturbations, kΩQn k ≤ ke−βQn /2 Ωk .
(5.8)
Now (5.6), (5.7) and (5.8) imply the Golden–Thompson inequality: kΩQ k ≤ ke−βQ/2 Ωk .
(5.9)
(13) Let Qn ∈ M be an arbitrary sequence of self-adjoint elements which converges strongly to Q. The proof of (1) yields that ΩQn → ΩQ weakly. Using first the chain rule and then the Golden–Thompson inequality we get kΩQn k = k(ΩQ )Qn −Q k ≤ ke−β(Qn −Q)/2 ΩQ k . Hence, lim supn kΩQn k ≤ kΩQ k. Combining this estimate with (5.7) we get kΩQn k → kΩQ k, and so ΩQn → ΩQ in norm. By Theorem 2.7, this implies that ωQn → ωQ in norm.
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
475
5.4. Perturbative expansion of ΩQ and the estimates In this subsection we prove Theorems 5.2 and 5.3. The proof of Theorem 5.2 is based on the following technical result of Araki. Theorem 5.4. (1) Set Sβ,n := {(z1 , . . . , zn ) : Im zi ≥ 0, i = 1, . . . , n, Im z1 + · · · + Im zn ≤ β/2} . Then for (z1 , . . . , zn ) ∈ Sβ,n , Ω belongs to D(eizn L Qn · · · eiz1 L Q1 ), the function Sβ,n 3 (zn , . . . , z1 ) 7→ eizn L Qn · · · eiz1 L Q1 Ω
(5.10)
is norm continuous on Sβ,n , analytic on its interior, and keizn L Qn · · · eiz1 L Q1 Ωk ≤ kQn k · · · kQ1 k .
sup
(5.11)
(z1 ,...,zn )∈Sβ,n
(2) Let Qi,m → Qi strongly, Q∗i,m → Q∗i strongly. Then m→∞
lim e
m→∞
izn L
m→∞
Qn,m · · · e
iz1 L
Q1,m Ω = eizn L Qn · · · eiz1 L Q1 Ω ,
(5.12)
uniformly for (z1 , . . . , zn ) in compact subsets of Sβ,n . Proof. The proof follows by induction wrt n. For n = 1, the statement follows from the Proposition A.1 and the KMS condition (Theorem 2.13). Suppose that the statement is true for n − 1. Set Ω(z1 , . . . , zn−1 ) := Qn eizn−1 L Qn−1 · · · eiz1 L Q1 Ω , Ω∗ (z1 , . . . , zn−1 ) := JQ∗1 e−i¯z1 L Q∗2 · · · e−i¯zn−1 L Q∗n−1 Ω . Consider Φ ∈ D(e−βL/2 ) and the function F (z1 , . . . , zn−1 ) := (Φ|Ω∗ (z1 , . . . , zn−1 )) . By the induction assumption, the function F is continuous on Sβ,n−1 , analytic on its interior, and satisfies the estimate |F (z1 , . . . , zn−1 )| ≤ kΦkkQ1k · · · kQn k ,
(5.13)
which gives the estimate (5.11) for zn = 0. The function G(z1 , . . . , zn−1 ) := (e(i¯z1 +···+i¯zn−1 −β/2)L Φ|Ω(z1 , . . . , zn−1 ))
(5.14)
is also analytic and continuous on the same domain. (Here we used the induction assumption, the assumption Φ ∈ D(e−βL/2 ) and Proposition A.1). For z1 , . . . , zn−1 ∈ R, set s2 = z1 , s3 := z2 + z1 , . . . , sn = zn−1 + · · · + z1 . Then F (z1 , . . . , zn−1 ) = (Φ|JQ∗1 τ −s2 (Q∗2 ) · · · τ −sn (Q∗n )Ω) = (Φ|e−βL/2 τ −sn (Qn ) · · · τ −s2 (Q2 )Q1 Ω) = G(z1 , . . . , zn−1 ) ,
July 14, 2003 10:12 WSPC/148-RMP
476
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
and by the edge of wedge theorem, the functions F and G coincide on their whole domains. Thus, by (5.13), |G(z1 , . . . , zn−1 )| ≤ kΦkkQ1k · · · kQn k .
(5.15)
For zn = iβ/2 − z1 − · · · − zn−1 and (z1 , . . . , zn−1 ) ∈ Sβ,n−1 , this implies that Ω(z1 , . . . , zn−1 ) ∈ D(eizn L ) , and Ω∗ (z1 , . . . , zn−1 ) = eizn L Ω(z1 , . . . , zn−1 ) .
(5.16)
(5.15) gives also the estimate (5.11) for zn = iβ/2 − z1 − · · · − zn−1 . The estimate (5.11) for 0 ≤ Im zn ≤ β/2 − Im z1 − · · · − Im zn−1 follows from (5.13), (5.15) and Proposition A.1. By Proposition A.1 and Hartog’s theorem of holomorphy, (ei¯zn L Φ|Ω(z1 , . . . , zn )) is analytic on the interior of Sβ,n , for Φ ∈ D(e−βL/2 ). Using the estimate (5.11) we see that it is analytic for all Φ. Hence we can conclude that the function (5.10) is weakly analytic. Since the weak analyticity is equivalent to the norm analyticity, we have proven all the statements of (1) except that (5.10) is norm continuous on the whole Sβ,n . Next we turn to the proof of (2) for n. Set Ωm (z1 , . . . , zn−1 ) := Qn,m eizn−1 L Qn−1,m · · · eiz1 L Q1,m Ω , Ω∗m (z1 , . . . , zn−1 ) := JQ∗1,m e−i¯z1 L Q∗2,m · · · e−i¯zn−1 L Q∗n−1,m Ω . By the uniform boundedness principle, independently of m, we have kQi,m k ≤ C ,
i = 1, . . . , n .
(5.17)
Now kΩ∗m (z1 , . . . , zn−1 ) − Ω∗ (z1 , . . . , zn−1 )k ≤ kQ1,m kke−i¯z1 L Q∗2,m · · · e−i¯zn−1 L Q∗n−1,m Ω − e−i¯z1 L Q∗2 · · · e−i¯zn−1 L Q∗n−1 Ωk + k(Q∗1,m − Q∗1 )e−i¯z1 L Q∗2 · · · e−i¯zn−1 L Q∗n−1 Ωk . The first term on the right goes to zero uniformly on compact subsets of Sβ,n−1 by the induction assumption and (5.17) for i = 1. The second term on the right goes to zero uniformly on compact subsets of Sβ,n−1 by the induction assumption, Lemma A.1 and the strong convergence Q∗1,m → Q∗1 . By the proof of (1) (see the identity (5.16)), we have for z1 , . . . , zn−1 ∈ Sβ,n−1 , Ω(z1 , . . . , zn−1 ) − Ωm (z1 , . . . , zn−1 ) ∈ D(e(−iz1 −···−izn−1 −β/2)L ) , Ω∗ (z1 , . . . , zn−1 ) − Ω∗m (z1 , . . . , zn−1 ) = e(−iz1 −···−izn−1 −β/2)L (Ω(z1 , . . . , zn−1 ) − Ωm (z1 , . . . , zn−1 )) .
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
477
Hence, lim ke(−iz1 −···−izn−1 −β/2)L (Ω(z1 , . . . , zn−1 ) − Ωm (z1 , . . . , zn−1 ))k = 0
m→∞
uniformly on compact subsets of Sβ,n−1 . By the induction assumption, lim kΩ(z1 , . . . , zn−1 ) − Ωm (z1 , . . . , zn−1 )k = 0
m→∞
uniformly on compact subsets of Sβ,n−1 . Hence, by Proposition A.1, lim keizn L (Ω(z1 , . . . , zn−1 ) − Ωm (z1 , . . . , zn−1 ))k = 0
m→∞
uniformly for 0 ≤ Im zn ≤ β/2−Im z1 −· · ·−Im zn−1 and (z1 , . . . , zn−1 ) in compact subsets of Sβ,n−1 . In particular, the convergence is uniform on compact subsets of Sβ,n . This ends the proof of (2) for n. It remains to prove the norm continuity part of (1). Let Qi,m ∈ Mτ such that Qi,m → Qi strongly and Q∗i,m → Q∗i strongly. The function m→∞
m→∞
Cn 3 (z1 , . . . , zn ) 7→ eizn L Qn,m · · · eiz1 L Q1,m Ω is entire analytic and in particular, it is norm continuous. By the uniform convergence on compact subsets of Sβ,n , proven in (2), and the local compactness of Sβ,n we conclude that (5.10) is norm continuous on Sβ,n . Proof of Theorem 5.2. Let Qn ∈ Mτ be such that Qn → Q strongly. Since ΩQn = EQn (iβ/2)Ω, the expansion (3.4) yields that Theorem 5.2 holds for Qn . Moreover, ΩQ = w- lim ΩQn n→∞
= w- lim
n→∞
=
∞ X
m=0
∞ X
(−1)m
m=0
(−1)m
Z
Z
··· Tβ,m
··· Tβ,m
Z
Z
e−β1 L Qn · · · e−βm L Qn Ω dβ1 · · · dβm
e−β1 L Q · · · e−βm L Q Ω dβ1 · · · dβm .
The first identity follows from Theorem 5.1 (recall the proof of (1) or use (13)), the second is obvious, and the third follows from Theorem 5.4. Proof of Theorem 5.3. Theorem 5.2 yields (1). By Theorem 5.1 (13) it suffices to prove (2)–(5) for Q ∈ Mτ . (2)–(3). Our proof is motivated by [2]. By Theorem 3.2, Ω ∈ D(e−z(L+Q) ) for all z and EQ (iz)Ω = e−z(L+Q) Ω is an entire vector-valued function. Set f (z) := (Ω|e−z(L+Q) Ω) = (Ω|EQ (iz)Ω) .
July 14, 2003 10:12 WSPC/148-RMP
478
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
Then f is an entire function, f 00 (x) ≥ 0 for x ∈ R, and f (0) = kΩk2 = 1 ,
f (β/2) = (Ω|ΩQ ) ,
f (β) = kΩQ k2 ,
f 0 (0) = −(Ω|(L + Q)Ω) = −(Ω|QΩ) , f 0 (β/2) = −(Ω|(L + Q)ΩQ ) = −(Ω|JQJΩQ ) = −(Ω|QΩQ ) , f 0 (β) = −(ΩQ |(L + Q)ΩQ ) = −(ΩQ |JQJΩQ ) = −(ΩQ |QΩQ ) (we used LΩ = 0 and (L + Q − JQJ)ΩQ = 0). These relations combined with the mean-value theorem yield (2)–(3). (4) follows easily from (2). To prove (5), consider the function z F (z) := τQ (Q)EQ (z)Ω . z Since τQ (Q) and EQ (z) are uniformly bounded on the strip 0 ≤ Im z ≤ β/2, F (z) is also bounded on the this strip. Moreover, ( kQΩk if Im z = 0 ; kF (z)k ≤ iβ/2 kτQ (Q)ΩQ k if Im z = β/2 . iβ/2
Since τQ
(Q)ΩQ = e−βLQ /2 QΩQ = JQΩQ , kF (z)k ≤ kQΩQ k if Im z = β/2 .
Hence, by the three-line theorem, for 0 ≤ t ≤ β/2, kF (it)k ≤ kQΩQ k1−2t/β kQΩk2t/β . Since ΩQ − Ω = −
Z
0
we derive kΩQ − Ωk ≤ ≤
Z
β/2
it τQ (Q)EQ (it)Ωdt ,
β/2
kF (it)kdt 0
β 2
Z
1
kQΩk1−s kQΩQ ks ds = βf (kQΩk, kQΩQk)/2 .
0
5.5. Unbounded perturbations This subsection contains our main results. It extends the construction of KMS states to a large class of unbounded perturbations. Let Q be a self-adjoint operator affiliated to M satisfying Assumptions 3.1 and 3.2. Let τQ be the dynamics defined as in Subsec. 3.3. Recall that by Theorem 3.5 its Liouvillean equals LQ = L + Q − JQJ .
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
479
In order to construct the perturbed KMS state we will need an additional assumption: Assumption 5.1. ke−βQ/2 Ωk < ∞. Theorem 5.5. Assume 3.1, 3.2 and 5.1. Then (1) Ω ∈ D(e−β(L+Q)/2 ). Set ΩQ := e−β(L+Q)/2 Ω , (2) (3) (4) (5) (6) (7) (8)
ωQ (A) := (ΩQ |AΩQ )/kΩQ k2 .
ΩQ ∈ H + . ΩQ is cyclic and separating. ωQ is a (τQ , β)-KMS state. log ∆ΩQ = −βLQ . log ∆ΩQ ,Ω = −βL − βQ. Ent(ω|ωQ ) = −βω(Q) − log kΩQ k2 . The Peierls–Bogoliubov inequality holds: e−β(Ω|QΩ)/2 ≤ kΩQ k .
(9) The Golden–Thompson inequality holds: kΩQ k ≤ ke−βQ/2 Ωk . (10) For any 0 ≤ λ ≤ 1, λQ satisfies the assumptions of the theorem, hence Ω λQ is well defined. Moreover, limλ↓0 kΩλQ − Ωk = 0. Remark. The formula for relative entropy of (7) requires a comment. Because of Assumption 5.1, ω(Q− ) is finite, where Q− = 1]−∞,0] (Q)Q. Therefore, ω(Q) is a finite number or +∞. Set Qn := 1[−n,n] (Q)Q , where 1[−n,n] (Q) is the spectral projection of Q on the interval [−n, n]. Theorem 5.6. (1) L + Qn → L + Q in the strong resolvent sense. (2) LQn → LQ in the strong resolvent sense. Proof. We prove only (2) (the proof of (1) is similar). Let D0 = D(L) ∩ D(Q) ∩ D(JQJ). By Assumption 3.2, LQ is essentially self-adjoint on D0 . Moreover, LQn Ψ → LQ Ψ, Ψ ∈ D0 . Hence the statement follows from Proposition A.3. Proof of Theorem 5.5. Given the approximating sequence Qn defined above and Lemma 5.6, the parts (1)–(9) follow from Theorem 5.1 in the same way as the analogous parts of Theorem 5.1 followed from the analytic case of Theorem 5.1.
July 14, 2003 10:12 WSPC/148-RMP
480
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
The only part requiring a separate argument is (10). To prove it, note that L + λQ → L in the strong resolvent sense as λ ↓ 0. This implies that ΩλQ → Ω weakly as λ ↓ 0 and kΩk ≤ lim inf kΩλQ k ≤ lim sup kΩλQ k ≤ lim ke−βλQ/2 Ωk = kΩk . λ↓0
λ↓0
λ↓0
Hence, kΩλQ k → kΩk as λ ↓ 0, and this implies that ΩλQ → Ω as λ ↓ 0. 5.6. Perturbations of Liouvilleans revisited In Theorem 3.5 we have shown that LQ is the Liouvillean of τQ by invoking Theorem 2.9 and checking that t τQ (A) = eitLQ Ae−itLQ ,
eitLQ H+ ⊂ H+ .
(5.18)
Under the conditions of Theorem 5.5 (recall Proposition 2.4), the second relation in (5.18) is equivalent to LQ ΩQ = 0 .
(5.19)
In this section we give an elementary direct proof of (5.19). This verifies that L Q is the Liouvillean of τQ without resort to Theorem 2.9. We consider only the case of analytic perturbations Q ∈ Mτ . The extension to bounded Q and unbounded Q satisfying conditions of Theorem 5.5 is immediate using the strong resolvent convergence of Liouvilleans and the weak convergence of β-KMS vectors. First, the relation eit(L+Q) ΩQ = EQ (t + iβ)Ω and analytic continuation yield that ΩQ ∈ D(exp(iz(L + Q)) for all z, and so ΩQ ∈ D(L + Q) = D(LQ ). Since eitL M0 e−itL = M0 , JQJ ∈ M0 , and eitL J = JeitL , the Trotter product formula yields eit(L+Q) JQJe−it(L+Q) = eitL JQJe−itL = JeitL Qe−itL J . By analytic continuation, the relation (eβ(L+Q)/2 Φ|JQJe−β(L+Q)/2 Ω) = (Φ|Jτ iβ/2 (Q)JΩ) ˜=S holds for all Φ in a dense domain D r>0 Ran1[−r,r] (L + Q). Using 1
Jτ iβ/2 (Q)JΩ = J∆ 2 QΩ = QΩ ,
we derive (eβ(L+Q)/2 Φ|JQJΩQ ) = (Φ|QΩ) .
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
481
This relation yields (eβ(L+Q)/2 Φ|(L + Q − JQJ)ΩQ ) = (eβ(L+Q)/2 Φ|(L + Q)ΩQ ) − (Φ|QΩ) = (Φ|(L + Q)Ω) − (Φ|QΩ) = (Φ|LΩ) = 0 . Since e
β(L+Q)/2
˜=D ˜ is dense in H, LQ ΩQ = 0. D
Appendix A. Technical Facts In this appendix we collect some technical facts which have been used throughout the paper. A.1. Operators and resolvent convergence First, we recall the Trotter product formula (see [30], Theorem VIII.31). Theorem A.1. If A and B are self-adjoint operators and A + B is essentially self-adjoint on D(A) ∩ D(B), then s- lim (eitA/n eitB/n )n = eit(A+B) . n→∞
The next proposition follows easily from the spectral theorem and the three-line theorem (see also Lemma 4 in [5]). Proposition A.1. Let H be a self-adjoint operator and Ω ∈ D(eδH ) for some δ > 0. Then the vector-valued function ezH Ω is analytic inside the strip 0 < Re z < δ, norm continuous on its closure and kezH Ωk ≤ keδH ΩkRe z/δ kΩk1−Re z/δ . Lemma A.1. Let Z be a compact metric space and Z 3 z 7→ Ω(z) ∈ H a norm continuous function. Let An be bounded operators and assume that An → A strongly. Then lim k(An − A)Ω(z)k = 0
n→∞
uniformly on Z. Proof. Note first that {Ω(z) : z ∈ Z} is a compact subset of H and that by the uniform boundedness principle C := supn kAn k < ∞. Let > 0 be given. Then there exists a finite dimensional projection P such that supz∈Z k(1 − P )Ω(z)k < . Since k(An − A)Ω(z)k ≤ k(An − A)P k sup kΩ(z)k z∈Z
+ sup kAn − Ak sup k(1 − P )Ω(z)k , n
we derive lim supn k(An − A)Ω(z)k < 2C.
z∈Z
July 14, 2003 10:12 WSPC/148-RMP
482
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
The following properties of the strong convergence of functions of self-adjoint operators are proven e.g. in [30]: Proposition A.2. Suppose that Hn , H are self-adjoint operators. Then the following conditions are equivalent: S∞ (1) Let z0 6= ( n=1 σ(Hn ))cl (for instance, Im z0 6= 0). Then s- lim (z0 − Hn )−1 = (z0 − H)−1 . n→∞
(2) If f is a bounded continuous function on ( strongly.
S∞
n=1
σ(Hn ))cl , then f (Hn ) → f (H)
Note that (1) in the above proposition holds for any choice of z0 if it holds for one choice of z0 . If the conditions of above proposition are satisfied we say that Hn → H in the strong resolvent sense. Proposition A.3. Suppose that Hn , H are self-adjoint operators, H is essentially self-adjoint on D and limn Hn Ψ = HΨ for Ψ ∈ D. Then Hn → H in the strong resolvent sense. Proof. Let Im z 6= 0. Then (z − H)D =: D1 is dense in H. For Ψ ∈ D1 , (z − H)−1 Ψ − (z − Hn )−1 Ψ = (z − Hn )−1 (H − Hn )(z − H)−1 Ψ → 0 . The following proposition plays an important role in several arguments in our paper. Proposition A.4. Suppose that Hn , H are self-adjoint operators and Hn → H in the strong resolvent sense. Suppose that Ωn , Ω ∈ H such that Ωn → Ω weakly and kHn Ωn k ≤ C. Then Ω ∈ D(H), w- limn Hn Ωn exists and HΩ = w- limn Hn Ωn . Remark. By the uniform boundedness principle, the condition kHn Ωn k ≤ C can be replaced by the existence of w- limn→∞ Hn Ωn . Proof. Since the ball of radius C in a Hilbert space is weakly sequentially compact, one can find a weakly convergent subsequence Hnk Ωnk . Set Ψ = w- limk→∞ Hn k Ω n k . S Let D := r>0 Ran1[−r,r](H). Let Φ ∈ D and f ∈ C0∞ (R) such that f (H)Φ = Φ. Then Φ = f (H)Φ = lim f (Hn )Φ n→∞
HΦ = f (H)HΦ = lim f (Hn )Hn Φ , n→∞
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
483
and (HΦ|Ω) = lim (Hnk f (Hnk )Φ|Ωnk ) k→∞
= lim (f (Hnk )Φ|Hnk Ωnk ) k→∞
= (Φ|Ψ) .
(A.1)
Since D is a core for H, Ω ∈ D(H) and HΩ = Ψ. Now assume that w- limn→∞ Hn Ω does not exist. Then there exists Φ ∈ H and a subsequence Hnk Ω such that |(Φ|Hnk Ω) − (Φ|HΩ)| ≥ > 0 .
(A.2)
Using again the weak sequential compactness of the ball of radius C and passing to a subsubsequence we may assume that w- limk→∞ Hnk Ω exists. Repeating the arguments of (A.1), we see that w- limk→∞ Hnk Ω = HΩ. This contradicts (A.2). A.2. An interpolation theorem Various versions of the following interpolation theorem for linear operators can be found throughout literature, see e.g. [4] (where a different proof is outlined) and [31]. Theorem A.2. Let H1 , H2 be Hilbert spaces and let Hi be a positive (possibly unbounded ) operator on Hi . Let D1 be a core of H1 . Let T ∈ B(H1 , H2 ) with kT k = c0 be such that: (a) T D1 ⊂ D(H2 ). (b) For Ψ ∈ D1 , kH2 T Ψk ≤ c1 kH1 Ψk. Then, for any 0 ≤ λ ≤ 1, T D(H1λ ) ⊂ D(H2λ ) and for Ψ ∈ D(H1λ ), kH2λ T Ψk ≤ c01−λ cλ1 kH1λ Ψk .
(A.3)
Proof. Clearly, we may assume that c0 = c1 = 1. First let us show that T D(H1 ) ⊂ D(H2 ) and kH2 T Ψk ≤ kH1 Ψk ,
Ψ ∈ D(H1 ) .
(A.4)
Let Ψ ∈ D(H1 ). Then there exist Ψn ∈ D1 such that Ψn → Ψ and H1 Ψn → H1 Ψ. Now kH2 (T Ψn − T Ψm )k ≤ kH1 (Ψn − Ψm )k . Thus H2 T Ψn is Cauchy, hence convergent. T Ψn is obviously convergent. H2 is closed. Hence T Ψ ∈ D(H2 ). (A.4) follows by passing to the limit in kH2 T Ψn k ≤ kH1 Ψn k. Let Φ ∈ D(H2 ), Ω ∈ H1 and > 0. For 0 ≤ Re z ≤ 1 set, F (z) := (H2z¯Φ|T (H1 + )−z Ω) .
July 14, 2003 10:12 WSPC/148-RMP
484
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
F (z) is a continuous function in the strip 0 ≤ Re z ≤ 1, analytic in its interior, and |F (z)| ≤ k(H2 + 1)Φk−1 kΩk . For Re z = 0 |F (z)| ≤ kΦkkΩk . For Re z = 1, (H1 + )−z Ω ∈ D(H1 ), and |F (z)| ≤ kΦkkH2 T (H1 + )−z Ωk ≤ kΦkkH1 (H1 + )−z Ωk ≤ kΦkkΩk . These estimates and the three-line theorem yield that for 0 ≤ λ ≤ 1, |F (λ)| ≤ kΦkkΩk . Therefore, for Ω ∈ H1 , kH2λ T (H1 + )−λ Ωk ≤ kΩk , and for Ψ ∈ D(H1λ ), kH2λ T Ψk = lim kH2λ T (H1 + )−λ (H1 + )λ Ψk ↓0
≤ lim k(H1 + )λ Ψk ↓0
= kH1λ Ψk . Appendix B. Pauli Fierz Systems B.1. Introduction A large part of the motivation for the formalism and the results of our paper comes from quantum statistical physics. A detailed description of their application to Pauli–Fierz systems — a certain class of physically motivated W ∗ -dynamical systems — can be found in [1]. In this appendix we briefly describe these applications. Pauli–Fierz systems describe a small quantum system (an atom or a molecule) interacting with a large bosonic reservoir. They arise as an approximation to nonrelativistic QED (see e.g. [11, 32]), and they have been widely used in physics literature as a basic paradigm of an open quantum system [33, 34]. We are interested in the case where the radiation density of the bosonic reservoir is not zero (in particular, the reservoir is not at zero temperature). For example, the radiation density can be given by the Planck law at the inverse temperature β < ∞, see (B.3) below. This corresponds to the case of bosons in thermal equilibrium. We are also interested in situations outside thermal equilibrium. For example, the reservoir may consist of several subreservoirs at distinct temperatures. W ∗ -dynamical systems provide a natural framework to describe Pauli–Fierz systems with nonzero radiation density, as it will be sketched below.
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
B.2. Bose gas at density ρ
485
Araki Woods algebras
If Z is a Hilbert space, then we will write Γs (Z) for the bosonic Fock space over the 1-particle space Z. Ω will denote the vacuum vector. For definiteness, we will consider the Bose gas with the 1-particle space L2 (Rd ). Assume that Rd 3 ξ 7→ ρ(ξ) is a nonnegative real measurable function describing the density of bosons with the momentum ξ ∈ Rd . To describe the Bose gas at density ρ one uses a special von Neumann algebra first described by Araki and Woods in [35]. It can be defined by its representation in the Hilbert space HAW := Γs (L2 (Rd ) ⊕ L2 (Rd )) . We will write al (ξ), a∗l (ξ), ar (ξ), a∗r (ξ) for the creation and annihilation operators corresponding to the left and right L2 (Rd ) respectively. We define the left/right Araki–Woods creation and annihillation opetators p p a∗ρ,l (ξ) := 1 + ρ(ξ)a∗l (ξ) + ρ(ξ)ar (ξ) , p p aρ,l (ξ) := 1 + ρ(ξ)al (ξ) + ρ(ξ)a∗r (ξ) , p p a∗ρ,r (ξ) := ρ(ξ)al (ξ) + 1 + ρ(ξ)a∗r (ξ) , p p aρ,r (ξ) := ρ(ξ)a∗l (ξ) + 1 + ρ(ξ)ar (ξ) .
∗ The left Araki–Woods algebra is denoted by MAW ρ,l and defined as the W -algebra generated by the operators ! Z ∗ ¯ exp i (f (ξ)aρ,l (ξ) + f(ξ)a ρ,l (ξ))dξ .
Let J AW := Γ(), where is an antilinear involution on L2 (Rd ) ⊕ L2 (Rd ) given by (f1 , f¯2 ) := (f2 , f¯1 ) , and Γ is the second quantization functor, and let HρAW,+ be the closure of the cone in HAW generated by AJAΩ ,
A ∈ MAW ρ,l .
AW Then (MAW , J AW , HρAW,+ ) is a von Neumann algebra in a standard form. It ρ,l , H describes the Bose gas at density ρ.
B.3. Araki Woods algebra coupled to a type I factor We denote by K the Hilbert space of the small quantum system. For simplicity, we assume that dim K < ∞. We would like to describe the W ∗ -algebra of the joint system consisting of the small system with the algebra of observables B(K) and the Bose gas at density ρ. One way to define this algebra is to identify it with the von Neumann algebra, Mρ := B(K) ⊗ MAW ρ,l
July 14, 2003 10:12 WSPC/148-RMP
486
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
acting on the Hilbert space K ⊗ Γs (L2 (Rd ) ⊕ L2 (Rd )). The identity representation of this algebra on K ⊗ Γs (L2 (Rd ) ⊕ L2 (Rd )) will be called the semi-standard representation of Mρ . ¯ be the Hilbert It is easy to describe the standard representation of Mρ . Let K space complex conjugate to K (see e.g. Sec. 4.6 in [1]). The standard representation acts on the space ¯ ⊗ Γs (L2 (Rd ) ⊕ L2 (Rd )) K⊗K
(B.1)
and is given by π(A ⊗ B) := A ⊗ 1K¯ ⊗ B for A ∈ B(K), B ∈ MAW ρ,l . The modular conjugation is given by ¯ 2 ⊗ Φ := Ψ2 ⊗ Ψ ¯ 1 ⊗ J AW Φ . J Ψ1 ⊗ Ψ Note that it is useful to consider the two representations of Mρ — the semi-standard and the standard representations in a parallel way. The semistandard representation is simpler whereas the standard representation has special mathematical properties. B.4. Pauli Fierz W ∗ -dynamical systems Suppose that K is a self-adjoint operator on K describing the Hamiltonian of the small system. Let |ξ| be the energy of the boson of momentum ξ. Let Rd 3 ξ 7→ v(ξ) ∈ B(K) describe the coupling of the small system to the Bose gas. We assume that the Bose gas is at the density ρ. Let λ ∈ R. We introduce the following operators on K ⊗ Γs (L2 (Rd ) ⊕ L2 (Rd )): Z Lsemi := K ⊗ 1 + 1 ⊗ |ξ|(a∗l (ξ)al (ξ) − a∗r (ξ)ar (ξ))dξ , fr Qsemi := ρ
Z
(v(ξ) ⊗ a∗ρ,l (ξ) + v ∗ (ξ) ⊗ aρ,l (ξ))dξ .
will be called the free semi-Liouvillean. The full semi-Liouvillean The operator Lsemi fr for the density ρ equals Lsemi := Lsemi + λQsemi . fr ρ ρ
(B.2)
For A ∈ Mρ we set semi
semi
τfrt (A) := eitLfr Ae−itLfr semi
τρt (A) := eitLρ
semi
Ae−itLρ
, .
¯ ⊗ Γs (L2 (Rd ) ⊕ L2 (Rd )): We also introduce the following operators on K ⊗ K Z ¯ ⊗ 1 + 1 ⊗ 1 ⊗ |ξ|(a∗l (ξ)al (ξ) − a∗r (ξ)ar (ξ))dξ . Lfr = K ⊗ 1 ⊗ 1 − 1 ⊗ K Qρ =
Z
(v(ξ) ⊗ 1 ⊗ a∗ρ,l (ξ) + v ∗ (ξ) ⊗ 1 ⊗ aρ,l (ξ))dξ .
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
487
It is easy to see that JQρ J = Set
Z
(1 ⊗ v¯(ξ) ⊗ a∗ρ,r (ξ) + 1 ⊗ v¯∗ (ξ) ⊗ aρ,r (ξ))dξ .
Lρ := Lfr + λQρ − λJQρ J . We denote by l2 (K) the vector space B(K) equipped with the inner product (A|B) = Tr(A∗ B) (recall that dim K < ∞). The following theorem describes the case of the free dynamics. Theorem B.1. (1) For any t, τfrt preserves the algebra Mρ and (Mρ , τfr ) is a W ∗ -dynamical system. (2) Lfr is the Liouvillean for the dynamics τfr . (3) Let β > 0, ρ(ξ) = (eβ|ξ| − 1)−1 ,
(B.3)
¯ Ψfr and Ψfr := e−βK/2 ⊗ Ω. Using the natural identification of l 2 (K) with K ⊗ K, 2 d 2 ¯ ⊗ Γs (L (R ) ⊕ L (Rd )). can be understood as an element of the Hilbert space K ⊗ K Then Ψfr is a β-KMS vector for τfr . The results of our paper are the main technical input in the proof of the following theorem, which describes the interacting dynamics: Theorem B.2. (1) Assume that Z (1 + |ξ|2 )(1 + ρ(ξ))kv(ξ)k2 dξ < ∞ .
(B.4)
Then for any t, τρt preserves the algebra Mρ and (Mρ , τρ ) is a W ∗ -dynamical system. (2) Lρ is the Liouvillean for the dynamics τρ . (3) Assume that (B.3) holds and that Z (|ξ|−1 + |ξ|2 )kv(ξ)k2 dξ < ∞ . Then (B.4) holds, and there exists a β-KMS vector for τρ . The W ∗ -dynamical system (Mρ , τρ ) is called the Pauli–Fierz system at density ρ. It is canonically defined given K, K, v and ρ. The proof of Theorem B.2 is given in [1]. To prove (1) we check that Qsemi is ρ semi semi ) ∩ + λQ is essentially self-adjoint on D(L affiliated to Mρ and that Lsemi ρ fr fr D(Qsemi ). Then we apply Theorem 3.3. To prove (2), in a similar way we apply ρ Theorem 3.5. Finally, to show (3) we use Theorem 5.5. The details can be found in [1]. We finish with several remarks. The perturbation Qsemi is unbounded from above and below, and the existing ρ results in the literature [2, 3, 14] are not applicable to Pauli–Fierz systems.
July 14, 2003 10:12 WSPC/148-RMP
488
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
The first result about existence of KMS-states for Pauli–Fierz systems goes back to [36] where the spin-boson system was considered. A result similar to Theorem B.2 was proven in [7] under a more restrictive infrared condition. Theorem B.2 covers the physical infrared regime of nonrelativistic QED (often called the ohmic case in the context of Pauli–Fierz systems, see e.g. [1, 11, 33, 34]). Acknowledgments The research of the first author was partly supported by the Postdoctoral Training Program HPRN-CT-2002-00277 and by the grant SPUB127 financed by Komitet Bada´ n Naukowych. A part of this work was done during a visit of the first author at the Aarhus University supported by MaPhySto funded by the Danish National Research Foundation and during a visit to University of Montreal. The research of the second author was partly supported by NSERC. We wish to thank the referees for the careful reading of the manuscript and for numerous helpful remarks. References [1] J. Derezinski and V. Jakˇsi´c, Return to equilibrium for Pauli–Fierz systems, preprint, Maphysto, Aarhus University, 2001, to appear in Ann. H. Poincar´e. [2] S. Sakai, Perturbations of KMS states in C ∗ -dynamical systems (Generalization of the absence theorem of phase transition to continuous quantum systems), Contemp. Math. 62 (1987), 187. [3] M. J. Donald, Relative Hamiltonians which are not bounded from above, J. Func. Anal. 91 (1990), 143. [4] M. Ohya and D. Petz, Quantum Entropy and its Use, Springer-Verlag, Berlin, 1993. [5] H. Araki, Golden–Thompson and Peierls–Bogoliubov inequalities for a general von Neumann algebra, Comm. Math. Phys. 34 (1973), 167. [6] A. Uhlmann, Relative entropy and the Wigner-Yanase-Dyson-Lieb concavity in an interpolation theory, Comm. Math. Phys. 54 (1977), 123. [7] V. Bach, J. Fr¨ ohlich and I. Sigal, Return to equilibrium, J. Math. Phys. 41 (2000), 3985. [8] V. Jakˇsi´c and C.-A. Pillet, On a model for quantum friction III. Ergodic properties of the spin-boson system, Comm. Math. Phys. 178 (1996), 627. [9] V. Jakˇsi´c and C.-A. Pillet, Nonequilibrium steady states for finite quantum systems coupled to thermal reservoirs, to appear in Comm. Math. Phys. [10] M. Merkli, Positive commutators in nonequilibrium quantum statistical mechanics, preprint. [11] J. Derezinski and V. Jakˇsi´c, Spectral theory of Pauli–Fierz operators, J. Func. Anal. 180 (2001), 243. [12] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics 1, 2nd edition, Springer-Verlag, Berlin, 1987. [13] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics 2, 2nd edition, Springer-Verlag, Berlin, 1996. [14] H. Araki, Relative Hamiltonian for a faithful normal states of a von Neumann algebra, Pub. R.I.M.S. Kyoto Univ. 9 (1973), 165. [15] A. Klein and L. Landau, Stochastic processes associated with KMS states, J. Func. Anal. 42 (1981), 368.
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
489
[16] S. Sakai, Operator Algebras in Dynamical Systems. The Theory of Unbounded Derivations in C ∗ -algebras. Encyclopedia of Mathematics and its Applications 41, Cambridge University Press, Cambridge, 1991. [17] B. Simon, Statistical Mechanics of Lattice Gasses, Volume 1, Princeton University Press, Princeton, 1991. [18] S. Sakai, C ∗ -algebras and W ∗ -algebras, Springer-Verlag, Berlin, 1971. [19] S. Stratila and L. Zsido, Lectures on von Neumann algebras, Abacus Press, Turnbridge Wells, 1979. [20] S. Stratila, Modular theory in operator algebras, Abacus Press, Turnbridge Wells, 1981. [21] S. Baaj and P. Jungl, Th´eorie bivariant de Kasparow et operateurs non borne’ees dans les C ∗ -modules hilbertiens, C. R. Acad. Sci. Paris, Serie I 296 (1983), 875. [22] S. L. Woronowicz, Unbounded elements afflliated with C ∗ -algebras and non-compact quantum groups, Comm. Math. Phys. 136 (1991), 399. [23] U. Haagerup, The standard form of von Neumann algebras, Math. Scand. 37 (1975), 271. [24] H. Araki, Positive cone, Radon–Nikodym theorems, relative Hamiltonian and the Gibbs condition in statistical mechanics. An application of the Tomita–Takesaki theory, in C ∗ -algebras and their Applications to Statistical Mechanics and Quantum Field Theory, ed. D. Kastler, Amsterdam, North-Holand, 1976. [25] A. Connes, Une classification des facteurs de type III, Ann. Sci. Ecole Norm. Sup. 6 (1973), 133. [26] H. Araki, Expansionals in Banach algebras, Ann. Sci. Ecole Norm. Sup. 6 (1973), 67. [27] H. Araki, Relative entropy of states of von Neumann algebras, Pub. R.I.M.S. Kyoto Univ. 11 (1976), 809. [28] H. Araki, Relative entropy of states of von Neumann algebras II, Pub. R.I.M.S. Kyoto Univ. 13 (1977), 173. [29] W. Pusz and S. L. Woronowicz, Form convex functions and the WYDL and other inequalities, Lett. Math. Phys. 2 (1978), 505. [30] M. Reed and B. Simon, Methods of Modern Mathematical Physics I. Functional Analysis, Academic Press, San Diego, 1973. [31] M. Reed and B. Simon, Methods of Modern Mathematical Physics II. Fourier Analysis, Self-Adjointness, Academic Press, San Diego, 1975. [32] G. A. Raggio and S. H. Zivi, Semiclassical description of N -level systems interacting with radiation fields, in “Quantum Porbability II, Heidelberg 1984”, Lecture Notes in Math., Vol. 1136, Springer-Verlag, Berlin-New York, 1985. [33] A. J. Legget, S. Chakravarty, A. T. Dorsey, M. P. A. Fisher, A. Garg and W. Zwerger, Dynamics of the dissipative two-state system, Rev. Mod. Phys. 59 (1987), 1. [34] U. Weiss, Quantum Dissipative Systems, Series in Modern Condensed Matter PhysicsVol. 2 (second enlarged edition), World Scientific, Singapore, 1999. [35] H. Araki and E. J. Woods, Representations of the canonical commutation relations describing a nonrelativistic infinite free Bose gas, J. Math. Phys. 4 (1963), 637. [36] M. Fannes, B. Nachtergale and A. Verbeure, The equilibrium states of the spin-boson model, Comm. Math. Phys. 114 (1988), 537.
July 14, 2003 11:19 WSPC/148-RMP
00169
Reviews in Mathematical Physics Vol. 15, No. 5 (2003) 491–558 c World Scientific Publishing Company
PERTURBATIVE RENORMALIZATION BY FLOW EQUATIONS
¨ VOLKHARD F. MULLER Fachbereich Physik, Universit¨ at Kaiserslautern, D-67653 Kaiserslautern, Germany
[email protected] Received 9 September 2002 Revised 6 March 2003 In this article a self-contained exposition of proving perturbative renormalizability of a quantum field theory based on an adaption of Wilson’s differential renormalization group equation to perturbation theory is given. The topics treated include the spontaneously broken SU(2) Yang–Mills theory. Although mainly a coherent but selective review, the article contains also some simplifications and extensions with respect to the literature. Keywords: Renormalization group.
Contents 1. Introduction
492
2. The Method
494
2.1. Properties of Gaussian measures
494
2.2. The flow equation
497
2.3. Proof of perturbative renormalizability
503
2.4. Insertion of a composite field
509
2.5. Finite temperature field theory
514
2.6. Elementary estimates
518
3. The Quantum Action Principle
519
3.1. Field equation
520
3.2. Variation of a coupling constant
523
3.3. Flow equations for proper vertex functions 4. Spontaneously Broken SU(2) Yang–Mills Theory
526 531
4.1. The classical action
532
4.2. Flow equations: Renormalizability without Slavnov–Taylor identities
535
4.3. Violated Slavnov–Taylor identities
541
4.4. Restoration of the Slavnov–Taylor identities
546
Appendix A. The Relevant Part of Γ
551
Appendix B. The Relevant Part of the BRS-Insertions
553
Acknowledgements
554
References
554 491
July 14, 2003 11:19 WSPC/148-RMP
492
00169
V. F. M¨ uller
1. Indroduction Dyson’s pioneer work [1, 2] opened the era of a systematic perturbative renormalization theory long ago, and in the late sixties of the last century the rigorous BPHZ-version [3–5] was accomplished. In place of the momentum space subtractions of BPHZ to circumvent UV-divergences various intermediate regularization schemes were invented: Pauli–Villars regularization [6], analytical regularization [7], dimensional regularization [8–11]. These different methods, each with its proper merits, are equivalent up to finite counterterms, for a review see e.g. [12]. All of them are based on the analysis of multiple integrals corresponding to individual Feynman diagrams, the combinatorial complexity of which rapidly grows with increasing order of the perturbative expansion. Zimmermann’s famous forest formula [13] provides the clue to disentangle overlapping divergences, organizing the order of subintegrations to be followed. The BPHZ-renormalization, originally developed in case of massive theories, was extented by Lowenstein [14] to cover also zero mass particles. From the point of view of elementary particle physics, renormalization theory culminated in the work of ’t Hooft and Veltman [15–17], demonstrating the renormalizability of non-Abelian gauge theories. At the time of these achievements, Wilson’s view [18–20] of renormalization as a continuous evolution of effective actions — a primarily non-perturbative notion — began to pervade the whole area of quantum field theory and soon proved its fertility. In the domain of rigorous mathematical analysis beyond formal perturbation expansion, the renormalizable UV-asymptotically free Gross–Neveu model in two space-time dimensions has been constructed, [21, 22], by decomposing in the functional integral the full momentum range into a union of discrete, disjoint “slices” and integrating successively the corresponding quantum fluctuations, thereby generating a sequence of effective actions. This slicing can be seen as the equivalent of introducing block-spins in lattices of Statistical Mechanics. Rigorous nonperturbative analysis of the renormalization flow is the general subject of the lecture notes [23, 24] and of the monograph [25] which, among other topics, also treats the problem of summing the formal perturbation series. In these lecture notes and in the monograph references to the original work on non-perturbative renormalization can be found. In the realm of perturbative renormalization Wilson’s ideas have proved beneficial, too. Gallavotti and Nicolo, [26, 27], split the propagator of a free scalar field in disjoint momentum slices, i.e. decomposed the field into a sum of independent (generalized) random variables, and developed a tree expansion to perturbative renormalization. Here, due to the slicing, the degrees of freedom are again integrated in finite steps. This method has been applied in the monograph [28] to present a proof of renormalizability of QED which only involves gauge-invariant counterterms. Polchinski [29] realized, that considering renormalization in terms of relevant and irrelevant operators as Wilson, is also effective in perturbation theory, and he gave in the case of the Φ44 -theory an inductive self-contained proof of perturbative renormalization, based on Wilson’s renormalization (semi-) group differential
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
493
equation. His method avoids completely the combinatoric complexity of generating Feynman diagrams and the following cumbersome analysis of Feynman integrals with their overlapping divergences. It rather treats an n-point Green function of a given perturbative order as a whole. Due to this fact, his method is particularly transparent. Polchinski’s approach has proved very stimulating in various directions: (i) In mathematical physics it has been extended to present new proofs of general results in perturbatively renormalized quantum field theory, which are simpler than those achieved before: renormalization of the nonlinear σ-model [30], a rigorous version of Polchinski’s argument, together with physical renormalization conditions [31], renormalization of composite operators and Zimmermann identities [32], Wilson’s operator product expansion [33], Symanzik’s improved actions [34, 35], large order bounds [36], renormalization of massless Φ44 -theory [37], renormalization of QED [38, 39], decoupling theorems [40], renormalization of spontaneously broken Yang–Mills theory [42], temperature independent renormalization of finite temperature field theories [43]. The monograph [44] contains a clear and detailed introduction to Polchinski’s method formulated with Wick-ordered field products [34], and, in addition, the application of a similar renormalization flow to the Fermi surface problem of condensed matter physics. (ii) In the domain of theoretical physics there is a vast amount of contributions with diverse applications of Polchinski’s approach. Flow equations for vertex functions have been introduced [45, 46] and also employed to investigate perturbative renormalizability of gauge, chiral and supersymmetric theories [47–53]. In these articles also several explicit one-loop calculations are performed. An effective quantum action principle has been formulated for the renormalization flow breaking gauge invariance [54]. There are also interesting attempts to combine a gauge invariant regularization with a flow equation [55]. Besides aims within perturbation theory, there have been many activities to use truncated versions of flow equations as appropriate nonperturbative approximations in strong interaction physics, more in accord with Wilson’s original goal. In the physically distinguished case of (non-Abelian) gauge theories [56–62] the effective action is restricted in a local approximation to its relevant part for all values of the flowing scale. As a consequence, the flow equation for the effective action reduces to a system of r ordinary differential equations, r being the number of relevant coefficients appearing. This system is integrated from the UV-scale downward. In these nonperturbative approaches the problem arises to reconcile the truncation with the gauge symmetry. This problem is discussed also in [54]. In a very different field of interest the question of the nonperturbative renormalizability of Quantum Einstein Gravity has been investigated, based on truncated flow equations, [63, 64]. These authors restrict the average effective action to the Hilbert action together with a small number of additional local terms. The flow of the coupling coefficients is studied numerically and the
July 14, 2003 11:19 WSPC/148-RMP
494
00169
V. F. M¨ uller
existence of a non-Gaussian fixed point in the ultraviolet found. This result is then interpreted to support the conjecture, that Quantum Einstein Gravity is “asymptotically safe” in Weinberg’s sense. We like to point out that physical applications of flow equations are reviewed in [65], containing an extensive list of references. The present article is intended to provide a self-contained exposition of perturbative renormalization based on Polchinski’s inductive method, employing the differential renormalization group equation of Wilson. Therefore, emphasis is laid on a coherent presentation of the topics considered. A comprehensive overview of the literature on the subject will not be pursued. The quantum field theories considered are treated in their Euclidean formulation on d = 4 dimensional (Euclidean) spacetime by means of functional integration. Accordingly, their correlation functions are called Schwinger functions to distinguish them from the Green functions on Minkowski space. In the intermediate steps of the derivations always regularized functionals are used, the controlled removal (within perturbation theory) of this regularization being our main concern. We avoid any manipulation of unregularized “path integrals”. The plan of this article is as follows. In Sec. 2 Polchinski’s method to prove perturbative renormalizability is elaborated treating the nonsymmetric Φ4 -theory in detail. Besides the system of Schwinger functions of this theory, the Schwinger functions with one composite field (operator) inserted are also dealt with. The presentation is mainly based on [31, 32] and on some simplifications [66]. Moreover, considering the theory at finite temperature, its temperature independent renormalizability is reviewed, following closely [43]. In Sec. 3 two simple cases of the quantum action principle are demonstrated, again treating the nonsymmetric Φ4 -theory: the field equation and the variation of a coupling constant. These applications of the method seem not to have been treated in the literature. Hereafter, somewhat disconnected, flow equations for proper vertex functions are dealt with, [41, 45, 46]. Section 4 is devoted to the proof of renormalizability of the physically most important spontaneously broken Yang–Mills theory. Because of the necessity to implement nonlinear field variations, this problem can be regarded as a further instance of the quantum action principle. The presentation follows the line of [42]. Due to a modified cutoff function for the flow, however, one can refrain from introducing irrelevant terms into the bare interaction, thereby simplifying the treatment. In addition, the restoration of the irrelevant part of the Slavnov–Taylor identities is also dealt with. 2. The Method 2.1. Properties of Gaussian measures Our point of departure is a Gaussian probability measure dµ on the space C(Ω) of continuous real-valued functions on a d-dimensional torus Ω. Such a function we
July 14, 2003 11:19 WSPC/148-RMP
00169
495
Perturbative Renormalization by Flow Equations
identify with a periodic function on Rd , i.e. φ(x) = φ(x + nl), where x ∈ Rd , n ∈ Zd , l = (l1 , . . . , ld ) ∈ Rd+ and nl = n1 l1 + · · · + nd ld . A Gaussian measure with mean zero is uniquely defined by its covariance C(x, y), Z dµC (φ)φ(x)φ(y) = C(x, y) = C(y, x) . (2.1) The covariance is a positive non-degenerate bilinear form on C ∞ (Ω) × C ∞ (Ω), we assume it to be translation invariant, C(x, y) = C(x − y) , too. Moreover, the function C(x) is assumed to have a given number N ∈ N of derivatives continuous everywhere on Ω. We list some properties of this Gaussian measure employed in the sequel, proofs can be found e.g. in [67]. • Using the notation Z dxφ(x)J(x) , hφ, Ji =
hJ, CJi =
Ω
Z
dx Ω
Z
dyJ(x)C(x − y)J(y) Ω
where J ∈ C ∞ (Ω) is a test function, the generating functional of the correlation functions is given explicitly as Z 1 (2.2) dµC (φ)ehφ,Ji = e 2 hJ,CJi . • The translation of the Gaussian measure by a function ϕ ∈ C ∞ (Ω) results in 1
dµC (φ − ϕ) = e− 2 hϕ,C
−1
ϕi
dµC (φ)ehφ,C
−1
ϕi
.
(2.3)
• Let A(φ) denote a polynomial formed of local powers of the field, φ(x)n , n ∈ N, and of its derivatives (∂µ φ(x))m , m ∈ N, 2m < N , at various points x. If the covariance C is the sum of two covariances, C = C1 + C2 , then Z Z Z (2.4) dµC (φ)A(φ) = dµC1 (φ1 ) dµC2 (φ2 )A(φ1 + φ2 ). • Integration by parts of a function A(φ) as considered in (2.4) yields Z Z Z δ dµC (φ)φ(x)A(φ) = dµC (φ) dyC(x − y) A(φ) . δφ(y) Ω
(2.5)
• Finally, let the covariance of the Gaussian measure depend differentiably on a parameter, C(x − y) = Ct (x − y) ,
d C˙ t ≡ Ct (x − y) . dt
Given again a function A(φ) as in (2.4), then Z Z 1 d δ δ dµCt (φ)A(φ) = dµCt (φ) A(φ) . , C˙ t dt 2 δφ δφ
(2.6)
July 14, 2003 11:19 WSPC/148-RMP
496
00169
V. F. M¨ uller
As an example of the class of covariances considered, we present already here the particular covariance which will be mainly used in the flow equations envisaged. The torus Ω has volume |Ω| = l1 l2 · · · ld and a point x ∈ Ω has coordinates − 21 li ≤ xi < 12 li , i = 1, . . . , d. Hence, the dual Fourier variables (momentum vectors) k form a discrete set: 2πn1 2πnd k = k(n) = ,..., , n ∈ Zd . l1 ld Let m, Λ0 be positive constants, 0 < m Λ0 , and the nonnegative parameter Λ satisfy 0 ≤ Λ ≤ Λ0 , we define the covariance C Λ,Λ0 (x − y) =
2 2 2 +m2 1 X eik(x−y) − k Λ+m 2 −k Λ 2 0 − e ). (e 2 + m2 |Ω| k d
(2.7)
n∈Z
This covariance obviously has the well-defined infinite volume limit Ω → Rd , with x, y, k ∈ Rd , Z 2 2 k2 +m2 1 eik(x−y) − k Λ+m 2 0 C Λ,Λ0 (x − y) = − e − Λ2 ) . (2.8) dk (e d 2 2 (2π) Rd k + m Abusing slightly the notation we did not choose a different symbol for the limit. Later on, however, the case referred to will be clearly stated. Choosing the values Λ0 = ∞, Λ = 0 the covariances (2.7) and (2.8) become the Euclidean propagator of a free real scalar field with mass m on Ω and Rd , respectively. A finite value of Λ0 generates an UV-cutoff thus regularizing the covariances: they now satisfy the regularity condition assumed for all N . This property is kept introducing the additional term governed by the “flowing” parameter 0 ≤ Λ ≤ Λ0 . Its role is to interpolate differentiably between a vanishing covariance at Λ = Λ0 , corresponding to a δ-measure on the function space, and the free UV-regularized covariance at Λ = 0. As a consequence we remark that the Gaussian measure with covariance (2.7) is supported with probability one on (the nuclear space) C ∞ (Ω), [68, 69]. Clearly, a modification of the Euclidean propagator showing these properties can be accomplished with a large variety of cutoff functions. In (2.7) and (2.8) a factor of the form RΛ,Λ0 (k) = σΛ0 (k 2 ) − σΛ (k 2 )
(2.9)
has been introduced, with the particular function σΛ (k 2 ) = e−
k2 +m2 Λ2
.
(2.10)
We observe, that regularization and interpolation is caused by any positive function σΛ (k 2 ) satisfying: (i) For fixed Λ it decreases as a function of k 2 , vanishing rapidly for k 2 > Λ2 . (ii) For fixed k 2 it increases with Λ from the value zero at Λ = 0 to the value one at Λ = ∞. Later on, our particular choice will prove advantageous.
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
497
2.2. The flow equation Perturbative renormalizability of a quantum field theory is based on locality of its action and on power counting. The qualification “perturbative” means expansion of the theory’s Green (or Schwinger) functions as formal power series in the loop parametera ~, and treating them order-by-order, i.e. disregarding questions of convergence. The notions of locality and power counting can be introduced looking at the classical precurser of the quantum field theory to be constructed. There, a local action in d space-time dimensions is the space-time integral of a Lagrangian (density), having the form of a polynomial in the fields entering the theory and their derivatives. The propagators are determined by the free part, which is bilinear in the fields. For a scalar field and a spin- 21 -field this free part is of second and first order in the derivatives, respectively. Defining the canonical (mass) dimension of the corresponding fields to be 12 (d − 2) and 12 (d − 1), respectively, and attributing the mass dimension 1 to each partial derivative, the free part of the Lagrangian has the dimension d, the action thus is dimensionless. Vector fields, especially (nonAbelian) gauge fields, pose particular problems to be considered later. Still looking at the classical theory, local interaction terms in the Lagrangian involve by definition more than two fields. Their respective coupling constant has a mass dimension, derived from the mass dimension of the interaction term, the coupling constant of an interaction term of mass dimension d being dimensionless. Any local term entering the Lagrangian is called a relevant operator [18–20] if it has a mass dimension ≤ d, but irrelevant, if its mass dimension is greater than d. In the physically distinguished case d = 4 the central result of pertubative renormalization theory is that UV-finite Green (or Schwinger) functions can be obtained in any order, if the interaction terms have mass dimensions ≤ 4, by prescribing a finite number of renormalization conditions. This number equals the number of relevant operators forming the full classical Lagrangian. We consider the quantum field theory of a real scalar field φ with mass m on four-dimensional Euclidean space-time within the framework of functional integration. The emerging vacuum effects require a finite space-time volume. Therefore we start with a real-valued field φ ∈ C 1 (Ω) on a four-dimensional torus Ω. Its bare interaction, labeled by an UV-cutoff Λ0 ∈ R+ , is chosen as Z g f 3 φ (x) + φ4 (x) LΛ0 ,Λ0 (φ) = dx 3! 4! Ω +
Z
1 1 dx v(Λ0 )φ(x) + a(Λ0 )φ2 (x) + z(Λ0 )(∂µ φ)2 (x) 2 2 Ω
! 1 1 3 4 + b(Λ0 )φ (x) + c(Λ0 )φ (x) . 3! 4! a If
(2.11)
one considers Feynman diagrams, the power of ~ counts the number of loops formed by such a diagram.
July 14, 2003 11:19 WSPC/148-RMP
498
00169
V. F. M¨ uller
The first integral has classical roots: its integrand is formed of the field’s selfinteraction with real coupling constants f and g having mass dimension equal to one and zero, respectively.b The second integral contains the related counterterms, determined according to the following rule. The canonical mass dimension of the field φ is equal to one. As counterterms in the integrand of the bare interaction have to appear all local terms of mass dimension ≤ 4 that can be formed of the field and of its derivatives but respecting the (Euclidean) O(4)-symmetry. This symmetry is not violated by the intermediate UV-regularization procedure and can thus be maintained. In contradistinction to the coupling constants f, g the five coefficients v(Λ0 ), a(Λ0 ), z(Λ0 ), b(Λ0 ), c(Λ0 ) of the counterterms cannot be chosen freely but have to depend on the UV-cutoff Λ0 . This dependence is dictated by the aim that after functional integration the UV-regularization can be removed, i.e. the limit Λ0 → ∞ can be performed keeping the physical content of the theory finite. As a consequence the coefficients stated above, however, turn out to diverge with Λ0 → ∞. If we restrict the bare interaction (2.11) to the case f = 0, v(Λ0 ) = b(Λ0 ) = 0, it is also invariant under the mirror transformation φ(x) → −φ(x) implying an additional symmetry of the theory. The regularized quantum field theory on finite volume is defined by the generating functional of its Schwinger functions Z 1 1 Λ0 ,Λ0 hφ,Ji (φ)+ ~ Λ,Λ0 (2.12) Z (J) = dµΛ,Λ0 (φ)e− ~ L
with a real source J ∈ C ∞ (Ω), bare interaction (2.11) and a Gaussian measure dµΛ,Λ0 with mean zero and covariance ~C Λ,Λ0 , (2.7). The positive parameter ~ has been introduced with regard to a systematic loop expansion considered later. For fixed Ω and Λ0 , and assuming g + c(Λ0 ) > 0, z(Λ0 ) ≥ 0 in the bare interaction, (2.11), the functional integral (2.12) is well-defined. As a functional on C ∞ (Ω), the support of the Gaussian measure dµΛ,Λ0 (φ), the bare interaction is continuous in any Sobolev norm of order n ≥ 1, and, furthermore, bounded below, LΛ0 ,Λ0 (φ) > κ. Hence, with Λ0 fixed, we have the uniform bound for 0 ≤ Λ ≤ Λ0 , Z 0,Λ0 1 1 Ji |Z Λ,Λ0 (J)| < e−κ dµΛ,Λ0 (φ)e ~ hφ,Ji ≤ e−κ+ 2~ hJ,C . (2.13) From (2.12) one obtains the generating functional W Λ,Λ0 (J) of the truncated Schwinger functionsc 1
e~W
Λ,Λ0
(J)
=
Z Λ,Λ0 (J) Z Λ,Λ0 (0)
(2.14)
which provides the n-point functions, n ∈ N, upon functional derivation: WnΛ,Λ0 (x1 , . . . , xn ) = b Stability
δn W Λ,Λ0 (J)|J=0 . δJ(x1 ) · · · δJ(xn )
(2.15)
requires g to be positive, but this property is not felt in a perturbative treatment. a representation of these functions in terms of (Feynman) diagrams only connected diagrams appear.
c In
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
499
Besides the UV-regularization determined by the cutoff Λ0 , imperative to have a well-defined functional integral (2.12), an additional flowing cutoff Λ has been built in, suppressing smaller momenta. It is a merely technical device, introduced by Polchinski [29] and inspired by Wilson’s view of renormalization [18–20]. Decreasing Λ from its maximal value Λ = Λ0 to its physical value Λ = 0 gradually takes into account the momentum domain, starting at high momenta — in mathematical terms: the parameter Λ interpolates continuously between a δ-measure (i.e. absence of quantum effects) at Λ = Λ0 and the Gaussian measure dµ0,Λ0 on a UV-regularized field, at Λ = 0. Of course, as stressed after Eq. (2.10), such an interpolation can also be realized by other cutoff functions than (2.7) used here. In order to make use of the flow parameter Λ it is advantageous to consider the (free propagator-) amputated truncated Schwinger functions with generating functional LΛ,Λ0 (ϕ), ϕ ∈ C ∞ (Ω), defined as Z 1 Λ0 ,Λ0 1 (LΛ,Λ0 (ϕ)+I Λ,Λ0 ) (φ+ϕ) −~ = dµΛ,Λ0 (φ)e− ~ L , (2.16) e LΛ,Λ0 (0) = 0 .
(2.17)
The constant I Λ,Λ0 is the vacuum part of the theory. Translating on the r.h.s. in (2.16) the source function ϕ to the measure and using (2.3) leads to 1
e− ~ (L
Λ,Λ0
(ϕ)+I Λ,Λ0 )
1
= e− 2 hϕ,(~C
Λ,Λ0 −1
)
ϕi
Z Λ,Λ0 ((C Λ,Λ0 )−1 ϕ)
(2.18)
relating the generating functionals Z and L. Hereupon, together with the definition (2.14) follows finally LΛ,Λ0 (ϕ) =
1 hϕ, (C Λ,Λ0 )−1 ϕi − W Λ,Λ0 ((C Λ,Λ0 )−1 ϕ) . 2
(2.19)
Denoting by C˙ Λ,Λ0 the derivative of the covariance C Λ,Λ0 with respect to the flow parameter Λ we observe, with Λ0 kept fixed, Z Z d ~ δ ˙ Λ,Λ0 δ 1 Λ0 ,Λ0 1 Λ0 ,Λ0 −~ L (φ+ϕ) (φ+ϕ) dµΛ,Λ0 (φ)e = dµΛ,Λ0 (φ) e− ~ L ,C dΛ 2 δφ δφ Z ~ δ ˙ Λ,Λ0 δ 1 Λ0 ,Λ0 (φ+ϕ) = dµΛ,Λ0 (φ)e− ~ L ,C 2 δϕ δϕ where in the first step (2.6) has been used, whereas the second step follows from the integrand’s particular dependence on the field φ. Hence, because of Eq. (2.16) we obtain the differential equation Λ,Λ0 ~ δ ˙ Λ,Λ0 δ d − 1 (LΛ,Λ0 (ϕ)+I Λ,Λ0 ) 1 (ϕ)+I Λ,Λ0 ) ~ = . (2.20) e ,C e− ~ (L dΛ 2 δϕ δϕ The reader notices that the relation (2.6) has been used in the case of a nonpolynomial function. Therefore this extension has to be understood in terms of a formal
July 14, 2003 11:19 WSPC/148-RMP
500
00169
V. F. M¨ uller
power series expansion, i.e. disregarding the question of convergence. Upon explicit differentiation in (2.20) follows the Wilson flow equation d ~ δ ˙ Λ,Λ0 δ LΛ,Λ0 (ϕ) (LΛ,Λ0 (ϕ) + I Λ,Λ0 ) = ,C dΛ 2 δϕ δϕ 1 δ Λ,Λ0 δ − L (ϕ), C˙ Λ,Λ0 LΛ,Λ0 (ϕ) . (2.21) 2 δϕ δϕ The form of Eq. (2.20) strongly resembles the heat equation. Defining the functional Laplace operator δ 1 δ , (2.22) , C Λ,Λ0 ∆Λ,Λ0 = 2 δϕ δϕ the unique solution of the differential equation (2.20), already given in the form (2.16), can also be written as 1
e− ~ (L
Λ,Λ0
(ϕ)+I Λ,Λ0 )
1
= e~∆Λ,Λ0 e− ~ L
Λ0 ,Λ0
(ϕ)
.
(2.23)
˙ Λ,Λ0 with respect to Λ, the r.h.s. of Since ∆Λ,Λ0 commutes with its derivative ∆ (2.23) satisfies the differential equation. Moreover, the initial condition holds because of ∆Λ0 ,Λ0 = 0 and I Λ0 ,Λ0 = 0. At this point, several remarks concerning the mathematical aspect of the steps performed are in order: (i) Our aim with these preparatory steps is to generate the system of flow equations satisfied by the regularized Schwinger functions of the theory, when considered in the perturbative sense of formal power series. This system then is taken as the starting point for a proof of perturbative renormalizability. As basic “root” acts the UV-regularized finite-volume generating functional (2.12) or one of its direct descendants (2.14) and (2.16). Expanding in their respective integrands the exponential function in a power series would provide the standard perturbation expansion in terms of (regularized) Feynman integrals. Bearing in mind our goal stated, we could already view the steps performed in the restricted sense as formal power series. (ii) We mention, that in [44] to begin on safe ground the generating functional of the theory has first been formulated on a finite space-time lattice in order to derive the (perturbative) flow equation — implying a finite-dimensional Gaussian integral —, and the limit to continuous infinite space-time taken afterwards. (iii) Rigorous analysis beyond perturbation theory of flow equations of the Wilson type (2.21) is the subject dealt with in [23], using convergent expansion techniques. Such techniques are developed in the monograph [25]. The flow equation (2.21) for the generating functional LΛ,Λ0 (ϕ) encodes a system of flow equations for the corresponding n-point functions, n ∈ N, and for the vacuum part I Λ,Λ0 . The flow of the latter is determined considering Eq. (2.21) at ϕ = 0.
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
501
From the translation invariance of the theory it follows that the 2-point function is a distribution depending on the difference variable x − y only, (suppressing momentarily the superscript Λ, Λ0 ) δ δ L(ϕ)|ϕ=0 =: L2 (x − y) , δϕ(x) δϕ(y) and thus Z Z Z δ ˙ δ ˙ ˙ − y)L2 (x − y) = |Ω| dz C(z)L ,C dy C(x dx L(ϕ)|ϕ=0 = 2 (z) . δϕ δϕ Ω Ω Ω Because of the emerging dependence on the volume |Ω| the flow equation of the vacuum part cannot be treated in the infinite volume limit. However, due to the covariance (2.8) which corresponds to a massive particle and thus decays exponentially, the flow equation for the n-point functions can and in the sequel will be treated in this limit. Hence, at least one functional derivative has to act on the flow equation (2.21). Due to the translation invariance of the theory it is convenient to consider the generating functional LΛ,Λ0 (ϕ), ϕ ∈ S(R4 ), in terms of the Fourier transformed source field ϕ, ˆ the conventions used are Z Z Z d4 p ipx , (2.24) ϕ(x) = e ϕ(p) ˆ , := 4 p p R4 (2π) δ implying for the functional derivative δϕ(x) := δϕ(x) the transformation Z δϕ(x) = (2π)4 e−ipx δϕ(p) . ˆ p
From the generating functional L functional derivation, n ∈ N,
Λ,Λ0
(ϕ) the correlation functions are obtained by
Λ,Λ0 (2π)4(n−1) δϕ(p (ϕ)|ϕ=0 ˆ n ) · · · δϕ(p ˆ 1)L 0 = δ(p1 + · · · + pn )LΛ,Λ (p1 , . . . , pn ) . n
(2.25)
0 (p1 , . . . , pn ) is a totally symThe amputated truncated n-point function LΛ,Λ n metric function of the momenta p1 , . . . , pn and, moreover, due to the δ-function, pn := −p1 − · · · − pn−1 . (In the case where the bare interaction (2.11) shows the mirror symmetry LΛ0 ,Λ0 (−φ) = LΛ0 ,Λ0 (φ) all n-point functions with n odd vanish.) Observing the definition (2.25) we obtain from (2.21) the system of flow equations for the n-point functions, n ∈ N, Z ~ 0 0 ∂Λ LΛ,Λ ∂Λ C Λ,Λ0 (k) · LΛ,Λ (p , . . . , p ) = 1 n n n+2 (k, p1 , . . . , pn , −k) 2 k
n
−
1X 2 r=0
X
Λ,Λ0 0 LΛ,Λ (p) r+1 (pi1 , . . . , pir , p)∂Λ C
i1 4 have always to be considered first (for fixed l, n). In these cases the bound (2.48) is integrated downwards, observing (2.45), similarly as in the tree order, (2.44). In place of the pure power behavior, however, we now have Z Λ0 Λ+m λ+m < (Λ + m)4−n−|w| P1 log dλ(λ + m)4−n−|w|−1 P log m m Λ with a new polynomial on the r.h.s., see the end of this section, Sec. 2.6. Thus, the assertion is established in the cases n + |w| > 4. (a2 ) In the cases n + |w| ≤ 4 the claim (2.41) has to be deduced from the respective integrated flow equation (2.46) at the renormalization point followed by an extension to general momenta by way of (2.47), proceeding in the order of induction. That is, to start with the particular (momentum independent) case n = 1 and continue successively with the cases (n = 2, |w| = 2), (n = 2, |w| = 1), and so on. Converting in the obvious way Eq. (2.46) into an inequality for
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
507
absolute values, a bound on the integral is gained using the bound (2.48) at vanishing momenta: Z Z Λ Λ λ+m λ,Λ w 4−n−|w|−1 0 dλ∂λ ∂ Ll,n (0, . . . , 0) ≤ dλ(λ + m) P log 0 m 0 ≤ (Λ + m)
4−n−|w|
P1
Λ+m log m
,
where P1 is a new polynomial, see Sec. 2.6. Hence, the assertion (2.41) is established at the renormalization point. In each case extension to general momenta via (2.47) is guaranteed by bounds established before. This concludes the proof of Proposition 2.1. The boundedness due to Proposition 2.1 would still allow an oscillatory dependence on Λ0 . Such a (implausible) behaviour is excluded by the Proposition 2.2 (Convergence). For all l ∈ N0 , n ∈ N, w from (2.30) and for 0 ≤ Λ ≤ Λ0 holds, 0 |∂Λ0 ∂ w LΛ,Λ l,n (p1 , . . . , pn )|
|pi | Λ0 + m (Λ + m)5−n−|w| P4 . P3 log ≤ (Λ0 + m)2 m Λ+m
(2.49)
Since we need this proposition for large values of Λ0 only, we then obviously can write ν (Λ + m)5−n−|w| |pi | Λ0 0 |∂Λ0 ∂ w LΛ,Λ (p , . . . , p )| ≤ P (2.50) log 1 n 4 l,n (Λ0 )2 m Λ+m with a positive integer ν depending on l, n, w. Integration of these bounds with 0 respect to Λ0 finally shows that for fixed Λ all LΛ,Λ l,n (p1 , . . . , pn ) converge to finite limits with Λ0 → ∞. In particular, one obtains for all Λ00 > Λ0 : ν |pi | m5−n Λ0 0,Λ0 0,Λ0 P5 |Ll,n (p1 , . . . , pn ) − Ll,n 0 (p1 , . . . , pn )| < . log Λ0 m m
Thus, due to the Cauchy criterion, finite limits (2.33) exist, i.e. perturbative renormalizability of the theory considered is demonstrated. Proof of Proposition 2.2. We integrate the system of flow equations (2.31) according to the induction scheme employed before and derive the individual npoint functions with respect to Λ0 . The r.h.s. of (2.31) will be denoted by the 0 shorthand ∂ w RΛ,Λ l,n (p1 , . . . , pn ). Due to (2.38)–(2.40) the cases (l = 0, n + |w| ≤ 4) evidently satisfy the claim (2.49). (b1 ) n + |w| > 4: In these cases, because of the initial condition (2.45), we have Z Λ0 λ,Λ0 w Λ,Λ0 dλ∂ w Rl,n (p1 , . . . , pn ) −∂ Ll,n (p1 , . . . , pn ) = Λ
July 14, 2003 11:19 WSPC/148-RMP
508
00169
V. F. M¨ uller
and hence w Λ0 ,Λ0 0 −∂Λ0 ∂ w LΛ,Λ (p1 , . . . , pn ) l,n (p1 , . . . , pn ) = ∂ Rl,n
+
Z
Λ0 Λ
λ,Λ0 dλ∂Λ0 ∂ w Rl,n (p1 , . . . , pn ) .
(2.51)
To the first term on the r.h.s. only the quadratic part of (2.31) contributes, cf. (2.45). It is bounded using Proposition 2.1 and the bound (2.43): Λ0 ,Λ0 |∂ w Rl,n (p1 , . . . , pn )|
Λ0 + m |pi | ≤ (Λ0 + m) P1 log P2 m Λ0 + m Λ0 + m (Λ + m)5−n−|w| |pi | P log P , ≤ 1 2 (Λ0 + m)2 m Λ+m 3−n−|w|
(2.52)
valid for 0 ≤ Λ ≤ Λ0 , since n + |w| > 4. The integrand of the second term on the r.h.s. of (2.51) is the derivative with respect to Λ0 of the r.h.s. of (2.31). Observing ∂Λ0 ∂Λ C Λ,Λ0 (k) = 0 ,
(2.53)
we bound the Λ0 — derivative considered using Proposition 2.1 and — in accord with the induction hypothesis — Proposition 2.2 together with the bounds (2.42) and (2.43) which are employed in the linear and in the quadratic part, respectively. Proceeding then similarly as in deducing (2.48) yields 0 |∂Λ0 ∂ w RΛ,Λ l,n (p1 , . . . , pn )|
(Λ + m)5−n−|w|−1 Λ0 + m |pi | P7 log ≤ P8 . (Λ0 + m)2 m Λ+m
(2.54)
From this follows upon integration, with the bound on the momenta majorized, a bound on the second term on the r.h.s. of (2.51) that has the form (2.52). Therefore, the assertion (2.49) is deduced if n + |w| > 4. (b2 ) n + |w| ≤ 4. Here, the respective flow equations integrated at the renormalization point (2.46) are derived with respect to Λ0 . They imply, observing that the initial conditions, i.e. the renormalization constants (2.34)–(2.37) do not depend on Λ0 , the bound Z Λ λ,Λ0 0 (0, . . . , 0)| , (2.55) (0, . . . , 0)| ≤ dλ|∂Λ0 ∂ w Rl,n |∂Λ0 ∂ w LΛ,Λ l,n 0
where on the r.h.s. the shorthand introduced before has been used. In deducing the bound (2.54) no restriction on n, w entered. Therefore, we can use it in (2.55) also and obtain upon integration (Λ + m)5−n−|w| Λ0 + m 0 (0, . . . , 0)| ≤ |∂Λ0 ∂ w LΛ,Λ P log . (2.56) 3 l,n (Λ0 + m)2 m
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
509
Extension of these bounds to general momenta is again achieved via the Taylor formula (2.47) as in the proof of Proposition 2.1. Thus, Proposition 2.2 is proven. Remark. Renormalizability is a consequence of Proposition 2.2 at the value Λ = 0. From this point of view Proposition 2.1 is of preparatory, technical nature. The bounds established in both propositions are not optimal but sufficient. Their virtue is to allow a concise and complete proof of renormalizability. These bounds can be refined in various ways. We mention that Kopper and Meunier [70], by sharpening the induction hypothesis with respect to momentum derivatives of n-point functions, obtained optimal bounds on the momentum behaviour related to Weinberg’s theorem [71]. The Propositions 2.1 and 2.2 established, it is physically important to notice that they even remain valid, when the original bare interaction (2.11) is extended by appropriately chosen irrelevant terms: It is sufficient to replace the condition (2.45) by requiring for n + |w| > 4: |pi | Λ0 + m w Λ0 ,Λ0 4−n−|w| P2 , |∂ Ll,n (p1 , . . . , pn )| ≤ (Λ0 + m) P1 log m Λ0 + m Λ0 + m |pi | Λ0 ,Λ0 |∂Λ0 ∂ w Ll,n P4 ; (p1 , . . . , pn )| ≤ (Λ0 + m)3−n−|w| P3 log m Λ0 + m
(2.57)
evidently, we can also write Λ0 instead of Λ0 + m everywhere. One first observes that these bounds agree with Propositions 2.1 and 2.2 considered at Λ = Λ0 . Moreover, as bounds on the initial conditions to be added in (2.44), (a1 ) and (2.51), respectively, they can be absorbed in the corresponding bounds on the integrals appearing. 2.4. Insertion of a composite field Besides the system of n-point functions dealt with up to now, n-point functions with one or more additional composite fields inserted are of considerable physical interest. In particular, the generators of symmetry transformations of a theory appear generally in the form of composite fields. But there are further instances where inserted composite fields — sometimes also called inserted operators — occur. In the sequel we treat the perturbative renormalization of one composite field inserted. Since, by definition, a composite field depends nonlinearly on the basic field (or fields) of the theory considered, new divergences have to be circumvented and hence additional renormalization conditions are required. As before, we examine the quantum field theory of a real scalar field φ(x) with mass m in four-dimensional Euclidean space-time. Then, a composite field Q(x) is a local polynomial formed in general of the field φ(x) and of its space-time derivatives. It is determined by its classical version Qclass (x). If we restrict to achieve
July 14, 2003 11:19 WSPC/148-RMP
510
00169
V. F. M¨ uller
a renormalized theory with one insertion, Q(x) has to be chosen as follows: Let Qclass (x) be a monomial having the canonical mass dimension D, then Q(x) = Qclass (x) + Qc.t. (x) ,
(2.58)
where Qc.t. (x) is a polynomial which is formed of all local terms of canonical mass dimension ≤ D. This polynomial Qc.t. (x) acts as counterterm. If Qclass (x) shows a symmetry not violated in the intermediate process of regularization this symmetry can be imposed on Qc.t. (x), too. Since the regularization (2.8) keeps the Euclidean symmetry the counterterms Qc.t. (x) can be restricted to those showing the same tensor type as Qclass (x). We illustrate the notion introduced with the example of a scalar composite field having D = 3: Qclass (x) = Qc.t. (x) =
1 φ(x)3 , 3!
(2.59)
1 1 r1 (Λ0 )φ(x)3 − r2 (Λ0 )∆φ(x) + r3 (Λ0 )φ(x)2 3! 2! + r4 (Λ0 )φ(x) + r5 (Λ0 ) ,
(2.60)
with coefficients ri (Λ0 ) = O(~), i = 1, . . . , 5. Our aim here is to show the renormalizability of the theory considered in Sec. 2.3 with one insertion of a scalar composite field Q(x) of mass dimension D. It will turn out that we can essentially proceed as before, taking minor modifications into account. Hence, we can refrain from repeating definitions and arguments already introduced. In place of the bare interaction (2.11) one starts with a modified one: Z Λ0 ,Λ0 Λ0 ,Λ0 Λ0 ,Λ0 ˜ ˜ L (%; φ) + I (%) = L (φ) + dx%(x)Q(x) , (2.61) ˜ Λ0 ,Λ0 (%; 0) = 0 , L where the composite field (2.58), coupled to an external source % ∈ C ∞ (Ω), has been added.e Then, as in (2.12), the generating functional of regularized Schwinger functions with insertions Q is obtained upon functional integration: Z 1 1 ˜ Λ0 ,Λ0 (%;φ)+I˜Λ0 ,Λ0 (%))+ ~ hφ,Ji . (2.62) Z˜ Λ,Λ0 (%; J) = dµΛ,Λ0 (φ)e− ~ (L Moreover, passing similarly as before to regularized amputated truncated Schwinger functions with insertions, the Eqs. (2.16) and (2.17) are replaced by Z 1 ˜ Λ0 ,Λ0 1 ˜ Λ,Λ0 (%;ϕ)+I˜Λ,Λ0 (%)) (%;φ+ϕ)+I˜Λ0 ,Λ0 (%)) = dµΛ,Λ0 (φ)e− ~ (L , (2.63) e − ~ (L ˜ Λ,Λ0 (%; 0) = 0 . L e I˜Λ0 ,Λ0 (%)
(2.64)
is the field independent part that possibly has to enter the modified bare action, as e.g. in (2.60).
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
511
In view of the implicit notation (2.58) we stress that the shift of the field to φ + ϕ involves the field dependent insertion Q(x) in (2.61), too. In exactly the same way ˜ as (2.18) was obtained, we find the relation between the generating functionals L ˜ and Z: 1
˜ Λ,Λ0 (%;ϕ)+I˜Λ,Λ0 (%))
e − ~ (L
1
= e− 2 hϕ,(~C
Λ,Λ0 −1
)
ϕi
Z˜ Λ,Λ0 (%; (C Λ,Λ0 )−1 ϕ) .
(2.65)
Since (2.63) and (2.16) have the same form we obtain the flow equation of the ˜ Λ,Λ0 (%; ϕ) by substituting in the flow equation (2.21): functional L ˜ Λ,Λ0 (%; ϕ) , LΛ,Λ0 (ϕ) → L
I Λ,Λ0 → I˜Λ,Λ0 (%) .
As a consequence the generating functional of the amputated truncated Schwinger functions with one insertion Q, 0 LΛ,Λ (1) (x; ϕ) :=
δ ˜ Λ,Λ0 L (%; ϕ)|%(x)=0 , δ%(x)
(2.66)
then satisfies the flow equation δ ˙ Λ,Λ0 δ 0 LΛ,Λ ,C (1) (x; ϕ) δϕ δϕ δ Λ,Λ0 Λ,Λ0 Λ,Λ0 δ ˙ L (ϕ), C L (x; ϕ) , − δϕ δϕ (1)
~ d Λ,Λ0 (LΛ,Λ0 (x; ϕ) + I(1) (x)) = dΛ (1) 2
(2.67)
involving the vacuum part with one insertion Λ,Λ0 I(1) (x) :=
δ ˜Λ,Λ0 I (%)|%(x)=0 . δ%(x)
(2.68)
In deriving the r.h.s. of (2.67) use of the symmetry C˙ Λ,Λ0 (x − y) = C˙ Λ,Λ0 (y − x) has been made. We note that the functional L(1) satisfies a linear equation. Because of the insertion the full flow equation (2.67) can be studied in the infinite volume limit Ω → R4 , ϕ ∈ S(R4 ). The Fourier transform with respect to the insertion is defined as Z ˆ Λ,Λ0 (q; ϕ) := dxeiqx LΛ,Λ0 (x; ϕ) , L (2.69) (1) (1) Λ,Λ0 (q) := Iˆ(1)
Z
Λ,Λ0 (x) = (2π)4 iΛ,Λ0 δ(q) . dxeiqx I(1)
(2.70)
Furthermore, the generating functional is decomposed, observing the conventions (2.24), n ∈ N, ˆ Λ,Λ0 (2π)4(n−1) δϕ(p ˆ n ) · · · δϕ(p ˆ 1 ) L(1) (q; ϕ)|ϕ=0 0 = δ(q + p1 + · · · + pn ) LΛ,Λ (1) n (q; p1 , . . . , pn ) .
(2.71)
The amputated truncated n-point function with one insertion carrying the momentum q, 0 LΛ,Λ (1) n (q; p1 , . . . , pn ) ,
July 14, 2003 11:19 WSPC/148-RMP
512
00169
V. F. M¨ uller
is at fixed q totally symmetric in the momenta p1 , . . . , pn . Furthermore, the sum of all momenta has to vanish because of the δ-constraint in (2.71). From (2.67) and by proceeding exactly as before from (2.25) to (2.31), after a loop expansion of the n-point functions, n ∈ N, and of the vacuum part iΛ,Λ0 , 0 LΛ,Λ (1) n (q; p1 , . . . , pn ) =
∞ X
0 ~l LΛ,Λ (1) l,n (q; p1 , . . . , pn ) ,
(2.72)
l=0
we arrive at the system of flow equations with one insertion: Z 1 0 0 ∂Λ iΛ,Λ = ∂Λ C Λ,Λ0 (k) · LΛ,Λ l (1) l−1,2 (0; k, −k) 2 k −
0 X
Λ,Λ0 0 0 LΛ,Λ (0) · LΛ,Λ l1 ,1 (0)∂Λ C (1) l2 ,1 (0; 0) ,
(2.73)
l1 ,l2 0 ∂Λ ∂ w LΛ,Λ (1) l,n (q; p1 , . . . , pn ) =
1 2 −
Z
k
0 ∂Λ C Λ,Λ0 (k) · ∂ w LΛ,Λ (1) l−1,n+2 (q; k, −k, p1 , . . . , pn )
0 0 X X
0 X
0 c{wi } [∂ w1 LΛ,Λ l1 ,n1 +1 (p1 , . . . , pn1 , p)
n1 ,n2 l1 ,l2 w1 ,w2 ,w3 0 · ∂ w3 ∂Λ C Λ,Λ0 (p) · ∂ w2 LΛ,Λ (1) l2 ,n2 +1
× (q; −p, pn1 +1 , . . . , pn )]r sym
(2.74)
p = −p1 − · · · − pn1 = q + pn1 +1 + · · · + pn . The notation used above has been introduced in (2.28)–(2.31). Furthermore, the derivations ∂ w can be restricted to the momenta p1 , . . . , pn , because of q = −p1 − · · · − pn . The vacuum part does not act back on the functional L(1) . We therefore disregard its flow (2.73) and just state that in each loop order the bare parameter ilΛ0 ,Λ0 is determined by a renormalization constant il0,Λ0 prescribed at Λ = 0. The task is to show that finite limits 0 lim lim LΛ,Λ (1) l,n (q; p1 , . . . , pn ) ,
Λ0 →∞ Λ→0
n ∈ N, l ∈ N0 ,
(2.75)
can be obtained, given the n-point functions without insertion which satisfy the Propositions 2.1 and 2.2. This can be achieved following closely the corresponding steps performed before without insertion in Sec. 2.3; the demonstration here can thus be presented in a concise way. Due to (2.69)–(2.70), (2.66), (2.61), the bare functional is given by Z Λ0 ,Λ0 Λ0 ,Λ0 ˆ ˆ L(1) (q; ϕ) + I(1) (q) := dxeiqx Q(x) . (2.76) The tree order is completely determined by the classical part of the insertion (2.58). This classical part Qclass yields the initial condition in integrating the system of flow equations (2.74) for l = 0 and general momenta from the initial point Λ = Λ0
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
513
downwards to smaller values of Λ ascending successively in n. It is not necessary to prescribe the order in which the derivations ∂ w are treated. Since Qclass is assumed to be a monomial of canonical mass dimension D and containing n0 field factors, 2 ≤ n0 ≤ D, the key properties (2.38) imply for all w: 0 ∂ w LΛ,Λ (1) 0,1 (q; p) = 0 ,
0 ∂Λ ∂ w LΛ,Λ (1) 0,2 (q; p1 , p2 ) = 0 .
(2.77)
The nonvanishing tree order starts at n = n0 with w Λ0 ,Λ0 0 ∂ w LΛ,Λ (1) 0,n0 (q; {pi }) = ∂ L(1) 0,n0 (q; {pi }) ,
(2.78)
i.e. the initial condition. Of course the limits (2.75) exist in the tree order (since no integrations occur). In view of the mass dimension D of the insertion, the bounds |pi | D−n−|w| 0 (2.79) (q; p , . . . , p )| ≤ (Λ + m) P |∂ w LΛ,Λ 1 n (1) 0,n Λ+m are established for later use. If n0 < D, apart from (2.78) there are other relevant instances n+|w| ≤ D; they should be evaluated explicitly without using the bounds in the flow equation. (It is instructive to compare the examples Qclass = φ∆φ, φ4 both having D = 4.) The irrelevant cases n + |w| > D in (2.79) are established similarly as (2.44). For l > 0 the coefficients of the counterterms inherent in (2.76), extracted via (2.71) and (2.72), have to depend on Λ0 , 0 ,Λ0 ∂ w LΛ (1) l,n (0; 0, . . . , 0) = rl,n,w (Λ0 ) ,
n + |w| ≤ D, l > 0 .
(2.80)
They are determined by prescribing related renormalization conditions for vanishing flow parameter Λ at a renormalization point which is again chosen at vanishing momenta. Hence, order-by-order, the real constants, l > 0, R 0 ∂ w L0,Λ (1) l,n (0; 0, . . . , 0) =: rl,n,w ,
n + |w| ≤ D ,
(2.81)
0 can be freely chosen, provided they respect the symmetry of LΛ,Λ (1) .
Proposition 2.3 (Boundedness). Let l ∈ N0 , n ∈ N, w from (2.30) and 0 ≤ Λ ≤ Λ0 , then 0 |∂ w LΛ,Λ (1) l,n (q; p1 , . . . , pn )|
≤ (Λ + m)
D−n−|w|
P1
Λ+m log m
P2
|pi | Λ+m
.
(2.82)
The symbol P denotes polynomials with nonnegative coefficients which depend on l, n, w but not on {pi }, Λ, Λ0 . For l = 0 all polynomials P1 reduce to positive constants. Proof. Due to (2.79), the assertion (2.82) is already shown in the tree order. Given the set of n-point functions without insertions satisfying Proposition 2.1, one proceeds for l > 0 inductively as in the proof of the latter proposition, i.e. (i) ascending
July 14, 2003 11:19 WSPC/148-RMP
514
00169
V. F. M¨ uller
in l, (ii) at fixed l ascending in n, (iii) at fixed l, n descending in w. Inspecting the flow equations (2.74), it is easily seen that for any given l, n, w on the l.h.s. the contributions to the r.h.s. — because of the key properties (2.38) — always precede those on the l.h.s. in the order of induction adopted. Imitating the steps leading to (2.48) provides the bound for l, n ∈ N, 0 |∂Λ ∂ w LΛ,Λ (1) l,n (q; p1 , . . . , pn )|
|pi | Λ+m P6 . ≤ (Λ + m)D−n−|w|−1 P5 log m Λ+m
(2.83)
(a1 ) In the cases n + |w| > D this bound is integrated downwards from the initial point Λ = Λ0 with vanishing initial conditions, see (2.76). From this follows easily (2.82) for n + |w| > D. (a2 ) If n+|w| ≤ D, however, the respective flow equation (2.74) has to be integrated upwards at the renormalization point, as in (2.46), employing now the renormalization conditions (2.81). The bound (2.83) then implies the claim (2.82) at vanishing momenta. Again as in Sec. 2.3, extension to general momenta is accomplished appealing to the Taylor formula (2.47). Thus, Proposition 2.3 is proven. In complete analogy with the steps taken in Sec. 2.3, the proposition just proven prepares the decisive Proposition 2.4 (Convergence). Let l ∈ N0 , n ∈ N, w from (2.30), 0 ≤ Λ ≤ Λ0 , and Λ0 > Λ0 , sufficiently large, then 0 |∂Λ0 ∂ w LΛ,Λ (1) l,n (q; p1 , . . . , pn )|
≤
(Λ + m)D+1−n−|w| (Λ0 )2
log
Λ0 m
ν
with a positive integer ν depending on l, n, w only.
P4
|pi | Λ+m
(2.84)
Proof. One first verifies the assertion directly in the tree order for the relevant cases n + |w| ≤ D. Herewith, the further course of the proof is just a replica of the proof given for Proposition 2.2. As a consequence, in each place the exponent 4 + 1 − n − |w| appears there, this exponent is changed here into D + 1 − n − |w|, thus proving the assertion (2.84). Finally, integration of the bound (2.84) of Proposition 2.4 demonstrates, that the renormalized regularized n-point functions with one insertion have finite limits (2.75). 2.5. Finite temperature field theory There are essentially two formulations of quantum fields at finite temperature: a real-time approach to treat dynamical effects, and an imaginary-time approach to
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
515
describe equilibrium properties [72]. In this section the problem of renormalization in a temperature independent way is considered. Such a renormalization is required studying the T -dependence of observables, since then the relation between bare and renormalized coupling constants must not depend on the temperature. Our aim here is to show within the imaginary-time formalism, that a quantum field theory renormalized at T = 0 stays also renormalized at any T > 0. For the sake of a succinct presentation the symmetric Φ4 -theory is treated and the generalization to the nonsymmetric theory is stated at the end. The first steps to be taken do not differ from the zero temperature case: Starting from a finite domain, given by a 4-dimensional torus Ω, and the Gaussian measure with the regularized covariance (2.7), we obtain Wilson’s flow equation (2.21). Here, the bare interaction (2.11) is restricted to the symmetric theory, as already mentioned, i.e. putting f = v(Λ0 ) = b(Λ0 ) ≡ 0 .
(2.85)
Disregarding as before the flow of the vacuum part I Λ,Λ0 , we imagine at least one functional derivative acting on the flow equation (2.21). Then we can pass to the spatial infinite volume limit, but keeping the periodicity in the imaginary time x4 and choosing the period equal to the inverse temperature: l4 = β ≡ 1/T . Hence, in this limit the space-time domain is R3 × S 1 and the theory shows the reduced symmetry O(3)×Z2 , as compared to the O(4)-symmetry at T = 0. Correspondingly, the dual Fourier variables (momentum vectors) are p ∈ R3 ,
p := (p, p4 ) ,
p4 = 2πnT ,
n ∈ Z,
(2.86)
and hence we define Z
:= T p
XZ
n∈Z
R3
d3 p . (2π)3
(2.87)
In the sequel we underline a symbol denoting a quantity at finite temperature or write the T -dependence explicitly. In place of (2.24) the Fourier transform takes the form Z β Z Z 3 ipx dx4 e−ipx ϕ(x) , ˆ (p) , ϕ d x ˆ (p) = (2.88) ϕ(x) = e ϕ p
R3
0
implying for a functional derivation: Z Z Z β (2π)3 T −ipx 3 δϕ(x) = e δϕˆ (p) , δϕˆ (p) = dx4 eipx δϕ(x) . d x T (2π)3 R3 p 0
(2.89)
Furthermore, the regularized covariance (2.27) is restricted to momenta (2.86), 2 +m2
C Λ,Λ0 (p) =
p − 1 (e p2 + m2
Λ2 0
− e−
p2 +m2 Λ2
).
(2.90)
July 14, 2003 11:19 WSPC/148-RMP
516
00169
V. F. M¨ uller
Denoting by LΛ,Λ0 (ϕ; T ) the generating functional of the amputated truncated Schwinger functions at finite temperature T , we define the n-point functions, n ∈ N, similar to (2.25) asf n−1 (2π)3 δϕˆ (p ) · · · δϕˆ (p ) LΛ,Λ0 (ϕ; T )|ϕ≡0 1 n T = δ(p1 + · · · + pn )δ0,(p
1
Λ,Λ0 (p1 , . . . , pn ; T ) . +···+pn ),4 Ln
(2.91)
These n-point functions, after a respective loop expansion in complete analogy to (2.29), then satisfy a system of flow equations obtained from (2.31) by replacing every momentum vector appearing by its underlined analogue, and moreover, restricting the momentum derivatives ∂ w to spatial momentum components. Employing this system of flow equations, we could prove renormalizability of the theory at finite temperature proceeding similarly as in the case of zero temperature. However, because of the reduced spacetime symmetry, the renormalization conditions (2.34)–(2.37) for l ≥ 1 would have to be extended by an additional constant: R,1 R 0 (T )p2 + zlR,2 (T )p24 + O(p4 ) , L0,Λ l,2 (p, −p; T ) = al (T ) + zl
(2.92)
R 0 L0,Λ l,4 (0, 0, 0, 0; T ) = cl (T ) .
(2.93)
The constants for n = 1, 3 are set equal to zero in the symmetric theory (2.85). Only at T = 0, the emerging O(4)-symmetry implies the equality zlR,1 (0) = zlR,2 (0). Our aim is to prove renormalizability in a temperature independent way, i.e. with counterterms that do not depend on the temperature. In this case, the renormalR,1 ization constants aR (T ), zlR,2 (T ), cR l (T ), zl l (T ) cannot be prescribed arbitrarily, since they are related dynamically to the three renormalization constants at T = 0. Therefore, we follow a different course and study the respective difference of a npoint function at T > 0 and at T = 0, n ∈ N, with momenta {p} ≡ (p1 , . . . , pn ) of the form (2.86): Λ,Λ0 Λ,Λ0 0 ({p}) := LΛ,Λ Dl,n l,n ({p}; T ) − Ll,n ({p}) .
(2.94)
These functions are well-defined. From the system of flow equations (2.31) and from its analogue at finite temperature follows the system of flow equations satisfied by the difference functions (2.94), with l, n ∈ N: Z 1 Λ,Λ0 Λ,Λ0 (k, −k, {p}) ∂Λ Dl,n ({p}) = ∂Λ C Λ,Λ0 (k) · Dl−1,n+2 2 k +
− f In
1 2 Z
Z
k
k
0 ∂Λ C Λ,Λ0 (k) · LΛ,Λ l−1,n+2 (k, −k, {p})
0 ∂Λ C Λ,Λ0 (k) · LΛ,Λ l−1,n+2 (k, −k, {p})
the symmetric theory the n-point functions with odd n vanish.
!
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
−
517
0 0 1 X X Λ,Λ0 [Ll1 ,n1 +1 (p1 , . . . , pn , p; T )∂Λ C Λ,Λ0 (p) 1 2 n ,n 1
2
l1 ,l2
0 · DlΛ,Λ (−p, pn 2 ,n2 +1
1 +1
, . . . , pn )]r sym
0 0 1 X X Λ,Λ0 − [Dl1 ,n1 +1 (p1 , . . . , pn , p)∂Λ C Λ,Λ0 (p) 1 2 n ,n 1
2
l1 ,l2
0 · LΛ,Λ l2 ,n2 +1 (−p, pn
1 +1
, . . . , pn )]r sym
p = −p1 − · · · − pn = pn 1
1 +1
(2.95)
+ · · · + pn .
In the (tree) order l = 0 we infer directly Λ,Λ0 D0,n (p1 , . . . , pn ) = 0 ,
n ∈ N,
(2.96)
since on the set of momenta considered the n-point function at T = 0 is equal to the n-point function at T > 0 in this order. The assertion of a temperature independent renormalization now requires the bare difference functions to vanish for l ≥ 1: Λ0 ,Λ0 Dl,n (p1 , . . . , pn ) = 0 ,
l, n ∈ N.
(2.97)
Given the bounds (2.41) and (2.49) satisfied by the n-point functions at zero temperature, then follows the Theorem. For l, n ∈ N and for 0 ≤ Λ ≤ Λ0 holds, )! ( |pi | Λ+m Λ,Λ0 −s−n P2 , |Dl,n (p1 , . . . , pn )| ≤ (Λ + m) P1 log m Λ+m Λ,Λ0 (p1 , . . . , pn )| |∂Λ0 Dl,n
Λ0 (Λ + m)−s−n P3 log P4 ≤ (Λ0 )2 m
(
|pi | Λ+m
)!
.
(2.98)
(2.99)
The polynomials P have positive coefficients, which depend on l, n, s, m and (smoothly) on T, but not on {p}, Λ, Λ0. The positive integer s may be chosen arbitrarily. 0 The n-point functions at finite temperature T, LΛ,Λ l,n (p1 , . . . , pn ; T ), when renormalized with the same counterterms as the zero temperature functions, (2.97), satisfy the bounds (2.41) and (2.49) restricted to the case w = 0 and to momenta (2.86). The coefficients in the polynomials P may now depend also (smoothly) on T. For the proof we refer to [43], pp. 396–399, and just indicate that the system of flow equations (2.95) is integrated inductively from the initial point Λ = Λ0 downwards, observing (2.97). The difference of the two terms not involving any 0 function DlΛ,Λ 0 ,n0 , which appears in (2.95), however, is not accessible by induction. It is bounded separately, matching the sharp bound on Λ asserted, by use of the Euler–MacLaurin formula, see e.g. [73].
July 14, 2003 11:19 WSPC/148-RMP
518
00169
V. F. M¨ uller
Due to the theorem, the n-point functions of the theory at T > 0, renormalized at zero temperature, satisfy the bound (2.49) for w = 0 and momenta (2.86). Hence, they have finite limits 0,Λ0 lim Ll,n (p1 , . . . , pn ; T ) ,
Λ0 →∞
l, n ∈ N ,
upon removing the UV-cutoff Λ0 . As already indicated before, a finite theory at given temperature T0 > 0 could also be generated imposing renormalization conditions at this temperature. The price to be paid (in the symmetric theory considered) are in each loop order the R,1 four constants aR (T0 ), zlR,2 (T0 ), cR l (T0 ), zl l (T0 ), (2.92) and (2.93), instead of the R R R three constants al , zl , cl at zero temperature. However, an arbitrary choice of zlR,1 (T0 ), zlR,2 (T0 ) would not correspond to a theory at zero temperature, which shows the O(4)-symmetry of Euclidean space-time. Starting from an O(4)-invariant theory at zero temperature, the functional LΛ,Λ0 (ϕ; T ) − LΛ,Λ0 (ϕ)
(2.100)
with initial condition (2.97) has been proven to satisfy the bound (2.99). Hence, 0,Λ0 (p, −p), converging for all l with Λ0 → ∞ to a finite limit, the function Dl,2 produces a dynamical relation between the renormalization constants zlR,1 (T0 ) and zlR,2 (T0 ), i.e. fixing one of them determines the other. Thus, a renormalization does not depend on temperature, if this relation is satisfied. It becomes manifest in the equality zl1 (Λ0 ) = zl2 (Λ0 ) of the corresponding bare parameters. Concluding we remark that the proof can be easily extended to the nonsymmetric Φ4 -theory. In this case, the n-point functions with n odd no longer vanish, since the Z2 -symmetry is now lacking. Hence, the bare interaction will be of the general form (2.11). Correspondingly, the theory at zero temperature is renormalized by the conditions (2.34)–(2.37), involving five renormalization constants. Proceeding inductively as before, considering now odd and even values of n, establishes the theorem for the nonsymmetric theory, too. 2.6. Elementary estimates Here, rather obvious estimates on some elementary integrals are listed, which we used repeatedly in generating inductive bounds on Schwinger functions. (a1 ) In the irrelevant cases, the integrals have the form Z b dxx−r−1 (log x)s , with 1 ≤ a ≤ b and r ∈ N, s ∈ N0 . a
Defining correspondingly the function 1 s s(s − 1) 1 · 2···s fr,s (x) := x−r (log x)s + (log x)s−1 + (log x)s−2 + · · · + r r r2 rs
!
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
519
0 we observe fr,s (x) = −x−r−1 (log x)s < 0 and fr,s (x) > 0 for x > 1, hence Z b dxx−r−1 (log x)s = fr,s (a) − fr,s (b) < fr,s (a) . a
(a2 ) The integrals to be bounded in the relevant cases have the form Z b dxxr−1 (log x)s , with 1 ≤ b and r, s ∈ N0 . 1
If r = 0, we just integrate. For r > 0, defining gr,s (x) :=
s s(s − 1) 1 r x (log x)s − (log x)s−1 + (log x)s−2 r r r2 + · · · + (−)
s1
! · 2···s , rs
0 we notice gr,s (x) = xr−1 (log x)s and hence Z b 1 · 2···s dxxr−1 (log x)s = gr,s (b) − gr,s (1) < + |gr,s (b)| . rs 1
3. The Quantum Action Principle The Green functions of a relativistic quantum field theory depend on the adjustable parameters of this theory and are in general related according to the inherent symmetries of the theory. Clearly, all types of Green functions, whether truncated, amputated, or one-particle-irreducible, show these properties. The quantum action principle deals with the variation of Green functions caused by diverse operations performed: (i) applying the differential operator appearing in the (classical) field equation, (ii) (nonlinear) variations of the fields, (iii) variation of an adjustable parameter of the theory. The quantum action principle relates each of these different operations on Green functions to the insertion of a corresponding composite field into the Green functions: as a local operator in the first two cases, whereas integrated over space-time in the third. Moreover, in general the local operation has a precursor within classical field theory (e.g. the field equation, the Noether theorem). Then the local composite field to be inserted in the case of a quantum field theory is a sum formed of its classical precursor and of assigned local counterterms, whose canonical mass dimensions are equal to or smaller than the canonical mass dimension ascribed to the term of classical descent. The quantum action principle has been established first by Lam [74, 75] and Lowenstein [76] using the BPHZ-formulation of perturbation theory. This principle is extensively used in the method of algebraic renormalization [77].
July 14, 2003 11:19 WSPC/148-RMP
520
00169
V. F. M¨ uller
Our aim is to demonstrate the parts (i) and (iii) of the quantum action principle by means of flow equations in the case of the scalar field theory. The particularly interesting part (ii) is deferred to a later section, where nonlinear BRST-transformations have to be implemented in showing the renormalizability of a non-Abelian gauge theory. 3.1. Field equation We consider again the quantum field theory of a real scalar field on four-dimensional Euclidean space-time, which has been treated in the preceding sections. To derive a field equation, we act on the generating functional of its regularized Schwinger functions (2.12) as follows: Z δ ~ dy(C Λ,Λ0 )−1 (x − y) Z Λ,Λ0 (J) δJ(y) Z Z 1 1 Λ0 ,Λ0 (φ)+ ~ hφ,Ji . = dy(C Λ,Λ0 )−1 (x − y) dµΛ,Λ0 (φ)φ(y)e− ~ L In presence of the regularization the inverse of the regularized covariance (2.8) replaces the differential operator −∆ + m2 . Integration by parts (2.5) on the r.h.s. and recalling that the covariance of the Gaussian measure dµΛ,Λ0 (φ) is ~C Λ,Λ0 yields the field equation of the regularized generating functional (2.12), Z δ Z Λ,Λ0 (J) J(x) − ~ dy(C Λ,Λ0 )−1 (x − y) δJ(y) Z 1 Λ0 ,Λ0 1 (φ)+ ~ hφ,Ji = dµΛ,Λ0 (φ)Q(x)e− ~ L . (3.1) On the r.h.s. the inserted composite field Q(x) is given by Q(x) =
δ LΛ0 ,Λ0 (φ) . δφ(x)
(3.2)
If we employ the generating functional of regularized Schwinger functions with insertions (2.61) and (2.62) we can rewrite the field equation (3.1) in the form Z δ Λ,Λ0 −1 ) (x − y) Z Λ,Λ0 (J) J(x) − ~ dy(C δJ(y) = −~
δ ˜ Λ,Λ0 Z (%; J)|%(x)=0 . δ%(x)
(3.3)
Taking into account the relations (2.18) and (2.65) on the l.h.s. and on the r.h.s. of this equation, respectively, provides the field equation for the generating functional of the amputated truncated Schwinger functions, δ Λ,Λ0 0 LΛ,Λ0 (ϕ) = LΛ,Λ (1) (x; ϕ) + I(1) (x) . δϕ(x)
(3.4)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
521
Hence, in momentum space we have (2π)4
δ ˆ Λ,Λ0 (q; ϕ) + IˆΛ,Λ0 (q) , LΛ,Λ0 (ϕ) = L (1) (1) δ ϕ(q) ˆ
(3.5)
using the conventions (2.24) and (2.69). Our goal is to show within perturbation theory, i.e. in a formal loop expansion, that the field equation (3.5) remains valid taking the limit Λ = 0, Λ0 → ∞. To this end we proceed as follows: (α) In Sec. 2.3 it has been shown that the generating functional LΛ,Λ0 (ϕ) of the theory considered (perturbatively) converges to a finite limit with Λ0 → ∞. The limit theory is determined by the choice of renormalization conditions (2.34)–(2.37) at the renormalization point (chosen at vanishing momenta). 0 (β) The generating functional LΛ,Λ (1) (q; ϕ) with insertion of one composite field of mass dimension D and momentum q has been shown in Sec. 2.4 to have a finite limit with Λ0 → ∞, too, provided the counterterms of the insertion (2.58) are introduced at first as indeterminate functions of Λ0 and then determined by a choice of renormalization conditions at the renormalization point. In the case entering the field equation, however, the dependence on Λ0 of the insertion (3.2) is already given by (2.11), the counterterms of the theory without insertion. In order to maintain in an intermediate stage the freedom in choosing renormalization conditions, we use instead of (3.2) indeterminate counterterms to begin with: Q(x) =
g f 2 φ (x) + φ3 (x) + v1 (Λ0 ) + a1 (Λ0 )φ(x) − z1 (Λ0 )∆φ(x) 2! 3! +
1 1 b1 (Λ0 )φ2 (x) + c1 (Λ0 )φ3 (x) . 2! 3!
(3.6)
Then, as has been demonstrated in Sec. 2.4, any choice of admissible renormaΛ,Λ0 lization conditions leads to a finite limit of the generating functional L(1) (q; ϕ) in sending Λ0 → ∞, and thereby a related dependence of the coefficients v1 (Λ0 ), . . . , c1 (Λ0 ) on Λ0 arises. (γ) We define the functional ˆ Λ,Λ0 (q; ϕ) := L ˆ Λ,Λ0 (q; ϕ) + IˆΛ,Λ0 (q) − (2π)4 D (1) (1)
δ LΛ,Λ0 (ϕ) . δ ϕ(q) ˆ
(3.7)
If this functional can be forced to vanish at Λ = 0 and for all Λ0 , Λ0 > Λ0 , by an appropriate fixed choice of renormalization conditions in (β), then (3.5) converges to a finite renormalized field equation for (Λ = 0, Λ0 → ∞). The functional (3.7) obeys the linear flow equation ~ δ ˙ Λ,Λ0 δ d ˆ Λ,Λ0 ˆ Λ,Λ0 (q; ϕ) D ,C (q; ϕ) = D dΛ 2 δϕ δϕ δ Λ,Λ0 δ ˆ Λ,Λ0 L (ϕ), C˙ Λ,Λ0 D (q; ϕ) , (3.8) − δϕ δϕ
July 14, 2003 11:19 WSPC/148-RMP
522
00169
V. F. M¨ uller
which follows directly from the flow equations (2.21) and (2.67), performing a functional derivation of the first and Fourier transforming the latter. To make use of it we decompose ˆ Λ,Λ0 (q; 0) = δ(q) D Λ,Λ0 (2π)−4 D 0 ˆ Λ,Λ0 (q; ϕ)|ϕ=0 = δ(q + p1 + · · · + pn ) (2π)4(n−1) δϕ(p ˆ n ) · · · δϕ(p ˆ 1)D × DnΛ,Λ0 (q; p1 , . . . , pn ) .
(3.9)
From (2.25), (2.69) and (2.70) then results 0 D0Λ,Λ0 = iΛ,Λ0 − LΛ,Λ (0) , 1
Λ,Λ0 0 DnΛ,Λ0 (q; p1 , . . . , pn ) = LΛ,Λ (1) n (q; p1 , . . . , pn ) − Ln+1 (q, p1 , . . . , pn ) .
(3.10) (3.11)
We notice that the flow equations (3.8) and (2.67) have the same form. Thus, after a loop expansion, the strict analogue of the system of flow equations (2.73) and (2.74) is obtained, l ∈ N0 , n ∈ N : Z 1 Λ,Λ0 Λ,Λ0 ∂Λ C Λ,Λ0 (k) · Dl−1,2 (0; k, −k) ∂Λ Dl,0 = 2 k −
0 X
Λ,Λ0 0 0 (0; 0) , (0) · DlΛ,Λ LΛ,Λ l1 ,1 (0)∂Λ C 2 ,1
(3.12)
l1 ,l2
Λ,Λ0 ∂Λ ∂ w Dl,n (q; p1 , . . . , pn ) Z 1 Λ,Λ0 = ∂Λ C Λ,Λ0 (k) · ∂ w Dl−1,n+2 (q; k, −k, p1 , . . . , pn ) 2 k
−
0 0 X X
0 X
w3 0 c{wi } [∂ w1 LΛ,Λ ∂Λ C Λ,Λ0 (p) l1 ,n1 +1 (p1 , . . . , pn1 , p) · ∂
n1 ,n2 l1 ,l2 w1 ,w2 ,w3 0 · ∂ w2 DlΛ,Λ (q; −p, pn1 +1 , . . . , pn )]r sym 2 ,n2 +1
(3.13)
p = −p1 − · · · − pn1 = q + pn1 +1 + · · · + pn . ˆ Λ0 ,Λ0 (q; ϕ)|l=0 = 0. We first treat the tree order. From (2.11), (2.76) follows D Integrating the flow equations with l = 0 and general momenta from the initial point Λ = Λ0 downwards to smaller values of Λ, ascending successively in n, we find due to the properties (2.38) for n ∈ N, 0 ≤ Λ ≤ Λ0 , Λ,Λ0 D0,0 = 0,
Λ,Λ0 D0,n (q; p1 , . . . , pn ) = 0 .
(3.14)
The extension to all loop orders l is achieved by the Proposition 3.1. For all l ∈ N and n + |w| ≤ 3 let 0,Λ0 = 0, Dl,0
0,Λ0 (0; 0, . . . , 0) = 0 , ∂ w Dl,n
(3.15)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
523
then for l ∈ N0 , n ∈ N, |w| ≤ 3, and 0 ≤ Λ ≤ Λ0 : Λ,Λ0 Dl,0 = 0,
Λ,Λ0 ∂ w Dl,n (q; p1 , . . . , pn ) = 0 .
(3.16)
Proof. In the order l = 0 the assertion is already established because of (3.14). We now assume (3.16) to hold for all orders smaller than a fixed order l. As a consequence, on the respective r.h.s. of the flow equations (3.12) and (3.13) the first term vanishes and in the second term only the pair (l1 = 0, l2 = l) has to be taken into account. Looking first at the vacuum part, we observe that the r.h.s. of (3.12) vanishes due to (2.38). Thus, integration from the initial point Λ = 0 yields Λ,Λ0 Dl,0 = 0. To demonstrate the assertion for general n we proceed inductively: ascending in n, and for fixed n descending with w from |w| = 3. Thus, for each n the irrelevant cases n + |w| > 3 always precede the relevant ones, n + |w| ≤ 3, if present at all. Since Λ0 ,Λ0 ∂ w Dl,n (q; p1 , . . . , pn ) = 0 ,
n + |w| > 3 ,
(3.17)
the flow equations (3.13) of these cases are integrated from the initial point Λ = Λ 0 downwards. On the other hand, the respective flow equation of the cases n+|w| ≤ 3 is first integrated at zero momentum from the initial point Λ = 0 with vanishing initial condition (3.15) and in a now familiar second step the result is extended to general momenta via the Taylor formula (2.47). Following the inductive order stated one notices that for each pair (n, w) occuring, the r.h.s. of (3.13) vanishes due to the key properties (2.38) and preceding instances (3.16). Hence, (3.16) also holds for all n in the order l and the proposition is proven. From (α), (β) we know, that letting Λ0 → ∞, each term on the r.h.s. of Eq. (3.7) converges to a finite limit. Hence, if the renormalization conditions chosen for 0 Λ,Λ0 LΛ,Λ (ϕ) to satisfy (3.15), the l.h.s. of (3.7) (1) (q; ϕ) are inferred from those of L vanishes for 0 ≤ Λ ≤ Λ0 . Thus, the field equation (3.5) remains valid after removing the cutoffs Λ, Λ0 , written suggestively as (2π)4
δ ˆ 0,∞ (q; ϕ) + Iˆ0,∞ (q) ; L0,∞ (ϕ) = L (1) (1) δ ϕ(q) ˆ
(3.18)
in the realm of a formal loop expansion, of course. Considering the relations (3.16) at Λ = Λ0 reveals that the counterterms entering the insertion (3.6) have to be chosen identical to those of the bare interaction (2.11), l ∈ N : v1,l (Λ0 ) = vl (Λ0 ), . . . , c1,l (Λ0 ) = cl (Λ0 ) .
(3.19)
3.2. Variation of a coupling constant The renormalized amputated truncated Schwinger functions (2.33) depend on the coupling constants f and g, which can be freely chosen in the bare interaction (2.11). Our aim is to find a representation for the derivative of these Schwinger
July 14, 2003 11:19 WSPC/148-RMP
524
00169
V. F. M¨ uller
functions with respect to f or g. To this end we start from the defining Eq. (2.16) of the regularized generating functional. Denoting by κ either f or g, and defining Z ∂ Λ0 ,Λ0 L (φ) =: dxQκ (3.20) Wκ (φ) := ∂κ Ω where the integrand Qκ (x) is a composite field and Wκ (φ) the space-time integral of it, we obtain from deriving (2.16): 1
Λ,Λ0
Λ,Λ0
(ϕ)+I ) ∂κ (LΛ,Λ0 (ϕ) + I Λ,Λ0 ) · e− ~ (L Z 1 Λ0 ,Λ0 (φ+ϕ) Wκ (φ + ϕ) . = dµΛ,Λ0 (φ)e− ~ L
(3.21)
On the other hand , the functional derivation of Eq. (2.63) with respect to %(x) at %(x) = 0 yields, observing the shift φ → φ + ϕ to be performed in (2.61) and employing the notations (2.66), (2.68): 1
Λ,Λ0
Λ,Λ0
Λ,Λ0 (ϕ)+I ) − ~ (L 0 (LΛ,Λ (1) (x; ϕ) + I(1) (x))e Z 1 Λ0 ,Λ0 (φ+ϕ) = dµΛ,Λ0 (φ)e− ~ L Q(x)|φ→φ+ϕ .
(3.22)
In writing this equation we have already taken account of the identities LΛ,Λ0 (0; ϕ) = LΛ,Λ0 (ϕ) ,
I Λ,Λ0 (0) = I Λ,Λ0 .
Choosing in (3.22) the particular composite field Q(x) = Qκ (x) introduced in (3.20), and integrating over the finite space-time Ω, implies by comparison with (3.21), Z Λ,Λ0 0 ∂κ L (ϕ) = dxLΛ,Λ (1) (x; ϕ) . Ω
We can now pass to the infinite volume limit Ω → R4 , ϕ ∈ S(R4 ). Hence, ˆ Λ,Λ0 (0; ϕ) , ∂κ LΛ,Λ0 (ϕ) = L (1)
(3.23)
ˆ Λ,Λ0 (0; ϕ) with the Fourier transform (2.69) at vanishing momentum. In the sequel, L (1) is always understood as the generating functional with the insertion (3.20). The task posed is to produce a finite limit of the Eq. (3.23) upon removing the cutoffs, i.e. letting Λ = 0, Λ0 → ∞. A finite limit of LΛ,Λ0 (ϕ) has been established in Sec. 2.3. Furthermore, the insertion appearing in (3.23) is a particular instance of the insertion of a composite field Q(x) dealt with in Sec. 2.4. The composite field Qκ (x) involved here follows from (3.20) and (2.11) as Qκ (x) =
1 1 δκf φ3 (x) + δκg φ4 (x) + vκ (Λ0 )φ(x) 3! 4! 1 1 + aκ (Λ0 )φ2 (x) + zκ (Λ0 )(∂µ φ)2 (x) 2 2 +
1 1 bκ (Λ0 )φ3 (x) + cκ (Λ0 )φ4 (x) , 3! 4!
(3.24)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
525
where δκf is the Kronecker symbol: δκf = 1, if κ = f , and δκf = 0, if κ 6= f . One should note that this composite field in both cases κ = f or κ = g has the canonical mass dimension D = 4, in contrast to its classical part. The coefficients of the counterterms appearing in (3.24) are the coefficients entering the bare interaction (2.11) derived with respect to κ. However, since in the process of renormalization the counterterms are determined by the renormalization conditions chosen, we at first treat the counterterms in (3.24) as free functions of Λ0 , which are then determined by the renormalization conditions prescribed in the case of the insertion. We do not assume the renormalization conditions (2.34)–(2.37) of LΛ,Λ0 (ϕ) to depend on f and g, hence, their derivative with respect to κ vanishes. Requiring (3.23) to be valid at the renormalization point for all values of Λ0 then implies, that in all ˆ Λ,Λ0 (0; ϕ) and its momentum derivatives loop orders l ≥ 1 an n-point function of L (1) vanish for Λ = 0 at zero momenta, if n + |w| ≤ 4. The renormalization conditions ˆ Λ,Λ0 (0; ϕ) has a finite fixed, we know from Proposition 2.4, that the functional L (1) limit Λ = 0, Λ0 → ∞. To control the renormalization of Eq. (3.23) we define ˆ Λ,Λ0 (0; ϕ) − ∂κ LΛ,Λ0 (ϕ) . DΛ,Λ0 (ϕ) := L (1)
(3.25)
This functional satisfies, as easily seen, a linear flow equation of the form (3.8); hence, after decomposition, a system of flow equations of the form (3.13) results. (Here, no vacuum part appears.) Comparing the bare interaction (2.11) with the insertion (3.24) we observe that D Λ0 ,Λ0 vanishes in the tree order DΛ0 ,Λ0 |l=0 = 0 , and its irrelevant part vanishes for l > 0, Λ0 ,Λ0 (p1 , . . . , pn ) = 0 , ∂ w Dl,n
n + |w| > 4 .
Given these initial conditions we have the Proposition 3.2. Assume for all l ∈ N, n + |w| ≤ 4: 0,Λ0 ∂ w Dl,n (0, . . . , 0) = 0 ,
(3.26)
then follows for l ∈ N0 , n ∈ N, |w| ≤ 4, and 0 ≤ Λ ≤ Λ0 : Λ,Λ0 (p1 , . . . , pn ) = 0 . ∂ w Dl,n
(3.27)
The proof by induction proceeds exactly as the proof of Proposition 3.1 and is omitted. Proposition 3.2 implies, that Eq. (3.23) has a finite limit for Λ = 0, Λ0 → ∞, ˆ 0,∞ (0; ϕ) , ∂κ L0,∞ (ϕ) = L (1)
(3.28)
again to be read in terms of a formal loop expansion. Furthermore, from (3.27) at Λ = Λ0 follows the relation of the counterterms vκ,l (Λ0 ) = ∂κ vl (Λ0 ), . . . , cκ,l (Λ0 ) = ∂κ cl (Λ0 ) .
(3.29)
July 14, 2003 11:19 WSPC/148-RMP
526
00169
V. F. M¨ uller
3.3. Flow equations for proper vertex functions In perturbative renormalization based on the analysis of Feynman integrals, (proper) vertex functions form the building blocks. They are represented by oneparticle-irreducible (1PI) Feynman diagrams, see e.g. [78]. Although their generating functional has no representation as a functional integral, flow equations for vertex functions can be derived [45, 46]. Our goal in this section is to deduce in the case of the symmetric Φ4 -theory from Wilson’s differential flow equation (2.21) for the L-functional the system of flow equations satisfied by the regularized n-point vertex functions, n ∈ N. After that, an inductive proof of renormalizability based on them is outlined. We start from the regularized generating functional W Λ,Λ0 (J) of the truncated Schwinger functions, (2.14), decomposed as W
Λ,Λ0
∞ X
1 (J) = (2n)! n=1
Z
dx1 · · ·
Z
dx2n
Λ,Λ0 × W2n (x1 , . . . , x2n )J(x1 ) · · · J(x2n ) ,
(3.30)
according to (2.15). Due to the symmetry φ → −φ of the theory, all n-point functions with n odd vanish identically. Defining the “classical field”, ϕ(x) :=
δW Λ,Λ0 (J) , δJ(x)
(3.31)
we then notice, that ϕ(x)|J≡0 = 0 , and, moreover, that ϕ depends on the flow parameter Λ ( and on Λ0 ). Since the 2-point function W2Λ,Λ0 is different from zero, (3.31) can be inverted iteratively as a formal series in ϕ(x) to yield the source J(x) in the form ∞ X
1 J(x) = J (ϕ(x)) ≡ (2n + 1)! n=0
Z
dx1 · · ·
Z
dx2n+1
× F2n+2 (x, x1 , . . . , x2n+1 )ϕ(x1 ) · · · ϕ(x2n+1 ) ,
(3.32)
where Z
dyF2 (x, y)W2Λ,Λ0 (y, z) = δ(x − z) .
The generating functional ΓΛ,Λ0 (ϕ) of the regularized vertex functions results from the Legendre transformation Z Λ,Λ0 Λ,Λ0 (3.33) Γ (ϕ) := −W (J) + dyJ(y)ϕ(y) J=J (ϕ)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
527
implying, due to (3.31), δΓΛ,Λ0 (ϕ) = J(x) . δϕ(x)
(3.34)
The functional ΓΛ,Λ0 (ϕ) is even under ϕ → −ϕ and vanishes at ϕ = 0. Finally, performing the functional derivation of (3.34) with respect to J(y) and using again (3.31), provides the crucial functional relation: Z δ 2 W Λ,Λ0 (J) δ 2 ΓΛ,Λ0 (ϕ) dz · = δ(y − x) . (3.35) δJ(y)δJ(z) δϕ(z)δϕ(x) As an immediate consequence follows Z 0 dz W2Λ,Λ0 (y, z)ΓΛ,Λ (z, x) = δ(y − x) , 2
considering (3.35) at ϕ = 0, and thus also at J = 0. In order to obtain the relation between the functionals LΛ,Λ0 (ϕ) and ΓΛ,Λ0 (ϕ), we write (2.19) in the form 1 W Λ,Λ0 (J) = −LΛ,Λ0 (ϕ) + hJ, C Λ,Λ0 Ji , 2 Z ϕ(x) = dy C Λ,Λ0 (x − y)J(y) .
(3.36) (3.37)
Deriving (3.36) twice with respect to J as required in (3.35) one obtains after operating on this equation with (C Λ,Λ0 )−1 , Z Z δ 2 LΛ,Λ0 (ϕ) (C Λ,Λ0 )−1 (y − x) = − dz du δϕ(y)δϕ(u) × C Λ,Λ0 (u − z)
δ 2 ΓΛ,Λ0 (ϕ) δ 2 ΓΛ,Λ0 (ϕ) + . δϕ(z)δϕ(x) δϕ(y)δϕ(x)
(3.38)
From this analogue of (3.35) follow the relations between the respective n-point functions of ΓΛ,Λ0 (ϕ) and LΛ,Λ0 (ϕ) upon repeated functional derivation with respect to ϕ, employing the chain rule together with the relation Z δΓΛ,Λ0 (ϕ) , (3.39) ϕ(x) = dyC Λ,Λ0 (x − y) δϕ(y) due to (3.37) and (3.34). With our conventions (2.24), (2.27) for the Fourier transformation, the Eqs. (3.38) and (3.39) appear in momentum space as Z δ(p + q) δ 2 ΓΛ,Λ0 δ 2 LΛ,Λ0 (2π)−4 Λ,Λ0 C Λ,Λ0 (k) = −(2π)8 C ˆ ϕ(k) ˆ δϕ ˆ (−k)δ ϕ ˆ (q) (p) k δ ϕ(p)δ +
δ 2 ΓΛ,Λ0 , δϕ ˆ (p)δ ϕ ˆ (q)
ϕ(q) ˆ = (2π)4 C Λ,Λ0 (q)
(3.40) δΓΛ,Λ0 . δϕ ˆ (−q)
(3.41)
July 14, 2003 11:19 WSPC/148-RMP
528
00169
V. F. M¨ uller
In the tree order, LΛ,Λ0 (ϕ) contains no 2-point function, (2.38). Hence, setting in (3.40) ϕ = ϕ = 0 yields l=0 δ 2 ΓΛ,Λ0 (ϕ) δ(p + q) = (2π)−4 Λ,Λ0 . δϕ ˆ (p)δ ϕ ˆ (q) C (p)
(3.42)
ϕ≡0
To deduce the flow equation for the vertex functional, we derive Eq. (3.33) with respect to the flow parameter Λ, Z Z δΓΛ,Λ0 ∂Λ ϕ(y) = −∂Λ W Λ,Λ0 (J) + dyJ(y)∂Λ ϕ(y) . (∂Λ ΓΛ,Λ0 )(ϕ) + dy δϕ(y) Hence, because of (3.34), (∂Λ ΓΛ,Λ0 )(ϕ) + ∂Λ W Λ,Λ0 (J) = 0 .
(3.43)
Substituting W Λ,Λ0 by LΛ,Λ0 according to (3.36) then yields Z Z δLΛ,Λ0 ˙ Λ,Λ0 (∂Λ ΓΛ,Λ0 )(ϕ) − (∂Λ LΛ,Λ0 )(ϕ) − dz dy C (y − z)J(z) δϕ(y) 1 + hJ, C˙ Λ,Λ0 Ji = 0 , 2
(3.44)
C˙ Λ,Λ0 denoting the derivative of the covariance C Λ,Λ0 with respect to Λ. There is an alternative way [41] to arrive at Eq. (3.44), starting from (3.33) but treating ϕ as Λ-independent and thus J to depend on Λ according to (3.31). Deriving (3.33) with respect to Λ then reads " # Z Z δW Λ,Λ0 Λ,Λ0 Λ,Λ0 )(J) − dy (ϕ) = −(∂Λ W ∂Λ Γ ∂Λ J(y) + dy∂Λ J(y)ϕ(y) δJ(y) J=J (ϕ)
= −(∂Λ W Λ,Λ0 )(J)|J=J (ϕ) .
(3.45)
The substitution of W Λ,Λ0 by LΛ,Λ0 due to (3.36) and (3.37) again provides (3.44). Given (3.44) the flow equation (2.21) with its vacuum part subtracted can be taken into account, leading to Λ,Λ0 1 δLΛ,Λ0 δL Λ,Λ0 Λ,Λ0 ˙ (∂Λ Γ )(ϕ) + − J, C −J 2 δϕ δϕ ~ δ ˙ Λ,Λ0 δ δ δ . LΛ,Λ0 (ϕ) − LΛ,Λ0 (ϕ) = ,C , C˙ Λ,Λ0 2 δϕ δϕ δϕ δϕ ϕ≡0
In the second term on the l.h.s. we use the relation Z δLΛ,Λ0 dx(C Λ,Λ0 )−1 (z − x)ϕ(x) = − + J(z) , δϕ(z)
(3.46)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
529
resulting from (3.31) together with (3.36), and note (∂Λ C)C −1 + C∂Λ C −1 = 0. Hence, the flow equation for the vertex functional turns out as 1 ~ δ Λ,Λ0 Λ,Λ0 −1 Λ,Λ0 δ ˜ Λ,Λ0 (ϕ) , Γ (∂Λ Γ )(ϕ) − hϕ, (∂Λ (C ) )ϕi = , (∂Λ C ) 2 2 δϕ δϕ (3.47) with the r.h.s. defined by ˜ Λ,Λ0 (ϕ) δ2 Γ δ 2 LΛ,Λ0 (ϕ) := δϕ(x)δϕ(y) δϕ(x)δϕ(y)
ϕ=C Λ,Λ0 J (ϕ)
δ 2 LΛ,Λ0 (ϕ) − δϕ(x) δϕ(y)
.
(3.48)
ϕ=0
˜ Λ,Λ0 (ϕ) we notice, that its 2Looking at this definition of the functional Γ point function vanishes, and furthermore, that its higher n-point functions, n = 4, 6, 8, . . . , emerge from the first term on the r.h.s. These latter are recursively determined by the functional Eqs. (3.38) or (3.40) (which could also be obtained via (3.46) and (3.34)), by performing successively two, four, six, . . . functional derivations with respect to ϕ. The r.h.s. of the flow equation (3.47) can also be given another form, expressing (3.48) first in terms of the functional W Λ,Λ0 (J) by way of (3.36), (3.37) and then using the functional relation (3.35), ~ δ Λ,Λ0 δ ˜ Λ,Λ0 (ϕ) , (∂Λ C ) Γ 2 δϕ δϕ ! Z Z ~ δ 2 W Λ,Λ0 (J) δ 2 W Λ,Λ0 (J) Λ,Λ0 −1 ) (y − x) = dx dy∂Λ (C − 2 δJ(x) δJ(y) δJ(x) δJ(y) J=0
=
Z
Z
~ dx dy∂Λ (C Λ,Λ0 )−1 (y − x) 2 !−1 !−1 δ 2 ΓΛ,Λ0 (ϕ) δ 2 ΓΛ,Λ0 (ϕ) · (x, y) − (x, y) δϕ δϕ δϕ δϕ
ϕ=0
.
(3.49)
This form is (also) met in the literature [54, 59–62], and the flow equation (3.47) called there “exact renormalization group”. Similar to (2.25) regarding the functional LΛ,Λ0 (ϕ) we define the n-point functions, n ∈ 2N, of the functional ΓΛ,Λ0 (ϕ) in momentum space as δ δ (2π)4(n−1) ··· ΓΛ,Λ0 (ϕ) = δ(p1 + · · · + pn )ΓnΛ,Λ0 (p1 , . . . , pn ) , δϕ ˆ (p1 ) δϕ ˆ (pn ) ϕ≡0
(3.50)
˜ Λ,Λ0 (ϕ). Performing in addition a and analogously in the case of the functional Γ respective loop expansion, n ∈ 2N, ∞ X Λ,Λ0 Λ,Λ0 ~l Γl,n (p1 , . . . , pn ) , (3.51) Γn (p1 , . . . , pn ) = l=0
July 14, 2003 11:19 WSPC/148-RMP
530
00169
V. F. M¨ uller
0 ˜ Λ,Λ and for the functions Γ (p1 , . . . , pn ) alike, the flow equation (3.47) is finally n converted into the system of flow equations, satisfied by the n-point functions, n ∈ 2N, l ∈ N, Z 1 Λ,Λ0 ˜ Λ,Λ0 (k, −k, p1 , . . . , pn ) . ∂Λ Γl,n (p1 , . . . , pn ) = ∂Λ C Λ,Λ0 (k) · Γ (3.52) l−1,n+2 2 k
In contrast to the system of flow equations (2.31) satisfied by the amputated truncated Schwinger functions, here the r.h.s. is in total of lower loop order, but there is no closed form for it. As explained before, it has to be determined recursively via (3.40), treated in a loop expansion and using (3.42). It then emerges in the form ˜ Λ,Λ0 (k, −k, p1 , . . . , pn ) = ΓΛ,Λ0 (k, −k, p1 , . . . , pn ) Γ l,n+2 l,n+2 −
X
0 X
Λ,Λ0 Λ,Λ0 0 σΓΛ,Λ Γl2 ,n2 +2 · · · l1 ,n1 +1 (k, . . .)C
r≥2 {ni },{li }
Λ,Λ0 Λ,Λ0 0 · · · ΓΛ,Λ Γlr ,nr +1 (−k, . . .) . lr−1 ,nr−1 +2 C
(3.53)
The prime restricts summation to l1 +l2 +· · ·+lr = l and n1 +n2 +· · ·+nr = n+2, in addition, 2-point functions in the tree order are excluded as factors. The momentum assignment has been suppressed, it goes without saying that the sum inherits from the l.h.s. the complete symmetry in the momenta p1 , . . . , pn . Moreover, there is a sign factor σ depending on {ni } and {li }. The form of (3.53) is easily understood when represented by Feynman diagrams: To the first term (on the r.h.s.) correspond 1PI-diagrams, whereas to the sum correspond chains of 1PI-diagrams, minimally connected by single lines and thus not of 1PI-type. These chains are closed to 1PI-diagrams by the contraction involved in the flow equation. The system of flow equations (3.52) can alternatively be employed to prove the renormalizability of the theory considered. In the tree order, only the 2-point 0 function (3.42) and the 4-point function ΓΛ,Λ 0,4 (p1 , . . . , p4 ) = g are different from zero. The latter is easily obtained via (3.40)–(3.42) from (2.40) observing f = 0 there. In each loop order l ≥ 1, the three counterterms Λ0 ,Λ0 (p, −p) = al (Λ0 ) + z l (Λ0 )p2 , Γl,2
Λ0 ,Λ0 (p1 , . . . , p4 ) = cl (Λ0 ) Γl,4
(3.54)
form the respective bare action, determined in the end by the renormalization conditions, l ≥ 1, R R 2 2 2 0 Γ0,Λ l,2 (p, −p) = al + z l p + O((p ) ) ,
0,Λ0 Γl,4 (0, 0, 0, 0) = cR l .
(3.55)
R R The renormalization constants aR l , z l , cl can be freely chosen. To prove renormalizability, we also have to make use of momentum derivatives of the flow equations (3.52), i.e. acting on them with ∂ w , (2.30). Then, the proof by induction follows step-by-step the proof given in Sec. 2.3 considering amputated truncated Schwinger
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
531
functions. It will therefore not be repeated. As result, the analogue of the Propo0 sitions 2.1 and 2.2 is established, where the function LΛ,Λ appearing there is now l,n 0 replaced by the function ΓΛ,Λ l,n .
4. Spontaneously Broken SU(2) Yang Mills Theory Attempting to prove renormalizability of a non-Abelian gauge theory via flow equations, following the path taken before in the case of a scalar field theory, one finds oneself confronted with a serious obstacle to be surmounted. By their definition the Schwinger functions of a non-Abelian gauge theory are not gauge invariant individually, the local gauge invarince of the theory, however, compels them to satisfy the system of Slavnov–Taylor identities [79, 80]. These identities are inevitably violated, if one employs in the intermediate regularization procedure a momentum cutoff. Moreover, the Slavnov–Taylor identities are generated by nonlinear transformations of the fields — the BRS-transformations [81, 82] — which, being composite fields, have to be renormalized, too. In the sequel we follow the general line of [42], hereafter referred to as “I”, incorporating the simplifications due to a more appropriate cutoff function. For the sake of readability, however, a coherent detailed argumentation is kept up. As concerns a number of intermediate proofs and technical derivations, we refer to the original article. After presenting in Sec. 4.1 the classical action of the theory considered, as a first step we disregard in Sec. 4.2 the Slavnov–Taylor identities and establish for an arbitrary set of renormalization conditions at a physical renormalization point a finite UV-behavior of the Schwinger functions without and with the insertion of one BRS-variation or of the gauge symmetry violation. The procedure is essentially the same as in the case of the scalar theory. Having thus established a family of finite theories, in Sec. 4.3 the violation of the Slavnov–Taylor identities of the amputated truncated Schwinger functions at the physical value Λ = 0 for the flow parameter and fixed Λ0 is worked out, as well as the BRS-variation of the bare action. Moreover, the violated Slavnov–Taylor identities of the vertex functions at Λ = 0, Λ0 fixed, are deduced. Flow equations for the vertex functions, however, are not used. Finally, in Sec. 4.4, the Slavnov–Taylor identities are accomplished by a proper choice of physical renormalization conditions. To this end, first termwise equivalence relations between the violated Slavnov–Taylor identities of the vertex functions and the BRS-variation of the bare action are established. Choosing freely 9 physical renormalization conditions, we then determine a related bare action such that the relevant part of its BRS-variation vanishes. Hence, due to the equivalence relations, the relevant part of the violated Slavnov–Taylor identities vanishes, too. From these 53 equations, the remaining 37 + 7 − 9 renormalization conditions are obtained, determining a UV-finite theory. It satisfies the Slavnov–Taylor identities, since the irrelevant part of the violation has a vanishing UV-limit, too.
July 14, 2003 11:19 WSPC/148-RMP
532
00169
V. F. M¨ uller
4.1. The classical action We begin collecting some basic properties of the classical Euclidean SU(2) Yang– Mills–Higgs model on four-dimensional Euclidean space-time, following closely the monograph of Faddeev and Slavnov [83]. This model involves the real Yang–Mills field {Aaµ }a=1,2,3 and the complex scalar doublet {φα }α=1,2 assumed to be smooth functions which fall-off rapidly. The classical action has the form Z 1 a a 1 Sinv = dx Fµν Fµν + (∇µ φ)∗ ∇µ φ + λ(φ∗ φ − ρ2 )2 , (4.1) 4 2 with the curvature tensor a Fµν (x) = ∂µ Aaν (x) − ∂ν Aaµ (x) + gabc Abµ (x)Acν (x)
(4.2)
and the covariant derivative ∇µ = ∂ µ + g
1 a a σ Aµ (x) 2i
(4.3)
acting on the SU(2)-spinor φ. The parameters g, λ, ρ are real positive, abc is totally skew symmetric, 123 = +1, and {σ a }a=1,2,3 are the standard Pauli matrices. For simplicity the wave function normalizations of the fields are chosen equal to one. The action (4.1) is invariant under local gauge transformations of the fields 1 1 a a σ Aµ (x) → u(x) σ a Aaµ (x)u∗ (x) + g −1 u(x)∂µ u∗ (x) , 2i 2i
(4.4)
φ(x) → u(x)φ(x) , with u : R4 → SU(2), smooth. A stable ground state of the action (4.1) implies spontaneous symmetry breaking, taken into account by reparametrizing the complex scalar doublet as ! B 2 (x) + iB 1 (x) φ(x) = (4.5) ρ + h(x) − iB 3 (x) where {B a (x)}a=1,2,3 is a real triplet and h(x) the real Higgs field. Moreover, in place of the parameters ρ, λ the masses 1 1 (4.6) gρ , M = (8λρ2 ) 2 2 are used. Since we aim at a quantized theory pure gauge degrees of freedom have to be eliminated. We choose the ’t Hooft gauge fixingg Z 1 Sg.f. = dx(∂µ Aaµ − αmB a )2 , (4.7) 2α
m=
with α ∈ R+ , implying complete spontaneous symmetry breaking. With regard to functional integration this condition is implemented by introducing anticommuting
g The general α-gauge [83] would lead to mixed propagators, in the Lorentz gauge the fields {B a } would be massless.
July 14, 2003 11:19 WSPC/148-RMP
00169
533
Perturbative Renormalization by Flow Equations
Faddeev–Popov ghost and antighost fields {ca }a=1,2,3 and {¯ ca }a=1,2,3 , respectively, and forming with these six independent scalar fields the additional interaction term ( Z 1 Sgh = − dx¯ ca (−∂µ ∂µ + αm2 )δ ab + αgmhδ ab 2 ) 1 acb c acb c + αgm B − g∂µ Aµ cb . 2
(4.8)
Hence, the total “classical action” is SBRS = Sinv + Sg.f. + Sgh ,
(4.9)
which we decompose as SBRS =
Z
dx{Lquad (x) + Lint (x)}
(4.10)
into its quadratic part, with ∆ ≡ ∂µ ∂µ , Lquad =
1 1 1 1 (∂µ Aaν − ∂ν Aaµ )2 + (∂µ Aaµ )2 + m2 Aaµ Aaµ + h(−∆ + M 2 )h 4 2α 2 2 1 + B a (−∆ + αm2 )B a − c¯a (−∆ + αm2 )ca , 2
(4.11)
and into its interaction part: 1 Lint = gabc (∂µ Aaν )Abµ Acν + g 2 (abc Abµ Acν )2 4 1 + g{(∂µ h)Aaµ B a − hAaµ ∂µ B a − abc Aaµ (∂µ B b )B c } 2 1 + gAaµ Aaµ {4mh + g(h2 + B a B a )} 8 2 1 M2 1 M + g h(h2 + B a B a ) + g 2 (h2 + B a B a )2 4 m 32 m 1 − αgm¯ ca {hδ ab + acb B c }cb − gacb (∂µ c¯a )Acµ cb . 2
(4.12)
In (4.11) we recognize the important properties that all fields are massive and that no coupling term Aaµ ∂µ B a appears. The classical action SBRS , (4.10), shows the following symmetries: (i) Euclidean invariance: SBRS is an O(4)-scalar. (ii) Rigid SO(3)-isosymmetry: The fields {Aaµ }, {B a }, {ca }, {¯ ca } are isovectors and h an isoscalar; SBRS is invariant under spacetime independent SO(3)transformations.
July 14, 2003 11:19 WSPC/148-RMP
534
00169
V. F. M¨ uller
(iii) BRS-invariance: Introducing the classical composite fields ψµa (x) = {∂µ δ ab + garb Arµ (x)}cb (x) , 1 ψ(x) = − gB a (x)ca (x) , 2 1 arb r 1 ab a ψ (x) = m + gh(x) δ + g B (x) cb (x) , 2 2 Ωa (x) =
(4.13)
1 apq p g c (x)cq (x) , 2
the BRS-transformations of the basic fields are defined as Aaµ (x) → Aaµ (x) − ψµa (x) , h(x) → h(x) − ψ(x) , B a (x) → B a (x) − ψ a (x) , a
a
(4.14)
a
c (x) → c (x) − Ω (x) , c¯a (x) → c¯a (x) −
1 (∂ν Aaν (x) − αmB a (x)) . α
In these transformations is a spacetime independent Grassmann element that commutes with the fields {Aaµ , h, B a } but anticommutes with the (anti-) ghosts {ca , c¯a }. To show the BRS-invariance of the total classical action (4.9) one first observes that the composite classical fields (4.13) are themselves invariant under the BRS-transformations (4.14). Moreover, we can write (4.8) in the form Z Sgh = − dx¯ ca {−∂µ ψµa + αmψ a } . (4.15) Using these properties the BRS-invariance of the classical action (4.9) follows upon direct verification. It is convenient to add to the classical action (4.9) source terms both for the fields and the composite fields introduced, defining the extended action Z Sc = SBRS + dx{γµa ψµa + γψ + γ a ψ a + ω a Ωa } −
Z
dx{jµa Aaµ + sh + ba B a + η¯a ca + c¯a η a } .
(4.16)
The sources γµa , γ, γ a all have canonical dimension 2, ghost number −1 and are Grassmann elements, whereas ω a has canonical dimension 2 and ghost number −2; the sources η a and η¯a have ghost number +1 and −1, respectively, and are Grassmann elements. Employing the BRS-operator D, defined by Z δ δ δ δ 1 δ δ D = dx jµa a + s + ba a + η¯a a + η a ∂ν a − m a , (4.17) δγµ δγ δγ δω α δjν δb
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
535
the BRS-transformation of the extended action Sc , (4.16), can be written as Sc → Sc + DSc .
(4.18)
Of course, also anticommutes with the sources of Grassmannian type. 4.2. Flow equations: Renormalizability without Slavnov Taylor identities In view of the various fields present, it is convenient to introduce a short collective notation for the fields and sources. We denote: (i) the bosonic fields and the corresponding sources, respectively, by ϕτ = (Aaµ , h, B a ) ,
Jτ = (jµa , s, ba ) ,
(4.19)
(ii) all fields and their respective sources by Φ = (ϕτ , ca , c¯a ) ,
K = (Jτ , η¯a , η a ) ,
(4.20)
(iii) the insertions and their sources ψτ = (ψµa , ψ, ψ a ) ,
γτ = (γµa , γ, γ a ) ,
ξ = (γτ , ω a ) .
(4.21)
The quadratic part of SBRS , (4.10), defines the inverses of the various unregularized free propagators. We start from the theory defined on finite volume, as described in Sec. 2.1. With the notation introduced there, we have Z 1 1 dxLquad (x) ≡ Q(Φ) = hAaµ , (C −1 )µν Aaν i + hh, C −1 hi 2 2 1 + hB a , S −1 B a i − h¯ ca , S −1 ca i , (4.22) 2 where the Fourier transforms of these free propagators, (compare (2.7) with Λ = 0, Λ0 = ∞) turn out to be 1 1 , S(k) = 2 , 2 +M k + αm2 kµ kν 1 δµν − (1 − α) 2 . Cµν (k) = 2 k + m2 k + αm2 C(k) =
k2
(4.23)
Again the notation has been abused omitting the “hat”. Furthermore, we shall use C(k) as a collective symbol for these propagators. A Gaussian product measure, the covariances of which are a regularized version of the propagators (4.23), forms the basis to quantize the theory by functional integration. Although gauge symmetry is violated by any momentum cutoff one should try to reduce the bothersome consequences as far as possible. Instead of the simple form (2.10) we choose the cutoff function 2 )(k2 +M 2 ) (1 + α)m2 M 2 + αm4 2 − (k2 +m2 )(k2 +αm Λ6 k . (4.24) e σΛ (k 2 ) = 1 + 6 Λ
July 14, 2003 11:19 WSPC/148-RMP
536
00169
V. F. M¨ uller
It is positive, invertible and analytic as the former, but satisfies in addition d σΛ (k 2 )|k2 =0 = 0 . dk 2
(4.25)
This property is the raison d’ˆetre for the particular choice (4.24), compared with I (30). Employing this cutoff function we define the regularized propagators, 0 ≤ Λ ≤ Λ0 < ∞, C Λ,Λ0 (k) ≡ C(k)σΛ,Λ0 (k 2 ) := C(k)
σΛ0 (k 2 ) − σΛ (k 2 ) . σΛ0 (0)
They satisfy the bounds, valid for |w| ≤ 4, Y |w| ∂ ∂ Λ,Λ0 C (k) ∂k ∂Λ i=1 µi
2 2 2 +αm2 )(k2 +M 2 ) − (k +m )(k 2m 6 c|w| e ≤ 2 )(k2 +M 2 ) |k| − (k2 +m2 )(k2 +αm Λ−3−|w| P|w| Λ6 e Λ
(4.26)
for 0 ≤ Λ ≤ m , for Λ > m
(4.27)
with polynomials P|w| having positive coefficients. These coefficients, as well as the constants c|w| , only depend on α, m, M, |w|. The bounds (4.27), only valid in the cases |w| ≤ 4, are sufficient for our purpose and have the same form as the bounds (2.42). Writing ! Z X a a a a (4.28) ϕτ (x)Jτ (x) + c¯ (x)η (x) + η¯ (x)c (x) , hΦ, Ki := dx τ
the characteristic functional of the Gaussian product measure with covariances ~C Λ,Λ0 , (4.26), (4.23), is given by Z 1 1 (4.29) dµΛ,Λ0 (Φ)e ~ hΦ,Ki = e ~ P (K) , P (K) =
1 a Λ,Λ0 a 1 hj , C j i + hs, C Λ,Λ0 si 2 µ µν ν 2 1 + hba , S Λ,Λ0 ba i − h¯ η a , S Λ,Λ0 η a i . 2
(4.30)
The free propagators (4.23) reveal the mass dimensions of the corresponding quantum fields: each of the fields has a mass dimension equal to one, attributing equal values to the ghost and antighost field. To promote the classical model to a quantum field theory we consider the generating functional LΛ,Λ0 (Φ) of the amputated truncated Schwinger functions. It unfolds according to the integrated flow equation, cf. (2.23), 1
e− ~ (L
Λ,Λ0
(Φ)+I Λ,Λ0 )
1
= e~∆Λ,Λ0 e− ~ L
Λ0 ,Λ0
(Φ)
(4.31)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
537
from the bare functional LΛ0 ,Λ0 (Φ), which forms its initial value at Λ = Λ0 . The functional Laplace operator appearing has the form 1 δ δ 1 Λ,Λ0 δ Λ,Λ0 δ , C , C + ∆Λ,Λ0 = 2 δAaµ µν δAaν 2 δh δh δ δ 1 Λ,Λ0 δ Λ,Λ0 δ + . (4.32) ,S ,S + 2 δB a δB a δca δ¯ ca Since the local gauge symmetry is violated by the regularization, the bare functional Z Λ0 ,Λ0 LΛ0 ,Λ0 (Φ) = dxLint (x) + Lc.t. (Φ) (4.33)
has at first to be chosen sufficiently general in order to allow the restoration of the Slavnov–Taylor identities at the end. Therefore, we add as counterterms to the given interaction part (4.12) of classical origin all local terms of mass dimension ≤ 4, which are permitted by the unbroken global symmetries, i.e. Euclidean O(4)invariance and SO(3)-isosymmetry. There are 37 such terms, by definition all of order O(~). The bare functional is presented in Appendix A. We remark that no irrelevant terms I, (107) and (108) are introduced in the bare interaction, now unnecessary due to (4.25). The decomposition of the generating functional LΛ,Λ0 (Φ) is written employing a multiindex n, the components of which denote the number of each source field species appearing: n = (nA , nh , nB , nc¯, nc ) ,
|n| = nA + nh + nB + nc¯ + nc .
(4.34)
Moreover, we consider the functional within a formal loop expansion, hence LΛ,Λ0 (Φ) =
∞ X
|n|=3
0 LΛ,Λ l=0,n (Φ) +
∞ X l=1
~l
∞ X
0 LΛ,Λ l,n (Φ) .
(4.35)
|n|=1
Disregarding the vacuum part, we can study the flow of the n-point functions in the infinite volume limit Ω → R4 , Φ ∈ S(R4 ). With our conventions (2.24) of the Fourier transformation, the momentum representation of the n-point function with multiindex n, (4.34), at loop order l is obtained as an |n|-fold functional derivative: n 0 0 (2π)4(|n|−1) δΦ(p) LΛ,Λ |Φ=0 = δ(p1 + · · · + p|n| )LΛ,Λ ˆ ˆ l l,n (p1 , . . . , p|n| ) .
(4.36)
To avoid clumsiness, the notation does not reveal how the momenta are assigned to the multiindex n, and in addition, it suppresses the O(4)- and SO(3)-tensor structure. From the definition (4.36) of the n-point function follows that it is completely symmetric (antisymmetric) upon permuting the variables belonging to each of the bosonic (fermionic) species occurring. Proceeding exactly as in the case of the scalar field, the flow equation (4.31) is converted into a system of flow equations relating the n-point functions. It looks like (2.31), where n is now a multiindex and the residual symmetrization has to be extended to a corresponding antisymmetrization in case of the (anti)ghost fields, cf. I (37). The system is integrated in the familiar
July 14, 2003 11:19 WSPC/148-RMP
538
00169
V. F. M¨ uller
way. At first the tree order l = 0 has to be gained, fully determined by the classical descendant (4.12) appearing in the initial condition (4.33) at Λ = Λ0 . Given the tree order l = 0, the inductive integration ascends in the loop order l, for fixed l ascends in |n|, and for fixed l, n descends in w from |w| = 4 to w = 0, with initial conditions as follows: (A1 ) For |n| + |w| > 4 at Λ = Λ0 , 0 ,Λ0 ∂ w LΛ (p1 , . . . , p|n| ) = 0 , l,n
(4.37)
due to the choice of the bare functional (4.33). (A2 ) For the (relevant) cases |n| + |w| ≤ 4, renormalization conditions at the physical value Λ = 0 and a chosen renormalization point are freely prescribed order-by-order, subject only to the unbroken O(4)- and SO(3)-symmetries. These conditions determine the 37 local counterterms entering the bare functional (4.33). For simplicity we choose, as before, vanishing momenta as renormalization point. Repeating exactly the steps that in the case of the scalar field led to the Propositions 2.1 and 2.2, one establishes in the present case analogous bounds, just reading now n as a multiindex. (cf. I, Propositions 1 and 2.) As a consequence of these bounds a finite theory results in the limit Λ0 → ∞, however, it is still not yet the gauge theory looked for! The problem to be solved is to select renormalization conditions (A2 ) such that the n-point functions in the limit Λ = 0, Λ0 → ∞ satisfy the Slavnov–Taylor identities. As worked out in the next section, to establish the Slavnov–Taylor identities necessitates to consider Schwinger functions with a composite field inserted. There will appear two kinds of such insertions: the composite BRS-fields forming local insertions, and a space-time integrated insertion describing the intermediate violation of the Slavnov–Taylor identities. The classical composite BRS-fields (4.13) all have mass dimension 2 and transform as vector-isovector, scalar-isoscalar, scalar-isovector and scalar-isovector, respectively. Moreover, the first three have ghost number 1, whereas the last has ghost number 2. Thus, adding counterterms according to the rules formulated in Sec. 2.4, we introduce the bare composite fields ψµa (x) = R10 ∂µ ca (x) + R20 garb Arµ (x)cb (x) , g ψ(x) = −R30 B a (x)ca (x) , 2 g g ψ a (x) = R40 mca (x) + R50 h(x)ca (x) + R60 arb B r (x)cb (x) , 2 2 g Ωa (x) = R70 apq cp (x)cq (x) , 2
(4.38)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
539
keeping the notation introduced for the classical terms and using it henceforth exclusively according to (4.38). We set Ri0 = 1 + O(~) ,
(4.39)
thus viewing the counterterms again as formal power series in ~; the tree order ~0 provides the classical terms (4.13). The reader notices that there is no insertion attributed to the linear variation of the antighost field. It will be seen that the Slavnov–Taylor identities can be established generating this variation by functional derivation with respect to the sources of the fields involved. We shall have to deal with Schwinger functions with one insertion. Similarly as in Sec. 2.4, the bare interaction (4.33) is modified adding the composite fields (4.38) coupled to corresponding sources, introduced in (4.16), ˜ Λ0 ,Λ0 := LΛ0 ,Λ0 + LΛ0 ,Λ0 (ξ) , L LΛ0 ,Λ0 (ξ) =
Z
dx{γµa (x)ψµa (x) + γ(x)ψ(x) + γ a (x)ψ a (x) + ω a (x)Ωa (x)} .
(4.40) (4.41)
Then, from the corresponding generating functional of the regularized amputated truncated Schwinger functions with one insertion ψ(x), Z δ ˜ Λ,Λ0 Λ,Λ0 Λ,Λ0 0 ˆ (x; Φ) (4.42) L |ξ=0 , Lγ (q; Φ) = dxeiqx LΛ,Λ Lγ (x; Φ) := γ δγ(x) with analogous expressions for the other insertions, after a loop expansion, follows for the n-point functions with one insertion ψ, Λ,Λ0 (q; p1 , . . . , p|n| ) δ(q + p1 + · · · + p|n| )Lγ;l,n n ˆ Λ,Λ0 (q; Φ)|Φ=0 , := (2π)4(|n|−1) δΦ(p) L ˆ γ;l
(4.43)
a system of flow equations. (cf. I (46)–(51).) From each of these systems the renormalizability of the amputated truncated Schwinger functions with one insertion can be deduced inductively in the familiar way. We denote by ξ any of the labels γµa , γ, γ a , ω a . First, the tree order l = 0 is obtained from its initial condition at Λ = Λ0 . For l ≥ 1 the initial conditions are: (B1 ) If |n| + |w| > 2 at Λ = Λ0 , 0 ,Λ0 ∂ w LΛ ξ;l,n (q; p1 , . . . , p|n| ) = 0 .
(4.44)
(B2 ) If |n| + |w| ≤ 2 at Λ = 0 and at vanishing momenta (the renormalization point) the initial condition can be fixed freely in each loop order, provided the Euclidean symmetry and the isosymmetry are respected. In total, there are 7 such renormalization conditions which then determine the 7 parameters Ri0 entering the bare insertions (4.38). Given the bounds of the case without insertion one deduces inductively the analogues of the Propositions 2.3 and 2.4, with n now a multiindex and D = 2 (cf. I, Prop. 3). Hence, we
July 14, 2003 11:19 WSPC/148-RMP
540
00169
V. F. M¨ uller
have boundedness and convergence of the amputated truncated Schwinger functions with the insertion of one BRS-variation. The intermediate violation of the Slavnov–Taylor identities, as will be derived in the following section, leads to a bare space-time integrated insertion of the form Z Λ0 ,Λ0 L1 (Φ) = dxN (x) , (4.45) N (x) = Q(x) + Q0 (x; (Λ0 )−1 ) .
(4.46)
Here Q(x) is a local polynomial in the fields and their derivatives, having canonical mass dimension D = 5, whereas Q0 (x; (Λ0 )−1 ) is nonpolynomial in the field derivatives but with powers (Λ0 )−1 as coefficients such that it becomes irrelevant. The individual terms composing N (x) involve at most five fields and have ghost number equal to one. We have to control L1Λ,Λ0 (Φ), the L-functional with one (bare) insertion (4.45). Hence, in analogy to the local case, cf. (2.61), a modified bare action 0 ,Λ0 LΛ0 ,Λ0 (Φ) + χLΛ (Φ) 1
(4.47)
is introduced as initial condition in the (integrated form of the) flow equation 1
Λ,Λ0
e− ~ (Lχ
+I Λ,Λ0 )
1
:= e~∆Λ,Λ0 e− ~ (L
Λ0 ,Λ0
Λ ,Λ0
+χL1 0
)
.
(4.48)
Herefrom results the generating functional of the (regularized) amputated truncated Schwinger functions with one insertion (4.45) as 0 LΛ,Λ (Φ) = 1
∂ Λ,Λ0 L (Φ)|χ=0 . ∂χ χ
(4.49)
It satisfies a linear differential flow equation which is easily obtained relating it to the case of a bare local insertion, cf. (2.61)–(2.69), Z dx%(x)N (x) and observing
∂ Λ,Λ0 L (Φ)|χ=0 = ∂χ χ
Z
=
Z
dx
δ ˜ Λ,Λ0 L (%; Φ)|%=0 δ%(x)
0 ˆ Λ,Λ0 dxLΛ,Λ (1) (x; Φ) = L(1) (0; Φ) .
Hence, the differential flow equation satisfied by the functional L1Λ,Λ0 (Φ) is the space-time integrated analogue of (2.67). Performing a loop expansion, the amputated truncated n-point functions with one insertion (4.45), n a multiindex (4.34), 4(|n|−1) n 0 0 δΦ(p) δ(p1 + · · · + p|n| )LΛ,Λ LΛ,Λ ˆ 1;l,n (p1 , . . . , p|n| ) := (2π) 1;l (Φ)|Φ=0
(4.50)
then satisfy a system of flow equations similar to the case of the local BRSinsertions, letting there the momentum take the value q = 0. As a consequence, we
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
541
obtain analogous bounds, but observing in the present case the dimension D = 5 (cf. I, Prop. 3). The irrelevant part appearing in the bare insertion (4.45) and (4.46) satisfies the required bounds to be admitted, cf. (2.57). 4.3. Violated Slavnov Taylor identities The Schwinger functions of the spontaneously broken Yang–Mills theory should be uniquely determined by its free physical parameters g, λ, m and the gauge fixing parameter α, once the normalization of the fields has been fixed. This uniqueness — as well as the physical gauge invariance — is accomplished by requiring the Schwinger functions to satisfy the Slavnov–Taylor identities. These identities, however, are inevitably violated by the intermediate regularization in momentum space. Our ultimate goal is to show, that by a proper choice of the renormalization conditions the Slavnov–Taylor identities emerge upon removing the regularization. To this end we first examine the violation of the Slavnov–Taylor identities produced by the UV-cutoff Λ0 . Our starting point is the generating functional of the regularized Schwinger functions, here considered at the physical value Λ = 0 of the flow parameter, Z 1 1 Λ0 ,Λ0 +~ hΦ,Ki . Z 0,Λ0 (K) = dµ0,Λ0 (Φ)e− ~ L The Gaussian measure dµ0,Λ0 (Φ) corresponds to the quadratic form cf. (4.22), (4.26), to wit:
1 0,Λ0 (Φ), ~Q
1 a 1 0,Λ0 −1 a hA , (C 0,Λ0 )−1 ) hi µν Aν i + hh, (C 2 µ 2
Q0,Λ0 (Φ) =
1 ca , (S 0,Λ0 )−1 ca i . + hB a , (S 0,Λ0 )−1 B a i − h¯ 2 Defining regularized BRS-variations (4.14), (4.38) of the fields by
(4.51)
δBRS ϕτ (x) = −(σ0,Λ0 ψτ )(x) , δBRS ca (x) = −(σ0,Λ0 Ωa )(x) , 1 a a a δBRS c¯ (x) = − σ0,Λ0 ∂ν Aν − mB (x) , α
the BRS-variation of the Gaussian measure follows as 1 0,Λ0 dµ0,Λ0 (Φ) 7→ dµ0,Λ0 (Φ) 1 − δBRS Q (Φ) . ~
(4.52)
(4.53)
Written more explicitly, δBRS Q0,Λ0 (Φ) =
−
X
ca , (S 0,Λ0 )−1 σ0,Λ0 Ωa i hϕτ , (Cτ0,Λ0 )−1 σ0,Λ0 ψτ i + h¯
τ
−
1 ∂ν Aaν − mB a , σ0,Λ0 (S 0,Λ0 )−1 ca α
!
,
(4.54)
July 14, 2003 11:19 WSPC/148-RMP
542
00169
V. F. M¨ uller
it reveals that σ0,Λ0 just cancels its inverse appearing in the inverted propagators, and as a consequence, the BRS-variation of the Gaussian measure has mass dimension D = 5. The essential reason for using regularized BRS-variations (4.52) is to assure this property. From the requirement, that the regularized generating functional Z 0,Λ0 (K) be invariant under the BRS-variations (4.52), result the violated Slavnov–Taylor identities h Z 1 Λ0 ,Λ0 1 ! +~ hΦ,Ki (δBRS hΦ, Ki − δBRS (Q0,Λ0 + LΛ0 ,Λ0 )) . (4.55) 0 = dµ0,Λ0 (Φ)e− ~ L This equation can be rewritten, introducing modified generating functionals: (i) With the modified bare interaction (4.40) we define Z 1 ˜ Λ0 ,Λ0 1 +~ hΦ,Ki Z˜ 0,Λ0 (K, ξ) := dµ0,Λ0 (Φ)e− ~ L ,
(4.56)
in combination with a regularized version of the BRS-operator (4.17), X δ δ a + η¯ , σ0,Λ0 a D Λ0 = Jτ , σ0,Λ0 δγτ δω τ +
δ δ 1 a ∂ν − m a , σ0,Λ0 η . α δjνa δb
(4.57)
(ii) In addition, we treat the BRS-variation of the bare action, L1Λ0 ,Λ0 := −δBRS (Q0,Λ0 + LΛ0 ,Λ0 ) ,
(4.58)
as a space-time integrated insertion with ghost number 1. Because of the regularizing factor σ0,Λ0 , cf. (4.52), the integrand is not a polynomial in the fields and their derivatives. With χ ∈ R, we then define Z Λ ,Λ Λ0 ,Λ0 1 1 +χL1 0 0 )+ ~ hΦ,Ki . (4.59) Zχ0,Λ0 (K) := dµ0,Λ0 (Φ)e− ~ (L Due to these definitions, the violated Slavnov–Taylor identities (4.55) can be written in the form DΛ0 Z˜ 0,Λ0 (K, ξ)|ξ=0 =
d 0,Λ0 Z (K)|χ=0 . dχ χ
(4.60)
From the modified functionals (4.56) and (4.59) follow, cf. (2.65), the generating functionals of the corresponding amputated truncated Schwinger functions 0,Λ0 1 1 ˜ 0,Λ0 ) Z˜ 0,Λ0 (K, ξ) = e ~ P (K) e− ~ (L (ϕτ ,c,¯c;ξ)+I , 1
Zχ0,Λ0 (K) = e ~ P (K) e
1 0 (ϕ ,c,¯ −~ (L0,Λ c)+I 0,Λ0 ) τ χ
,
where the variables of the Z- and the L-functional are related as h As
long as the vacuum part is involved, one has to stay in finite volume.
(4.61) (4.62)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
Z
dyCτ0,Λ0 (x − y)Jτ (y) , Z ca (x) = − dyS 0,Λ0 (x − y)η a (y) , Z c¯a (x) = − dyS 0,Λ0 (x − y)¯ η a (y) .
ϕτ (x) =
543
(4.63)
Furthermore, P (K), (4.30), has to be taken here at Λ = 0. We observe, that the vacuum part I 0,Λ0 present without insertions appears, since both insertions have positive ghost number. To have a less cumbersome notation in the rest of this section, we abbreviate 0 ˜ 0,Λ0 |ξ=0 = L0,Λ , | L ≡ L0,Λ0 = L χ=0 χ
0 L1 ≡ L0,Λ := 1
L0 ≡ LΛ0 ,Λ0 ,
d 0,Λ0 L |χ=0 , dχ χ
0 (x; Φ) , Lγ ≡ L0,Λ γ
0 ,Λ0 L01 ≡ LΛ , 1 δ Λ0 ,Λ0 0 Λ0 ,Λ0 (ξ)|ξ=0 , L (x; Φ) = L γ ≡ Lγ δγ(x)
(4.64)
see (4.40)–(4.42). Moreover, we denote the inverted unregularized propagators by Dτ ≡
(−∆ + m2 )δµ,ν −
1−α ∂µ ∂ν , −∆ + M 2 , −∆ + αm2 ≡ D . α
(4.65)
From (4.60) we derive via (4.61)–(4.63), employing the previous abbreviations, the violated Slavnov–Taylor identities of the amputated truncated Schwinger functions:
δL δL 1 − ca , σ0,Λ0 ∂ν a − m a ∂ν Aaν − mB a α δAν δB X + hϕτ , Dτ Lγτ i − h¯ ca , DLωa i = L1 .
ca , D
(4.66)
τ
As will turn out, we also need the explicit form of L01 , (4.58), i.e. the BRS-variation of the bare action. From its definition (4.58) follows directly, using (4.40) and (4.41), avoiding the detour in I, Eqs. (91)–(98), L01 =
δL0 1 a a , σ ∂ A − mB 0,Λ0 ν ν δ¯ ca α X X δL0 0 a 0 0 hϕτ , Dτ Lγτ i − h¯ + c , DLωa i + , σ0,Λ0 Lγτ δϕτ τ τ
ca , D
−
1 ∂ν Aaν − mB a α
δL0 , σ0,Λ0 L0ωa δca
.
−
(4.67)
July 14, 2003 11:19 WSPC/148-RMP
544
00169
V. F. M¨ uller
Moreover, to restore the Slavnov–Taylor identities we shall rely on proper vertex functions, too. Therefore, the violated form in terms of these functions is derived here, too. In the following, all functionals appearing should carry the superscript 0, Λ0 which is omitted, cf. (4.64). Considering the generating functional of the truncated Schwinger functions 1
˜
e ~ W (K,ξ) =
˜ Z(K, ξ) , ˜ Z(0, 0)
(4.68)
it follows from (4.60), together with (4.61) and (4.62) and using notation defined in (4.64), that ˜ (K, ξ)|ξ=0 = −L1 (ϕτ , ca , c¯a ) , D Λ0 W
(4.69)
with arguments according to (4.63). Because of the inherent symmetries, the functional L, and hence also W , contain only one 1-point function, which we force to vanish by the renormalization condition ˜ δL δL ! = 0 , → = 0. (4.70) δh(x) Φ=0 δh(x) Φ=0
A Legendre transformation yields the (modified) generating functional of the proper vertex functions, ! Z X a a a a a a a a ˜ ˜ ϕτ Jτ + η¯ c + c¯ η , (4.71) Γ(ϕτ , c , c¯ ; ξ) + W (Jτ , η , η¯ ; ξ) = dx τ
with variables related by
ϕτ (x) =
˜ δW , δJτ (x)
Jτ (x) =
ca (x) =
˜ δW , a δ η¯ (x)
η¯a (x) = −
c¯a (x) = −
˜ δW , a δη (x)
η a (x) =
˜ δΓ , δϕτ (x) ˜ δΓ , a δc (x)
(4.72)
˜ δΓ . a δ¯ c (x)
˜ does not contain 1-point functions, because of (4.70), but begins with 2Since W point functions, the equations on the left in (4.72) imply, that the variables ϕ τ , ca , c¯a ˜ vanish, if the variables Jτ , η a , η¯a are equal to zero. Inverting these equations of Γ provides Jτ , η a , η¯a as respective functions of ϕτ , ca , c¯a , to be used in the definition ˜ It follows, that there is no 1-point proper vertex function, i.e. (4.71) of Γ. δΓ = 0. (4.73) δh(x) Φ=0 From the functional derivation of (4.71) with respect to the source γ(x) at fixed Φ, ˜ ˜ δΓ + δW δγ(x) Φ δγ(x) K
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
X δW ˜ ˜ ˜ δJτ (y) δ η¯a (y) δ W δη a (y) δ W + dy + + a a δJτ (y) δγ(x) δγ(x) δ η¯ (y) δγ(x) δη (y) τ ! Z X δJτ (y) δ η¯a (y) a δη a (y) a + c (y) − c¯ (y) , = dy ϕτ (y) δγ(x) δγ(x) δγ(x) τ Z
545
!
we infer, because of (4.72),
˜ ˜ δΓ = − δW , δγ(x) Φ δγ(x) K
(4.74)
and similar relations for the derivatives with respect to the sources γµa , γ a and ω a . These relations are employed at ξ = 0. Using a notation in accord with (4.64) , ˜ 0,Λ0 δ Γ 0,Λ 0 ˜ Γ≡Γ |ξ=0 , Γγτ (x) ≡ , (4.75) δγτ (x) ξ=0
the violated Slavnov–Taylor identities for proper vertex functions emerge from (4.69) via (4.72), (4.74) as + * X δΓ 1 δΓ δΓ a a a , σ0,Λ0 Γω − ∂ν Aν − mB , σ0,Λ0 a , σ0,Λ0 Γγτ − δϕτ δca α δ¯ c τ = Γ1 (ϕτ , ca , c¯a ) ,
(4.76)
with Γ1 (ϕτ , ca , c¯a ) = L1 (ϕτ , ca , c¯a ) .
(4.77)
In (4.77) the variables are related, suppressing the supersript 0, Λ0 of the propagators, as Z δΓ ϕτ (x) = dyCτ (x − y) , δϕτ (y) Z δΓ ca (x) = − dyS(x − y) a , δ¯ c (y) Z δΓ (4.78) c¯a (x) = dy a S(y − x) . δc (y) Comparing (4.67) with (4.76) we observe, that L01 and Γ1 have the same form! The apparently additional terms in L01 result from the quadratic part of the classical action, which by definition is excluded from the bare interaction L0 , but is contained in Γ. The relevant part of Γ1 , and hence of L01 , is listed in I, Appendix C, (I–XXIX), which consists of 53 different local parts. Due to our new cutoff function, however, the following simplifications occur here: (i) All conditions whose numbering carries a superscript zero are deleted, since no irrelevant bare terms I, (107) and (108) ˙ have been introduced, (ii) σ˙ = 0, cf. (4.25), (iii) The symbols Σ(0) have to be read
July 14, 2003 11:19 WSPC/148-RMP
546
00169
V. F. M¨ uller
˙ as Σ(0), cf. our Appendix A. In the case of Γ1 there appear contributions from irrelevant terms of Γ, indistinctly denoted by “irr”. They have to be deleted if one reads the list as the relevant part of L01 . 4.4. Restoration of the Slavnov Taylor identities The systems of amputated truncated Schwinger functions and of proper vertex functions are equivalent formulations of the theory. To analyze the Slavnov–Taylor identities, however, turns out to be simpler in the case of the proper vertex functions. Starting from the violated Slavnov–Taylor identities (4.76), the restoration is accomplished, if a set of the 37 + 7 relevant parameters of the theory and the BRS-variations can be given, such that the limit 0 lim Γ0,Λ (ϕτ , ca , c¯a ) = 0 1
Λ0 →∞
(4.79)
results. To achieve this we have also to resort to the functionals L1 , (4.66), and L01 , (4.67), derived as well as Γ1 in the foregoing section. This triplet of functionals is invoked to uncover linear dependences in the relevant parts of Γ and L0 . To this end we shall establish termwise equivalence relations between the relevant parts of Γ1 and L01 . In these relations the functional L1 acts as a connecting link. In the sequel we keep the notation introduced in (4.64), (4.77) for the various functionals and use in addition the shorthand n ∂ w δΦ F |0
(4.80)
ˆ where Φ should always be read as the Fourier transformed field Φ(p), to denote the |n|-fold field derivative of the functional F (which is a L- or a Γ-functional) ˆ evaluated at Φ ˆ = 0, then followed corresponding to the multiindex n of fieldsi Φ, by removing the momentum δ-function and afterwards performing the momentum derivative ∂ w . Setting furthermore in (4.80) all momenta equal to zero is written as n ∂ w δΦ F |0,0 ,
(4.81)
and finally, in a loop expansion of (4.81), the coefficient of loop order l is denoted by n ∂ w δΦ F |0,0,l .
(4.82)
In accord with (4.70), (4.73) we require for the possible 1-point function in each loop order l ≥ 1 the renormalization conditions !
κl := δh Γ|0,l = 0 ,
!
δh L|0,l = 0 .
(4.83)
(The 1-point functions are constants.) We first present two Lemmata invoked later to establish the limit (4.79). For the multiindices n, w a somewhat hybrid convention is used: n0 ⊂ n means strict inclusion in the set-theoretic sense, and correspondingly for (n, w). Lemma 1. Let l, n, w be given and assume (4.83). If i In
ˆ appear. the case of a Γ-functional the fields Φ
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
547
(i) for l0 < l and (n0 , w0 ) ⊆ (n, w), (ii) for l0 = l and (n0 , w0 ) ⊂ (n, w): 0
0
n ∂ w δΦ Γ1 |0,0,l0 = 0
is valid, then n n ∂ w δΦ Γ1 |0,0,l = 0 ⇔ ∂ w δΦ L1 |0,0,l = 0 .
(4.84)
Lemma 2. Let l, n, w with |n| + |w| ≤ 5 be given and (4.83) satisfied. Assuming (i) for l0 < l, (n0 , w0 ) with |n0 | + |w0 | ≤ 5: 0
0
n Γ1 |0,0,l0 = 0 , ∂ w δΦ
0
0
0
0
n 0 ∂ w δΦ L1 |0,0,l0 = 0 ,
(ii) for l0 = l, n0 ⊂ n and (n0 , w0 ) ⊂ (n, w): 0
0
n ∂ w δΦ L1 |0,0,l = 0 ,
n 0 ∂ w δΦ L1 |0,0,l = 0 ,
then follows the equality n n 0 L1 |0,0,l = ∂ w δΦ L1 |0,0,l . ∂ w δΦ
(4.85)
Proofs are given in I, p. 501 and pp. 503–504, respectively. After these preparations we turn to the proof of (4.79). Our first goal is to determine renormalization conditions for the functional Γ in such a way that the relevant part of Γ1 vanishes, proceeding inductively in the loop order l. The Lemmata indicate that we should ascend for given l in the number |n| of fields, and for given l, n to ascend in |w|. Requiring the relevant part of Γ1 to vanish amounts to the 53 conditions listed in Appendix C of I, however, observe the qualifications stated at the end of Sec. 4.3. The loop order l = 0 of these conditionsj is satisfied by the (classical) parameters displayed in our Appendices A and B, i.e. Γ1,l=0 |rel = 0 ,
(4.86)
L01,l=0 |rel = 0 .
(4.87)
and hence we also have
Induction hypothesis. Given l ∈ N, we assume for all loop orders l 0 < l the functional Γ (or equivalently the L-functional ) to be renormalized according to A 2 ) stated after Eq. (4.37) such that n ∂ w δΦ Γ1 |0,0,l0 = 0 ,
n 0 ∂ w δΦ L1 |0,0,l0 = 0
is satisfied for all (n, w) with |n| + |w| ≤ 5. Theorem. The induction hypothesis holds in the loop order l. j There
are no contributions “irr” in the order l = 0.
(4.88)
July 14, 2003 11:19 WSPC/148-RMP
548
00169
V. F. M¨ uller
Proof. The 37 + 7 relevant parameters appearing in the theory and in the BRStransformations are defined in Appendix A and B, respectively. In the loop order l we fix a priori 26 of them as follows, suppressing an index l: (A) Besides κl = 0, (4.83), already fixed before, we choose freely in Γ0,Λ0 (Φ): ˙ c¯c , ΣAB , F BBh , F AAA , R3 . Σtrans , Σlong , Σ˙ BB , Σ
(4.89)
This means that the normalizations of all fields except the Higgs field, k the two couplings F AAA and F BBh and one global normalization for the BRStransformations are freely chosen. ˜ Λ0 ,Λ0 , (4.40), by requiring (!): Moreover, we restrict the bare functional L (B) the bare parameters of the BRS-insertions (4.38) to obey R60 = R70 = R20 ,
R50 =
(R20 )2 , R30
(4.90)
(C) the 11 r0 -terms listed in Appendix A which have no correspondence in the loop order l = 0 to vanish, i.e. hBA r20 = · · · = r0c¯c¯cc = 0 ,
(4.91)
(D) and finally the additional relations F0BBA = −
R30 hBA F , 2R20 10
F0c¯cB = −
R30 c¯ch F , R20 0
F0AAhh =
R50 AABB F . R30 10
(4.92)
Since by the conditions (B)–(D) we have fixed 17 relevant parameters on the “wrong side” Λ = Λ0 , these restrictions have to be shown to provide a finite limit theory upon removing the cutoff Λ0 . The values of the bare parameters BBh AB ˙ c¯c , F0AAA , R30 , κ0 , Σ0trans , Σ0long , Σ˙ BB 0 , Σ 0 , Σ 0 , F0
(4.93)
follow from the renormalization conditions chosen in (4.89). The remaining relevant ˜0 ≡ L ˜ Λ0 ,Λ0 are now determined by requiring L0 |rel to vanish, taking parameters of L 1 into account the relations (B)–(D) already introduced. We list these parameters determined successively writing in bracket the number of the particular equation of Appendix C of I, from which the parameter is determined. This way each bare parameter different from (4.93) is fixed in terms of the free parameters (4.93) — directly or via parameters already determined before in proceeding: c¯cA (IIIa0 ) , R10 (Ib0 ), R40 (IIb0 ), R20 (IIIb0 ) → R60 , R70 , R50 , due to (B), F10 hBA F0BBA (V 0 ) → F10 , due to (D), F0c¯cB (IVa0 ) → F0c¯ch , due to (D) , 0 hh 0 AAh ˙ hh (V II 0 ) . Σc0¯c (V IIIa0 ), δm20 (Ia0 ), ΣBB (V Ib0 ), Σ 0 (IIa ), Σ0 (V IIa ), F0 0 b k The
field h emerges together with the field B a from the complex scalar doublet φ, (4.5).
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
549
There are 4 linearly dependent equations: Denoting by {X} the content of the bracket {· · ·} appearing in equation X, and using also (B), (D) we find −α−1 {V IIIb0 } = R10 ({IIIa0 } + {IIIb0 }) + gR20 {Ib0 } , R10 {IVb0 } = 2mR40 {V 0 } − gR20 {IIb0 } + m{V IIIb0 } , {V Ia0 } = {V Ib0 } + {V IIc0 } =
R20 {IVa0 } , R30
R20 0 {V } . R30
(4.94) (4.95) (4.96) (4.97)
Moreover, V IId0 , V IIIc0 are satisfied because of (C). At this stage all contributions to L01 |rel involving 2 or 3 fields do vanish. We then continue as follows: F0BBBB (X 0 ), F0BBhh (XX 0 ), F0hhhh (XIX 0 ), F0hhh (IX 0 ) , AABB AAAA F10 (XIII20 ) → F0AAhh , due to (D), F10 (XIVc0 ) .
˜ 0 are fixed. The remaining equations have to be Now all bare parameters entering L fulfilled identically. Indeed, using (B), (C): g 0 mR40 {XV1a } = R10 {XIII20 } − R30 {V Ib0 } , 2 0 R30 {XV IIa0 } = R50 {XV1a },
{XIVa0 } = −{XIVc0 } .
(4.98) (4.99)
Finally, one easily sees that the remaining 26 equations are satisfied because of (B), (C), (D). Thus we have achieved L01 |rel ≡ L1Λ0 ,Λ0 |rel = 0 .
(4.100)
From this property, we now determine Γ1 |rel , based on the induction hypothesis. The relevant parts of L01 , L1 , Γ1 are formed by terms that fulfil |n| + |w| ≤ 5, i.e. involve at most |n| = 5 fields. Ascending for given n with |w|, one treats first the cases |n| = 2: From ∂ w L01;n = 0, (4.100), one infers via Lemma 2 that ∂ w L1;n = 0, and hereupon via Lemma 1 that ∂ w Γ1;n = 0 holds, too. The cases |n| = 2 established, treating this way successively the cases |n| = 3, 4, 5, we arrive at 0 Γ1 |rel ≡ Γ0,Λ |rel = 0 . 1
(4.101)
Hence, the 53 equations listed in Appendix C of I are fulfilled. Given the freely chosen renormalization conditions (4.89) and κ = 0 in the loop order l, the remaining 37 + 7 − 9 (dependent) ones in this loop order are extracted one after the other from correspondingly chosen equations. We now list these renormalization constants in the order they are obtained, together with the label of the respective determining equation written in brackets. In this succession each determining equation then only contains parameters already determined before. Furthermore,
July 14, 2003 11:19 WSPC/148-RMP
550
00169
V. F. M¨ uller
one easily infers in each case from the equation considered, that the renormalization condition imposed in such a way stays finite in the limit Λ0 → ∞. We obtain R1 (Ib ), R4 (IIb ), (−αδm2 + Σc¯c )(Ia ), (ΣBB − Σc¯c )(IIa ), R2 (IIIb ) , (F1c¯cA − r2c¯cA )(IIIa ), F BBA (IVb ), R6 (V ), r2hBA (V IId ), R7 (V IIIb ) , r2c¯cA (V IIIc ), F c¯cB (IVa ), Σc¯c (V IIIa ), F1hBA (XV2a − XV2b ) , ˙ hh (V IIb ) . F AAh (V Ib ), F c¯ch (V Ia ), R5 (V IIc ), Σhh (V IIa ), Σ (Now all equations I–V III have been used, and the difference XV2a − XV2b , to determine all Ri and all terms with |n| = 2, 3 apart from F hhh .) F1AAAA (XIVc ), r1AA¯cc (XIVb ), r2AA¯cc (XIVe ), r2AAAA (XIVa ), r2AABB (XIII1 ) , F1AABB (XV1a ), F AAhh (XV IIa ), r1BB¯cc (XV1b ), r2BB¯cc (XV2c ), rhB¯cc (XI) , F BBBB (X), rhh¯cc (XV IIb ), rc¯c¯cc (XV IIIa ), F BBhh (XX) , F hhhh (XIX), F hhh (IX) . Thus all 37 + 7 relevant parameters are fixed in the loop order l and have finite limits in removing the cutoff Λ0 . This completes the proof. In view of the Theorem and Lemma 1, the restoration of the Slavnov–Taylor 0 0 identities is finally accomplished, if the irrelevant parts of Γ0,Λ or of L0,Λ are 1 1 shown to vanish, too, sending the UV-cutoff Λ0 to infinity. This behavior follows from the Proposition. Let l ∈ N0 , |w| ≤ 4, n a multiindex and 0 ≤ Λ ≤ Λ0 , then ν |pi | (Λ + m)5+1−|n|−|w| Λ0 0 P |∂ w LΛ,Λ (p , . . . , p )| ≤ , log 1 |n| 1;l,n Λ0 m Λ+m (4.102) with nonnegative integers ν and polynomials P as before. The proof is given in [84]. We just recall that all irrelevant terms occurring in the 0 ,Λ0 BRS-variation LΛ , (4.67), of the bare interaction result from momentum deriva1 tives of the cutoff function σ0,Λ0 (k 2 ), (4.26), and have |n| ≤ 5. The Proposition now directly implies vanishing limits 0,Λ0 (p1 , . . . , p|n| ) = 0 lim L1;l,n
Λ0 →∞
(4.103)
for all sets of fields n, (4.34), and in all loop orders l. Thus, the Slavnov–Taylor identities are restored upon removing the UV-cutoff Λ0 . This completes the proof of perturbative renormalizability of the spontaneously broken Yang–Mills theory based on flow equations.
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
551
In retrospect, one notes that we used only flow equations for L-functionals, ˆ Λ,Λ0 (q), LΛ,Λ0 . There are flow equations for Γ-functionals, too, albeit i.e. for LΛ,Λ0 , L 1 ξ of a more implicit form. The only reason we passed in an intermediate stage to vertex functions was to facilitate extracting the relevant parameters of the theory upon analyzing the relevant part of the violated Slavnov–Taylor identities. This task is greatly reduced dealing with the vertex functions, indeed. Appendix A. The Relevant Part of Γ The bare functional LΛ0 ,Λ0 and the relevant part of the generating functional ΓΛ,Λ0 for the proper vertex functions have the same general form. We present the latter and give the tree order of both explicitly. The cutoff symbols Λ and Λ0 are suppressed. We write Γ(A, h, B, c¯, c) =
4 X
Γ|n| + Γ(|n|>4) ,
|n|=1
|n| counting the number of fields, and extract its relevant part, i.e. its local field content with mass dimension not greater than four. Generally we will not underline the field variable symbols in the Appendices, though of course all arguments in the Γ-functional should appear underlined. The modification to obtain the bare functional LΛ0 ,Λ0 is stated at the end. (1) One-point function: ˆ . Γ1 = κh(0) (2) Two-point functions: Z ( Γ2 =
p
1 a 1 hh Aµ (p)Aaν (−p)ΓAA µν (p) + h(p)h(−p)Γ (p) 2 2
1 + B a (p)B a (−p)ΓBB (p) − c¯a (p)ca (−p)Γc¯c (p) 2 )
+ Aaµ (p)B a (−p)ΓAB µ (p) ,
2 2 2 2 ΓAA µν (p) = δµν (m + δm ) + (p δµν − pµ pν )(1 + Σtrans (p ))
+
1 pµ pν (1 + Σlong (p2 )) , α
Γhh (p) = p2 + M 2 + Σhh (p2 ) , Γc¯c (p) = p2 + αm2 + Σc¯c (p2 ) ,
ΓBB (p) = p2 + αm2 + ΣBB (p2 ) , AB 2 ΓAB (p ) . µ (p) = ipµ Σ
July 14, 2003 11:19 WSPC/148-RMP
552
00169
V. F. M¨ uller
Besides the unregularized tree order explicitly stated, there emerge 10 relevant parameters from the various self-energies: δm2 , Σtrans (0), Σlong (0), Σhh (0), Σ˙ hh (0), ΣBB (0), Σ˙ BB (0), ˙ c¯c (0), ΣAB (0) Σc¯c (0), Σ P P where the notation ˙ (0) ≡ (∂p2 )(0) has been used. We note, that in transforming the regularized L-functional into the corresponding Γ-functional the inverse of the regularized propagators (4.26) become the 2-point functions of the latter in the tree order l = 0. The factor (σΛ,Λ0 (p2 ))−1 thus appearing, however, does not contribute to the relevant part due to the property (4.25). (3) Three-point functions: Only the relevant part is given explicitly: r ∈ O(~) denotes a relevant parameter which vanishes in the tree order, otherwise a relevant parameter is denoted by F . Moreover, we indicate an irrelevant part by a symbol On , n ∈ N, indicating that this part vanishes as an nth power of the momentum in the limit when all momenta tend to zero homogeneously. Z Z n rst Arµ (p)Asν (q)Atλ (−p − q)ΓAAA Γ3 = µνλ (p, q) p
q
+ Arµ (p)Arν (q)h(−p − q)ΓAAh µν (p, q) + rst B r (p)B s (q)Atµ (−p − q)ΓBBA (p, q) µ + h(p)B r (q)Arµ (−p − q)ΓhBA (p, q) + rst c¯r (p)cs (q)Atµ (−p − q)Γcµ¯cA (p, q) µ + B r (p)B r (q)h(−p − q)ΓBBh (p, q) + h(p)h(q)h(−p − q)Γhhh (p, q) o + c¯r (p)cr (q)h(−p − q)Γc¯ch (p, q) + rst c¯r (p)cs (q)B t (−p − q)Γc¯cB (p, q) , AAA + O3 , ΓAAA µνλ (p, q) = δµν i(p − q)λ F AAh ΓAAh + O2 , µν (p, q) = δµν F
ΓBBA (p, q) = i(p − q)µ F BBA + O3 , µ ΓhBA (p, q) = i(p − q)µ F1hBA + i(p + q)µ r2hBA + O3 , µ
1 F AAA = − g + rAAA , 2 1 F AAh = mg + rAAh , 2 1 F BBA = − g + rBBA , 4 1 F1hBA = g + r1hBA , 2
Γcµ¯cA (p, q) = ipµ F1c¯cA + iqµ r2c¯cA + O3 ,
F1c¯cA = g + r1c¯cA ,
ΓBBh (p, q) = F BBh + O2 ,
F BBh =
Γhhh (p, q) = F hhh + O2 ,
F hhh =
1 M2 g + rBBh , 4 m 1 M2 g + rhhh , 4 m
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
553
1 F c¯ch = − αgm + rc¯ch , 2 1 Γc¯cB (p, q) = F c¯cB + O2 , F c¯cB = αgm + rc¯cB . 2 The 3-point functions AAB and BBB have no relevant local content. Γc¯ch (p, q) = F c¯ch + O2 ,
(4) Four-point functions: With parameters r and F defined as before Z Z Z Γ4 |rel = {abc ars Abµ (k)Acν (p)Arµ (q)Asν (−k − p − q)F1AAAA k
p
q
+ Arµ (k)Arµ (p)Asν (q)Asν (−k − p − q)r2AAAA + Aaµ (k)Abµ (p)¯ cr (q)cs (−k − p − q)(δ ab δ rs r1AA¯cc + δ ar δ bs r2AA¯cc ) + Aaµ (k)Abµ (p)B r (q)B s (−k − p − q)(δ ab δ rs F1AABB + δ ar δ bs r2AABB ) + B a (k)B b (p)¯ cr (q)cs (−k − p − q)(δ ab δ rs r1BB¯cc + δ ar δ bs r2BB¯cc ) + h(k)h(p)h(q)h(−k − p − q)F hhhh + B r (k)B r (p)h(q)h(−k − p − q)F BBhh + B r (k)B r (p)B s (q)B s (−k − p − q)F BBBB + Arµ (k)Arµ (p)h(q)h(−k − p − q)F AAhh + h(k)h(p)¯ cr (q)cr (−k − p − q)r hh¯cc + c¯a (k)ca (p)¯ cr (q)cr (−k − p − q)r c¯c¯cc + rst h(k)B r (p)¯ cs (q)ct (−k − p − q)r hB¯cc } , 1 2 1 g + r1AAAA , F1AABB = g 2 + r1AABB , 4 8 2 2 1 2 M 1 2 M + rhhhh , F BBhh = + rBBhh , F hhhh = g g 32 m 16 m 2 1 1 2 M BBBB F = g + rBBBB , F AAhh = g 2 + rAAhh . 32 m 8 Hence, in total Γ involves 1 + 10 + 11 + 15 = 37 relevant parameters. After deleting in the two-point functions the contributions of the order l = 0, i.e. keeping only the 10 parameters which appear in the various self-energies, we have the form of the bare functional LΛ0 ,Λ0 , and its order l = 0 also given explicitly. F1AAAA =
Appendix B. The Relevant Part of the BRS-Insertions We also have to consider the vertex functions (4.74) and (4.75) with one operator insertion, generated by the BRS-variations. These insertions have mass dimension
July 14, 2003 11:19 WSPC/148-RMP
554
00169
V. F. M¨ uller
D = 2. Performing the Fourier-transform Z 0,Λ0 ˆ Γγ (q) = dxeiqx Γγ0,Λ0 (x)
and similarly in the other cases, we list the respective relevant part of these four vertex functions with one insertion, suppressing the superscript 0, Λ0 : Z a arb ˆ a Γγµ (q)|rel = −iqµ c (−q)R1 + Arµ (k)cb (−q − k)gR2 , k
1 B (k)c (−q − k) − gR3 , 2 k Z 1 = mca (−q)R4 + h(k)ca (−q − k) gR5 2 k Z 1 + arb B r (k)cb (−q − k) gR6 , 2 k Z 1 = ars cr (k)cs (−q − k) gR7 . 2 k
ˆ γ (q)|rel = Γ ˆ γ a (q)|rel Γ
ˆ ωa (q)|rel Γ
Z
r
r
There appear 7 relevant parameters: Ri = 1 + r i ,
ri = O(~) ,
i = 1, . . . , 7 .
All the other 2-point functions, and the higher ones, of course, are of irrelevant type. Acknowledgments The author is much indebted to Christoph Kopper for his encouragement and gratefully acknowledges his careful reading of an earlier version of the manuscript and his numerous valuable suggestions. Thanks are also due to the anonymous referees for constructive proposals. References [1] F. J. Dyson, The Radiation theories of Tomonaga, Schwinger, and Feynman, Phys. Rev. 75 (1949), 486–502. [2] F. J. Dyson, The S matrix in quantum electrodynamics, Phys. Rev. 75 (1949), 1736–1755. ¨ [3] N. N. Bogoliubov and O. S. Parasiuk, Uber die Multiplikation der Kausalfunktionen in der Quantentheorie der Felder, Acta Math. 97 (1957), 227–266. [4] K. Hepp, Th´eorie de la Renormalisation, Lecture Notes in Physics, Springer, Berlin, 1969. [5] W. Zimmermann, Local Operator Products and Renormalization in Quantum Field Theory, in Lectures on Elementary Particles and Quantum Field Theory, 1970 Brandeis University Summer Institute in Theoretical Physics, Vol. 1, M.I.T. Press, Cambridge, 1970.
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
555
[6] W. Pauli and F. Villars, On invariant regularization in relativitic quantum theory, Rev. Mod. Phys. 21 (1949), 434–444. [7] E. Speer, Generalized Feynman Amplitudes, Princeton University, Princeton, 1969. [8] G. ’t Hooft and M. Veltman, Regularization and renormalization of Gauge fields, Nucl. Phys. B44 (1972), 189–213. [9] C. G. Bollini and J. J. Giambiagi, Lowest order “divergent” Graphs in ν-dimensional space, Phys. Lett. 40B (1972), 566–568. [10] P. Breitenlohner and D. Maison, Dimensional Renormalization and the Action Principle, Comm. Math. Phys. 52 (1977), 11–38. [11] P. Breitenlohner and D. Maison, Dimensionally renormalized Green’s functions for theories with massless particles. I, II, Comm. Math. Phys. 52 (1977), 39–54, 55–75. [12] A. S. Wightman and G. Velo (eds.), Renormalization Theory, Erice 1975, Reidel, 1976. [13] W. Zimmermann, Convergence of Bogoliubov’s method of renormalization in momentum space, Comm. Math. Phys. 15 (1969), 208–234. [14] J. H. Lowenstein, Convergence theorems for renormalized Feynman integrals with zero-mass propagators, Comm. Math. Phys. 47 (1976), 53–68. [15] G. ’t Hooft, Renormalization of massless Yang–Mills fields, Nucl. Phys. B33 (1971), 173–199. [16] G. ’t Hooft, Renormalizable Lagrangians for massive Yang–Mills fields, Nucl. Phys. B35 (1971), 167–188. [17] G. ’t Hooft and M. Veltman, Combinatorics of Gauge fields, Nucl. Phys. B50 (1972), 318–353. [18] K. G. Wilson, Renormalization group and critical phenomena. I. renormalization group and the Kadanoff scaling picture, Phys. Rev. B4 (1971), 3174–3183. [19] K. G. Wilson, Renormalization group and critical phenomena. II. phase-space-cell analysis of critical behaviour, Phys. Rev. B4 (1971), 3184–3205. [20] K. G. Wilson and J. Kogut, The Renormalization Group and the ε Expansion, Phys. Rep. 12C (1974), 75–199. [21] K. Gawedzki and A. Kupiainen, Gross-Neveu model through convergent perturbation expansions, Comm. Math. Phys. 102 (1985), 1–30. [22] J. Feldman, J. Magnen, V. Rivasseau and R. S´en´eor, A renormalizable field theory: the massive Gross-Neveu model in two dimensions, Comm. Math. Phys. 103 (1986), 67–103. [23] D. C. Brydges, Functional Integrals and their Applications, Lausanne lectures 1992, mp-arc 93–24. [24] V. Rivasseau, Constructive Renormalization Theory, arXiv: math-ph/9902023. [25] V. Rivasseau, From Perturbative to Constructive Renormalization, Princeton University Press, 1991. [26] G. Gallavotti and F. Nicolo, Renormalization theory in four-dimensional scalar fields I, II, Comm. Math. Phys. 100 (1985), 545–590; 101 (1986), 247–282. [27] G. Gallavotti, Renormalization theory and ultraviolet stability for scalar fields via renormalization group methods, Rev. Mod. Phys. 57 (1985), 471–562. [28] J. S. Feldman, T. R. Hurd, L. Rosen and J. D. Wright, QED: A proof of renormalizability, Springer, Lecture Notes in Physics 312, 1988. [29] J. Polchinski, Renormalization And Effective Lagrangians, Nucl. Phys. B231 (1984), 269-295. [30] P. K. Mitter and T. R. Ramadas, “The two-dimensional O(N ) nonlinear σmodel: renormalisation and effective actions, Comm. Math. Phys. 122 (1989), 575–596.
July 14, 2003 11:19 WSPC/148-RMP
556
00169
V. F. M¨ uller
[31] G. Keller, Ch. Kopper and M. Salmhofer, Perturbative renormalization and effective Lagrangians in Φ44 , Helv. Phys. Acta 65 (1992), 32–52. [32] G. Keller and Ch. Kopper, Perturbative renormalization of composite operators via flow equations I, Comm. Math. Phys. 148 (1992), 445–467. [33] G. Keller and Ch. Kopper, Perturbative renormalization of composite operators via flow equations II: Short distance expansion, Comm. Math. Phys. 153 (1993), 245–276. [34] Ch. Wieczerkowski, Symanzik’s improved actions from the viewpoint of the renormalization group, Comm. Math. Phys. 120 (1988), 148–176. [35] G. Keller, The perturbative construction of Symanzik’s improved action for Φ44 and QED4 , Helv. Phys. Acta 66 (1993), 453–470. [36] G. Keller, Local Borel summability of Euclidean Φ44 : A simple proof via differential flow equations, Comm. Math. Phys. 161 (1994), 311–323. [37] G. Keller and Ch. Kopper, Perturbative renormalization of massless Φ44 with flow equations, Comm. Math. Phys. 161 (1994), 515–532. [38] G. Keller and Ch. Kopper, Perturbative renormalization of QED via flow equations, Phys. Lett. B273 (1991), 323–332. [39] G. Keller and Ch. Kopper, Renormalizability proof for QED based on flow equations, Comm. Math. Phys. 176 (1996), 193–226. [40] C. Kim, A renormalization group flow approach to decoupling and irrelevant operators, Ann. Phys. (N.Y.) 243 (1995), 117–143. [41] G. Keller, Ch. Kopper and C. Schophaus, Perturbative renormalization with flow equations in Minkowski space, Helv. Phys. Acta 70 (1997), 247–274. [42] Ch. Kopper and V. F. M¨ uller, Renormalization proof for spontaneously broken Yang–Mills theory with flow equations, Comm. Math. Phys. 209 (2000), 477–516. [43] Ch. Kopper, V. F. M¨ uller and Th. Reisz, Temperature independent renormalization of finite temperature field theory, Ann. Henri Poincar´e 2 (2001), 387–402. [44] M. Salmhofer, Renormalization, An Introduction, Springer, 1999. [45] M. Bonini, M. D’ Attanasio and G. Marchesini, Perturbative renormalization and infrared finiteness in the Wilson renormalization group: the massless scalar case, Nucl. Phys. B409 (1993), 441–464. [46] Ch. Wetterich, Exact evolution equation for the effective potential, Phys. Lett. B301 (1993), 90–94. [47] M. Bonini, M. D’Attanasio and G. Marchesini, Ward identities and Wilson renormalization group in QED, Nucl. Phys. B418 (1994), 81–112. [48] M. Bonini, M. D’Attanasio and G. Marchesini, Renormalization group flow for SU(2) Yang–Mills theory and gauge invariance, Nucl. Phys. B421 (1994), 429–455. [49] M. Bonini, M. D’Attanasio and G. Marchesini, BRS — symmetry for Yang–Mills theory and exact renormalization group, Nucl. Phys. B437 (1994), 163–186. [50] L. Girardello and A. Zaffaroni, Exact renormalization group equation and decoupling in quantum field theory, Nucl. Phys. B424 (1994), 219–238. [51] C. Becchi, On the construction of renormalized gauge theories using renormalization group techniques, arXiv: hep-th/ 9607188. [52] M. Bonini and F. Vian, Chiral gauge theories and anomalies in the Wilson renormalization group approach, Nucl. Phys. B511 (1998), 479–494. [53] M. Bonini and F. Vian, Wilson renormalization group for supersymmetric gauge theories and gauge anomalies, Nucl. Phys. B532 (1998), 473–497. [54] M. D’ Attanasio and T. R. Morris, Gauge invariance, the quantum action principle, and the renormalization group, Phys. Lett. B378 (1996), 213–221. [55] S. Arnone, Y. A. Kubyshin, T. R. Morris and J. F. Tighe, A gauge invariant regulator for the ERG, Int. J. Mod. Phys. A16 (2001), 1989.
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
557
[56] U. Ellwanger, Flow equations and BRS invariance for Yang–Mills theories, Phys. Lett. B335 (1994), 364–370. [57] U. Ellwanger, M. Hirsch and A. Weber, Flow equations for the relevant part of the pure Yang–Mills action, Z. Phys. C69 (1996), 687–697. [58] U. Ellwanger, Confinement, monopoles and Wilsonian effective action, Nucl. Phys. B531 (1998), 593–612. [59] M. Reuter and C. Wetterich, Effective average action for gauge theories and exact evolution equations, Nucl. Phys. B417 (1994), 181–214. [60] M. Reuter and C. Wetterich, Exact evolution equation for scalar electrodynamics, Nucl. Phys. B427 (1994), 291-324. [61] M. Reuter and C. Wetterich, Gluon condensation in nonperturbative flow equations, Phys. Rev. D56 (1997), 7893–7916. [62] H. Gies, Running coupling in Yang–Mills theory — a flow equation study, Phys. Rev. D66 (2002), 025006. [63] O. Lauscher and M. Reuter, Towards nonperturbative renormalizability of quantum Einstein gravity, Int. J. Mod. Phys. A17 (2002), 993–1002. [64] O. Lauscher and M. Reuter, Is quantum Einstein gravity nonperturbatively renormalizable? Class. Quant. Grav. 19 (2002), 483–492. [65] C. Bagnuls and C. Bervillier, Exact renormalizsation group equations. An introductory review, Phys. Rep. 348 (2001), 91–157. [66] Ch. Kopper, Renormierungstheorie mit Flussgleichungen, Shaker Verlag, Aachen, 1998. [67] J. Glimm and A. Jaffe, Quantum Physics, Sec. ed., Springer, New York, 1987, Chap. 9.1. [68] T. Hida, Stationary Stochastic Processes, Princeton, 1970, Theo. 4.2. [69] Y. Yamasaki, Measures on Infinite Dimensional Spaces, World Scientific, Singapore, 1985, Part A, Theo. 17.1. [70] Ch. Kopper and F. Meunier, Large momentum bounds from flow equations, Ann. Henri Poincar´e 3 (2002), 435–449. [71] S. Weinberg, High energy behaviour in quantum field theory, Phys. Rev. 118 (1960), 838–849. [72] N. P. Landsman and C. van Weert, Real- and imaginary-time field theory at finite temperature and density, Phys. Rep. 145 (1987), 141–249. [73] N. Bourbaki, Fonctions d’une variable r´eelle, Editions Hermann, Paris, 1976, Chap. 6. [74] Y. M. P. Lam, Perturbation Lagrangian theory for scalar fields — Ward–Takahashi identity and current algebra, Phys. Rev. D6 (1972), 2145–2161. [75] Y. M. P. Lam, Equivalence theorem on Bogoliubov–Parasiuk–Hepp–Zimmermannrenormalized Lagrangian field theories, Phys. Rev. D7 (1973), 2943–2949. [76] J. H. Lowenstein, Differential vertex operations in Lagrangian field theory, Comm. Math. Phys. 24 (1971), 1–21. [77] O. Piguet and S. P. Sorella, Algebraic Renormalization, Springer, Berlin, 1995. [78] J. Zinn–Justin, Quantum Field Theory and Critical Phenomena, Sec. ed., Clarendon Press, Oxford, 1993. [79] A. A. Slavnov, Math. Theor. Phys. 10 (1972), 99. [80] J. C. Taylor, Ward identities and charge renormalization of the Yang–Mills field, Nucl. Phys. B33 (1971), 436–444. [81] C. Becchi, A. Rouet and R. Stora, Renormalization of the Abelian Higgs–Kibble Model, Comm. Math. Phys. 42 (1975), 127–162. [82] C. Becchi, A. Rouet and R. Stora, Renormalization of gauge theories, Ann. Phys. (N.Y.) 98 (1976), 287–321.
July 14, 2003 11:19 WSPC/148-RMP
558
00169
V. F. M¨ uller
[83] L. D. Faddeev and A. A. Slavnov, Gauge Fields: Introduction to Quantum Theory, Benjamin, Reading MA, 1980. [84] V. F. M¨ uller, Perturbative renormalization by flow equations, arXiv hep-th/0208211, Proposition 4.4.
September 1, 2003 11:49 WSPC/148-RMP
00172
Reviews in Mathematical Physics Vol. 15, No. 6 (2003) 559–628 c World Scientific Publishing Company
ON THE MODULI OF A QUANTIZED ELASTICA IN P AND KdV FLOWS: STUDY OF HYPERELLIPTIC CURVES AS AN EXTENSION OF EULER’S PERSPECTIVE OF ELASTICA I
† ˆ SHIGEKI MATSUTANI∗ and YOSHIHIRO ONISHI ∗8-21-1 †Faculty
Higashi-Linkan Sagamihara, 228-0811 Japan of Humanities and Social Sciences, Iwate University, Ueda, Morioka, Iwate, 020-8550 Japan Received 25 June 1999 Revised 4 May 2003
Quantization needs evaluation of all of states of a quantized object rather than its stationary states with respect to its energy. In this paper, we have investigated moduli MPelas of a quantized elastica, a quantized loop with an energy functional associated with the Schwarz derivative, on a Riemann sphere P. Then it is proved that its moduli space is decomposed to a set of equivalent classes determined by flows obeying the Korteweg-de Vries (KdV) hierarchy which conserve the energy. Since the flow obeying the KdV hierarchy has a natural topology, it induces topology in the moduli space M Pelas . Using the topology, MPelas is classified. Studies on a loop space in the category of topological spaces Top are well-established and its cohomological properties are well-known. As the moduli space of a quantized elastica can be regarded as a loop space in the category of differential geometry DGeom, we also proved an existence of a functor between a triangle category related to a loop space in Top and that in DGeom using the induced topology. As Euler investigated the elliptic integrals and its moduli by observing a shape of classical elastica on C, this paper devotes relations between hyperelliptic curves and a quantized elastica on P as an extension of Euler’s perspective of elastica. Keywords: Statistical mechanics; elastica; polymer; loop space; hyperelliptic function.
1. Introduction History of investigations of elastica was opened by James Bernoulli in 1691 according to Truesdell’s inquiry [1, 2]. He named a shape of a thin non-stretching elastic rod elastica and proposed the elastica problem: what shape does elastica take for a given boundary condition? It should be, further, noted that he also proposed the lemniscate problem and discovered an elliptic integral corresponding to the lemniscate function by investigation of elastica. He considered a smooth curve with the arc-length in a plane C, γ˜ : [0, l] ,→ C , 559
(s 7→ γ˜ (s)) .
September 1, 2003 11:49 WSPC/148-RMP
560
00172
ˆ S. Matsutani & Y. Onishi
Following his studies, his nephew Daniel Bernoulli discovered that the elastica obeys the minimal principle that shape of the elastica is realized as a stationary point of an energy functional, which is called Euler–Bernoulli functional nowadays [1–3], Z E[˜ γ ] = k 2 ds ,
√ where k is the curvature of the curve γ˜ in C, k = − −1∂s2 γ˜/∂s γ˜, ∂s := d/ds, and s is the arc-length of the curve using the induced metric in C. (It should be noted that this functional differs from that of a “string” in the literature of the string theory in the elementary particle physics: although an elastica is a model of a string of the chord e.g. the guitar, “string” in the string theory cannot be realized in the classical mechanical regime.) Since the curvature k is expressed as k = ∂s φ where φ is the tangential angle, and R the energy is given by E = |∂s φ|2 ds, the elastica problem could be interpreted as the oldest problem of a harmonic map into a target space U(1); if we write ∂s φds = g −1 dg, forR U(1) valued function g over C, then the Hodge-star dual ∗g −1 dg = ∂s φ and E = hg −1 dg ∧ ∗g −1 dgi. ˜ elas, cls , The elastica problem is to investigate moduli M ˜ elas, cls := {˜ M γ : [0, 1] ,→ C | δE[˜ γ ]/δ˜ γ = 0}/ ∼ . Here “∼” means modulo Euclidean move in C and dilatation. We sometimes call this space moduli space of the classical elasticas. The classification of this moduli space ˜ elas, cls was essentially done by Euler in 1744 by means of numerical computations M ˜ elas, cls is classified by the moduli of the elliptic curves [4]. The moduli space M [1–3, 5]. It is noted that before Euler referred to Fagnano’s paper on his discovery of an algebraic properties of the lemniscate function (an elliptic function of a special modulus) at December 31, 1751, the elliptic integrals for more general modulus was investigated in the study of this classical harmonic map problem. (It is known that Jacobi recognized that the day is the birthday of the elliptic function. Thus we think that elastica is a kind of movements of the fetus of algebraic curves.) We also emphasize that from the beginning, the harmonic map problem (classical field theory in physics) is closely related to algebraic varieties. Recently Mumford investigated this elastica problem from a viewpoint of applied mathematics and gave simple and deep expressions of the shape of elastica, which show the depth, importance and beauty of this problem [6]. Especially for a closed elastica, Euler showed that its moduli space, Melas, cls := {γ : S 1 ,→ C | δE[γ]/δγ = 0}/ ∼ , consists of two disjoint points: the corresponding moduli τ of the elliptic curves consist of two points τ = 0 and τ = 0.70946 . . . [1–4]. Recently a loop space is one of the most concerned objects in mathematics and there have been so many efforts to investigate it [7–11 and reference therein]. Further it is well-known that soliton equations are closely related to the loop spaces, loop
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
561
groups and loop algebras [8, 11]. However these studies are sometimes too abstract to be related to physical problems, except problems in the elementary particle physics; for example, the embedded space is often a group manifold, e.g. U(N ). Further the energy function is paid little attention in these studies. On the other hand, our concerned object is a non-stretching elastica, which is related to a large polymer, such as the deoxyribonucleic acid (DNA) as a physical model [12–15]. Elastica has an energy functional as we described above. Thus our problem, basically, differs from the arguments in an ordinary loop space in [8, 10, 11] except [7, 9] though it is closely related to them. One of these authors (S.M.) considered the quantization of a closed elastica (precisely speaking, statistical mechanics of elasticas) [13]. He defined the moduli space of the closed quantized elastica, which is an isometric immersion of S 1 into C module the Euclidean motion and dilatation, 1 MC elas := {γ : S ,→ C | isometric immersion}/ ∼ .
He investigated the partition function from a physical point of view, which has not been mathematically justified: Z : MC elas × R>0 → R , with Z[β] =
Z
Dγ exp(−βE[γ]) , MC elas
where β ∈ R>0 := {x ∈ R | x > 0} and Dγ is the Feynman measure. On the quantization of an elastica, we need more information of the moduli space of curves besides those around its stationary points. To evaluate this map Z, he classified the moduli space of a quantized closed elastica MC elas and attempted to redefine the Feynman measure by replacing it with the series of Riemann integral over MC elas . His quantization is somewhat novel for an elastica. He physically proved that the moduli space of the quantized elastica is given as a subspace of the moduli space of the modified Korteweg-de Vries (MKdV) equation [13]. Here we should emphasize that it is very surprising that a physical system is completely described by a soliton equation as mentioned in [13]. Even in physical phenomena which are known as systems represented by soliton equations, like shallow waves, plasma waves, charge density waves and so on, the higher soliton solutions are, in general, out of their approximation regions; of course one or two soliton solutions do represent these phenomena well. On the other hand, in the quantized elastica problem, its functional space is completely expressed by the MKdV hierarchy, even though problems in polymer physics are, in general, too complex to be solved exactly [15]. In this paper, we will rewrite the physical theorem in [13] from a mathematical point of view and extend it. Pedit gave a lecture on a loop space over a Riemann sphere P at Tokyo Metropolitan University in 1998 [16]. There he showed that the
September 1, 2003 11:49 WSPC/148-RMP
562
00172
ˆ S. Matsutani & Y. Onishi
loop space is related to the Korteweg-de Vries (KdV) flow by considering a loop in C2 \{0}. As his treatment is given in the framework of pure mathematics, we will follow the expressions of Pedit and deal with the KdV flow instead of the MKdV flow here. Due to the Miura map (the Ricatti type differential equation), the MKdV flow and the KdV flow can be regarded as different aspects of the same object; this choice is not significant. Mathematical investigations on the KdV flow leads us to our main results, Theorems 3.4, 4.2 and 7.4. As we will show later, our investigation of a quantized elastica leads us to study the hyperelliptic curves and their moduli space as Euler encountered the elliptic integrals and studied of the moduli of the elliptic functions by observing a shape of classical elastica on C. One of our purposes of this study is to know the hyperelliptic functions and its moduli by investigating a quantized elastica in P as an extension of Euler’s perspective of elastica. After we submitted the first version of this paper, these works progressed [17–20]. Hence in this revised version, we also rewrite the related parts. Contents of this paper is as follows. Section 2 shows an expression of a real curve immersed in a Riemann sphere P according to the lecture of Pedit [16]. Using his expressions, we define the moduli space of a real smooth curve immersed in P and an energy functional of the curve whose integrand is the Schwarz derivative along the curve. When we regard P as a complex plane with the infinity point, C ∪ {∞}, the energy functional is identified with the Euler–Bernoulli energy functional around the origin {0} of C ∪ {∞} and the curve with the energy is reduced to a quantized elastica which was studied by one of these authors [13]. Thus we continue to refer such a curve in P “quantized elastica in P”. In order to consider a quantum effect, we should get knowledge of a set of curves with different energies instead of investigation of only a stationary point of the energy functional even though we are dealing with a single elastica. Thus we will call, in this paper, the moduli MPelas defined in Definitions 2.10 and 2.12, “moduli of a quantized elastica” rather than moduli of loops. In Sec. 2, we will give an equivalence between a loop space over C2 \{0} and P in a certain sense. Further following MacLaughlin and Beylinski [21, 7], we will introduce a natural topology of the loop space which is induced from the topology of the base space. In Sec. 3, we introduce infinite dimensional parameters t = (t1 , t2 , t3 , . . .) which deform a given curve and define a flow obeying the KdV hierarchy along t, which is called KdVH flow. First we give our first main Theorem 3.4 in this paper. Since the energy functional of a curve turns out to be the first integral with respect to the parameter t, we prove that using the KdVH flow we can classify the moduli MPelas of a quantized elastica in P. In other words, the moduli space MPelas is decomposed to a set of the equivalent classes with respect to the KdVH flow. As Sec. 3 gives the differential geometrical and dynamical properties of the quantized elastica, we will attempt to express the theorem in terms of the words of the differential geometers. Remark 3.10 is a key of the study in Sec. 3.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
563
Primary considerations leads the fact that the moduli space of a quantized elastica MPelas is a subspace of the moduli space of the KdVH flow MKdV as shown in Proposition 4.29. The system of the KdV hierarchy has a natural topology, which essentially determines the algebraic properties of the KdV hierarchy [11, 22–25]. Using results on these studies of the KdV hierarchy, we give finer classification of the moduli space MPelas in Theorem 4.2 and Proposition 4.33, which is our second main theorem. There a dense subspace in MPelas is decomposed by a subspace characterized by a natural number. As we defined below Lemma 4.1, we encounter a finite type of the KdVH flow, which corresponds to the finite type solutions of the KdV equation and are related to a hyperelliptic curve. The natural number is related to genus of the hyperelliptic curve. In order that we mention our second main statements, Theorem 4.2 and Proposition 4.33, Sec. 4 reviews the algebro-geometrical properties of the KdV hierarchy based upon the so-called Sato–Mulase theory [24–26]. As the completion of set of finite type solutions is equal to MKdV , we concentrate our attention on the finite solution of the KdV flow and consider MKdV algebro-geometrically. As Sato–Mulase theory is of the algebraic analysis and is based upon the formal power series ring, we replace the base ring of smooth functions by the formal power series. There we find that a commutative differential ring is connected with geometry of a commutative ring, i.e. a hyperelliptic curve. Using the inclusion MPelas ⊂ MKdV , we will introduce the relative topology in MPelas induced from the topology of MKdV . In Sec. 5, we will show another algorithm of explicit computation of solutions of the KdV flow. There we will reconsider the KdV equation in the framework of inverse scattering method and comment the meanings of Theorem 4.2 again at Proposition 5.20. In other words, we will rewrite our second result more analytically. So readers can skip this section except Example 5.21. There we will also review Krichever’s construction of algebro-geometrical solutions [27–28] and Baker’s original method given about one hundred years ago [17, 29]. Using it we showed that there is an injection from the moduli space Mhyp of hyperelliptic curves to the moduli space MKdV of the KdV equation up to an ambiguity; this correspondence enables us to determine function forms of hyperelliptic ℘ functions as solutions of the KdV equations for any algebraically given hyperelliptic curves including degenerate curves. Section 6 is digression and we will review a result of a loop space over S 2 in the category of topological space Top, whose morphism is a continuous map, following the arguments in the textbook of Bott and Tu [30]. Studies on a loop space in Top are well-established and its cohomological properties are well-known. On the other hand, the moduli space of a quantized elastica in P can be regarded as a loop space in the category of the differential geometry DGeom. Thus by loosening the properties in DGeom and regarding them as those in Top, it is expected that the moduli of a quantized elastica MPelas in P are topologically related to those of a loop space in Top. Thus in Sec. 6, we will review a loop space in Top and show its cohomological properties.
September 1, 2003 11:49 WSPC/148-RMP
564
00172
ˆ S. Matsutani & Y. Onishi
In Sec. 7, we will mention the topological properties of the moduli of a quantized elastica MPelas and give our third main theorem. As loop spaces in both Top and DGeom are not finite dimensional spaces when we regard them as manifolds in an appropriate sense, it is not known that de Rham’s theorem can be applicable to them. However it is expected that cohomological sequences in both categories should correspond to each other. In other words, it is important to argue existence of functor between triangle categories related to them, i.e. quasi-isomorphism. Precisely speaking, though the closed condition and the reality condition in the moduli MPelas make its topological properties difficult to treat, we will tune the low dimensional parts of chain complex of MPelas and consider a complex of a quotient spaces CMPelas . Then we will show existence of a functor between the triangle categories in loop spaces in both Top and DGeom as our third main theorem at Theorem 7.4. The existence of the functor means the our theory in DGeom is justified in topological investigation. We believe that this result is meaningful to the investigations of the loop space. Section 8 gives the remarks and comments upon our results. First we will comment upon sequences of homotopy of loop spaces in both Top and DGeom. Next, we will give a possibility of computations of the partition function of a quantized elastica in C. Even in the quantized system, we will show that the orbit space is meaningful, whereas it is well-known that in noncommutative space, concepts of orbit and geometry are sometimes nonsense [58]. So we will comment upon the fact. Further we will remark the relations between our system and Painlev`e equation of the first kind [13, 31], and between our system and conformal field theory. Finally we will comment upon our results from the a point of view recent progress of Dirac operator related to immersion object based upon [12–14, 33, 34]. We will also mention possibility of higher dimensional case of our consideration there. Notations R and C are real and complex number fields respectively. R≥0 is the set of the non-negative real numbers. Z is the set of integers and N is the set of natural numbers 1, 2, 3, . . . . Z≥0 is the set of the non-negative integers. C ∞ (A, B) means the set of B-valued smooth functions over A. R[x1 , . . . , xn ] is the set of polynomial of x1 , . . . , xn with R valued coefficients and R[[x1 , . . . , xn ]] is the set of formal power series of x1 , . . . , xn with R valued coefficients. Others important quantities are listed as follows. MP : M
C2 \{0}
Moduli of Loops in P :
{γ, s}SD : MPelas : MC elas :
Defintion 2.4 2
Moduli of Loops in C \{0}, $ : M Schwarz derivative Moduli of quantized elastica in P Moduli of quantized elastica in C
C2 \{0}
→M
P
Defintion 2.4, Remark 2.5 Defintion 2.6 Defintion 2.10 Defintion 2.10
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows C2 \{0}
Melas
C2 \{0}
:
$ : Melas
→ MPelas
Defintion 2.10, Remark 2.11 Defintion 2.12 Defintion 2.12
MPelas : MC elas :
P πelas : MPelas → MPelas C πelas : MPelas → MC elas
Melas : E[γ]: Ds : Es : V∞: φ¯∂s u,t , ϕ ¯∂s u,t :
πelas : MPelas → Melas energy of elastica in P Differential ring over C ∞ (S 1 , C) Micro differential ring to Ds Q S1 × ( ∞ n=1 R) the KdVH flow
C2 \{0}
Ω, Ω: ∼
KdVHf
:
φA,t : MPelas : MPelas, finite ,
C2 \{0}
C2 \{0}
Defintion 2.12 Defintion 2.18 Defintion 3.1 Defintion 3.1 Defintion 3.2 Defintion 3.2, Proposition 3.11 Recursion differenital operator Defintion 3.2, Lemma 3.6 Equivlent relation related to the KdVH flow Defintion 3.2 a flow for A πelas : MPelas → MPelas := MPelas /
MPelas g :
565
∼
KdVHf
Defintion 3.7 Definition 3.18
Df : Ef : Ef : Wf , Wc : L: Af , Ac : Ac : Dt , Et , W t : MKdV , M∞ KdV :
finite type of the KdV flow and finite g-type flow Differential ring over C[[t1 ]] Micro differential ring to Df Micro differential ring with coefficient C Subsets of Ef and Ec Subset of Ef commutative subrings of Ef and Ec set of the commutative subrings in Ec Differential rings over C[[t1 , t2 , . . .]] Moduli of the KdV hierarchy
MKdV , g: Fg MKdV , MKdV g : Ef :
g πKdV : MKdV → MKdV , g Filter of Moduli of the KdV hierarchy Micro differential ring with coefficient C
Theorem 4.2 Definition 4.4 Definition 4.4 Definition 4.4 Definition 4.4 Lemma 4.8 Lemma 4.8 Definition 4.12 Definition 4.19 Definition 4.20, Proposition 4.32 Above Proposition 4.27 Definition 4.26 Definition 4.4
Wfg , Wf0,1 : Fg MPelas : Mhyp, g : P (X): ΩX: DMPelas , CMPelas
Gauge freedom Filter of Moduli of quantized elastica Moduli of hyperelliptic curves of genus g Path space over X in Top Loop space over X in Top Complex related to quantized elastica
Lemma 4.28 Proposition 4.29 Proposition 5.4 Proposition 6.2 Proposition 6.2 Proposition 7.1
2. A Loop in P In this section we will give an expression of a real curve immersed in a Riemann sphere P following one of Pedit [16]. His expression is based upon the oldest theory of a complex curve embedded in a complex plane C or an upper half plane H, which was found in ending of the nineteenth century and studied by Klein, Schwarz, Fuchs,
56 7
ˆ S. Matsutani & Y. Onishi
F lo w s
566
00172
dV
Poincar´e and so on [35]. Using the expression, we will define the moduli space MPelas C2 \{0}
C2 \{0}
in
ca
E la st i
ze d
ua nt i
Q
a
of
th e
M
od ul i
Using this map and the natural projection of C2 \{0} to the complex projective space (Riemann sphere) P, we can define the immersion of a loop in P: Definition 2.1. We define an immersion γ : S 1 ,→ P by the commutative diagram as γ = $ ◦ ψ, n
−− −− −→
W SP
C
/1
48
-R M P
00
17
2
P
an d
K
(Melas ) and MPelas (Melas ) of smooth curves in P (C2 \{0}) in Definitions 2.10 and 2.12 and an energy functional of a curve in Definition 2.18, whose integrand is the Schwarz derivative along the curve. As mentioned in Introduction, we will call MPelas a moduli space of a quantized elastica. Let us consider a smooth immersion of a circle into a two-dimensional complex space without origin, !! ψ1 (s) 1 2 . ψ : S (:= R/Z) ,→ C \{0} , s 7→ ψ(s) = ψ2 (s)
O
13
September 1, 2003 11:49 WSPC/148-RMP
ψ
γ
S 1 −−−−→ For a chart around ψ2 6= 0, s 7→ γ(s) =
C2 \{0} $ . y P
ψ1 ψ2 (s).
Definition 2.2. (1) The special linear map SL2 (C) : C2 \{0} → C2 \{0}, a b a, b, c, d ∈ C ad − bc = 1 , m ∈ SL2 (C) := c d
acts on P through the M¨ obius transformation as a symmetric group of P : g m : $ ◦ ψ 7→ $ ◦ mψ for m ∈ SL2 (C) and for a point γ ∈ P, aγ + b a b gm : γ 7→ , for m = . c d cγ + d
Let PSL2 (C) denote this group including the group action. (2) Let Γ0 (C) denote the subgroup which is characterized by vanishing condition of (2, 1)-component, a b Γ0 (C) := ∈ SL2 (C) , 0 d and E0 (C) denote its action to C ∪ {∞} using the M¨ obius transformation. (3) Let Γ1 (C) denote the other subgroup which is characterized by a b Γ1 (C) := ∈ SL2 (C) |a| = 1 , 0 d
and E1 (C) denote its action to C ∪ {∞} using the M¨ obius transformation.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
567
Remark 2.3. PSL2 (C) has following properties: (1) Translation, rotation and global dilatation: z→
a b 0 d
∈ Γ0 (C), (b = 1/a)
az + b = a2 z + ab . d
If we restrict the action into Γ1 (C), it generates a Euclidean motion induced from C = P\{∞}. (2) Coordinate transformation from chart around 0 to chart around ∞: z → −1/z . In Definition 2.10, we give the definitions of moduli spaces of a quantized elastica in P, which are our main objects in this article. However as Proposition 2.8 is correct for a more complicate system, we will give provisional moduli spaces of loops. Definition 2.4. We define the moduli spaces of loops as sets as follows: (1) MP := {γ : S 1 ,→ P | γ is smooth immersion}/PSL2 (C). 2 (2) MC \{0} := {ψ : S 1 ,→ C2 \{0} | ψ is smooth immersion, det(ψ(s), ∂s ψ(s)) = 1}/SL2 (C). Here ∂s := d/ds. Remark 2.5. (1) Let [γ] denote an element in MP for a representative element 2 γ ∈ P and an element in MC \{0} by [ψ] for a representative element ψ ∈ C2 \{0}. (2) For a free loop space M over a base space M , M := {δ : S 1 → M | δ is smooth immersion} ,
we can define an evaluation map ev from S 1 × M to M by ev(s, δ) = δ(s) [7]. For M◦ (◦ is P or C2 \{0}), we have the evaluation map whose image is a little bit complicate. 2 (3) For loops ψ1 and ψ2 in C2 \{0} such that [ψ1 ] = [ψ2 ] ∈ MC \{0} , we obviously obtain [$ψ1 ] = [$ψ2 ] in MP . Thus we also use the notation of $ as the map, $ : MC
2
\{0}
→ MP .
Definition 2.6 (Schwarz derivative [35]). {γ(s), s}SD is called Schwarz derivative, which is defined for a smooth map γ : S 1 → P equipped with a parameter s ∈ S 1 by, 2 2 1 ∂s2 γ(s) ∂s γ(s) . − {γ(s), s}SD := ∂s ∂s γ(s) 2 ∂s γ(s) We write it by {γ, s}SD or {γ, s}SD(s) for brevity.
By elementally computations, the Schwarz derivative is also expressed by 3 2 ∂s γ(s) 3 ∂s2 γ(s) {γ, s}SD = − . ∂s γ(s) 2 ∂s γ(s)
Straightforward computations give following lemma.
September 1, 2003 11:49 WSPC/148-RMP
568
00172
ˆ S. Matsutani & Y. Onishi
Lemma 2.7 ([35]). (1) For the action of g ∈ PSL2 (C), the Schwarz derivative {γ, s}SD is invariant: {γ, s}SD = {g(γ), s}SD .
(2) For a diffeomorphism s0 ∈ Diff + (S 1 )
{γ, s}SD = (∂s s0 )2 ({γ, s0 }SD − {s, s0 }SD )
and for U(1) action on S 1 , i.e. s0 = s + α,
{γ(s), s}SD = {γ(s0 − α), s0 }SD . Definition/Proposition 2.8 [35]. There is a natural one-to-one correspondence 2 between MC \{0} and MP with the following properties. (1) If [γ] is an element of MP , there exists a unique lifted curve [ψ] as an inverse of the map $ ($[ψ] = [γ]). Let the correspondence be denoted by σ ˜ , i.e. σ ˜ : MP → C2 \{0} M , ([ψ] = σ ˜ ([γ])). Then we have $ ◦ σ ˜ ([γ]) = [γ] and σ ˜ ◦ $([ψ]) = [ψ]. (2) For a map γ : S 1 → P representing a point of γ(s) ∈ P, there is a curve ψ in C2 \{0} as a solution of the differential equation, 1 2 −∂s − {γ, s}SD (s) ψ(s) = 0 , 2 so that ψ defines an element [ψ] ∈ MC a realization of σ ˜.
2
\{0}
and [$ψ] = [γ]. This algorithm is
Proof. In this proof, we will deal only with representative elements γ and ψ of 2 MP and MC \{0} . First we will check the well-definedness of σ ˜ in (2). Without 2 1 loss of its generality, we use the chart of ψ2 6= 0 a loop ψ := ψ ψ2 ∈ C \{0}. Noting the Remark 2.5(3), the well-definedness means that the lift of the loop γ(S 1 ) := $ψ(S 1 ) is uniquely ψ up to SL2 (C). By differentiating det(ψ, ∂s ψ) = 1 in s, (∂s2 ψ2 )/ψ2 = (∂s2 ψ1 )/ψ1 . After straightforward computation, for γ = ψ1 /ψ2 , we obtain the relation, (∂s2 ψ2 )/ψ2 = −{γ, s}SD/2. Up to SL2 (C), ψ is identified with a solution of (2). Hence well-definedness is asserted. Further existence of a solution √ √ −1γ/ ∂s γ √ √ of this equation in (2) is guaranteed by a special solution, ψ = , whose −1/ ∂s γ det(ψ, ∂s ψ) is unit. The property of Wronskian det(ψ, ∂s ψ) = 1 and the uniqueness of the solutions of a second order differential equation confirms uniqueness of the solution of (2) up to SL2 (C). Further due to the construction of the solutions, we will consider the effect of Diff + (S 1 ); for s0 (s), the Schwarz derivative changes as √ in Lemma 2.7, and ∂s0 given by the chain rule, ψ(s0 ) := ψ(s)/ ∂s0 s. Then ψ1 (s0 ) : ψ2 (s0 ) = γ(s0 ) and γ(S 1 ) and ψ(S 1 ) do not depend on the parameterization. Thus (1) and (2) are completely proved. Remark 2.9 (Poincar´ e and Schwarz [35, 36]). By the analytical continuation of s ∈ S 1 , γ can be complexfied to γc . If γc is also embedded in C, γc−1 is automorphic function. (In general, even though γ is immersed or embedded in P, γc cannot be immersed in P.) For example the case s = ℘(ξ) (ξ = ℘−1 (s) ∈ X1 ) for
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
569
s ∈ P, ξ ∈ C, {ξ, s}SD is a meromorphic function of s, where ℘(ξ) is the Weierstrass elliptic function and X1 = C/Λ is an elliptic curve. These studies are by Klein, Riemann, Poincar´e, Schwarz and so on. In this article, we will not restrict ourselves to deal only with meromorphic function. We will consider transcendental functions of s because our problem is related to a physical problem or an elastica problem as the catenary, another physical curve, is also given by the transcendental function. In this article, we are concerned with a loop with an energy functional in Definition 2.18. However the integrand {γ, s}SD in the energy integration depends upon the parameterization of S 1 or Diff + (S 1 ) from Lemma 2.7. Hence we must fix the parameterizations of the loop in order to treat a loop with the energy functional. Even in P, we can locally define the metric because its tangent space T P is isomorphic to C but the action of PSL2 (C) prevents that the metric becomes global. Hence we restrict ourselves to consider an action of the subgroup Γ0 (C) instead of PSL2 (C). Let us introduce our main objects in this article. Definition 2.10. We define the moduli spaces of loops, which are called moduli of a quantized elastica or moduli spaces of a quantized elastica, as follows: (1) MPelas := {γ : S 1 ,→ P | smooth immersion, |∂s γ(s)| = 1}/E0 (C). 1 (2) MC elas := {γ : S ,→ C | smooth immersion, |∂s γ(s)| = 1}/E0 (C). C2 \{0}
(3) Melas := {ψ : S 1 ,→ C2 \{0} | s smooth immersion, det(ψ(s), ∂s ψ(s)) = 1, |ψa (s)| = 1 (a = 1 or 2)}/Γ0 (C).
Remark 2.11. (1) The condition |∂s γ(s)| = 1 means that we will treat only loops with the arc-length parameter s in C or P = C ∪ {∞} equipped with the standard flat metric hereafter. We call the condition |∂s γ(s)| = 1 reality condition or arclength condition. C2 \{0} by [γ] and [ψ] for loops (2) We continue to express the elements in MPelas , Melas γ ∈ P and ψ ∈ C2 \{0} satisfying appropriate conditions respectively for a while. C2 \{0} (3) Further similar to Remark 2.5(3), we can define the map from Melas to MPelas by $ noting that the reality condition |∂s γ(s)| = 1 means |ψa (s)| = 1 (a = 1 or 2) under the condition det(ψ(s), ∂s ψ(s)) = 1 since ∂s γ(s) = det(ψ(s), ∂s ψ(s))/ψ2 (s)2 for ψ2 (s) 6= 0. (4) We H can find a representative element by tuning the dilation of E0 (C). By letting |dγ| = 2π for a curve with finite length in C ∪ {∞} we have a natural isomorphism, C,2π C 1 Melas ≈ Melas := γ : S ,→ C ∪ {∞} smooth immersion, |∂s γ(s)| = 1,
Here |dγ| = |∂s γ(s)|ds.
I
|dγ| = 2π
E1 (C) .
September 1, 2003 11:49 WSPC/148-RMP
570
00172
ˆ S. Matsutani & Y. Onishi
(5) Using (1), we have a decomposition whether the length is finite or not, i.e. a C,2π M∞ MPelas ≈ Melas elas .
This picture is also asserted if one considers a smooth loop in a two-sphere S 2 and its stereographic projection. H For M∞ |dγ| = ∞, elas , we have another representation element, ∞,cvtr M∞ ≈ M := γ : S 1 ,→ C ∪ {∞} | smooth immersion, elas elas |∂s γ(s)| = 1, sup |∂s log ∂s γ(s)| = 1 s∈S 1
E1 (C) .
Using the equivalences and such a representative element, we can introduce scale in our system. Next let us introduce smaller moduli spaces for later convenience. Even under the reality condition |∂γ(s)| = 1, there is a freedom to choose its origin of the loop, which is denoted by Isom(S 1 ) = U(1). Thus let us define smaller sets with P projections to these sets of moduli spaces, e.g. πelas : MPelas → MPelas /Isom(S 1 ). Definition 2.12 (Moduli of a quantized elastica). We define moduli spaces of loops, which are also called moduli of a quantized elastica, or moduli spaces of a quantized elastica, as follows: P : MPelas → MPelas := MPelas /Isom(S 1 ). (1) πelas
1 C C C : MC (2) πelas elas → Melas := Melas /Isom(S ). C2 \{0}
(3) πelas
C2 \{0}
: Melas
C2 \{0}
→ Melas
C2 \{0}
:= Melas
/Isom(S 1 ).
In physics, we are concerned only with the shape of elastica. M ◦elas is more important than M◦elas . Further we remark here that we have a natural isomorphism MPelas ≈ MP /Diff + (S 1 ) as a connection between MPelas and MP . C2 \{0}
Next we give our correspondence between MPelas and Melas based on C,2π C,2π C Proposition 2.8. We let Melas = Melas /Isom(S 1 ) and have MC,2π elas ≈ Melas . Definition/Proposition 2.13. (1) There is a natural one-to-one and continuous C2 \{0} correspondence between Melas and MPelas with the following properties. (1-1) If [γ] is an element of MPelas , there exists a unique lifted curve [ψ] in C2 \{0}
Melas as an inverse of the map $, ($[ψ] = [γ]). Let the correspondence be denoted by C2 \{0}
σ : MPelas → Melas
,
([ψ] = σ([γ])) .
Then we have $ ◦ σ([γ]) = [γ] and σ ◦ $([ψ]) = [ψ].
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
571
(1-2) For a curve γ(s) ∈ P representing [γ] ∈ MPelas , there is a curve ψ in C2 \{0} as a solution of the differential equation, 1 −∂s2 − {γ, s}SD(s) ψ(s) = 0 , 2 C2 \{0}
so that ψ uniquely defines an element [ψ] in Melas algorithm is a realization of σ.
and $[ψ] = [γ]. This C2 \{0}
(2) There is a natural one-to-one correspondence between Melas induced from σ and $. C2 \{0} C2 \{0} (3) MC (MC ) are connected with elas and Melas elas and Melas C2 \{0}
MC elas = $Melas
,
C2 \{0}
MC elas = $Melas
and MPelas
.
Proof. (1) and (2) are essentially the same as the proves in Proposition 2.8 if we check the compatibility between |∂s γ(s)| = 1 and |ψ2 (s)| = 1, and continuity of C2 \{0} the map. Due to Remark 2.11(3), the condition |ψ2 (s)| = 1 in Melas essentially means the reality condition |∂s γ(s)| = 1 of the map γ on the chart around ψ2 6= 0. Let us consider the continuity. For a loop in P, γ 0 (s) := γ(s) + v(s) with small number and an element v of C ∞ (S 1 , C), the Schwarz derivative changes 2 2 3 ∂3γ ∂2γ ∂s ∂s v ∂ v − s 2 ∂s v . {γ 0 , s}SD = {γ, s}SD + s − s 2 ∂s v − 3 ∂s γ (∂s γ) ∂s γ ∂s γ (∂s γ) From the proof of Proposition 2.8, a solution of the its differential equation, ψ 0 , is periodic and thus is a loop in C2 \{0}. For sufficiently small , the second term becomes small enough. Then using the perturbation theory, we have ψ 0 = ψ + η as a solutions of 1 ∂s2 ψ 0 + {γ 0 , s}SD ψ 0 = 0 . 2 0 We note {γ , s}SD is also invariant for PSL2 (C). For the condition |ψ2 | = 1, we replace the parameter s by s0 using the fact in the proof of Proposition 2.8. On the other hand, for ψ 0 = ψ + η 0 we can find v ∈ C ∞ (S 1 , C) such that γ 0 := ψ10 /ψ20 = ψ1 /ψ2 + v 0 . Hence both maps are continuous. Here we will consider the natural neighborhood in the moduli space MPelas . Corollary 2.14. (1) There is a continuous injective map from MPelas to the function space C ∞ (S 1 , C). (2) MPelas has a natural topology generated by neighborhood in the function space ∞ C (S 1 , C). Proof. (2) is obvious if (1) is proved. For a given u ∈ C ∞ (S 1 , C) the functions ψ ∈ C ∞ ([0, 2π), C2 ) satisfying (−∂s2 − u)ψ = 0
September 1, 2003 11:49 WSPC/148-RMP
572
00172
ˆ S. Matsutani & Y. Onishi
is uniquely determined up to SL2 (C). In general, even though u is periodic and a function over S 1 , ψ is not periodic due to Floquet theorem [49]. However if ψ is periodic for some u, by letting |ψ2 | = 1, γ ∈ P is uniquely determined by γ = ψ1 /ψ2 up to E0 (C). We note that for such γ and u, u is given as u = {γ, s}SD/2 and {γ, s}SD /2 is invariant for the action of E1 (C). (For an action Diff + (S 1 ) to the reparameterization of the coordinate s, {γ, s}SD/2 is not invariant and {γ, s}SD/2 changes its value. However we have considered only the arc-length parameterization of s as |∂s γ(s)| = 1.) Hence if ψ is periodic for some u, by letting |ψ2 | = 1, it determines [γ] ∈ MPelas . The continuity is obvious from the previous proposition. As the injective map in the Corollary 2.14 is a continuous map, the above neighborhood can be geometrically interpreted as a neighborhood of γ. Remark 2.15. It is known that the free loop space can be a metric space and has natural topology if the base space is a Riemannian manifold [7, 21]. As we can regard R that an element in MPelas with finite length |dγ(s)| < ∞ can be represented by a loop with 2π length whose gravity center exists at the origin of C, it is not difficult to treat the quotient by E0 or E1 . Consider the image C of the map γ : S 1 → P; C := γ(S 1 ). For such a loop C ⊂ P whose represents a point [C] ∈ MPelas , there is a normal bundle characterized by the exact sequence of tangent bundle of C and P, 0 → T C → T P|C → NC → 0 . Any elements in T P|C are decomposed to T P|C ≡ NC ⊕ T C. For a smooth section v ∈ C ∞ (C, T P|C ) of T P|C over C, and for an infinitesimal real parameter , we have C + v as a loop in P. Here + means the natural addition in the local chart C of P in the sense of euclidean geometry. Of course, it is important to check whether such a loop is in a different point in MPelas or not but if it is, we can find an infinitesimal path from [γ] to [γ + vγ ] in MPelas by letting vγ ∈ C ∞ (S 1 , T P|γ(S 1) ), vγ := v ◦ γ. |∂s (γ(s) + vγ (s)| = 1 is not difficult to be treated by reparameterizing s by s0 in primitive sense. Further even for the case [γ] = [γ + vγ ], we can regard it as a trivial path. If [γ] and [γ + vγ ] are different points for an infinitesimal small , we can regard such [vγ ] as an element in a set of smooth sections of tangent bundle of MPelas , C ∞ (T[γ] MPelas ) ≡ C ∞ (MPelas , T[γ] MPelas ). We show that there exist such different points. From Corollary 2.14, C ∞ (T[γ] MPelas ) is not the empty set. Let us find an element in C ∞ (T[γ] MPelas ) for each [γ] ∈ MPelas . As the fiber of Tp P is isomorphic to C, we define a norm in v ∈ C ∞ (S 1 , T P|γ(S 1 ) ) by sup-norm. (In our article, our argument does not strongly depend upon the norm in C ∞ (S 1 , T P|γ(S 1) ).) As it is difficult to define a length in C,2π scaleless space, we might consider an element in Melas rather than MPelas due to C,2π 1 Remark 2.11. Consider [γ]R ∈ Melas for γ : S → P satisfying the reality condition |∂s γ(s)| = 1 and dγ = 2π, and v ∈ C ∞ (S 1 , T P|γ(S 1) ). Suppose that γ(S 1 ) + v(S 1 ) preserves local and total length of γ(S 1 ) for sufficiently small ,
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
573
R i.e. |∂s (γ(S 1 ) + v(S 1 ))| = 1 and d(γ + v) = 2π are satisfied. We call the deformation as isometric. Then we regard [γ + v] as an element in MC,2π elas . If the vector field v(6∈ {Euclidean move}) is the isometric deformation, [γ + v] is a different C,2π point from [γ] in Melas and thus they are different points in MPelas . Then we can naturally define a neighborhood around a loop with finite length in our moduli space MPelas . Similarly for an element in M∞ elas , we can define the neighborhood. For an element [γ] in M∞,cvtr , we define its tangent space and velocity C ∞ (S 1 , T P|γ(S 1) ). elas We can constraint the velocity as an isometric path which locally preserves the length. However since M∞,cvtr is defined by sup-norm in Remark 2.11, for an elas element [γ] ∈ M∞,cvtr and an isometric path for v ∈ C ∞ (S 1 , T P|γ(S 1 ) ), [γ + v] elas ∞,cvtr generally does not belong to Melas even with a sufficiently small . However [γ + v] is in MPelas and [γ + v] generates a point in M∞ elas again. Hence the path is well-defined by local argument. Accordingly we can naturally define a neighborhood in our moduli space MPelas . C2 \{0}
Further as we also define neighborhood in Melas MPelas
using Proposition 2.13, we
C2 \{0} Melas .
define a topology in our moduli spaces and As the topology comes from that in the loop space we call it topology of loop space [7, 21]. As the topology of loop space is generated by C ∞ (Tγ MPelas ), we will consider an infinitesimal deformation parameterized by t ∈ [0, ] for a sufficiently small in detail. Remark 2.16. Due to the arguments in Remark 2.15, we wish to find one parameter family [γt ] in MPelas such that ∂t [γt ] belongs to C ∞ (Tγ MPelas ). (1) First, we will consider an isometric deformation which locally preserves the arc-length of one parameter family of loops immersed in P : γ◦ : S 1 × [0, ] → P, (γt (s) := γ(s, t) ∈ P) satisfying [∂t , ∂s ]γt (s) = 0 . Here ∂s := ∂/∂s and ∂t := ∂/∂t. We call this condition isometric condition. Then if |∂s γt=0 (s)| = 1 for s ∈ S 1 , |∂s γt (s)| = 1 for (s, t) ∈ S 1 × [0, ]. For (s, [γt ]) ∈ S 1 × MPelas , ∂s acts only on S 1 whereas ∂t acts only on [γt ] ∈ P Melas . Of course the relation [∂t , ∂s ](s, [γt ]) = 0 trivially holds. On the other hand, for the evaluation map ev(s, [γt ]) as Remark 2.5(2), the action of [∂t , ∂s ] is not trivial. However by dealing only with the isometric deformation, we can avoid the noncommutativity between a deformation and the evaluation map. (2) Let us consider one parameter family of loops immersed in P, γ◦ : S 1 ×[0, ] → P, given by a differential equation, which the right hand side depends upon γt itself. First assume that the differential equation is ∂t γt = f (γt ) for a given functional f . In this case, the deformation depends upon the affine coordinate γt in P and it is not invariant for the action of E1 . Further we note that ∂s γt (s) is the tangential vector of the circle γt (S 1 ) and φ := log ∂s γt (s) denotes its
September 1, 2003 11:49 WSPC/148-RMP
574
00172
ˆ S. Matsutani & Y. Onishi
tangential angle if |∂s γt (s)| = 1. Provided that the deformation ∂t γt is governed by a function of ∂s γt (s) itself, the deformation must depend upon the angle of C in P and a euclidean move. Hence they can not be deformations in MPelas and a deformation in MPelas does not include γ(s) and φ. (3) From Lemma 2.8(1), ut (s) ≡ u(s, t) := {γt (s), s}SD /2 is a function of MPelas . Further ut (s) depends only on ∂s log(∂s γt (s)) and ∂s2 log(∂s γ(s)) due to Definition 2.6. We might consider the deformation in an element γt in MPelas through an equation, ∂t ut = f (ut , ∂s ut , ∂s2 ut , . . . , A) , for appropriate functional f and function A ∈ C ∞ (S 1 × [0, ], C); the function A must be invariant for the action of E0 (C). If ut is determined at a time t, γt can be reconstructed by Proposition 2.13. If there does not appear γt (s) or ∂s γt (s) themselves in right hand side, the deformation is invariant for the action of E 1 (C) for an appropriate A. Due to the above consideration, we can consider an infinitesimal deformation in MPelas by ∂t u = f (u, ∂s u, ∂s2 u, . . . , A) and [∂t , ∂s ]γ(s, t) = 0. As we prepared the tools, it is not difficult to deal with the quotient space MPelas C2 \{0}
and Melas
. From here, let γ itself denote an element of MPelas instead of [γ] for C2 \{0}
a loop γ(S 1 ) ∈ P satisfying the conditions. Similarly we write ψ ∈ Melas sake of simplicity. Further we will consider a flow in MPelas .
for the
Remark 2.17. Let us consider the situation that for a point γ ∈ MPelas and its neighborhood Uγ in terms of the loop topology, we can find another point γ 0 ∈ Uγ such that γ 0 = γ + v for a sufficiently small and some velocity v ∈ C ∞ (Tγ MPelas ). Suppose that by sequentially finding such points, we construct a curve γt , t ∈ [0, 1] in MPelas connecting between a starting point γ0 = γ and a terminal point γ1 = γ 00 for some γ 00 ∈ Uγ . Then we may write the velocity as ∂t γt at γt . In this way, for each point γ in MPelas , we can define an immersion of [0, 1] in MPelas for a smooth section v ∈ C ∞ (Tγ MPelas ) if it is well-defined. We call such a immersion flow in MPelas . Further for each point γt in a flow, t ∈ [0, 1] with vt ∈ C ∞ (Tγt MPelas ), let us assume that we can choose another element vt0 ∈ C ∞ (Tγt MPelas ) and find a point γt, in the neighborhood of γt such that γt, = γt + u0 for a sufficiently small parameter . Then we can consider duplex flow such as γt,t0 for [0, 1]2 in MPelas . Similarly we can deal with an immersion of [0, 1]m in MPelas . For the case, we call the immersion γt of t ∈ [0, 1]m multiple flow. Further for a certain case, [0, 1]m ∈ MPelas can be extended to Rm ∈ MPelas where m is a positive integer or the infinite number. C2 \{0} Similarly we can deal with flow in Melas . We define the KdV and KdVH flow as an extension of [0, 1]m immersion to Rm immersion in Sec. 3.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
575
Definition 2.18 (Energy of a quantized elastica). We introduce an energy C,2π functional of γ ∈ Melas ≈ MC elas , called Euler–Bernoulli energy functional, by Z 1 {γ(s), s}SD ds . E[γ] := 2π S 1 Lemma 2.19. For γ ∈ MC elas , the energy E[γ] is non-negative real. Proof. The Schwarz derivative can be expressed by 1 {γ, s}SD = ∂s2 log(∂s γ) − (∂s log(∂s γ))2 . 2 √ Due to Definition 2.10, the reality condition |∂s γ(s)| = 1, we let ∂s γ = exp( −1φ), φ is a real smooth function over S 1 , φ(0) = φ(2π). Hence Z Z 1 {γ, s}SD ds = ds (∂s φ)2 , 2 1 1 S S which is real. Remark 2.20 ([13]). (1) By Lemma 2.7, the integrand in the energy E is invariant for the action of PSL2 (C). However the diffeomorphism of S 1 , Diff + (S 1 ), changes the energy. Hence we cannot find a well-defined energy over MP . P Further for γ ∈ πelas M∞,cvtr , we can also consider a correspondence elas Z {γ, s}SD ds , S1
by giving up to considering dilatation symmetry. P As we wish to neglect the problem for πelas M∞ elas , we restrict ourselves to deal C with Melas . In other words, we will consider the energy functional only for MC elas . (2) We regard the energy function as a section of line bundle over MC , elas R −−−−→ Energy (MC elas ) y MC elas
(3) As mentioned in Introduction, for γ ∈ MC elas , this energy functional E = R 2 2 {γ, s} ds is identified with (∂ γ/∂γ) ds; thus we call it Euler–Bernoulli SD S1 S1 s energy functional. The stationary points of E in MPelas in the meaning (1) were investigated by Euler [1–4]. Even though we will not touch the problem in this paper, we are implicitly considering the partition function of an elastica as a problem of physics [13], Z Z Z= Dγ exp −β {γ, s}SD ds . R
MC elas
S1
In order to know this partition function (which is not mathematically still wellP defined), we must investigate the moduli space of curve MC elas ⊂ Melas and we will do in this paper.
September 1, 2003 11:49 WSPC/148-RMP
576
00172
ˆ S. Matsutani & Y. Onishi
3. KdVH Flow Our studies are based upon the discovery of Goldstein and Pertich [37, 38] on the MKdV flow for a loop in C and that of Langer and Perline [9] on the nonlinear Schr¨ odinger flow for a loop in R3 . Using their results, one of the authors studied the moduli of loops in C [13] and loops in R3 [14]. Our purpose is to give mathematical implications of these works [13, 14] using results of Pedit [16]. In this section, we will give our main Theorem 3.4 and its proof, which are of a relation between the moduli of a quantized elastica in P and the KdV flow. In order to express the system of the KdV equation, we will introduce the differential algebra and its division algebra before our main arguments in this section. Definition 3.1 ([24]). (1) The differential ring Ds is defined by, N X ak (s)∂sk N < ∞, ak (s) ∈ C ∞ (S 1 , C), s ∈ S 1 . Ds := k≥0
(2) The degree of a differential operator, D ∈ Ds , is denoted by deg D, deg : Ds → Z≥0 ,
where Z≥0 is the set of non-negative integers. (3) The micro-differential ring Es to Ds is defined by ( N ) X Es := ak (s)∂sk N < ∞, ak (s) ∈ C ∞ (S 1 , C), s ∈ S 1 , k=−∞
s
where deg : E → Z and the product is defined by the extended Leibniz rule, ∞ X 1 n n r n−r n := n(n − 1) · · · (n − r + 1) . (∂s a)∂s , ∂s a = r r r! r=0
(4) The projections + and − are defined by + : E s → Ds ,
(L 7→ L+ ) ,
− : Es → Es \Ds ,
(L 7→ L− ,
L = L + + L− ) .
Hereafter we will write a map from S 1 to P, MPelas and MPelas by the same γ. Noting Remark 2.17, let us define the KdV and KdVH flows, which satisfy the isometric condition as in Proposition 3.11. Definition 3.2 (KdV flow and KdVH flow). (1) The KdV flow is defined as the immersion γ◦ : R ,→ MPelas
and
C2 \{0}
ψ◦ : R ,→ Melas
which satisfies the following properties: (1-1) γt (s) = $ ◦ ψt (s), for each t ∈ R.
,
(t 7→ (γt , ψt )) ,
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
577
(1-2) u(s, t) := {γt (s), s}SD /2 obeys the KdV equation,
∂t u + 6u∂s u + ∂s3 u = 0 . C2 \{0}
If for a point γ ∈ MPelas and its corresponding point ψ ∈ Melas , there is one of the KdV flows such that γt = γ and ψt = ψ for some t ∈ R, we say that γ or ψ belongs to the KdV flow γ◦ or ψ◦ . (2) Let us introduce a formal infinite dimensional parameter space, ! ∞ Y V ∞ := S 1 × R , t = (t1 , t2 , t3 , . . .) ∈ V ∞ , t1 ∈ S 1 . n=1
Then the KdVH flow is defined as the immersion
C2 \{0}
(γ◦ , ψ◦ ) ≡ φ∂s u,t : V ∞ ,→ MPelas × Melas
,
(t 7→ (γt , ψt )) ,
which satisfy the following conditions: (1-1) γt (s) = $ ◦ ψt (s), (1-2) φ∂s u,t is given by γ(s, t) → γ(s, t + δt) = exp
X
δtn ∂tn
n=1
!
γ(s, t) ,
whose each tn deformation obeys the nth KdV equation (n ≥ 1), ∂tn u = −Ωn−1 ∂s u ,
where Ω is a micro-differential operator, Ω = (∂s2 + 2u + 2∂s u∂s−1 ) ∈ Es . C2 \{0}
If for a point γ ∈ MPelas and its corresponding point ψ ∈ Melas , there is one of the KdV flows such that γt = γ and ψt = ψ for some t ∈ V ∞ , we say that γ or ψ belongs to the KdVH flow γ◦ or ψ◦ . (3) We define a relation, γ
∼
KdVHf
γ0 ,
for two points γ, γ 0 ∈ MPelas if these γ and γ 0 are on an orbit of the projection of P P −1 P −1 0 the KdVH flow πelas ◦ φ∂s u,t , i.e. every points in the fibers πelas γ and πelas γ belongs the same KdVH flow. For convenience, let γt (ψt ) denote the KdV flow or KdVH flow instead of γ◦ (ψ◦ ) from this. 3.3. (1) Though the well-definedness of the above definition is later investigated, these flows satisfy the isometric condition as in Proposition 3.11. (2) We will note that the space V ∞ has an algebro-analytic structure induced from the equations, ∂tn+1 u = Ω∂tn u .
September 1, 2003 11:49 WSPC/148-RMP
578
00172
ˆ S. Matsutani & Y. Onishi
(3) The n = 2 KdVH flow obeying ∂t2 u = −Ω∂s u is identified with the KdV flow in (1) of Definition 3.2. Our first main theorem is as follows: Theorem 3.4. (1) The relation MPelas
relation; for arbitrary γ in and for γ ∼ γ 0 and γ 0 ∼ KdVHf
KdVHf
∼
KdVHf
in the Definition 3.2 becomes an equivalent
there is one of the KdVH flows to which γ belongs, γ 00 , we have a relation γ ∼ γ 00 . By this relation, KdVHf
we can define an equivalent class
C[γ] := {γ 0 ∈ MPelas | γ 0
∼
γ} ,
∼
γ} ,
KdVHf
MPelas =
a
C[γ] .
MC elas =
a
CC [γ] .
γ
Similarly we can define 0 CC [γ] := {γ 0 ∈ MC elas | γ
KdVHf
γ
(2) The KdVH flow conserves the energy E. In other words, for the subspace of MC elas , C MC elas, E := {γ ∈ Melas | E[γ] − E = 0} ,
and a curve γ ∈ MC elas , the following relation holds MC elas, E[γ] ⊃ CC [γ] . (3) The moduli space of a quantized elastica MC elas is decomposed as a a CC [γ] . MC MC MC elas, E , elas, E = elas = E
γ,E[γ]=E
C2 \{0}
by Noting Remark 2.16, we will investigate the moduli spaces MPelas and Melas considering flows over there and prove our theorem. Here we mention the strategies of the proof of the theorem. 3.5. We plan to investigate MPelas by dealing with a group which is generated by a Lie algebra associated with T MPelas . By the correspondence between MPelas C2 \{0}
in Proposition 2.8, we can identify γ(s) with (ψ1 , ψ2 )(s). We firstly and Melas deal with wider class of flows φA,t in Lemma 3.6, which is characterized by a smooth function A over S 1 × [0, 1]. In Lemma 3.9, we find that an arbitrary flow φA,t approximately preserves the energy of elastica in Definition 2.18. Due to the argument in Remark 3.10, we choose a special A as A = ∂s {γ, s}SD and then the flow is identified with the KdV flow in Proposition 3.11. As shown in Propositions 3.15, 3.16, and 3.17, we use the regular properties of the KdV hierarchy and prove the theorem. Noting Remark 2.16, we have the following lemma.
September 1, 2003 11:49 WSPC/148-RMP
00172
579
On the Moduli of a Quantized Elastica in P and KdV Flows
Lemma 3.6 (Goldstein Pertich, Pedit [37, 38, 16]). Let us consider a flow of [0, ] for a real number > 0 : [0, ] → MPelas ,
(t 7→ γt ) ,
i.e. it is realized by an isometric deformation, [∂s , ∂t ]γt (s) = 0 . (1) Every isometric deformation γt (s) locally obeys the equation of motion, ∂t u = −ΩA(s, t) , where u = {γ, s}SD/2 and A(s, t) is an appropriate smooth function over (s, t) ∈ S 1 × [0, ]. (2) For the function A(s, t), there exists a smooth function B(s, t) such that A(s, t) = −∂s B(s, t)/2 and this equation of motion is locally rewritten by, ∂t u =
1 ΩB(s, t) , 2
where Ω := Ω∂s , Ω = (∂s3 + 2u∂s + 2∂s u) . C2 \{0}
Proof. Using the one-to-one correspondence between MPelas and Melas , we lift the flow γt to ψt := σγt . In this proof, we consider representative elements of the image of its evaluation map, γt (s) and ψt (s). Due to the linear independence given by det(∂s ψt (s), ψt (s)) = 1, we express the deformation in terms of ψt (s) and ∂s ψt (s); ∂t ψt (s) = (A(s, t) + B(s, t)∂s )ψt (s) , where A(s, t) and B(s, t) are smooth functions over (s, t). However from ∂t det(ψt (s), ∂s ψt (s)) = 0, we have the constraint, ∂s B(s, t) = −2A(s, t) ,
(3.1)
−(∂s2 ψ2 (s))/ψ2 (s),
using [∂s , ∂t ]ψt (s) = 0. Noting u(s, t) = we perform a straightforward computations of ∂t u(t, s), we obtain the equation in (1). On the other hand, if the equation is satisfied, we can reduce the equation to [∂t , ∂s ]γt (s) = 0. Similarly we obtain (2). Let us introduce another formal infinite dimensional parameter spaces, t = (t1 , t2 , t3 , . . .) ∈ [0, ]∞ and a formal multiple flow φA,t with the infinite dimensional parameters, which is locally defined. Definition 3.7. For t ∈ [0, ]∞ for a sufficiently small parameter , we will define an infinitesimal multiple flow, φA,t : [0, ]∞ → MPelas ,
(t 7→ γt ) ,
September 1, 2003 11:49 WSPC/148-RMP
580
00172
ˆ S. Matsutani & Y. Onishi
induced from the formal variation for a sufficiently small δt, (δti < < N δti , a small natural number N ) and image of evaluation map γ(s, t) := γt (s), ! ! X X γ(s, t) 7→ γ(s, t+δt) = exp δtn ∂tn γ(s, t) := 1 + δtn ∂tn γ(s, t)+O(δt2 ) , n=1
n=1
with local relations, [∂s , ∂tn ]γ(s, t) = 0 , ∂tn u = −Ωn−1 A(s, t) ,
(n ≥ 1) , (n ≥ 1) ,
where u(s, t) = {γt (s), s}SD /2, A(s, t) and B(s, t) are appropriate smooth functions over S 1 × [0, ]∞ such that 2A = −∂s B. Remark 3.8. (1) In terms of the definition of the exponential function to the base e, N 0 O , 1+ 0 exp(O) = lim N 0 →∞ N the development of δtn generates [0, ]∞ . By tuning N 0 compatible to N in Definition 3.7, we can define the exponent action to γ(s, t). (2) If Ωn−1 A(s, t) vanishes for n > M for a natural number M , the deformation is of finite dimensional. Then the flow φA,t is well-defined for a sufficiently small . (3) In general, the above flow φA,t is a formal one and its well-definedness is not guaranteed. However if it is well-defined, it gives an isometric deformation of a curve γ(s) ∈ MPelas . In fact due to the relation ∂tn+1 = Ω∂tn , we have the flow ∂tn ψt (s) = (An + Bn ∂s )ψt (s) , where A2 = A = −∂s B/2, B2 = B, A1 = Ω−1 A, and An = ΩAn−1 ,
∂s Bn = ΩBn−1 ,
(n ≥ 2) .
Then the above relation ∂tn u = −Ωn−1 A(s, t) turns out to be the standard type of the flow for An in Lemma 3.6. ∞ 1 ∞ Lemma 3.9. For γ ∈ MC elas and A ∈ C (S ×[0, ] , C), the infinitesimal flow φA,t preserves the energy functional modulo (δt)2 , supposed that φA,t is well-defined , Z Z 1 1 {γt , s}SD ds = {γt+δt , s}SD ds + O((δt)2 ) . 2π S 1 2π S 1
Proof. Noting Remark 3.8 and by Eq. (3.1) in the proof of Lemma 3.6, we have the relations, ∂s Bn = −2An = −2ΩAn−1 ,
∂s Bn = ΩBn−1 .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
581
When we will apply the relation to the right hand side of the lemma, (u(s, t) := {γt , s}SD /2), Z Z Z X ∂tn u(s, t)ds + O((δt)2 ) u(s, t)ds + u(s, t + δt)ds = δtn S1
S1
=
=
=
Z Z Z
S1
u(s, t)ds − u(s, t)ds +
S1
S1
S1
n=1
X
δtn
n=2
Z
ΩAn ds + S1
1X δtn 2 n
Z
S1
1 2
Z
S1
∂s Bds + O((δt)2 )
∂s Bn+1 (s, t)ds + O((δt)2 )
u(s, t)ds + O((δt)2 ) .
We completely prove the lemma. Remark 3.10. (1) This flow φA,t could be regarded as an infinitesimal action of a diffeomorphism of MPelas , which is a (infinite dimensional) Lie group GA if it can be well-defined. (2) We can regard S 1 as a Riemannian manifold with a metric ds2 . Then ∂s is √ the Killing vector and exp( −1s) is a geodesic flow. They are a generator and an element of the Isom(S 1 ) = U(1) group respectively; √ √ U(1) : S 1 → S 1 , exp( −1s) 7→ exp( −1(s + s0 )) , for g0 ∈ U(1), g0 gives a natural automorphism of MPelas . P : MPelas → MPelas , the U(1) action (3) Since there is the natural projection πelas P on γ ∈ Melas must be trivial g0 γ = γ for g0 ∈ U(1) and we have the relation P P g0 ◦ πelas = πelas ◦ g0 . It implies that the immersion of the loop S 1 is consistent with U(1) action. (4) For a curve γ(s) ∈ MPelas , we can locally express the U(1) action, (∂s − ∂s0 ){γ, s}SD (s, s0 ) = 0 ,
(∂s − ∂s0 )γ(s, s0 ) = 0 .
These equations faithfully represent the u(1) symmetry or translation, γ(s) → γ(s − s0 ) in MPelas . (5) Due to the above remarks, if exists, GA should include G0 = U(1) as its normal subgroup. Accordingly it is natural that A in Definition 3.7 starts with the internal symmetry: A = ∂s u and ∂t1 u = ∂s u for u = {γ, s}SD. (6) When we consider the multiple flow generated by φ∂s u,t (A = ∂s u), it means that we deal with the variation, ! X γ(s, t) → γ(s, t + δt) = exp δtn ∂tn γ(s, t) , n
which obeys
∂tn u = −Ωn−1 ∂s u . Following Definition 3.2, they are locally identified with the KdVH flow.
September 1, 2003 11:49 WSPC/148-RMP
582
00172
ˆ S. Matsutani & Y. Onishi
(7) Due to the Remark 2.16, this multiple flow is locally well-defined in MPelas . (8) Physically speaking for the above arguments, we are implicitly investigating the partition function of a “elastic” curve in P. We require that the partition function must naturally include classical shapes whose have the above trivial translation symmetry as the Goldstone bosons or the Jacobi fields [39]. This requirement makes the group structure acting MPelas (if exists) contain this trivial symmetry [13]. We will summarize the above results as a proposition. Proposition 3.11. (1) The multiple flow φ∂s u,t contains a subflow φ∂s u,t1 generated by (∂s − ∂s0 ){γ, s}SD = 0 . This domain of t1 ∈ [0, ] is extended to S 1 and is consistent with the projection P P P πelas : MPelas → MPelas , i.e. there exists ϕ∂s u,t such that πelas ◦ φ∂s u,t = ϕ∂s u,t ◦ πelas . (2) By choosing A = ∂s u for u = {γ, s}SD/2, the flow φ∂s u,t defined in Definition 3.7 is well-defined as a flow in MPelas and can extend the domain of the flow [0, ]∞ → V ∞ . (3) φ∂s u,t is identified with the KdVH flow φ∂s u,t by extending [0, ]∞ to V ∞ . P P (4) There exists a flow ϕ∂s u,t in MPelas such that πelas ◦ φ∂s u,t = ϕ∂s u,t ◦ πelas , we also call it KdVH flow. (5) For the KdVH flow, we have algebraic relations among multi-times t n as ∂tn+1 u = Ω∂tn u. (6) The KdVH flow preserves the decomposition in Remark 2.11. (7) The restricted flow of the KdVH flow to MC elas preserves the energy functional exactly. Proof. (1) is obvious from Remark 3.10. If (2) is satisfied, (3), (4) and (5) are naturally given from Remark 3.10. Since the KdVH flow consists of isometric deformations, (6) is obvious. (2) and (7) will be asserted by Propositions 3.15 and 3.16. Firstly we note that (7) should be compared with the Lemma 3.9. Next we also note that in order to prove (2), we should check (i) the well-definedness of the KdVH flow locally and (ii) the extension of the domain to V ∞ . If the well-definedness of the KdVH flow is guaranteed, we can find the neighborhood of a point γ ∈ MPelas by the KdVH flow to γ as its initial state, because the KdVH flow consists of the isometric C,2π deformations. We can consider the process in Melas as mentioned in Remark 2.15. Here we will introduce the words of a dynamic system here apart from our notations in main subject [40]. Definition 3.12 ([7, 40]). We will consider a manifold M equipped with a closed real 2-form ω. We will use the notations: iY v is the interior product of a vector field Y and a differential form v.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
583
(1) A vector field Y is called symplectic if iY ω is closed. (2) A vector field Y is called a Hamiltonian vector field if there exists a function f such that iY ω = df. Corresponding to Definition 3.12, we will define quantities in the KdV flow in Definition 3.13 and give Proposition 3.15 by assuming MPelas as a (infinite dimensional) manifold [40]. Definition 3.13 ([40]). (1) In our KdVH flow, we define a 2-form ω for vectors Y1 and Y2 over MPelas , Z Z s 1 0 0 0 (Y2 (s)Y1 (s ) − Y1 (s)Y2 (s ))ds ds . ω(Y1 , Y2 ) := 2 S1 0
¯ δu ¯ for the KdVH flow : (2) We define the quantities Xn and hn and variation δ/ h0 = u/2, X0 = 0 and ¯ n δh Xn (u) = ∂s ¯ , δu
Xn (u) := Ωn−1 ∂s u , where
¯ n δhn δh δhn 2 δhn 3 δhn ¯ = δu − ∂s δ(∂s u) + ∂s δ(∂s2 u) − ∂s δ(∂s3 u) + · · · . δu The existence of such hn will be guaranteed in Proposition 4.18. Noting Ω = Ω∂s , and from the definition we have a recursion relation, ¯ n−1 ¯ n δh δh , ∂s ¯ = Ω ¯ δu δu if hn exists. In Proposition 4.18, we show existence of a set of functionals 2n ¯ n = res 2 h L(2n−1)/2 , 2n − 1 ¯ n−1 with ¯ u satisfying ∂s ¯ hn = Ω∂s h h1 = u/2. Here L = ∂s2 + u and “res” means the −1 ¯ n = Ω n ∂s h ¯1 = coefficient of the ∂s in the notations in Sec. 4. In other words, ∂s h Ωn ∂s u/2. Further from the definition, we have [22] Z Z ¯¯ δ hn ¯ ¯ ds ≡ 2(2n − 3) hn−1 ds , δu R ¯h¯ n R δh¯ n ≡ δu due to periodicity and since δδu ¯ ! Z Z ¯ r−1 X ¯ 1/2 δ r/2 1/2 i δL 1/2 r−i−1 (L ) ¯ (L ) ds ¯ res(L )ds = res δu δu i=1
¯
(r−1)/2 δL
1/2
=r
Z
res L
r = 2
Z
res(L(r−2)/2 )ds .
¯ δu
r ds = 2
Z
res L
¯ ds ¯ δu
(r−2)/2 δL
September 1, 2003 11:49 WSPC/148-RMP
584
00172
ˆ S. Matsutani & Y. Onishi
¯ n−1 /(2n + 1) modulo periodic functions and Xn = ∂s h ¯ n with h ¯ 0 = 0. Let hn ≡ 2n h Hence Definition 3.13 is guaranteed by Proposition 4.18. Here we give the vector fields Xn and quantities hn explicitly: Example 3.14 (KdVH flow). 1 u 2 1 h 1 = u2 2
n=0:
X0 (u) = 0 ,
h0 =
n=1:
X1 (u) = ∂s u ,
n=2:
X2 (u) = ∂s (3u2 + ∂s2 u) ,
n=3:
X3 (u) = ∂s (10u3 + 5(∂s u)2 + 10u∂s2 u + ∂s4 u) ,
1 h2 = u3 + (∂s u)2 2 5 h3 = u4 + 10u(∂s u)3 + (∂s2 u)2 . 2
n=1:
∂ t1 u + ∂ s u = 0 ,
n=2:
∂t2 u + 6u∂s u + ∂s3 u = 0 ,
n=3:
∂t3 u + 30u2 ∂s u + 20∂s u∂s2 u + 10u∂s3u + ∂s5 u = 0 .
Proposition 3.15 ([40]). (1) ω is a cocycle 2-form. (2) The KdVH flow has the Hamiltonian structures with their Hamiltonian, Z Hn := hn ds , (n ≥ 0) , S1
with involutive relations for the Poisson bracket, {Hn , Hm } := ω(Xn , Xm ), {Hn , Hm } = 0 ,
for all n, m .
(3) The nth KdV flow has infinite conserved quantities Hm n ∈ Z≥0 . (4) We have the relation, [∂tn , ∂tm ]u = 0 ,
for all n, m .
(5) For an arbitrary curve γ, the nth (n ≥ 1) KdV flow is uniquely determined. Proof. We will prove these following to the arguments in [40]. First we will show that iX ω is exact: For all n > 0, we have the relation, Z ¯ n δh ds ¯ v = (dHn )(v) , for n ≥ 1 . iXn ω(v) = ω(Xn (u), v) = δu S1 Hence Xn (u) is a Hamiltonian vector field from the Definition 3.12(2). Our system is a Hamiltonian system and the nth KdV equation is given by, utn = Xn (u) .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
585
Next we will show that the KdVH flow is involutive. As the time tm development of Hn is given by Z ¯ Z δhn ∂ tm Hn = ∂ u = {Hn , Hm } , t m ¯ δu the involution relations are important. From the Definition 3.12, we have relations for n ≥ 1, ¯ n ¯ n−1 δh δh . Xn = ∂ s ¯ = Ω ¯ δu δu
Since in terms of ω in Definition 3.13(1), the Poisson bracket between Hn ’s are given by {Hn , Hm } = ω(Xn , Xm ), we obtain the following relation for n, m > 0: Z Z ¯ m−1 ¯ n δh ¯ n δh δh ds ¯ Ω ¯ ds ¯ Xm (u) = {Hn , Hm } = δu δu δu S1 S1 Z ¯ n−1 δh ¯ m δh = dsΩ ¯ ¯ = {Hn+1 , Hm−1 } . δu δu S1 Using this relations and noting {Hn , Hm } = −{Hm , Hn }, we will prove the involutive relation. When both n and m are even or both n and m are odd, {Hn , Hm } = {H(n+m)/2 , H(n+m)/2 } = 0 . On the other hand, when n is odd and m is even, {Hn , Hm } = {H(n+m−1)/2 , H(n+m−1)/2+1 } = {H(n+m−1)/2+1 , H(n+m−1)/2 } = 0 . Hence Hn ’s are involutive and the KdVH flow has infinite conserved quantities. We can express the relation {Hn , Hm } = 0 by using a vector representation for n, m > 0, [Xn , Xm ] = 0 . In the solution of the KdV hierarchy, we can identify ∂tn with Xn itself: ∂tn ≡ Xn . Hence we obtain (4). Further (5) can be proved as follows. For a given curve γ, we uniquely have the data, u, ∂s u, ∂s2 u, . . . . The KdV equations are given by ∂t u = f (u, ∂s u, ∂s2 u, . . .) . Hence for an arbitrary curve γ ∈ MPelas , the KdVH flow is uniquely determined by the KdV hierarchy. Due to the integrability, the “time” development of the γ is stably determined. Since the KdVH flow is a Hamiltonian system with infinite time parameters, we can find a group g ∈ G such that γt+t0 = gt0 γt . The multiplication is given as gt0 gt = gt0 +t . g0 is unit and g−t is the inverse of gt . Further Proposition 3.15(4) P means that [∂t1 , ∂tn ]u = 0 and the projection of πelas : MPelas → MPelas consists with the KdV flow.
September 1, 2003 11:49 WSPC/148-RMP
586
00172
ˆ S. Matsutani & Y. Onishi
Further as solving the KdV hierarchy is an initial problem with the first derivative with respect to the time, for an arbitrary γ ∈ MPelas we can find the KdVH flow to which γ belongs as an initial state. We will give a proposition as a summary of the above arguments. P Proposition 3.16. (1) There is an Abelian group G := {exp( n tn ∂tn ) | tn ∈ V ∞ } acting on the moduli spaces MPelas and MPelas , whose orbits are identified with the KdVH flow. (2) There is a fixed normal subgroup G0 of G, G0 = {gt1 | t1 ∈ R} ≈ U(1); G0 trivially acts upon MPelas : γ = gt1 γ for γ ∈ MPelas and gt1 ∈ G0 . (3) The group G/G0 acts on MPelas . Hence Proposition 3.11(2) is proved. We can express the equivalent class in MPelas by the group action in the following proposition. Proposition 3.17. (1) Fixing γ ∈ MPelas , G/G0 whose element is given as gt2 ,t3 ,... transitively acts upon C[γ] : For an arbitrary γ 0 ∈ C[γ], we can find an element gt2 ,t3 ,... of the group G/G0 such that γ = gt2 ,t3 ,... γ 0 . (2) For an arbitrary γ ∈ MPelas , there exists the KdVH flow : MPelas can be decomposed, a MPelas = C[γ] . (3) For γ ∈ MC elas , the energy functional E[γ] is exactly conserved for the KdVH flow.
Proof. (1) and (2) are obvious from the properties of group. (3) is proved because the energy E[γ] of the loop γ given by Definition 2.18 is identified with the conserved quantity of H0 . Hence Proposition 3.11(7) is proved from Proposition 3.17(3). By Propositions 3.11, 3.15, 3.16 and 3.17, we completely proved our main Theorem 3.2. As we have the classification of MPelas , we will use it and go on to investigate the moduli space MPelas in rest of this paper because our purpose is to get some knowledge of the moduli space MPelas . For later convenience, we will introduce a quotient space. Due to Theorem 3.2 and Proposition 3.17, MPelas has natural projections induced by the equivalent relation ∼ , i.e. πKdVHf : C[γ] 7→ (γ), where (γ) is a representative element KdVHf
of C[γ].
Definition 3.18. (1) We define a quotient space of the moduli space by, M Pelas := πKdVHf MPelas := MPelas / ∼ . KdVHf
(2) The natural projection is denoted by πelas : MPelas → MPelas .
Remark 3.19. We will comment on Proposition 3.15(4), [∂s , ∂tn ] = 0 for n > 0 .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
587
As the KdVH flow is very regular, we can regard C[γ] × S 1 ∈ MPelas as a manifold. Accordingly ∂tn are regarded as a vector field. We will use it as a generator of a cohomology in Sec. 7. (1) It means the local length ds preserves for the KdVH flow. (2) It can be interpreted as Frobenius integrability conditions. (3) It is known as the compatibility condition or zero curvature conditions known in the soliton physics. 4. Algebro-Geometric Properties of the KdV Flow I: Algebraic Properties As we proved Theorem 3.4, we will use the relation between the moduli of a quantized elastica MPelas and the KdV flow in order to give a finer classification, which is based on the study finite type flow in MPelas and MPelas , in this section. However as this classification comes from the algebraic investigation of the KdV flow, we should replace the base function space in the category of the smooth functions by that of the formal power series in order to explain this classification, though we need some subtle treatments. This section is devoted for investigations of a commutative differential ring, which were given by Mulase in [26], Burchnall and Chaundy in about seventy years ago [41–43], and Mumford in [44]. Our argument basically follows the arguments of Mulase for the Schottky problem [26] and of Sato [24, 25]. Following their theories, we will consider a part of the moduli of a quantized elastica using the formal power series. Since the part is dense in MPelas as mentioned in Theorem 4.2, the replacement of the base function field is not so critical. Although investigation of γ as a real one-dimensional curve is our main subject, we deal with a hyperelliptic curve as a complex one-dimensional curve in the context of algebraic geometry in this section and next section. Thus readers should not confuse the terms “curve” in the categories of the differential geometry and the algebraic geometry. We basically refer the complex algebraic curve algebraic curve, hyperelliptic curve or elliptic curve whereas we call such a real curve just curve. Let us start this section with the following lemma. Lemma 4.1. If there is a natural number N such that ∂tN u is an eigen vector of the operator Ω with an eigenvalue k ∈ C, i.e. k∂tN u = Ω∂tN u , ∂tm is a scalar multiplication of ∂tN for all m ≥ N. Further by introducing t0n n > N and setting ∂t0n := ∂tn − k n−N ∂tN , the relation becomes ∂t0n u ≡ 0. Proof. This proof is easily from Definition 3.2 and Proposition 3.11(5).
September 1, 2003 11:49 WSPC/148-RMP
588
00172
ˆ S. Matsutani & Y. Onishi
Lemma 4.1 means that some orbits in an infinite dimensional vector space V ∞ are essentially reduced to an orbit consisting of finite N dimensional vector space. Let us refer this flow finite flow or finite N -type flow. Here we will give our second main theorem: Theorem 4.2. (1) We will write the set of the finite type flow by MPelas, finite and the set of finite g-type flow by MPelas g . The moduli space of the elastica has decomposition, a MPelas g . MPelas, finite := g 0} is called a local ideal of R. (4) Let the metric of K be |x| := e−val(x) for x ∈ K, which is called nonArchimedian metric. For example, the valuation of a commutative ring C[x] is given by its degree, i.e. for f (x) ∈ C[x], val(x) = degx (x). For a more general commutative ring, we can find a local parameter by localization at a prime ideal and its valuation is given by its degree of the local parameter. The valuation ring is a linear topological space due to the non-Archimedian metric [46]. Similarly, we have the following proposition [24], which is naturally obtained. Proposition 4.7 ([24]). (1) When we define Efm := {D ∈ Ef | deg D ≤ m}, Ef has filter, Ef = ∪m Efm ,
{0} = ∩m Efm ,
Efm ⊂ Efm+1 .
(2) Ef is a linear topological space with respect to this filter. (3) Ef is an infinite dimensional algebra given by the formal power sires whose element converges in the filter topology. P∞ (4) In Ef , we can define valuations in C[[t]] and Ef as P = i=−∞ ai ∂1i ∈ Ef val(a) := max{m ∈ N | ∂1m a 6= 0}, val(P ) := inf{val(ai ) − i} .
Formally Proposition 4.7 is obvious from their definitions but we need rigorous arguments to justify them mathematically, which is written in [11, 25]. The differential operators appearing in the soliton theory and in the following arguments converge in this topology. Lemma 4.8 ([24 26]). (1) The adjoint map for W ∈ Wf , Ad(W ) : Ef → Ef (Ad(W )P = W P W −1 ), defines the automorphism in Ef . Ad(W )|Efm is invariant, i.e. Ad(W )Efm = Efm . ˜ ∈ L, where (2) For an operator L ( ) ∞ X f −i ai ∂s L := D ∈ E | D = ∂s + , i=1
f
c
we can find a unique W ∈ W modulo W such that ˜ = ∂s , Ad(W )L
and then this relation induces the isomorphisms of Wf /Wc ≈ L . (3) For every standard commutative subalgebra Af ⊂ Df , there is a C-subalgebra A ∈ Ec such that is C-isomorphic to Af and c
Ac ∩ Ec− = {0} .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
591
P∞ ˜ = W ∂1 W −1 . Proof. (1) is trivial. (2) Let us find W = i=0 wi ∂1−i such that L ˜ = ∂1 + L ˜ − , the relation is reduced to [∂1 , W ] = −L ˜ − W , i.e. Noting L X 1−i ∂1 wk−1 = − ui ∂1r wj . r i+j+r=k,i≥2
Then we can recurrently determine wi from small numbers since C[[t]] has indefinite ˜ a −Wa ∂1 = 0, then W1 ∂1 W −1 W2 − integrals. When we find such W1 and W2 , i.e. LW 1 −1 −1 W1 W1 W2 ∂1 = 0 or [∂1 , W1 W2 ] = 0. Hence W1−1 W2 ∈ Wc . (3) Let us take a monic element of Af such that its form is Ln = ∂1n + bn−2 ∂1n−2 + · · · + b0 . ˜ = Ln ∈ L. Let S ∈ Wf such that L ˜ = S∂1 S −1 . Then Ac := Then we have L −1 f f −1 S A S. For an arbitrary P ∈ A , S P S belongs to Ec , i.e. [∂1 , S −1 P S] = 0 because [∂1 , S −1 P S] = S −1 [S∂1 S −1 , P ]S = S −1 [L, P ]S = 0 due to the assumption [P, L] = 0. Further the inner automorphism preserves the order of the operator. As we will not prove here, it is known that if we define a left Ef -module, Vf := Ef /Ef t1 , the homomorphism from Ef to the endomorphism of Vf is injective. In other words, the endomorphism is faithful if it can be regarded as a representation. Vf has a valuational topology val and becomes a graded module. There the valuation and graded topology are identified. Further we have a natural Ec -module isomorphism [24], Vf ≈ E c . Further we consider an embedding of a submodule Vf0 into Vf with zero-index map for a certain index, which can be regarded as a Grassmannian manifold in a certain sense. We note that the above isomorphism is not meaning of Ef -module. In this article, we characterize such an embedding by a finite subset of natural numbers F , which can be regarded as the Weierstrass gap in the infinite point of a corresponding algebraic curve. Further we should note that the adjoint map Ad is the key of the Sato theory and in this section, we sometimes call it gauge transformation. Definition 4.9 ([26]). A C-subalgebra Ac in Ec is called a rank one subalgebra if it has C-linear basis whose indices corresponds to all of integer except a finite subset F, i.e. N − F = NAc := {n ∈ N | ∃ P ∈ Ac such that ord(P ) = n} and Ac ∩ Ec− = {0} .
September 1, 2003 11:49 WSPC/148-RMP
592
00172
ˆ S. Matsutani & Y. Onishi
As Ac is a C-algebra, there is a monic element Pn in Ac of order n ∈ N − F with P0 := 1. Then {Pn | n ∈ N − F } forms a C-linear basis of Ac . In other words arbitrary P ∈ Ac can be represented by C-linear combinations of monic Pn elements. In fact if the order of P is m, there exists c ∈ C such that the order of P − cPm ∈ Ac must be less than m. Such a recursion process gives us the representations. Lemma 4.10 ([26]). Let Ac 6= C be a rank one subalgebra, and P and Q be elements in Ac whose orders are coprime. (1) dimC (Ac /C[P, Q]) < +∞. (2) Ac is finite C[P ]-module. There is a nontrivial polynomial f (x, y) ∈ C[x, y] such that f (P, Q) = 0. (3) The transcendence degree of Ac over C is one. (4) By regarding Ac as a graded module with respect to degree of differential operators: Ac(n) := {P ∈ Ac | ord(P ) ≤ n} ,
Acn := Ac(n) ⊕ Ac(n−1) · I ⊕ Ac(n−2) · I 2 ⊕ · · · ⊕ Ac(0) · I n ,
c gr Ac = ⊕∞ n=0 An ,
we regard Proj(gr Ac ) as an algebraic curve C. Here I is the identity of Ac . (5) Let H 1 (Ac ) = Ec /Ac ⊕ Ec− . We have × H 1 (C, OC ) = H 1 (Ac ) ,
where O is the sheaf of holomorphic functions on C of (4) and O × is a multiplicative subset of O. Proof. Let GCD(m, n) denote the greatest common divisor of two non-negative integer m and n. Since the rank of Ac is unit, we have the relations 1 = min{GCD(ord(P 0 ), ord(Q0 )) | P 0 , Q0 ∈ Ac }
and the orders of P and Q are coprime. Hence C[P, Q] ⊂ Ac . As NC[P,Q] = N and C[P, Q] is C-linear vector space, NAc − NC[P,Q] must be finite set. Hence dimC (Ac /C[P, Q]) must finite. On the other hand, since N − {ord{P m , Qn } | m, n ∈ Z≥0 } must be finite set, P and Q satisfy an algebraic relation f (P, Q) = 0. Further the proofs of (4) and (5) are due to theory of an ordinary commutative ring [46].
We note that F in Definition 4.9 is related to the Weierstrass gap at infinity point of the algebraic curve C. After this point, we will concentrate our attention only on the operator L = ∂12 + u, which is related to the KdV equation: L2 := {D ∈ Ef | D = ∂12 + u, u ∈ C[[t1 ]]} . We give its related operators as examples.
September 1, 2003 11:49 WSPC/148-RMP
00172
593
On the Moduli of a Quantized Elastica in P and KdV Flows
Example 4.11. 1 1 1 1 L1/2 = ∂1 + u∂1−1 − (∂1 u)∂1−2 + ((∂12 u) − u2 )∂1−3 + (6u(∂1 u) − ∂13 u)∂1−4 2 4 8 16 1 − (−2u3 + 14u(∂12u) + 11(∂1 u)2 − (∂13 u))∂1−5 + · · · , 32 1 2 3 2 3/2 3 4L = 4∂1 + 3∂1 u + 3u∂1 + ∂ u + u ∂1−1 + · · · , 2 1 2 16L5/2 = 16∂15 + 40u∂13 + 60(∂1 u)∂12 + 50(∂12 u)∂1 + 30u2∂1 15(∂13 u) + 30u(∂1 u) 1 + 5 u3 + (∂1 u)2 + ∂1 f (u, ∂1 u, . . .) ∂1−1 + · · · . 2
Here ∂1 f (u, ∂1 u, . . .) is a functional of u, ∂1 u, . . . .
Let us fix the operator P = ∂12 of Ac in Lemma 4.10 because we only consider L = W ∂12 W −1 . From the primitive number theory, for an odd number m and an integer n(> m), we find a, b ∈ Z≥0 such that n = am + 2b , c
(a, b ∈ Z≥0 ) .
(4.1) ∂12
When we fixed A as a rank one subalgebra, the partner Q of P ≡ in the Lemma 4.10 is an operator whose order is given by an odd number 2g +1. Thus F in Definition 4.9 is given by a smaller sequence of odd numbers, {1, 3, 5, 7, 9, . . . , 2g−1}. Let us introduce a set of such subrings Ac in Ec . Definition 4.12. Ac := {Ac | Ac is a rank one subalgebra,
∃ W ∈ Wf such that W Ac W −1 ∈ Df is a commutative subalgebra,
N − NAc ⊂ {1, 3, . . . , 2g − 1}, g < ∞} .
Similarly for the case of g = 0, Ac ≡ W −1 C[L, [L1/2 ]+ ]W . Since [L1/2 ]+ ≡ ∂1 , [L, ∂1 ] = 0 and thus u must be C. In other words, Q must be ∂1 , C[∂12 , ∂1 ] ≡ C[∂1 ]. For the case g = 0, it becomes an ordinary polynomial ring. 4.13. We recall that an algebraic curve with a morphism to P of order two is called hyperelliptic curve. A hyperelliptic curve Cg of genus g (g ≥ 1), including the case of elliptic curve, is given by the homogeneous equation, Y 2 Z 2g−1 = hg (X, W ) := λ0 Z 2g+1 + λ1 XZ 2g + λ2 X 2 Z 2g−1 + · · · + λ2g+1 X 2g+1 , where λ2g+1 ≡ 1 and λj ’s are complex values. Lemma 4.14. Let L = ∂12 + u ≡ W ∂12 W −1 ∈ L2 for W ∈ Wf . (1) Ln/2 = W ∂1n W −1 . 2n (2) [L2n + , L] ≡ [L , L] ≡ 0. (3) The set of the differential operators in Df which commute with the given operator L is itself a commutative subalgebra of Df .
September 1, 2003 11:49 WSPC/148-RMP
594
00172
ˆ S. Matsutani & Y. Onishi
(4) Ac ∈ Ac is C[∂12 ]-module and by considering X Ac = Ac ,
Ac is also C[∂12 ]-module. (5) For an arbitrary Ac ∈ Ac , we can find Qg ∈ Ac which satisfies an affine equation, Q2g = hg (∂12 , 1) , so that there is a W ∈ Wf /Wc such that W ∂12 W −1 = L and W QW −1 are commutative in Df . Further we have found a hyperelliptic curve C = Proj(gr Ac ) and H1 (C, OC ) = H1 (Ac ) ,
which are generated by h∂1 , ∂13 , . . . , ∂12g−1 i.
Proof. (1) and (2) can be shown by direct computations. On (3), we consider a commutative differential ring in Df such that B := {P ∈ Df | [P, L] = 0}. Since L1/2 = W ∂1 W −1 , [∂1 , W −1 P W ] = W −1 [L1/2 , P ]W = 0 because of the assumption. Hence W −1 P W is an element of Ec and thus we can find Ac ∈ Ac such that B = W −1 Ac W . Hence B is a commutative ring. Next (4) is trivial. From the definition Lemma 4.10, and Eq. (4.1), we reach (5). Next we will consider a filter structure in Ac and its completion with respect to the filtration. Proposition 4.15. Let us define a filter, Fg Ac := {Ac ∈ Ac | N − NAc ⊂ {1, 3, . . . , 2g − 1}} . This satisfies the following relations: (1) Fg Ac ⊂ Fg+1 Ac , Fn Ac ≡ 0, n < 0. (2) By letting Acg := Fg Ac /Fg−1 Ac , there is a large gauge transformation between Ac1 , Ac2 ∈ Acg , i.e. there exists W ∈ Wf such that Ac1 = W Ac2 W −1 . (3) The direct limit of the filtration gives Ac := lim Fg Ac →
= {Ac ∈ Ec | ∃ W ∈ Wf such that W Ac W −1 ∈ Df is a subalgebra, N − NAc ⊂ 2N − 1} . Proof. (1) and (3) are obvious. (2) is due to the proof of Proposition 4.16(2). For each element Ac ∈ Acg , we consider the correspondence in Lemma 4.10(4), i.e. Proj(gr Acg ). It turns out that Acg is isomorphic to the set of the hyperelliptic curves with genus g.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
595
Proposition 4.16. (1) The set Af of commutative subrings in Df inherits from the above filtration of Ac . (2) For any elements L1 and L2 in L2 , there is a gauge transformation W ∈ Wt /Wc such that L1 = W L2 W −1 . Proof. (1) is trivial. (2) There is an element Wa ∈ Wt /Wc such that La = Wa ∂12 Wa−1 for (a = 1, 2). Hence L2 = W2 W1−1 L1 W1 W2−1 . As we described the tools and their properties for the differential ring over C[[t1 ]], we will extend its base field to C[[t1 , t2 , . . .]]. However before we will give the extension in Definition 4.19, we digress and show a connection between Ω in Definition 3.2 and L in L2 . Following the arguments in [22], we firstly prepare a lemma. Lemma 4.17 ([22]). The “resolvent” operator for L = ∂12 + u, # " ∞ 1 X (±) r −r/2 , T := (±z) L 2z 2 r=−∞ −
has the following properties: (1) (T (+) + T (−) ) = (L − z 2 )−1 . (2) [T (±) (L − z 2 )]− = [(L − z 2 )T (±) ]− = 0. (3) When we define a map for a X ∈ Es , called Adler map [22], h(X) := [(L − z 2 )X]+ (L − z 2 ) − (L − z 2 )[X(L − z 2 )]+ , we have the relation, h(T (±) ) = 0. (4) T (±) has a formal expansion, T (±) =
∞ X
Sr(±) ∂1−r .
r=1
Proof. (1) is trivial. (2) is given by the relation, " ∞ # X 1 [(L − z 2 )T (±) ]− = (L − z 2 ) (±z)r L−r/2 2 r=−∞ −
=
∞ X 1 (L − z 2 ) (±z)r L−r/2 2 r=−∞
1 = 2
" "
∞ X
r=−∞
r
2
(±z) L − z (±z)
r
#
−
−
L
−r/2
#
. −
September 1, 2003 11:49 WSPC/148-RMP
596
00172
ˆ S. Matsutani & Y. Onishi
It is clear that it vanishes. (3) is proved due to the property of the Adler map, h(X) ≡ −[(L − z 2 )X]− (L − z 2 ) + (L − z 2 )[X(L − z 2 )]− . (4) Is obvious from the definition of the resolvent. Due to the lemma, we gave the connection. Proposition 4.18 ([22]). (1) [2(n−1) [L(2n−1)/2 ]+ , L] = Ωn1 ∂1 u, where Ω1 := ∂12 + 2u + 2∂1 u∂1−1 . ¯ n = res 22n L(2n−1)/2 , (“ res” means the coefficient of ∂ −1 ), we (2) By letting h 1 2n+1 have ¯ n = Ω 1 ∂1 h ¯ n−1 . ∂1 h Proof. Due to the condition h(T (±) ) = 0, we can determine the first two coefficients S1 and S2 as (±)
∂s3 S1
(±)
+ 2(∂s u)S1
(±)
∂s2 S2
(±)
+ 4(u + z 2 )∂s S1
= 0,
1 (±) = − ∂ s S1 . 2
Let us consider the following operator, " ∞ # X (+) (−) 2r+1 −(2r+1)/2 (T −T ) = z L r=−∞
. −
The left hand side in the relation, [L(2r+1)/2 ]+ L − L[L(2r+1)/2 ]+ = L[L(2r+1)/2 ]− − [L(2r+1)/2 ]− L , appears as a coefficient of z 2r−1 in the series h(T (+) − T (−) ) with respect to z. Thus (+) (−) we are concerned with Sr := Sr − Sr , which must have the expansion, Sr =
∞ X
Sr(i) z 2i+1 .
i=−∞
Comparing the coefficients in z (i+1)
4∂1 S1
2r−1
, we obtain, (i)
(i)
= (∂s3 + 2(∂s u) + 4u∂1 )S1 = Ω1 ∂1 S1 .
We have the relation, [[L(2r+1)/2 ]+ , L] =
1 1 (1) ∂1 S12r+1 = r Ωr1 ∂1 S1 , 4 4
with S11 = −u/2. Then we let hn identified with Sn by tuning its coefficient. As we finished the digression, we extend C[[t1 ]] to C[[t1 , t2 , . . .]]. In the extension of the valuation over C[[t1 ]] to that of C[[t1 , t2 , . . .]], let the degree of tni be (2i−1)n.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
597
Definition 4.19 ([24, 26]). (1) The differential ring Dt , and its related set and ring are defined by, N X Dt := ak ∂1k |N < ∞, ak ∈ C[[t1 , t2 , . . .]] , k≥0
t
E :=
(
N X
k=−∞
ak ∂1k |N
< ∞, ak ∈ C[[t1 , t2 , . . .]]
Lt2 := {D ∈ Et | D = ∂12 + u ,
)
,
Et = Dt + Et− ,
u ∈ C[[t1 , t2 , t3 , . . .]]} .
(2) By letting val(tni ) := (2i − 1)n, we extend the valuation of Dt and Et , which are also called valuations of Dt and Et . P∞ (3) Wt := {W ∈ Et | W = 1 + i=1 wi ∂1−i }. P∞ (4) Dˆt := {P = i=0 ai ∂1i ∈ Dt | ∃ N ∈ N, val(ai ) > i − N for ∀ i 0}, P∞ Eˆt := {P = i=−∞ ai ∂1i ∈ Et | ∃ N ∈ N, val(ai ) > i − N for ∀ i 0}.
We note that Dt , Et and so on, have natural embeddings of Df , Ef and so on, e.g. Df 3 P (t1 ) 7→ P (t1 , 0, 0, . . .) ∈ Dt . Using the embeddings, we regard Df as a subring of Dt as following. Definition 4.20. (1) The moduli space of the KdV equations is defined by MKdV := {u ∈ C[[t1 , t2 , . . .]] | ∂tn u−Ω1n−1∂1 u = 0 for ∀ n} , MKdV := MKdV /(t1 ) ,
where Ω1 := ∂12 + 2u + 2∂1 u∂1−1 . P ˜ ∈ Et− , ˜ ∈ Lt := {D ∈ Et | D = ∂s + ∞ ai ∂s−i } and P ∈ Dt satisfy [P, L] (2) If L i=1 ˜ the equation [∂y − P, L] = 0 is called Lax equation and (P, L) is Lax pair. Here y is an element of the vector space generated by t1 , t2 , . . . . Proposition 4.21 ([26]). Let L := ∂12 − u ∈ Lt2 . (1) [∂tn − 22(n−1) [L(2n−1)/2 ]+ , L] = 0 is the Lax equation. (2) For an arbitrary P ∈ Dt of the Lax pair (P, L), P can be expressed by P =
n X
cj [Lj/2 ]+ ,
j=1
where cj ∈ C[[t2 , t3 , . . .]]. (3) If and only if u satisfies [∂y − P, L1/2 ] = 0, [∂y − P, L] = 0. (4) The equation [∂tn − 22(n−1) [L(2n−1)/2 ]+ , L] = 0 gives the nth KdV equation, ∂tn u − Ω1n−1 ∂1 u = 0, and thus we have a bijection MKdV ≈ {L ∈ Lt2 | [∂tn − 22(n−1) [L(2n−1)/2 ]+ , L] = 0, n > 1} .
Here ≈ is given by the correspondence between u and L = ∂12 + u.
September 1, 2003 11:49 WSPC/148-RMP
598
00172
ˆ S. Matsutani & Y. Onishi
Proof. First we consider (3). Let L = W ∂12 W −1 . [∂y − P, W ∂1 W −1 ] = 0 gives [W (∂y − P )W −1 , ∂1 ] = 0 and then we obtain [W (∂y − P )W −1 , ∂12 ] = 0 and [∂y − P, L] = 0. For an operator Q ∈ Ac , [Q, ∂12 ] = 0 means (∂12 Q) + 2(∂1 Q)∂1 = 0, i.e. (∂12 Q) = 0 and (∂1 Q) = 0. Hence [W (∂y − P )W −1 , ∂12 ] = 0 also means [∂y − P, L1/2 ] = 0. (1) It is know that [[Lj/2 ]+ , L1/2 ] ∈ Et− . Due to (3), (1) is proved. Next we consider (2). [Lj/2 ]+ is a monic operator. Hence if order of P is n, there exists c ∈ C such that the order of P − c[Ln/2 ]+ ∈ Dt is n − 1. By induction, we have the results in (2). Proposition 4.18(1) leads us to (4). Here we will translate the relations in terms of geometrical language. Due to Proposition 4.21(4), we also denote the right hand side there by MKdV . Lemma 4.22 ([24 26]). Let L := ∂12 − u = W −1 ∂12 W ∈ Lt2 , dL := ∂1 Ldt1 + ∂2 Ldt2 + ∂3 Ldt3 + · · · , dW := ∂1 W dt1 + ∂2 W dt2 + ∂3 W dt3 + · · · , dZ := 2L1/2 dt1 + 4L3/2 dt2 + 8L5/2 dt3 + · · · , dZ+ := 2[L1/2 ]+ dt1 + 4[L3/2 ]+ dt2 + 8[L5/2 ]+ dt3 + · · · ,
Z = Z + + Z− .
(1) The Lax equation becomes dL = [Z+ , L] ,
dL = −[Z− , L] .
(2) dZ+ = 12 [Z+ , Z+ ], dZ− = − 21 [Z− , Z− ]. (3) dL = [dW · W −1 , L]. (4) W −1 dW − Z+ ∈ Dc dt1 + Dc dt2 + Dc dt3 + · · · or by using the gauge freedom, dW = Z+ W ,
dW = −Z− W .
Proof. (1) is trivial. (2) Noting d2 L ≡ 0, [L, dZ+ − Z+ Z+ ] ≡ 0 and then we obtain (2). (3) From d(W W −1 ) ≡ 0, dW −1 = −W −1 dW W −1 . Hence dL = d(W ∂12 W −1 ) becomes the right hand side. (4) Using (2) and (3), [dW W −1 − Z+ , L] = 0, and we obtain [W −1 dW − W −1 Z+ W, ∂1 ] = 0. It implies (4). Here we note that the conditions dZ+ = [Z+ , Z+ ] and so on are the Frobenius integrability conditions. Due to the conditions, the orbit as a dynamical system can be uniquely determined. Conclusively we have the following proposition on the orbit of the KdV equations. As its proof is a little bit complicated, we will give only the result. Proposition 4.23 ([26]). For L(0) = S(0)∂ 2 S(0)−1 , U (t) = exp(t1 ∂1 + t2 ∂13 + t3 ∂15 + · · ·)S(0)−1 ∈ Eˆt ,
U (t) = S(t)−1 Y for S(t) ∈ G and Y ∈ Dˆt , we have the time development, L(t) = S(t)∂12 S(t)−1 .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
599
Definition 4.24 ([26]). (1) For L ∈ Lt2 , if the map ∂L1/2 ∂ 7→ T0 (Rt,n ) 3 ∈ Et− ∂y ∂y y=0
is injective, we say that Rt,n is effective. Here we write Rt,n as the orbit space generated by t1 , t2 , . . . , tn and T0 (Rt,n ) as its tangent space at the origin 0 ∈ Rt,n . (2) If for L ∈ Lt2 Rt,n is not effective for n > g but n ≤ g is effective, we say that L = ∂12 + u or u is finite g type solution of the KdV equation.
Lemma 4.25 ([23, 24, 26]). (1) If there is a natural number N such that ∂tN u is an eigen vector of the operator Ω with an eigenvalue k ∈ C, i.e., k∂tN u = Ω∂tN u , ∂tm is scalar multiplication of ∂tN for m ≥ N. If not, we refer that tm is effective. (2) For the finite g solution of L and for n > g, we have the commutation relation, [22(g+1) [L(2g+1)/2 ]+ , L] ≡ 0 , by construction tn in terms of a linear combination in Cht1 , t2 , . . . , tg i. (3) Let L ∈ Lt2 such that [22(g+1) [L(2g+1)/2 ]+ , L] ≡ 0 and [∂tj − 22(j−1) [L(2j−1)/2 ]+ , L] = 0 ,
for j < g ,
is effective. Then we have a commutative subring At := C[L, [L(2g+1)/2 ]+ ] ⊂ Dt such that W ∈ Wt , Ac = W −1 At W ∈ Ac , and an isomorphism as C-vector space, H 1 (Ac ) ≈ Chdt1 , dt2 , . . . , dtg i . Proof. (1) is essentially the same as Lemma 4.1 and (2) is obvious from (1). So we will concentrate our attention on (3). The integrability conditions makes the conditions in Dt reduced to those in Df as an initial state. Due to Lemma 4.14(5), (3) is proved. Definition 4.26. (1) The filter with respect to the effective differential equations is defined by Fg MKdV := {L ∈ Lt2 | [∂n − 22(n−1) [L(2n−1)/2 ]+ , L] = 0 is not effective for n > g} = {L ∈ Lt2 | [[L(2g−1)/2 ]+ , L] ≡ 0} and Fg MKdV ⊂ Fg+1 MKdV ,
Fn MKdV = ∅, for n < 0 .
(2) A set of finite g type solutions of the KdV equation is denoted by MKdVg := Fg MKdV \Fg−1 MKdV .
September 1, 2003 11:49 WSPC/148-RMP
600
00172
ˆ S. Matsutani & Y. Onishi
Due to Lemma 4.25(3), Fg MKdV corresponds to Fg Ac and the correspondence becomes a bijection by considering their appropriate quotient spaces. As the system of the KdV equations is a dynamical system, there is a g-dimensional orbit in each solution space in MKdVg by neglecting its periodicity. We can regard it as a fiber bundle, orbit −−−−→ MKdVg g πKdV y MKdVg .
For each orbit space Ac ∈ Acg such that
−1 g πKdV (p)
at a point p in MKdVg , there is a commutative ring
g T ∗ πKdV
−1
(p) ≈ H 1 (Ac ) .
For later convenience, we also define a space MKdV finite := qg MKdVg . Next we will consider MKdV itself. (Fg , Ac ) has direct limit due to Proposition 4.15. Let us consider the set of subrings in Dt B := {L ∈ Lt2 | ∃ W ∈ Wt , ∃ At ∈ Dt and ∃ Ac ∈ Ac such that At = W Ac W −1 and L = W ∂12 W −1 } . Since solving the KdV equations are an initial value problem, for an arbitrary initial state u ∈ C[[t1 ]] we can find the time-development obeying the KdV equations. Thus we have B ⊂ MKdV . On the other hand, from the definition, we can find C[∂12 ] ∈ Ac which gives NC[∂12 ] = 2N. Further for an arbitrary L ∈ MKdV , there is a gauge transformation, W ∈ Wt such that W −1 LW = ∂12 due to Lemma 4.8(2). Hence B ⊃ MKdV and then B ≡ MKdV . Such a consideration is justified by the direct limit and graded topology of Dt or Ec . Thus MKdV has naturally the topology induced from the linear topology of the micro-differential operator in Proposition 4.7 and the filter of C[∂s2 ]-module in Proposition 4.15, even though MKdV itself is not a vector space. Proposition 4.27. (1) MKdV is a filter space. (2) The set of finite g type solutions of the KdV equation is denoted by M KdVg := Fg MKdV \Fg−1 MKdV and the set of finite type solutions of the KdV equation is denoted by MKdV finite . Then we have decomposition, a MKdV finite = MKdVg . g 0), Wfg := {W ∈ Wf | W LW −1 ∈ MKdVg , for L ∈ MKdVg } , g the projection πKdV along the orbit space in the quotient space of MKdVg by the f action of Wg is given by, g πKdV (MKdV g /Wfg ) ∼ pt .
(3) For Wf0,1 := {W ∈ Wf | W LW −1 ∈ MKdV 0 ∪ MKdV 1 , for L ∈ MKdV 0 ∪ MKdV 1 } , the following relation holds, πKdV (MKdV 0 ∪ MKdV 1 /Wf0,1 ) ∼ pt . Proof. (1) is essentially the same as Proposition 4.16(2). The action of W fg to MKdVg is transitive due to (1) and thus (2) is obtained. Now let us come back to the elastica problem. Firstly we note that the elastica problem is defined over the real functions. Hence we should restrict the above result to a real analytic problem. In other words, we choose a natural complex structure J (J 2 = −1) in the orbits space hdt1 , dt2 , . . . , dtg iC and constraint it by hdt1 , dt2 , . . . , dtg iR using the fact that finite g-type flow is a finite g-type solution of the KdV equation. Further the orbit satisfies the reality condition |∂1 γ| = 1, which characterizes a certain type of hyperelliptic curves. Secondly we should notice the difference of the categories of the previous chapter and this chapter. However as the g-type flow u is expressed by meromorphic functions over a hyperelliptic curve of genus g, elements of the finite real flow exist in the category of the formal power series. Hence the investigation of the finite flow does not depend on the difference. Further the arc-length s corresponds to t1 in the above argument but we are consider only the closed one. Hence firstly t1 must be an element of S 1 = R/Z. Even though u(s) ≡ {γ, s}SD is periodic, γ is not in general. We should restrict the space of the solution space of the KdV equation so that γ(0) = γ(2π) or γ(0) = γ(∞). g We will define a projective structure in MPelas g by πelas : MPelas g → MPelas g so −1
g that for a point p in MPelas g , πelas (p) is the real number orbit, and let MPelas finite = qg MPelas g as did in Definition 3.18. We summary these results in the following proposition.
September 1, 2003 11:49 WSPC/148-RMP
602
00172
ˆ S. Matsutani & Y. Onishi
Proposition 4.29. There are natural injections iKdV : MPelas finite ,→ MKdV finite ,
ιKdV : MPelas finite ,→ MKdV finite
which satisfy (1) ιKdV ◦ πelas = πKdV ◦ iKdV , (2) MKdV \MPelas 6= ∅. Using the above results, there is a filtration in MPelas such that Fg MPelas := MPelas ∩ Fg MKdV , which satisfies Fg MPelas ⊂ Fg+1 MPelas ,
MPelasg = Fg MPelas /Fg−1 MPelas .
We have written just MPelas as iKdV (MPelas ) and MPelas as ιKdV (MPelas ) for brevity. Next we will consider the real orbits or the “time” development of each finite g type flow in MPelas (instead of MKdV ). Let us recall the fact that rational points in [0, 1), i.e. Q/Z, are measure zero in [0, 1). Further it is known that for a torus √ C/(Z + −1Z), a real direct line (orbit) stemmed from the origin with an angle θ does not stand upon the origin again if θ ∈ / tan−1 (Q/Z). Similarly in general, the real number “time” development of the finite g type solution is not periodic in “time” ti (i > 1), in the g-dimensional torus Jg which is called quasi-periodic solutions. Hence we conclude that such an orbit is homeomorphic to Rg−1 in this sense and show the following proposition. Proposition 4.30. For each pt ∈ MPelas g , we have a restricted action of Wfg and thus the following results are satisfied : g (1) πKdV |MPelas (MPelas g /[Wfg |MPelas ]) = pt. g
g
(2) MPelas g /[Wfg |MPelas ] ≈ S 1 × Rg−1 , MPelas g /[Wfg |MPelas ] ≈ Rg−1 , g
g
for g > 1.
(3) MPelas 0 ∪ MPelas 1 /Wf0,1 |MPelas ∪MPelas ≈ S 1 , (MPelas 0 ∪ MPelas 1 /Wf0,1 |MPelas ∪MPelas )/ 0 1 0 1 Isom(S 1 ) ≈ pt. We will recover the base ring with smooth functions. In other words, we show that the completion in Proposition 4.27 can be extended to Es because the convergence is determined only by the topology of order of the differential operator as shown in the following lemma. Lemma 4.31. (1) When we define Esm := {D ∈ Es | deg D ≤ m}, Es has a filter topology, Es = ∪m Esm ,
{0} = ∩m Esm ,
Esm ⊂ Esm+1 .
(2) Es is a linear topological space with respect to this filter topology.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
603
Due to Lemma 4.31 and note below the Proposition 4.3, we have the following proposition. Proposition 4.32. Let us define the moduli space of the KdV equations over the ring of the smooth functions: n−1 ∞ ∞ u = 0 for ∀ n} , M∞ KdV := {u ∈ C (V ) | ∂tn u − Ω1
∞ M∞ KdV := MKdV /(t1 ) .
Then (1) MKdV is dense in M∞ KdV . (2) MKdV finite is a subset of M∞ KdV . Proof. Due to the Weierstrass preparation theorem, for an arbitrary germ in C ∞ (R), there is a sequence in F(R) conversing it. Integrability due to Proposition 3.15 asserts that the difference does not enlarge for the time development. Hence (1) is proved. (2) is obvious Hence we have the final statement in this section. ∞ Proposition 4.33. MPelas ⊂ M∞ KdV has the filter topology induced from MKdV .
(1) There is a natural decomposition, MPelas, finite =
a
MPelas g .
g N , are trivial one for N -type solution. For a while, we will assume that u is real. Let Spect(−L) denote a set of x ¯. Due to hermitian properties of −L, Spect(−L) is a subset of real number bounded from below. The function ψx¯ (t1 ) is regarded as a section of line bundle over Spect(−L). For bases y0 and y1 of the solution space of −Lψx¯ = x ¯ψx¯ , (ψx¯ = ay0 + by1 , for a, b ∈ C), y0 (0, x ¯) = 1 ,
y1 (0, x ¯) = 0 ,
∂1 y0 (0, x ¯) = 0 ,
∂1 y1 (0, x ¯) = 1 ,
we have monodoromy matrix defined as M (¯ x) :=
y0 (π, x¯)
y1 (π, x ¯)
∂1 y0 (π, x¯)
∂1 y1 (π, x¯)
!
,
whose determinant is unity. If the eigenvalue of this matrix ρ is in the unit circle in C (|ρ| = 1), the solution ψx¯ is called stable and exist as a global section over the line bundle over s ∈ R. Unless, it is called unstable and it means that there is no global section over s ∈ R even though we can find local solutions of −Lψx¯ = x¯ψx¯ . We sometimes refer the unstable state “gap state” or “forbidden state”. The determinant whether it is stable or unstable is done by the characteristic equation, ρ2 − ∆u ρ + 1 = 0 ,
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
605
where ∆u := tr M . If its discriminant ∆2u − 4 is non-positive, corresponding x ¯ becomes stable. Since ∆2u − 4 is an analytic function over Spect(−L) − {∞} and has ordered zero points x ¯1 , x ¯2 , . . . , it has infinite product expression: (∆2u − 4) = c
∞ Y
j=0
(¯ x−x ¯j ) ,
where c is a constant in x ¯. This fact is correct even for the case that u is complex valued and thus we will return to the general u form here. Proposition 5.3 ([49]). For −Lψx¯ = x¯ψx¯ with smooth u(t1 ) over R, the discriminant ∆ is characterized by infinite x¯j and can be rewritten as, Y x )2 , (¯ x−x ¯j ) h(¯ (∆2u − 4) = j=0, single zeros
√ Q∞ x−x ¯j ) is the part of double zeros. where h(¯ x) = c j 0 , double zeros (¯
For large x ¯, −L asymptotically behaves like −∂12 for bounded u and thus the asymptotic behavior of ∆ can be investigated. Since the ground state corresponds to a single zero of ∆2 − 4 and other each gap has two single zeros of ∆2 − 4, the number of single zeros of ∆2 − 4 must be odd. Here we will consider a case with finite single zero points 2g + 1: 2g+1 Y (∆2u − 4) = (¯ x−x ¯j ) . h(¯ x )2 j=1
We refer such a case as finite-gap-state. It should be noted that ψx¯ has natural involution π : Spect(−L) → Spect(−L) (π : y¯ → −¯ y , π : ∞ = ∞) where y¯ = p ∆2u − 4/h(¯ x). Due to analyticity, we can extend Spect(−L) to complex. As for u ≡ 0 case, Spect(−∂12 ) is complexfied to P (even though we need more precise arguments), the energy spectrum Spect(−L) is, in general, p reduced to a hyperelliptic curve Cg due to its two-folding property. In fact for y¯ = ∆2u − 4/h(¯ x), this relation means a hyperelliptic curve defined in 4.13. In this section, we will fix a hyperelliptic curve Cg with genus g given by an affine curve, y¯2 = hg (¯ x, 1) = (¯ x − c1 ) · · · (¯ x − c2g+1 ) . In other words, we deal with a commutative ring C[¯ x, y¯]/(¯ y 2 − hg (¯ x, 1)) ∪ {∞}. We should note that for a hyperelliptic curve Cg , there exists a differential operator −L with u such that its spectrum Spect(−L) gives the hyperelliptic curve isomorphic to Cg . Proposition 5.4. Let the moduli space of hyperelliptic curves of genus g be denoted by Mhyp, g . Then Mhyp, g is (2g − 1) dimensional space.
September 1, 2003 11:49 WSPC/148-RMP
606
00172
ˆ S. Matsutani & Y. Onishi
Proof. A point in the moduli space Mhyp, g is characterized by 2g +1 zero points of hg (x, 1) in the above definition and ∞ point. However in these variables, there are several symmetries which express the same compact Riemannian surface. First one is translational symmetry cj → cj + α0 , α0 ∈ C. Second one is dilatation cj → cj α1 α1 ∈ C. Third one is (¯ x, y¯) → (1/¯ x, y¯Πj cj /¯ x(2g+1)/2 ), which reduces cj → 1/cj . Hence the remainder degree of freedom is 2g − 1. Remark 5.5. We will mention Mhyp, g here. We consider a smooth curve in Mhyp, g which is not degenerated; ci 6= cj if i 6= j and all of cj are finite value of C. Let us find the largest distance |cj − ck | of pair (cj , ck ) in {cj } as an arbitrary |cj − ck | does not vanish because the curve is not degenerated. Let us rename them as (c1 , c2g+1 ) and define (α1 , . . . , α2g−1 ) := ((c2 − c1 )/(c2g+1 − c1 ), . . . , (c2g − c1 )/(c2g+1 − c1 )) ∈ C2g−1 . Since 1 − αj = (cj+1 − c2g+1 )/(c1 − c2g+1 ) and |c1 − c2g+1 | is the largest distance, the region of each αj must be constrained as |αj | ≤ 1 and |1 − αj | ≤ 1. Next we will order α following the law, • if Re(αi ) < Re(αj ), i < j, • if Re(αi ) = Re(αj ) and Im(αi ) < Im(αj ), i < j. Hyperelliptic curves of genus g are determined as two-fold coverings of P1 ramified at 0, 1, ∞ and 2g − 1 additional points as the above order. However it is difficult to deal with deformation from non-degenerate hyperelliptic curves to degenerate curves [50–52]. Definition 5.6 ([28, 51, 27, 52, 53]). (1) Let H1 (Cg , Z) =
g M j=1
Zαj ⊕
g M
Zβj .
j=1
denote the homology of an algebraic curve Cg . (2) We introduce the periodic matrix of the curve Cg , in terms of the normalized first kind one-form ωi over Cg : "Z # "Z # 1 ˆ 1= . ω ˆi , T = ω ˆ i , Ω1 = T αj βj (3) For fixing T, we define the theta function θ : Cg → C by, X 1t nTn + tnz . θ(z) := θ(z | T) := exp 2πi 2 g n∈Z
Proposition 5.7 ([27, 28, 44, 51 53]). (1) By defining the Abel map for gth symmetric product of the curve Cg , ! g Z Qi X sˆ ωk , w ˆ : Symg (Cg ) → Cg , w ˆk (Q) := i=1
∞
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
607
the Jacobi variety Jˆg is realized as a complex torus, ˆ. Jˆg = Cg /Λ ˆ is a lattice generated by Ω. ˆ For the Abelian group of the divisor of the line Here Λ bundle over a hyperelliptic curve Cg , which is called Picard group Pic0 (X), the Abel ˆ theorem is expressed by Pic0 (X) ≈ Cg /Λ. (2) The theta function has monodoromy properties θ(z + ek ) = θ(z) ,
θ(z + τk ) = e−2πizk +πiτkk θ(z) .
(3) The Riemann theorem gives that θ w(Q) ˆ −
g X i=1
w(P ˆ g) + K
!
6≡ 0 ,
if and only if Pg ’s are general points on Cg where K is a constant called Riemann constant. As MPelas, g and MKdV, g have the natural projections, we will introduce the universal family of hyperelliptic curves of Mhyp, g induced from πhyp : Jg 7→ pt ∈ Mhyp, g . Proposition 5.8 (Krichever, Mulase [26, 27, 44]). (1) A finite g type solution of the KdV equation is given by a meromorphic function over the Jacobi variety J g of a hyperelliptic curves Cg . (2) There is a natural bijection between the moduli spaces of hyperelliptic curves Mhyp, g and MKdV, g , Mhyp, g ≈ MKdV, g . As (2) comes from the previous section, we will mention its idea of (1) as follows [11, 27]. Krichever started with ψx , a solution of (−∂12 − u + x2 )ψx = 0, which is called the Baker–Akhiezer function. His approach is very natural in the soliton theory and can be generalized from the case of the KdV hierarchy, which is related to hyperelliptic curves, to that in the KP hierarchy related to more general compact Riemannian surfaces. Lemma 5.9 ([27]). (1) For a solution of the KdV equation whose Spect(−L) is associated with the hyperelliptic curve Cg , we parameterize the eigenvalue −x2 for Lψx = x2 ψx . Then 1/x is a local parameter of ∞ of Cg . (2) ψx is meromorphic on Cg − ∞ and at the point ∞ it has an essential singularity ! ∞ X sx −i . ψx = e ψW , ψW := 1 + ai (t1 )x i=1
Here this expansion gives us the recursive relation −2∂1 ai = −Lai−1 with a0 = 1.
September 1, 2003 11:49 WSPC/148-RMP
00172
ˆ S. Matsutani & Y. Onishi
608
Proof. (1) For a sufficiently large |x|, this equation can be approximated by (−∂12 + x2 )ψx ∼ 0. Thus we can regard ψx ∼ exp(sx). In other words for a local coordinate z = 1/x around ∞ ∈ Spect(−L), ψx ∼ exp(−s/z)(1 + O(z)) : 1/x2 = 1/¯ x is a local coordinate around ∞ ∈ Spect(−L). (2) can be obtained by straightforward computations. Using this Lemma 5.9, we follow the Krichever’s construction of the finite g type solution. As we gave the Jacobi varieties and theta functions of hyperelliptic curve Cg in 4.13 and Proposition 5.7, we introduce a normalized Abelian differential of the second kind, ηˆP,i , 1 ηˆP,n = d n−1 + O(1) , t around P using a local parameter t (t(P ) = 0) with the normalization Z ηˆP,n = 0 , for j = 1, . . . , g . αj
As we have prepared to express the Baker–Akhiezer function, we consider the deformation equation, (2n−1)/2
(∂tn − 22(n−1) L+
)ψx = 0 . (2n−1)/2
Since z = 1/x is a local parameter around ∞ and around there L+ we introduce
(2n−1)
∼ ∂1
,
ηˆ∞,n = d(x2n−1 + O(1)) , and consider the function
Around ∞,
E(t, Q) = exp
X
22(n−1) tα,j
α,j
E(t, Q) ∼ exp
∞ X
n=1
2
2(n−1)
Z
Q
tn x
ηPα ,i . 2n−1
!
and ∂tn E(t, Q) ∼ 22(n−1) x2n−1 E(t, Q) . Due to Lemmas 4.8 and 5.9 and by letting L = W (s, ∂1 )∂12 W (s, ∂1 )−1 in the sense of Lemma 4.14, we obtain the relations ψx = W (s, ∂1 )E(t, Q) + O( x1 ) and 1 . Ln/2 W (s, ∂1 )E(t, Q) = W (s, ∂1 )∂1n E(t, Q) + O x
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
609
From the Lax equations in Proposition 5.1, ψx is expressed by ψx /E = (ψx /E) (xt1 , 4x3 t2 , 8x5 t3 , . . .) + O( x1 ). On the other hand, even though E(t, Q) is satisfied with the dispersion relation around ∞ and has no monodoromy around αj ’s, it has monodoromy around βj X i exp(2πiUj ) := exp 22(j−1) tj Hα,j j,α
where
i Hα,j =
1 2πi
Z
dˆ ηPα ,j . βi
Noting this monodoromy of the theta function in Proposition 5.7, we can find a single value function over Cg , which is known as Baker–Akhiezer function; P P θ(w(Q) + α,j 22(j−1) tα,j Hα,j − gi=1 w(Pg ) + K) Pg ψx = E(t, Q) . θ(w(Q) − i=1 w(Pg ) + K)
This is a solution of the Lax equations in Proposition 5.1. We can find a finite type solution of the KdV equation by using the zero mode using Proposition 2.8. ψx is determined by an analysis on the functions over Cg related to the Jacobi variety Jˆg . As the map form Cg ’s to the Jacobi variety Jˆg is known as Abel map, finding inverse map from functions over Jˆg to functions over Cg ’s is known as Jacobi inverse problem. Krichever’s scheme should be regarded as the Jacobi inverse method and can be applied even to a generalized Jacobi variety. It shows the existence of an injection form Mhyp to MKdV . Remark 5.10. For a finite type solution of the KdV hierarchy u, we have the hyperelliptic curve Cg as a spectrum of −L to u. Then the above arguments give the following results: (1) The orbit of an equation in the KdV hierarchy is realized in a direct line in the Jacobi variety Jg of Cg . (2) Any finite g solution u is given as a solution of the Jacobi inverse problem of the Jacobi variety Jg . As far as we will deal with only hyperelliptic curves and the KdV hierarchy, we can give more concrete arguments based upon Baker’s original argument [29, 36]. Definition 5.11 ([29, 36, 54]). We introduce the family of the differential forms: g−1
d¯ x x ¯ d¯ x ω2 = x¯2¯ y , . . . ωg = 2¯ y . P 2g−j 1 (2) ηj = 2¯ ¯k d¯ x, (j = 1, . . . , g). k=j (k + 1 − j)λk+1+j x y
(1) ω1 =
d¯ x 2¯ y,
Lemma 5.12 ([29, 36, 54]). (1) ω’s are the basis of the holomorphic function valued cohomology of hyperelliptic curve Cg , which give unnormalized periods: "Z # "Z # " 0# Ω Ω0 = ωi , Ω00 = ωi , Ω = . Ω00 αj βj
September 1, 2003 11:49 WSPC/148-RMP
610
00172
ˆ S. Matsutani & Y. Onishi
(2) They are related to the normalized ones: t
[ˆ ω1 · · · ω ˆ g ] := Ω0−1 t [ω1 · · · ωg ] ,
T := Ω0−1 Ω00 .
(3) η’s are the unnormalized one-form of the second kind over Cg and then the complete hyperelliptic integral of the second kinds is given as "Z # "Z # H 0 := ηi , H 00 := ηi . αj
βj
Here the contours in the integral are, for example, given in p. 3.83 in [53]. Proof. We check holomophicity of the forms in (1) and (3). A zero point of y¯ = 0, or a root cj of f (¯ x) = 0, corresponds to a point (cj , 0) of the curve Cg . We use a local coordinate z 2 := (¯ x − cj ) and x ¯m d¯ x/(2¯ y) ∼ (z 2 + cj )m dz + · · · . On the other hand, around ∞ point, let us choose local coordinate 1/x as 1/x2 = 1/¯ x and then x¯m d¯ x/(2¯ y) ∼ (1/x)2g−2m+2 dx + · · · . Hence ω is holomorphic all over the curve Cg while η is holomorphic except ∞ point. Definition 5.13. (1) The unnormalized Jacobi variety Jg is defined by a complex torus, Jg = Cg /Λ ,
where Λ is a lattice generated by Ω. (2) We defined the theta function by, X 1t a t exp 2πi (z) = θ ab (z; T) = θ (n + a)T(n + a) + (n + a)(z + b) . b 2 g n∈Z
Proposition 5.14. The Riemann constant of the hyperelliptic curve Cg is given as g Z Aj X K= ω ˆ = δ 0 + δ 00 T j=1
∞
1 1 00 t 1 where δ 0 = t [ g2 g−1 2 · · · 2 ], δ = [ 2 · · · 2 ].
Proof. This proof is in p. 3.82 in [53]. Using the Abel map Symg (Cg ) → Cg , we define the coordinate in Cg , g Z (¯ yi ,¯ xi ) X ωj . tj := i=1
Here we note that tj behaves (1/x) x2 = x ¯.
2(g−j)+1
around ∞ point if we use the parameter
Definition 5.15 (℘-function, Baker [29, 36]). (1) Using the coordinate tj , the σ-function, which is a holomorphic function over Cg , is defined by " 00 # δ 1 t 0 0−1 σ(t) := σ(t; Cg ) := exp − tH Ω t ϑ (Ω0−1 t; T) . 2 δ0
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
611
(2) In terms of the σ-function, the hyperelliptic ℘-function over the hyperelliptic curve Cg is defined by ℘ij (t) := −
∂2 σi (t)σj (t) − σij (t)σ(t) log σ(u) = . ∂ti ∂tj σ(t)2
As σ-function is an entire function over Cg and has single zero at g − 1 dimensional subvariety of Cg which is called theta-divisor, the hyperelliptic ℘ij has the second order singularity and function of Jg . Remark 5.16. It is worth while noting that from Definition 5.15, the hyperelliptic ℘-function can be concretely computed for a given hyperelliptic curve Cg . The summation in the definition of θ function rapidly converges due to effects of T and others are integrations of primary functions. Further it is known that ℘gi is an elementary symmetric function, i.e. for F (¯ x) = (¯ x−x ¯1 )(¯ x−x ¯2 ) · · · (¯ x−x ¯g ), [36, 54], g
F (¯ x) = x¯ −
g X
℘gi x ¯i−1 .
i=1
Accordingly, by numerical approach, we can compute a value of the hyperelliptic ℘ function as Euler determined a value of the elliptic integral to know the shape of a classical elastica by numerical method [1–4]. This approach was discovered by Baker about one hundred years ago [17–20, 29, 36]. We emphasize that it completely differs from Krichever’s approach based upon Baker–Akhiezer theorem explained in Sec. 4. Krichever’s arguments might not give us practical algorithms to fix parameters of general hyperelliptic function except solutions expressed by elliptic or hyperbolic functions. (Due to its abstract, it is a good strategy to construct soliton theory.) On the other hand, Baker’s original method determines concrete function forms of corresponding ℘ functions, for any algebraically given hyperelliptic curves (even for degenerate curves in Mhyp, g ). We can expand ℘-function around a general point and know its parameter dependence. Since this Baker’s construction in [29] might be no longer in recent researchers’ memory as long as I know, we believe that this review of Baker’s work has meaning. We believe that it is very useful for the analysis in physics. In [29] Baker found that the ℘-functions obey the following differential equations, which contain the KdV hierarchy. Example 5.17 (Genus = 3 [29]). Let us express ℘ijk := ∂℘ij (t)/∂tk and ℘ijkl := ∂ 2 ℘ij (t)/∂tk ∂tl . The hyperelliptic ℘-function obeys the relations (1) (2) (3) (4) (5)
℘3333 − 6℘233 = 2λ5 λ7 + 4λ6 ℘33 + 4λ7 ℘32 , ℘3332 − 6℘33 ℘32 = 4λ6 ℘32 + 2λ7 (3℘31 − ℘22 ), ℘3331 − 6℘31 ℘33 = 4λ6 ℘31 − 2λ7 ℘21 , ℘3322 − 4℘232 − 2℘33 ℘22 = 2λ5 ℘32 + 4λ6 ℘31 − 2λ7 ℘21 , ℘3321 − 2℘33 ℘21 − 4℘32 ℘31 = 2λ5 ℘31 ,
September 1, 2003 11:49 WSPC/148-RMP
00172
612
ˆ S. Matsutani & Y. Onishi
(6) (7) (8) (9) (10) (11)
℘3311 − 4℘231 − 2℘33 ℘11 = 2∆℘ , ℘3222 − 6℘32 ℘22 = −4λ2 λ7 − 2λ3 ℘33 + 4λ4 ℘32 + 4λ5 ℘31 − 6λ7 ℘11 , ℘3221 − 4℘32 ℘21 − 2℘31 ℘22 = −2λ1 λ7 + 4λ4 ℘31 − 2∆℘ , ℘3211 − 4℘31 ℘21 − 2℘32 ℘11 = −4λ0 λ7 + 2λ3 ℘31 , ℘3111 − 6℘31 ℘11 = 4λ0 ℘33 − 2λ1 ℘32 + 4λ2 ℘31 , ℘2222 −6℘222 = −8λ2 λ6 +2λ3 λ5 −6λ1 λ7 −12λ2 ℘33 +4λ3 ℘32 +4λ4 ℘22 +4λ5 ℘21 − 12λ6 ℘11 + 12∆℘ , ℘2221 − 6℘22 ℘21 = −4λ1 λ6 − 8λ0 λ7 − 6λ1 ℘33 + 4λ3 ℘31 + 4λ4 ℘21 − 2λ5 ℘11 , ℘2211 − 4℘221 − 2℘22 ℘11 = −8λ0 λ6 − 8λ0 ℘33 − 2λ1 ℘32 + 4λ2 ℘31 + 2λ3 ℘21 , ℘2111 − 6℘21 ℘11 = −2λ0 λ5 − 8λ0 ℘32 + 2λ1 (3℘31 − ℘22 ) + 4λ2 ℘21 , ℘1111 − 6℘211 = −4λ0 λ4 + 2λ1 λ3 + 4λ0 (4℘31 − 3℘22 ) + 4λ1 ℘21 + 4λ2 ℘11
(12) (13) (14) (15)
where ∆℘ = ℘32 ℘21 − ℘31 ℘22 + ℘231 − ℘33 ℘11 . Proposition 5.18. For u = −2(℘gg − λ2g /3) and 3 tg−1 tg−2 tg−1 u(s, t2 , t3 ) = u tg , 2 , 4 + 4 2 2 2 λ2g
obeys the first and the second KdV equations.
Proof. Let us consider g = 3 case. If we regarded as u = −2(℘33 − λ6 /3), it is obvious that (1) in Example 3.14 becomes the KdV equation noting λ7 = 1. By 2 setting 2∂t3 × (2) + ∂t2 × (1) and ∂t3 = 16∂t1 + 16λ 3 ∂t2 , we obtain the second KdV equation. From arguments of Baker [29, 36], even for g > 3 the relations (1) and (2) maintain for g case. Remark 5.19. (1) By above arguments, for given hyperelliptic curve y¯2 = f (¯ x), we can construct a solution of the first and second KdV equations. Further the compatibility of Lax system gives more general argument for the other equations in the KdV hierarchy. Then it implies that we explicitly showed the existence of an injective map Mhyp, g → MKdV, g . This correspondence is valid even for degenerate curves. (2) Our development of the quantized elastica after submitting this article is in [17–20]. In [17–20], we showed more explicit function forms of quantized elastica over C. (3) After submitting this article, we knew the works of Buchstaber, Enolskii, Leykin and related people [reference in [54] as an extension of parts of Baker’s studies. Now let us give another proof of Theorem 4.2(2) and Proposition 4.33(2). Proposition 5.20. Propositions 4.2(2), 4,27(3), and 4.33(2) can be regarded as an approximation theory based upon the Weierstrass preparation theorem.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
613
Proof. Let us recall the moduli space of the KdV equations whose base ring is smooth functions and definition is in Proposition 4.32, n−1 ∞ ∞ M∞ u = 0 for ∀ n} , KdV = {u ∈ C (V ) | ∂tn u − Ω1
M∞ KdV ,
MKdV = MKdV /(t1 ) ,
For an arbitrary u ∈ there is a unique spectrum Spect(L := −∂s2 − u) up to its orbits, by solving the eigenvalue equation (−∂s2 −u)ψx = x ¯ψx . We assume that the spectrum does not have finite gap {(c1 , c2 ), (c3 , c4 ), . . . , (c2g−1 , c2g ), . . .} and then the corresponding characteristic equation becomes transcendental equation y¯2 = f (x), where f (x) is the transcendental function with zeros (cj )j=0,1,... . Since u is a smooth function over S 1 , |u| is bounded the above. Hence around ∞ of the Spect(L), L ∼ −∂s2 and Spect(L) at ∞ is patched by the affine space C; the width of gap converges to zero for x ¯ → ∞. Thus we can approximate Spect(L) by finite gap spectrum Spect(Lg ) := {(c1 , c2 ), (c3 , c4 ), . . . , (c2g−1 , c2g ), (c2g+1 , ∞)}. The approximated potential ug is given by the ℘ function of the hyperelliptic function y¯2 = hg (¯ x, 1) whose zero points are (cj )j=1,2,...,2g+1 . By using Weierstrass preparation theorem and taking appropriate g, we can approximate f (¯ x) by hg (¯ x, 1) for desired. Hence up to the KdVH flow, ug approaches to u for g approaches to ∞ from its construction. (For an arbitrary finite g, ug is unique up to its orbits). Thus for an arbitrary u in M∞ KdV , there is a series of points ug belonging to Fg MKdV such as ug → u for g → ∞ up to orbit. (We note that the finite type solutions does not depend upon the base rings C ∞ or formal power series.) Hence we have M∞ KdV = ∪g Fg MKdV .
P Since MPelas is a subset of M∞ KdV and for an arbitrary curve γ ∈ Melas , u := {γ, s}SD has a unique value, the above statement is valid.
Example 5.21 ([1, 2, 4, 13]). As an element γ in MPelas must satisfy γ(s + L) = γ(s) in P and a reality condition |∂s γ| = 1. Even though the hyperelliptic function ℘ is a meromorphic function over Jg , we can find a trajectory or real line in Jg which avoids the singularities and satisfies the reality and closed conditions. In other words, we will find MPelas as a subset of MKdV . We give examples of the γ ∈ P in terms of the local chart around the origin. (1) genus g = 0 case: a circle with radius 1. (2) genus g = 1 cases: MPelas 1 consists of two points: (2-1) Jacobi elliptic modulus l = 1 case
√ 2 (tanh(αs) − −1 sinh(αs)) . α (2-2) Jacobi elliptic modulus l = 0.908911 · · · , which gives the eight shape loop in a complex plane C [4]. γ(s) = s −
Here we note that in [6], Mumford gave simple and deep expression of the shape of elastica, which shows the depth, importance and beauty of this problem. There he showed how the reality condition |∂s γ| = 1 restricts the moduli of elliptic curves.
September 1, 2003 11:49 WSPC/148-RMP
614
00172
ˆ S. Matsutani & Y. Onishi
6. Cohomology of a Loop Space As we mentioned in Introduction, in this section, we will digress from our analysis of the moduli of a quantized elastica and review arguments of a loop space over S 2 in the category of topological spaces Top whose morphism is a continuous map (isomorphism is homeomorphism, monomorphism is injective continuous map and so on). Studies on a loop space in Top are well-established and its cohomological properties are well-known as in the textbook of Bott and Tu [30]. We can recognize the moduli space of a quantized elastica in P as a loop space in the category of the differential geometry DGeom. When we replace smooth functions with continuous functions and P with S 2 respectively, it is expected that the moduli space of a quantized elastica in P is related to that in Top. In this section, we will review a loop space in Top and show its cohomological properties. Definition 6.1 ([30]). E and X are topological space and X has a good cover U. A map π : E → X is called a fibering if it satisfies the covering homotopy properties: for given a map f : Y → E from an arbitrary topological space Y into E and homotopy f¯t of f¯ = π ◦ f in X (Y × [0, 1] → X, f0 := f ), there is a homotopy ft of f which covers f¯t ; (Y × [0, 1] → E such that f¯t := π ◦ ft ). Definition 6.2 ([30]). (1) The path space of X is defined to be the space P (X) consisting of all the paths in X with initial point ∗ : P (X) := {maps µ : [0, 1] → X | µ(0) = ∗ ∈ X} . (2) The loop space over X with a fixed point is defined by, ΩX = {µ : [0, 1] → X | µ(0) = µ(1) = ∗ ∈ X} . In the category of topological spaces Top, P and S 2 are identified by homeomorphism as its morphism. Thus we will give properties of the loop space over S 2 in Top as follows. Theorem 6.3 ([30]). (1) P (S 2 ) is a fibering whose fiber is Ω(S 2 ) : Ω(S 2 ) −−−−→ P (S 2 ) y S2
(2) Its cohomology is torsionless and given by H q (ΩS 2 , Z) = Z
for q ∈ Z≥0
as a module and its algebraic properties are given by H ∗ (ΩS 2 , Z) = E(x) ⊗Z Zγ (e) ,
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
615
where x and e generators of H 1 (ΩS 2 , Z) and H 2 (ΩS 2 , Z) respectively (dim x = 1 and dim e = 2). Here E(x) is the exterior algebra Z[x]/(x2 ) and Zγ (e) is the divided polynomial algebra whose base is (1, e, e2 /2, e3 /3!, . . .). In other words, the generator of H 2k+1 (ΩS 2 , Z) is x · ek /k! and that of H 2k (ΩS 2 , Z) is ek /k!. In order to prove Theorem 6.3, we prepare two well-known results in algebraic topology and triangle category without proofs [30]. Proposition 6.4 ([30]). For given a double complex K = ⊕q,p≥0 K p,q , there is a spectral sequence {Er , dr } converging to the total cohomology HD (K) such that each Er has a bigrading with dr : Erp,q → Erp+r,q−r+1 and E1p,q = Hdp,q (K) ,
E2p,q = Hδp,q Hd (K) ,
where d and δ are derivative: d : K p,q → K p+1,q and δ : K p,q → K p,q+1 , D = d + (−)p δ. We will consider the double complex for a fibering π : E → M , K p,q := C p (π −1 U, Ωq ) .
Here U is a ramification of M and Ωq is a q-form along the fiber. Proposition 6.5 (Leray-Hirsch theorem [30]). π : E → X is a fibering with fiber F over simply connected topological space which has a good cover, E2p,q = H p (X, H q (F, A)) , where A is a commutative ring. If H q (F, A) is a finitely generated A-module, E2 := H ∗ (X; A) ⊗ H ∗ (F ; A) . Proof of Theorem 6.3 [30]. Since P (X) is contractive, ( Z for q = 1 q H (P (X)) = 0 otherwise and the spectral sequence must converge to H p (P (X)), E2 must give isomorphism except 0-dimension.
E2 :
5
.. .
.. .
.. .
.. .
..
4
Z
0
Z
0
3
Z
0
Z
0
···
2
Z
0
Z
0
1
Z
0
Z
0
0
Z
0
Z
0
0
1
2
3
.
···
···
···
···
···
September 1, 2003 11:49 WSPC/148-RMP
616
00172
ˆ S. Matsutani & Y. Onishi
Next we will consider the algebraic properties. From Proposition 6.5, E2 is the tensor product H ∗ (ΩS 2 ) ⊗ H ∗ (S 2 ). Let v be a two-form of S 2 . Then if H 1 (ΩS 2 ) is denoted as Zx, E20,1 is expressed by Zx ⊗ 1. The derivative d2 in E2 , which is isomorphism, acts on x ⊗ 1 as d2 (x ⊗ 1) = (1 ⊗ v). Since d2 (x2 ⊗ 1) = (d2 x ⊗ 1) · x ⊗ 1 − x ⊗ 1 · d2 x ⊗ 1 = (1 ⊗ v)(x ⊗ 1) − (x ⊗ 1)(1 ⊗ v) = 0, we have x2 = 0 because d2 2 2 is isomorphism. Thus d−1 2 (x ⊗ v) is expressed by another generator e in H (ΩS ), which is algebraically independent of x. d2 (e ⊗ 1) = (x ⊗ v). Since d(ex ⊗ 1) = e ⊗ v, ex is a generator in dimension 3. Similarly d2 (e2 ⊗ 1) = 2ex ⊗ v means that e2 /2 is a generator in dimension 4. In other words, we have a table such that, 5 4 3
E2 :
2 1 0
.. .
.. .
.. .
.. .
..
e2 /2 ⊗ 1
0
0
0
ex ⊗ 1
0
0
e⊗1
0
ex ⊗ v
···
x⊗1
0
0
1
0
x⊗v
0
1
e⊗v
0
1⊗v
0
2
3
.
···
···
···
···
···
Hence Theorem 6.3 is proved. Remark 6.6. Though we showed the result on the loop space defined in Definition 6.2. However there are several studies on another loop space {γ : S 1 ,→ S 2 | smooth immersion} , and its cohomology, which differs from the result in Theorem 6.3 [7]. This loop has a freedom of choice of starting points of S 1 in S 2 . However in this article, we are concerned with a loop space with fixed point as we mentioned in Remark 2.15 and Definition 2.10. Accordingly we mentioned only the result. 7. Topological Properties of Moduli MPelas As in previous section, we reviewed the cohomological properties of a loop space in Top, in this section we will argue its relation to our loop space in DGeom or the moduli space of a quantized elastica again. We believe that such considerations are important for the quantization of an elastica and the statistical mechanics of polymer physics [12, 13, 32, 55]. The loop spaces in both Top and DGeom are infinite dimensional spaces when we regard them as manifolds in an appropriate sense. Even though it is not known that de Rham’s theorem can be applicable to such an infinite dimensional manifold, it is expected that cohomological sequences should correspond to each other. Precisely speaking, as we will show later, the closed condition and the reality condition |∂s γ| = 1 in the moduli space MPelas makes its topological properties difficult. Thus we must tune the 0-dimension of the cohomology related to MPelas .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
617
Then we will reach our third main Theorem 7.4, which implies that cohomology of MKdV reproduces Theorem 6.3 with R coefficients. Since the loop space in Top is given with the fixed point, there is no translation freedom for the loop in S 2 , which corresponds to our situation of quotient of E 0 (C) in Definition 2.2. Further there is no freedom of change of the origin of the loop in Top. Hence we must compare ΩS 2 with MPelas rather than MPelas . Further MPelas 0 ≈ MPelas 1 /Wt1 ≈ pt, which should be regarded the same class because both these are zero dimension. The FMPelas might be natural sequence: FMPelas : ∅ → F1 MPelas ,→ F2 MPelas ,→ · · · ,→ Fg−1 MPelas ,→ Fg MPelas ,→ Fg+1 MPelas ,→ · · · . Noting MPelas g := Fg MPelas /Fg+1 MPelas , as we are concerned only with its topological properties, let us consider the related complex of vector spaces, δ
δ
δ
δ
GMPelas : ∅ −→ F1 MPelas /Wt0,1 −→ MPelas 2 /Wt2 −→ · · · −→ MPelas g−1 /Wtg−1 δ
δ
δ
−→ MPelas g /Wtg −→ MPelas g+1 /Wtg+1 −→ · · · , with trivial map δ = 0 and δ 2 = 0. As each MPelas g /Wtg is a finite dimensional vector space Rg−1 thanks to Proposition 4.30, we have de Rham complex DMPelas g (g > 1), d
d
d
DMPelas g : 0 → Ω0 (MPelas g /Wtg ) −→ Ω1 (MPelas g /Wtg ) −→ Ω2 (MPelas g /Wtg ) −→ · · · . and DMPelas 1 d
DMPelas 1 : 0 → Ω0 (F1 MPelas /Wt0,1 ) −→ Ω1 (F1 MPelas /Wt0,1 ) d
d
−→ Ω2 (F1 MPelas /Wt0,1 ) −→ · · · , where Ωp (M ) is the set of p-forms over M . Proposition 7.1. Let us consider a double complex CMPelas with the derivative D = d + (−)g δ, 0 → DMPelas 1 → DMPelas 2 → · · · → DMPelas g−1 → DMPelas g → DMPelas g+1 → · · · . Then its cohomology, H p (CMPelas ) := ⊕g H p−g+1 (DMPelas g ) , is given by H 0 (CMPelas ) := R and H p (CMPelas ) = Rdt2 ∧ dt3 ∧ · · · ∧ dtp+1 ,
p > 0.
Proof. First we note MPelas g /Wt ≈ Rg−1 , for g ≥ 1 ,
CMPelas 1 /Wt ≈ pt .
September 1, 2003 11:49 WSPC/148-RMP
618
00172
ˆ S. Matsutani & Y. Onishi
Since we have for n ≥ 0 [30],
H p (Rn ) = R for p = 0 .
Due to Poincar´e duality, we have H p (Rn ) = Hcn−p (Rn ) , if we write the compact support function valued cohomology by Hcp [30]. The generator is expressed by, dt2 ∧ dt3 ∧ · · · ∧ dtg ,
with a compact support function over there.
First from Proposition 3.11(5), let us interpret Ω : ∂tn 7→ ∂tn+1 as an endomorphism of tangent space of Jacobi varieties T∗ Jg of a hyperelliptic curve related to a point γ in MPelas . Since the Jacobi variety is a quotient space of Cg , its tangent space (and also its cotangent space) can be identified with Cg : T ∗ Jg ≈ T∗ Jg ≈ Cg . Of course, we are concerned only with its real part Rg . Then using the canonical duality in the real part Rg , h∂tn , dtm i = δn,m ,
we can introduce an endomorphism Ω−1∗ and Ω∗ of MPelas , Ω∗ : dtn 7→ dtn−1 = Ω∗ dtn ,
Ω−1∗ : dtn 7→ dtn+1 = Ω−1∗ dtn ,
where hΩ∂tn , dtm i = h∂tn , Ω∗ dtm i.
Definition 7.2. Let us define an endomorphism of MPelas by, := dt2 Ω−1∗ , where Ω−1∗ is regarded as a right action operator, q = dt2 Ω−1∗ (∧q−1 ) for q > 1 and Ω−1∗ · 1 := 1. Then we have the properties of as follows. Lemma 7.3. (1) We have the relation q · 1 = dt2 ∧ dt3 ∧ · · · ∧ dtq+1 . (2) can be realized by ˜, X ˜ := σ k , 0 := dt2 , k := dtk+2 ∧ (dtk+1 i∂tk+1 ) (k > 0) , k=0
where σ is a permutation operator 1
2
3
q
q−1
q−2
···
···
q−1 2
q 1
!
and i∂tk is an inner product ∗ operator ; i∂tk · dtl = h∂tk , dtl i = δlk . (3) There is a ring isomorphism, ϕ0 : R ⊗R E(x) ⊗R Zγ (e) → R[[2 , dt2 ]] by ϕ0 : (e, x) → (2 , dt2 ) ,
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
619
where the product in R[[2 , dt2 ]] is defined by ∗ dt2 = dt2 ∗ := · dt2 ,
∗ = 2 ,
dt2 ∗ dt2 = dt2 ∧ dt2 = 0 .
Proof. (1) For example 2 · 1 = dt2 Ω−1∗ (∧dt2 Ω−1∗ ) · 1 = dt2 ∧ dt3 and this can be extended to general case. (2) Noting 2k = 0, (k ≥ 0), straightforward computations gives the results. (3) Noting Theorem 6.3, it is obvious. Here we will note that : Hcg (Rg ) → Hcg (Rg+1 ) generates the sequence Rg ,→ Rg+1 , and thus m could be regarded as a generator of the filter topology of MPelas and MKdV . Thus it means that we can evaluate the moduli space of a quantized elastica MPelas using the induced topology and as in Proposition 4.26. Finally we reach our third main theorem. Theorem 7.4. By setting e = 2 , x = dt2 , the cohomology H q (CMPelas ), is a ring isomorphic to H q (ΩS 2 , R), φ : H ∗ (CMPelas ) → ˜ H ∗ (ΩS 2 , R) . Remark 7.5. (1) The closed condition γ(s + L) = γ(s) for some L and the reality condition |∂s γ| = 1 are too strong. For example due to the condition, CMPelas 0 and CMPelas 1 consist only of disjoint points as mentioned in Examples 5.21 and [13]. Thus if we assign real vector bases each point, these cohomology might be H p (CMPelas 0 ) = Rδ0,p and H p (CMPelas 1 ) = R ⊕ Rδ0,p . These phenomena come from a “elasticity” in the category DGeom but we wish to consider the topological properties of the loop space in DGeom. Thus we have replaced MPelas with CMPelas by loosing strongness of the condition and make its topology weak; it implies a replacement to fewer open sets. This replacement comes from modulo computations in the gauge transformation by Wt in the KdV equations, using the natural immersion iKdV : MPelas → MKdV . However for a sufficiently large g case, the closed condition and the reality condition might not have serious effects. Then the quotient by the gauge transformations can be also guaranteed by the fact that each moduli space of compact Riemannian surface of genus g is simply connected [56]. Accordingly we consider that the replacement is not worse. The isomorphism φ could be regarded as a functor between the triangle categories of the loop spaces in DGeom and Top and a quasi-isomorphism between CMPelas and ΩS 2 [7]. (These objects in the triangle categories are vector spaces given by n and (xa , em ) respectively. The morphisms are multiplications as their ring structures.) (2) From the definition, m can be regarded as a map from H q (CMPelas ) to q+m H (CMPelas ). We should regard that this map comes from the properties of vertex operator, which change the genus of curves [57] and m · 1 is interpreted as a topological base of CMPelas .
September 1, 2003 11:49 WSPC/148-RMP
620
00172
ˆ S. Matsutani & Y. Onishi
(3) The operator induces the complexes FMPelas and GMPelas . This essentially exhibits the topology of Sato theory because in Sato theory [23–25], the existence of the gauge transformation Wt is a key factor. Theorem 7.4 means that its topology is as strong as that of a loop space in Top. It implies that the topology of Sato theory is too weak to lead us to express fine structure of the moduli space as Harris and Morrison pointed out in [50, p. 44–5]; they stated that the geometrical approach in [26] does not influence the study of the moduli space of algebraic curves including Mhyp [50]. In fact as mentioned in [50, 52], the moduli space of Mhyp is, in general, very complicate but our approach is not so difficult. Accordingly we wish to obtain stronger topology to express the moduli space. We hope that the day comes that the studies on quantized elastica are connected with those of Mhyp as Euler did for the case of genus one [1, 2, 4]. (4) As we will comment in Remark 8.9, the correspondence between loop spaces in DGeom and Top can be extended to higher dimensional loop spaces by considering recent result of a quantized elastica in R3 [14]. 8. Discussion 8.1. Although we have correspondence between homological properties of ΩS 2 in Top and those of MPelas, g in DGeom, there is an open problem for a correspondence of homotopy group between them, e.g. πq−1 (ΩS 2 ) = πq (S 2 ) (q ≥ 2) , ( Q for q = 1, 2 2 πq−1 (ΩS ) × Q = . 0 otherwise 8.2 [13]. We will consider γ ∈ MC elas in this remark. By defining ∂2γ √ s , v= 2 −1∂s γ this problem is related to the quantization of an elastica in C, Z Z Z[β] = Dγ exp −β v 2 ds . MC elas
For β > 0, the domain of E =
R
S1
S1
v 2 can be extended to ∞-point and we will define
1 MC elas = {γ : S → C | γ is continuous, |∂s γ(s)| = 1}/ ∼ .
In other words, as we assign the energy of γ with wild shape to ∞-point of E, it does not contribute the partition function Z. Then we can regard the partition function as Z : MC elas × R≥0 → R .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
621
The integral region in Z is recognized as MC elas . Due to our Theorem 3.4, we have a natural projection operator ΠE : C Π E : MC elas → Melas, E ,
Π2E = ΠE .
We have a spectral decomposition, 1MCelas =
Z
dEΠE .
Hence the partition function becomes Z −βE Z[β] = dE Vol(MC , elas, E )e
C where Vol(MC elas, E ) means the volume of (Melas, E ). Here we will comment on a question why we can use the concept of the orbits of “kinematic” system even though in the noncommutative algebra, one sometimes encounters nonsense of concept of orbit, e.g. Kronecker foliation [58]. Even in quantized problem, we can go on to use the concept of orbit and commutative geometry even though the dimension of the orbit space need not be finite. Let extend to the domain of β ∈ R≥0 to R≥0 + ∞. Note that as the inverse image, −1 (Z(∞)) , MC elas, cls = Z
the classical moduli space of the harmonic map of the elastica depending upon the boundary condition is naturally immersed in our moduli space MC elas . In other words, our analysis naturally contains Euler’s perspective of the classical elastica [1–4]. 8.3. Due to the projection operator, we can define the order in the moduli space MPelas . Noting that the energy E is real in MC elas , let a MC MC elas,<E := elas, E 0 . E 0 <E
For E1 < E2 , we have
C MC elas,<E1 ⊂ Melas,<E2 .
Then the moduli space MC elas is an ordered space. 8.4. The operator in Lemma 7.3 can be regarded as a creation operator in the quantum field theory. The vacuum state is regarded as 1. We can define the dual space of V ∞ ; hem , en i = δnm where en = dtn and em = ∂tm . Further by noting modulo 2 , we can reconstruct CMPelas in Lemma 4.31. On the other hand, we can introduce the micro-differential operator em (m ∈ Z) as the base of CMPelas as in the Definition 3.1 and Proposition 7.1. Then as the dual of CMPelas , we can define em (m < 0) and the vacuum of this field operator in the quantum field theory has affine structure as physicists think.
September 1, 2003 11:49 WSPC/148-RMP
622
00172
ˆ S. Matsutani & Y. Onishi
R 8.5. In the differential operator ring, Ds , the integral S 1 ∂s u = 0 means that since the integral is linear map, its kernel belongs to Ds /∂s Ds . Using the Definition 3.7 and Proposition 3.11, let us define, X X dtj ∂tj , a = uds . hj dtj , δ := h := j
j
We have the transformation in (Ds /∂s Ds ): ˜ , δa = Ωh
˜ := ds∂s δ , Ω δu
δ ∗h = 0. This relation is called Becchi–Rouet–Stora (BRS) relation [13, 59]. 8.6. We will introduce a dilatation flow ∂t ψx = t∂s ψx . The intersection between this flow and the KdV flow is governed by the Painlev´e equation of the first kind, s = 3u2 + ∂s2 u . This statement can be proved as follows. Since the KdV flow in Remark 3.8 is given by B1 = u while this flow B1 = t. Hence u = t and the KdV flow becomes ∂t u = 1 = ∂s (3u2 + ∂s2 u) , and we obtain the Painlev´e equation of the first kind [13, 31]. 8.7. Since the Schwarz derivative u is invariant for PSL2 (C) and PSL2 (C) transitively acts upon P, we can regard MPelas as ΩSL2 (C) := {γ : S 1 ,→ PSL2 (C) | γ(0) = 1} . Because γ(s) = gs γ(0) for g ∈ PSL2 (C), we have the condition g(0) = g(2π). As Witten pointed out, for a loop space we can naturally construct its tangent space as a loop space of the tangent space of the target space [60]. In other words, we can naturally define a loop algebra Ω sl2 (C). In the loop algebra, we have only the condition g −1 dg(0) = g −1 dg(2π) using g ∈ SL2 (C), which is not stronger condition than the condition g(0) = g(2π). Since there is a smooth map from S 1 to S 1 as Diff(S 1 ), we obtain an expression of the loop algebra, Diff(S 1 ) ⊗ sl2 (C) ⊕ C , which acts upon M∞ KdV in Proposition 4.32 with the weaker condition. The KdV flow has bi-hamiltonian structure and 2-cocycle ωΩ (X, Y ) := ω(ΩX, Y ) + ω(X, ΩY ) .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
623
Using ordinary functional derivative (Gatuex derivative δu(y)/δu(x) = δ(x − y)), we can write down the (second) Poisson relation, {u(s), u(s0 )} = Ωδ(s − s0 ) , where δ(s) is the Dirac δ-function. Let ln :=
1 2π
Z
dsuκ eisn .
denote its Fourier component. Then it obeys the semi-classical Virasoro algebra, {ln , lm } = (n − m)ln+m + n(n2 − 1)δn+m,0 . where the second term the unit central charge. We have the Virasoro algebra. Using the topological relation C∗ ∼ S 1 , the problem of conformal field theory is reduced to that of the loop algebra. Thus our relation can be also interpreted in the regime of the conformal field theory. Thus it is clear that our problem is related to the two dimensional quantum gravity [50]. R 8.8. It is known that for H0 := uds, the second Poisson structure of H0 reproduces the KdV equation; when the second Poisson bracket is defined as {X, Y }Ω = ωΩ (X, Y ) , ∂t u = {u, H0 }Ω is ∂t u + 6u∂s u + ∂s3 u = 0. If we will used the Hamiltonian Hn of the higher dimensional KdV as the energy functional of the system, we will have another decomposition, a P, (n) P,(n) Melas = Melas, E P, (n)
Melas, E := {γt ∈ MPelas | Hn − E = 0} .
The space is determined by the n(> 1)th KdV hierarchy, 8.9 [14, 30]. According to the results in [7], we have the relation H q (ΩS n , Z) = Z for q = 0 modulo n − 1 . As we mentioned in Remark 7.5, it is expected that the moduli space of a quantized elastica in S n has similar cohomological properties. In fact, one of these authors calculated the quantized elastica in Rn and obtained the same structure of the moduli space of a quantized elastica in Rn [14]. 8.10. We wish to know the volume of each MC elas, E . However this problem is not easy. In fact as pointed out in [50], the soliton theory might not affect to get any information of the structure of MC elas, E . In other words, our Theorem 7.4 means that the filter topology in the soliton theory is too weak and is equivalent with the topological properties of the loop
September 1, 2003 11:49 WSPC/148-RMP
624
00172
ˆ S. Matsutani & Y. Onishi
space. It might have no effect on the study of geometrical future of moduli space of hyperelliptic curve. Thus we believe that we must go beyond the ordinary soliton theories to another theoretical world for the study of moduli space of a quantized elastica as Euler investigated the elliptic functions by studying the shape of classical elasticas [1–5]. 8.11. First we will note the relations for P, C and upper half complex plane H; P : PSL2 (C) : C:
aγ + b cγ + d
: aγ + b
H : PSL2 (R) :
aγ + b cγ + d
We showed that loops on P are related to the KdV flow and that loops on C are related to the MKdV flow. Next we should consider loops on H. 8.12. One of solutions of
−∂s2
1 − {γ, s}SD ψ = 0 2
√ 1 is given by 1/ ∂s γ. The coordinate p transformation for the Diff(S ) leads us to redefine ψ as the invariant form ds/dγ. This reminds us of the prime form and the Dirac field which has a half weight as same as the theta function [17, 19, 53]. In fact, for a curve in C ⊂ P, there is a natural topology of γ induced form the distance in C, which is given by the Frenet–Serret relation: ! ! √ ∂s k/2 1/ ∂s γ = 0. √ −k/2 ∂s i/ ∂s γ This operator is regarded as the Dirac operator. The Dirac operator could be regarded as a translator from the category of analysis to the category of geometry. Hence as we are dealing with the topology of the Dirac operator, we might have a stronger topology of the curve. We can extend this structure to a conformal surface in R3 as the generalized Weierstrass relation [33, 34, 55, 61]. We note that this Dirac operator (and the Schr¨ odinger operator in Proposition 2.8) defined upon the loop space differs from the Dirac operator of Witten in [60] because Witten’s one is related to the conformal field theory and the ordinary string which is determined by intrinsic properties whereas ours are related to the extrinsic Polyakov string [33, 34, 55]. 8.13. As we noticed in 8.2, the partition function Z can be expressed by Z −βE Z = dE Vol(MC , elas, E )e
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
625
where Vol(MC elas, E ) is formally represented by Z XZ d vol(J) Vol(MC ) = dt2 dt3 · · · dtg , elas, E g
MC elas, E,g
J
C where d vol(J) is the volume form around a point J in MC elas, E,g and Melas, E,g := C MC elas, E ∩ Melas, g . Then we will leave integral over t2 , in the above expression and obtain the time t2 depending partition function, Z Z XZ d vol(J) dt3 · · · dtg e−βE . Z[t2 ] = dE g
MC elas, E,g
J
Similarly we obtain Z[t2 , t3 , . . . , tg ], which is a generating function [39]. Then we can expect that it might obey the KdV equation or related equation. This situation might be related to with Witten’s conjecture and Kontsevich’s theorem [50]. Acknowledgments One of us (S.M.) would like to thank Prof. F. Pedit and Prof. K. Tamano for critical discussions and drawing his attention to this problem. It is acknowledged that Prof. K. Tamano has taught him algebraic topology and differential geometry based upon [30] and [7] for over this decade and critically read this manuscript. He also thanks Prof. S. Saito, Prof. T. Tokihiro, W. Kawase and H. Mitsuhashi for helpful discussions and comments in early stage of this study. Prof. K. Sogo privately suggested him that soliton equations should be expressed in a projective space before starting this study and thus this study is one of the answers to his suggestions. He also thanks Prof. A. Koholodnko for telling him the reference [7] and many encouragements and discussions by using e-mails and Prof. B. L. Konopelchenko for kind letters to encourage his works. He is also grateful to Prof. Y. Ohnita, Prof. M. Guest, Dr. R. Aiyama and Prof. K. Akutagawa for inviting him to their seminars and for critical discussions especially Prof. M. Guest for sending him the reference [10]. Further we thank Prof. J. McKay for his interest on this article; his kind comments encouraged us to revise the manuscript. Finally we would like to express our sincere thanks to the referee for appropriate suggestions, which improved this article. References [1] C. Truesdell, The influence of elasticity on analysis: The classic heritage, Bull. Amer. Math. Soc. 9 (1983), 293–310. [2] C. Truesdell, Leonhrdi Euleri Opera Omnia ser. Secunda XI; The Rational Mechanics of flexible or elastic bodies 1638–1788, Birkhauser Verlag, Berlin, 1960. [3] A. E. H. Love, A Treatise on the Mathematical Theory of Elasticity, Cambridge University Press, Cambridge, 1927. [4] L. Euler, Methodus inveniendi lineas curvas maximi minimive proprietate gaudentes, Lausanne, 1744.
September 1, 2003 11:49 WSPC/148-RMP
626
00172
ˆ S. Matsutani & Y. Onishi
[5] A. Weil, Number Theory: an approach through history; From Haammurapi to Legendre, Birkh¨ auser, Cambridge, 1983. [6] D. Mumford, “Elastica and computer vision” in Algebraic Geometry and its Applications, ed. C. Bajaj, Springer-Verlag, Berlin, 1993, pp. 507–518. [7] J.-L. Brylinski, Loop Spaces Characteristic Classes and Geometric Quantization, Birkh¨ auser, Boston, 1992. [8] M. A. Guest, Harmonic Maps, Loop Groups, and Integrable Systems (London Math. Soc. Student Text 38 ), Cambridge University Press, Cambridge, 1997. [9] J. Langer and R. Perline, Poisson geometry of the filament equation, J. Nonlinear Sci. 1 (1991), 71–91. [10] G. Segal, Topological Methods in Quantum Field Theory, eds. W. Nahm et al., World Scienctific, Singapore, 1990, pp. 96–106. [11] G. Segal and G. Wilson, Loop groups and equations of KdV type, IHES 61 (1985), 5–65. [12] S. Matsutani, Geometrical construction of the Hirota bilinear form of the modified Korteweg-de Vries equation on a thin elastic rod: Bosonic classical theory, Int. J. Mod. Phys. A 22 (1995), 3109–3123. [13] S. Matsutani, Statistical mechanics of elastica on plane: origin of MKdV hierarchy, J. Phys. A 31 (1998), 2705–2725. [14] S. Matsutani, Statistical mechanics of elastica in R3 , J. Geom. Phys. 29 (1999), 243–259. [15] A. L. Kholodenko and T. A. Vilgis, Some geometrical and topological problems in polymer physics, Phys. Rep. 298 (1998), 251–370. [16] F. Pedit, KdV flows on the Riemann sphere, a talk at the meeting on “Study on Integrability in Differential Geometry”, Lecture on Tokyo Metropritan University, Jan. 8–10, 1998. [17] S. Matsutani, Closed loop solitons and sigma functions: Classical and quantized elasticas with Genera one and two, J. Geom. Phys. 39 (2001), 50–61. [18] S. Matsutani, Hyperelliptic solutions of KdV and KP equations: Reevaluation of Baker’s study on hyperelliptic sigma functions, J. Phys. A 34 (2001), 4721–4732. [19] S. Matsutani, Hyperelliptic loop solitons with Genus g: Investigations of a quantized elastica, J. Geom. Phys. 43 (2002), 146–162. [20] S. Matsutani, Explicit hyperelliptic solutions of modified Korteweg-de Vries equation: Essentials of Miura transformation, J. Phys. A. Math & Gen. 35 (2002), 4321–4333. [21] C. MacLaughlin, Orientation and string structures on loop spaces, Pac. J. Math 155:1 (1992), 143–156. [22] L. A. Dickery, Soliton Equations and Hamiltonian Systems, World Scientific, Singapore, 1991. [23] M. Sato, D-Modules and nonlinear system, Adv. Stud. Pure Math. 19 (1989), 417–434. [24] M. Sato and M. Noumi, Soliton Equation and Universal Grassmannian Manifold (in Japanese), Shophia University, Tokyo, 1984. [25] M. Sato and Y. Sato, Soliton equations as dynamical systems on infinite dimensional Grassmann manifold, Nonlinear Partial Differentail Equations in Applied Science, eds. H. Fujita, P. D. Lax and G. Strang, Kinokuniya/North-Holland, Tokyo, 1983. [26] M. Mulase, Cohomological structure in soliton equations and Jacobian varieties, J. Diff. Geom. (1984), 403–430. [27] I. M. Krichever, Methods of algebraic geomtery in the theory of nonlinear equations, Russian Math. Surverys 32 (1977), 185–213. [28] E. D. Belokolos, A. I. Bobenko, V. Z. Enol’skii, A. R. Its and V. B. Matveev, AlgebroGeometric Approach to Nonlinear Integrable Equations, Springer, New York, 1994.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
627
[29] H. F. Baker, On a system of differential equations leading to periodic functions, Acta Math. 27 (1903), 135–156. [30] R. Bott and L. W. Tu, Differential Form in Algebraic Topology, Springer, New York, 1982. [31] E. L. Ince, Ordinary Differential Equations, Dover, New York, 1956. [32] S. Matsutani, On density of state of quantized Willmore surface: A way to a quantized extrinsic string in R3 , J. Phys. A 31 (1998), 3595–3606. [33] S. Matsutani, Dirac operator of a conformal surface immersed in R4 : Further generalized Weierstrass relation, Rev. Math. Phys. 12 (2000), 431–444. [34] S. Matsutani, Immersion anomaly of Dirac operator on surface in R3 , Rev. Math. Phys. 11 (1999), 171–186. [35] H. Poincar´e, Papers on Fuchsian functions, J. Stillwel, Springer, 1985. [36] H. F. Baker, Abelian Functions, Cambridge University Press, Cambridge, 1897. [37] R. E. Goldstein and D. M. Petrich, The Korteweg-de Vries hierarchy as dynamics of closed curves in the plane, Phys. Rev. Lett. 67 (1991), 3203-3206. [38] R. E. Goldstein and D. M. Petrich, Solitons, Euler’s equation, and vortex patch dynamics, Phys. Rev. Lett. 67 (1992), 555–558. [39] P. Ramond, Field Theory, A Modern Primer, Benjamin/Cummings, Massachusetts, 1981. [40] R. Abraham and J. E. Marsden, Foundations of Mechanics, 2nd ed., Addison-Wesley, Reading, 1985. [41] J. L. Burchnall and T. W. Chaundy, Commutative Ordinary Differential Operators, Proc. Royal Society London (A) 118 (1928), 557–583. [42] J. L. Burchnall and T. W. Chaundy, Commutative Ordinary Differential Operators II, Proc. Royal Society London (A) 134 (1931), 471–485. [43] H. F. Baker, Note the foregoing paper “Commutative Ordinary Differential Operators” by J. L. Burchnall and T. W. Chaundy, Proc. Royal Society London (A) 118 (1928), 584–593. [44] D. Mumford, An algebro-geometric construction of commuting operators and of solutions to the Toda latticd equation, Korteweg-de Vries equation and related nonlinear equation, Intl. Symp. on Algebraic Geomtery (1977), Kyoto, 115–153. [45] H. Whitney, Analytic extensions of differentiable functions defined in closed sets, Trans. Amer. Math. Soc. 36 (1934), 63–89. [46] R. Hartshorne, Algebraic Geometry, Springer, Berlin, 1977. [47] P. G. Drazin and R. S. Johnson, Solitons: An introduction, Cambridge University Press, Cambridge, 1989. [48] S. Matsutani, The physical realization of the Jimbo-Miwa theory of the modified Korteweg-de Vries equation on a thin elastic rod: Fermionic theory, Int. J. Mod. Phys. A 10 (1995), 3091–3107. [49] H. P. McKean and P. van Moerbeke, The spectrum of Hill’s equation, Inventions Math. 30 (1975), 217–274. [50] J. Harris and I. Morrison, Moduli of Curves, Springer, New York, 1998. [51] S. Iitaka, K. Ueno and Y. Namikawa, Sprits of Deescartes and Algebraic Geometry (in Japanese) Nihon-Hyouron-Sha, Tokyo, 1980. [52] D. Mumford, Curves and Their Jacobians, University of Michigan, Michigan, 1975. [53] D. Mumford, Tata Lectures on Theta, Vol. II, Birkh¨ auser, Boston, 1983–1984. [54] V. H. Buchstaber, V. Z. Enolskii and D. V. Leykin, Klein Function, Hyperelliptic Jacobians and applications, Rev. Math. & Math. Phys. 10 (1997), 3–120. [55] B. G. Knopelchenko and G. Landlfi, Generalized Weierstrass representation for surface in multidimensional Riemann spaces, math.DG/9804144 (1998).
September 1, 2003 11:49 WSPC/148-RMP
628
00172
ˆ S. Matsutani & Y. Onishi
[56] C. Maclachlan, Modulus space is simply-connected, Proc. A.M.S. 29 (1971), 85–86. [57] E. Date, M. Jimbo, M. Kashiwara and T. Miwa, Nonlinear Integrable Systems — Classical Thoery and Quantum Thoery, eds. M. Jimbo and T. Miwa, World Scientific, Singapore, 1983. [58] A. Connes, Noncommutative Geometry, Academic Press, Singapore, 1994. [59] J. M. Leinass and K. Olaussen, Ghosts and geometry, Phys. Lett. 108B (1982), 199-202. [60] E. Witten, Elliptic Curves and Modular Forms in Algebraic Topology, Proceedings Princeton 1986, ed. P. S. Landweber, Springer, Berlin, 1986. [61] B. G. Knopelchenko, Induced surfaces and their integrable dynamics, Studies in Appl. Math. 96 (1996), 9–51.
September 1, 2003 10:14 WSPC/148-RMP
00170
Reviews in Mathematical Physics Vol. 15, No. 6 (2003) 629–641 c World Scientific Publishing Company
ENTANGLEMENT BREAKING CHANNELS
MICHAEL HORODECKI Institute of Theoretical Physics and Astrophysics, University of Gda´ nsk, 80-952 Gda´ nsk, Poland
[email protected] PETER W. SHOR AT&T Labs Research, Florham Park, New Jersey 07922 USA
[email protected] MARY BETH RUSKAI Department of Mathematics, Tufts University, Medford, Massachusetts 02155 USA
[email protected] Received 2 February 2003 Revised 30 May 2003
This paper studies the class of stochastic maps, or channels, for which (I ⊗ Φ)(Γ) is always separable (even for entangled Γ). Such maps are called entanglement breaking, P and can always be written in the form Φ(ρ) = k Rk Tr Fk ρ where each Rk is a density matrix and Fk > 0. If, in addition, Φ is trace-preserving, the {Fk } must form a positive operator valued measure (POVM). Some special classes of these maps are considered and other characterizations given. Since the set of entanglement-breaking trace-preserving maps is convex, it can be characterized by its extreme points. The only extreme points of the set of completely positive trace preserving maps which are also entanglement breaking are those known as classical-quantum or CQ. However, for d ≥ 3, the set of entanglement breaking maps has additional extreme points which are not extreme CQ maps. Keywords: Quantum channels; entanglement breaking maps; completely positive maps; CQ channels; separable states; extreme points.
1. Introduction A quantum channel is represented by a stochastic map, i.e. a map which is both completely positive and trace-preserving. We will refer to these as CPT maps. In this paper we consider the special class of quantum channels which can be simulated by a classical channel in the following sense: The sender makes a measurement on the input state ρ, and sends the outcome k via a classical channel to the receiver who 629
September 1, 2003 10:14 WSPC/148-RMP
630
00170
M. Horodecki, P. W. Shor & M. B. Ruskai
then prepares an agreed upon state Rk . Such channels can be written in the form X Φ(ρ) = Rk Tr Fk ρ (1) k
where each Rk is a density matrix and the {Fk } form a positive operator valued measure (POVM). We call this the “Holevo form” because it was introduced by Holevo in [6]. It is also natural consider the class of channels which break entanglement. Definition 1. A stochastic map Φ is called entanglement breaking if (I ⊗ Φ)(Γ) is always separable, i.e. any entangled density matrix Γ is mapped to a separable one. It is not hard to see that, as shown in the next section, a map is entanglementbreaking if and only if it can be written in the form X |ψk ihψk |hφk , ρφk i (2) Φ(ρ) = k
in which case it is necessarily completely positive. Furthermore, Φ is traceP preserving if and only if k |φk ihφk | = I, in which case, (2) is a special case of (1). One can show that the converse also holds, so that we have the following result. Theorem 2. A channel can be written in the form (1) using positive semi-definite operators Fk if and only if it is entanglement breaking. Such a map is also traceP preserving if and only if the {Fk } form a POVM or, equivalently, k |φk ihφk | = I. The rather straightforward proof will be given in the next section together with some additional equivalences. We will refer to stochastic maps which are both entanglement-breaking and trace-preserving as EBT. Of course there are stochastic maps which are not of the form (1). In particular, conjugation with a unitary matrix is not EBT. Channels which break entanglement are particularly noisy in some sense, e.g. a qubit map is EBT if the image of the Bloch sphere collapses to a plane or a line. In the opposite direction, we will show that a channel in d dimensions is not EBT if it can be written using fewer than d Kraus operators. Theorem 3. The set of EBT maps is convex. Although this follows easily from the definition of entanglement breaking, it may be instructive to also show directly that the set of maps of the form (1) is convex. ˜ denote such maps with density matrices {Rj }j=1···m and {R ˜ k }k=1···n Let Φ and Φ ˜ and POVM’s {Ej }j=1···m and {Ek }k=1···n respectively. For any α ∈ [0, 1] the map X X ˜ ˜ k Tr[(1 − α)E˜j ρ] [αΦ + (1 − α)Φ](ρ) = Rj Tr(αEj ρ) + R j
k
September 1, 2003 10:14 WSPC/148-RMP
00170
Entanglement Breaking Channels
631
˜n } is also a has the form (1) since {αE1 , αE2 , . . . , αEm , (1 − α)E˜1 , . . . , (1 − α)E POVM. Note that we have used implicitly the idea of generating a new POVM as the convex combination of two POVM’s, In this sense, the set of POVM’s is also convex, and one might expect that the extreme points of the set of entanglement-breaking maps are precisely those with an extreme POVM and pure Rk . However, this is false; at end of Sec. 3 of [18], the trine POVM is used to give an example of a qubit channel which is not extreme, despite the fact that the POVM is. Certain subclasses of EBT maps are particularly important. Holevo called a channel • classical-quantum (CQ) if each Fk = |kihk| in the POVM is a one-dimensional P projection. In this case, (1) reduces to Φ(ρ) = k Rk hk, ρki. • quantum-classical (QC) if each density matrix Rk = |kihk| is a one-dimensional P projection and k Rk = I. If a CQ map has the property that each density matrix Rk = |ψk ihψk | is a pure state, we will call it an extreme CQ map. Note that the pure states |ψk i need not be orthonormal, or even linearly independent. We will see in Sec. 3 that extreme CQ maps are always extreme points of the set of EBT maps, but they are only extreme points for the set of CPT maps if all pairs hψj , ψk i are nonzero. When all Rk = R are identical, then Φ is the maximally noisy map Φ(ρ) = R for all ρ. Because it maps all density matrices to the same R, its image is a single “point” in the set of density matrices and its capacity is zero. A point channel is extreme if and only if its image R is a pure state. A point channel is a special case of a CQ map; however, because all Rk = R the sum in (1) can be reduced to a single term with E1 = I. For d > 2, one can also consider those CQ maps for which some Rk are identical; then the POVM can be written as a projective measurement, and the image is a polyhedron. It is useful to have Kraus operator representations of EBT maps. For Φ of the √ √ form (1), let Akmn = Rk |mihn| Fk where {|mi} and {|ni} are orthonormal bases. Then one easily verifies that X X Akmn ρA†kmn = Rk Tr Fk ρ . (3) kmn
k
√ For CQ and QC maps these operators reduce to Akm = Rk |mihk| and Akn = √ |kihn| Fk respectively. Moreover, if all density matrices are pure states Rk = |ψk ihψk |, then one can achieve a further reduction to Ak = |ψk ihk| in the case of CQ maps. Holevo [6] showed that for EBT maps the Holevo capacity (i.e. the capacity of a quantum channel used for classical communication with product inputs) is additive. This result was extended by King [13] to additivity of the capacity of channels of the form Φ ⊗ Ω where Φ is CQ or QC and Ω is completely arbitrary. Shor [20] then proved the additivity of minimal entropy and Holevo capacity when Φ is EBT and
September 1, 2003 10:14 WSPC/148-RMP
632
00170
M. Horodecki, P. W. Shor & M. B. Ruskai
Ω arbitrary. Quite recently, King [14] showed that the maximal p-norms of EBT channels are multiplicative, and used this to give another proof of Shor’s additivity results for minimal entropy and Holevo capacity. In a related development, Vidal, D¨ ur and Cirac [22] used Shor’s techniques to prove additivity of the entanglement of formation for a class of mixed states associated with EBT maps. As it is important to understand the differences between those channels which break entanglement and those which preserve it, we seek other characterizations of these channels, describe their extreme points, and examine their properties. Results for qubits are given in a related paper [18] which follows. Some analysis of entanglement breaking channels was also independently presented by Verstraete and Verschelde [21]. 2. Equivalent Conditions In this section, we establish a number of equivalent characterizations of EBT maps, some of which were already discussed in the previous section. Theorem 4. The following are equivalent (A) Φ has the Holevo form (1) with Fk positive semi-definite. (B) Φ is entanglement breaking. P (C) (I ⊗ Φ)(|βihβ|) is separable for |βi = d−1/2 j |ji ⊗ |ji a maximally entangled state. (D) Φ can be written in operator sum form using only Kraus operators of rank one. (E) Υ ◦ Φ is completely positive for all positivity preserving maps Υ. (F) Φ ◦ Υ is completely positive for all positivity preserving maps Υ. A corresponding equivalence holds for CPT and EBT maps with the additional P conditions that {Fk } is a POVM, the Kraus operators Ak satisfy k A†k Ak = I, and Υ is trace-preserving. To prove this result, we will make use of the correspondence [2, 12] between maps and states given by Φ ↔ (I ⊗ Φ)(|βihβ|). (Also see [1] in this context.) Proof. To show that (A) ⇒ (B) note that when Φ has the form (1), X p p (I ⊗ Φ)(Γ) = R k T2 ( E k Γ E k ) k
=
X k
γk Rk ⊗ Q k
√ √ where T2 denotes the partial trace, γk = Tr Ek Γ and Qk = γ1l T2 ( Ek Γ Ek ). Thus, for arbitrary Γ, (I ⊗ Φ)(Γ) is separable. The implication (B) ⇒ (C) is trivial. To see that (C) ⇒ (A), observe that since (I ⊗ Φ)(|βihβ|) is separable, one can find normalized vectors |vn i and |wn i
September 1, 2003 10:14 WSPC/148-RMP
00170
Entanglement Breaking Channels
633
for which (I ⊗ Φ)(|βihβ|) ≡ =
1X |jihk| ⊗ Φ(|jihk|) d
(4)
X
(5)
jk
n
pn |vn ihvn | ⊗ |wn ihwn | .
Now let Ω be the map Ω(ρ) = d
X n
|wn ihwn | Tr(ρpn |vn ihvn |) .
(6)
Then one easily verifies that (I ⊗ Ω)(|βihβ|) =
X
=
X
jkn
n
|jihk| ⊗ |wn ihwn |pn hj, vn ihvn , ki pn |vn ihvn | ⊗ |wn ihwn |
P where we have used |vn i = j |jihj, vn i. Since a map Φ is uniquely determined by its action on the basis |jihk|, and hence by the action of (I ⊗ Φ) on |βihβ|, we can conclude that Φ = Ω. For trace-preserving maps, we also need to verify that {d pn |vn ihvn |} is a POVM. Taking the partial trace of (5), and using the fact that Φ is trace-preserving yields 1X 1 T2 [(I ⊗ Φ)(|βihβ|)] = |jihk| ⊗ Tr(|jihk|) = I d d jk
=
X n
pn |vn ihvn |
which is the desired result. Moreover, we have also shown that (C) ⇒ (D). P To show that (D) ⇒ (A), suppose that Φ(ρ) = k Ak ρA†k with Ak = |wk ihuk |. Then the map Φ can be written in the form (1) with Rk = |uk ihuk |. Moreover, P P when k A†k Ak = I, then k |uk ihuk | = I so that Fk = |uk ihuk | defines a POVM. The equivalence of (E) and (B) follows easily from the fact that a density matrix Γ is separable if and only if (I ⊗Ω)(Γ) > 0 for all positivity preserving maps Ω [7]. To see that this is equivalent to (F), it suffices to observe that Ω is positivity preserving ˆ is and that Φ ˆ Φ, ˆ where the adjoint is taken with \ if and only if its adjoint Ω ◦ Υ = Υ◦ † ˆ respect to the Hilbert Schmidt inner product so that Tr[Ω(A)] B = Tr A† Ω(B). ˆ is unital It may be interesting to recall that Υ is trace-preserving if and only if Υ so that the adjoint of a positivity and trace preserving map preserves POVM’s. Thus, when Φ has the form (1), the map Φ ◦ Υ is achieved by replacing Ek by ˆ k ). Υ(E Conditions (E) and (F) could be weakened slightly since it would suffice to check either for all Υ in some set of entanglement witnesses for the space on which Φ acts. However, one does not expect to be able to weaken them beyond this. Indeed, [5] and [9] contain examples of a channels which preserve PPT entanglement,
September 1, 2003 10:14 WSPC/148-RMP
634
00170
M. Horodecki, P. W. Shor & M. B. Ruskai
but break other types, i.e. the channel output (I ⊗ Φ)(Γ) is entangled, yet the partial transpose (I ⊗ T ) acting on it always yields a positive semi-definite state (I ⊗ T ◦ Φ)(Γ) ≥ 0. Alternatively, one could also consider maps which are not EBT, but break particular types of entanglement. 3. Extreme Points We now give some results about the extreme points of the convex set of EBT maps. In this section we will use some additional results from Choi [2] who observed that Φ is completely positive if and only if (I ⊗Φ)(|βihβ|) is positive semi-definite. When Φ is written in the operator sum form X Ak ρA†k (7) Φ(ρ) = k
the Kraus operators Ak can be chosen as the eigenvectors of (I ⊗ Φ)(|βihβ|) with strictly positive (i.e. nonzero) eigenvalue. (See Leung [16] for an nice exposition.) Choi [2] also showed that Φ is extreme in the set of CPT maps if and only if the set {A†j Ak } is linearly independent. Since both (7) and this linear independence P are preserved when Ai 7→ j uij Aj , a sufficient condition for Φ to be an extreme EBT map is that {A†j Ak } is linearly independent for some set of operators {Ak } satisfying (7). Note that the condition that Φ is also trace-preserving becomes P † k Ak Ak = I. Recall that an extreme CQ map is one which can be written in the form X Φ(ρ) = |ψk ihψk |hek , ρek i (8) k
with the vectors {ek } orthonormal. We can summarize our results as follows. Theorem 5. (A) If Φ is an extreme CQ map, then Φ is an extreme point in the set of EBT maps. (B) If Φ is an extreme CQ map, then Φ is an extreme point in the set of CPT maps if and only if hψj , ψk i 6= 0 ∀ j, k when it is written in the form (8). (C) If Φ is both in the set of EBT maps and an extreme point of the CPT maps, then Φ is an extreme CQ map. (D) When d = 2, the extreme points of the set of EBT maps are precisely the extreme CQ maps. When d ≥ 3 there are extreme EBT maps which are not CQ. Proof. To prove (A) we assume that Φ = aΦ1 + (1 − a)Φ2 with Φ1 , Φ2 6= Φ, 0 < a < 1 and Φ1 , Φ2 both EBT. Both Φ1 , Φ2 can be written in the form (2). By combining these, one finds one can write X Φ(ρ) = tj |φj ihφj |hfj , ρfj i (9) j
September 1, 2003 10:14 WSPC/148-RMP
00170
Entanglement Breaking Channels
635
with Φ1 , Φ2 having the same form, but different tj ≥ 0. By assumption, Φ can be written in the form (8) with |ek i orthonormal so that X tj |hek , fj i|2 |φj ihφj | . (10) Φ(|ek ihek |) = |ψk ihψk | = j
Since all tj ≥ 0, the rank one projection |ψk ihψk | is a linear combination with non-negative coefficients of the projections |φj ihφj |. This is possible only if those projections |φj ihφj | which have nonzero coefficients in (10) are identical to the projection |ψk ihψk |. Hence, we can conclude that every projection |φj ihφj | in (9) is equal to one of the projections |ψk ihψk | in (8). Let us now relabel the projections P |ψk0 ihψk0 | so that they are all distinct and let Ek0 = i∈k0 |ei ihei | where the sum is taken over those ei for which the associated projection in (8) is |ψk0 ihψk0 |. Then {Ek0 } gives a partition of I into mutually orthogonal projections, i.e. a von Neumann measurement, and we can write (dropping the 0 s for simplicity) X Φ(ρ) = |ψk ihψk | Tr Ek ρ . (11) k
We can also write Φ1 (ρ) =
X k
Φ2 (ρ) =
X k
|ψk ihψk | Tr Fk ρ
(12)
|ψk ihψk | Tr Gk ρ
(13)
with {Fk } and {Gk } each a POVM. Since the |ψk0 ihψk0 | were chosen to be distinct and the Ek0 orthonormal, it follows that Φ = aΦ1 + (1 − a)Φ2 if and only if Ek = aFk + (1 − a)Gk . Since 0 ≤ Fk , Gk ≤ I, this is possible only if Fk = Gk = Ek . But then we have shown that Φ1 = Φ2 = Φ, which proves part (A). To prove (B) note that the Kraus operators can be chosen as Ak = |ψk ihvk |. Thus, A†j Ak = hψj , ψk i|ek ihej | which yields a linearly independent set if and only if none of the ψj are mutually orthogonal. But this is precisely Choi’s condition for the map to be extreme in the set of all CPT maps. The proof of part (C) requires Lemma 8 which is of interest in its own right. The proof of (D) when d = 2 is given in the following paper [18] on qubit EBT maps, while the counter-example establishing (D) for d > 3 is given below. Remark. Recall that a QC map can be written in the form X Φ(ρ) = |ek ihek | Tr ρFk
(14)
k
with the vectors {ek } orthonormal. Such maps can never be extreme in the set of CPT maps; their Kraus operators always include a subset of the form Ak = |ek ihvk |Gk which can not satisfy Choi’s linear independence condition due to the orthogonality of the {ek }. In the case of qubits, QC maps are not even extreme
September 1, 2003 10:14 WSPC/148-RMP
636
00170
M. Horodecki, P. W. Shor & M. B. Ruskai
in EBT, unless they are also CQ. However, for d = 4, one can have extreme EBT maps which are QC but not CQ. Example. Let {gk } be orthonormal and consider the POVM consisting of a “trine” on span{g1 , g2 } and the projection onto span{g3 , g4 }, i.e. 2 2 2 E1 = |g1 ihg1 | , E2 = |g+ ihg+ | , E3 = |g+ ihg+ | , E4 = |g3 ihg3 | + |g4 ihg4 | 3 3 3 √ P4 3 1 where |g± i = 2 |g1 i± 2 |g2 i. Then Φ(ρ) = k=1 |ek ihek | Tr ρEk is an extreme EBT map, which is QC, but not CQ. To see that Φ is extreme it suffices to observe that it is essentially the direct P3 sum of maps ΦA ⊕ ΦB where ΦA : C2 7→ C3 with ΦA (ρ) = k=1 |ek ihek | Tr ρEk and ΦB : C2 7→ C1 with ΦB (ρ) = |e4 ihe4 | for all ρ. ΦA is extreme because it is the adjoint of an extreme CQ map, and ΦB is the only CPT from map C2 to C1 . We used the fact that proof of part (A) of Theorem 5 extends easily to map from Cd 0 to Cd with d0 < d. A map which is both CQ and QC projects a density matrix ρ onto its diagonal in a fixed orthonormal basis. One can generalize this to CPT maps which take a density matrix to its projection onto a block-diagonal one. Such maps have the form P Φ(ρ) = k Ek ρEk where Ek are the projections in a von Neumann measurement; they are not EBT when at least one of the projections has rank > 1. The map in the example above is a generalization of CQ in the sense that it is the composition of a block diagonal projection together with an EBT map, and thus could be regarded as “block CQ”. In a similar spirit, one might regard an extreme CQ map for which the ψk can be split into two mutually orthogonal subsets as “block QC”. With respect to CPT, maps which are both block QC and block CQ could be considered as generalizations of the quasi-extreme points introduced in [19] for stochastic maps on C2 . We now give some results about the number of Kraus operators associated with EBT maps. Theorem 6. If a CPT map Φ can be written with fewer than d Kraus operators, then it is not EBT. Proof. This follows from the fact [2] that Φ can always be written using at most r ≡ rank[(I ⊗ Φ)(|βihβ|) Kraus operators. However, it was shown in [11] that if r < d, then (I ⊗ Φ)(|βihβ|) is not separable and, hence, Φ does not break the entanglement of the state |βihβ|. Alternatively, one could observe that if r < d, then at least one eigenvalue of (I ⊗ Φ)(|βihβ| is greater than 1/d, while its left reduced density matrix has all eigenvalues equal to 1/d (since Φ is CPT). However, in Ref. 8 it was shown that if a state is separable, then its the maximal eigenvalue must not exceed the maximal eigenvalue of either of subsystems. Lemma 7. If Φ is a CPT map for which rank[(I ⊗ Φ)(|βihβ|)] = d, then Φ is EBT if and only if T ◦ Φ is completely positive.
September 1, 2003 10:14 WSPC/148-RMP
00170
Entanglement Breaking Channels
637
This follows immediately from a (non-trivial) result in [10] which implies that a d2 × d2 density matrix of rank d is separable if and only if it has positive partial transpose. The following lemma is of some interest since one can find examples [4] of separable matrices of rank d whose decomposition into product pure states requires more than d products. The additional hypothesis that the reduced density matrix ρA = TrB ρ also has rank d is crucial. The lemma was first proven in [10]. Here we present a simpler proof. Lemma 8. Let ρ be a density matrix on HA ⊗ HB . If ρ is separable, ρ has rank d, and ρA = TrB ρ has rank d, then ρ can be written as a convex combination of products of pure states using at most d products. Proof. Since ρ is separable it can be written in the form ρ=
k X i=1
λi |ai ihai | ⊗ |bi ihbi | .
(15)
Assume that k > d and that ρ cannot be written in the form (15) using less than k products. Since ρA has exactly rank d, there is no loss of generality in assuming that the vectors above have been chosen so that |a1 i, |a2 i, . . . , |ad i are linearly independent. Moreover, since ρ has rank d < k, the first d + 1 vectors |ai i ⊗ |bi i must be linearly dependent so that one can find αj such that d+1 X j=1
αj |aj i ⊗ |bj i = 0 .
(16)
Now let {|ek i} be an orthonormal basis for HB . Then d+1 X j=1
αj hek , bj i|aj i = 0 ∀ k .
(17)
Since the first d vectors |aj i are linearly independent, there is a vector x in Cd+1 P such that j vj |aj i = 0 if and only if v is a multiple of x. Applying this to the coefficients in (17) one finds that there are numbers νk such that uj hek , |bj i = νk xj . P Let |νi be the vector k νk |ek i. Then αj |bj i = xj |νi. Since |bj i was chosen to have x norm 1, it follows that when αj 6= 0, | αjj | = 1 and |bj i = eiθj |νi. Thus, one can rewrite (15) as X X ρ= λj |aj ihaj | ⊗ |bj ihbj | + λj |aj ihaj | ⊗ |νihν| . (18) j:αj 6=0
j:αj =0
Suppose that t of the αj are nonzero. Since the vectors {aj i : αj 6= 0} are linearly P dependent, the density matrix ρ˜A = j:αj =0 λj |aj ihaj | has rank strictly < t and Pt0 0 0 0 0 can be rewritten in the form ρ˜A = k=1 λj |aj ihaj | using only t < t vectors. Substituting this in (18) gives ρ as linear combination of products using strictly less than k contradicting the assumption that (15) used the minimum number.
September 1, 2003 10:14 WSPC/148-RMP
638
00170
M. Horodecki, P. W. Shor & M. B. Ruskai
Proof of (C). If Φ can be written with fewer than d Kraus operators, it is not entanglement breaking; and if it requires more than d Kraus operators, it is not extreme. Hence we can assume that rank[(I ⊗ Φ)(|βihβ|) = d. The result then follows from Lemma 8. We now show that, for d = 3, the set of entanglement breaking maps has extreme points which are not CQ. Moreover, unlike the d = 4 example considered earlier, there is no decomposition into orthogonal blocks associated with this map. Counterexample. Let |0i, |1i, |2i be an orthonormal basis for C3 and consider the following four vectors corresponding to the vertices of a tetrahedron 1 |v0 i = √ (+|0i + |1i + |2i) 3 1 |v1 i = √ (+|0i − |1i − |2i) 3 1 |v2 i = √ (−|0i + |1i − |2i) 3 1 |v3 i = √ (−|0i − |1i + |2i) 3 and let 3
Φ(ρ) =
3X |vi ihvi | Tr ρ|vi ihvi | . 4 i=0
(19)
We now show that Φ is an extreme point for the set of entanglement-breaking maps. To see this, first recall that any entanglement breaking map Ψ can be written as X Ψ(ρ) = αi |yi ihyi | Tr ρ|zi ihzi | . (20) i
Let Ψ be one of the entanglement breaking maps whose convex combination is Φ, and let |yi and |zi be |yi i and |zi i for some fixed i in this above expression for Ψ. Now, consider the six vectors |wij i for i < j, where these are defined so that hwij |vk i = 0 for k 6= i, j. For example, |w01 i = √12 (|1i + |2i). Then, 1 (|vi ihvi | + |vj ihvj |) (21) 2 so for input |wij ihwij |, the output has rank 2 and is orthogonal to wkl , where i, j, k, l are all distinct. We thus have that for |yi and |zi, Φ(|wij ihwij |) =
hwij |yi = 0 or hwkl |zi = 0
(22)
where {i, j, k, l} is any permutation of {0, 1, 2, 3}, as above. Now, consider |yi. Suppose it is orthogonal to two of w01 , w02 , and w12 . Then, we must have |yi = |v3 i. This means that |yi is not orthogonal to w23 , w13 and
September 1, 2003 10:14 WSPC/148-RMP
00170
Entanglement Breaking Channels
639
w03 , which implies in turn that |zi is orthogonal to w01 , w02 and w12 , showing that |zi = |v3 i as well. The other case is when y is not orthogonal to at least two of the above three vectors w01 , w02 , w12 ; we can assume by symmetry that these two are w01 and w02 . Then z is orthogonal to w23 and w13 , showing that |zi = |v0 i. By the same reasoning as in the last paragraph, we now have that |yi = |v0 i as well. Thus, all the yi and zi in the above expression for Ψ must be one of the four vectors vj . It follows easily from this that Ψ = Φ. Moreover, we have shown that the Holevo form for Φ is essentially unique. Hence Φ cannot be written in the form required for it to be a CQ map. Note that Φ is not extreme in the set of CPT maps. In fact, it can be represented as a convex combination of CPT maps in several ways. For example, it can be written as the convex combination of the identity map, with weight 31 , and the average of the three CP maps that first project the state into one of the three planes {|0i, |1i}, {|0i, |2i}, {|1i, |2i}, and then apply the σx operator for that plane interchanging the two basis states, with weight 23 . It can also be written as a convex combination of the identity and the four maps corresponding to conjugation with a unitary map which reflects across the plane orthogonal to one of the vectors |v j i. 4. Representations in Bases Let G0 = d−1/2 I and let G1 · · · Gd2 −1 be a basis for the subspace of self-adjoint d×d matrices with trace zero which is orthonormal in the sense Tr G∗j Gk = δjk . Then {Gk }, k = 0, 1 · · · d2 − 1 is an orthonormal basis for the subspace of self-adjoint d × d matrices and every density matrix can be written in the form 2
ρ=
2
dX −1 dX −1 1 I+ wj G j = wj G j d j=1 j=0
(23)
with wj = Tr ρGj so that w0 = d1/2 . It then follows that 2 dX −1
j=0
wj2
2
= Tr ρ ≤ Tr ρ = 1 and
2 dX −1
j=1
wj2 ≤
d−1 . d
Then any linear (and hence stochastic) map Φ on the self-adjoint d×d matrices can be represented as a d2 ×d2 matrix T with elements tjk = Tr Gj Φ(Gk ). Now let Φ be a P P Holevo channel with density matrices Rk = j wjk Gj and POVM Fk = n ukn Gn P (k = 1 · · · N ) and write ρ = i xi Gi . Then it is straightforward to verify that P tjn = k wjk ukn . Thus, T = W T U where W and U are the d2 × N matrices with elements wjk = wjk and unk = ukn respectively. The condition that {Fk } is a POVM is precisely that the first row of T is (1, 0, . . . , 0). Such representations have been studied in more detail for qubits using the Pauli matrices for Gk . Recently, several generalizations have been considered for
September 1, 2003 10:14 WSPC/148-RMP
640
00170
M. Horodecki, P. W. Shor & M. B. Ruskai
d = 3 [15] and higher [3, 17]. Another natural choice of basis has Gjk = |jihk| for some orthonormal basis |ji. In this case some modifications are needed since P I = k Gkk . For j < k, one could also replace Gjk , Gkj by 2−1/2 (Gjk ± Gkj ) which act like σx and iσy for the two-dimensional subspace span{|ji, |ki}. Unfortunately, when d > 2, the requirement that Rk and Fk are positive semi-definite does not P 2 −1 2 seem easily related to a condition between u0 and dj=1 uj in any of these bases. Hence, such representations seem most useful for qubits, as discussed in [18]. For a CQ or QC channel, W and U are d2 × d which implies rank(T) ≤ d. Hence the image of a QC or CQ channel lies in a subspace of dim ≤ d − 1. This raises the question of whether or not a stochastic map for which the image of the set of density matrices lies in a subspace of sufficiently small dimension is always entanglement breaking. (This is true for qubits for which all planar maps are EBT.) For a basis in which a necessary condition for positive semi-definiteness is Pd2 −1 Pd2 −1 2 2 i=1 |xi | ≤ x0 , one can show that EBT implies j=1 |tjj | ≤ 1. For details, see Ref. 18. In general, a matrix T can be written as a product in many ways. We have shown that T represents an entanglement-breaking map if it can be decomposed into a product T = W T U whose elements W, U have very special properties. There is also a correspondence between the matrix T which represents Φ in a basis in the usual sense and the matrix (I ⊗Φ)(|βihβ|). It would seem that the requirement that (I ⊗ Φ)(|βihβ|) is separable is related to the product decomposition of T; however, we have not analyzed this. It may be more amenable to the filtering approach advocated by Verstraete and Verschelde [21]. Acknowledgment Part of this work was done while the authors participated in the program on Quantum Computation at the Mathematical Sciences Research Institute at Berkeley in November, 2002. The work of M.H. is supported by EC, grant EQUIP (IST-1999-11053), RESQ (IST-2001-37559) and QUPRODIS (IST-2001-38877). The work of M.B.R. was partially supported by the National Security Agency (NSA) and Advanced Research and Development Activity (ARDA) under Army Research Office (ARO) contract numbers DAAG55-98-1-0374 and DAAD19-02-1-0065, and by the National Science Foundation under Grant number DMS-0074566. References [1] C. H. Bennett, D. P. DiVincenzo, J. Smolin and W. K. Wootter, Mixed-state entanglement and quantum error correction, Phys. Rev. A 54 (1996), 3824–3851, quant-ph/9604024. [2] M.-D. Choi, Completely positive linear maps on complex matrices, Lin. Alg. Appl. 10 (1975), 285–290. [3] J. Cortese, The Holevo–Schumacher–Westmoreland channel capacity for a class of qudit unital channels, quant-ph/0211093.
September 1, 2003 10:14 WSPC/148-RMP
00170
Entanglement Breaking Channels
641
[4] D. P. DiVincenzo, B. M. Terhal and A. V. Thapliyal, Optimal decompositions of barely separable states, J. Mod. Optics 47 (2000), 377–385, quant-ph/9904005. [5] D. P. DiVincenzo, P. W. Shor, J. A. Smolin, B. M. Terhal and A. V. Thapliyal, Evidence for bound entangled states with negative partial transpose, Phys. Rev. A 61, 062312 (2000), quant-ph/9910026. [6] A. S. Holevo, Coding theorems for quantum channels, Russian Math. Surveys 53 (1999), 1295–1331, quant-ph/9809023. [7] M. Hordecki, P. Hordecki and R. Hordecki, Separability of mixed states: necessary and sufficient conditions, Phys. Lett. A223 (1996), 1–8. [8] M. Horodecki and P. Horodecki, Reduction criterion of separability and limits for a class of protocols of entanglement distillation, Phys. Rev. A 59 (1999), 4206–4216, quant-ph/9708015. [9] M. Horodecki, P. Horodecki and R. Horodecki, Binding entanglement channels, J. Mod. Opt. 47 (2000), 347–354, quant-ph/9904092. [10] P. Hordecki, M. Lewenstein, G. Vidal and I. Cirac, Operational criterion and constructive checks for the separability of low rank density matrices, Phys. Rev. A 62, 032310 (2000), quant-ph/0002089. [11] P. Horodecki, J. Smolin, B. Terhal and A. Thapliyal, Rank two bound entangled states do not exist, J. Theor. Comp. Sci. 292 (2003), 589–596, ArXiv.org preprint quant-ph/9910122. [12] A. Jamiolkowski, Linear transformations which preserve trace and positive semidefiniteness of operators, Rep. Math. Phys. 3 (1972), 275–278. [13] C. King, Maximization of capacityand lp norms for some product channels, J. Math. Phys. 43 (2002), 1247–1260. [14] C. King, Maximal p-norms of entanglement breaking channels, Quant. Information and Computation 3 (2003), 186–190, quant-ph/0212057. [15] C. King, Capacity of the depolarizing channel, Lecture in workshop on Quantum Information and Cryptography at Mathematical Sciences Research Institute (November, 2002). http://www.msri.org/publications/ln/msri/2002/quantumcrypto/king/1/index.html [16] D. Leung, Choi’s proof as a recipe for quantum process tomography, J. Math. Phys. 44 (2003), 528–533, quant-ph/0201119. [17] A. O. Pittenger and M. H. Rubin, Separability and Fourier representations of density matrices, Phys. Rev. A 62, 032313 (2000), quant-ph/0001014. [18] M. B. Ruskai, Qubit entanglement breaking maps, quant-ph/0302032, Rev. Math. Phys. 15 (2003), 643–662. [19] M. B. Ruskai, S. Szarek and W. Werner, An analysis of completely positive tracepreserving maps on M2 , Lin Alg. Appl. 347 (2002), 159–187, quant-ph/0101003. [20] P. W. Shor, Additivity of the classical capacity of entanglement-breaking quantum channels, J. Math. Phys. 43 (2002), 4334–4340, quant-ph/0201149. [21] F. Verstraete and H. S. Verschelde, On one-qubit channels, ArXiv.org preprint quant-ph/0202124, version 1. [22] G. Vidal, W. D¨ ur and J. I. Cirac, Entanglement cost of bipartite mixed states, Phys. Rev. Lett. 89, 027901 (2002), quant-ph/0112131.
September 1, 2003 12:19 WSPC/148-RMP
00171
Reviews in Mathematical Physics Vol. 15, No. 6 (2003) 643–662 c World Scientific Publishing Company
QUBIT ENTANGLEMENT BREAKING CHANNELS
MARY BETH RUSKAI∗ Department of Mathematics Tufts University, Medford, Massachusetts 02155
[email protected] Received 2 February 2003 Revised 30 May 2003 This paper continues the study of stochastic maps, or channels, for which (I ⊗ Φ)(Γ) is always separable in the case of qubits. We give a detailed description of entanglementbreaking qubit channels, and show that such maps are precisely the convex hull of those known as classical-quantum channels. We also review the complete positivity conditions in a canonical parameterization and show how they lead to entanglement-breaking conditions. Keywords: Quantum channels; entanglement breaking maps; completely positive maps; CQ channels; separable states; extreme points; qubit complete positivity conditions.
1. Introduction The preceding paper [11] studied the class of stochastic maps which break entanglement. For a given map Φ this means that I ⊗ Φ(Γ) is separable for any density matrix Γ on a tensor product space. It was observed that a map is entanglement breaking if and only if it can be written in one of the following equivalent forms X Φ(ρ) = Rk TrFk ρ (1) k
=
X k
|ψk ihψk |hφk , ρ φk i
(2)
where each Rk is a density matrix and Fk a positive semi-definite operator. The map P P Φ is also trace-preserving if and only if k Fk = k |φk ihφk | = I, in which case the set {Fk } form a POVM. Henceforth we will only consider trace-preserving maps and use the abbreviations CPT for those which are also completely positive and EBT ∗ Partially
supported by the National Security Agency (NSA) and Advanced Research and Development Activity (ARDA) under Army Research Office (ARO) contract numbers DAAG5598-1-0374 and DAAD19-02-1-0065, and by the National Science Foundation under Grant number DMS-0074566. 643
September 1, 2003 12:19 WSPC/148-RMP
644
00171
M. B. Ruskai
for those which are also entanglement breaking. An EBT map is called classicalquantum (CQ) if each Fk = |kihk| is a one-dimensional projection; it is quantumclassical (QC) if each density matrix Rk = |kihk| is a one-dimensional projection. Maps which break entanglement can always be simulated using a classical channel; thus, one is primarily interested in those which preserve entanglement. Nevertheless, it is important to understand the distinction. In this paper we restrict attention to EBT maps on qubits, for which one can obtain a number of results which do not hold for general EBT maps. The main new result, which does not hold in higher dimensions, is that every qubit EBT map can be written as a convex combination of maps in the subclass of CQ maps defined above. Before proving this result in Sec. 6, we review parameterizations and complete positivity conditions for qubit maps. We also give a number of more specialized results which use the canonical parameterization and/or the fact that positivity of the partial transpose suffices to test entanglement for states on pairs of qubits. Recall that any CPT map Φ on qubits can be represented by a matrix in the canonical basis of {I, σ1 , σ2 , σ3 }. When ρ = 12 [I +v·σ], then Φ(ρ) = 12 [I +(t+T v)·σ] where t is the vector with elements tk = t0k , k = 1, 2, 3 and T is a 3 × 3 matrix, i.e. T = t1 T0 . Moreover, it was shown in [14] that we can assume without loss of generality (i.e. after suitable change of bases) that T is diagonal so that T has the canonical form 1 0 0 0 t 1 λ1 0 0 . (3) T= 0 λ2 0 t2 t3 0 0 λ3 The conditions for complete positivity in this representation were obtained in [16] and are summarized in Sec. 4. In the case of qubits, Theorem 4 of [11] can be extended to give several other equivalent characterizations. Theorem 1. For trace-preserving qubit maps, the following are equivalent (A) (B) (C) (D)
Φ has the Holevo form (1) with {Fk } a POVM. Φ is entanglement breaking. Φ ◦ T is completely positive, where T (ρ) = ρT is the transpose. Φ has the “sign-change” property that changing any λk → −λk in the canonical form (3) yields another completely positive map. (E) Φ is in the convex hull of CQ maps.
Conditions (C) through (E) are special to qubits. Conditions (C) and (D) use the fact [4, 8, 10, 15] that the PPT (positive partial transpose) condition for separability is also sufficient in the case of qubits.
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
645
2. Characterizations In this section, we prove Theorem 1 and provide some results using the canonical parameters. This gives another characterization of qubit EBT maps in the special case of CPT maps which are also unital. The equivalence (A) ⇔ (B) was proved in [11] where it was also shown that both are equivalent to the condition that Υ ◦ Φ is CPT for all Υ in a set of entanglement witnesses and that Φ◦Υ is CPT if and only if Υ◦Φ is. In the case of qubits, it is wellknown that it suffices to let Υ be the transpose, which proves the equivalence with (C). Furthermore, changing Φ → Φ ◦ T is equivalent to changing λ2 → −λ2 in the representation (3), and is unitarily equivalent (via conjugation with a Pauli matrix) to changing the sign of any other λk which yields (C) ⇔ (D). That (E) ⇒ (A) follows immediately from the facts that CQ maps are a special type of entanglementbreaking maps and the set of entanglement-breaking maps is convex by Theorem 2 of [11]. The proof that shows (D) ⇒ (E) will be given in Sec. 6. The proof that (B) ⇒ (A) given in [11] relied on the fact that there is a one-toone correspondence [5, 10, 12] (but not a unitary equivalence) between maps Φ and states ΓΦ = (I ⊗ Φ)(|βihβ|)
(4)
where |βi = √12 (|00i+|11i) is one of the maximally entangled Bell states. Moreover, a map is EBT if and only if ΓΦ is separable since that was shown to be equivalent to writing it in the form (2). One could then apply the reduction criterion for separability [2, 9, 10] to ΓΦ . This condition states that a necessary condition for separability of ρ is that hβ, ρ βi ≤ d1 for all maximally entangled states. In the case of qubits, this criterion is equivalent to the PPT condition, and hence sufficient, and equivalent to ρ ≤ 12 I, which gives the following result. Theorem 2. A qubit CPT map is EBT if and only if ΓΦ ≤ 21 I with ΓΦ as in (4). We now consider entanglement breaking conditions which involve only the parameters λk . Theorem 3. If Φ is an entanglement breaking qubit map written in the form (3), P then j |λj | ≤ 1. Proof. It is shown in [1, 16] that a necessary condition for complete positivity is (λ1 ± λ2 )2 ≤ (1 ± λ3 )2 .
(5)
When combined with the sign change condition (D), this yields the requirement |λ1 | + |λ2 | ≤ 1 − |λ3 |. For unital qubit channels, the condition in Theorem 3 is also sufficient for entanglement breaking. For unital maps t = 0 and, as observed in [1, 14, 16], the P conditions in (5) are also sufficient for complete positivity. Since j |λj | ≤ 1 implies
September 1, 2003 12:19 WSPC/148-RMP
646
00171
M. B. Ruskai
that (5) holds for any choice of sign in λk = ±|λk |, it follows that any unital CPT map satisfying this condition is also EBT. Theorem 4. A unital qubit channel is entanglement breaking if and only if P j |λj | ≤ 1 [after reduction to the form (3)].
Moreover, as will be discussed in Sec. 5 the extreme points of the set of unital entanglement breaking maps are those for which two λk = 0. Hence these channels are in the convex hull of CQ maps. For non-unital maps these conditions need not be sufficient. Consider the socalled amplitude damping channel for which λ1 = α, λ2 = α, λ3 = α2 , t1 = t2 = 0, and t3 = 1 − α2 . For this map equality holds in the necessary and sufficient conditions (λ1 ± λ2 )2 ≤ (1 ± λ3 )2 − t23 .
(6)
Since the inequalities would be violated if the sign of one λk is changed, the amplitude damping maps are never entanglement breaking except for the limiting case P α = 0. Thus there are maps for which j |λj | = 2α + α2 can be made arbitrarily small (by taking α → 0), but are not entanglement-breaking. 3. A Product Representation We begin by considering the representation of maps in the basis {I, σ1 , σ2 , σ3 }. Let Φ have the form (1) and write Rk = 21 [I + wk · σ] and Fk = 21 [uk0 + uk · σ]. Let W, U be the n × 4 matrices whose rows are (1, w1k , w2k , w3k ) and (uk0 , uk1 , uk2 , uk3 ) respectively, i.e. wjk = wjk , ujk = ukj k = 0 · · · 3. Let T be the matrix W T U . Note that the requirement that {Fk } is a POVM is precisely that the first row of T is (1, 0, 0, 0). The matrix T = W T U is the representative of Φ in the form (3) (albeit not necessarily diagonal). We can summarize this discussion in the following theorem. Theorem 5. A qubit channel is entanglement breaking if and only if it can be represented in the form (3) with T = W T U where W and U are n × 4 matrices as P3 P3 2 1/2 above, i.e. the rows satisfy ( k=1 u2jk )1/2 ≤ u0k and ( k=1 wjk ) ≤ w0k = 1 for all k. We can use this representation to give alternate proofs of two results of the previous section. To show that (A) ⇔ (D) observe that changing the sign of the jth column of U (j = 1, 2, 3) is equivalent to replacing Fk by the POVM with ukj → −ukj . The effect on T is simply to multiply the jth column by −1. The critical property about P qubits is that the condition Fk > 0 is equivalent to ( j |ukj |2 )1/2 ≤ uk0 which is unaffected by the replacement ukj → −ukj . Next, we give an alternate proof of Theorem 3 which is of interest because it may be extendable to higher dimensions.
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
Proof. Let W, U be as in Sec. 3. Then 3 3 X n X X k k |λj | = w j uj j=1
647
j=1 k=1
≤
3 X n X
≤
n X
≤
n X
j=1 k=1
k=1
k=1
|wjk ukj |
3 X j=1
=
n X 3 X k=1 j=1
1/2
|wjk |2
|wjk ukj |
3 X j=1
1/2
|ukj |2
1 · uk0 = 1
where we have used the fact that |wk | ≤ 1 and |uk | ≤ uk0 . That consequence of the fact that the {Fk } form a POVM.
P
k
uk0 = 1 is a
We now consider the decomposition T = W T U for the special cases of CQ, QC and point channels. If Φ is a CQ channel, we can assume without loss of generality 1 that U = 12 11 00 00 −11 . Now write W = 11 w w2 . Then
1
0 0
0
· 0 0 · T T=W U = 1 2 1 w +w w − w2 0 0 2 2 · 0 0 ·
.
(7)
0 By acting on the left with a unitary matrix of the form 01 ±R where R is a rotation whose third row is a multiple of w1 − w2 , this can be reduced to the form (3) with λ1 = λ2 = 0, |λ3 | = 21 |w1 − w2 |, and t = Rw1 − (0, 0, λ3 )T [since 1 1 1 2 1 1 2 2 (w + w ) = w − 2 (w − w )]. Indeed, it suffices to choose ! 1 t 1 t2 t3 + λ 3 W = . (8) 1 t 1 t2 t3 − λ 3 Note that the requirement |t| ≤ 1 only implies t21 + t22 + (t3 + λ3 )2 ≤ 1; however, the requirement |wk | ≤ 1 implies that t21 + t22 + (t3 ± λ3 )2 ≤ 1 must hold with both signs and this is equivalent to the stronger condition t21 + t22 + (|t3 | + |λ3 |)2 ≤ 1
(9)
which is necessary and sufficient for a CPT map to reduce the Bloch sphere to a line.
September 1, 2003 12:19 WSPC/148-RMP
648
00171
M. B. Ruskai
If Φ is a QC channel, we can assume without loss of generality that 1 0 0 1 W = 1 0 0 −1
and
U=
u0
u1
u2
u3
1 − u0
−u1
−u2
−u3
,
from which one easily finds that the second and third rows of T = W T U are identically zero and the fourth row is (2u0 − 1 2u1 2u2 2u3 ). One then easily verifies that multiplication on the right by a matrix as above with R a rotation whose third column is a multiple ofp(u1 u2 u3 ) reduces T = W T U to the canonical form (3) with λ1 = λ2 = 0, λ3 = 2 u21 + u22 + u22 = |u| ≤ min{2u0 , 2(1 − u0 )} ≤ 1, and t3 = 2u0 − 1. (Note that t3 + λ3 ≤ |2u0 − 1| + min{2u0 , 2(1 − u0 )} ≤ 1 with equality if and only if the image reaches the Bloch sphere). It is interesting to note that for qubits channels, every QC channel is unitarily equivalent to a CQ channel. Indeed, a channel which, after reduction to canonical form has nonzero elements λ3 and t3 with |λ3 | + |t3 | ≤ 1 and |t3 | < 1 can be written as either a QC channel with 1 0 0 1 1 1 + t3 0 0 λ3 W = , U= 2 1 − t3 0 0 −λ3 1 0 0 −1
or as a CQ channel with 1 0 0 t 3 + λ3 , W = 1 0 0 t 3 − λ3
1 U= 2
1 0 0
1
1 0 0 −1
.
For point channels W = ( 1 t1 t2 t3 ) and U = 12 ( 1 0 0 0 ). We conclude this section with an example of map of the form (1) with an extreme POVM, for which the corresponding map Φ is not extreme. Let Ek = 31 [I + wk · σ] √ √ with w1 = (1, 0, 0), w2 = (− 21 , 0, 23 ), w3 = (− 21 , 0, − 23 ). Then, irrespective of the choice of Rk , the third column of T = W T U is identically zero, which implies that, after reduction to canonical form, one of the parameters λk = 0. However, it is easy to find density matrices, e.g. Rk = 21 [I + σk ], for which the resulting map Φ is not CQ or point. But by Theorem 10, Φ is a convex combination of CQ maps and hence, not extreme. 4. Complete Positivity Conditions Revisited Not only is the set of CPT maps convex, in a fixed basis corresponding to the canonical form (3) the set of λk corresponding to any fixed choice of t = (t1 , t2 , t3 ) is also a convex set which we denote Λt . We will also be interested in the convex subset Λt,λ3 of the λ1 –λ2 plane for fixed t, λ3 , and in the convex set Ξt3 ,λ3 of points (t1 , t2 , λ1 , λ2 ) corresponding to fixed t3 , λ3 . Although stated somewhat differently, the following result was proved in [16].
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
649
Theorem 6. Let t and λ3 be fixed with |t3 | + |λ3 | < 1. Then the convex set Λt,λ3 † † consists of the points (λ1 , λ2 ) for which I − RΦ RΦ (or, equivalently I − RΦ RΦ ) is positive semi-definite, where
t1 + it2 1/2 (1 − t − λ )1/2 (1 + t + λ ) 3 3 3 3 RΦ = λ1 − λ 2 (1 + t3 − λ3 )1/2 (1 − t3 − λ3 )1/2
λ1 + λ 2 (1 + t3 + λ3 )1/2 (1 − t3 + λ3 )1/2 . (10) t1 + it2 (1 + t3 − λ3 )1/2 (1 − t3 + λ3 )1/2
† Similarly, Ξt3 ,λ3 also consists of the points (t1 , t2 , λ1 , λ2 ) for which I − RΦ RΦ ≥ 0. † Moreover, the extreme points of Λt3 ,λ3 are those for which RΦ RΦ = I.
Although this result is stated in a form in which t3 and λ3 play a special role and does not appear to be symmetric with respect to interchange of indices, the conditions which result are, in fact, invariant under permutations of 1, 2, 3. Theorem 6 follows from Choi’s theorem [5] that Φ is completely positive if and only if ΓΦ , given by (4), is positive semi-definite. As noted in [16], this implies that it can be written in the form Φ(E11 ) p p † Φ(E22 )RΦ Φ(E11 )
ΓΦ =
p
Φ(E11 )RΦ
p
Φ(E22 )
Φ(E22 )
!
(11)
where RΦ is a contraction. (Note, however, that the expression for RΦ given in (10) b i.e. to (I ⊗ Φ)(|βihβ|).) b was obtained by applying this result to the adjoint Φ, 2 Conversely, given a CPT map Φ and any contraction U on C , one can define a 4 × 4 matrix in block form,
b 11 ) Φ(E M = q q b 22 ) U † Φ(E b 11 ) Φ(E
q
b 22 ) Φ(E . b 22 ) Φ(E
b 11 ) U Φ(E
q
(12)
It then follows that there is another CPT map which (with a slight abuse of nocU )(|βihβ|) = M . However, (12) need not, tation) we denote ΦU for which (I ⊗ Φ in general, correspond to a map Φ which has the canonical form (3) since that q q U b 12 ) = Φ(E b 11 ) U Φ(E b 22 ) = (t1 + it2 )I + λ1 σx + iλ2 σy . For U an requires Φ(E arbitrary unitary or contraction, we can only conclude that b x) = Φ(σ b y) = Φ(σ
q
q
b 11 ) U Φ(E
q q q 3 X † b b b 22 ) ≡ Φ(E22 ) + Φ(E22 ) U Φ(E t1k σk k=0
q q q 3 X † b b b b Φ(E11 ) U Φ(E22 ) − Φ(E22 ) U Φ(E22 ) ≡ t2k σk k=0
September 1, 2003 12:19 WSPC/148-RMP
650
00171
M. B. Ruskai
so that the map ΦU corresponds to a matrix of 1 0 0 t10 t11 t12 t20 t21 t22 t3
0
0
the form 0 t13 t23 λ3
with tjk real. In order to study the general case of nonzero tk , it is convenient to rewrite (10) in the following form (using notation similar to that introduced in [13]). λ+ τ √ √ c++ c−− c++ c+− RΦ = (13) λ− τ √ √ c−− c−+ c+− c−+
where λ± = λ1 ± λ2 , τ = t1 + it2 , and c±± = 1 ± λ3 ± t3 , e.g. c+− = 1 + λ3 − t3 . Then ! m11 m12 † (14) I − R Φ RΦ ≡ M = m21 m22 with m11 = 1 −
|τ |2 |λ− |2 − c++ c−− c−− c−+
(15)
m22 = 1 −
|τ |2 |λ+ |2 − c+− c−+ c++ c+−
(16)
m12 = m21 =
τ λ− τ λ+ + . √ √ c++ c−− c+− c−+ c−− c+−
(17)
Note that the denominators, although somewhat messy, are essentially constants depending only on t3 and λ3 . Considering τ as also a fixed constant it suffices to rotate (and dilate) the λ1 –λ2 plane by π/4 and work instead with the variables λ± . The diagonal conditions m11 ≥ 0 and m22 ≥ 0 define a rectangle in the λ+ –λ− plane, namely |λ− |2 ≤ c−− c−+ −
1 − λ 3 + t3 2 c−+ 2 |τ | = (1 − λ3 )2 − t23 − |τ | c++ 1 + λ 3 + t3
(18)
|λ+ |2 ≤ c++ c+− −
c++ 2 1 + λ 3 + t3 2 |τ | = (1 + λ3 )2 − t23 − |τ | . c−+ 1 − λ 3 + t3
(19)
These diagonal conditions imply the necessary conditions |λ± |2 ≤ (1 ± λ3 )2 − t23
(20)
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
651
λ+ 2
--
--
+-
m22 = 0 1
++
-+
-+ λ− –1
1
2
–1
m22 = 0
--
--
+m11 = 0
m11 = 0
–2
Fig. 1. The λ+ –λ− plane showing the regions described by the diagonal conditions (dotted lines) and the curves corresponding to det(I − R†Φ RΦ ) = 0 for t = (0.2, 0.3, 0) and λ3 = 0.35. The closed curve and its interior describes the parameters for which the corresponding map is completely positive.
for complete become sufficient when by τ =the 0. diagonal The determinant Figure 1: positivity, The λ+ -λ−which plane also showing the regions described con2 † condition |m12and | isthe more complicated, buttobasically form 11 m22 ≥ ditionsm(dotted lines) curves corresponding det(I − has R Rthe Φ ) = 0 for Φ
t = (0.2, 0.3, 0) and λ23 = 0.35. 2 The closed curve 2 2 and its interior describes the − bλthe − dλ− ] ≥ eλmap f λcompletely (21) −. + ][ccorresponding + + is − + gλ+ λpositive. parameters for[a which
In particular, we would like to know if the values of (λ+ , λ− ) satisfying (21) necessarily lie within the rectangle defined by (18) and (19). Extending the lines bounding this rectangle, i.e. m11 = 0 and m22 = 0 one sees that the λ+ –λ− plane is divided into 9 regions, as shown in Fig. 1 and described below. 20
• the rectangle in the center which we denote ++, • four (4) outer corners which we denote −− since both m11 < 0 and m22 < 0, • the four (4) remaining regions (directly above, below and to the left and right of the center rectangle) which we denote as +− or −+ according to the signs of m11 and m22 . We know that the determinant condition (21) is never satisfied in the +− or −+ regions since m11 m22 − |m12 |2 < 0 when m11 and m22 have opposite signs. This implies that equality in (21) defines a curve which bounds a convex region lying
September 1, 2003 12:19 WSPC/148-RMP
652
00171
M. B. Ruskai λ+ 1
0.5
λ− –0.8
–0.4
0.4
0.8
1
–0.5
–1
Fig. 2.
The λ+ –λ− plane showing the region determined by determinant condition when t =
(0.4, 0.3, =−0.15 and the corresponding withdetermined λ+ and λ− interchanged. Their Figure 2: 0.0) Theandλλ+3-λ plane showing the region region by determinant condiintersection corresponds to the entanglement breaking maps with the indicated parameters. tion when t = (0.4, 0.3, 0.0) and λ3 = 0.15 and the corresponding region with λ+ andλ− interchanged. Their intersection corresponds to the entanglement breaking entirely within the ++ rectangle. Although (21) also has solutions in the −− regions maps with the indicated parameters. as shown in Fig. 1, one expects that these will typically lie outside the region for
which |tk | + |λk | ≤ 1, i.e. the rectangle bounded by the line segments satisfying |λ+ + λ− | ≤ 2(1 − |t1 |) and |λ+ − λ− | ≤ 2(1 − |t2 |). However, John Cortese [6] has shown that this need not necessarily be the case. Nevertheless, one need only check one of the two conditions m11 > 0, m22 > 0, and might substitute a weaker condition, such as Tr M > 0, to exclude points in the −− regions. For example, one could substitute for the diagonal conditions, c−− m11 + c+− m22 ≥ 0 which is equivalent to (λ21 + λ22 )(1 + t3 ) + λ23 (1 − t3 ) ≤ (1 + t3 )(1 − |t|)2 + 2λ1 λ2 λ3 .
(22)
Thus, strict inequality in both (21) and (22) suffice to ensure complete positivity. In general, when t 6= 0, the convex set Λt,λ3 is determined by (21), i.e. by the closed curve for which equality holds and its interior. Since changing the sign of λ1 or λ2 is equivalent to changing λ+ ↔ λ− , the corresponding set of entanglement breaking maps is given by the intersection of this region with the corresponding one with λ+ and λ− switched, as shown in Fig. 2. † † Remark. If, instead of looking at I − RΦ RΦ , we had considered I − RΦ RΦ , the matrix M would change slightly and the conditions (18) or (19) would be modified accordingly. (In fact, the only change would be to replace +t3 by −t3 in the fraction
Figure 3: The tetrahedron of bistochastic maps and its inversion through the origin (left); their intersection gives the octahedron of unital entanglement breaking maps (right). (Figures by K. Durstberger appeared in [3]. )
21
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
653
multiplying |τ |2 .) However, the determinant condition (21) would not change. Since † † RΦ RΦ and RΦ RΦ , are unitarily equivalent, † † † det[I − RΦ RΦ ] = det(U [I − RΦ RΦ ]U † ) = det[I − RΦ RΦ ].
It is worth noting that whether or not RΦ is a contraction is not affected by the signs of the tk . (In particular, changing t2 7→ −t2 takes RΦ 7→ RΦ , changing T t3 7→ −t3 takes RΦ 7→ σx RΦ σx , and changing t1 7→ −t1 takes RΦ 7→ −σz RΦ σz .) Therefore, one can change the sign of any one of the tk without affecting completely positivity. By contrast, one can not, in general, change λk → −λk without affecting the complete positivity conditions. (Note, however, that one can always change the signs of any two of the λk since this is equivalent to conjugation with a Pauli matrix on either the domain or range. The latter will also change the signs of two of the tk .) Changing the sign of λ2 is equivalent to composing Φ with the transpose, so that changing the sign of one of the λk is equivalent to composing Φ with the transpose and conjugation with one of the Pauli matrices. Furthermore, if changing the sign of one particular λk does not affect complete positivity, then one can change the sign of any of the λk without affecting complete positivity. In view of the role of the sign change condition it is worth summarizing these remarks. Proposition 7. Let Φ be a CPT map in canonical form (3) and let T (ρ) = ρT denote the transpose. Then (i) T ◦ Φ ◦ T is also completely positive, i.e. changing tk → −tk does not affect complete positivity. (ii) Φ ◦ T is completely positive if and only if changing any λk → −λk does not affect complete positivity. (iii) Φ ◦ T is completely positive if and only T ◦ Φ is. The only difference between Φ ◦ T and T ◦ Φ is that the former changes the sign of λ2 while the latter changes the signs of both t2 and λ2 . 5. Geometry 5.1. Image of the Bloch sphere We first consider the geometry of entanglement breaking channels in terms of their effect on the Bloch sphere. It follows from the equivalence with the sign change condition in Theorem 1 that any CPT map with some λk = 0 is entanglement breaking. We call such channels planar since the image lies in a plane within the Bloch sphere. Similarly, we call a channel with two λk = 0 linear. If all three λk = 0, the Bloch sphere is mapped into a point. Note that the subsets of channels whose images lie within points, lines, and planes respectively are not convex. However, they are well-defined and useful classes to consider.
September 1, 2003 12:19 WSPC/148-RMP
654
00171
M. B. Ruskai
Points: A channel which maps the Bloch sphere to a point has the Holevo form (1) in which the sum reduces to a single term with R = 21 [I + t · σ] and E = I. 1 0 Then Φ(ρ) = R Tr(Eρ) = R ∀ ρ and T = t 0 when |t| = 1, R is a pure state and the map is extreme. It is also a special case of the so-called amplitude damping channels, and (as noted at the end of Sec. 2) these are the only amplitude damping channels which break entanglement. Lines: When two of the λk = 0 so that the image of the Bloch sphere is a line, the conditions for complete positivity reduce to a single inequality, which becomes (9) in the case λ1 = λ2 = 0. Moreover, it is straightforward to verify that any such channel can be realized as a CQ channel. Indeed, it suffices to choose W as in (8). Planar channels: The image of a map with exactly one λk = 0 lies in a plane. When † this is λ3 , the condition I − RΦ RΦ ≥ 0 becomes ! 1 − |t|2 − (λ1 − λ2 )2 2(t1 λ1 + it2 λ2 ) ≥0 2(t1 λ1 − it2 λ2 ) 1 − |t|2 − (λ1 + λ2 )2 where |t|2 = t21 + t22 + t22 , and the condition on the diagonal becomes (|λ1 | + |λ2 |)2 + |t|2 ≤ 1 .
(23)
Now, if either diagonal element is identically zero, then one must have t1 λ1 = t2 λ2 = 0. Thus, if both λ1 , λ2 6= 0 and equality holds in the necessary condition (23), one must have t1 = t2 = 0, in which case it reduces to (|λ1 | + |λ2 |)2 + t23 = 1. This implies that a truly planar channel can not touch the Bloch sphere, unless it reduces to a point or a line. 5.2. Geometry of λk space We now consider, instead of the geometry of the images of entanglement-breaking maps, the geometry of the allowed set of maps in λk space. After reduction to the canonical form (3) it is often useful to look at the subset of [λ1 , λ2 , λ3 ] which correspond to a particular class of maps. We first consider maps for which t = 0. Theorem 8. In a fixed (diagonal) basis, the set of unital entanglement breaking maps on qubits corresponds to the octahedron whose extreme points correspond to the channels for which [λ1 , λ2 , λ3 ] is a permutation of [±1, 0, 0]. P Since this octahedron is precisely the subset with j |λj | ≤ 1 the result follows immediately from Theorem 4. Alternatively, one could use Theorem 10 and the fact that the unital CQ maps must have the form above. Remarks. (1) The channels corresponding to a permutation of [±1, 0, 0] belong to the subclass known as CQ channels. Hence, the set of unital entanglement breaking maps is the convex hull of unital CQ maps.
–0.5
September 1, 2003 12:19 WSPC/148-RMP
00171 –1
Figure 2: The λ+ -λ− plane showing the region determined by determinant condition when t = (0.4, 0.3, 0.0) and λ3 = 0.15 and the corresponding region with λ+ andλ− interchanged. Their intersection corresponds to the entanglement breaking maps with the indicated parameters. Qubit Entanglement Breaking Channels
655
Figure 3: The tetrahedron of bistochastic and its inversion the (left); origin their Fig. 3. The tetrahedron of bistochastic maps andmaps its inversion throughthrough the origin intersection of gives the octahedron of unital entanglement breaking mapsby K. intersection(left); gives their the octahedron unital entanglement breaking maps (right). (Figures (Figures Durstberger(right). appeared in [3].)by K. Durstberger appeared in [3]. )
(2) This octahedron in Theorem 8 is precisely the intersection of the tetrahe21 dron with corners [1, 1, 1], [1, −1, −1], [−1, 1, −1], [−1, −1, 1] with its inversion through the origin, as shown in Fig. 3. (A similar picture arises in studies of entanglement and Bell inequalities. See, e.g. Fig. 3 in [18] or Fig. 2 in [3].) (3) The tetrahedron of unital maps is precisely the intersection of four half-spaces bounded by planes of the form n · [λ1 , λ2 , λ3 ] = 1 with n = [±1, ±1, ±1] and an odd number of negative signs, i.e. n1 n2 n3 = −1. The octahedron of unital EBT maps is precisely the intersection of all eight planes of this form. (4) If the octahedron of unital entanglement breaking maps is removed from the tetrahedron of unital maps, one is left with four disjoint tetrahedrons whose sides are half the length of the original. Each of these defines a region of “entanglement-preserving” unital channels with fixed sign. For example, the tetrahedron with corners, [1, 1, 1], [1, 0, 0], [0, 10], [1, 0, 0]; whose boundary consists of four equilateral triangles, one in the plane [−1, −1, −1] · [λ1 , λ2 , λ3 ] = −1 and three in the planes n · [λ1 , λ2 , λ3 ] = 1 with n = [1, 1, −1], [1, −1, 1], [−1, 1, 1]. For many purposes, e.g. consideration of additivity questions, it suffices to confine attention to one of these four corner tetrahedrons. Indeed, conjugation with one of the Pauli matrices, transforms the corner above into one of the other four. We next consider non-unital maps, for which one finds the following analogue of Theorem 8. Theorem 9. Let t = (t1 , t2 , t3 ) be a fixed vector in R3 and let Λt denote the convex subset of R3 corresponding to the vectors [λ1 , λ2 , λ3 ] for which the canonical map with these parameters is completely positive. Then the intersection of Λ t with its inversion through the origin (i.e. λj → −λj ) is the subset of EBT maps with translation t. Remark. The effect of changing the sign of λ2 is λ+ ↔ λ− and of changing the sign of λ1 is λ+ ↔ −λ− . In either case, the effect on the determinant condition (21)
September 1, 2003 12:19 WSPC/148-RMP
656
00171
M. B. Ruskai
is simply to switch λ+ ↔ λ− , i.e. to reflect the boundary across the λ+ = λ− line. Thus, the intersection of these two regions will correspond to entanglement breaking channels. The remainder will, typically, consist of 4 disjoint (non-convex) regions, corresponding to the four corners remaining after the “rounded octahedron” of Theorem 9 is removed from the “rounded tetrahedron”. 6. Convex Hull of Qubit CQ Maps In [16] we found it useful to generalize the extreme points of the set of CPT maps S to include all maps for which RΦ is unitary, which is equivalent to the statement that both singular values of RΦ are 1. In addition to true extreme points, this includes “quasi-extreme” points which correspond to the edges of the tetrahedron of unital maps. Some of these quasi-extreme points are true extreme points for the set of entanglement-breaking maps. However, there are no extreme points of the latter which are not generalized extreme points of S. This will allow us to conclude the following. Theorem 10. Every extreme point of the set of entanglement-breaking qubit maps is a CQ map. Hence, the set of entanglement-breaking qubit maps is the convex hull of qubit CQ maps. The goal of the section is to prove this result. Because our argument is somewhat subtle, we also include, at the end of this section a direct proof of some special cases. First we note that the following was shown in [16]. After reduction to canonical form (3), for any map which is a generalized extreme point, the parameters λk must satisfy (up to permutation) λ3 = λ1 λ2 . This is compatible with the sign change condition if and only if at least two of the λk = 0, which implies that Φ be a CQ map. We now wish to examine in more detail those maps for which RΦ is not unitary. We can assume, without loss of generality, that the singular values of RΦ can be written as cos θ1 and cos θ2 , that cos θ1 ≥ cos θ2 , and that 0 ≤ cos θ2 < 1. Recall that we showed in Lemma 15 of [16] that one can use the singular value decomposition of RΦ to write ! cos θ1 0 1 1 (24) W † = U+ + U− RΦ = V 2 2 0 cos θ2 where U± = V
ei±θ1 0 0 ei±θ2
W † and V, W are unitary. Thus, Φ is the midpoint of
a line segment in S and can be written as Φ=
1 1 ΦU + + ΦU − 2 2
(25)
with ΦU ± defined as in (12). Although ΦU ± need not have the canonical form (3), they are related so that their sum does.
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
657
We now use the singular value decomposition of RΦ to decompose it into unitary maps in another way. ! cos θ1 0 RΦ = V W† (26) 0 cos θ2 =V
=
cos θ1 + cos θ2 cos θ1 − cos θ2 I+ σz W † 2 2
cos θ1 − cos θ2 cos θ1 + cos θ2 V W† + V σz W † . 2 2
(27)
Moreover, it follows from (27) that cos θ1 + cos θ2 cos θ1 − cos θ2 ΦV W † + ΦV σz W † + (1 − cos θ1 )Φ0 2 2 where Φ0 is the QC map corresponding to ! b 11 ) Φ(E 0 . M= b 22 ) 0 Φ(E Φ=
(28)
Since we have assumed that we do not have cos θ1 = cos θ2 = 1, Eq. (28) represents Φ as a non-trivial convex combination of at least two distinct CPT maps, the first two of which are generalized extreme points. (Unless cos θ1 = 1 or cos θ1 = cos θ2 , we will have three distinct points, and can already conclude that Φ lies in the interior of a segment of a plane within S.) Now, the assumption that cos θ2 6= 1 suffices to show that the decompositions (28) and (25) involve different sets of extreme points and, hence, that Φ can be written as a point on two distinct line segments in S. Therefore, there is a segment of a plane in S which contains Φ and for which Φ does not lie on the boundary of the plane (although the plane might be on the boundary of S). Thus we have proved the following. Lemma 11. Every map Φ in S lies in one of two disjoint sets which allows it to be characterized as follows. Either (I) Φ is a generalized extreme point of S, or (II) Φ is in the interior of a segment of a plane in S. Now let T denote the set of maps for which Φ ◦ T or, equivalently (−I) ◦ Φ, is in S. Since T is a convex set isomorphic to S, its elements can also be broken into two classes as above. The set of entanglement breaking maps is precisely S ∩ T . We can now prove Theorem 6 by showing that the convex hull of CQ maps is S ∩ T . Proof. Let Φ be in S ∩ T which is also a convex set. If Φ is a generalized extreme point of either S or T , then the only possibility consistent with Φ being entanglement-breaking is that it is CQ. Thus we suppose that Φ belongs to class II for both S and T . Then Φ lies within a plane in S and within a plane in T . The
September 1, 2003 12:19 WSPC/148-RMP
658
00171
M. B. Ruskai
intersection of these two planes is non-empty (since it contains Φ) and its intersection must contain a line segment in S ∩ T which contains Φ and for which Φ is not an endpoint. Therefore, Φ is not an extreme point of S ∩ T . Thus all possible extreme points of S ∩ T must be generalized extreme points of S or T , in which case they are CQ. Remark. Although this shows that all extreme points of S ∩ T are CQ maps, this need not hold for the various convex subsets, corresponding to allowed values of λk , tk in a fixed basis, discussed at the start of Sec. 4. The following remark shows that “most” points in the convex subset Λt,λ3 of the λ1 –λ2 plane can, in fact, be written as a convex combination of CQ maps in canonical form in the same basis. It also shows why it is necessary to go outside this region for those points close to the boundary. (a) First consider the set of entanglement-breaking maps with λ3 = 0, which is the convex set ∪t3 Ξt3 ,0 . Every extreme point must be an extreme point of the convex set Ξt 3 ,0 for some t3 . By Theorem 6, these are the maps for which √ 1 2 λτ λτ+ is unitary, which implies that either − 1−t3
(i) t1 = t2 = 0 and (λ1 ± λ2 )2 = 1 − t23 which implies that either λ1 = 0 or λ2 = 0 with t23 + λ2j = 1 for j = 1 or 2, or (ii) λ1 = λ2 = 0 and |t|2 = 1. The first type of extreme point is obviously a CQ map; the second is a “point” channel which, as noted before, is a special case of a CQ map. Thus any map in Ξt3 ,0 can be written as a convex combination of CQ maps in Ξt3 ,0 . Similar results hold if λ1 = 0 or λ2 = 0. Therefore, any entanglement breaking channel with some λk = 0, can be written as a convex combination of CQ channels with at most one nonzero λk in the same basis. Thus any planar channel can be written as a convex combination of CQ channels in the same plane. (b) Next consider entanglement-breaking maps with at most one nonzero tk . We can assume, without loss of generality, that t1 = t2 = 0 in which case the conditions for complete positivity reduce to (20). Combining this with the sign change condition yields (|λ1 | + |λ2 |)2 ≤ (1 − |λ3 |)2 − t23 .
(29)
It follows that for each fixed value of λ3 the set of allowable (λ1 , λ2 ) form a p square with corners (0, ±A3 ), (±A3 , 0) where A3 = (1 − |λ3 |)2 − t23 . Thus, the extreme points of Λ(0,0,t3 ),λ3 are planar channels which, by part (a) are in the convex hull of CQ channels. In particular, a map with λ1 = 0, λ2 = ±A3 , can be written as a convex combination of CQ maps with either λ2 = 0 or λ3 = 0. However, these maps need not necessarily lie in Λ(0,0,t3 ),λ3 ; we can only be sure that λ1 = 0 and t1 = 0, but not that t2 = 0. Thus we can only state
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
659
λ+ 0.6
0.4
0.2
–0.6
–0.4
–0.2
0
0.2
0.4
0.6
λ−
–0.2
–0.4
–0.6
Fig. 4.
The region of the λ+ –λ− plane corresponding to entanglement breaking maps with t =
Figure 4:0.0) The of The thedotted λ+ -λlines to entanglement breaking (0.4, 0.3, andregion λ3 = 0.15. show thecorresponding convex hull of the intersection points, − plane which are planar maps. maps with t = (0.4, 0.3, 0.0) and λ3 = 0.15. The dotted lines show the convex hull of the intersection points, which are planar maps. that Λ(0,0,t3 ),λ3 is in the convex hull of those CQ maps with λj = 0 and tj = 0 for either j = 1 or 2. Although it may beλ+ necessary to enlarge the set Λ(0,0,t3 ),λ3 in order to ensure that it is in the convexλ+ hull of some subset of CQ maps, these 0.6 CQ maps will have the canonical form in the same basis, and the same value for λ3 in that basis. (c) Now consider the convex subset Λt,λ3 ∩Λt,−λ3 of the λ1 –λ2 plane corresponding 0.4 to entanglement breaking maps with t, |λ3 | fixed. These two regions intersect when either λ1 = 0 or λ2 = 0 (or, equivalently, |λ+ | = |λ− | where λ± = λ1 ±λ2 ). One can again use part (a) to see that these intersection points can be written 0.2 in canonical form in the same basis. Since as convex combinations of CQ maps their convex hull has the same property, the resulting parallelogram, as shown in Fig. 4, is also a convex combination of CQ maps of the same type. Only for λ− those points in the strip between the parallelogram and the boundary might 0 –0.2 one need to make a change of basis in order 0.2 to write 0.4 the maps0.6 as a convex combination of CQ maps. (d) Now suppose |λ1 | = |λ2 | = |λ3 | = λ > 0. Since any two signs can be changed –0.2 by conjugation with a Pauli matrix, Φ is unitarily equivalent to a map with λ1 = λ2 = λ3 = ±λ. One can then conjugate with another unitary matrix (corresponding to a rotation on the Bloch sphere) pP to conclude that Φ is unitarily 2 equivalent to a channel Φ0 with t1 = |t| = k tk , and t2 = t3 = 0. It then 0 follows from part (b) that Φ , and thus also Φ, can be written as a convex combination of CQ channels which have the form described above in the rotated However, these not necessarily have the canonical form in the Figurebasis. 5: The region of maps the λneed + -λ− plane corresponding to entanglement breaking original basis. maps with t = (0.4, 0.3, 0.3742) and λ = 0.20, Because the intersections of the 3
axes with the boundary (at λ± = ±0.4, for which all |λk | = 0.2) correspond to maps known to be in the convex hull of CQ maps, one can enlarge the convex hull of such maps from the dotted line to the octagon shown by the dashed line. 22
–0.2
September 1, 2003 12:19 WSPC/148-RMP
00171
–0.4
–0.6
Figure 4: The region of the λ+ -λ− plane corresponding to entanglement breaking maps with t = (0.4, 0.3, 0.0) and λ3 = 0.15. The dotted lines show the convex hull 660of M. Ruskai theB. intersection points, which are planar maps. λ+ 0.6
λ+
0.4
0.2
λ− –0.2
0
0.2
0.4
0.6
–0.2
Fig.Figure 5. The of the of λ+the –λ− λplane corresponding to entanglement breaking maps with t = 5: region The region + -λ− plane corresponding to entanglement breaking (0.4, 0.3, 0.3742) and λ3 = 0.20, Because the intersections of the axes with the boundary (at maps with t = (0.4, 0.3, 0.3742) and λ3 = 0.20, Because the intersections of the λ± = ±0.4, for which all |λk | = 0.2) correspond to maps known to be in the convex hull of CQ axes the boundary (athull λ± of=such ±0.4, forfrom which |λk |line = 0.2) to by maps, onewith can enlarge the convex maps the all dotted to thecorrespond octagon shown known themaps dashed line. to be in the convex hull of CQ maps, one can enlarge the convex hull
of such maps from the dotted line to the octagon shown by the dashed line.
Consider the region Λt,λ3 with 0 n − i and all v1 , . . . , vj ∈ TX L, whereas for j = n − i either ω (i,n−i) = 0 or (i,n−i) ⊥ there are v1 , . . . , vj ∈ TX such that vj y · · ·yv1 yωX 6= 0. The uniqueness of this decomposition (for a fixed Riemannian metric) can also be seen in local coordinates adapted to the foliation of M , in which it amounts to collecting terms with a fixed number of differentials along normal coordinates. Definition 10. An n-form ω is called pure of degree (i, n − i) if it has the decomposition ω = ω (i,n−i) . The space Λ(i,n−i) (T ∗ M ) ⊂ Λn (T ∗ M ) is the space of pure forms of degree (i, n − i). Note that these and the following definitions depend on the Riemannian metric chosen on M . We can similarly decompose the exterior derivative operator d into two parts dk : Λ(i,n−i) (T ∗ M ) → Λ(i+1,n−i) (T ∗ M ) and d⊥ : Λ(i,n−i) (T ∗ M ) → Λ(i,n+1−i) (T ∗ M ) . Given a pure form ω ∈ Λ(i,n−i) (T ∗ M ), we define dω = (dω)(i+1,n−i) + (dω)(i,n+1−i) =: dk ω + d⊥ ω . Both derivative operators can be extended linearly so as to be defined on arbitrary forms. Choosing local coordinates (X α , X I ) such that X α parameterize the leaves in a neighborhood and X I the normal directions, the new derivative operators can be written as dk = ∂α dX α ∧ and d⊥ = ∂I dX I ∧. At this point the integrability of the normal distribution has been used: otherwise there would be an additional term in the decomposition of d (see also [14]). This can be seen using the Cartan formula which for a 1-form ω reads dω(v1 , v2 ) = v1 ω(v2 ) − v2 ω(v1 ) − ω([v1 , v2 ]) . Choosing v1 , v2 ∈ T ⊥ L such that [v1 , v2 ] 6∈ T ⊥ L (which by definition is possible only if the normal distribution is not integrable), there is always a 1-form ω ∈ Λ(1,0) (T ∗ M ) with dω(v1 , v2 ) 6= 0. Thus, dω has a nonvanishing contribution in Λ(0,2) (T ∗ M ) and d 6= dk + d⊥ for a nonintegrable normal distribution. One can see that there is only one additional term in the general case mapping Λ(i,j) (T ∗ M ) to
November 4, 2003 10:45 WSPC/148-RMP
698
00176
M. Bojowald & T. Strobl
Λ(i−1,j+2) ; but this would already complicate the descent equations derived below. Therefore, we will only deal with the integrable case from now on, for which we have Lemma 5. dk 2 = 0, {dk , d⊥ } := dk d⊥ + d⊥ dk = 0, d⊥ 2 = 0. Proof. It suffices to prove the assertion for actions on a pure form ω of arbitrary degree. We then have 0 = d2 ω = d(dk ω + d⊥ ω) = dk 2 ω + {dk , d⊥ }ω + d⊥ 2 ω . Because the three terms in the sum are all pure of different degrees, they have to vanish separately. Now we are in the position to proceed with the derivation of conditions for ˜ introduced above the existence of a compatible presymplectic form. The 2-form Ω (2,0) ˜ ˜ is pure of degree (2, 0) by construction, Ω = Ω . As already discussed in the ˜ is not necessarily closed since d⊥ Ω ˜ 6= 0 in paragraph preceding Proposition 9, Ω 2 (2,0) (1,1) ˜ general. Adding a form λ ∈ Λ0 (M, F ) leads to a new form Ω = Ω +Ω +Ω(0,2) which is closed if and only if ˜ (2,0) + dk Ω(1,1) + d⊥ Ω(1,1) + dk Ω(0,2) + d⊥ Ω(0,2) = 0 . dΩ = d⊥ Ω Collecting forms of equal degree immediately leads to Proposition 10. Let (M, Π) be a regular Poisson manifold being equipped with a Riemannian metric and an associated integrable decomposition of the tangential bundle. There is a presymplectic 2-form Ω = Ω(2,0) + Ω(1,1) + Ω(0,2) compatible with Π on M if and only if the descent equations dk Ω(1,1) = − d⊥ Ω(2,0) ,
dk Ω(0,2) = − d⊥ Ω(1,1) ,
have a solution on M subject to the condition that Ω M coincides with the symplectic form of that leaf.
(2,0)
d⊥ Ω(0,2) = 0
(20)
restricted to any leaf in
In special cases of a foliation we can reformulate the conditions of the proposition. Corollary 3. If M is foliated trivially, i.e. it is of the form M ∼ = L × Rk , then the first equation in (20) implies I ∂I Ω(2,0) = 0 σ
where ∂I denotes any differentiation transversal to L and σ is a closed two-cycle in L. This means that the symplectic volume of any closed two-cycle in a leaf has to be constant in M. This condition is violated for example, for any family of homeomorphic coadjoint orbits of a compact, semisimple Lie algebra because the symplectic form of leaves in
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
699
the dual Lie algebra with the Kirillov–Kostant structure depends nontrivially on the Casimir functions (the radial coordinate, e.g. for su(2)). In particular, according to the work of Kirillov, the (always discrete) set of irreducible unitary representations of a compact, semisimple Lie algebra corresponds to the set of all integral symplectic leaves in the corresponding Lie Poisson manifold, that is to leaves satisfying that H (2,0) Ω is an integer multiple of a fixed constant for any two-cycle in the leaf. σ H Correspondingly, σ Ω(2,0) depends nontrivially on the leaf and Poisson manifolds of this kind do not allow compatible presymplectic H (2,0) forms. (In this argument we cannot vanish on an interval if used analyticity which implies that ∂I σ Ω Halso (2,0) Ω is not constant. This means that any open subset of the dual Lie algebra σ contains (part of) a leaf violating the condition of the Corollary.) We have another interesting situation if all leaves in M have trivial second cohomology: H 2 (L) = 0. If M is foliated trivially, i.e. of the form M ∼ = L × Rk , one can easily verify that there is always a compatible presymplectic form: ˜ (2,0) has a symplectic potential θ (1,0) on any leaf L in M ∼ Lemma 6. If Ω = L × Rk , (2,0) (1,0) (1,0) ˜ i.e. Ω = dk θ , and θ varies smoothly from leaf to leaf, then Ω := dθ (1,0) is a compatible presymplectic form on M. In particular, if all leaves L in a trivially foliated M ∼ = L×Rk have trivial second cohomology, then there exists a compatible presymplectic form on M. This can also be derived using the descent equations which acquire the form (using Lemma 5) dk Ω(1,1) = dk d⊥ θ(1,0) solved by Ω(1,1) := d⊥ θ(1,0) , leading to dk Ω(0,2) = − d⊥ 2 θ(1,0) = 0 in addition to d⊥ Ω(0,2) = 0. The latter two equations have an obvious solution Ω(0,2) = 0 implying ˜ (2,0) + Ω(1,1) = dk θ(1,0) + d⊥ θ(1,0) = dθ(1,0) . Ω=Ω As a more explicit examplef we look at the manifold M = T 2 × R with Poisson bivector Π = F (x1 , x2 , x3 )(∂1 + ω∂2 ) ∧ ∂3 with an arbitrary function F on M . Leaves are submanifolds subject to the condition ωx1 − x2 = 0 and we can use x1 and x3 as local coordinates of a leaf. If we choose the normal distribution to ˜ (2,0) + Ω(1,1) + Ω(0,2) with be spanned by ∂2 , any 2-form Ω is split into Ω = Ω (2,0) −1 (1,1) ˜ Ω = F (x1 , x2 , x3 ) dx1 ∧ dx3 , Ω = µ1 dx1 ∧ dx2 + µ2 dx3 ∧ dx2 , Ω(0,2) = 0, and only the first descent equation is nontrivial and takes the form dk Ω(1,1) = (∂3 µ1 − ∂1 µ2 )dx1 ∧ dx2 ∧ dx3 = − d⊥ Ω(2,0) = ∂2 F −1 dx1 ∧ dx2 ∧ dx3 . f We
thank A. Weinstein for suggesting to look at such an example.
November 4, 2003 10:45 WSPC/148-RMP
700
00176
M. Bojowald & T. Strobl
R This implies ∂3 µ1 −∂1 µ2 = ∂2 F −1 which is solved by, e.g. µ1 = ∂2 F −1 dx3 , µ2 = 0 yielding Z −1 −1 ∂2 F dx3 dx1 ∧ dx2 Ω = F (x1 , x2 , x3 ) dx1 ∧ dx3 + as compatible presymplectic form. Note that Ω is well-defined globally even if ω is irrational in which case the leaves are dense in the torus factor of M . However, if we change M to be T 2 × S 1 , the x3 -integration in the second term of Ω is not periodic if there is a nonvanishing zero-mode in the Fourier decomposition of ∂2 F −1 with respect to x3 . This means that in such a case the characteristic form class of Π on T 2 ×S 1 does not vanish. If ω is rational the leaves in M have nontrivial second cohomology, whereas for irrational ω the leaves are of topology R × S 1 and so have trivial second cohomology, but M is not foliated trivially. So in both cases, there is no contradiction to Lemma 6. 4.4. Leafwise symplectic embeddings of Poisson manifolds Theorem 1 asserts that any Poisson manifold (P, ΠP ) has a symplectic realization. This provides an appropriate Poisson map from a symplectic manifold to the given Poisson manifold. As demonstrated in Sec. 3 above (cf. in particular Proposition 2 and diagram (11), physically this is, for example, of interest in constrained systems with a closed algebra: any Poisson manifold (or at least any region of it contained in Rd , d = dim P ) can be understood as arising from a constrained Hamiltonian system with appropriate constraint map φ. The Dirac bracket and the considerations of the present section motivate the question for another map, going in a reverse direction. Clearly, the identity map from (S, ω) to (S, ΠD ) is not a Poisson map (here ΠD is the Dirac bivector (16) defined for some second-class constraints; S is the neighborhood of C for which ΠD exists). Still, ΠD and ω are by no means unrelated; they are what we called compatible to one another. In the language of maps, this may be rephrased as follows: the embedding map from (S, ΠD ) into (M, ω) is leafwise symplectic. Definition 11. A map from a Poisson manifold (P, ΠP ) to a symplectic manifold (M, ω) is leafwise symplectic (or leaf-symplectic), if its restriction to any symplectic leaf of (P, ΠP ) is a symplectic map. Remark. Recall that according to Definition 3 a map f between two symplectic manifolds (M1 , ω1 ) and (M2 , ω2 ) is symplectic, iff f ∗ ω2 = ω1 . Therefore f : (P, ΠP ) → (M, ω) is leaf-symplectic, iff (f ◦ ιL )∗ ω = ΩL for any leaf L of (P, ΠP ), ΩL being the induced symplectic 2-form on L. Proposition 11. A regular Poisson manifold (P, ΠP ) permits a leaf-symplectic embedding into some symplectic manifold, iff its characteristic form-class vanishes. Proof. According to Proposition 9, there exists a compatible presymplectic form on P iff the characteristic form-class of ΠP vanishes. Assuming that there exists a
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
701
leaf-symplectic embedding f : P → M from P to (M, ω), f ∗ ω provides a compatible presymplectic form on (P, ΠP ). So the vanishing of the characteristic form-class is a necessary condition. On the other hand, if it vanishes, there exists a compatible presymplectic form Ω on P . According to Theorem 2, (P, Ω) can be embedded coisotropically into a symplectic manifold (M, ω). Denoting this embedding by ι and the embedding of a symplectic leaf (L, ΩL ) into P by ιL , we have ΩL = ι∗L Ω = ∗ ι∗L ι∗ ω = (ι|L ) ω. Thus the condition is also sufficient. Leaf-symplectic embeddings can be regarded as a concept related to isotropic symplectic realizations,g which are Poisson maps r: (Y, ω) → (M, Π) from a symplectic manifold (Y, ω) to a Poisson manifold (M, Π) such that the fibers of r are isotropic (i.e. T F ⊂ (T F )⊥ for any fiber F = r −1 (m), m ∈ M ). They are of interest for symplectic groupoids [18, 41], which also appeared recently in the context of Poisson Sigma Models in Ref. [42]. Obstructions for the existence of an isotropic symplectic realization have been derived [14, 43] which are of cohomological nature and similar to those derived here for the existence of a leaf-symplectic embedding. While the Dirac bracket, which provided the motivation for defining leafsymplectic embeddings, is usually used only on symplectic manifolds (M, ω) there is a straightforward generalization of Definition 11 to the case of a Poisson target manifold: Definition 12. A map from a Poisson manifold (P1 , Π1 ) to a Poisson manifold (P2 , Π2 ) is leaf-to-leaf symplectic if the image of any leaf L1 of P1 is symplectically embedded in a leaf L2 of P2 . Remark. (i) If (P2 , Π2 ) is symplectic, this clearly reduces to Definition 11. (ii) Unlike leaf-symplectic embeddings, there always exists a leaf-to-leaf symplectic embedding of a Poisson manifold (P, Π), namely the identity map. (iii) Non-trivial examples of leaf-to-leaf symplectic embeddings are given by second-class submanifolds of a Poisson manifold as defined in part (iii), (b) of Definition 2 (see also the discussion following this definition). Other examples are cosymplectic submanifolds [13] and Dirac submanifolds [44] of a Poisson manifold; this follows from Corollary 2.11 and Theorem 2.3, (vi) of [44]. The situation in Definition 12 corresponds to a (generalized) Dirac bracket constructed for a family of second-class submanifolds C in a (nonsymplectic) Poisson manifold (P, Π) in the following way: we use the notation of Remark (iii) at the end of Sec. 4.1. However, for a degenerate Poisson structure Π there is no symplectically orthogonal complement of Tx C. Instead, we use the fact that C is, as a consequence of Definition 2, contained in a leaf L of P with symplectic structure ΩL which defines the complement of Tx C in Tx L = Tx C ⊕ Tx C ⊥ . Similarly, we have Tx∗ L = AnnL (Tx C ⊥ ) ⊕ AnnL (Tx C). The projection π ¯1 is now defined in two g We
are grateful to A. Weinstein for making us aware of this relation.
November 4, 2003 10:45 WSPC/148-RMP
702
00176
M. Bojowald & T. Strobl
steps: we first project Tx∗ P to Tx∗ P/ AnnP (Tx L) ∼ = Tx∗ L, followed by a projection ∗ to the first factor in the decomposition of Tx L. Since AnnP (Tx L) = ker Π]x the Poisson bivector factors through the first projection and ΠD = Π ◦ (¯ π1 ⊗ π ¯1 ) defines a bivector which generalizes the Dirac bracket. Acknowledgments We thank A. Alekseev, M. Bordemann, P. Bressler, S. Lyakhovich, D. Sternheimer and in particular A. Weinstein for interesting discussions and suggestions, and L. Dittmann for help with drawing the diagrams. M. B. is grateful for support from NSF grant PHY00-90091 and the Eberly research funds of Penn State, and to A. Wipf and the TPI in Jena for hospitality during an essential part of the completion of this work. T. S. thanks the Erwin Schr¨ odinger Institute in Vienna for hospitality during an inspiring workshop on Poisson geometry. References [1] P. A. M. Dirac, Lectures on Quantum Mechanics, Yeshiva University, Academic Press, New York, 1967. [2] M. Kontsevich, Deformation quantization of Poisson manifolds, I, q-alg/9709040. [3] A. Cannas da Silva and A. Weinstein, Geometric models for noncommutative algebras, Providence, RI, Amer. Math. Soc. (AMS ), 1999. [4] I. Vaisman, On the geometric quantization of Poisson manifolds, J. Math. Phys. 32 (1991), 3339–3345. [5] V. Schomerus, D-branes and deformation quantization, JHEP 9906 (1999), 030. [6] B. Jurco, P. Schupp and J. Wess, Noncommutative gauge theory for Poisson manifolds, Nucl. Phys. B584 (2000), 784–794. [7] M. Flato, A. Lichnerowicz and D. Sternheimer, Deformations of Poisson brackets, Dirac brackets and applications, J. Math. Phys. 17 (1976), 1754–1762. [8] A. Y. Alekseev, V. Schomerus and T. Strobl, Closed constraint algebras and path integrals for loop group actions, J. Math. Phys. 42 (2001), 2144–2155. [9] I. A. Batalin and E. S. Fradkin, Operatorial quantization of dynamical systems subject to second class constraints, Nucl. Phys. B279 (1987), 514–528. [10] I. A. Batalin and I. V. Tyutin, Existence theorem for the effective gauge algebra in the generalized canonical formalism with Abelian conversion of second class constraints, Int. J. Mod. Phys. A6 (1991), 3255. [11] I. A. Batalin, M. A. Grigoriev and S. L. Lyakhovich, Star product for second class constraint systems from a BRST Theory, Theor. Math. Phys. 128 (2001), 1109. [12] M. Grigoriev and S. Lyakhovich, Fedosov deformation quantization as a BRST theory, Comm. Math. Phys. 218 (2001), 437–457. [13] A. Weinstein, The local structure of Poisson manifolds, J. Differential Geom. 18 (1983), 523–557. [14] I. Vaisman, Lectures on the Geometry of Poisson Manifolds, Birkh¨ auser, Basel, 1994. [15] R. A. Bertlmann, Anomalies in Quantum Field Theory, Clarendon Press, Oxford, 1996. [16] G. Barnich, F. Brandt and M. Henneaux, Local BRST cohomology in gauge theories, Phys. Rep. 338 (2000), 439–569. [17] M. Karasev, Analogues of objects of the theory of Lie groups for nonlinear Poisson brackets, Math. USSR Izvestiya 28 (1987), 497–527.
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
703
[18] A. Weinstein, Symplectic groupoids and Poisson manifolds, Bull. Amer Math. Soc. 16 (1987), 101–104. [19] M. J. Gotay, On coisotropic embeddings of presymplectic manifolds, Proc. Am. Math. Soc. 84 (1982), 111–114. [20] V. Guillemin and S. Sternberg, Symplectic Techniques in Physics, Cambridge University Press, Cambridge, 1984. [21] L. D. Faddeev and S. L. Shatashvili, Realization of the Schwinger term in the Gauss law and the possibility of correct quantization of a theory with anomalies, Phys. Lett. B167 (1986), 225–228. [22] M. Bojowald and T. Strobl, Classical Solutions for Poisson Sigma Models on a Riemann Surface, JHEP 0307, 002 (2003). [23] T. Strobl, Gravity in Two Spacetime Dimensions, Habilitation thesis, RWTH Aachen, June 1999, hep-th/0011240. [24] P. Schaller and T. Strobl, Poisson structure induced (topological) field theories, Mod. Phys. Lett. A9 (1994), 3129–3136. [25] N. Ikeda, Two-dimensional gravity and nonlinear gauge theory, Ann. Phys. 235 (1994), 435–464. [26] M. Bojowald and A. Perez, Spin Foam Quantization and Anomalies, gr-gc/0303026. [27] I. Vaisman, Symplectic Geometry and Secondary Characteristic Classes, Birkh¨ auser, Basel, 1987. [28] T. J. Courant, Dirac manifolds, Trans. Amer. Math. Soc. 319 (1990) 631–661. [29] Z. J. Liu, A. Weinstein and P. Xu, Manin triples for Lie bialgebroids, J. Differential Geom. 45 (1997), 547–574. [30] R. U. Sexl and H. K. Urbantke, Relativity, Groups, Particles: Special Relativity and Relativistic Symmetry in Field and Particle Physics, Springer-Verlag, New York, 2001. [31] A. Weinstein, Coisotropic calculus and Poisson groupoids, J. Math. Soc. Japan 40 (1988), 705–727. [32] P. Libermann and Ch.-M. Marle, Symplectic Geometry and Analytical Mechanics, D. Reidel Publ. Comp., Dordrecht-Boston, 1987. [33] R. Bott and L. M. Tu, Differential Forms in Algebraic Topology, Graduate Texts in Mathematics 82, Springer-Verlag, New York, 1991. [34] V. N. Gribov, Quantization of nonabelian gauge theories, Nucl. Phys. B139, 1 (1978). [35] M. Henneaux and C. Teitelboim, Quantization of Gauge Systems, Princeton University Press, Princeton, 1992. [36] S. Lyakhovich and R. Marnelius, Extended observables in theories with constraints, Int. J. Mod. Phys. A 16 (2001), 4271–4296. [37] C. Klimˇc´ık and T. Strobl, WZW-Poisson manifolds, J. Geom. Phys. 43 (2002), 341–344. [38] J.-S. Park, Topological Open P-Branes., hep-th/0012141. [39] P. Severa and A. Weinstein, Poisson geometry with a 3-form background, Prog. Theor. Phys. Suppl. 144 (2001), 145–154. [40] J. E. Marsden and T. S. Ratiu, Introduction to Mechanics and Symmetry, Springer, New York, 1999. [41] M. V. Karasev and V. P. Maslov, Nonlinear Poisson Brackets: Geometry and Quantization, Translations of Mathematical Monographs 119. Providence: AMS, 1993. [42] A. S. Cattaneo and G. Felder, Poisson Sigma Models and Symplectic Groupoids, Math. SG/0003023. [43] D. Dazord and T. Delzant, Le probl`eme g´en´eral des variables actions-angles, J. Differential Geom. 26 (1987), 223–251. [44] P. Xu, Dirac Submanifolds and Poisson Involutions, Math. SG/0110326.
November 5, 2003 9:42 WSPC/148-RMP
00173
Reviews in Mathematical Physics Vol. 15, No. 7 (2003) 705–743 c World Scientific Publishing Company
THE POISSON BRACKET FOR POISSON FORMS IN MULTISYMPLECTIC FIELD THEORY
MICHAEL FORGER∗ Departamento de Matem´ atica Aplicada, Instituto de Matem´ atica e Estat´ıstica, Universidade de S˜ ao Paulo, Caixa Postal 66281, BR–05311-970 S˜ ao Paulo, S.P., Brazil
[email protected] ‡ ¨ CORNELIUS PAUFLER† and HARTMANN ROMER
Fakult¨ at f¨ ur Physik, Albert-Ludwigs-Universit¨ at Freiburg im Breisgau Hermann-Herder-Straße 3, D–79104 Freiburg i.Br., Germany †
[email protected] ‡
[email protected] Received 4 November 2002 Revised 17 April 2003 We present a general definition of the Poisson bracket between differential forms on the extended multiphase space appearing in the geometric formulation of first order classical field theories and, more generally, on exact multisymplectic manifolds. It is well defined for a certain class of differential forms that we propose to call Poisson forms and turns the space of Poisson forms into a Lie superalgebra. Keywords: Geometric field theory; multisymplectic geometry; Poisson brackets.
1. Introduction The multiphase space approach to classical field theory, whose origins can be traced back to the early work of Hermann Weyl on the calculus of variations, has recently undergone a rapid development, but a number of conceptual questions is still open. The basic idea behind all attempts to extend the covariant formulation of classical field theory from the Lagrangian to the Hamiltonian domain is to treat spatial derivatives on the same footing as time derivatives. This requires associating to each field component ϕi not just its standard canonically conjugate momentum πi but rather n conjugate momenta πiµ , where n is the dimension of space-time. If one starts out from a Lagrangian L depending on the field and its first partial ∗ Partially ‡ Partially
supported by CNPq, Brazil supported by FAPESP, Brazil 705
November 5, 2003 9:42 WSPC/148-RMP
706
00173
M. Forger, C. Paufler & H. R¨ omer
derivatives, these are obtained by the covariant Legendre transformation πiµ =
∂L . ∂∂µ ϕi
This allows one to rewrite the standard Euler-Lagrange equations of field theory, ∂µ
∂L ∂L − =0 ∂∂µ ϕi ∂ϕi
as a covariant first order system, the covariant Hamiltonian equations or De DonderWeyl equations ∂H = ∂ µ ϕi , ∂πiµ
∂H = −∂µ πiµ ∂ϕi
where H = πiµ ∂µ ϕi − L is the covariant Hamiltonian density or De Donder-Weyl Hamiltonian. Multiphase space (ordinary as well as extended) is the geometric environment built by appropriately patching together local coordinate systems of the form (q i , pµi ) — instead of the canonically conjugate variables (q i , pi ) of mechanics — together with space-time coordinates xµ and, in the extended version, a further energy type variable that we shall denote by p (without any index). In the recent literature on the subject, special attention has been devoted to the so-called multisymplectic form ω which is, except for a sign, the exterior derivative of another form θ that we propose to call the multicanonical form: both are naturally defined on extended multiphase space and are the geometric objects replacing, respectively, the symplectic form ω = dq i ∧ dpi and the canonical form θ = pi dq i of Hamiltonian mechanics (on cotangent bundles), or more precisely, the symplectic form ω = dq i ∧ dpi + dt ∧ dE and the canonical form θ = pi dq i + E dt of Hamiltonian mechanics (on cotangent bundles) for non-autonomous systems. Additional motivation and precise definitions will be given in the next section, and a table confronting the most relevant concepts of the field theoretical formalism with their counterparts in Hamiltonian mechanics can be found at the end of the paper. The advantage of such an approach as compared to the orthodox strategy of treating field theoretical models as infinite-dimensional dynamical systems is threefold. First, general covariance (and in particular, Lorentz covariance) is trivially achieved. Second, by working on multiphase space which is a finite-dimensional manifold, one automatically avoids all the functional analytic complications that plague the orthodox method. Third, space-time locality is also automatically guaranteed, since one works with the field variables and their first derivatives or conjugates of these at single points of space-time, rather than with fields defined over entire hypersurfaces: integration is deferred to the very last step of every procedure. Of course, there is also a price to be paid for all these benefits, namely that the obvious duality of classical mechanics between coordinates and momenta is lost.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
707
As a result, there is no evident multiphase space quantization procedure. What seems to be needed is a new and more sophisticated concept of “multi-duality” to replace the standard duality underlying the canonical commutation relations. Certainly, an important step towards a better understanding of what might be the nature of this “multi-duality” and that of a multiphase space quantization procedure is the construction of Poisson brackets within this formalism. After all, the Poisson bracket should be the classical limit of the commutator of quantum theory. Surprisingly, this is to a large extent still an open problem. Our approach to the question has been motivated by the work of Kanatchikov [1, 2], who seems to have been the first to propose a Poisson bracket between differential forms of arbitrary degree in multimomentum variables and to analyze the restrictions that must be imposed on these forms in order to make this bracket well-defined: he uses the term “Hamiltonian form” in this context, although the concept as such is of course much older. It must be pointed out, however, that Kanatchikov’s approach is essentially local and makes extensive use of features that have no invariant geometric meaning, such as a systematic splitting into horizontal and vertical parts; moreover, his definition of Hamiltonian forms is too restrictive. We avoid all these problems by working exclusively within the multisymplectic framework and on the extended multiphase space, instead of the ordinary one: this leads naturally to a definition of the concept of a Poisson form which is more general than Kanatchikov’s notion of a Hamiltonian form, as well as to a coordinate-independent definition of the Poisson bracket between any two such forms. In fact, most of the concepts involved do not even depend on the explicit construction of extended multiphase space but only on its structure as an exact multisymplectic manifold, and we shall make use of this fact in order to simplify the treatment whenever possible. The paper is organized as follows. In Sec. 2, we give a brief review of some salient features of the multiphase space approach to the geometric formulation of first order classical field theories, following Ref. [3] and, in particular, Ref. [4], to which the reader is referred for more details and for the discussion of many relevant examples; this material is included here mainly in order to fix notation and make our presentation reasonably self-contained. The main point is to show that the extended multiphase space of field theory does carry the structure of an exact multisymplectic manifold (in fact it seems to be the only known example of a multisymplectic manifold). In Sec. 3, we introduce the concept of a Poisson form on a general multisymplectic manifold, specify the notion of an exact multisymplectic manifold, define the Poisson bracket between Poisson forms on exact multi symplectic manifolds and prove our main theorem, which states that this bracket satisfies the usual axioms of a Lie superalgebra. The construction generalizes the corresponding one for Hamiltonian (n − 1)-forms on the extended multiphase space of field theory given by two of the present authors in a previous paper [5]: the idea is to modify the standard formula that had been adopted for decades [6–11], even though it fails to satisfy the Jacobi identity, by adding a judiciously chosen exact form that turns out to cure the defect. Here, we show that the same trick works
November 5, 2003 9:42 WSPC/148-RMP
708
00173
M. Forger, C. Paufler & H. R¨ omer
for forms of arbitrary degree, provided one introduces appropriate sign factors. In both cases, it is the structure of the correction term that requires the underlying manifold to be exact multisymplectic and not just multisymplectic. In Sec. 4, we define the notion of an exact Hamiltonian multivector field on an exact multisymplectic manifold and show that by contraction with the multicanonical form θ, any such multivector field gives rise to a Poisson form; moreover, this simple prescription yields an antiSchouten bracket of multivector fields and the Poisson bracket of Poisson forms introduced here). It can be viewed as an extension, from vector fields to multivector fields, of the universal part of the covariant momentum map [4], which is the geometric version of the construction of Noether currents and the energy-momentum tensor in field theory, and we shall therefore refer to it as the universal multimomentum map. In Sec. 5, we return to the case of extended multiphase space and discuss other examples for the construction of Poisson forms. More specifically, we show that arbitrary functions are Poisson forms (of degree 0) and find that Kanatchikov’s Hamiltonian forms, when pulled back from ordinary to extended multiphase space by means of the appropriate projection, constitute a special class of Poisson forms. The complete determination of the space of Poisson forms of arbitrary degree > 0 on extended multiphase space, together with that of exact Hamiltonian and locally Hamiltonian multivector fields of arbitrary degree < n, is a technically demanding problem whose solution will be presented elsewhere [12]. The paper concludes with two appendices: the first presents a number of important formulas from the multivector calculus on manifolds, related to the definition and main properties of the Schouten bracket and the Lie derivative of differential forms along multivector fields, while the second shows how, given a connection in a fiber bundle, one can construct induced connections in various other fiber bundles derived from it, including the multiphase spaces of geometric field theory; this possibility is important for the comparison of the multisymplectic formalism with other approaches that have been proposed in the literature and to a certain extent depend on the a priori choice of a connection. Recently, the problem of constructing Poisson brackets has also been addressed in the context of other formalisms such as the one based on n-symplectic manifolds [13] (see [14] for a recent overview) or that of Lepage-Dedecker which is more general than that of De Donder-Weyl [15]. Finally, we would like to point out that there exists another construction of a covariant Poisson bracket in classical field theory, based on the same functional approach that underlies the construction of “covariant phase space” of CrnkovicWitten [16, 17] and Zuckerman [18]. This bracket, originally due to Peierls [19] and further elaborated by de Witt [20, 21] (see also [22] for a recent exposition), has been adapted to the multiphase space approach by Romero [23] and shown to be precisely the Poisson bracket associated with the symplectic form on covariant phase space introduced in Refs. [16, 17] and [18]; these results will be presented elsewhere [24]. It would be interesting to identify the relation between that bracket and the one introduced here; this question is presently under investigation.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
709
2. Multiphase Spaces in Geometric Field Theory The starting point for the geometric formulation of classical field theory is the choice of a configuration bundle, which in general will be a fiber bundle over space-time whose sections are the fields of the theory under consideration. In what follows, we shall denote its total space by E, its base space by M , its typical fiber by Q and the projection from E to M by π; the dimensions are dim M = n,
dim Q = N,
dim E = n + N .
(2.1)
In field theoretical models, M is interpreted as space-time whereas Q is the configuration space of the theory — a manifold whose (local) coordinates describe internal degrees of freedom.a The total space E is locally but not necessarily globally isomorphic to the Cartesian product M × Q, but it must be stressed that even when the configuration bundle is globally trivial, there will in general not exist any preferred trivialization, and it is precisely the freedom to change trivialization that allows one to incorporate gauge theories into the picture. Another point that deserves to be emphasized is that the configuration bundle does not in general carry any additional structures: these only appear when one focusses on special classes of field theories. • Vector bundles arise naturally in theories with linear matter fields and also in general relativity: the metric tensor is an example. • Affine bundles can be employed to incorporate gauge fields, since connections in a principal G-bundle P over space-time M can be viewed as sections of the connection bundle of P — an affine bundle CP over M constructed from P . • General fiber bundles are used to handle nonlinear matter fields, in particular those corresponding to maps from space-time M to some target manifold Q: a standard example are the nonlinear sigma models. In order to cover this variety of situations, the general constructions on which the geometric formulation of classical field theory is based must not depend on the choice of any additional structure on the configuration bundle. This requirement is naturally satisfied in the multiphase space formalism — in contrast to the majority of similar approaches that have over the last few decades found their way into the literature: most of these depend on the a priori choice of a connection in the configuration bundle, thus excluding gauge theories in which connections must be treated as dynamical variables and not as fixed background fields. The multiphase space approach to first order classical field theory follows the same general pattern as the standard formalism of classical mechanics on the tangent and cotangent bundle of a configuration space Q [25, 26].b However, the a This
interpretation is turned around in the theory of strings and membranes. term “first order” refers to the fact that the Lagrangian is supposed to be a pointwise defined function of the coordinates or fields and of their derivatives or partial derivatives of no more than first order; higher order derivatives should be eliminated, e.g., by introducing appropriate auxiliary variables.
b The
November 5, 2003 9:42 WSPC/148-RMP
710
00173
M. Forger, C. Paufler & H. R¨ omer
correspondence between the objects and concepts underlying the geometric formulation of mechanics and that of field theory becomes fully apparent only when one reformulates mechanics so as to incorporate the time dimension. (This is standard practice, e.g., in the study of non-autonomous systems, that is, mechanical systems whose Lagrangian/Hamiltonian depends explicitly on time, such as systems of particles in time-dependent external fields. Additional motivation is provided by relativistic mechanics where Newton’s concept of absolute time is abandoned and hence there is no place for an extraneous, absolute time variable that can be kept entirely separate from the arena where the dynamical phenomena take place.) In its simplest version, this reformulation amounts to replacing the configuration space Q by the extended configuration space R × Q and the velocity phase space T Q (the tangent bundle of Q) by the extended velocity phase space R × T Q, where R stands for the time axis. The usual momentum phase space T ∗ Q (the cotangent bundle of Q) admits two different extensions: the simply extended phase space R×T ∗ Q, where R represents the time variable, and the doubly extended phase space R × T ∗ Q × R, where the first copy of R represents the time variable whereas the second copy of R represents an energy variable. This second extension is required if one wants to maintain a symplectic structure, rather than just a contact structure, for extended phase space, since energy is the physical quantity canonically conjugate to time. A further generalization appears when one considers mechanical systems in external gauge fields, since time-dependent gauge transformations do not respect the direct product structure of the extended configuration and phase spaces mentioned above. What does remain invariant under such transformations are certain projections, namely the projection from the extended configuration space onto the time axis, the projections from the various extended phase spaces onto extended configuration space and, finally, the projection from the doubly extended to the simply extended phase space which amounts to “forgetting the additional energy variable”. In passing to field theory, we must replace the time axis R by the space-time manifold M , the extended configuration space R × Q by the configuration bundle E over M introduced above and the extended velocity phase space R × T Q by the jet bundle JE of E.c It is well known that JE is — unlike the tangent bundle of a manifold — in general only an affine bundle over E (of fiber dimension N n) and not a vector bundle; the corresponding difference vector bundle over E (also of fiber ~ dimension N n) will be called the linearized jet bundle of E and be denoted by JE. ~ This leads to the possibility of forming two kinds of dual: the linear dual of JE, denoted here by J~∗ E, and the affine dual of JE, denoted here by J ? E; both of them are vector bundles over E (of fiber dimension N n and N n + 1, respectively). Even more important are their twisted versions, obtained by taking the tensor product with the line bundle of volume forms on M , pulled back to E via π: this ~ called ordinary multiphase space and gives rise to the twisted linear dual of JE, c We
consider only first order jet bundles and therefore omit the index “1” used by many authors.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
711
∗ E, and the twisted affine dual of JE, called extended multiphase denoted here by J~ ? space and denoted here by J E; both of them, once again, are vector bundles over E (of fiber dimension N n and N n + 1, respectively). The former replaces the simply extended phase space R × T ∗ Q of mechanics whereas the latter replaces the doubly extended phase space R × T ∗ Q × R of mechanics. Moreover, in both cases (twisted or untwisted), there is a natural projection η that, as in mechanics, ? can be interpreted as “forgetting the additional energy variable”: it turns J E into
∗ ? ~ an affine line bundle over J E and, similarly, J E into an affine line bundle over J~∗ E. The most remarkable property of extended multiphase space is that it is an exact multisymplectic manifold: it carries a naturally defined multicanonical form θ, of degree n, whose exterior derivative is the multisymplectic form ω, of degree n + 1, replacing the canonical form θ and the symplectic form ω, respectively, on the doubly extended phase space R × T ∗ Q × R of mechanics. The global construction of the first order jet bundle JE and the linearized first order jet bundle J~E associated with a given fiber bundle E over a manifold M , as well as that of the various duals mentioned above, is quite easy to understand. (Higher order jet bundles are somewhat harder to deal with, but we won’t need them in this paper.) Given a point e in E with base point x = π(e) in M , the fiber Je E of JE at e consists of all linear maps from the tangent space Tx M of the base space M at x to the tangent space Te E of the total space E at e whose composition with the tangent map Te π : Te E → Tx M to the projection π : E → M gives the identity on Tx M :
Je E = {ue ∈ L(Tx M, Te E)/Te π ◦ ue = idTx M } .
(2.2)
Thus the elements of Je E are precisely the candidates for the tangent maps at x to (local) sections ϕ of the bundle E satisfying ϕ(x) = e. Obviously, Je E is an affine subspace of the vector space L(Tx M, Te E) of all linear maps from Tx M to the tangent space Te E, the corresponding difference vector space being the vector space of all linear maps from Tx M to the vertical subspace Ve E: J~e E = L(Tx M, Ve E) .
(2.3)
The jet bundle JE thus defined admits two different projections, namely the target projection τJE : JE → E and the source projection σJE : JE → M which is simply its composition with the original projection π, that is, σJE = π ◦ τJE . It is easily shown that JE is a fiber bundle over M with respect to σJE , in general without any additional structure, but it is an affine bundle over E with respect to τJE , the corresponding difference vector bundle being the vector bundle over E of linear maps from the pull-back of the tangent bundle of the base space by the projection π to the vertical bundle of E: ~ = L(π ∗ T M, V E) . JE
(2.4)
November 5, 2003 9:42 WSPC/148-RMP
712
00173
M. Forger, C. Paufler & H. R¨ omer
The affine structure of the jet bundle JE over E, as well as the linear structure of the ~ over E, can also be read off directly from local coordinate linearized jet bundle JE expressions. Namely, choosing local coordinates xµ for M , local coordinates q i for Q and a local trivialization of E induces naturally a local coordinate system (x µ , q i , qµi ) ~ such coordinates for JE, as well as a local coordinate system (xµ , q i , q~µi ) for JE: will simply be referred to as adapted local coordinates. Moreover, a transformation to new local coordinates x0κ for M , new local coordinates q 0k for Q and a new local trivialization of E, according to x0κ = x0κ (xµ ) ,
q 0k = q 0k (xµ , q i )
(2.5)
induces naturally a transformation to new adapted local coordinates (x0κ , q 0k , qκ0k ) ~ given by Eq. (2.5) and for JE and (x0κ , q 0k , q~κ0k ) for JE qκ0k = qκ0k (xµ , q i , qµi ) ,
q~κ0k = q~κ0k (xµ , q i , q~µi ) ,
(2.6)
where qκ0k =
∂xµ ∂q 0k i ∂xµ ∂q 0k q + , ∂x0κ ∂q i µ ∂x0κ ∂xµ
~qκ0k =
∂xµ ∂q 0k i q~ . ∂x0κ ∂q i µ
(2.7)
Before going on, we pause to fix some notation concerning differential forms, for which we shall in terms of local coordinates xµ use the following conventions: dn x = dx1 ∧ · · · ∧ dxn ,
(2.8)
dn xµ = i∂µ dn x = (−1)µ−1 dx1 ∧ · · · ∧ dxµ−1 ∧ dxµ+1 ∧ · · · ∧ dxn ,
(2.9)
dn xµν = i∂ν i∂µ dn x . . . dn xµ1 ...µr = i∂µr . . . i∂µ1 dn x .
(2.10)
Then i∂µ dn xµ1 ...µr = dn xµ1 ...µr µ ,
(2.11)
dxκ ∧ dn xµ = δµκ dn x ,
(2.12)
dxκ ∧ dn xµν = δνκ dn xµ − δµκ dn xν ,
(2.13)
whereas
dxκ ∧ dn xµ1 ... µr =
r X
(−1)r−p δµκp dn xµ1 ...µp−1 µp+1 ... µr .
(2.14)
p=1
Moreover, these (local) forms on M are lifted to (local) forms on E by pull-back with the projection πE , and later (local) forms on E will be lifted to (local) forms on total spaces of bundles over E by pull-back with the respective projection, without change of notation. The dual J ? E of the jet bundle JE and the dual J~∗ E of the linearized jet ~ are obtained according to the standard rules for defining the dual of bundle JE an affine space and of a vector space, respectively. In particular, these rules state
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
713
that if A is an affine space of dimension k over R, its dual A? is the space A(A, R) of affine maps from A to R, which is a vector space of dimension k + 1. Thus the dual or, more precisely, affine dual J ? E of the jet bundle JE and the dual or, more ~ are obtained by defining precisely, linear dual J~∗ E of the linearized jet bundle JE their fiber over any point e in E to be the vector space Je? E = {ze : Je E → R affine} ,
(2.15)
J~e∗ E = {~ze : J~e E → R linear} ,
(2.16)
and the vector space
respectively. However, as mentioned before, the multiphase spaces of field theory are defined with an additional twist, replacing the real line by the one-dimensional space of volume forms on the base manifold M at the appropriate point. Thus ? E of the jet bundle JE and the twisted (linear) dual the twisted (affine) dual J
∗ ~ are obtained from the corresponding ordinary J~ E of the linearized jet bundle JE (untwisted) duals by taking the tensor product with the line bundle of volume forms on the base manifold M , pulled back to the total space E via the projection π, i.e. we put Vn ? E = J ? E ⊗ π∗ ( T ∗M ) , (2.17) J and
∗ E = J~∗ E ⊗ π ∗ ( J~
Vn
T ∗M ) ,
respectively, which means that if x = π(e), we set Vn ∗ ? Tx M affine} , E = {ze : Je E → Je
(2.18)
(2.19)
and
∗ E = {~ze : J~e E → J~e
Vn
Tx∗ M linear} ,
(2.20)
respectively. As is the case for the jet bundle itself, the linearized jet bundle and the various types of dual bundles introduced here all admit two different projections, namely the target projection τ... onto E and the source projection σ... onto M which is simply its composition with the original projection π, that is, σ... = π ◦ τ... . It is easily shown that all of them are fiber bundles over M with respect to σ... , in general without any additional structure, but — as stated before — they are vector bundles over E with respect to τ... . The global linear structure of these bundles over E also becomes clear in local coordinates. Namely, choosing local coordinates x µ for M , local coordinates q i for Q and a local trivialization of E induces naturally ~ but also not only local coordinate systems (xµ , q i , qµi ) for JE and (xµ , q i , q~µi ) for JE µ i µ
? ? local coordinate systems (x , q , p i , p) both for J E and for J E, as well as local ∗ coordinate systems (xµ , q i , p µi ) both for J~∗E and for J~ E, respectively: all these will again be referred to as adapted local coordinates. They are defined by requiring
November 5, 2003 9:42 WSPC/148-RMP
714
00173
M. Forger, C. Paufler & H. R¨ omer
? the dual pairing between a point in J ? E or in J E with coordinates (xµ , q i , p µi , p ) µ i i and a point in JE with coordinates (x , q , qµ ) to be given by
pµi qµi + p
(2.21)
in the ordinary (untwisted) case and by (pµi qµi + p)dn x
(2.22)
∗ in the twisted case, whereas the dual pairing between a point in J~∗ E or in J~ E µ i µ µ i i ~ with coordinates (x , q , q~µ ) should with coordinates (x , q , pi ) and a point in JE be given by
pµi ~qµi
(2.23)
in the ordinary (untwisted) case and by pµi q~µi dn x
(2.24)
in the twisted case. Moreover, a transformation to new local coordinates x 0κ for M , new local coordinates q 0k for Q and a new local trivialization of E, according to Eq. (2.5), induces naturally not only a transformation to new adapted local coordi~ as given by Eqs. (2.6) and (2.7), nates (x0κ , q 0k , qκ0k ) for JE and (x0κ , q 0k , q~κ0k ) for JE, 0 but also a transformation to new adapted local coordinates (x0κ , q 0k , p0κ k , p ) both
? ? for J E and for J E, as well as a transformation to new adapted local coordinates ∗ ~∗ ~ (x0κ , q 0k , p0κ k ) both for J E and for J E, respectively: they are given by 0κ µ i µ p0κ k = pk (x , q , pi , p) ,
p0 = p0 (xµ , q i , pµi , p) ,
(2.25)
where p0κ k =
∂x0κ ∂q i µ p , ∂xµ ∂q 0k i
p0 = p −
∂q 0k ∂q i µ p ∂xµ ∂q 0k i
in the ordinary (untwisted) case and ∂x ∂x0κ ∂q i µ ∂x ∂q 0k ∂q i µ 0 p0κ = det p , p = det p p − k ∂x0 ∂xµ ∂q 0k i ∂x0 ∂xµ ∂q 0k i
(2.26)
(2.27)
? in the twisted case. Finally, it is worth noting that the affine duals J ? E and J E
? ? of JE contain line subbundles J0 E and J0 E whose fiber over any pointVe in E n ∗ consists of the constant (rather than affine) maps from Je E to R and to Tx M respectively, and the corresponding quotient vector bundles over E can be naturally ∗ ~ i.e. we have identified with the respective duals J~∗ E and J~ E of JE,
J ? E/J0? E ∼ = J~∗ E ∼ = L(V E, π ∗ T M ) ,
(2.28)
and
? ? E/J0 E∼ J = L(V E, π ∗ ( = J~ ∗ E ∼
Vn−1
T ∗ M )) ,
(2.29)
respectively. This shows that, in both cases, the corresponding projection onto the quotient amounts to “forgetting the additional energy variable” since it takes a
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
715
point with coordinates (xµ , q i , pµi , p) to the point with coordinates (xµ , q i , pµi ); it will be denoted by η (as a reminder for the fact that it projects the extended ? multiphase space to the ordinary one) and is easily seen to turn J ? E and J E into
∗ ∗ ~ ~ affine line bundles over J E and over J E, respectively. An alternative but equivalent description of the extended multiphase space of field theory is as a certain bundle of differential forms on the total space E of the Vn configuration bundle, namely the bundle n−1 T ∗ E of (n − 1)-horizontal n-forms on E, that is, of n-forms on E that vanish whenever one inserts at least two vertical vectors. In fact, there is a canonical isomorphism Vn ∼ = ? Φ : n−1 T ∗ E −→ J E (2.30)
of vector bundles over E that can be defined explicitly as follows: given anyVpoint e in n E with base point x = π(e) in M and any (n − 1)-horizontal n-form αe ∈ n−1 Te∗ E, together with a jet ue ∈ Je E, we can use ue , which is a linear map from Tx M to Te E, to pull back the n-form αe on Te E to an n-form u∗e αe on Tx M . Obviously, u∗e αe is an affine function of ue as ue varies over the affine space Je E because it is actually a linear function of ue when ue is allowed to vary over the entire vector space L(Tx M, Te E) (the restriction of a linear map between two vector spaces to an affine subspace of its domain is an affine map). Thus putting Φe (αe ) · ue = u∗e αe
(2.31)
Vn defines a map Φe : n−1 Te∗ E → Je? E which is evidently linear and, as e varies over E, provides the desired isomorphism (2.30). Further details can be found in Ref. [4]. The importance of this canonical isomorphism is due to the fact that it provides a natural way to introduce a multicanonical form θ and a multisymplectic form ω on extended multiphase space which play a similar role in field theory as the canonical form θ and the symplectic form ω on cotangent bundles in mechanics. Namely, θ is an n-form that can be defined intrinsically by using the tangent map ? ? T τJ ◦? E : T (J E) → T E to the bundle projection τJ ◦? E : J E → E, as follows. ? E with base point e = τJ ◦? E (z) in E and n tangent vectors Given a point z ∈ J ? w1 , . . . , wn to J E at z, put ? E · wn ) . ? E · w 1 , . . . , Tz τ J ◦ θz (w1 , . . . , wn ) = (Φ−1 e (z))(Tz τJ ◦
(2.32)
Moreover, ω is an (n + 1)-form which, as in mechanics, is defined to be the negative of the exterior derivative of θ: ω = −dθ .
(2.33)
Another important object that can be defined globally both on extended and ordinary multiphase space is the scaling or Euler vector field which we shall denote here ∗ ? E are total by Σ. Its definition is based exclusively on the fact that J E and J~ spaces of vector bundles over E. In fact, given any vector bundle V over E, Σ V (which we shall simply denote by Σ when there is no danger of confusion) is defined
November 5, 2003 9:42 WSPC/148-RMP
716
00173
M. Forger, C. Paufler & H. R¨ omer
to be the fundamental vector field associated with the action of R, considered as a commutative group under addition, by scaling transformations on the fibers: R×V →
V
(λ, v) 7→ exp(λ)v
.
Thus Σ is simply that vertical vector field on V which, under identification of the vertical tangent spaces to V with the fibers of V itself typical for vector bundles, becomes the identity on V : d exp(λ)v =v. Σ(v) = dλ λ=0
In adapted local coordinates, the isomorphism Φ can be defined by the requirement ? E with that the (n − 1)-horizontal n-form on E corresponding to the point in J µ coordinates (xµ , q i , pi , p) is explicitly given by pµi dq i ∧ dn xµ + p dn x .
(2.34)
The tautological nature of the definition of θ then becomes apparent by realizing that exactly the same expression represents the multicanonical form θ: θ = pµi dq i ∧ dn xµ + p dn x .
(2.35)
Taking the exterior derivative yields ω = dq i ∧ dpµi ∧ dn xµ − dp ∧ dn x .
(2.36)
? ∗ Moreover, the scaling vector fields on J E and on J~ E are given by
Σ = pµi
∂ ∂ +p ∂pµi ∂p
(2.37)
∂ ∂pµi
(2.38)
and by Σ = pµi
respectively. Finally, we note the following relations, which will be used later. Proposition 2.1. The multicanonical form θ, the multisymplectic form ω and the ? scaling or Euler vector field Σ on extended multiphase space J E satisfy the following relations: LΣ θ = θ .
(2.39)
LΣ ω = ω .
(2.40)
iΣ θ = 0 .
(2.41)
iΣ ω = −θ .
(2.42)
Proof. Let (ϕλ )λ ∈ R denote the one-parameter group of scaling transformations ? on J E given by ϕλ (z) = eλ z. Then by the formula relating the Lie derivative of
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
717
a differential form along a vector field to the derivative of its pull-back under the flow of that vector field (see, e.g., [25, p. 91]) and the definition of θ, we have ∂ ∗ (LΣ θ)z (w1 , . . . , wn ) = (ϕ θ)z (w1 , . . . , wn ) ∂λ λ λ=0 ∂ θϕλ (z) (Tz ϕλ · w1 , . . . , Tz ϕλ · wn ) = ∂λ λ=0 ∂ −1 Φe (ϕλ (z))(Tϕλ (z) τJ ◦? E · (Tz ϕλ · w1 ), . . . , Tϕλ (z) τJ ◦? E · (Tz ϕλ · wn )) = ∂λ λ=0 ∂ −1 λ = Φ (e z)(Tz (τJ ◦? E ◦ ϕλ ) · w1 , . . . , Tz (τJ ◦? E ◦ ϕλ ) · wn ) ∂λ e λ=0 ∂ λ −1 e Φe (z)(Tz τJ ◦? E · w1 , . . . , Tz τJ ◦? E · wn ) = ∂λ λ=0 ∂ λ = e θz (w1 , . . . , wn ) ∂λ λ=0 = θz (w1 , . . . , wn ) ,
which proves Eq. (2.39) and also Eq. (2.40) since LΣ commutes with the exterior ? derivative. Next, observe that with respect to the target projection of J E onto E, Σ is vertical whereas θ is horizontal, which implies Eq. (2.41). Combining these two equations, we finally get θ = LΣ θ = d(iΣ θ) + iΣ dθ = −iΣ ω , proving Eq. (2.42). We note here that the existence of the canonically-defined forms θ and ω is what ? distinguishes the twisted affine dual J E from the ordinary affine dual J ? E of JE. ∗ ? Using the jet bundle JE and the multiphase spaces J~ E and J E associated with a given fiber bundle E over space-time M , one can develop a general covariant Lagrangian and Hamiltonian formalism for field theories whose configurations are sections of E. For example, the Lagrangian function of mechanics is replaced by a Lagrangian density L, which is a function on JE with values in the volume forms on space-time, so that one can integrate it to compute the action functional and formulate a variational principle. It gives rise to a covariant Legendre transformation which replaces that of mechanics and comes in two variants, both defined by an appropriate notion of vertical derivative or fiber derivative: one of them is a fiber ∗ E and the other a fiber preserving smooth preserving smooth map ~FL : JE → J~
? map FL : JE → J E; of course, the former is obtained from the latter by composi? ∗ tion with the natural projection η from J E onto J~ E mentioned above. When ~FL is a local/global diffeomorphism, the Lagrangian L is called regular/hyperregular.
November 5, 2003 9:42 WSPC/148-RMP
718
00173
M. Forger, C. Paufler & H. R¨ omer
On the other hand, the Hamiltonian function of mechanics is replaced by a Hamil? E as an affine tonian density H, which is a section of extended multiphase space J
∗ line bundle over ordinary multiphase space J~ E. Once again, any such section gives rise to a covariant Legendre transformation, defined by an appropriate notion of vertical derivative or fiber derivative: it is a fiber preserving smooth map ∗ FH : J~ E → JE. When FH is a local/global diffeomorphism, the Hamiltonian H is ? E to JE via called regular/hyperregular. In any case, pulling back θ and ω from J FL generates the Poincar´e-Cartan forms θL and ωL on JE, and similarly, pulling ∗ ∗ ? E. As in E via H generates the forms θH and ωH on J~ them back from J E to J~ mechanics, the Lagrangian and Hamiltonian formulations turn out to be completely equivalent in the hyperregular case, with ~FL and FH being each other’s inverse. For more details on these and related matters, the reader may consult Ref. [3] and, in particular, Ref. [4] — except for the direct construction of the Legendre transformation ~FH associated with a Hamiltonian H, which was first derived in Ref. [23]; see also Ref. [24]. There is also a generalization of the Hamilton-Jacobi equation to the field theoretical situation; the reader may consult the extensive review by Kastrup [27] as a starting point for this direction. 3. Poisson Forms and Their Poisson Brackets The constructions exposed in the previous section have identified the extended multiphase space of field theory as an example of a multisymplectic manifold. Definition 3.1. A multisymplectic manifold is a manifold P equipped with a non-degenerate closed (n + 1)-form ω, called the multisymplectic form. Remark. This definition is deliberately vague as to the meaning of the term “nondegenerate”, at least when n > 1. The standard interpretation is that the kernel of ω on vectors should vanish, that is, iX ω = 0 ⇒ X = 0 for vector fields X .
(3.1)
Note that, of course, no such conclusion holds for multivector fields, that is, the kernel of ω on multivectors is non-trivial. (This is true even for symplectic forms which vanish on certain bivectors, for example on those that represent two-dimensional isotropic subspaces.) However, the condition (3.1) alone is too weak and it is not clear what additional algebraic constraints should be imposed on ω. A first attempt in this direction has been made by Martin [28, 29], but his conditions are too restrictive and do not seem to agree with what is needed in applications to field theory. More recently, a promising proposal has been made by Cantrijn, Ibort and de Le´ on [30] which seems to come close to a convincing definition of the concept of a multisymplectic manifold. Fortunately, there is no need to enter this discussion here since the “minimal” requirement of non-degeneracy formulated in Eq. (3.1) is sufficient for our purposes and will be used here to provide a working definition.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
719
In what follows, we shall make extensive use of the basic operations of calculus on manifolds involving multivector fields and differential forms, namely the Schouten bracket between multivector fields, the contraction of differential forms with multivector fields and the Lie derivative of differential forms along multivector fields. For the convenience of the reader, the relevant formulae are summarized in Appendix A; in particular, Eqs. (A.9) and (A.11) will be used constantly and often without further mention. On multisymplectic manifolds, there are special classes of multivector fields and of differential forms: Definition 3.2. An r-multivector field X on a multisymplectic manifold P is called locally Hamiltonian if iX ω is closed, or equivalently, if LX ω = 0 ,
(3.2)
and it is called globally Hamiltonian or simply Hamiltonian if iX ω is exact, i.e. if there exists an (n − r)-form f on P such that iX ω = df .
(3.3)
In this case, we say that f is associated with X or corresponds to X. Conversely, an (n − r)-form f on a multisymplectic manifold P is called Hamiltonian if there exists an r-multivector field X on P such that iX ω = df .
(3.4)
In this case, we say that X is associated with f or corresponds to f . Remark. As mentioned before, the kernel of ω on multivectors is non-trivial, so the correspondence between Hamiltonian multivector fields and Hamiltonian forms is not unique (in either direction). Moreover, by far not every form is Hamiltonian. In particular, as first shown in special examples by Kijowski [8] and then more systematically by Kanatchikov [1], although in a somewhat different context, there are restrictions on the allowed multimomentum dependence of the coefficient functions. Of course, every closed form is Hamiltonian (the corresponding Hamiltonian multivector field vanishes identically). Below we will give more interesting examples to show that the definition is not empty. Proposition 3.3. The Schouten bracket of any two locally Hamiltonian multivector fields X and Y on a multisymplectic manifold P is a globally Hamiltonian multivector field [X, Y ] on P whose associated Hamiltonian form can, up to sign, be chosen to be the double contraction iX iY ω. More precisely, assuming X to be of degree r and Y to be of degree s, we have i[X,Y ] ω = (−1)(r−1)s d(iX iY ω) .
(3.5)
In particular, this implies that under the Schouten bracket, the space X ∧ LH (P ) of locally Hamiltonian multivector fields on P is a subalgebra of the Lie superalgebra
November 5, 2003 9:42 WSPC/148-RMP
720
00173
M. Forger, C. Paufler & H. R¨ omer
X∧ (P ) of all multivector fields on P, containing the space X∧ H (P ) of globally Hamiltonian multivector fields, as well as the (smaller) space X∧ (P ) of multivector fields 0 taking values in the kernel of ω, as ideals: if X is locally Hamiltonian, then iξ ω = 0
⇒
i[ξ,X] ω = 0 .
(3.6)
Proof. According to Eqs. (A.11) and (A.9), we have for any two multivector fields X of degree r and Y of degree s, i[X,Y ] ω = (−1)(r−1)s LX iY ω − iY LX ω = (−1)(r−1)s d(iX iY ω) + (−1)(r−1)(s−1) iX d(iY ω) − iY LX ω = (−1)(r−1)s d(iX iY ω) + (−1)(r−1)(s−1) iX LY ω − iY LX ω , since dω = 0, showing that if X and Y are both locally Hamiltonian, then [X, Y ] is globally Hamiltonian and Eq. (3.5) holds. Definition 3.4. A Hamiltonian form f on a multisymplectic manifold P is called a Poisson form if its contraction with any multivector field ξ on P taking values in the kernel of ω vanishes: iξ ω = 0
⇒
iξ f = 0 .
(3.7)
Remark. For the Poisson bracket introduced below to be well-defined, it would be sufficient to impose the apparently weaker condition that the contraction of f with any multivector field ξ on P taking values in the kernel of ω should be a closed form: iξ ω = 0
⇒
d(iξ f ) = 0 .
(3.8)
However, it turns out that this condition is already sufficient to imply the previous one. To see this, observe that if f is a differential form on P satisfying Eq. (3.8) and ξ is any multivector field on P taking values in the kernel of ω, then for any function ϕ on P , ϕξ will be a multivector field on P taking values in the kernel of ω as well and hence 0 = d(iϕξ f ) = d(ϕiξ f ) = dϕ ∧ iξ f + ϕd(iξ f ) = dϕ ∧ iξ f . But this means that the exterior product of iξ f with any one-form on P must vanish, which is only possible if iξ f itself vanishes. Definition 3.5. An exact multisymplectic manifold is a multisymplectic manifold whose multisymplectic form ω is the exterior derivative of a Poisson form: ω = −dθ . iξ ω = 0
⇒
iξ θ = 0 .
We shall call θ the multicanonical form.
(3.9) (3.10)
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
721
Remark. It is an immediate consequence of Proposition 2.1, in particular of Eq. (2.42), that the extended multiphase space of field theory is an exact multisymplectic manifold. However, the condition that the kernel of θ should contain that of ω is non-trivial in the sense that it is not always possible to modify a potential of an exact form by adding an appropriate closed form so as to achieve the desired inclusion of the kernels, as the following counterexample will show.d Consider the three-sphere S 3 as the total space of the Hopf bundle, a principal U (1)-bundle over the two-sphere S 2 , and let ξ be the fundamental vector field of the U (1) group action on S 3 and α be the canonical connection 1-form on S 3 . Then iξ α = 1 and iξ dα = 0. We want to modify α by some closed form β so that iξ (α + β) = 0. But S 3 is simply connected, so dβ = 0 implies that there is a function f with df = β. Hence we are looking for a function f on S 3 that satisfies iξ df = −1. But S 3 is compact, so f must have at least two critical points (a maximum and a minimum), and we arrive at a contradiction. In other words, we cannot modify the potential α of dα in such a way that the kernel of dα is contained in the kernel of the modified potential. Definition 3.6. Let P be an exact multisymplectic manifold. Given any two Poisson forms f of degree n − r and g of degree n − s on P, their Poisson bracket is defined to be the (n + 1 − r − s)-form on P given by {f, g} = −LX g + (−1)(r−1)(s−1) LY f − (−1)(r−1)s LX ∧ Y θ ,
(3.11)
or equivalently, {f, g} = (−1)r(s−1) iY iX ω + d (−1)(r−1)(s−1) iY f − iX g − (−1)(r−1)s iY iX θ ,
(3.12)
where X and Y are Hamiltonian multivector fields associated with f and with g, respectively. Remark. This Poisson bracket is an extension of the one between Hamiltonian (n − 1)-forms introduced by two of the present authors in an earlier article [5], except for the fact that when f and g are (n − 1)-forms, X and Y are vector fields and are uniquely determined by f and g, so there is no need to impose restrictions on the contraction of f and g with multivector fields taking values in the kernel of ω: the definition given in Ref. [5] works for all Hamiltonian (n − 1)-forms and not just for Poisson (n − 1)-forms. Proposition 3.7. The Poisson bracket introduced above closes and is well-defined, i.e. when f and g are Poisson forms, {f, g} is again a Poisson form which does not depend on the choice of the Hamiltonian multivector fields X and Y used in its definition. Moreover, we have i[Y,X]ω = d{f, g} , d This
example is due to M. Bordemann.
(3.13)
November 5, 2003 9:42 WSPC/148-RMP
722
00173
M. Forger, C. Paufler & H. R¨ omer
i.e. if X is a Hamiltonian multivector field associated with f and Y is a Hamiltonian multivector field associated with g, then [Y, X] is a Hamiltonian multivector field associated with {f, g}. Proof. We begin by using Eq. (A.9) to show that, for any two Hamiltonian forms f of degree n − r and g of degree n − s with associated Hamiltonian multivector fields X and Y , respectively, the expressions on the right-hand side of Eqs. (3.11) and (3.12) coincide: −LX g + (−1)(r−1)(s−1) LY f − (−1)(r−1)s LX ∧ Y θ = −d(iX g) + (−1)r iX dg + (−1)(r−1)(s−1) d(iY f ) − (−1)(r−1)(s−1)+s iY df − (−1)(r−1)s d(iX ∧ Y θ) − (−1)r(s−1) iX ∧ Y ω = −d(iX g) + (−1)rs+r iY iX ω + (−1)(r−1)(s−1) d(iY f ) + (−1)rs−r iY iX ω − (−1)(r−1)s d(iY iX θ) − (−1)rs−r iY iX ω = (−1)r(s−1) iY iX ω + d (−1)(r−1)(s−1) iY f − iX g − (−1)(r−1)s iY iX θ .
In order for the bracket to be well-defined, it is necessary and sufficient that this expression vanishes whenever X or Y takes its values in the kernel of ω: this is guaranteed by the requirement that f , g and θ should be Poisson forms. Moreover, in view of Eq. (3.5), Eq. (3.13) follows immediately from Eq. (3.12), proving that the Poisson bracket {f, g} of two Poisson forms is a Hamiltonian form. To check that it is in fact a Poisson form, assume ξ to be a multivector field taking values in the kernel of ω, say of degree k, and consider the expressions obtained by contracting each of the four terms in Eq. (3.12) with ξ. The first obviously vanishes, whereas the fourth can be seen to vanish due to Eqs. (3.6) and (3.10): iξ d(iY iX θ) = (−1)s iξ iY d(iX θ) + iξ LY iX θ = (−1)s(k−1) iY iξ LX θ + (−1)r+s(k−1) iY iξ iX dθ −i[Y,ξ]iX θ + (−1)(s−1)k LY iξ iX θ = −(−1)s(k−1) iY i[X,ξ] θ + (−1)(r−1)k+s(k−1) iY LX iξ θ −(−1)r+s(k−1) iY iξ iX ω −i[Y,ξ]iX θ + (−1)(s−1)k LY iξ iX θ = 0.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
723
Similarly, the second and third can be handled by using Eqs. (3.6) and (3.7) which imply that iξ d(iY f ) = (−1)s iξ iY df + iξ LY f = (−1)s iξ iY iX ω − i[Y,ξ]f + (−1)(s−1)k LY iξ f , and iξ d(iX g) = (−1)r iξ iX dg + iξ LX g = (−1)r iξ iX iY ω − i[X,ξ] g + (−1)(r−1)k LX iξ g . vanish since f and g are Poisson forms. Now we can formulate the main theorem of this paper: Theorem 3.8. Let P be an exact multisymplectic manifold. The Poisson bracket introduced above is bilinear over R, is graded antisymmetric, which means that for any two Poisson forms f of degree n − r and g of degree n − s on P, we have {g, f } = −(−1)(r−1)(s−1) {f, g} ,
(3.14)
and satisfies the graded Jacobi identity, which means that for any three Poisson forms f of degree n − r, g of degree n − s and h of degree n − t on P, we have (−1)(r−1)(t−1) {f, {g, h}} + cyclic perm. = 0 ,
(3.15)
thus turning the space of Poisson forms on P into a Lie superalgebra. Remark. Bilinearity over R and the graded antisymmetry (3.14) being obvious, the main statement of the theorem is of course the validity of the graded Jacobi identity (3.15), which depends crucially on the exact correction terms, that is, the last three terms in the defining equation (3.12). To prove this, we need the following two lemmas: Lemma 3.9. Let P be a multisymplectic manifold. For any three locally Hamiltonian multivector fields X of degree r, Y of degree s and Z of degree t on P, we have the cyclic identity (−1)r(t−1) iX d(iY iZ ω) + cyclic perm. = (−1)rt d(iX iY iZ ω) , Proof. This is obtained by calculating iX d(iY iZ ω) = (−1)(s−1)t iX i[Y,Z] ω = (−1)(s−1)t+r(s+t−1) i[Y,Z] iX ω = (−1)r(s+t−1) (LY iZ − (−1)(s−1)t iZ LY )iX ω = (−1)r(s+t−1) d(iY iZ iX ω) + (−1)r(s+t−1)+s−1 iY d(iZ iX ω) −(−1)r(s+t−1)+(s−1)t iZ d(iY iX ω) , and multiplying by (−1)rt−r .
(3.16)
November 5, 2003 9:42 WSPC/148-RMP
724
00173
M. Forger, C. Paufler & H. R¨ omer
Lemma 3.10. Let P be an exact multisymplectic manifold. For any three locally Hamiltonian multivector fields X of degree r, Y of degree s and Z of degree t on P, we have the cyclic identity (−1)r(t−1) iX d(iY iZ θ) − (−1)r(t−1)+s iX iY d(iZ θ) + cyclic perm. = (−1)rt+r+s+t iX iY iZ ω + (−1)rt d(iX iY iZ θ) .
(3.17)
Proof. This is obtained by calculating iX d(iY iZ θ) + (−1)s−1 iX iY d(iZ θ) − (−1)(s−1)t iX iZ d(iY θ) + (−1)(s−1)(t−1) iX iZ iY ω = iX (LY iZ − (−1)(s−1)t iZ LY )θ = (−1)(s−1)t iX i[Y,Z] θ = (−1)(s−1)t+r(s+t−1) i[Y,Z] iX θ = (−1)r(s+t−1) (LY iZ − (−1)(s−1)t iZ LY )iX θ = (−1)r(s+t−1) d(iY iZ iX θ) + (−1)r(s+t−1)+s−1 iY d(iZ iX θ) − (−1)r(s+t−1)+(s−1)t iZ d(iY iX θ) − (−1)r(s+t−1)+(s−1)(t−1) iZ iY d(iX θ) , and multiplying by (−1)rt−r .
Proof of Theorem 3.8. Given any three Poisson forms f of degree n − r, g of degree n − s and h of degree n − t and fixing three Hamiltonian multivector fields X of degree r, Y of degree s and Z of degree t associated with f , with g and with h, respectively, we compute the double Poisson bracket (−1)(r−1)(t−1) {f, {g, h}} = (−1)(r−1)(t−1)+r(s+t) i[Z,Y ] iX ω + (−1)(r−1)(t−1)+(r−1)(s+t) d(i[Z,Y ] f ) − (−1)(r−1)(t−1) d(iX {g, h}) − (−1)(r−1)(t−1)+(r−1)(s+t−1) d(i[Z,Y ] iX θ) = −(−1)(rs+r+t)+r(s+t−1)+(st+s+t) iX i[Y,Z] ω + (−1)(r−1)(s−1)+(t−1)s d(LZ iY f ) − (−1)(r−1)(s−1) d(iY LZ f ) − (−1)(r−1)(t−1)+s(t−1) d(iX iZ iY ω) − (−1)(r−1)(t−1)+(s−1)(t−1) d(iX d(iZ g)) + (−1)(r−1)(t−1) d(iX d(iY h)) + (−1)(r−1)(t−1)+(s−1)t d(iX d(iZ iY θ)) − (−1)(r−1)s+(t−1)s d(LZ iY iX θ) + (−1)(r−1)s d(iY LZ iX θ)
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
725
= −(−1)rt+s+t iX d(iY iZ ω) + (−1)rs+st+r+t d(iZ d(iY f )) + (−1)rs+r+s d(iY d(iZ f )) − (−1)rs+r+s+t d(iY iZ iX ω) + (−1)rt+st+r+s+t d(iX iZ iY ω) − (−1)rt+st+r+s d(iX d(iZ g)) − (−1)rt+r+t d(iX d(iY h)) − (−1)rt+r d(iX d(iY iZ θ))
←
+ (−1)st+t d(iZ d(iX iY θ))
←
+ (−1)rs+s d(iY d(iZ iX θ)) − (−1)rs+s+t d(iY iZ d(iX θ)) . In the last expression, the underlined terms cancel each other. Moreover, under the cyclic sum, the terms marked by an arrow cancel each other and the terms containing derivatives of contractions of f , g, h cancel pairwise, i.e. the expression +(−1)rs+st+r+t d(iZ d(iY f )) + (−1)rs+r+s d(iY d(iZ f )) −(−1)rt+st+r+s d(iX d(iZ g)) − (−1)rt+r+t d(iX d(iY h)) +(−1)st+tr+s+r d(iX d(iZ g)) + (−1)st+s+t d(iZ d(iX g)) −(−1)sr+tr+s+t d(iY d(iX h)) − (−1)sr+s+r d(iY d(iZ f )) +(−1)tr+rs+t+s d(iY d(iX h)) + (−1)tr+t+r d(iX d(iY h)) −(−1)ts+rs+t+r d(iZ d(iY f )) − (−1)ts+t+s d(iZ d(iX g)) vanishes. Finally, using the cyclic identities (3.16) and (3.17), we see that the remaining terms sum up as follows: (−1)(r−1)(t−1) {f, {g, h}} + cyclic perm. = −(−1)r+s+t (−1)r(t−1) iX d(iY iZ ω) + cyclic perm.
+ d (−1)r(t−1) iX d(iY iZ θ) − (−1)r(t−1)+s iX iY d(iZ θ) + cyclic perm. = −(−1)r+s+t (−1)rt d(iX iY iZ ω) + d (−1)r+s+t (−1)rt iX iY iZ ω + (−1)rt d(iX iY iZ θ) = 0.
This completes the proof of the main theorem. Remark. From the definition given in Eq. (3.12), it is obvious that the Poisson bracket between an arbitrary Poisson form f and a closed Poisson form g is exact, since in this case the Hamiltonian multivector field Y associated with g may be
November 5, 2003 9:42 WSPC/148-RMP
726
00173
M. Forger, C. Paufler & H. R¨ omer
chosen to vanish identically, so that one gets {f, g} = −d(iX g). Therefore, the space of closed Poisson forms is an ideal in the Lie superalgebra of all Poisson forms. Concluding, it must not go unnoticed that the Poisson bracket between Poisson forms introduced in this paper should be looked upon with a certain amount of caution, for a variety of reasons. One of these is that the space of Poisson forms is a Lie superalgebra but apparently not a Poisson superalgebra, since the Poisson bracket does not act as a superderivation in its second argument with respect to the exterior product of forms, nor does there seem to exist any other naturally defined associative supercommutative product between Poisson forms with that property: this is in contrast to the situation for multivector fields which do form a Poisson superalgebra with respect to the exterior product and the Schouten bracket. There is also a degree problem, since for example, the Poisson bracket between functions would be a form of negative degree, which is always zero: this is, at least at first sight, rather odd. Finally, the question about the relation to the covariant Poisson bracket of Peierls and de Witt mentioned at the end of the introduction remains open. 4. The Universal Multimomentum Map On exact multisymplectic manifolds, Definition 3.2 can be complemented as follows. Definition 4.1. A multivector field X on an exact multisymplectic manifold P is called exact Hamiltonian if LX θ = 0 .
(4.1)
The terminology is consistent with that introduced before because exact Hamiltonian multivector fields are Hamiltonian: this is an immediate consequence of Proposition 4.3 below. Thus Proposition 3.3 can be complemented as follows. Proposition 4.2. The Schouten bracket of any two exact Hamiltonian multivector fields X and Y on an exact multisymplectic manifold P is an exact Hamiltonian multivector field [X, Y ] on P . This means that the space X∧ EH (P ) of exact Hamiltonian multivector fields on P is a subalgebra of the Lie superalgebra X ∧ (P ) of all multivector fields on P which, according to Eq. (3.6), contains the space X ∧ 0 (P ) of multivector fields taking values in the kernel of ω as an ideal. Proof. The proposition follows directly from Eq. (A.12). Exact Hamiltonian multivector fields generate Poisson forms, by contraction with the multicanonical form. Proposition 4.3. Let P be an exact multisymplectic manifold. For every exact Hamiltonian r-multivector field X on P, the formula J(X) = (−1)r−1 iX θ
(4.2)
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
727
defines a Poisson (n−r)-form J(X) on P whose associated Hamiltonian multivector field is X itself. In particular, X is Hamiltonian. Proof. Using Eq. (A.9), we see that the condition (4.1) implies d(J(X)) = (−1)r−1 d(iX θ) = (−1)r−1 LX θ − iX dθ = iX ω ,
(4.3)
so J(X) is a Hamiltonian form whose associated Hamiltonian multivector field is X itself. Moreover, the kernel of J(X) on multivectors contains that of θ which in turn contains that of ω, so J(X) is a Poisson form. Proposition 4.4. Let P be an exact multisymplectic manifold. The linear map J from the space X∧ EH (P ) of exact Hamiltonian multivector fields on P to the space of Poisson forms on P defined by Eq. (4.2) is an antihomomorphism of Lie superalgebras, i.e. we have {J(X), J(Y )} = J([Y, X]) .
(4.4)
Proof. For any two exact Hamiltonian multivector fields X of degree r and Y of degree s, we have, according to the defining Eqs. (3.12) and (4.2), {J(X), J(Y )} = (−1)r(s−1) iY iX ω + (−1)(r−1)(s−1)+r−1 d(iY iX θ) − (−1)s−1 d(iX iY θ) − (−1)(r−1)s d(iY iX θ) = (−1)r(s−1) iY iX ω + (−1)(r−1)s d(iY iX θ) , whereas combining Eqs. (A.11), (A.9) and (4.3) gives J([Y, X]) = (−1)r+s i[Y,X]θ = (−1)r+s+r(s−1) LY iX θ
since LY θ = 0
= (−1)r(s−1) d(iY iX θ) − (−1)r(s−1)+s iY d(iX θ) = (−1)r(s−1) d(iY iX θ) + (−1)r(s−1) iY iX ω . Obviously, these two expressions coincide. Remark. This proposition, even when restricted to vector fields and (n − 1)-forms, constitutes a remarkable improvement over the corresponding Proposition 4.5 of Ref. [4] where, due to an inadequate definition of the Poisson bracket (omitting the exact correction terms, that is, the last three terms in Eq. (3.12)), Eq. (4.4) must be modified by an exact correction term. Definition 4.5. Let P be an exact multisymplectic manifold. The linear map J from the space X∧ EH (P ) of exact Hamiltonian multivector fields on P to the space of Poisson forms on P defined by Eq. (4.2) will be called the universal multimomentum map and its restriction to the space XEH (P ) of exact Hamiltonian vector fields on P the universal momentum map.
November 5, 2003 9:42 WSPC/148-RMP
728
00173
M. Forger, C. Paufler & H. R¨ omer
Remark. The term “universal momentum map” can be justified in the context of Noether’s theorem, dealing with the derivation of conservation laws from symmetries. In classical field theory, conserved quantities are described by Noether currents which depend on the fields of the theory and are (n − 1)-forms on ndimensional space-time, so that they can be integrated over compact regions in spacelike hyper-surfaces in order to provide Noether charges associated with each such region: Noether’s theorem then asserts that when the fields satisfy the equations of motion of the theory, these Noether currents are closed forms. In the multiphase space approach, the Noether currents on space-time are obtained from corresponding Noether current forms defined on (extended) multiphase space via pull-back of differential forms, their entire field dependence being induced by this pull-back. Moreover, there is an explicit procedure to construct these Noether current forms on (extended) multiphase space: it is the field theoretical analogue of the momentum map of Hamiltonian mechanics on cotangent bundles and, in Ref. [4], is called the “special covariant momentum map”. Briefly, given a Lie group G, with Lie algebra g, the statement that G is a symmetry group of a specific theory supposes that we are given an action of G on the configuration bundle E over M by ~ as bundle automorphisms, which of course induces actions of G on JE and on JE,
? ∗ ~ well as on all of their duals, including J E and J E, by bundle automorphisms. (In order to speak of a symmetry, we must also assume the Lagrangian or Hamiltonian density to be invariant, or rather equivariant, under the action of G, but this aspect is not relevant for the present discussion.) As usual, each of these actions induces an antihomomorphism from g to the Lie algebra of vector fields on the corresponding manifold, taking each generator X in g to the corresponding fundamental vector ? E , all of which (except XM ) are projectable: field XM , XE , XJE , XJE ∗ E , XJ ◦ ~ . . . XJ~◦ for example, XE projects to XM under the tangent map T π : T E → T M to the ? E can projection π : E → M . Moreover, the vector fields XJE , XJE ∗ E , XJ ◦ ~ . . . XJ~◦ all be obtained from the vector field XE by a canonical lifting process. In particular, ? the projectable vector fields XJ ◦? E on J E obtained from projectable vector fields XE on E by lifting are exact Hamiltonian, and conversely, it turns out that all ? exact Hamiltonian vector fields on J E are obtained in this way. (The last statement, analogous to a corresponding statement for cotangent bundles, is not proved in Ref. [4]; it will be derived in Ref. [12].) Now the “special covariant momentum map” of Ref. [4] associated with the symmetry under G is simply given by composing the antihomomorphism that takes generators X in g to exact Hamiltonian ? fundamental vector fields XJ ◦? E on J E with the universal momentum map introduced above. Therefore, the universal momentum map comprises that part of the construction of the momentum map in field theory which does not depend on the a priori choice of a symmetry group or its action on the dynamical variables of the theory, and the universal multimomentum map extends that from vector fields to multivector fields.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
729
5. Poisson Forms on Multiphase Space Our aim in this final section is to give a series of examples for Poisson forms on the ? E of field theory. A full, systematic treatment of the extended multiphase space J subject will be given in a forthcoming separate paper [12]. As a preliminary step, we observe that there is a natural, globally defined notion ? of vertical vectors and of horizontal covectors on J E. In fact, there are two such notions, one referring to the “source” projection onto space-time M and the other to the “target” projection onto the total space E of the configuration bundle. In either case, the vertical vectors are those that vanish under the tangent to the projection, while the horizontal covectors are those that vanish on all vertical vectors. In adapted local coordinates, ∂ ∂ , i ∂q ∂pµi ∂ ∂pµi
and
and ∂ ∂p
∂ ∂p
are vertical with respect to the source projection, (5.1)
are vertical with respect to the target projection,
(5.2)
while dxµ dxµ
are horizontal with respect to the source projection,
and dq i
are horizontal with respect to the target projection.
(5.3) (5.4)
This can be extended to multivectors and exterior forms, as follows. Given positive integers r and s with s 6 r, an exterior r-form is said to be s-horizontal if it vanishes whenever one inserts at least r − s + 1 vertical vectors (this includes the standard notion of horizontal forms by taking s = r), and an r-multivector is said to be s-vertical if it is annihilated by all (r − s + 1)-horizontal exterior forms. Using the standard expansion of multivectors and of exterior forms in adapted local coordinates, it is not difficult to see that an r-form is s-horizontal if and only if it is a linear combination of terms each of which is an exterior product containing at least s horizontal covectors and that an r-multivector is s-vertical if and only if it is a linear combination of terms each of which is an exterior product containing at least s vertical vectors. Thus for example, Eqs. (2.35)–(2.37) show that θ and ω are both (n − 1)-horizontal with respect to the source projection and even n-horizontal with respect to the target projection, while Σ is vertical with respect to both projections. In what follows, the terms “vertical” and “horizontal” will always refer to the source projection, except when explicitly stated otherwise. For later use, we first write down the expansion of a general multivector field X of degree r in terms of adapted local coordinates, as follows:
November 5, 2003 9:42 WSPC/148-RMP
730
X =
00173
M. Forger, C. Paufler & H. R¨ omer
∂ 1 ∂ ∂ 1 µ1 ...µr ∂ ∂ X ∧···∧ + X i,µ2 ...µr i ∧ µ2 ∧ · · · ∧ µr r! ∂xµ1 ∂xµr (r − 1)! ∂x ∂q ∂x +
∂ 1 ∂ ∂ 1 µ1 ...µr ∂ ∂ µ2 ...µr ∂ X ∧ ∧ 1 µ2 ∧ · · · ∧ ∂xµr + (r − 1)! X0 µ2 ∧ · · · ∧ ∂xµr + ξ . r! i ∂x ∂p ∂x ∂pµ i
(5.5) Here, all coefficients are assumed to be totally antisymmetric in their spacetime indices, whereas ξ is assumed to take values in the kernel of ω. (This can always be achieved without loss of generality, because if we begin by supposing instead that ξ should contain all other terms of the standard expansion, that is, all 2-vertical terms, then ξ would contain just one group of terms that are not obviously annihilated under contraction with ω, namely the terms of the form ∂ ∂ ∂ ∂ ∧ ∧ ∧···∧ . κ i µ 3 ∂q ∂pk ∂x ∂xµr However, this part of ξ can be decomposed into the sum of a term which is annihilated under contraction with ω and a linear combination of the 1-vertical terms ∂ ∂ ∂ ∂ ∧ ∧ ∧···∧ , ∂p ∂xµ2 ∂xµ3 ∂xµr so that by a redefinition of the coefficents X0µ2 ...µr and of ξ, we arrive at the expression for X given in Eq. (5.5), with ξ now taking values in the kernel of ω. For a more detailed discussion, see Ref. [12].) Explicitly, the contraction of ω with X then reads 1 (−1)r µ1 ...µr iX ω = X µ1 ...µr dq i ∧ dpµi ∧ dn xµµ1 ...µr − X dp ∧ dn xµ1 ...µr r! r! +
(−1)r−1 i,µ2 ...µr µ n X dpi ∧ d xµµ2 ...µr (r − 1)!
+
(−1)r µ1 ...µr i n Xi dq ∧ d xµ1 ...µr r!
−
1 X µ2 ...µr dn xµ2 ...µr , (r − 1)! 0
(5.6)
while that of θ with X reads (−1)r µ1 ...µr µ i n 1 iX θ = X pi dq ∧ d xµµ1 ...µr + X µ1 ...µr p dn xµ1 ...µr r! r! +
1 X i,µ2 ...µr pµi dn xµµ2 ...µr , (r − 1)!
(5.7)
where, in each of the last two equations, the first term is to be omitted if r = n, whereas only the last term in the first equation remains and iX θ vanishes identically if r = n + 1. With these preliminaries out of the way, we can easily deal with the simplest case, which is that of functions.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
731
? E is always a Poisson 0-form. Moreover, in Proposition 5.1. A function f on J adapted local coordinates, the corresponding Hamiltonian n-multivector field X is, modulo terms taking values in the kernel of ω, given by ∂ 1 ∂f ∂ 1 ∂f ∂ ∂ µ2 ...µn µ ∧ X=− − ∧···∧ µ µ µ 2 (n − 1)! ∂x ∂p n ∂p ∂x ∂x ∂xµn 1 ∂ ∂f ∂ 1 ∂f ∂ ∂ + ∧···∧ . (5.8) µ2 ... µn µ − ∧ µ µ i i µ 2 (n − 1)! ∂pi ∂q n ∂q ∂pi ∂x ∂xµn
Proof. First of all, observe that for functions f , the kernel condition (3.7) is void. Next, we simplify the expression (5.6), with r = n, by noting that due to our conventions (2.8), (2.9) and (2.10), we have dn xµ1 ...µn = µ1 ...µn ,
dn xµ2 ...µn = µ2 ...µn µ dxµ .
(5.9)
Thus iX ω = − −
(−1)n 1 µ1 ...µn X µ1 ...µn dp + µ ...µ µ X i,µ2 ...µn dpµi n! (n−1)! 2 n 1 1 µ ...µ µ X µ,µ2 ...µn dq i − µ ...µ µ X µ2 ...µn dxµ . (5.10) (n − 1)! 2 n i (n − 1)! 2 n 0
Equating this expression with the exterior derivative of f , we obtain the following system of equations ∂f X µ1 ...µn = (−1)n−1 µ1 ...µn , (5.11) ∂p X i,µ2 ...µn = µ2 ...µn µ
∂f , ∂pµi
Xiµ,µ2 ...µn = −µ2 ...µn µ
1 ∂f , n ∂q i
(5.12) (5.13)
∂f . (5.14) ∂xµ Inserting this back into Eq. (5.5), with r = n, and rearranging the terms, we arrive at Eq. (5.8). X0µ2 ...µn = −µ2 ...µn µ
? Remark. It has been shown in Ref. [31] that for functions h on J E of the special form
h(xµ , q i , pµi , p) = −H(xµ , q i , pµi ) − p ,
(5.15)
the associated Hamiltonian multivector field X can be chosen so that it defines ? an n-dimensional distribution in J E because it is locally decomposable, that is, locally there exist vector fields X1 , . . . , Xn such that X = X1 ∧ · · · ∧ Xn satisfies the equation iX ω = dh. Indeed, setting ∂ ∂h ∂ 1 ∂h ∂ ∂h 1 ∂h ∂h ∂ Xµ = − µ + µ i − − − , (5.16) ∂x ∂pi ∂q n ∂q i ∂pµi ∂xµ n ∂q i ∂pµi ∂p
November 5, 2003 9:42 WSPC/148-RMP
732
00173
M. Forger, C. Paufler & H. R¨ omer
we can convince ourselves that this choice of X and the choice of X made in Eq. (5.8) differ by a term taking values in the kernel of ω. Under additional assumptions, this distribution will be integrable and its integral manifolds will be the images of ? E over M satisfying the covariant Hamiltonian equations of motion, sections of J or De Donder-Weyl equations. Another method for constructing Poisson forms on the extended multiphase ? ∗ E is from Hamiltonian forms on the ordinary multiphase space J~ E, as space J
? introduced by Kanatchikov [1, 2], pulling these back to J E via the appropriate projection. To describe the salient features of Kanatchikov’s construction, one must first of ∗ E similar to the multisymplectic form ω that exists all introduce a structure on J~
? naturally on J E. This requires the choice of a connection in E and of a linear connection in T M which, for the sake of convenience, will be assumed to be torsion free. Together, they induce connections in all the other bundles that are important in the multiphase space approach to field theory, including the multiphase spaces ∗ ? E and J E; for the convenience of the reader, the relevant formulas in adapted J~ ∗ local coordinates are collected in Appendix B. In the case of J~ E, this induced connection can be used to define a “vertical multisymplectic form” ω V which is however not closed; instead, it is annihilated under the action of a “vertical exterior derivative” d V for differential forms. In adapted local coordinates, these objects can be written in the form ω V = ei ∧ eµi ∧ dn xµ + · · ·
(5.17)
∂ ∂ + eµi ∧ µ i ∂q ∂pi
(5.18)
and dV = e i ∧
respectively, where ei = dq i + Γiν dxν and eµi = dpµi − (∂i Γjκ pµj − Γµκλ pλi + Γρκρ pµi )dxκ are vertical 1-forms (with respect to the aforementioned induced connection): the dots in the definition of ω V indicate n-horizontal terms that are not important here, while the partial derivatives in the definition of dV are meant to act on the coefficient functions. As shown by one of the present authors [32], dV is still a cohomology operator, i.e. it has square zero. Then the Hamiltonian forms as defined by ∗ Kanatchikov can be shown to be precisely the horizontal forms f˜ on J~ E satisfying the equation iX˜ ω V = dV f˜ ,
(5.19)
∗ ˜ is a multivector field on J~ E; this relation is of course completely analowhere X gous to our equation (3.3/3.4). Moreover, Kanatchikov introduces a Poisson bracket between Hamiltonian forms f˜ of degree n−r and g˜ of degree n−s, with multivector ˜ of degree r and Y˜ of degree s corresponding to f˜ and to g˜ according to fields X Eq. (5.19), by setting
˜ g˜}V = (−1)r(s−1) i ˜ i ˜ ω V . {f, Y X
(5.20)
This Poisson bracket satisfies the analogue of the graded Jacobi identity (3.15).
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
733
We will now show how this approach can be naturally incorporated into the multisymplectic framework used in the present paper. Proposition 5.2. Under the canonical projection from extended multiphase space ∗ ∗ ? E as deE, every Hamiltonian form f˜ on J~ J E to ordinary multiphase space J~
? fined by Kanatchikov pulls back to a horizontal Poisson form f on J E. Conversely, ? every horizontal Poisson form f of degree > 0 on J E is obtained in this way.
? Moreover, the Hamiltonian multivector field X on J E corresponding to f can be ∗ ˜ on J~ chosen so as to project to a Hamiltonian multivector field X E corresponding ˜ to f. Proof. We begin by analyzing the properties of Poisson forms f of degree n − r ? (0 < r < n) on J E which are horizontal. Being horizontal, such a form trivially satisfies the kernel condition (3.7) and its expansion in adapted local coordinates is f=
1 µ1 ...µr n d xµ1 ...µr , f r!
implying df =
1 ∂f µ2 ...µr ν n 1 ∂f µ1 ...µr i n + d x dq ∧ d xµ1 ...µr µ ...µ 2 r (r − 1)! ∂xν r! ∂q i +
1 ∂f µ1 ...µr 1 ∂f µ1 ...µr κ n dp ∧ d x + dp ∧ dn xµ1 ...µr . µ ...µ k 1 r r! ∂pκk r! ∂p
Comparing this formula with Eq. (5.6), we see that f being a Hamiltonian form implies first of all that X must be 1-vertical since the coefficients X µ1 ...µr give a contribution to iX ω proportional to dq i ∧ dpµi ∧ dn xµµ1 ...µr which is absent from df . But this implies that iX ω contains no terms proportional to dp∧dn xµ1 ...µr either and hence the coefficients f µ1 ...µr cannot depend on the energy variable p; the same then goes for all the coefficients of X. Therefore, f is the pull-back of a horizontal ∗ ∗ ˜ on J~ form f˜ on J~ E whereas X projects onto a 1-vertical multivector field X E whose expansion in terms of adapted local coordinates is given by the second and third term in Eq. (5.5). Finally, we see that with these relations between the various objects involved, Eq. (3.3, 3.4) becomes equivalent to Eq. (5.19) plus the relation X0µ2 ...µr = −
∂f µ2 ...µr ν , ∂xν
∗ which has no counterpart in J~ E but also does not convey any additional information.
Finally, the fact that the Poisson bracket (5.20) introduced by Kanatchikov, ∗ ? when pulled back from J~ E to J E, coincides with the Poisson bracket defined by Eq. (3.12) follows from the following simple observation. ? Proposition 5.3. Let f and g be two horizontal Poisson forms on J E of respective degrees n − r and n − s, with corresponding 1-vertical Hamiltonian multivector
November 5, 2003 9:42 WSPC/148-RMP
734
00173
M. Forger, C. Paufler & H. R¨ omer
fields X and Y of respective degrees r and s. Then the definition (3.12) of their Poisson bracket reduces to the pull-back of Eq. (5.20): {f, g} = (−1)r(s−1) iY iX ω .
(5.21)
Proof. As we have seen in the proof of the preceding proposition, f and g being horizontal forces X and Y to be 1-vertical, so iY f and iX g vanish. Similarly, Eq. (5.7) shows that iX θ and iY θ are horizontal, so iY iX θ and iX iY θ vanish. Therefore, the exact correction term of Eq. (3.12) does not contribute in this case. Finally, X ∧ Y will be 2-vertical, so contraction of the pull-back of ω V or of ω with X and Y gives the same result, implying that Eq. (5.21) is really the pull-back of Eq. (5.20). Remark. In the case of horizontal Poisson forms, one can also introduce an associative product, which has been found by Kanatchikov [2]: f • g = ∗−1 (∗f ∧ ∗ g) ,
(5.22)
where ∗ is the Hodge star operator on M associated to some metric which can be ∗ transported to horizontal forms on J E in an obvious manner. With respect to this product, the Poisson bracket (5.21) satisfies a graded Leibniz rule {f, g • h} = {f, g} • h + (−1)(r−1)s g • {f, h} .
(5.23)
However, this product cannot be extended in any natural way to arbitrary Poisson forms. To see this, suppose we had such an extension at hand. Then we could define ? a space of vertical covectors at every point of J E by requiring it to consist of all covectors that vanish when multiplied by a horizontal (n − 1)-form, which would be equivalent to the choice of a connection. Appendix A. Multivector calculus on manifolds The extension of the usual calculus on manifolds from vector fields to multivector fields is by now well known, although it does not seem to be treated in any of the standard textbooks on the subject. Moreover, there is a certain amount of ambiguity concerning sign conventions. Our sign conventions follow those of Tulczyjew [33], but for the sake of completeness we shall briefly expose the structural properties that naturally motivate these choices. Multivector fields of degree r on a manifold are sections of the rth exterior power of its tangent bundle: they are the dual objects to differential forms of degree r, which are sections of the rth exterior power of its cotangent bundle. Every known natural operation involving vector fields, such as the contraction on differential forms, the Lie bracket and the Lie derivative, has a natural extension to multivector fields: this is the subject of an area of differential geometry that we simply refer to as “multivector calculus”. The most important and the ones that we need in this
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
735
paper are (a) the Schouten bracket between multivector fields, (b) the contraction of a differential form with a multivector field and (c) the Lie derivative of a differential form along a multivector field. Throughout this appendix, let M be an n-dimensional manifold, F(M ) the commutative algebra of functions on M (with respect to pointwise multiplication), X(M ) the space of vector fields on M and X∧ (M ) =
n M Vr
X(M )
r=0
the supercommutative superalgebra of multivector fields on M (with respect to pointwise exterior multiplication). A.1. The Schouten bracket The Schouten bracket between multivector fields constitutes the natural, canonical extension both of the Lie bracket between vector fields and of the Lie derivative of multivector fields (as special tensor fields) along vector fields. Starting from the Lie derivative of multivector fields along vector fields, it can be defined by imposing a Leibniz rule with respect to the exterior product of multivector fields, as in Eq. (A.4) below. Proposition A.1. There exists a unique R-bilinear map [· , ·] : X∧ (M ) × X∧ (M ) → X∧ (M )
(A.1)
called the Schouten bracket, with the following properties. 1. It is homogeneous of degree −1 with respect to the standard tensor degree, i.e. deg X = r ,
deg Y = s ⇒ deg[X, Y ] = r + s − 1 .
(A.2)
2. It is graded antisymmetric: if X has tensor degree r and Y has tensor degree s, then [Y, X] = −(−1)(r−1)(s−1) [X, Y ] .
(A.3)
3. It coincides with the standard Lie bracket on vector fields. 4. It satisfies the graded Leibniz rule: if X has tensor degree r, Y has tensor degree s and Z has tensor degree t, then [X, Y
∧ Z]
= [X, Y ] ∧ Z + (−1)(r−1)s Y
∧ [X, Z] .
(A.4)
5. It satisfies the graded Jacobi identity: if X has tensor degree r, Y has tensor degree s and Z has tensor degree t, then (−1)(r−1)(t−1) [X, [Y, Z]] + cyclic perm. = 0 .
(A.5)
November 5, 2003 9:42 WSPC/148-RMP
736
00173
M. Forger, C. Paufler & H. R¨ omer
We shall not prove this proposition here but just point out that uniqueness of an operation with the properties stipulated above follows from the required R-bilinearity (not F(M )-bilinearity, of course), the homogeneity (A.2), the graded antisymmetry (A.3) and the graded Leibniz rule (A.4) alone; existence can then be proved, for example, by showing that the resulting local coordinate formula satisfies all these requirements. Moreover, the validity of the graded Jacobi identity (A.5) can be derived from the standard Jacobi identity for the Lie bracket of vector fields by means of the graded Leibniz rule (A.4), using induction on the degree. An explicit formula which is slightly more general than the local coordinate formula just mentioned and often useful in practical applications is that for the Schouten bracket between decomposable multivector fields; it follows directly from the same kind of argument and states that for any r + s vector fields X1 , . . . , Xr and Y1 , . . . , Ys , we have [X1 ∧ · · · ∧ Xr , Y1 ∧ · · · ∧ Ys ] =
r X s X
(−1)i+j [Xi , Yj ] ∧ X1 ∧ · · · ∧ Xi−1 ∧ Xi+1 ∧ · · · ∧ Xr
i=1 j=1
∧ Y1 ∧ · · · ∧ Yj−1 ∧ Yj+1 ∧ · · · ∧ Ys
.
(A.6)
Note also that there is a graded Leibniz rule in the other factor as well: it follows from the one written down above by using graded antisymmetry and reads [X ∧ Y, Z] = (−1)(t−1)s [Z, X] ∧ Y + X ∧ [Y, Z] .
(A.7)
Finally, a word seems in order on the adequate choice of signs and degrees. Indeed, one recognizes Eqs. (A.2), (A.3) and (A.5) as the graded homogeneity, the graded antisymmetry and the graded Jacobi identity familiar from the definition of a Lie superalgebra, provided one assigns to every multivector field X of tensor degree r the parity (−1)r−1 : this means that X is even with respect to the Schouten bracket if it has odd tensor degree and is odd with respect to the Schouten bracket if it has even tensor degree! This switch can be better understood by realizing that the operator ad(X) = [X, .] lowers the tensor degree of any multivector field that it operates on by r − 1. The same argument explains the sign that appears in the graded Leibniz identity (A.4), which can be thought of as stating that the operator ad(X) = [X, .] should be a superderivation with respect to the exterior product and, more precisely, an even or odd superderivation according to whether X is even or odd with respect to the Schouten bracket. We can also think of this operator as defining the Lie derivative LX of multivector fields along X (possibly up to signs, which are a matter of convention), but this will not be needed here. Algebraically, the situation can be summarized by stating that X∧ (M ) is a Poisson superalgebra, the supersymmetric analogue of a Poisson algebra — the structure encountered, for example, on the space of functions on a symplectic manifold or, more generally, a Poisson manifold. The surprising aspect is that this
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
737
intricate structure requires no additional structure whatsoever on the underlying manifold. A.2. Lie derivative of differential forms along multivector fields We now come to the other two operations of multivector calculus mentioned at the beginning of this appendix, namely the contraction of differential forms with multivector fields and the Lie derivative of differential forms along multivector fields. The case of contraction is easy. First, the contraction of a differential form α with a decomposable multivector field X1 ∧ · · · ∧ Xr is simply defined as repeated contraction with its constituents (which by convention should be performed in the opposite order): iX1 ∧···∧ Xr α = iXr . . . iX1 α .
(A.8)
This is then extended to arbitrary (non-decomposable) multivector fields X by F(M )-linearity. (Here, of course, one uses that contraction is a purely algebraic operation; it would not work so naively if we were dealing with a differential operator.) The Lie derivative LX α of a differential form α along a multivector field X is most conveniently defined by a generalization of a well known formula for vector fields. Definition A.2. On differential forms, the Lie derivative LX along a multivector field X is defined as the supercommutator of the exterior derivative d and the contraction operator iX : LX α = diX α − (−1)r iX dα .
(A.9)
According to the rules of supersymmetry, the sign of the second term is fixed by observing that d is an odd operator (it is of degree 1 since it raises the tensor degree of forms by 1) while iX is an even/odd operator if r is even/odd (it is of degree −r since it lowers the tensor degree of forms by r). Proposition A.3. Given any two multivector fields X of tensor degree r and Y of tensor degree s, we have for any differential form α dLX α = (−1)r−1 LX dα ,
(A.10)
i[X,Y ] α = (−1)(r−1)s LX iY α − iY LX α .
(A.11)
L[X,Y ] α = (−1)(r−1)(s−1) LX LY α − LY LX α .
(A.12)
LX ∧ Y α = (−1)s iY LX α + LY iX α .
(A.13)
November 5, 2003 9:42 WSPC/148-RMP
738
00173
M. Forger, C. Paufler & H. R¨ omer
Proof. The first formula is an immediate consequence of the definition (A.9), since d2 = 0. Next, the last formula can be proved by direct calculation: LX ∧ Y α = d(iX ∧ Y α) − (−1)r+s iX ∧ Y dα = d(iY iX α) − (−1)r+s iY iX dα = d(iY iX α) − (−1)s iY d(iX α) + (−1)s iY d(iX α) − (−1)r+s iY iX dα = LY iX α + (−1)s iY LX α . Next, observe that the first formula is well known to be true when X and Y are vector fields. The general case follows by induction on the tensor degree of both factors. Indeed, if X, Y and Z are multivector fields of tensor degree r, s and t, respectively, such that the above equation holds for [X, Y ] and for [X, Z], one can use the graded Leibniz rule (A.4) to derive that it also holds for [X, Y ∧ Z]: i[X,Y
∧ Z] α
= i[X,Y ] ∧ Z α + (−1)(r−1)s iY
∧ [X,Z] α
= iZ i[X,Y ] α + (−1)(r−1)s i[X,Z] iY α = (−1)(r−1)s iZ LX iY α − iZ iY LX α + (−1)(r−1)s+(r−1)t LX iZ iY α − (−1)(r−1)s iZ LX iY α = (−1)(r−1)(s+t) LX iY
∧Zα
− iY
∧ Z LX α .
Similarly, if X, Y and Z are multivector fields of tensor degree r, s and t, respectively, such that the above equation holds for [X, Z] and for [Y, Z], one can use the graded Leibniz rule (A.7) together with Eq. (A.13) to derive that it also holds for [X ∧ Y, Z]: i[X ∧ Y,Z] α = (−1)(t−1)s i[X,Z] ∧ Y α + iX ∧ [Y,Z] α = (−1)(t−1)s iY i[X,Z] α + i[Y,Z] iX α = (−1)(t−1)s+(r−1)t iY LX iZ α − (−1)(t−1)s iY iZ LX α + (−1)(s−1)t LY iZ iX α − iZ LY iX α = (−1)(r+s−1)t+s iY LX iZ α − (−1)s iZ iY LX α + (−1)(r+s−1)t LY iX iZ α − iZ LY iX α = (−1)(r+s−1)t LX ∧ Y iZ α − iZ LX ∧ Y α . Finally, the second formula can now again be proved by direct calculation: L[X,Y ] α = di[X,Y ] α + (−1)r+s i[X,Y ] dα = (−1)(r−1)s dLX iY α − diY LX α +(−1)r(s−1) LX iY dα − (−1)r+s iY LX dα
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
739
= (−1)(r−1)(s−1) LX diY α − diY LX α + (−1)r(s−1) LX iY dα + (−1)s iY dLX α = (−1)(r−1)(s−1) LX LY α − LY LX α . B. Induced connections In this appendix we want to describe briefly the construction of various induced connections in jet bundle language. First of all, if E is a fiber bundle over M , we shall view a connection in E as a section ΓE of the first order jet bundle JE of E, considered as an affine bundle over E; see [34, Ch. IV.17]. In adapted local coordinates (xµ , q i ) for E and (xµ , q i , qµi ) for JE, this section is given by ΓE : (xµ , q i ) 7→ (xµ , q i , Γiµ (x, q)) . Next, if V is a vector bundle over M , a linear connection in V is given by a section ΓV of JV over V that depends linearly on the fiber coordinates. In adapted local coordinates (xµ , v i ) for V and (xµ , v i , vµi ) for JV , this section is given by ΓV : (xµ , v i ) 7→ (xµ , v i , Γiµj (x) v j ) , where the Γiµ,j are of course the connection coefficients (gauge potentials) associated with the corresponding covariant derivative. In particular, a linear connection in the tangent bundle T M of the base manifold M corresponds to a section ΓT M of J(T M ) over T M which, in adapted local coordinates (xµ , x˙ κ ) for T M and (xµ , x˙ κ , x˙ κµ ) for J(T M ) is given by ΓT M : (xµ , x˙ κ ) 7→ (xµ , x˙ κ , Γκµλ (x) xλ) ˙ , where the Γκµλ are of course the corresponding Christoffel symbols. Now given a fiber bundle E over M together with a connection in E and a linear connection in T M , we can introduce induced connections in all the various induced bundles that appear in this paper — regarded as fiber bundles over M , not over E. (This means that jets of sections will contain just one additional lower space-time index for counting partial derivatives with respect to the space-time variables.) The simplest way to describe them is by introducing adapted local coordinates (xµ , q i ) for E as before; then the local coefficient functions of the induced connections with respect to the induced adapted local coordinates can be expressed directly in terms of the local coefficient functions Γiµ and Γκµλ of the original two connections with respect to the original adapted local coordinates, as follows. • The vertical bundle V E of E: in adapted local coordinates (xµ , q i , q˙k ) for V E and (xµ , q i , q˙k , qµi , q˙µk ) for J(V ∗ E), the induced connection maps (xµ , q i , q˙k ) to (xµ , q i , q˙k , Γiµ (x, q), ∂l Γkµ (x, q)q˙l ) .
November 5, 2003 9:42 WSPC/148-RMP
740
00173
M. Forger, C. Paufler & H. R¨ omer
• The dual vertical bundle V ∗ E of E: in adapted local coordinates (xµ , q i , pk ) for V ∗ E and (xµ , q i , pk , qµi , pµ,k ) for J(V ∗ E), the induced connection maps (xµ , q i , pk ) to (xµ , q i , pk , Γiµ (x, q), −∂k Γlµ (x, q)pl ) . • The pull-back π ∗ (T M ) of the tangent bundle T M of M to E: in adapted local coordinates (xµ , q i , x˙ κ ) for π ∗ (T M ) and (xµ , q i , x˙ κ , qµi , x˙ κµ ) for J(π ∗ (T M )), the induced connection maps (xµ , q i , x˙ κ ) to (xµ , q i , x˙ κ , Γiµ (x, q), Γκµλ (x)x˙ λ ) . • The pull-back π ∗ (T ∗ M ) of the cotangent bundle T ∗ M of M to E: in adapted local coordinates (xµ , q i , ακ ) for π ∗ (T ∗ M ) and (xµ , q i , ακ , qµi , αµ,κ ) for J(π ∗ (T ∗ M )), the induced connection maps (xµ , q i , ακ ) to (xµ , q i , ακ , Γiµ (x, q), −Γλµκ (x)αλ ) . Vn Vn ∗ • The pull-back π ∗ ( T ∗ M ) of the bundle TVM of volume forms on M to E: n µ i ∗ in adapted local coordinates (x , q , ) for π ( T ∗ M ) and (xµ , q i , , qµi , µ ) for Vn ∗ ∗ J(π ( T M )), the induced connection maps (xµ , q i , ) to (xµ , q i , , Γiµ (x, q), −Γρµρ (x)) .
~ of E: • The linearized jet bundle JE k ~ and (xµ , q i , q~κk , qµi , q~µ,κ in adapted local coordinates (xµ , q i , q~κk ) for JE ) for µ i k ~ J(J E), the induced connection maps (x , q , q~κ ) to (xµ , q i , q~κk , Γiµ (x, q), ∂l Γkµ (x, q)~ qκl − Γλµκ (x)~ qλk ) . • The jet bundle JE of E: k in adapted local coordinates (xµ , q i , qκk ) for JE and (xµ , q i , qκk , qµi , qµ,κ ) µ i k for J(JE), the induced connection maps (x , q , qκ ) to (xµ , q i , qκk , Γiµ (x, q), ∂l Γkµ (x, q)(qκl − Γlκ (x, q)) − Γλµκ (x)(qλk − Γkλ (x, q))) . ∗ • Ordinary multiphase space J~ E: ∗ in adapted local coordinates (xµ , q i , pκk ) for J~ E and (xµ , q i , pκk , qµi , pκµ,k ) for ∗ J(J~ E), the induced connection maps (xµ , q i , pκk ) to
(xµ , q i , pκk , Γiµ (x, q), −∂k Γlµ (x, q)pκl + Γκµλ (x)pλk − Γρµρ (x) pκk ) . ? • Extended multiphase space J E: ? in adapted local coordinates (xµ , q i , pκk , p) for J E and (xµ , q i , pκk , p, qµi , pκµ,k , pµ ) ? for J(J E), the induced connection maps (xµ , q i , pκk , p) to
(xµ , q i , pκk , p, Γiµ (x, q), −∂k Γlµ (x, q)pκl + Γκµλ (x)pλk − Γρµρ (x)pκk , − Γρµρ (x)p − (∂µ Γjν (x, q) − Γkν (x, q)∂k Γjµ (x, q) − Γκµν (x)Γjκ (x, q))pνj ) .
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
Table 1. Correspondence of important concepts time-dependent mechanics versus field theory. Mechanics
in
the
multiphase
space
approach:
Field Theory
Extended configuration space R × Q, where R is the time axis
Configuration bundle E over M with typical fibre Q, where M is the space-time manifold
Extended velocity space R × T Q
Velocity bundle: jet bundle JE
Doubly extended phase space P = T ∗ (R × Q) = R × T ∗ Q × R
Extended multiphase space: twisted affine dual of VJE n ∗ ?E = J ?E ⊗ P = J T M
Simply extended phase space P0 = R × T ∗ Q
Ordinary multiphase space: ~ twisted linear dual of JE V ∗E = J ~∗ E ⊗ n T ∗ M P0 = J~
Local coordinates for R × Q t, q i
Local coordinates for E xµ , q i
Local coordinates for R × T Q t, q i , q˙i
Local coordinates for JE i xµ , q i , q µ
Local coordinates for P t, q i , pi , E
Local coordinates for P x µ , q i , pµ i ,p
Local coordinates for P0 t, q i , pi
Local coordinates for P0 x µ , q i , pµ i
Projection from P to P0 (t, q i , pi , E) 7→ (t, q i , pi )
Projection from P to P0 µ i µ (xµ , q i , pµ i , p) 7→ (x , q , pi )
Canonical 1-form on P θ = pi dq i + Edt
Multicanonical n-form on P i n n θ = pµ i dq ∧ d xµ + p d x
Symplectic 2-form ω = −dθ on P, non-degenerate ω = dq i ∧ dpi − dE ∧ dt
Multisymplectic (n + 1)-form ω = −dθ on P, non-degenerate (on vector fields) n n ω = dq i ∧ dpµ i ∧ d xµ − dp ∧ d x
Hamiltonian is a function on P0
Hamiltonian is a section of P (as an affine line bundle over P0 )
iX ω = df Hamiltonian vector fields X
↔
iX ω = df functions f
Poisson bracket for functions f, g ∈ C ∞ (P) {f, g} = LY f − LX g
Hamiltonian r-multivector fields X
↔
Hamiltonian or Poisson (n − r)-forms f
Poisson bracket for Poisson forms f ∈ Ωn−r (P), g ∈ Ωn−s (P) P P {f, g} = (−1)(r−1)(s−1) LY f − LX g −(−1)(r−1)s LX ∧ Y θ
Hamiltonian equations ∂H ∂H = q˙i , i = −p˙ i ∂pi ∂q
De Donder-Weyl equations ∂pµ ∂q i ∂H ∂H , i = − iµ µ = µ ∂pi ∂x ∂q ∂x
741
November 5, 2003 9:42 WSPC/148-RMP
742
00173
M. Forger, C. Paufler & H. R¨ omer
Acknowledgments Two of the authors (M.F. and H.R) wish to gratefully acknowledge the financial support of FAPESP (Funda¸ca ˜o de Amparo a ` Pesquisa do Estado de S˜ ao Paulo, Brazil) which made this collaboration possible.
References [1] I. V. Kanatchikov, On Field Theoretic Generalizations of a Poisson Algebra, Rep. Math. Phys. 40 (1997), 225–234, hep-th/9710069. [2] I. V. Kanatchikov, Canonical Structure of Classical Field Theory in the Polymomentum Phase Space, Rep. Math. Phys. 41 (1998), 49–90, hep-th/9709229. [3] J. F. Cari˜ nena, M. Crampin and L. A. Ibort, On the Multisymplectic Formalism for First Order Field Theories, Diff. Geom. Appl. 1 (1991), 345–374. [4] M. J. Gotay, J. Isenberg and J. E. Marsden, Momentum Maps and Classical Relativistic Fields I: Covariant Field Theory, physics/9801019. [5] M. Forger and H. R¨ omer, A Poisson Bracket on Multisymplectic Phase Space, Rep. Math. Phys. 48 (2001), 211–218, math-ph/0009037. [6] H. Goldschmidt and S. Sternberg, The Hamilton-Cartan Formalism in the Calculus of Variations, Ann. Inst. Four. 23 (1973), 203–267. [7] V. Guillemin and S. Sternberg, Geometric Asymptotics, Mathematical Surveys, Vol. 14, American Mathematical Society, Providence 1977. [8] J. Kijowski, A Finite-dimensional Canonical Formalism in the Classical Field Theory, Commun. Math. Phys. 30 (1973), 99–128; Multiphase Spaces and Gauge in Calculus of Variations, Bull. Acad. Pol. Sci. SMAP 22 (1974), 1219–1225. [9] J. Kijowski and W. Szczyrba, “Multisymplectic Manifolds and the Geometrical Construction of the Poisson Brackets in the Classical Field Theory”, in G´eometrie Symplectique et Physique Math´ematique, ed. J.-M. Souriau, C.N.R.S., Paris 1975, pp. 347–379. [10] J. Kijowski and W. Szczyrba, Canonical Structure for Classical Field Theories, Commun. Math. Phys. 46 (1976), 183–206. [11] J. Kijowski and W. Tulczyjew, “A Symplectic Framework for Field Theories”, Lecture Notes in Physics, Vol. 107, Springer-Verlag, Berlin 1979. [12] M. Forger, C. Paufler and H. R¨ omer, “More about Poisson Brackets and Poisson Forms”, in Multisymplectic Field Theory, in preparation. [13] L. K. Norris, N -Symplectic Algebra of Observables in Covariant Lagrangian Field Theory, J. Math. Phys. 42 (2001), 4827–4845. [14] M. de Le´ on, M. McLean, L. K. Norris, A. R. Roca and M. Salgado, Geometric Structures in Field Theory, preprint, math-ph/0208036. [15] F. H´elein and J. Kouneiher, “Finite-Dimensional Hamiltonian Formalism for Gauge and Field Theories”, J. Math. Phys. 43 (2002), 2306–2347, math-ph/0010036; Covariant Hamiltonian Formalism for the Calculus of Variations with Several Variables, preprint, math-ph/0211046v2. [16] C. Crnkovi´c and E. Witten, “Covariant Description of Canonical Formalism in Geometrical Theories”, in Three Hundred Years of Gravitation, eds. W. Israel and S. Hawking, Cambridge University Press, Cambridge 1987, pp. 676–684. [17] C. Crnkovi´c, Symplectic Geometry of Covariant Phase Space, Class. Quant. Grav. 5 (1988), 1557–1575. [18] G. Zuckerman, “Action Principles and Global Geometry”, in Mathematical Aspects of String Theory, ed. S.-T. Yau, World Scientific, Singapore 1987, pp. 259–288.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
743
[19] R. E. Peierls, The Commutation Laws of Relativistic Field Theory, Proc. Roy. Soc. Lond. A214 (1952), 143–157. [20] B. de Witt, “Dynamical Theory of Groups and Fields”, in Relativity, Groups and Topology, 1963 Les Houches Lectures, eds. B. de Witt and C. de Witt, Gordon and Breach, New York 1964, pp. 585–820. [21] B. de Witt, “The Spacetime Approach to Quantum Field Theory”, in Relativity, Groups and Topology II, 1983 Les Houches Lectures, eds. B. de Witt and R. Stora, Elsevier, Amsterdam 1984, pp. 382–738. [22] G. Bimonte, G. Esposito, G. Marmo and C. Stornaiolo, Peierls Brackets in Field Theory, preprint, hep-th/0301113. [23] S. V. Romero, Colchete de Poisson Covariante na Teoria Geom´etrica dos Campos, PhD thesis, Institute for Mathematics and Statistics, University of S˜ ao Paulo, June 2001. [24] M. Forger and S. V. Romero, Covariant Poisson Brackets in Geometric Field Theory, in preparation. [25] R. Abraham and J. E. Marsden, Foundations of Mechanics, 2nd edition, Benjamin/Cummings, Reading 1978. [26] V. Arnold, Mathematical Methods of Classical Mechanics, 2nd edition, Springer, Berlin 1989. [27] H. A. Kastrup, Canonical Theories of Lagrangian Dynamical Systems in Physics, Phys. Rep. 101 (1983), 3–167. [28] G. Martin, A Darboux Theorem for Multisymplectic Manifolds, Lett. Math. Phys. 16 (1988), 133–138. [29] G. Martin, Dynamical Structures for k-Vector Fields, Int. J. Theor. Phys. 41 (1988), 571–585. [30] F. Cantrijn, A. Ibort and M. de Le´ on, On the Geometry of Multisymplectic Manifolds, J. Austral. Math. Soc. (Ser. A) 66 (1999), 303–330. [31] C. Paufler and H. R¨ omer, Geometry of Hamiltonian n-vectors in Multisymplectic Field Theory, J. Geom. Phys. 44 (2002), 52–69, math-ph/0102008. [32] C. Paufler, A Vertical Exterior Derivative in Multisymplectic Geometry and a Graded Poisson Bracket for Nontrivial Geometries, Rep. Math. Phys. 47 (2001), 101–119, math-ph/0002032. [33] W. M. Tulczyjew, The Graded Lie Algebra of Multivector Fields and the Generalized Lie Derivative of Forms, Bull. Acad. Pol. Sci. SMAP 22 (1974), 937–942. [34] I. Kol´ aˇr, P. W. Michor and J. Slov´ ak, Natural Operations in Differential Geometry, Springer, Berlin 1993.
November 1, 2003 12:2 WSPC/148-RMP
00175
Reviews in Mathematical Physics Vol. 15, No. 7 (2003) 745–763 c World Scientific Publishing Company
COMBINATORIAL PROPERTIES OF ARNOUX RAUZY SUBSHIFTS AND APPLICATIONS TO ¨ SCHRODINGER OPERATORS
D. DAMANIK Department of Mathematics 253–37, California Institute of Technology Pasadena, CA 91125, USA
[email protected] LUCA Q. ZAMBONI Department of Mathematics, University of North Texas Denton, TX 76203, USA
[email protected] Received 6 February 2003 Revised 16 July 2003 We consider Arnoux–Rauzy subshifts X and study various combinatorial questions: When is X linearly recurrent? What is the maximal power occurring in X? What is the number of palindromes of a given length occurring in X? We present applications of our combinatorial results to the spectral theory of discrete one-dimensional Schr¨ odinger operators with potentials given by Arnoux–Rauzy sequences. Keywords: Arnoux–Rauzy subshifts; linear recurrence; powers; palindromes; Schr¨ odinger operators. 2000 AMS Subject Classification: 81Q10, 68R15, 37B 10
1. Introduction Mainly motivated by the discovery of quasicrystals by Shechtman et al. in 1984 [1], there has been a lot of research done on the spectral properties of Schr¨ odinger operators with potentials displaying long-range order. The first rigorous mathematical results were obtained in the late eighties. By now, many key issues are well understood, at least in one dimension. The two survey articles [2] and [3] recount the history of this effort up to 1994 and 1999, respectively. The primary example is given by a discrete one-dimensional Schr¨ odinger operator whose potential is given by the Fibonacci sequence. More generally, one considers Sturmian potentials or potentials generated by (primitive) substitutions. It turned out that all these potentials lead to the same qualitative behavior: the
745
November 1, 2003 12:2 WSPC/148-RMP
746
00175
D. Damanik & L. Q. Zamboni
corresponding Schr¨ odinger operator has purely singular continuous zero-measure Cantor spectrum. This has been established for all Sturmian potentials and most substitution potentials. On the other hand, no counterexample is known. This led to the conjecture that these properties are shared by a large class of potentials displaying long-range order in a certain sense. One possible way to measure longrange order is given by the combinatorial complexity function f : N → N associated with a potential taking finitely many values, where f (n) is given by the number subwords of the potential of a given length n. Since periodic potentials are well understood, one is interested in the case of an aperiodic potential and, in this case, it is well known that the complexity function grows at least linearly. One possible point of view could be the stipulation that long-range order manifests itself in a slowly (e.g. linearly) growing complexity function, possibly along with further conditions. This combinatorial approach is further motivated by the fact that the properties above, singular continuous zero-measure Cantor spectrum, can be shown by purely combinatorial methods. The key combinatorial properties that allow one to deduce these spectral properties are linear recurrence and the occurrence of local symmetries such as powers and palindromes. Here, a sequence is linearly recurrent if its subwords occur infinitely often, with gap lengths bounded linearly in the length of the subword. Powers are repetitions of subwords and palindromes are subwords that are the same when read backwards. Thus, the interplay between the spectral theory of Schr¨ odinger operators and combinatorics of infinite words has enjoyed quite some popularity recently, due to its success in answering long-standing questions; for example, the completion of the analysis of the Sturmian case [4] or the proof of zero-measure spectrum for all primitive substitution potentials [5] (see also [6] for a different proof of the latter result using trace maps). This interplay and its applications will be discussed in detail in [7]. As was mentioned above, the spectral theory is well understood for the primary example, the Fibonacci case, and more generally, for all Sturmian potentials. Thus it is natural to consider generalizations of Sturmian potentials and to explore whether the combinatorial approach continues to be applicable. There are a number of natural candidates: • quasi-Sturmian sequences, • sequences obtained by codings of rotations, • Arnoux–Rauzy sequences. Quasi-Sturmian sequences are essentially given by morphic images of Sturmian sequences and the corresponding Schr¨ odinger operators were studied in [8], confirming all of the above points. Sturmian sequences have a geometric realization as a coding of an irrational rotation on the unit circle with respect to a decomposition of the circle into two half-open intervals, where the rotation number is equal to the length of one of the intervals. By dropping the latter condition, one obtains the more
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
747
general class of sequences associated to codings of rotations. The corresponding operators display purely singular continuous zero-measure Cantor spectrum in many cases [9–11]. Finally, Sturmian potentials can be characterized by a scarceness of so-called special factors, that is, aside from being defined over two symbols, there is, for each length, exactly one subword with multiple extensions to the right and one subword with multiple extensions to the left. When considering more than two symbols, this definition leads to the class of Arnoux–Rauzy sequences, originally defined and studied in [12] (the paper [12] considers the three-symbol case; for the case of k ≥ 3 symbols, see [13] for initial definition and study). For the corresponding operators, no results have been shown yet. Thus, the spectral analysis of these operators via the combinatorial approach mentioned above is the objective of the present paper. Let us mention that some authors (e.g. [14–16]) study the class of episturmian sequences which contains all Arnoux–Rauzy sequences. To this end, we shall recall the formal definition and some basic combinatorial properties of Arnoux–Rauzy sequences in Sec. 2 and then study the relevant combinatorial issues, namely, linear recurrence, powers, and palindromes, in Secs. 3–5, respectively. Applications of the combinatorial results obtained in these sections to the corresponding Schr¨ odinger operators are then presented in Sec. 6. 2. Basic Properties of Arnoux Rauzy Sequences In this section we recall some known properties of Arnoux–Rauzy sequences and subshifts. In particular, we explain the two combinatorial descriptions of such subshifts from [13] since they will be used extensively in later sections. We begin with some definitions. Let Ak = {1, 2, . . . , k} with k ≥ 2. We denote Z by A∗k , AN k , Ak the set of finite, one-sided infinite, and two-sided infinite words over Z x of length Ak . Given x ∈ AN k or Ak , we denote by Fx (n) the set of all subwords of S n ∈ N, that is, Fx (n) = {xj · · · xj+n−1 : j ∈ N(or Z)}. We write Fx = n∈N Fx (n). The complexity function f : N → N of x is defined by f (n) = |Fx (n)|, where | · | denotes cardinality. A factor (= subword) u ∈ Fx (n) of x is called right-special if it has at least two extensions to the right, that is, there are a, b ∈ Ak , a 6= b such that ua, ub ∈ Fx (n + 1). A left-special factor is defined analogously. If a factor is both right-special and left-special, it is called bispecial. The sequence x is called an Arnoux–Rauzy sequence if • x is uniformly recurrent (i.e. each factor of x occurs with bounded gaps), • f (n) = (k − 1)n + 1, • each Fx (n) contains exactly one right-special factor rn and one left-special factor ln . It can be shown that rn = lnR [13], where the reversal uR of a word u = u1 · · · um is defined by uR = um · · · u1 . In particular, r1 = l1 and this factor is bispecial. Observe that there is a unique symbol a ∈ Ak such that aa ∈ Fx (2) (which is given by a = r1 = l1 ). We shall say that x is of type a.
November 1, 2003 12:2 WSPC/148-RMP
748
00175
D. Damanik & L. Q. Zamboni
If k = 2, this recovers the definition of a Sturmian sequence. Hence, Arnoux– Rauzy (AR for short) sequences are a natural generalization of Sturmian sequences to larger alphabets. Given an AR sequence x, we define the associated AR subshift X by Z X = {y ∈ AN k (or Ak ) : Fy (n) = Fx (n) for every n ∈ N} .
By definition, we have Fy = Fx ≡ FX for every y ∈ X. When we want to be more specific (about the choice of N or Z), we shall refer to X as a one-sided (resp., twosided) subshift. AR sequences and subshifts were originally defined and studied by Arnoux and Rauzy in [12]. Since x is assumed to be uniformly recurrent, X is minimal. Moreover, it was shown in [12] that X is uniquely ergodic, that is, X admits a unique shift-invariant probability measure ν. Equivalently, for every factor u ∈ FX and every m ∈ N (or m ∈ Z), the limit 1 d(u) = lim #u (xm · · · xm+n−1 ) n→∞ n exists, uniformly in m. Here, #u (v) denotes the number of occurrences of u in v. The number d(u) is called the frequency of u. We have, for every m, ν({y ∈ X : ym · · · ym+|w|−1 = u}) = d(u) .
(2.1)
Two important objects associated with such a subshift are the index sequence N (in ) ∈ AN k and the characteristic sequence (cn ) ∈ Ak which are defined as follows: Let {ε = w1 , w2 , w3 , . . .} be the set of bispecial factors ordered so that 0 = |w1 | < |w2 | < |w3 | < · · · . For n ∈ N, let in ∈ Ak be the unique symbol so that in wn is right-special. The characteristic sequence (cn ), on the other hand, is defined to be the unique accumulation point of the set {l1 , l2 , l3 , . . .} of left-special factors. Note that (cn ) is an element of the one-sided subshift X and hence has the same factors as x. Consequently, it carries all the necessary information and once we find a way of constructing (cn ) from (in ), we see that the index sequence completely determines X. One such construction is given by the hat algorithm from [13]. Define a function N H : AN k → Ak
as follows. Set ˆ A0k = {1, . . . , k, ˆ 1, . . . , k} and let Φ denote the morphism Φ : A0k → Ak ,
Φ(ˆ a) = Φ(a) = a for every a ∈ Ak .
(A0k )∗
and (A0k )N . With each sequence S = (sn ) ∈ AN Clearly, Φ extends to both k, we associate a sequence (Bn ) of words over the alphabet A0k as follows: B1 = sˆ1 and, for n > 1, Bn is obtained from Bn−1 according to the following rule. If sˆn does not occur in Bn−1 , then Bn = Bn−1 sˆn Φ(Bn−1 ) .
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
749
Otherwise, if sˆn occurs in Bn−1 , then we can write Bn−1 = x0 sˆn y 0 , where x0 , y 0 are words over A0k (possibly empty) and sˆn does not occur in y 0 . In this case we set Bn = Bn−1 sˆn Φ(y 0 ) . The sequence (Bn ) converges to a unique sequence B ∈ (A0k )N . We set H(S) = Φ(B) . Then, the following holds: Theorem 2.1 (Risley Zamboni [13]). Let X be an AR subshift over Ak . Let I = (in ) be its index sequence and C = (cn ) its characteristic sequence. Then every a ∈ Ak occurs in I an infinite number of times and C = H(I) . Conversely, if I = (in ) is a sequence over Ak such that every a ∈ Ak occurs infinitely often in I, then H(I) is the characteristic sequence of an AR subshift. The key observation is that {Φ(Bn ) : n ∈ N} is precisely the set of all bispecial factors (see [13]). Thus, we can re-interpret the construction above on this level: Let w be one of the bispecial factors. Suppose that each symbol a ∈ Ak occurs in w (this holds by minimality if w is long enough). Then by the hat algorithm, for each symbol a ∈ Ak , there is a positive integer m < |w| (depending on a) such that the next bispecial factor is obtained from w by adjoining to the end of w a suffix of w of length m. The quantity m is one of the k periods of w. Let p1 , p2 , . . . , pk denote the k periods of w. Then we have the following formula [17]: k X
(pi − 1) = (k − 1)|w| .
i=1
We can suppose that p1 ≥ p2 ≥ · · · ≥ pk . In this case it follows that |w| , 1 ≤ i ≤ k −1. (2.2) k In fact, if pk−1 ≤ |w|/k, then pk−1 + pk ≤ |w|, implying that p1 + p2 + · · · + pk−2 ≥ (k − 2)|w|, which is a contradiction since each pi < |w|. Another way of constructing the characteristic sequence is given by the following result. For each a ∈ Ak , define the morphism τa by τa (a) = a and τa (b) = ab for b ∈ Ak \{a}. pi >
Theorem 2.2 (Risley Zamboni [13]). Let X be an AR subshift and let (in ) be its index sequence. For each a ∈ Ak , the characteristic sequence (cn ) of X is given by lim τi1 ◦ · · · ◦ τin (a) .
n→∞
That is, the characteristic sequence admits an S-adic representation where the underlying morphisms are given by {τa : a ∈ Ak } and they are iterated in an order dictated by the index sequence.
November 1, 2003 12:2 WSPC/148-RMP
750
00175
D. Damanik & L. Q. Zamboni
3. Linearly Recurrent Arnoux Rauzy Sequences In this section we characterize the set of AR subshifts that are linearly recurrent. Recall that a subshift X is called K-linearly recurrent (or K-LR) if there is a constant K > 0 such that every w ∈ FX is contained in every v ∈ FX of length K|w|. X is called linearly recurrent (or LR) if it is K-LR for some K. Linear recurrence is a concept that has been quite popular since the late nineties and it is known to have a number of nice consequences; compare [5, 18–22]. For example, every linearly recurrent X is uniquely ergodic and N -power free for some N (i.e. FX does not contain an element of the form uN ). Theorem 3.1. An AR subshift X over Ak is linearly recurrent if and only if every letter a ∈ Ak occurs in (in ) with bounded gaps. Remark 3.1. This is Corollary III.9 in [13]. One direction was stated without proof and the proof of the other direction was based on [19, Proposition 5] which turned out to be incorrect [23]. Proof. It clearly suffices to prove the assertion for the characteristic sequence (c n ) since the LR property only depends on the set of factors of a sequence and hence is an invariant of a minimal subshift. We first prove that if some letter a ˜ ∈ Ak occurs in (in ) with unbounded gaps, then (cn ) is not linearly recurrent. A special case of this scenario is easy to handle: If for each n ∈ N, there is m ∈ N such that im = · · · = im+n−1 , then (cn ) is not LR since it is not N -power free for any N (see [13, Corollary III.6] or the next section). Let us therefore assume, in addition, that there is some N ∈ N such that for every a ∈ Ak , aN does not occur in the index sequence. Fix some K > 0. Let L > kN K (recall that k is the size of the alphabet). Then there exists n ∈ N such that in+j 6= a ˜, 0 ≤ j ≤ L. We shall show that wn+L is a word of length > (K + 1)|wn | which does not contain an occurrence of wn a ˜. (Recall that wm denotes the mth bispecial factor.) This implies that (cn ) is not K-LR. Since K was arbitrary, (cn ) is not LR. That wn+L does not contain wn a ˜ follows from the hat algorithm and the fact that a ˜ does not occur in in , . . . , in+L . That wn+L is of length > K|wn | follows from the fact that for each j, in passing from wn+j to wn+j+1 , one adds on a suffix; all but one of these suffixes are, by (2.2), of length > |wn |/k. Since (cn ) is N -power free, each window of length N in the index sequence must contain at least two distinct symbols. Thus |wn+jN | > |wn | + j
|wn | , k
and hence |wn | = (K + 1)|wn | . k Consider now the case where each symbol a ∈ Ak occurs in the index sequence with bounded gaps. That is, there is a number g ∈ N such that for every a ∈ Ak |wn+kN K | > |wn | + Kk
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
751
and every m ∈ N, at least one of im , . . . , im+g−1 equals a. We have to show that C = (cn ) is linearly recurrent. Recall from Theorem 2.2 that C = lim (τi1 ◦ · · · ◦ τin )(a) for every a ∈ Ak . n→∞
(3.1)
Fix some a ˜ ∈ Ak and define, for m ∈ N, a) . C (m) = lim (τim ◦ · · · ◦ τin )(˜ n→∞
(3.2)
In order to show that C is linearly recurrent, we shall employ [23, Lemma 4] which provides a sufficient condition for LR: For each m ∈ N, let dm be the largest gap between consecutive occurrences of a word of length 2 in C (m) . If the set {dm : m ∈ N} is bounded, then C is linearly recurrent. Fix some m ∈ N and consider the sequence C (m) . It is an AR sequence of type im and its factors of length 2 are given by FC (m) (2) = {aim : a ∈ Ak } ∪ {im a : a ∈ Ak } . The gaps between occurrences of aim are bounded by twice the maximal length of the gaps between occurrences of a in C (m+1) , which in turn occurs with gaps bounded by 2g since at least one of im+1 , . . . , im+g is equal to a (the corresponding substitution produces a sequence where the gaps between successive a’s are bounded by 2, and then the gaps increase under subsequent substitutions at most by a factor 2). The same argument works for words of the form im a and hence dm ≤ 2 g . 4. Powers in Arnoux Rauzy Sequences In this section we study the occurrences of powers in a given AR subshift X. As was noted in [13], if there are arbitrarily long runs in the index sequence (in ), then there are arbitrarily high powers. Here, we shall prove the converse and provide an explicit expression for the index of X (the highest power occurring in X) in terms of the run lengths in (in ). To this end, we shall distinguish between two types of runs in (in ), namely, open runs and closed runs. An r-run in (in ) is a pair (a, l) ∈ Ak × N such that im = a for l ≤ m ≤ l + r − 1. If the value of r is understood, such an r-run will sometimes be simply referred to as a run. An r-run (a, l) is called open if im 6= a for 1 ≤ m ≤ l−1; otherwise, it is called closed. Recall that the (integer) index of X, ind(X) ∈ N ∪ {∞}, is defined by ind(X) = sup{p ∈ N : there exists u ∈ A∗k such that up ∈ FX } . The following result provides an explicit formula for the index in terms of the runs in the index sequence and generalizes the corresponding result in the Sturmian case (cf. e.g. [24–27]). See [15, Secs. 4.1 and 5.5] for related results on powers in AR subshifts.
November 1, 2003 12:2 WSPC/148-RMP
752
00175
D. Damanik & L. Q. Zamboni
Theorem 4.1. Let X be an AR subshift over Ak and (in ) its index sequence. If ind(X) is defined as above, then ind(X) = max{N1 , N2 } , where N1 = 1 + sup{r ∈ N : (in ) contains an open r-run} , N2 = 2 + sup{r ∈ N : (in ) contains a closed r-run} . In particular, we obtain the following corollary which is a generalization of the corresponding result in the Sturmian case, which was proved by Mignosi in [28]. Corollary 4.1. An AR subshift X has finite index if and only if the runs in its index sequence are uniformly bounded. We begin by proving ind(X) ≥ max{N1 , N2 }. The key observation is given in the following lemma: Lemma 4.1. Let a, a1 , . . . , an , α, β ∈ Ak with ai 6= a, 1 ≤ i ≤ n and α 6= β. Then τa ◦ τa1 ◦ · · · ◦ τan (a) is a prefix of τa ◦ τa1 ◦ · · · ◦ τan (αβ). Proof. For n = 1, we have τa ◦ τa1 (a) = aa1 a and τa ◦ τa1 (αβ) = τa (a1 α0 τa1 (β)) = aa1 a · · · , where α0 =
(
ε
if a1 = α ,
α
if a1 6= α .
Thus the statement is true for n = 1. Let us now assume that the statement holds for n. We have τa ◦ τa1 ◦ · · · ◦ τan ◦ τan+1 (a) = τa ◦ τa1 ◦ · · · ◦ τan (an+1 a) = τa ◦ τa1 ◦ · · · ◦ τan (an+1 )τa ◦ τa1 ◦ · · · ◦ τan (a) . On the other hand, we have (with α0 , β 0 defined as above) τa ◦ τa1 ◦ · · · ◦ τan ◦ τan+1 (αβ) = τa ◦ τa1 ◦ · · · ◦ τan (an+1 α0 an+1 β 0 ) = τa ◦ τa1 ◦ · · · ◦ τan (an+1 )τa ◦ τa1 ◦ · · · ◦ τan × (α0 an+1 β 0 ) . Since α 6= β, at least one of them is 6= an+1 , say α, and then α0 = α 6= an+1 . Now apply the induction hypothesis.
November 1, 2003 12:2 WSPC/148-RMP
00175
753
Combinatorial Properties of Arnoux–Rauzy Subshifts
Proposition 4.1. Suppose a, a1 , . . . , an , an+1 ∈ Ak with ai 6= a, 1 ≤ i ≤ n + 1 and x ∈ AN k has an occurrence of a. Then, for every r ∈ N, τa ◦ τa1 ◦ · · · ◦ τan ◦ τar ◦ τan+1 (x) contains (τa ◦ τa1 ◦ · · · ◦ τan (a))r+2 . Proof. Since x contains a and an+1 = 6 a, τan+1 (x) contains an+1 aan+1 . Thus τar ◦ τan+1 (x) contains ar an+1 ar+1 an+1 a. Therefore τa ◦ τa1 ◦ · · · ◦ τan ◦ τar ◦ τan+1 (x) contains (τa ◦ τa1 ◦ · · · ◦ τan (a))r+1 τa ◦ τa1 ◦ · · · ◦ τan (an+1 a) . By Lemma 4.1, τa ◦ τa1 ◦ · · · ◦ τan (an+1 a) has τa ◦ τa1 ◦ · · · ◦ τan (a) as a prefix. Proposition 4.2. Let X be an AR subshift over Ak and (in ) its index sequence. If (in ) contains a run ar , then X has a factor ur+1 . If the run ar is preceded by a somewhere in (in ), then X has a factor ur+2 . In particular, ind(X) ≥ max{N1 , N2 } . Proof. Both claims follow immediately from Proposition 4.1 and its proof. We now aim at proving ind(X) ≤ max{N1 , N2 }. This will be done by starting from a factor up , p ≥ 3 and then performing an iterated desubstitution process which will produce an r-run in the index sequence. In general, we have r = p − 2, but under certain circumstances, we have r = p − 1. To illustrate this procedure, let us start with an example. Suppose X is an AR subshift over three symbols such that FX contains (21232121232122123212123212212321)p . (1)
(4.1)
Clearly, C = C = (cn ) is of type 2 and hence i1 = 2. Thus, C C (2) must contain the factor
(1)
= τ2 (C
(2)
(13113121311312131)p . (2)
Now, C is of type 1 and hence i2 = 1. Thus, C contain the factor
) and (4.2)
(2)
(3132313231)p−1313231323? .
= τ1 (C
(3)
) and C
(3)
must (4.3)
Observe that the last symbol in the last block cannot be desubstituted uniquely. We indicate this ambiguity by “?” and note that the last block is one symbol shorter than the other blocks. Next, i3 = 3, C (3) = τ1 (C (4) ) and C (4) must contain the factor (12121)p−1 1212? .
(4.4)
The ambiguity on this level comes from the ambiguity on the previous level, that is, the “?” in (4.3). However, we clearly have i4 = 1 and hence the last 2 in (4.4) must be followed by a 1, so in fact C (4) must contain the factor (12121)p−112121 .
(4.5)
November 1, 2003 12:2 WSPC/148-RMP
754
00175
D. Damanik & L. Q. Zamboni
This allows us to go up one level and replace the “?” in (4.3) by a 1. This deciphers all the ambiguities up to this point. Next, C (4) = τ1 (C (5) ) and C (5) must contain the factor (221)p−1 22?
(4.6)
and i5 = 2, C (5) = τ2 (C (6) ) and C (6) must contain the factor (21)p−1 2? .
(4.7)
Now, i6 is either 1 or 2, but in either case, further desubstitution yields a run of length p−2 in the index sequence. Namely, if i6 = 1, then i7 = · · · = i7+(p−2)−1 = 2, and i6 = 2 gives i7 = · · · = i7+(p−2)−1 = 1. Note that, contrary to the situation above, the ambiguities in (4.6) and (4.7) cannot be removed. We observe: (1) The desubstitution process takes up to ap−1 ? for some a ∈ Ak and “?” is either known or not. This yields at least a (p − 2)-run in the index sequence. (2) If at no step there is an ambiguity in the desubstitution process, then w p reduces to ap and hence produces a (p − 1)-run in the index sequence. These observations lead to the following lemma: Lemma 4.2. Let X be an AR subshift over Ak and (in ) its index sequence. Suppose there is p ≥ 3 and a primitive u ∈ A∗k such that up ∈ FX . Then we have one of the following scenarios: (i) There are a ∈ Ak and m ∈ N such that im+j = a, 1 ≤ j ≤ p − 1. (ii) There are a ∈ Ak and m ∈ N such that im+j = a, 1 ≤ j ≤ p − 2 and ij = a for some j with 1 ≤ j ≤ m. Proof. Start with the word up and perform a continued desubstitution process, as above, using τi−1 , τi−1 ◦ τi−1 , τi−1 ◦ τi−1 ◦ τi−1 , . . ., where the last symbol of the 1 2 1 3 2 1 desubstituted word may be unknown and hence denoted by “?”. Clearly, this process leads, after, say, m steps, to a desubstituted word which has either the form ap or ap−1 ?, where a is some symbol from Ak . In particular, we must have im+j = a for 1 ≤ j ≤ p − 1 (in the first case) or 1 ≤ j ≤ p − 2 (in the second case). It only remains to be shown that in the second case, we must have applied τa somewhere along the way. Notice that each desubstituted word results from the previous word by a deletion of a number of symbols. In particular, the word u we started with must contain a. Consider first the case where u contains at least two occurrences of a. Then, in order to reduce u2 (the first two of the p ≥ 3 blocks) to a2 , we necessarily have to apply τa along the way. Let us now consider the case where u contains exactly one a. Then we do not apply τa until we are left with a desubstituted word of length ≤ 2. That is, either the word is a, in which case we are done (since the a in the last block never gets deleted and hence up reduces to ap ), or the word is ab (or ba) for some b. In the next step, either τa or τb is applied. Remember that the
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
755
last word still contains a (so that it is one of ab, ba, a?) so that desubstitution by τb leads to ap . That is, only desubstitution by τa leads to ap−1 ?. Proposition 4.3. Let X be an AR subshift over Ak and (in ) its index sequence. Then ind(X) ≤ max{N1 , N2 } . Proof. This is an immediate consequence of Lemma 4.2. Namely, a given power up ∈ FX , for some p ≥ 3, corresponds to either an open or closed (p − 1)-run or a closed (p − 2)-run in the index sequence. Note that every AR subshift contains squares (e.g. i1 i1 ) so that powers p < 3 are irrelevant for the computation of the index. Proof of Theorem 4.1. The assertion follows from Propositions 4.2 and 4.3. One might also be interested in powers that occur for arbitrarily long factors. That is, define i − ind(X) ∈ N ∪ {∞} by i − ind(X) = sup{p ∈ N : there exist un with |un | → ∞ such that upn ∈ FX } . Then, the above analysis has the following immediate consequence: Corollary 4.2. Let X be an AR subshift over Ak and (in ) its index sequence. If i − ind(X) is defined as above, then i − ind(X) = 2 + lim sup en , n→∞
where, for n ∈ N, en = max{l ∈ N : in+l−1 = in } . Proof. Since every symbol from Ak occurs in (in ), the index sequence has exactly k open runs. Moreover, there is N ∈ N such that beyond iN , there are no more open runs. In particular, for the computation of i − ind(X), only closed runs are relevant. Thus, the assertion follows in a straightforward way from Proposition 4.2, Lemma 4.2, and their proofs. 5. Palindromes in Arnoux Rauzy Sequences In this section we study the number of palindromes of a given length that occur in a given AR subshift X. Recall that a word is called a palindrome if it is the same when read backwards. Given a minimal subshift X, define its palindrome complexity function p : N → N0 by p(n) = |{p ∈ FX : p = pR , |p| = n}| .
November 1, 2003 12:2 WSPC/148-RMP
756
00175
D. Damanik & L. Q. Zamboni
It was shown by Droubay and Pirillo that all Sturmian subshifts have the same palindrome complexity function, namely, ( 2 if n is odd , p(n) = (5.1) 1 if n is even , and that Sturmian subshifts are in fact characterized by this property [29]. The first part, namely, that all AR subshifts over Ak have the same palindrome complexity function was shown by Justin and Pirillo in [15]. The second part, however, does not extend. That is, AR subshifts are not characterized by their palindrome complexity. We give a simple alternate proof of the result of Justin and Pirillo which reads as follows (cf. [15, Theorem 4.4]): Theorem 5.1. The palindrome complexity function p : N → N0 of an AR subshift X over Ak is given by ( k if n is odd , p(n) = (5.2) 1 if n is even . Proof. We shall prove the statement ∀ n ∈ N : p(2n − 1) = k ,
p(2n) = 1 ,
(5.3)
which is equivalent to the assertion, by induction on n. The case n = 1 is readily checked. In fact, p(1) = k is obvious, and if X is of type a ∈ Ak , then aa is the unique palindrome of length 2 which occurs in X. Now assume that (5.3) holds for n. Let us show (5.3) for n + 1 by proving that if p is a palindrome occurring in X, then p admits a unique extension apa to a palindrome of length |p| + 2. Fix a palindrome p ∈ FX . We show below that p ∈ FX bispecial ⇒ there exists a unique a ∈ Ak such that apa ∈ FX .
(5.4)
Now, either there exists a unique a ∈ Ak such that apa ∈ FX , or else p is bispecial (and so by (5.4) there exists a unique a ∈ Ak such that apa ∈ FX ); hence in either case there exists a unique a ∈ Ak such that apa ∈ FX . Let us show (5.4). Let a ∈ Ak be the unique letter for which ap is right-special (and, equivalently, pa is left-special). Then apa ∈ FX . Consider any letter b 6= a. Then, we have that bp is not right-special, and bpa ∈ FX (since pa is left-special), so bpb 6∈ FX . As we mentioned above, every minimal subshift with palindrome complexity given by (5.1) is necessarily Sturmian. We are now going to show that this does not extend to the AR case, that is, there are non-AR subshifts with palindrome complexity given by (5.2). To this end, we consider subshifts X3iet , defined over A3 , associated with three-interval exchange transformations. These dynamical systems have the following combinatorial description, as shown by Ferenczi et al. [30]:
November 1, 2003 12:2 WSPC/148-RMP
00175
757
Combinatorial Properties of Arnoux–Rauzy Subshifts
• FX3iet (2) = {12, 13, 21, 22, 31}. • If u ∈ FX3iet , then uR ∈ FX3iet . • For every n ∈ N, there are exactly two left-special words in FX3iet (n), one beginning in 1 and one beginning in 2. • If w is a bispecial word ending in 1 and w 6= w R , then w2 is left-special if and only if wR 1 is left-special. Clearly, no such subshift is an AR subshift. We have the following result: Proposition 5.1. The palindrome complexity function p of X3iet is given by ( 3 if n is odd , p(n) = (5.5) 1 if n is even . Proof. The proof is similar to the proof of Theorem 5.1. It follows from the proof of Proposition 2.6 in [30] that if u is a bispecial palindrome factor, then there exists a unique symbol a ∈ A3 such that aua is a factor. In fact, if u begins in 1, then a ∈ {2, 3}, while if u begins in 2, then a ∈ {1, 2}. So now suppose u is a palindrome factor of length n. We claim there exists a unique symbol a ∈ A3 such that aua is a factor. If no such a exists, then there exist distinct symbols b, c ∈ A3 such that buc is a factor. This implies that u is bispecial, so from the above there must exist a ∈ A3 such that aua is a factor. Next, suppose there exist distinct symbols b, c ∈ A3 such that bub and cuc are factors. Then again u is bispecial, and hence this cannot happen. Thus p(n) = p(n + 2). Since p(1) = 3 and p(2) = 1, the assertion follows. On the other hand, X3iet has the same factor complexity function as an AR subshift over three symbols, so one may ask whether (5.5) implies that f (n) = 2n + 1 .
(5.6)
This is not true, at least on the level of individual sequences, as demonstrated by the following result (the example is due to J. Cassaigne [31]). Proposition 5.2. There exists a sequence over A3 whose palindrome complexity function is given by (5.5), but whose factor complexity function is not given by (5.6). Proof. Let w = 121312141213121 · · · be the fixed point of the infinite substitution 1 7→ 12 ,
2 7→ 13 ,
3 7→ 14, . . . .
Let w0 be the morphic image of w under the map Θ where Θ(i) = 1i 21i 31i . So w0 = 1213111211311121311112111311112131 · · · .
November 1, 2003 12:2 WSPC/148-RMP
758
00175
D. Damanik & L. Q. Zamboni
Note that 22, 33, 21n2, 31n 3, 31n21n 3, 21n31n 2 are not factors of w 0 .
(5.7)
We therefore have for w 0 , p(n) = 1 for n even ,
(5.8)
since by (5.7) the only even length palindromes are 1n , p(n) = 3 for n odd , since by (5.7) the only odd length palindromes are 1n , 1
(5.9) n−1 2
21
n−1 2
f is not given by (5.6) .
,1
n−1 2
31
n−1 2
, and
(5.10)
For example, f (3) = 9. By (5.8)–(5.10), we have palindrome complexity as in (5.5), but factor complexity different from (5.6). Note, however, that the sequence w 0 above is not uniformly recurrent and hence does not induce a minimal subshift. We consider it an interesting open problem to determine all minimal subshifts X over Ak , with k ≥ 3 arbitrary, whose palindrome complexity function is given by (5.2). 6. Applications to Schr¨ odinger Operators In this section, we discuss applications of our combinatorial results, Theorems 3.1 and 5.1 and Corollary 4.2, to the spectral theory of Schr¨ odinger operators. A discrete one-dimensional Schr¨ odinger operator acts in the Hilbert space H = `2 (Z). If φ ∈ H, then Hφ is given by (Hφ)(n) = φ(n + 1) + φ(n − 1) + V (n)φ(n) , where V : Z → R. The map V is called the potential. For our purposes, we can assume V bounded. Then H is a bounded, self-adjoint operator. Denote the spectrum of H by σ(H). Given an initial state φ ∈ H, the Schr¨ odinger time evolution is given by φ(t) = exp(−itH)φ, where exp(−itH) is given by the spectral theorem. One is interested in the question whether φ(t) will spread out in space, and if so, how fast. One possible way to tackle this issue is to study the spectral measure µ φ associated with φ, which is defined by Z dµφ (x) −1 for every z with Im z > 0 . hφ, (H − z) φi = R x−z Roughly speaking, the more continuous µφ , the faster the spreading of φ(t); compare, for example, [32–34]. Denote Hac = {φ ∈ H : µφ is absolutely continuous} Hsc = {φ ∈ H : µφ is singular continuous} Hpp = {φ ∈ H : µφ is pure point}
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
759
and σε (H) = σ(H|Hε ) for ε ∈ {ac, sc, pp} . We say that H has purely absolutely continuous spectrum if both σsc (H) and σpp (H) are empty, etc. As was mentioned in the introduction, there has been a considerable amount of research dealing with the spectral properties of H if V displays long-range order. The totally ordered case (i.e. V periodic) is well-understood [35]. In this case, H has purely absolutely continuous spectrum. If V takes on only finitely many values, one popular measure for long-range order is given by the complexity function. It has then been the goal to determine the spectral properties for aperiodic potentials of low combinatorial complexity. A complete understanding has been obtained for Sturmian potentials [4, 36] and quasi-Sturmian potentials [8]. It turned out that in all these cases, one has purely singular continuous spectrum, supported on a Cantor set of Lebesgue measure zero. Here, a Cantor set is a closed, perfect, nowhere dense set. It is natural to conjecture that these properties are shared by other lowcomplexity potentials. In fact, these questions can be studied from a purely combinatorial perspective. That is, there are results that deduce singular continuous, zero-measure spectrum from purely combinatorial properties of the potential. Here we study the case of Arnoux–Rauzy potentials which provide a natural class of low-complexity potentials. Fix a two-sided AR subshift X over Ak with index sequence (in ) and a nonconstant function f : Ak → R. Denote the unique ergodic measure on X by ν. Each element x of X induces a potential via Vx (n) = f (xn ). The Schr¨ odinger operator with potential Vx will be denoted by Hx . Since X is minimal, we have that the spectrum and the absolutely continuous spectrum are invariants of X, that is, there are sets Σ, Σac ⊆ R such that σ(H) = Σ and σac (H) = Σac for every x ∈ X. The result for the spectrum follows from strong convergence and is folklore. The result on the absolutely continuous is much deeper and more recent [37]. In fact, aperiodicity implies that Σac is empty [38]. Thus, to establish the desired picture, we have to show that Σ has Lebesgue measure zero and σpp (H) is often/always empty. We first turn to the zero-measure property. It is a result of Lenz that linear recurrence provides a sufficient condition: Theorem 6.1 (Lenz [5]). If X is a linearly recurrent subshift and X and f are such that the resulting potentials Vx are aperiodic, then Σ has Lebesgue measure zero. Combining this with our Theorem 3.1, we immediately obtain the following (the Cantor set properties follow from the zero-measure property by general principles): Corollary 6.1. If every letter a ∈ Ak occurs in (in ) with bounded gaps, then σ(Hx ) is a Cantor set of zero Lebesgue measure for every x ∈ X. Let us now discuss the absence of point spectrum. Both palindromes and powers allow one to prove this property. The palindrome criterion is easy to verify, but it has the slight disadvantage that it only gives generic absence of eigenvalues:
November 1, 2003 12:2 WSPC/148-RMP
760
00175
D. Damanik & L. Q. Zamboni
Theorem 6.2 (Hof et al. [11]). If X is a minimal subshift and its palindrome complexity function obeys lim supn→∞ p(n) > 0, then for a dense Gδ -set of x ∈ X, we have σpp (Hx ) = ∅. We immediately deduce from this and Theorem 5.1: Corollary 6.2. For a dense Gδ -set of x ∈ X, we have σpp (Hx ) = ∅. We remark that an analog of Theorem 6.2 for half-line Schr¨ odinger operators was found in [39]. On the other hand, the criterion for empty point spectrum which is based on powers is slightly more complicated to state, requires more effort to be verified, but yields a stronger conclusion. Define the set Xn of elements of X, which have cubes of length 3n, suitably centered around the origin, by Xn = {x ∈ X : x−n+j = xj = xn+j ,
1 ≤ j ≤ n} .
Then we have the following result (the proof is based on a Gordon-type argument [40]; see e.g. [3, 10]): Theorem 6.3. Suppose lim supn→∞ ν(Xn ) > 0. Then, for ν-almost every x ∈ X, we have σpp (Hx ) = ∅. We can use this theorem and our Corollary 4.2 to show: Corollary 6.3. If the index sequence (in ) contains infinitely many 2-runs, we have σpp (Hx ) = ∅ for ν-almost every x ∈ X. Proof. Corollary 4.2 shows that if the index sequence (in ) contains infinitely many 2-runs, then FX contains arbitrarily long fourth powers. That is, there are un ∈ A∗k with |un | → ∞ and u4n ∈ FX . Since X is aperiodic, either u4n is right-special or one of its conjugates is right-special. Thus, we can assume without loss of generality that u4n is right-special. It was shown in [17, Lemma 2.2] that among all factors of length 4|un |, the right-special factor has the largest frequency. Since there are 4(k − 1)|un | + 1 words of length 4|un | whose frequencies add up to one, we infer that 1 . d(u4n ) ≥ 4(k − 1)|un | + 1 This yields, using (2.1), lim sup ν(Xn ) ≥ lim sup ν(X|un | ) ≥ lim sup n→∞
n→∞
n→∞
|un | 1 = > 0. 4(k − 1)|un | + 1 4(k − 1)
Thus, the assertion follows from Theorem 6.3. Corollary 6.3 does not cover the prominent case of the Tribonacci subshift XTrib , which is defined over three symbols and corresponds to the index sequence (in ) = 1, 2, 3, 1, 2, 3, 1, 2, 3, . . . .
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
761
We shall nevertheless show that the conclusion of Corollary 6.3 holds for this case. By Theorem 2.2, the characteristic sequence C = (cn ) is given by C = lim (τ1 ◦ τ2 ◦ τ3 )n (1) . n→∞
The substitution S = τ1 ◦ τ2 ◦ τ3 on A3 is given by S(1) = 1213121 ,
S(2) = 121312 ,
S(3) = 1213 .
Note that S is primitive (i.e. there is l ∈ N, namely l = 1, such that for every a ∈ A3 , S l (a) contains all symbols from A3 ). Recall that a fractional power w q is a word wp w0 with p ∈ N, w0 a prefix of w, and q = p + |w 0 |/|w|. We have the following result for subshifts generated by primitive substitutions: Theorem 6.4 (Damanik [41]). Suppose the subshift X is generated by a primitive substitution S and FX contains a fractional power w q with q > 3. Then we have σpp (Hx ) = ∅ for ν-almost every x ∈ X. This allows us to prove the following: Corollary 6.4. For the Tribonacci subshift XTrib , we have σpp (Hx ) = ∅ for ν-almost every x ∈ XTrib . Proof. As we have seen above, C is the unique fixed point of S in AN 3 and we have [ FS n (1) . FXTrib = n∈N
Thus it suffices to find some S n (1) which contains w q with q > 3. The claim then follows from Theorem 6.4. First, S 2 (1) contains the word 1121. Thus, S 3 (1) contains the word 1213121 1213121 121312 1213121 = (1213121) 32 · · · , and hence S 4 (1) contains (1213121 121312 1213121 1213 1213121 121312 1213121) 3121312 · · · , which yields a fractional power w q with q = 3 + 3/22 > 3. Acknowledgments We thank J. Cassaigne for useful discussions. D. D. would like to express his gratitude to the Department of Mathematics at the University of North Texas at Denton for its warm hospitality and financial support through the Texas Advanced Research Program. D. D. was supported in part by NSF Grant No. DMS–0227289.
November 1, 2003 12:2 WSPC/148-RMP
762
00175
D. Damanik & L. Q. Zamboni
References [1] D. Shechtman, I. Blech, D. Gratias and J. V. Cahn, Metallic phase with long-range orientational order and no translational symmetry, Phys. Rev. Lett. 53 (1984), 1951– 1953. [2] A. S¨ ut˝ o, Schr¨ odinger difference equation with deterministic ergodic potentials, in Beyond Quasicrystals (Les Houches, 1994), eds. F. Axel and D. Gratias, Springer, Berlin (1995), 481–549. [3] D. Damanik, Gordon-type arguments in the spectral theory of one-dimensional quasicrystals, in Directions in Mathematical Quasicrystals, eds. M. Baake and R. V. Moody, CRM Monograph Series 13, AMS, Providence, RI (2000), 277–305. [4] D. Damanik, R. Killip and D. Lenz, Uniform spectral properties of one-dimensional quasicrystals, III. α-continuity, Commun. Math. Phys. 212 (2000), 191–204. [5] D. Lenz, Singular spectrum of Lebesgue measure zero for quasicrystals, Commun. Math. Phys. 227 (2002), 119–130. [6] Q.-H. Liu, B. Tan, Z.-X. Wen and J. Wu, Measure zero spectrum of a class of Schr¨ odinger operators, J. Statist. Phys. 106 (2002), 681–691. [7] J.-P. Allouche and D. Damanik, Applications of combinatorics on words to physics, in preparation. [8] D. Damanik and D. Lenz, Uniform spectral properties of one-dimensional quasicrystals, IV. Quasi-Sturmian potentials, to appear in J. d’Analyse Math. [9] B. Adamczewski and D. Damanik, Linearly recurrent circle map subshifts and an application to Schr¨ odinger operators, Ann. Henri Poincar´e 3 (2002), 1019–1047. [10] F. Delyon and D. Petritis, Absence of localization in a class of Schr¨ odinger operators with quasiperiodic potential, Commun. Math. Phys. 103 (1986), 441–444. [11] A. Hof, O. Knill and B. Simon, Singular continuous spectrum for palindromic Schr¨ odinger operators, Commun. Math. Phys. 174 (1995), 149–159. [12] P. Arnoux and G. Rauzy, Repr´esentation g´eom´etrique de suites de complexit´e 2n + 1, Bull. Soc. Math. France 119 (1991), 199–215. [13] R. N. Risley and L. Q. Zamboni, A generalization of Sturmian sequences: combinatorial structure and transcendence, Acta Arith. 95 (2000), 167–184. [14] X. Droubay, J. Justin and G. Pirillo, Epi-Sturmian words and some constructions of de Luca and Rauzy, Theoret. Comput. Sci. 255 (2001), 539–553. [15] J. Justin and G. Pirillo, Episturmian words and episturmian morphisms, Theoret. Comput. Sci. 276 (2002), 281–313. [16] J. Justin and L. Vuillon, Return words in Sturmian and episturmian words, Theor. Inform. Appl. 34 (2000), 343–356. [17] N. Wozny and L. Q. Zamboni, Frequencies of factors in Arnoux–Rauzy sequences, Acta Arith. 96 (2001), 261–278. [18] D. Damanik and D. Lenz, Linear repetitivity. I. Uniform subadditive ergodic theorems and applications, Discrete Comput. Geom. 26 (2001), 411–428. [19] F. Durand, Linearly recurrent subshifts have a finite number of non-periodic subshift factors, Ergodic Theory Dynam. Systems 20 (2000), 1061–1078. [20] F. Durand, B. Host and C. Skau, Substitutional dynamical systems, Bratteli diagrams and dimension groups, Ergodic Theory Dynam. Systems 19 (1999), 953–993. [21] J. C. Lagarias and P. A. B. Pleasants, Repetitive Delone sets and quasicrystals, to appear in Ergodic Theory Dynam. Systems. [22] D. Lenz, Uniform ergodic theorems on subshifts over a finite alphabet, Ergodic Theory Dynam. Systems 22 (2002), 245–255. [23] F. Durand, Corrigendum and appendum to: Linearly recurrent subshifts have a finite number of non-periodic subshift factors, to appear in Ergodic Theory Dynam. Systems.
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
763
[24] J. Berstel, On the index of Sturmian words, in Jewels are Forever, Springer, Berlin (1999), 287–294. [25] D. Damanik and D. Lenz, The index of Sturmian sequences, European J. Combin. 23 (2002), 23–29. [26] J. Justin and G. Pirillo, Fractional powers in Sturmian words, Theoret. Comput. Sci. 255 (2001), 363–376. [27] D. Vandeth, Sturmian words and words with a critical exponent, Theoret. Comput. Sci. 242 (2000), 283–300. [28] F. Mignosi, On the number of factors of Sturmian words, Theoret. Comput. Sci. 82 (1991), 71–84. [29] X. Droubay and G. Pirillo, Palindromes and Sturmian words, Theoret. Comput. Sci. 223 (1999), 73–85. [30] S. Ferenczi, C. Holton and L. Q. Zamboni, Structure of three-interval exchange transformations II: A combinatorial description of the trajectories, to appear in J. d’Analyse Math. [31] J. Cassaigne, private communication. [32] J. M. Barbaroux, F. Germinet and S. Tcheremchantsev, Fractal dimensions and the phenomenon of intermittency in quantum dynamics, Duke Math. J. 110 (2001), 161–193. [33] I. Guarneri, Spectral properties of quantum diffusion on discrete lattices, Europhys. Lett. 10 (1989), 95–100. [34] Y. Last, Quantum dynamics and decompositions of singular continuous spectra, J. Funct. Anal. 142 (1996), 406–445. [35] G. Teschl, Jacobi Operators and Completely Integrable Nonlinear Lattices, Mathematical Surveys and Monographs 72, AMS, Providence, RI, 2000. [36] J. Bellissard, B. Iochum, E. Scoppola and D. Testard, Spectral properties of onedimensional quasi-crystals, Commun. Math. Phys. 125 (1989), 527–543. [37] Y. Last and B. Simon, Eigenfunctions, transfer matrices, and absolutely continuous spectrum of one-dimensional Schr¨ odinger operators, Invent. Math. 135 (1999), 329–367. [38] S. Kotani, Jacobi matrices with random potentials taking finitely many values, Rev. Math. Phys. 1 (1989), 129–133. [39] D. Damanik, J.-M. Ghez and L. Raymond, A palindromic half-line criterion for absence of eigenvalues and applications to substitution Hamiltonians, Ann. Henri Poincar´e 2 (2001), 927–939. [40] A. Gordon, On the point spectrum of the one-dimensional Schr¨ odinger operator, Usp. Math. Nauk 31 (1976), 257–258. [41] D. Damanik, Singular continuous spectrum for a class of substitution Hamiltonians II, Lett. Math. Phys. 54 (2000), 25–31.
November 5, 2003 9:23 WSPC/148-RMP
00174
Reviews in Mathematical Physics Vol. 15, No. 7 (2003) 765–788 c World Scientific Publishing Company
PHASE TRANSITION FROM THE VIEWPOINT OF RELAXATION PHENOMENA
NOBUO YOSHIDA Division of Mathematics, Graduate School of Science Kyoto University, Kyoto 606-8502, Japan
[email protected] Received 3 February 2003 Revised 15 August 2003
Some results on the relaxation processes (Glauber dynamics) obtained in the last decade are presented. This article is intended to be a short guided tour through these results for readers without prior knowledge of rigorous statistical mechanics or stochastic processes. Keywords: Gibbs measure; phase transition; relaxation time; log-Sobolev inequality.
Contents 0. Introduction 0.1 What is relaxation phenomenon? 0.2 Some notations in probability 1. Gibbs Measure 1.1 Gibbs measure 1.2 The phase transition 2. Glauber Dynamics 2.1 The case of the Ising modal 2.2 The case of the lattice φ4 -field 2.3 Relaxation time 3. Relaxation in the Unique Phase Region 3.1 The equivalence of the decay of correlation and the fast relaxation 3.2 Sufficient conditions for (DC), (SG) or (LS) 3.3 Temperatures slightly above 1/βc , or small non-zero magnetic fields 4. Relaxation in the Phase-Coexistence Region 4.1 Some general observations 4.2 The free boundary condition 4.3 The plus boundary condition A. Appendix A.1 Proof of Eq. (2.7) A.2 Brownian motion and Eq. (2.10) A.3 Proof of Eq. (2.14) A.4 Proof of Proposition 2.3 765
766 766 766 766 766 769 770 771 773 774 776 776 778 779 780 780 781 783 784 784 785 785 786
November 5, 2003 9:23 WSPC/148-RMP
766
00174
N. Yoshida
0. Introduction 0.1. What is relaxation phenomenon? Imagine a piece of iron from the microscopic point of view. The piece of iron is then a collection of a large number of atoms randomly interacting with each other through the spinning of their electrons. According to equilibrium statistical mechanics, the statistics of the spins is governed by a probability distribution called the Gibbs measure which is supposed to describe the distribution of a large number of random objects in equilibrium. As we discuss later in detail, the Gibbs measure works as a mathematical tool to formulate the notion of phase transition. On the other hand, there is a rather different way of looking at the Gibbs measure. The Gibbs measure is understood as the stationary measure for relaxation processes. Here, a relaxation process means the way a physical system goes back to its equilibrium, after being disturbed by some external factor (e.g. temporary exposure of the piece of iron to an external magnetic field). Relaxation processes are a familiar aspect of nature which we experience through diffusion of particles, heat conduction, etc. An interesting thing from the physical/mathematical point of view is that the phase transition and the relaxation are related roughly as; unique phase ↔ fast relaxation
(0.1)
phase coexistence ↔ slow relaxation
(0.2)
The purpose of this article is to present some theorems which make (0.1) and (0.2) clearer in the mathematical framework. The readers are not required to be experienced in statistical mechanics or or stochastic processes, although knowledge in these fields would certainly make the reading easier. We have tried to make the exposition as clear as possible, however, without going into technicalities. More complete lecture notes can be found in [1, 2]. 0.2. Some notations in probability Let (X, A) be a measurable space and µ be a probability measure on (X, A). R • The expectation of a function f ∈ L1 (µ) is denoted by µ(f ) = X f dµ. • The correlation (or covariance) of f, g ∈ L2 (µ) is denoted by µ(f ; g) = µ((f − µ(f ))(g − µ(g))). 1. Gibbs Measure 1.1. Gibbs measure We would like to describe a physical system in which a large number of particles interact with each other. In the models we will consider here, the state of a particle
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
767
at a position x is denoted by σx . We will then introduce the Gibbs measure as a probability measure which governs the random variable (σx )x∈Λ , where Λ is a large set. We will discuss two different models (Ising model and lattice φ4 -field) at the same time. Position of a particle: The position x of a particle is supposed to be a point in the d-dimensional cubic lattice Zd ; Zd = {x = (xi )di=1 ; xi ∈ Z} . P Zd is endowed with `1 -distance: kxk1 = di=1 |xi |. The notation Λ ⊂⊂ Zd means that Λ is a non-empty, finite set in Zd . For Λ ⊂ Zd , the interior boundary ∂in Λ and the exterior boundary ∂ex Λ are defined by ∂in Λ = {x ∈ Λ; kx − yk1 = 1 for some y ∈ Zd \Λ} ,
(1.1)
∂ex Λ = {y ∈ Zd \Λ; kx − yk1 = 1 for some x ∈ Λ} .
(1.2)
State σ x of a particle: The state of a particle is expressed as a value in a set S. In the sequel, the set S is either a two-point set {−1, +1} (the case of the Ising model) or the real line R (the case of the lattice φ4 -field). We introduce the configuration spaces as follows: S Λ = {σ = (σx )x∈Λ ; σx ∈ S},
Λ ⊂⊂ Zd ,
d
S Z = {η = (ηy )y∈Zd ; ηy ∈ S} . The way these configurations are used is as follows. We consider some large Λ ⊂⊂ Z d d and specify the spin configuration outside Λ by η ∈ S Z , so that only (ηy )y6∈Λ (more exactly, only (ηy )y∈∂ex Λ ) is used in doing so. We then discuss the statistical property d of σ = (σx )x∈Λ . The configuration η ∈ S Z which appears in the way described above is called the boundary condition. Interaction among particles: Suppose that Λ ⊂⊂ Zd and that the spin configud ration of the particles on Zd \Λ is specified by η ∈ S Z . We define the Hamiltonian H Λ,η : S Λ → R as follows: X 1 β(σx − σy )2 H Λ,η (σ) = 2 {x,y}⊂Λ kx−yk1 =1
+
1 2
X
x∈Λ,y6∈Λ kx−yk1 =1
β(σx − ηy )2 +
X
βhσx .
(1.3)
x∈Λ
Here β > 0 and h ≥ 0 are parameters, wihch are usually called the inverse temperature and the external magnetic field. Note that only (ηy )y∈∂ex Λ is relevant in (1.3).
November 5, 2003 9:23 WSPC/148-RMP
768
00174
N. Yoshida
Gibbs measure: To define two models (Ising model and lattice φ4 -field) at the same time, we endow S with a probability measure ν in two different ways depending on whether S = {−1, +1} (Ising model) or S = R (lattice φ4 -field). • For the Ising model, we take S = {−1, +1} ,
ν({+1}) = ν({−1}) =
• For the lattice φ4 -field, we take S = R,
ν(ds) = exp(−U (s))ds
Z
1 . 2
exp(−U (s))ds ,
(1.4)
(1.5)
R
where U (s) = (s2 − 1)2 , and ds is the Lebesgue measure, see Remark 1.2 below. Suppose that Λ ⊂⊂ Zd and that the spin configuration of the particles on Zd \Λ d is specified by η ∈ S Z . We define a probability measure µΛ,η on the configuration space S Λ by exp −H Λ,η (σ) Y ν(dσx ) , (1.6) µΛ,η (dσ) = Z Λ,η x∈Λ
which we will henceforce refer to as the finite volume Gibbs measure on Λ with the boundary condition η.
Remark 1.1. Though we discuss only finite volume Gibbs measures in this article, it is possible to define infinite volume Gibbs measures (Λ = Zd ) as is done in mathematical textbooks in statistical mechanics, e.g. [3, 4]. Roughly speaking, an infinite volume Gibbs measure µ is obtained as a limit of finite volume ones (1.6) as the set Λ expands to Zd along a suitable sequence {Λn }n≥1 . The situation here is not simple because of the fact that the limit may depend on the choice of the boundary condition η and even on the way {Λn }n≥1 is chosen. This uniqueness problem is fundamental in discussing infinite volume Gibbs measures. In this respect, see also Remark 1.4. Remark 1.2. We can take a much more general function as U (s) in (1.5). For example, it is enough to assume the following condition: for any m > 0, there exist V, W ∈ C ∞ (R → R) such that U (s) = V (s) + W (s) for all s ∈ R , inf V 00 (s) ≥ m , s
kW k∞ + kW 0 k∞ < ∞
(1.7) (1.8) (1.9)
where kW k∞ = sups |W (s)|. A typical example of U with these requirements is given by the following polynomial: U (s) =
N X
ν=1
a2ν s2ν + a1 s
(1.10)
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
769
where N ≥ 2, a1 , a2 ∈ R, a4 ≥ 0, . . . , a2(N −1) ≥ 0 and a2N > 0. Since a2 can be a large negative value, the polynomial U in (1.10) may have arbitrarily deep double wells. Here, we have decided to choose one of the simplest non-trivial example U (s) = (s2 − 1)2 only for the sake of simplicity. 1.2. The phase transition One of the greatest advantage of introducing Gibbs measure is that it works as a mathematical tool to capture the notion of phase transition. In fact, if the situation of a piece of iron described at the beginning of this article is properly formulated, one can show that the system exhibits phase transition. We begin by explaining it conceptually. Let us first simplify the picture and regard each atom as a small magnetic dipole which represents the spinning of the electron. Each dipole (or spin) can point up (value +1) or down (value −1) with a certain randomness due to thermal fluctuation. On the other hand, two spins at neighboring atoms tend to point in the same direction to decrease the interaction energy. Then, the phase transition can be explained as follows: • If the temperature is sufficiently high, the system is entirely dominated by thermal fluctuation and only one equilibrium called disordered phase can be realized, in which spins point up and down almost independently of each other. • On the other hand, if the temperature is sufficiently low, the thermal fluctuation is suppressed by the interaction energy which tries to align the spins. As a result, there are (at least) two equilibrium states called pure phases; a great majority of spins point upwards in one of these pure phases, while the opposite situation can be seen in the other pure phase. We now formulate the notion of phase transition more precisely using the Gibbs measure. We discuss the Ising model and the lattice φ4 -field at the same time. We start with the following proposition. The value m± (β, h) introduced there can be understood as the average value of the spin with respect to the two “pure phases” alluded to above. Proposition 1.1. There exists a function (β, h) 7→ m± (β, h) and boundary d conditions {η + , η − } ⊂ S Z such that ( m+ (β, h) if η ≥ η + Λ,η lim µ (σ0 ) = (1.11) Λ%Zd m− (β, h) if η ≤ η − . For the Ising model, it is enough to take ηx± ≡ ±1 (pure boundary conditions). For the lattice φ4 -field, it is shown in [5] that (1.11) holds for example if ηx+ ≥ 1+ε 1+ε (ln(2 + |x|)) 2 and ηx− ≤ −(ln(2 + |x|)) 2 for all x ∈ Zd . Definition 1.1. We say that the phase is unique when m+ (β, h) = m− (β, h). We say that the phases coexist when m+ (β, h) 6= m− (β, h).
November 5, 2003 9:23 WSPC/148-RMP
770
00174
N. Yoshida
The phase transition is then described as follows: Theorem 1.1. • If h > 0, then m+ (β, h) = m− (β, h) > 0. • If h = 0, then there is βc ∈ (0, ∞) (βc = ∞ ⇔ d = 1) such that β < βc ⇒ m+ (β, 0) = m− (β, 0) = 0 , βc < β ⇒ m+ (β, 0) = −m− (β, 0) > 0 . The number βc is called the critical inverse temperature. Remark 1.3. It is believed that m+ (βc , 0) = m− (βc , 0) = 0. For the Ising model this is proven for d = 2 (see (1.12) below) and for d ≥ 4 [6, (1.9), (1.13)]. The critical inverse temperature and the magnetization for the two-dimensional Ising model are explicitly known: 1/8 sinh 2βc = 1 and m+ (β, 0) = 1 − (sinh 2β)−4 for β ≥ βc . (1.12)
See [7] and the references therein.
d
Remark 1.4. For the boundary conditions {η + , η − } ⊂ S Z in Proposition 1.1, not only the limit (1.11), but also infinite volume Gibbs measures µ± called “pure phases” exist as the limits (in the weak topology); µ+ = limΛ%Zd µΛ,η if η ≥ η+ , and µ− = limΛ%Zd µΛ,η if η ≤ η− . Then, the phase coexistence discussed in Theorem 1.1 can be equivalently stated as µ+ 6= µ− . In fact, it might be more common to capture the notion of the phase transition in terms of the infinite volume Gibbs measures. However, we have chosen to formulate the phase transition referring only to finite volume Gibbs measures to make the exposition simpler. 2. Glauber Dynamics We now formulate a notion of relaxation as the random time evolution called Glauber dynamics. There are at least two approach to describe the time evolution. Probabilistic definition: In this approach, time evolution is described as a family of S Λ -valued stochastic process (σtΛ,η )t≥0 indexed by time t. Here, we consider a continuous time parameter t ∈ (0, ∞) rather than discrete ones t ∈ {0, 1, ...}. Each σtΛ,η here is a random configuration observed at time t. As will be explained below, the time evolution is a continuous-time Markov chain with values in S Λ . The relaxation to the equilibrium (= Gibbs measure) is then described as: lim P (σtΛ,η = σ) = µΛ,η ({σ}) ,
t%∞
σ ∈ SΛ ,
(2.1)
where P denotes a probability on a measurable space on which the process (σtΛ,η )t≥0 is defined.
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
771
Analytic definition: In this approach, we look at the averaged quantity over the stochastic process alluded to above, rather than the stochastic process in itself. More precisely, we look at (TtΛ,η f )(σ) = P [f (σtΛ,η )|σ0Λ,η = σ]
(2.2)
for f : S Λ → R, where the expectation on the right-hand side is the conditional expectation given the time-zero configuration σ. Thus, what we observe in this approach is more like “heat conduction” caused by the motion of the particles. As is explained below, it is possible to describe the quantity on the left-hand side of (2.2) in terms of the Hamiltonian, without referring to the stochastic process (σtΛ,η )t≥0 . In this approach, (2.1) is rephrased as: lim (TtΛ,η )f (σ) = µΛ,η f ,
t→∞
σ ∈ SΛ .
(2.3)
In what follows, we will present both probabilistic and analytic definitions of the Glauber dynamics. 2.1. The case of the Ising model Probabilistic definition of the Glauber dynamics: We need some notations. CΛ = the set of all real functions on S Λ . For f ∈ CΛ , ∇x f (σ) = f (σ x ) − f (σ) , where σyx
=
(
x ∈ Λ,
−σy
if y = x ,
σy
if y = x .
The Glauber dynamics evolves in time by a series of such spin flips σ 7→ σ x . The flip rate cΛ,η x (σ) is defined by 1 Λ,η Λ,η (2.4) cx (σ) = exp − ∇x H (σ) . 2 The evolution of the dynamics (σtΛ,η )t≥0 which starts from a configuration σ0Λ,η = σ can be described as follows: • The first spin flip occurs at a random time Tσ , where Tσ is an exponentially distributed random variable with the expectation given by the inverse of P C Λ,η (σ) = x∈Λ cΛ,η x (σ), that is, P (Tσ ∈ dt) = C Λ,η (σ)e−tC
Λ,η
(σ)
dt .
• A spin flip σ 7→ σ x is implemented at time Tσ , where site x is chosen with the Λ,η probability cΛ,η (σ). x (σ)/C
November 5, 2003 9:23 WSPC/148-RMP
772
00174
N. Yoshida
• After the first flip, continue the same procedure independently of the past, with σ x as the new starting configuration. The time evolution of (σtΛ,η )t≥0 described above is nothing but the continuous-time Λ,η Markov chain with the flip rate cΛ,η x (σ). The time evolution (σt )t≥0 and the Gibbs measure µΛ,η is then related as follows: Z µΛ,η (dσ 0 )P [σtΛ,η = σ|σ0Λ,η = σ 0 ] = µΛ,η ({σ}) . (2.5) This amounts to saying that the Gibbs measure is a stationary measure of the Markov chain (σtΛ,η )t≥0 . Since the Markov chain is irreducible in the present case, µΛ,η is in fact the unique stationary measure and the relaxation to the equilibrium (2.1) follows from a well-known convergence theorem for Markov chains (see e.g. [8, p. 65]). Analytic definition of the Glauber dynamics: The generator of the Glauber dynamics is defined by AΛ,η f (σ) =
X
cΛ,η x (σ)∇x f (σ) ,
f ∈ CΛ .
(2.6)
x∈Λ
Since AΛ,η is a linear map on a finite dimensional vector space CΛ , AΛ,η may be regarded as a matrix. It is not difficult (cf. Sec. A.1) to see that def.
E Λ,η (f, g) = −µΛ,η (f AΛ,η g) =
1 X Λ,η Λ,η µ (cx ∇x f ∇x g) . 2
(2.7)
x∈Λ
We therefore see the following: Proposition 2.1. The operator AΛ,η : L2 (µΛ,η ) → L2 (µΛ,η ) is symmetric and is negative semi-definite. Moreover, the eigenspace for eivenvalue zero consists only of constant functions. The quadratic form defined by (2.7) is called the Dirichlet form of the Glauber dynamics. The semi-group generated by AΛ,η is denoted by (TtΛ,η )t≥0 , TtΛ,η = exp(tAΛ,η ) ,
t > 0.
(2.8)
This is again, identified with a special case of the exponential of a matrix: exp(A) = P Λ,η p f (σ) can be characterized by the following p≥0 A /p!. For f ∈ CΛ , u(σ, t) = Tt initial value problem: ∂ u(σ, t) = AΛ,η u(σ, t) , ∂t u(σ, 0) = f (σ) ,
t>0 t = 0.
(2.9)
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
773
2.2. The case of the lattice φ4 -field Probabilistic definition of the Glauber dynamics: For f ∈ C 2 (RΛ ), we introduce the following notations: 2 ∂ ∂ f (σ) , x ∈ Λ . ∇x f (σ) = f (σ) , ∆x f (σ) = ∂σx ∂σx We then define
CΛ = {f ∈ C ∞ (RΛ ) ; ∇x f (x ∈ Λ) are bounded} . Λ,η For the lattice φ4 -field, the Glauber dynamics σtΛ,η = (σt,x )x∈Λ ∈ RΛ is introduced as the solution to the following stochastic differential equation: Z 1 t Λ,η Λ,η e Λ,η (σ Λ,η )ds , ∇x H (2.10) σt,x = σ0,x + Bt,x − s 2 0 where X e Λ,η (σ) = H Λ,η (σ) + H U (σx ) (2.11) x∈Λ
and Bt = (Bt,x )x∈Λ is a |Λ|-dimensional standard Brownian motion (see Sec. A.2). Knowledge of Brownian motion would be helpful for the firm grasp of the mathematics behind (2.10). However, almost no knowledge of Brownian motion is needed here to roughly capture the meaning of (2.10). In fact, (2.10) is just a random perturbation of the following deterministic integral equation: Z 1 t Λ,η Λ,η e Λ,η (σ Λ,η )ds . = σ0,x − σt,x ∇x H (2.12) s 2 0
Analytic definition of the Glauber dynamics: The generator of the Glauber dynamics is defined by 1X 1X e Λ,η (σ)∇x f (σ) f ∈ CΛ . ∆x f (σ) − ∇x H (2.13) AΛ,η f (σ) = 2 2 x∈Λ
x∈Λ
It is not difficult (cf. Sec. A.3) to see that def.
E Λ,η (f, g) = −µΛ,η (f AΛ,η g) =
1 X Λ,η µ (∇x f ∇x g) . 2
(2.14)
x∈Λ
As in the Ising model case, we see from (2.14) that the similar statement as in Proposition 2.1 is true for AΛ,η : L2 (µΛ,η ) → L2 (µΛ,η ) defined on CΛ . However, unlike the Ising model case, the operator AΛ,η here is not bounded on L2 (µΛ,η ). This requires a property of AΛ,η called the self-adjointness for the associated semigroup to be defined rigorouslya. For this reason, we will extend the operator AΛ,η on a larger domain Dom(AΛ,η ) on which it is self-adjoint, that is, the following hold: (i) f ∈ Dom(AΛ,η ) if and only if sup{|µΛ,η (f AΛ,η g)| ; g ∈ Dom(AΛ,η ) , kgkL2(µΛ,η ) ≤ 1} < ∞ , a Readers
who do not care about the rigorous construction of the semi-group can skip this point.
November 5, 2003 9:23 WSPC/148-RMP
774
00174
N. Yoshida
(ii) µΛ,η (f AΛ,η g) = µΛ,η (gAΛ,η f ) ,
for f, g ∈ Dom(AΛ,η ) .
We state this procedure as: Proposition 2.2. Let Dom(AΛ,η ) be the set of f ∈ L2 (µΛ,η ) for which there is a sequence {fn }n≥1 ⊂ CΛ such that lim kf − fn kL2 (µΛ,η ) = 0,
n%∞
and
lim kAΛ,η (fm − fn )kL2 (µΛ,η ) = 0 . (2.15)
m,n%∞
(a) For f ∈ Dom(AΛ,η ) and a sequence {fn }n≥1 ⊂ CΛ with the property (2.15), the following L2 (µΛ,η )-limit def.
AΛ,η f = lim AΛ,η fn
(2.16)
n%∞
is independent of the choice of the sequence {fn }n≥1 and hence defines a linear operator on L2 (µΛ,η ) with Dom(AΛ,η ) as its domain of definition. (b) The operator AΛ,η : L2 (µΛ,η ) → L2 (µΛ,η ) defined by (2.16) is self-adjoint, negative semi-definite on Dom(AΛ,η ). Moreover, the eigenspace for eivenvalue zero consists only of constant functions. The quadratic form defined by (2.14) is called the Dirichlet form of the Glauber dynamics. The semi-group generated by AΛ,η is denoted by (TtΛ,η )t≥0 , TtΛ,η = exp(tAΛ,η ) ,
t > 0.
(2.17)
Thanks to the self-adjointness (and the negative semi-definiteness) of AΛ,η , the semi-group can be constructed by the spectral decomposition [9, p. 235]. For f ∈ L2 (µΛ,η ), u(σ, t) = TtΛ,η f (σ) can be characterized by the following initial value problem: ∂ u(σ, t) = AΛ,η u(σ, t) , t > 0 ∂t (2.18) u(σ, 0) = f (σ) , t = 0.
2.3. Relaxation time
d
Proposition 2.3. Fix Λ ⊂⊂ Zd and a boundary condition η ∈ S Z . Then, the constant γ ∈ (0, ∞) described in the following two statements are the same. (a) The smallest γ ∈ (0, ∞) such that the following inequality (Poincar´ e inequality) holds for all f ∈ CΛ : µΛ,η (f ; f ) ≤ γE Λ,η (f, f ) .
(2.19)
(b) The smallest γ ∈ (0, ∞) such that kTtΛ,η f − µΛ,η f kL2 (µΛ,η ) ≤ kf − µΛ,η f kL2 (µΛ,η ) exp(−t/γ) , for all f ∈ CΛ and t > 0.
(2.20)
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
775
The constant γ is called the relaxation time or the inverse spectral gap and is denoted by γSG (Λ, η). The proof is easy and is presented in Sec. A.4 for the interested readers. Since we have put the notion of “relaxation time” into a mathematical framework, we can now formulate the meaning of “fast/slow relaxation” as follows: fast relaxation ↔ sup γSG (Λ(`), η) < ∞ ,
(2.21)
`≥1
slow relaxation ↔ lim γSG (Λ(`), η) = ∞ , `%∞
(2.22)
where Λ(`) = Zd ∩ (−`/2, −`/2]d .
(2.23)
The log-Sobolev inequality introduced in the following proposition plays an important role in the analysis of Glauber dynamics. d
Proposition 2.4. Fix Λ ⊂⊂ Zd and a boundary condition η ∈ S Z . We let γLS (Λ, η) denote the smallest γ ∈ (0, ∞) such that the following inequality (log-Sobolev inequality) holds for all f ∈ CΛ : f2 ≤ 2γE Λ,η (f, f ) . (2.24) µΛ,η f 2 ln Λ,η 2 µ (f ) Then, 0 < γSG (Λ, η) ≤ γLS (Λ, η) < ∞ .
(2.25)
Log-Sobolev inequality was introduced by Gross [10] as an equivalent condition to the hypercontractivity of the associated semi-group. Since then, it has been applied to many aspects in probability theory. We refer the reader to [11, 12] or [13, Sec. 6.1] for expositions of log-Sobolev inequality in more general settings. A proof of (2.25) can be found, for example, in [12, Theorem 2.5], [13, (6.1.17)]. It might be interesting to relate the log-Sobolev inequality with the relative entropy. For probability measures µ and ν on a some measurable space, the relative entropy of ν with respect to µ is defined by: ( ν(log ρ) = µ(ρ log ρ) , if dν = ρdµ with ρ ∈ L1 (µ) , H(ν|µ) = (2.26) +∞ , if otherwise . Note that the left-hand side of (2.24) equals µΛ,η (f 2 )H(ν|µΛ,η ), where dν =
f2 dµΛ,η . µΛ,η (f 2 )
The relative entropy can be used to measure the deviation of ν with respect to µ. In fact, it is well known [13, (3.2.25)] that 1 |ν(f ) − µ(f )|2 ≤ H(ν|µ)kf k2 , 2
(2.27)
November 5, 2003 9:23 WSPC/148-RMP
776
00174
N. Yoshida
for any bounded measurable function f , where kf k denotes the sup-norm. On the other hand, (2.24) implies that for any probability measure ν on S Λ that H(νTtΛ,η |µΛ,η ) ≤ H(ν|µΛ,η ) exp(−2t/γ) , νTtΛ,η
t ≥ 0,
(2.28)
(νTtΛ,η )(f )
where the probability measure is defined by = ν(TtΛ,η f ), see [13, (6.1.37)]. This shows that the log-Sobolev inequality (2.24) implies the exponentially fast relaxation to equilibrium in the sense of entropy, while the Poincar´e inequality (2.19) is equivalent to exponentially fast L2 -relaxation. 3. Relaxation in the Unique Phase Region 3.1. The equivalence of the decay of correlation and the fast relaxation The following result originates in a celebrated work of Stroock and Zegarlinski [14], which is a typical example of the correspondence (0.1). Theorem 3.1 ([14 17]). For the Ising model and lattice φ4 -field, the following conditions are equivalent: (DC) There exist constants B3.1 , C3.1 ∈ (0, ∞) such that for all Λ ⊂⊂ Zd , η ∈ Λ and x, y ∈ Λ, |µΛ,η (σx ; σy )| ≤ B3.1 exp(−kx − yk1 /C3.1 ) .
(3.1)
d
(SG) sup{γSG (Λ, η) ; Λ ⊂⊂ Zd , η ∈ S Z } < ∞. d (LS) sup{γLS (Λ, η) ; Λ ⊂⊂ Zd , η ∈ S Z } < ∞. The condition (DC) which we call decay of correlation refers to a certain asymptotic independence of the spins located far away from each other [55]. It is not difficult to see that (DC) implies the uniqueness of the phase. On the other hand, (SG) and (LS) are conditions which ensure fast enough relaxation of the Glauber dynamics (cf. Proposition 2.3, Proposition 2.4). Here are some rough explanations for why the equivalence in Theorem 3.1 is true. (DC) ⇒ (LS): An important observation which is true even without condition (DC) is that, for all n ≥ 1, there exists C(n) = C(n, β, d) ∈ (0, ∞) such that d
sup{γSG (Λ, η) ; |Λ| ≤ n , η ∈ S Z } ≤ C(n) .
(3.2)
On the other hand, (DC) implies that for large enough Λ, the measure µΛ,η is almost independent of the choice of Λ and η, and hence supΛ,η γSG (Λ, η) is essentially the same as the left-hand side of (3.2) for finite n, if it is large enough. (LS) ⇒ (SG): This follows from (2.25). (SG) ⇒ (DC): We will use the following basic property of the Glauber dynamics, which is often referred to as “finite speed propagation property”. For any ε ∈ (0, ∞), there exists C ∈ (0, ∞) such that Λ,η Λ,η P (σt,x ; σt,y ) ≤ C exp(Ct − ε|y − z|)
(3.3)
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
777
d
for all Λ ⊂⊂ Zd , η ∈ S Z and x, y ∈ Zd . This implies that, for fixed time t and Λ,η Λ,η |y − z| large enough, σt,x and σt,y are almost independent in the sense that their correlation decays exponentially in |y − z|. On the other hand, (SG) implies that the distribution of σtΛ,η with large enough finite t is close to µΛ,η uniformly in Λ and η; recall Proposition 2.3. Therefore, with (SG), the left-hand side of (3.3) can be replaced by µΛ,η (σx ; σy ) if t is a large enough finite number. Remark 3.1. The relation between the decay of correlation and the fast relaxation began to be studied already in 1970’s in the series of works by Holley and Stroock [18, 19, 54]. Stroock and Zegarlinski [14, 20] proved the equivalence stated in Theorem 3.1 for a certain class of spin systems where the single spin space S is either a finite set or a compact Riemannian manifold. This implies Theorem 3.1 for the case of the Ising model. We will refer to the papers [15, 16] in Remark 3.2 below. See also a work of Cesi [21] for an elegant proof of (LS) in this context. Theorem 3.1 for a class of unbounded spin system including the case of the lattice φ4 -field can be found in [17]. Remark 3.2. Conditions in Theorem 3.1 require the uniformity over all Λ ⊂⊂ Zd . In some cases, it is more reasonable to restrict one’s attention only to nicely-shaped Λ (e.g. cubes or fat enough boxes) to avoid pathological phenomena caused by Λ’s whose shapes are too irregular. This idea was implemented by the series of works by Martinelli, Olivieri, Schonmann, Shlosman [15, 16, 22, 23], which led to improvement of Theorems 3.1 and 3.3, especially in the case of the two-dimensional Ising model (see Theorem 3.4). A more practical role played by the (LS) can be seen in the following result. Recall that (SG) is equivalent to L2 -convergence of the Glauber dynamics with uniform speed in Λ and η. With (LS), one gets much stronger result (L∞ -convergence). Theorem 3.2 ([14 16]). Consider the Ising model and suppose that def.
γ(η) = sup{γLS (Λ, η) ; Λ ⊂⊂ Zd } < ∞
(3.4)
d
for some η ∈ S Z . Then, there exist constants B3.5 , C3.5 ∈ (0, ∞) such that for all d Λ ⊂⊂ Zd , η ∈ S Z , kTtΛ,η f − µΛ,η f k ≤ B3.5 |||f ||| exp(−t/C3.5 ) , f or all f ∈ CΛ and t > 0 , P where kf k is the sup-norm and |||f ||| = x∈Λ k∇x f k.
(3.5)
Here is a rough explanation of how the log-Sobolev inequality is used to derive (3.5). Let ∆ ⊂ Λ be the “support” of f : ∆ = {x ∈ Λ ∇x f 6≡ 0}. We then define Λ(t) = {x ∈ Λ dist.(x, ∆) ≤ C + Ct} , where the constant C is large enough. By the finite propagation speed property Λ,η (3.3), the coordinates σs,x , x ∈ ∆, s ≤ t are almost independent of what happens in Λ\Λ(t) up to time t. For this reason, the proof of (3.5) boils down to the estimation
November 5, 2003 9:23 WSPC/148-RMP
778
00174
N. Yoshida
of the left-hand side of (3.5) with Λ replaced by Λ(t). Now, by applying (2.27) and Λ(t),η (2.28) to µ = µΛ(t),η and ν = δσ Tt , we have 1 Λ(t),η Λ(t),η Λ(t),η |T f (σ) − µΛ(t),η f |2 ≤ kf k2H(δσ Tt |µ ) 2 t ≤ kf k2H(δσ |µΛ(t),η ) exp(−2t/γ(η)) . By the definition of the relative entropy, it is easy to see that H(δσ |µΛ(t),η ) ≤ ln(1/µΛ(t),η (σ)) ≤ C1 t , where C1 = C1 (d, β, h) ∈ (0, ∞). These prove the exponential decay of the left-hand side of (3.5) with Λ replaced by Λ(t). Remark 3.3. For the lattice φ4 -field, a similar result to Theorem 3.2 is known [24, 25], where however, the exponential decay is not in the sup-norm, but in a certain point-wise sense for each “tempered” initial configuration. 3.2. Sufficient conditions for (DC ), (SG ) or (LS ) The following result concerns sufficient conditions which ensure the validity of (DC), (SG) and (LS) in Theorem 3.1. Theorem 3.3. Consider the Ising model and lattice φ4 -field. (a) (LS ) holds if d = 1. (b) For all d ≥ 1, there is an inverse temperature β0 = β0 (d) ∈ (0, ∞) such that for β ≤ β0 , γLS (Λ, η) is bounded in Λ, η and h. In particular, (LS ) holds for β ≤ β0 . (c) For the Ising model, (LS ) holds if |h| > 2d. Remark 3.4. • Theorem 3.3(a) was obtained by Zegarlinski [25, 26]. • A result of Zegarlinski [27, Theorem 4.3] implies Theorem 3.3(b) for the Ising model. Theorem 3.3(b), (c) for the Ising model can also be found in, [28] where (DC) (rather than (LS)) is discussed. For the Ising model, it is known [29] that (DC) holds for β < βc /2. • As for Theorem 3.3(b) for the lattice φ4 -field, Zegarlinski [27, Propositions 3.4, 3.5] also proved some results in this direction. However, the conditions imposed there for the potential function were not mild enough to cover the case of the lattice φ4 -field. Theorem 3.3(b) for the lattice φ4 -field was then obtained in [30] by the use of “surgery construction” of Dobrushin and Shlosman [31] (to show a strong enough decay of correlation), together with the “martingale method” of Lu and Yau [32] (to conclude the spectral gap and the log-Sobolev inequality from the decay of correlation). Soon afterwards, Bodineau and Helffer [33, 56] used Witten Laplacian to simplify a part of the proof (the decay of correlation and the spectral gap) and thereby generalized the assumptions on the interaction potentials needed to prove the log-Sobolev inequality; see also a paper by Gentil and Roberto [34] for
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
779
a research on (SG) in this direction. More recently, Procacci and Scoppola [35] investigated (DC) for the lattice φ4 -field by a different method based on the cluster expansion. A written article by Ledoux [36] is now available on the subject of Theorem 3.3(b) for unbounded lattice spin systems. 3.3. Temperatures slightly above 1/βc , or small non-zero magnetic fields The results in Sec. 3.1 apply if the temperature is sufficiently high, or if large enough magnetic field is present (Theorem 3.3). To make the correspondence (0.1) more precise, it is desired to study the model at a temperature only slightly above the critical one, or under a small magnetic field. As is alluded to in Remark 3.2, researches in this direction for the two-dimensional Ising model was very successfully done in a series of work by Martinelli, Olivieri, Schonmann and Shlosman [15, 22, 23]. Let us quote one of its final form as: Theorem 3.4 ([23]). Consider the Ising model in d = 2. If β < βc or h 6= 0, then γLS (Λ(`), η) < ∞ ,
sup `≥1,η∈S Zd
or equivalently, γSG (Λ(`), η) < ∞ .
sup `≥1,η∈S Zd
Remark 3.5. In contrast with the situation in two dimensions described in Theorem 3.4, it is conjectured in dimension d ≥ 3 that the fast relaxation property (2.21) is violated by an appropriate choice of low temperatures, small but non-zero magnetic fields, and boundary conditions η which are mixtures of +1 and −1. The part of the (β, h)-plane referred to above is often called “Basuev region”. Though the conjecture mentioned above is not yet proven, there are some results on related models which suggest the existence of such “dangerous” boundary condition d η ∈ S Z [37, 38, 39]. The conjecture in Remark 3.5 motivated an attempt to find a class of “safe” boundary conditions even when (β, h) is near the phase transition line. To describe a sufficient condition to be a safe boundary condition, we introduce the effective magnetic field : X def. hΛ,η = h+ ηy , x ∈ Λ . (3.6) x y6∈Λ kx−yk1 =1
Note that the Hamiltonian (1.3) can be written, up to a term independent of σ, as X X 1 β(σx − σy )2 + βhΛ,η (3.7) x σx . 2 {x,y}⊂Λ kx−yk1 =1
x∈Λ
November 5, 2003 9:23 WSPC/148-RMP
780
00174
N. Yoshida
Note also that hΛ,η = h for x ∈ Λ\∂in Λ, recall (1.1). The effective magnetic field is x thus a “magnetic field” with the effect of the boundary condition taken into account. A sufficient condition for a safe boundary condition is then, roughly speaking, that the signs of {hΛ,η x }x∈∂in Λ are kept identical to that of the external magnetic field h. More precisely, we have: Theorem 3.5 ([15, 40, 41]). Consider the Ising model and we allow the boundary d condition η in an extended configuration space RZ . (a) If β < βc and h ≥ 0, then there exist constants B3.5 , C3.5 ∈ (0, ∞) such that d (3.5) holds whenever Λ ⊂⊂ Zd and η ∈ RZ satisfy minx∈∂in Λ hΛ,η ≥ 0, cf. (3.6). x (b) There exsits β0 = β0 (d) ∈ (0, ∞) as follows. If β ≥ β0 , h > 0, then for any h0 > 0, there exist constants B3.5 , C3.5 ∈ (0, ∞) such that (3.5) holds whenever d Λ is a cube and η ∈ RZ satisfies minx∈∂in Λ hΛ,η ≥ h0 , cf. (3.6). x Remark 3.6. Theorem 3.5 originates in a result of Martinelli and Olivieri [15, Theorem 5.1], where the infinite volume dynamics are discussed. The present form of Theorem 3.5, which is stronger than the above mentioned result (because of the uniformity over Λ) is due to Schonmann and Yoshida [40, 41]. 4. Relaxation in the Phase-Coexistence Region We now turn our attention to the correspondence (0.2), namely slow relaxation in the phase-coexistence region. In this subsection, we investigate the relaxation time γSG (Λ(`), η) for the Ising model with d ≥ 2, β > βc and h = 0. 4.1. Some general observations • Heuristics: Here are heuristics to predict how long the relaxation time γSG (Λ(`), η) is. Although the argument is quite rough, it leads to “correct” answers in many cases. The relaxation time is supposed to be proportional to the expected amount of time needed to cross the energy barrier (if the barrier is present). Let us assume for simplicity that the boundary condition is non-negative. Then, the time needed to cross the energy barrier is roughly the time needed to make the transition from σ ≡ −1 to σ ≡ +1. There are many different ways to make this transition. However, the main contribution comes from the most “efficient” way to do it, namely the one that minimizes the increment of the energy along the transition. With this in mind, let us now approximate the original dynamics by a birth-death chain which moves back and forth along a fixed series of configurations σ (0) , σ (1) , . . . , σ (N ) with σ (0) ≡ −1 and σ (N ) ≡ +1. Then, the computation of the expected hitting time of σ (N ) under this approximation tells us that the relaxation time is the exponential of the maximum increment of the energy along the transition, up to some polynomial correction.
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
781
Another important observation is that the energy barrier along the transition from σ ≡ −1 to σ ≡ +1 is the creation of the layer of plus spins which separates a pair of opposite faces of Λ(`) (and hence contains `d−1 spins). In fact, after such a layer is created, then the rest of the transition can be made without increasing the energy. • A general upper bound of the relaxation time: It is known that for all d ≥ 2, β > 0 and h ≥ 0; γSG (Λ({`i }), η) ≤ exp(βC`2 · · · `d ) , (4.1) Q where Λ({`i }) = Zd ∩ di=1 (−`i /2, −`i/2] with `1 ≥ `2 ≥ · · · ≥ `d and the constant C depends only on d, see [42, Theorem 5], [2, Theorem 6]. Note that the length `1 of the longest side of the rectangle Λ({`i }) does not appear on the right-hand side of (4.1). Along with the line of heuristics explained above, the right-hand side of (4.1) Qd refers to the cost to fill a layer, say, the bottom one {−b`1 /2c}× i=2 (−`i /2, −`i /2] with the plus spins. We have in particular γSG (Λ(`), η) ≤ exp(βC`d−1 ) .
(4.2)
4.2. The free boundary condition We first consider the free boundary condition. The following result says that the “slowest relaxation time” (the right-hand side of (4.2)) is indeed realized here; Theorem 4.1 ([43, 44]). Consider the Ising model with h = 0. For d = 2, β > βc , and η ≡ 0, lim
`→∞
1 1 ln γSG (Λ(`), η) = 2 − ln(tanh β) . β` β
(4.3)
For d ≥ 2, sufficiently large β and η ≡ 0, there exist Bi = Bi (β, d) > 0, Ci = Ci (β, d) > 0 (i = 1, 2) such that B1 exp(C1 `d−1 ) ≤ γSG (Λ(`), η) ≤ B2 exp C2 `d−1 , ` = 1, 2, . . . . (4.4)
Theorem 4.1 is quite reasonable from the viewpoint of the heuristics explained in Sec. 4.1. In fact, the exponent `d−1 can be explained by the cardinality of the layer of plus spins alluded to there. Moreover for d = 2, the right-hand side of (4.3) is the surface tension in a coordinate direction [45, p. 264, (81)]. In proofs of (4.3) and (4.4), the lower bound of the relaxation time is based on the following immediate consequence of the definition of γSG (Λ, η), cf. (2.19): γSG (Λ, η)−1 ≤ E Λ,η (f, f )/µΛ,η (f ; f ) ,
(4.5)
for any f ∈ CΛ . To obtain the lower bound in (4.4), for example, one chooses f to be the indicator function of an event in which a large closed surface separating +1 and −1 (as such a “contour”) is present. One then uses Peierls’ argument to show that the right-hand side of (4.5) is exponentially small in `d−1 .
November 5, 2003 9:23 WSPC/148-RMP
782
00174
N. Yoshida
Remark 4.1. The asymptotics (4.3) is obtained by Cesi, Guadagni, Martinelli and Schonmann [43], while the bound (4.4) is due to Thomas [44]. We now consider extensions of Theorem 4.1 to a more general boundary condition η. We say a set A ⊂ Zd is (*)-connected if for any {x, y} ⊂ A, there is {x0 , x1 , . . . , xn } ⊂ A such that x0 = x, xn = y and kxi −xi−1 k∞ = 1, i = 0, 1, . . . , n, where kyk∞ = max1≤i≤d |yi | for y = (yi )di=1 ∈ Zd . Theorem 4.2 ([46–48]). Consider the Ising model with h = 0. (a) For d = 2, suppose that 0 < δ < 1 and that the boundary condition η y ∈ [−1, +1], y ∈ ∂ex Λ(`) satisfies X ηy ≤ δ|I| f or every (∗) − connected I ⊂ ∂ex Λ(`) with |I| = ` . (4.6) y∈I
Then, there exist β0 = β0 (δ) > 0, Bi = Bi (β, δ) > 0, Ci = Ci (β, δ) > 0 (i = 1, 2) such that (4.4) holds for β ≥ β0 and ` = 1, 2, . . . . d (b) For d ≥ 3, suppose that η ∈ {0, 1}Z and that X lim `−(d−1) ηy < δ d , (4.7) `%∞
y∈∂ex Λ(`)
9 27 , δ5 = 12 where δ3 = 43 , δ4 = 16 32 , δd ≤ (16d) for d ≥ 6. Then, there exist β0 = β0 (d) > 0, Bi = Bi (β, d) > 0, Ci = Ci (β, d) > 0 (i = 1, 2) such that (4.4) holds for β ≥ β0 and ` = 1, 2, . . . .
A typical boundary condition for which (4.6) holds is a chessboard-like one, i.e., a mixture of alternating +1 and −1. The condition (4.6) also allows an “almost plus” boundary condition, e.g. +1 for 99% of the boundary with 1% zero on each side; see [49] for an even stronger result in this direction. The condition (4.6) can be explained by the heuristics in Sec. 4.1. Under the condition, a creation of the layer of plus spins with length ` necessarily increases the energy by at least (1 − δ)`. The condition (4.6) is optimal in some sense as can be seen from the following example. For δ > 0, consider a boundary condition η defined by +1 if x = ` + 1 and −δ` < x ≤ δ` , 1 2 2 2 2 ηx = (4.8) 0 otherwise .
By Theorem 4.2(a), one sees that (4.4) is true for all δ < 1. On the other hand, it follows from [50, Corollary 4.1] that (4.4) is no longer valid for δ = 1. The proof of Theorem 4.2 is again based on (4.5) and Peierls’ argument taking the boundary condition into account. For d ≥ 3, Peierls’ argument with boundary condition is more difficult to implement, due to the geometrical complexity. In fact, Theorem 4.2(b) is the only result in this direction for d ≥ 3. For example, (4.4) for
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
783
a chessboard-like boundary condition which appears to be clearly true, is not yet proven as far as we know. Remark 4.2. Theorem 4.2(a) was obtained in a weaker form by Higuchi and Yoshida in [46] and then in the present form by Alexander and Yoshida [47]. Theorem 4.2(b) is due to Sugimine [48]. 4.3. The plus boundary condition In contrast to Theorems 4.1 and 4.2, the relaxation time with pure (+) boundary condition is shorter than the exponential of `d−1 . This can again be explained from the viewpoint of the heuristics explained in Sec. 4.1. For the (+) boundary condition, a plus layer along the boundary can be created without increasing the energy, i.e. there is no energy barrier to cross. This suggests that the relaxation time should be at most polynomial in `. In fact, a recent work of Bodineau and Martinelli [51], where the cube Λ(`) is replaced by Wulff shape with linear size `, suggests the following: for d = 2, h = 0 and β > βc , γSG (Λ(`), η ≡ +1) ` ,
` % ∞,
and for d ≥ 2, h = 0 and β > βc , γLS (Λ(`), η ≡ +1) `2 ,
` % ∞.
So far, existing rigorous upper bounds of the relaxation time are obtained only in the following weaker form: γSG (Λ(`), η ≡ +1) ≤ exp(C(β)`d−2 ϕ(`)) ` = 1, 2, . . . ,
(4.9)
for some C(β) > 0, lim ϕ(`) = ∞ and lim ϕ(`)/` = 0. `%∞
`%∞
Theorem 4.3 ([2, 50, 52, 53]). Consider the Ising model with h = 0 (a) For d = 2 and β > βc , (4.9) holds with ϕ(`) = (` ln `)1/2 . (b) For d ≥ 3 and sufficiently large β, (4.9) holds with ϕ(`) = (ln `)2 . Roughly speaking, the right-hand side of (4.9) is a bound for the relaxation time for a slice (say S(`)) of Λ(`) with height ϕ(`) from the bottom: S(`) = {x ∈ Λ(`) ; xd ≤ −`/2 + ϕ(`)} , recall (4.1). In fact, along with the line of heuristics in Sec. 4.1, the relaxation time γSG (Λ(`), η ≡ +1) should be controlled by the time needed to fill S(`) with (+) spins. Technically, to be able to control the thermal fluctuation, the slice S(`) should not be too thin, i.e. ϕ(`) % ∞ should be fast to a certain extent. Remark 4.3. The bound (4.9) for d = 2 was first proven by Martinelli for large 1+ε enough β [50], where ϕ(`) = ` 2 and then for β > βc [2]. The present form Theorem 4.3(a) was obtained by Higuchi and Wang [52]. Theorem 4.3(b) is due to Sugimine [53].
November 5, 2003 9:23 WSPC/148-RMP
784
00174
N. Yoshida
Acknowledgments This article originates in a course given by the author in a summer school “Mathematical Physics 2001” in Tokyo. The author would like to thank the organizers Huzihiro Araki and Hiroshi Ezawa for the opportunities to give the course and to write these notes. The author also would like to thank Thierry Bodineau, Fabio Martinelli, Nobuaki Sugimine, Boguslaw Zegarlinski and the anonymous referees for their useful comments. A. Appendix A.1. Proof of Eq. (2.7) We will show that −µΛ,η (f cΛ,η x ∇x g) =
1 Λ,η Λ,η µ (cx ∇x f ∇x g) , 2
(A.1)
which, by summation over x ∈ Λ, proves (2.7). It is clear that X X f (σ x ) = f (σ) f ∈ CΛ . σ∈S Λ
(A.2)
σ∈S Λ
On the other hand, x µΛ,η (cΛ,η x (σ)f (σ )) =
µΛ,η (cΛ,η x (σ)f (σ)) =
1 Z Λ,η 1 Z Λ,η
Note now that exp(−H
Λ,η
(σ))cΛ,η x (σ)
X
x exp(−H Λ,η (σ))cΛ,η x (σ)f (σ )
(A.3)
exp(−H Λ,η (σ))cΛ,η x (σ)f (σ) .
(A.4)
σ∈S Λ
X
σ∈S Λ
1 1 = exp − H Λ,η (σ x ) − H Λ,η (σ) 2 2
is invariant under spin flip σ 7→ σ x . Therefore we see from (A.2), (A.3) and (A.4) that x Λ,η Λ,η µΛ,η (cΛ,η (cx (σ)f (σ)) . x (σ)f (σ )) = µ
(A.5)
It is now easy to conclude (A.1) from (A.5): Λ,η Λ,η −µΛ,η (f cΛ,η (cx (σ)f (σ)(g(σ x ) − g(σ))) x ∇x g) = −µ x x = −µΛ,η (cΛ,η x (σ)f (σ )(g(σ) − g(σ )))
=
1 ((A.6) + (A.7)) 2
=
1 Λ,η Λ,η µ (cx ∇x f ∇x g) . 2
(A.6) (A.7)
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
785
A.2. Brownian motion and Eq. (2.10) A family {Bt }t≥0 of Rd -valued random variables on a probability space (Ω, F, P ) is called a d-dimensional Brownian motion if the following properties are satisfied; (B0) P (B0 = 0) = 1, (B1) There is an Ω0 ∈ F such that P (Ω0 ) = 1 and t 7→ Bt (ω) is continuous for all ω ∈ Ω0 . (B2) {Btj − Btj−1 }nj=1 are independent if n ≥ 2 and 0 = t0 < t1 < . . . < tn . (B3) For any 0 ≤ s < t, Bt − Bs is a mean-zero Gaussian random variable with the covariance matrix (t − s)(δij )di,j=1 ; Z P (Bt − Bs ∈ A) = pt−s (x)dx , for all Borel set A ⊂ Rd , (A.8) A
2
where pt (x) = (2πt)−d/2 exp(− |x| 2t ). Let us now think of the “time derivative” B˙ t = (B˙t,i )di=1 of the Brownian motion which however does not exist in the classical sense (it exists only in the distributional sense). Then, the intuition behind (B2) and (B3) above are that {B˙t,i ; 1 ≤ i ≤ d, t ≥ 0} is an independent (both in i and t) Gaussian random field. With this nonrigorous formulation of the Brownian motion, the stochastic differential equation (2.10) may be interpreted in the following intuitive form: Λ,η dσt,x 1 e Λ,η (σtΛ,η ) . = B˙ t,x − ∇x H dt 2
A.3. Proof of Eq. (2.14)
1 1 e Λ,η ∇x g. We will show that Set AΛ,η x g = 2 ∆x g − 2 ∇x H Z Z Y Y 1 e Λ,η e Λ,η − e−H f AΛ,η g dσ = e − H ∇x f ∇ x g dσy , y x 2 RΛ RΛ y∈Λ
(A.9)
y∈Λ
which, by summation over x ∈ Λ, proves (2.14). The proof of (A.9) boils down to that of: Z Z 1 e Λ,η e Λ,η − e−H f AΛ,η gdσ = e−H ∇x f ∇x gdσx . (A.10) x x 2 R R Note that AΛ,η x g =
1 He Λ,η e Λ,η e ∇x (e−H ∇x g) 2
and that therefore (A.10) is equivalent to that Z Z e Λ,η e Λ,η − f ∇x (e−H ∇x g)dσx = e−H ∇x f ∇x gdσx . R
R
This can easily be seen by integration by parts.
November 5, 2003 9:23 WSPC/148-RMP
786
00174
N. Yoshida
A.4. Proof of Proposition 2.3 We will prove the equivalence of (2.19) and (2.20). We let k · k and h ·, · i stand for the norm and the inner product of L2 (µΛ,η ). We may assume that µΛ,η f = 0. Then, (2.19) ⇔ kf k2 ≤ −γh f, AΛ,η f i ,
for all f ∈ L2 (µ) ,
(2.20) ⇔ e2t/γ kTtΛ,η f k2 ≤ kf k2 ,
for all f ∈ L2 (µ) and t > 0 .
The equivalence of (2.19) and (2.20) can therefore be seen from the following computation; 2 Λ,η 2 d 2t/γ Λ,η 2 e kTt f k = e2t/γ kTt f k + 2h TtΛ,η f, AΛ,η TtΛ,η f i . dt γ References [1] A. Guionnet and B. Zegarlinski, Lectures on Logarithmic Sobolev Inequalities, S´eminaire de Probabilit´es, Lecture Notes in Math. 1801, Springer. [2] F. Martinelli, Lectures on Glauber dynamics for discrete spin models, Ecole de probabilit´es de St Flour 1997, Lecture Notes in Math. 1717, Springer, Berlin (1999). [3] R. S. Ellis, Entropy, Large Deviations, and Statistical Mechanics, Springer Verlag, New York (1985). [4] H. O. Georgii, Gibbs Measures and Phase Trans. Walter de Gruyter, Berlin, New York (1988). [5] J. Bellissard and R. Høegh-Krohn, Compactness and maximal Gibbs state for random Gibbs fields on the lattice, Commun. Math. Phys. 84 (1982), 297–327. [6] M. Aizenmann and R. Fern´ andez, On the critical behavior of the magnetization in Ising models, J. Stat. Phys. 44 (1986), 393–454. [7] G. Benettin, G. Gallavotti, G. Jona-Lasinio and A. L. Stella, On the Onsager–Yang– Value of the spantaneous magnetization, Commun. Math. Phys. 30 (1973), 45–54. [8] T. M. Liggett, Interacting Particle Systems, Springer Verlag, Berlin–Heidelberg– Tokyo (1985). [9] M. Reed and B. Simon, Method of Modern Mathematical Physics II, Academic Press, 1980. [10] L. Gross, Logarithmic Sobolev inequalities, Amer. J. Math. 97 (1975), 1061–1083. [11] C. An´e, S. Blach`ere, D. Chafa¨ı, P. Foug´eres, I. Gentil, F. Malrieu, C. Roberto and G. Scheffer, Sur les in´egalit´es de Sobolev logarithmiques. Panoramas et Synth´eses 10. Soci´et´e Math´ematique de France, Paris, 2000, pp. xvi+217. [12] L. Gross, Logarithmic Sobolev inequalities and contractivity properties of semigroups, Springer Lecture Notes in Math. 1563 (1994), 54–88. [13] J. D. Deuschel and D. W. Stroock, “Large Deviations”, AMS Chelsea publishing (2001). [14] D. W. Stroock and B. Zegarlinski, The logarithmic Sobolev inequality for discrete spin systems on a lattice, Commun. Math. Phys. 149 (1992), 175–193. [15] F. Martinelli and E. Olivieri, Approach to equilibrium of Glauber dynamics in the one phase region I. The attractive case, Commun. Math. Phys. 161 (1994), 447–486. [16] F. Martinelli and E. Olivieri, Approach to equilibrium of Glauber dynamics in the one phase region II. General case, Commun. Math. Phys. 161 (1994), 487–514.
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
787
[17] N. Yoshida, The equivalence of the log-Sobolev inequality and a mixing conditions for unbounded spin systems on the lattice, Ann. Inst. Henri Poincar´e. Probabilit´es et Statistiques 37(2) (2001), 223–243. [18] R. Holley and D. W. Stroock, L2 theory for the stochastic Ising model, Z. Warsch. verw. Gebiete 35 (1976), 87–101. [19] R. Holley and D. W. Stroock, Applications of the stochastic Ising model to the Gibbs states, Commun. Math. Phys. 48 (1976), 249–265. [20] D. W. Stroock and B. Zegarlinski, The equivalence of the logarithmic Sobolev inequality and the Dobrushin–Shlosman mixing condition, Commun. Math. Phys. 144 (1992), 303–323. [21] F. Cesi, Quasi-factorization of the entropy and logarithmic Sobolev inequalities for Gibbs random fields, Prob. Th. Rel. Fields 120 (2001), 569–584. [22] F. Martinelli, E. Olivieri and R. H. Schonmann, For 2-D lattice spin systems weak mixing implies strong mixing, Commun. Math. Phys. 165 (1994), 33–47. [23] R. H. Schonmann and S. B. Shlosman, Complete analyticity for 2D Ising completed, Commun. Math. Phys. 170 (1995), 453–482. [24] N. Yoshida, Application of log-Sobolev inequality to the stochastic dynamics of unbounded spin systems on the lattice, J. Funct. Anal. 173 (2000), 74–102. [25] B. Zegarlinski, The strong decay to equilibrium for the stochastic dynamics of unbounded spin systems on a lattice, Commun. Math. Phys. 175 (1996), 401–432. [26] B. Zegarlinski, Log-Sobolev inequalities for infinite one dimensional lattice systems, Commun. Math. Phys. 133 (1990), 147–162. [27] B. Zegarlinski, Dobrushin Uniqueness Theorem and Logarithmic Sobolev Inequalities, J. Funct. Anal. 105 (1992), 77–111. [28] R. L. Dobrushin and S. Shlosman, Completely analytical Gibbs fields, in Statistical Physics and Dynamical Systems, eds. J. Fritz, A. Jaffe and D. Szasz, Birkh¨ auser (1985). [29] Y. Higuchi, unpublished result. [30] N. Yoshida, The log-Sobolev inequality for weakly coupled lattice fields, Prob. Th. Rel. Fields 115 (1999), 1–40. [31] R. L. Dobrushin and S. Shlosman, Constructive criterion for the uniqueness of Gibbs field, in Statistical Physics and Dynamical Systems, eds. J. Fritz, A. Jaffe and D. Szasz, Birkh¨ auser (1985). [32] S. Lu and H. T. Yau, Spectral gap and logarithmic Sobolev inequality for Kawasaki and Glauber dynamics, Commun. Math. Phys. 156 (1993), 399–433. [33] T. Bodineau and B. Helffer, Log-Sobolev inequality for unbounded spin systems, J. Funct. Anal. 166 (1999), 168–178. [34] I. Gentil and C. Roberto, Spectral gaps for spin systems: some non-convex phase examples, J. Funct. Anal. 180 (2001), 66–84. [35] A. Procacci and B. Scoppola, On decay of correlations for unbounded spin systems with arbitrary boundary conditions, J. Stat. Phys. 105 (2001), 453–482. [36] M. Ledoux, Log-Sobolev inequality for unbounded spin systems revisited, Lecture Notes in Math. 1755 Springer, Berlin (2001). [37] F. Cesi and F. Martinelli, On the layering transition of an SOS surface interacting with a wall. I. Equilibrium results, J. Stat. Phys. 82 (1996), 823–913. [38] F. Cesi and F. Martinelli, On the layering transition of an SOS surface interacting with a wall. II. The Glauber dynamics, Commun. Math. Phys. 177 (1996), 173–201. [39] E. I. Dinaburg and A. E. Mazel, Layering transition in SOS model with external magnetic field. J. Stat. Phys. 74 (1994), 533–563.
November 5, 2003 9:23 WSPC/148-RMP
788
00174
N. Yoshida
[40] R. H. Schonmann and N. Yoshida, Exponential relaxation of Glauber dynamics with some special boundary conditions, Commun. Math. Phys. 189 (1997), 299–310. [41] N. Yoshida, Finite volume Glauber dynamics in a small magnetic field, J. Stat. Phys. 90 (1998), 1015–1035. [42] R. H. Schonmann, Slow droplet-driven relaxation of stochastic Ising models in the vicinity of the phase coexistence region, Commun. Math. Phys. 161 (1994), 1–49. [43] F. Cesi, G. Guadagni, F. Martinelli and R. H. Schonmann, On the 2D dynamical Ising model in the phase coexistence region near the critical point, J. Stat. Phys. 85 (1996), 55–102. [44] L. E. Thomas, Bound on the mass gap for finite volume stochastic Ising models at low temperature, Commun. Math. Phys. 126 (1989), 1–11. [45] D. B. Abraham and A. Martin-L¨ of, The transfer matrix for a pure phase in the two dimensional Ising model, Commun. Math. Phys. 32 (1973), 245–268. [46] Y. Higuchi and N. Yoshida, Slow relaxation of stochastic Ising models with random and non-random boundary conditions, in New Trends in Stochastic Analysis, eds. K. D. Elworthy, S. Kusuoka and I. Shigekawa, World Scientific Publishing (1997). [47] K. S. Alexander and N. Yoshida, The spectral gap of the 2-D stochastic Ising model with mixed boundary conditions, J. Stat. Phys. 104 (2001), 89–109. [48] N. Sugimine, Extension of Thomas’ result and upper bound on the spectral gap of d(≥ 3)-dimensional stochastic Ising models, J. Math. Kyoto Univ. 42 (2002), 141– 160. [49] K. S. Alexander, The spectral gap of the 2-D stochastic Ising model with nearly single-spin boundary conditions, J. Stat. Phys. 104 (2001), 59–87. [50] F. Martinelli, On the two dimensional dynamical Ising model in the phase coexistence region, J. Stat. Phys. 76 (1994), 1179–1246. [51] T. Bodineau and F. Martinelli, Some new results on the kinetic Ising model in a pure phase, J. Stat. Phys. 109 (2002), 207–235. [52] Y. Higuchi and J. Wang, Spectral gap of Ising model for Dobrushin’s boundary condition in two dimensions, preprint, 1999. [53] N. Sugimine, A lower bound on the spectral gap of the 3-dimensional stochastic Ising models (preprint, 2002). [54] R. Holley and D. W. Stroock, Logarithmic Sobolev inequality and stochastic Ising models, J. Stat. Phys. 46 (1987), 1159–1194. [55] R. Dobrushin, and S. Shlosman, Completely analytical interactions: Constructive description, J. Stat. Phys. 46 (1987), 983–1014. [56] T. Bodineau and B. Helffer, Correlations, Spectral gap and Log-Sobolev inequality for unbounded spin systems, in Differential Equations and Mathematical Physics, International Press, Birmingham, 1999, pp. 27–42.
December 8, 2003 11:39 WSPC/148-RMP
00181
Reviews in Mathematical Physics Vol. 15, No. 8 (2003) 789–822 c World Scientific Publishing Company
COIDEAL SUBALGEBRAS IN QUANTUM AFFINE ALGEBRAS
A. I. MOLEV School of Mathematics and Statistics, University of Sydney NSW 2006, Australia
[email protected] E. RAGOUCY∗ and P. SORBA† LAPTH, Chemin de Bellevue, BP 110, F-74941 Annecy-le-Vieux cedex, France ∗
[email protected] †
[email protected] Received 15 February 2003 Revised 21 August 2003 We introduce two subalgebras in the type A quantum affine algebra which are coideals with respect to the Hopf algebra structure. In the classical limit q → 1 each subalgebra specializes to the enveloping algebra U(k), where k is a fixed point subalgebra of the loop algebra glN [λ, λ−1 ] with respect to a natural involution corresponding to the embedding of the orthogonal or symplectic Lie algebra into glN . We also give an equivalent presentation of these coideal subalgebras in terms of generators and defining relations which have the form of reflection-type equations. We provide evaluation homomorphisms from these algebras to the twisted quantized enveloping algebras introduced earlier by Gavrilik and Klimyk and by Noumi. We also construct an analog of the quantum determinant for each of the algebras and show that its coefficients belong to the center of the algebra. Their images under the evaluation homomorphism provide a family of central elements of the corresponding twisted quantized enveloping algebra. Keywords: Quantized enveloping algebra; quantum determinant; evaluation homomorphism.
1. Introduction For a simple Lie algebra g over C consider the corresponding quantized enveloping algebra Uq (g); see Drinfeld [9], Jimbo [18]. If k is a subalgebra of g then U(k) is a Hopf subalgebra of U(g). However, Uq (k), even when it is defined, need not be isomorphic to a Hopf subalgebra of Uq (g). In the case where (g, k) is a classical symmetric pair the twisted quantized enveloping algebra Utw q (k) was introduced by Noumi [31] (type A pairs) and by Noumi and Sugitani [32] (remaining classical types). This is a subalgebra and a left coideal of the Hopf algebra Uq (g) which specializes to U(k) as q → 1. The algebras Utw q (k) play an important role in the theory of quantum 789
December 8, 2003 11:39 WSPC/148-RMP
790
00181
A. I. Molev, E. Ragoucy & P. Sorba
symmetric spaces developed in [31] and [32]. In particular, in the type A, which we are only concerned with in this paper, there are two twisted quantized enveloping tw algebras Utw q (oN ) and Uq (sp2n ) corresponding to the symmetric pairs AI : (glN , oN ) ,
(1.1)
AII : (gl2n , sp2n ) ,
respectively. It was also shown by Noumi [31] that the algebra Utw q (oN ) coincides with the one introduced earlier by Gavrilik and Klimyk [12]. The algebra Utw q (oN ) also appears as the symmetry algebra for the q-oscillator representation of the quantized enveloping algebra Uq (sp2n ); see Noumi, Umeda and Wakayama [33, 34]. In Noumi’s approach, the defining relations for the quantized algebras can be written in the form of a reflection-type equation. A constant solution of the reflection equation provides an embedding of the twisted quantized enveloping algebra into Uq (glN ). The quantum homogeneous spaces corresponding to the remaining series of the classical symmetric pairs of type A AIII : (glN , glN −l ⊕ gl l )
(1.2)
were studied by Dijkhuizen, Noumi and Sugitani [6, 7]. A one-parameter family of the constant solutions of the appropriate reflection equation was produced in [7], although no reflection-type presentation of the subalgebras of type Utw q (k) were formally introduced. A different description of the coideal subalgebras of Uq (g) associated with an arbitrary irreducible symmetric pair (g, k) was given by Letzter [22, 23]. The subalgebras are presented by generators and explicit relations depending on the Cartan matrix of g; see [23]. In particular, this work demonstrates the importance of the coideal property: it makes the construction of the twisted quantized algebras essentially unique. Natural infinite-dimensional analogs of the symmetric pairs are provided by involutive subalgebras in the polynomial current Lie algebras g[x] = g ⊗ C[x] or loop algebras g[λ, λ−1 ] = g ⊗ C[λ, λ−1 ]. Let (g, k) be a symmetric pair and g = k ⊕ p be the decomposition determined by the involution θ of g. So, k and p are the eigenspaces of θ with the eigenvalues 1 and −1, respectively. Then the twisted polynomial current Lie algebra g[x]θ can be defined by g[x]θ = k ⊕ px ⊕ kx2 ⊕ px3 ⊕ · · · ,
(1.3)
or, equivalently, it is the fixed point subalgebra of g[x] with respect to the extension of θ given by θ : Axp 7→ (−1)p θ(A)xp ,
A ∈ g.
(1.4)
As demonstrated by Drinfeld [9], the enveloping algebra U(g[x]) admits a canonical deformation in the class of Hopf algebras. The corresponding “quantum” algebra is called the Yangian and denoted by Y(g). For the case where θ is an involution of glN corresponding to the pair of type AI or AII, quantum analogs of the symmetric
December 8, 2003 11:39 WSPC/148-RMP
00181
791
Coideal Subalgebras in Quantum Affine Algebras
pairs (glN [x], glN [x]θ ) are provided by the Olshanski twisted Yangians [35]. These are coideal subalgebras in the Yangian Y(glN ) and each of them is a deformation of the enveloping algebra U(glN [x]θ ). In the AIII case, the quantum analogs of the pairs (glN [x], glN [x]θ ) are provided by the reflection algebras B(N, l) which are coideal subalgebras of Y(gl N ) originally introduced by Sklyanin [37]. Recently, these algebras and their representations were studied in connection with the NLS model ; see Liguori, Mintchev and Zhao [24], Mintchev, Ragoucy and Sorba [25], Molev and Ragoucy [29]. For the symmetric pairs (g[x], g[x]θ ) of general types the corresponding coideal subalgebras in the Yangian Y(g) were recently introduced by Delius, MacKay and Short [4] in relation with the principal chiral models with boundaries. These subalgebras are given in terms of the Q-presentation of the Yangian. A different R-matrix presentation of coideal subalgebras in the (super) Yangian Y(g) is given by Arnaudon, Avan, Cramp´e, Frappat and Ragoucy [1]. Field theoretical applications of the coideal subalgebras in the quantum affine algebras have been studied in a recent paper by Delius and MacKay [5]. In the case of the loop algebra b g = g[λ, λ−1 ] there is another natural way [cf. (1.4)] to extend the involution θ of g, θ : Aλp 7→ θ(A)λ−p ,
A∈g
(1.5)
and thus to define the fixed point subalgebra b g θ . In this paper we introduce certain b N , gl b θ ) associated with the involution θ quantizations of the symmetric pairs (gl N
corresponding to the pairs AI and AII. We define the twisted q-Yangians Yqtw (oN ) b ). They are and Yqtw (sp2n ) as subalgebras of the quantum affine algebra Uq (gl N b b θ ) as left coideals with respect to the coproduct on Uq (glN ) and specialize to U(gl N
q → 1. At this point we consider it necessary to comment on the terminology. Although, b θ is not a “twisted” quantum affine as we have mentioned above, the Lie algebra gl N
algebra in the usual meaning, we believe the names we use for the coideal subalgebras can be justified having in mind their analogy with both the twisted Yangians and the q-Yangian; cf. [30]. The latter is a subalgebra of the quantum affine algebra b ) which can be regarded as a q-analog of the usual Yangian Y(gl ); see also Uq (gl N N Sec. 3 below. Our first main result is a construction of the evaluation homomorphisms Yqtw (oN ) → Utw q (oN ) ,
Yqtw (sp2n ) → Utw q (sp2n )
(1.6)
to the corresponding twisted quantized enveloping algebras of [12] and [31]. Note that an evaluation homomorphism Uq (b g ) → Uq (g) from the quantum affine algebra to the corresponding quantized enveloping algebra only exists if g is of A type, and the same holds for the case of the Yangians; see Jimbo [19], Drinfeld [9]. In both cases, the evaluation homomorphisms play an important role in the representation theory of the quantum algebras; see Chari and Pressley [2]. An evaluation
December 8, 2003 11:39 WSPC/148-RMP
792
00181
A. I. Molev, E. Ragoucy & P. Sorba
homomorphism from the twisted Yangian to the corresponding enveloping algebra U(oN ) or U(sp2n ) does exist (see [35, 28]) and has many applications in the classical representation theory; see e.g. [27] for an overview. Note also that the existence of the homomorphisms (1.6) is not directly related with the corresponding fact for the A type algebras but is quite a nontrivial property of the reflection equations satisfied by the generators of the twisted q-Yangians. Next we construct an analog of the quantum determinant for each twisted q-Yangian and show that its coefficients belong to the center of this algebra. The application of the evaluation homomorphism (1.6) yields a family of central tw elements in Utw q (oN ) and Uq (sp2n ). In the orthogonal case we also produce a “short” determinant-like formula for this analog which employs a certain map from the symmetric group into itself. This same map was used in the short formulas for the Sklyanin determinants for the twisted Yangians; see [27]. Some other families of Casimir elements were constructed by Noumi, Umeda and Wakayama [34] and by Gavrilik and Iorgov [13]. It would be interesting to understand the relationship between the families, as well as to investigate possible applications to the study of the quantum Howe dual pairs; cf. [33, 34]. b N ) of Another intriguing problem is to construct coideal subalgebras of Uq (gl type AIII, i.e. to find q-analogs of the Sklyanin reflection algebras B(N, l) mentioned above. 2. Coideal Subalgebras of Uq (glN ) We shall use an R-matrix presentation of the algebra Uq (glN ). Our main references are Jimbo [19] and Reshetikhin, Takhtajan and Faddeev [36]. We fix a complex parameter q which is nonzero and not a root of unity. Consider the R-matrix X X X R=q Eii ⊗ Eii + Eii ⊗ Ejj + (q − q −1 ) Eij ⊗ Eji (2.1) i6=j
i
N
i<j
N
which is an element of End C ⊗ End C , where the Eij denote the standard matrix units and the indices run over the set {1, . . . , N }. The R-matrix satisfies the Yang–Baxter equation R12 R13 R23 = R23 R13 R12 ,
(2.2)
where both sides take values in End C N ⊗ End C N ⊗ End C N and the subindices indicate the copies of End C N , e.g. R12 = R ⊗ 1 etc. The quantized enveloping algebra Uq (glN ) is generated by elements tij and t¯ij with 1 ≤ i, j ≤ N subject to the relationsa tij = t¯ji = 0 , tii t¯ii = t¯ii tii = 1 , R T 1 T2 = T 2 T1 R , a Our
1≤i<j≤N, 1≤i≤N, R T¯1 T¯2 = T¯2 T¯1 R ,
(2.3) R T¯1 T2 = T2 T¯1 R .
T and T¯ correspond to the L-operators L− and L+ , respectively, in the notation of [36].
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
Here T and T¯ are the matrices X tij ⊗ Eij , T =
X
T¯ =
t¯ij ⊗ Eij ,
793
(2.4)
i,j
i,j
which are regarded as elements of the algebra Uq (glN ) ⊗ End C N . Both sides of each of the R-matrix relations in (2.3) are elements of Uq (glN ) ⊗ End C N ⊗ End C N and the subindices of T and T¯ indicate the copies of End C N where T or T¯ acts; e.g. T1 = T ⊗ 1. In terms of the generators the defining relations between the tij can be written as q δij tia tjb − q δab tjb tia = (q − q −1 ) (δb j , τ ¯ = for i < j , (2.12) τij = ij q − q −1 q − q −1 and τii =
tii − 1 , q−1
τ¯ii =
t¯ii − 1 , q−1
(2.13)
December 8, 2003 11:39 WSPC/148-RMP
794
00181
A. I. Molev, E. Ragoucy & P. Sorba
for i = 1, . . . , N. Then we have an isomorphism UA ⊗A C ∼ = U(glN )
(2.14)
with the action of A on C defined via the evaluation q = 1; see e.g. [2, Sec. 9.2]. Note that τij and τ¯ij respectively specialize to the elements Eij and −Eij of U(glN ). More generally, given a subalgebra V of Uq (glN ), set VA = V ∩ UA . Following Letzter [22, Sec. 1], we shall say that V specializes to the subalgebra V ◦ of U(glN ) (as q goes to 1) if the image of VA in UA ⊗A C is V ◦ . 2.1. Orthogonal case Following Noumi [31], we introduce the twisted quantized enveloping algebra Utw q (oN ) as the subalgebra of Uq (glN ) generated by the matrix elements of the matrix S = T T¯ t . It can be easily derived from (2.3) (see [31]) that the matrix S satisfies the relations sij = 0 ,
1≤i<j≤N,
(2.15)
sii = 1 ,
1≤i≤N,
(2.16)
R S 1 R t S2 = S 2 R t S1 R ,
(2.17)
where R t := R t1 denotes the element obtained from R by the transposition in the first tensor factor: X X X Rt = q Eii ⊗ Eii + Eii ⊗ Ejj + (q − q −1 ) Eji ⊗ Eji . (2.18) i
i6=j
i<j
Indeed, the only nontrivial part of this derivation is to verify that R T1 T¯1t R t T2 T¯2t = T2 T¯2t R t T1 T¯1t R .
(2.19)
However, this is implied by the relation R R t = R t R and the following consequences of (2.3): T¯1t R t T2 = T2 R t T¯1t ,
R T¯1t T¯2t = T¯2t T¯1t R .
(2.20)
We now prove an auxiliary lemma which establishes a weak form of the Poincar´e– Birkhoff–Witt theorem for abstract algebras defined by the relation (2.17). It will be used in both the orthogonal and symplectic cases. Lemma 2.1. Consider the associative algebra with N 2 generators sij , i, j = 1, . . . , N and the defining relations written in terms of the matrix S = (sij ) by the relation (2.17). Then the ordered monomials of the form 1N sk1111 sk1212 · · · sk1N · · · skNN11 skNN22 · · · skNNNN
with nonnegative powers kij linearly span the algebra.
(2.21)
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
795
Proof. Rewriting (2.17) in terms of the generators we get q δaj +δij sia sjb − q δab +δib sjb sia = (q − q −1 ) q δai (δbj
X
Eij ⊗ Eji .
(3.4)
i<j
It satisfies the Yang–Baxter equation R12 (u, v)R13 (u, w)R23 (v, w) = R23 (v, w)R13 (u, w)R12 (u, v) ,
(3.5)
where both sides take values in End C N ⊗ End C N ⊗ End C N and the subindices indicate the copies of End C N , e.g. R12 (u, v) = R(u, v) ⊗ 1 etc. Note that R(u, v) is related with the constant R-matrices (2.1) and (2.7) by the formula ˜ −vR. R(u, v) = u R
(3.6)
b ) with the coproduct defined by There is a Hopf algebra structure on Uq (gl N ∆(tij (u)) =
N X
k=1
tik (u) ⊗ tkj (u) ,
∆(t¯ij (u)) =
N X
k=1
t¯ik (u) ⊗ t¯kj (u) .
(3.7)
December 8, 2003 11:39 WSPC/148-RMP
802
00181
A. I. Molev, E. Ragoucy & P. Sorba
b ) specializes to U(gl b ) as q → 1. More precisely, as with The algebra Uq (gl N N the case of Uq (glN ) (see Sec. 2), regard q as a formal variable and introduce the b ) generated by the elements τ (r) and τ¯(r) defined by A-subalgebra UA of Uq (gl N ij ij (r)
(r)
τij =
tij , q − q −1
(r)
τ¯ij =
(r) t¯ij q − q −1
(3.8)
for r ≥ 0 and all i, j, except for the case r = 0 and i = j where we set (0)
(0)
τii =
tii − 1 , q−1
(0)
τ¯ii =
(0) t¯ii − 1 . q−1
(3.9)
Then we have an isomorphism bN) ; UA ⊗A C ∼ = U(gl
(3.10)
see [2, Sec. 12.2] and [11, Sec. 2]. The images of the generators of UA in (3.10) are given by (r)
τij → Eij λr , (0)
(r)
τ¯ij → −Eij λ−r
(3.11)
(0)
for all r ≥ 0 with the exception τij = τ¯ji = 0 if i < j. Given a subalgebra V of b N ) we set VA = V ∩ UA . We shall say that V specializes to a subalgebra V ◦ Uq (gl b N ) if VA ⊗A C ∼ of U(gl = V ◦. The quantized enveloping algebra Uq (glN ) is a natural (Hopf) subalgebra of b ) defined by the embedding Uq (gl N (0)
tij 7→ tij ,
(0) t¯ij 7→ t¯ij .
(3.12)
b N ) → Uq (glN ) called the evaMoreover, there is an algebra homomorphism Uq (gl luation homomorphism defined by T (u) 7→ T − T¯ u−1 ,
T¯(u) 7→ T¯ − T u .
(3.13)
The A type quantum affine algebras are exceptional in the sense that only in this case such an evaluation homomorphism does exist; see Chari–Pressley [2, Chapter 12]. b N ) generated by the elements t(r) was studied, e.g. in The subalgebra of Uq (gl ij [3, 30] and [36]. We call it the q-Yangian. In what follows we construct quantum affine algebras associated with the orthogonal and symplectic Lie algebras for which analogs of the evaluation homomorphism (3.13) do exist; cf. the B and C type twisted Yangians [35, 28]. These algebras can be viewed as twisted analogs of the q-Yangian as well as q-analogs of the twisted Yangians. Note, however, that contrary to the case of the twisted Yangians, our algebras are not subalgebras of the q-Yangian; they are generated by certain combinations of both types of elements (r) (r) tij and t¯ij .
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
803
3.1. Orthogonal twisted q-Yangians b ) Definition 3.1. The twisted q-Yangian Yqtw (oN ) is the subalgebra of Uq (gl N (r)
generated by the coefficients sij of the matrix elements of the matrix S(u) = T (u) T¯(u−1 )t . More precisely, we have sij (u) =
N X
tia (u) t¯ja (u−1 )
(3.14)
a=1
so that
sij (u) =
∞ X
(r)
sij u−r .
(3.15)
r=0
(r)
The subalgebra Yqtw (oN ) is generated by the elements sij with 1 ≤ i, j ≤ N and r running over the set of nonnegative integers. Next we give a presentation of the algebra Yqtw (oN ) in terms of generators and defining relations by analogy with the finite-dimensional case; see Sec. 2.1. Consider the element R t (u, v) := R t1 (u, v) obtained from R(u, v) by the transposition in the first factor: X X Eii ⊗ Eii Eii ⊗ Ejj + (q −1 u − q v) R t (u, v) = (u − v) i6=j
+ (q −1 − q)u
i
X
Eji ⊗ Eji + (q −1 − q)v
i>j
X
Eji ⊗ Eji .
(3.16)
i<j
The following relations are implied by (3.3): (0)
1≤i<j≤N,
(3.17)
(0)
1≤i≤N,
(3.18)
sij = 0 , sii = 1 ,
R(u, v) S1 (u) R t (u−1 , v) S2 (v) = S2 (v) R t (u−1 , v) S1 (u) R(u, v) .
(3.19)
We shall be proving that these are precisely the defining relations for the algebra Yqtw (oN ). (r)
Lemma 3.2. Consider the (abstract) associative algebra with generators s ij where i, j = 1, . . . , N and r = 0, 1, . . . . The defining relations are written in terms of the matrix S(u) = (sij (u)) by the relation (3.19) with sij (u) defined by (3.15). (p) (r) Introduce the ordering on the generators in such a way that sij skl if and only if (i, j, r) (k, l, p) in the lexicographical order. Then the ordered monomials in the generators span the algebra. Proof. Write the defining relations in terms of the generating series sij (u): (q −δij u − q δij v) αijab (u, v) + (q −1 − q)(uδj 1 such that the bound |f (t + is)| ≤ M (1 + |t|)−p holds uniformly in s ∈ [−1/4, 1/4]. We remark that there exists an admissible function satisfying Definition 2.1. See [1, Lemma 3.1]. We are ready to give a construction of Dirichlet forms on the standard form (M, H, P, J) associated with pair (M, ξ0 ), where M is a Z2 -graded von Neumann algebra by γ. Denote by Man the dense subset of M consisting of σt -analytic elements on a domain containing the strip I1/2 := {z : |Im z| ≤ 1/2} [7]. By [7, Proposition 2.5.21], any element A ∈ Man is strongly analytic. For a given admissible function f and odd element x ∈ Mo ∩ Man , define a sesquilinear form E : H × H −→ C by E(η, ξ) Z = h(σt−i/4 (x) − j(σt−i/4 (x∗ ))Uγ )η, (σt−i/4 (x) − j(σt−i/4 (x∗ ))Uγ )ξif (t) dt +
Z
h(σt−i/4 (x∗ ) − j(σt−i/4 (x))Uγ )η, (σt−i/4 (x∗ ) − j(σt−i/4 (x))Uγ )ξif (t) dt
≡ E (1) (η, ξ) + E (2) (η, ξ) .
(2.9)
Here Uγ is the unitary operator satisfying γ(A) = Uγ AUγ−1 , A ∈ M. The associated quadratic form E[·] is given by Z E[ξ] = k(σt−i/4 (x) − j(σt−i/4 (x∗ ))Uγ )ξk2 f (t) dt +
Z
k(σt−i/4 (x∗ ) − j(σt−i/4 (x))Uγ )ξk2 f (t) dt .
(2.10)
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
829
The form given in the above is a bounded form since kσt−i/4 (x)k = kσ−i/4 (x)k for any x ∈ Man . Recall that Mξ0 is a dense subset in H and x an odd element. Using the fact that JBξ0 = ∆1/2 B ∗ ξ0 , B ∈ M, one obtains that for any A ∈ M, (σt−i/4 (x# ) − j(σt−i/4 ((x# )∗ ))Uγ )Aξ0
= σt−i/4 (x# )Aξ0 − γ(A)j(σt−i/4 ((x# )∗ ))ξ0 = (σt−i/4 (x# )A − γ(A)σt−i/4 (x# ))ξ0
(2.11)
where x# stands for x or x∗ . Thus we have used inner superderivations in (2.5) to define the sesquilinear form in (2.9). The following is the main result which corresponds to [1, Theorem 3.1]. Theorem 2.1. For a given admissible function f and an odd element x ∈ Mo ∩ Man , let (E, H) be defined as in (2.9). Let H be the self-adjoint operator associated with (E, H), i.e., E[ξ] = hξ, Hξi, ∀ ξ ∈ H. Assume that there exists a constant M > 0 such that the bound sup s∈[−1/4,1/4]
kσt+is (x)k ≤ M
(2.12)
holds uniformly in t ∈ R. Then the following properties hold : (a) Hξ0 = 0, (b) E[Jξ] = E[ξ], ∀ ξ ∈ H, (J-real ) (c) E(ξ+ , ξ− ) ≤ 0 for ∀ ξ ∈ HJ . Furthermore (E, H) is a Dirichlet form. We will produce the proof of Theorem 2.1 at the end of this section. The following is a consequence of Theorem 2.1: Theorem 2.2. Let H be the self-adjoint operator associated with (E, H) defined as in (2.9) and let Tt = e−tH , t ≥ 0. Then {Tt }t≥0 is a J-real, strongly continuous, symmetric Markovian semigroup. Proof. Clearly E[·] ≥ 0. Thus Theorems 2.1 and (2.3) imply that {Tt }t≥0 is J-real, strongly continuous, sub-Markovian. It follows from Theorem 2.1(a) that Tt ξ0 = ξ0 for any t ≥ 0. Thus {Tt }t≥0 is Markovian. Remark 2.1. Consider the symmetric embedding: i0 : M → H i0 (A) = ∆1/4 Aξ0 . Define the maps St on M by St : M → M ,
i 0 ◦ St ≡ Tt ◦ i0 .
December 3, 2003 17:12 WSPC/148-RMP
830
00182
C. Bahn, C. K. Ko & Y. M. Park
It follows from [2, Theorem 2.12] that {St }t≥0 is a weak* continuous, Markovian semigroup on M. Notice that St preserves parity: St (γ(A)) = γ(St (A)), where A ∈ M. This can be checked from the fact Uγ HUγ−1 = H where H is defined in Theorem 2.2. Remark 2.2. The notion of admissible function similar to that used in the formula (2.9) (also [1, (3.4)]) has already appeared in the diffusion type generator [15, formula (2)]. However, in order to make the generator in [15] be well-defined, one needs some kind of asymptotic abelianness [15–17]. On the other hand the Dirichlet forms we introduced in (2.9) do not need any asymptotic abelianness and are reduced to usual (canonical) forms [23, 5] in the case of tracial states. We now produce the proof of Theorem 2.1. Let us mention that we modify [1, proof of Theorem 3.1]. Proof of Theorem 2.1. (a) Notice that Uγ ξ0 = ξ0 , JAξ0 = ∆1/2 A∗ ξ0 for any A ∈ M. We obtain (σt−i/4 (x) − j(σt−i/4 (x∗ ))Uγ )ξ0 = ∆1/4 σt (x)ξ0 − J∆1/4 σt (x∗ )∆−1/4 ξ0 = ∆1/4 σt (x)ξ0 − ∆1/4 σt (x)ξ0 = 0, and also (σt−i/4 (x∗ ) − j(σt−i/4 (x))Uγ )ξ0 = 0. Thus (a) follows from (2.9) and the above facts. (b) Notice that Uγ xUγ−1 = −x for any x ∈ Mo . The first relation in (2.8) implies that σt−i/4 (x) = −Uγ σt−i/4 (x)Uγ−1 .
(2.13)
Thus a direct calculation shows that for any ξ ∈ H, k(σt−i/4 (x) − j(σt−i/4 (x∗ ))Uγ )Jξk2 = kUγ J(−j(σt−i/4 (x))Uγ + σt−i/4 (x∗ ))ξk2 = k(σt−i/4 (x∗ ) − j(σt−i/4 (x))Uγ )ξk2 . Here we have used Uγ−1 = Uγ and Uγ commutes with J. Thus we have E (1) [Jξ] = E (2) [ξ]. The method used in the above also implies that E (2) [Jξ] = E (1) [ξ]. (c) By the expression of E(ξ, η) in (2.9), E(ξ+ , ξ− ) can be written as E(ξ+ , ξ− ) = E (1) (ξ+ , ξ− ) + E (2) (ξ+ , ξ− ) = (I(1) + II(1) ) + (I(2) + II(2) )
(2.14)
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
where I
(1)
II(1)
=
Z
831
(hσt−i/4 (x)ξ+ , σt−i/4 (x)ξ− i
+ hσt−i/4 (x∗ )ξ− , σt−i/4 (x∗ )ξ+ i)f (t) dt , Z = − (hσt−i/4 (x)ξ+ , j(σt−i/4 (x∗ ))Uγ ξ− i
(2.15)
+ hj(σt−i/4 (x∗ ))Uγ ξ+ , σt−i/4 (x)ξ− i)f (t) dt
and I(2) and II(2) are obtained from I(1) and II(1) respectively by replacing x by x∗ in the above. Here we have used the facts that Uγ = Uγ−1 , the antilinearity of J and (2.13) to obtain the second term of I(1) in (2.15). As a consequence of [22, Theorem 4(7)], Mξ+ ⊥ Mξ− , which implies I(1) = 0. Similarly I(2) = 0. See also [2, proof of Proposition 5.3(ii)]. Next, we consider II(1) . Since σt−i/4 (x)∗ = σt+i/4 (x∗ ), it follows from (2.7) and (2.13) that hσt−i/4 (x)ξ+ , j(σt−i/4 (x∗ ))Uγ ξ− i = hξ+ , σt+i/4 (x∗ )j(σt+i/4 (x)∗ )Uγ ξ− i
(2.16)
and hj(σt−i/4 (x∗ ))Uγ ξ+ , σt−i/4 (x)ξ− i = h−Uγ j(σt−i/4 (x∗ ))ξ+ , −Uγ σt−i/4 (x)Uγ ξ− i = hξ+ , σt−i/4 (x)j(σt−i/4 (x∗ )∗ )Uγ ξ− i .
(2.17)
Substituting (2.16) and (2.17) into the second expression of (2.15), we get that Z II(1) = − hξ+ , σt+i/4 (x∗ )j(σt+i/4 (x)∗ )Uγ ξ− if (t) dt −
Z
hξ+ , σt−i/4 (x)j(σt−i/4 (x∗ )∗ )Uγ ξ− if (t) dt .
Notice that for any x ∈ Man and ξ ∈ H, the map
z 7→ j(σz (x)∗ )Uγ ξ
is analytic on a domain containing the strip I1/2 . In fact, the analyticity follows from the fact that hη, j(σz (x)∗ )Uγ ξi = hσz (x)∗ JUγ ξ, Jηi = hJUγ ξ, σz (x)Jηi for any η, ξ ∈ H, and that weak analyticity implies strong analyticity (see [24, Theorem VI.4]). Using the Cauchy integral theorem, the assumption in the theorem, the property (c) in Definition 2.1 and σt (x)∗ = σt (x∗ ), we obtain Z i dt II(1) = − hξ+ , σt (x∗ )j(σt (x∗ ))Uγ ξ− if t − 4 Z i − hξ+ , σt (x)j(σt (x))Uγ ξ− if t + dt . 4
December 3, 2003 17:12 WSPC/148-RMP
832
00182
C. Bahn, C. K. Ko & Y. M. Park
Replacing x by x∗ in the above, we obtain the expression of II(2) . Thus we get Z II = − hξ+ , [σt (x)j(σt (x)) + σt (x∗ )j(σt (x∗ ))]Uγ ξ− i i i · f t− +f t+ dt . 4 4
Recall that Uγ P ⊂ P in (2.7). Since σt (x# )j(σt (x# ))Uγ ξ− ∈ P ,
hξ+ , σt (x# )j(σt (x# ))Uγ ξ− i ≥ 0 for ∀ t ∈ R ,
where x# is either x or x∗ . By the property (b) in Definition 2.1, we conclude that II ≤ 0. This proved the part (c) of the theorem. Clearly E[·] ≥ 0. Note that E(ξ, ξ0 ) = 0 for all ξ ∈ H. Theorem 2.1(b) and (c) imply that the form (E, H) satisfies the conditions in (2.2). Thus (E, H) is a Dirichlet form. 3. Markovian Semigroups on CAR Algebras with Quasi-Free States In this section, we apply the results in Sec. 2 to construct Dirichlet forms and associated Markovian semigroups on CAR algebras with respect to quasi-free states. We first review the notion of CAR algebras. For the details we refer to [7, Sec. 5.2.2]. Let h0 be a separable pre-Hilbert space with an inner product (·, ·) and h the completion of h0 . Let A(h) be the C ∗ -algebra generated by the identity 1 and elements a(f ), f ∈ h, satisfying (a) f 7→ a(f ) is antilinear,
(3.1)
(b) {a(f ), a(g)} = 0, (c) {a(f ), a(g)∗ } = (f, g)1
for all f , g ∈ h, where {A, B} := AB + BA. Notice that ka(f )k = kf k ,
∀f ∈ h,
(3.2)
and so A(h0 ) = A(h). Let γ : A(h) → A(h) be the ∗-automorphism defined by γ(a(f )) = −a(f ) ,
(3.3)
for all f ∈ h. Then (A(h), γ) is a Z2 -graded C ∗ -algebra. Next, we describe quasi-free states on A(h). Let A be a bounded and nonnegative operator on h. Recall that a vector g ∈ h is called an analytic vector for an operator B on h if for each n ∈ N, g ∈ D(B n ) and for some t > 0, ∞ X kB n gk n t < ∞. n! n=0
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
833
In the rest of this paper, we assume that A satisfies the following properties: Assumption 3.1. (a) There exists α > 0 such that 0 < A ≤ α1. (b) The inverse A−1 of A exists as a (unbounded) self-adjoint and positive operator on h. (c) For any z ∈ C, Az leaves h0 invariant, i.e. Az h0 ⊂ h0 . Moreover, z 7→ Az f is entire analytic for any f ∈ h0 . (d) Any g ∈ h0 is an analytic vector for A−1/2 . We remark that a dense subspace h0 of h satisfying Assumption 3.1 exists by the spectral theorem. Example 3.1 (Ideal Fermi Gases). Let h be the space L2 (Rd , dx) and ∆ the Laplacian operator on h. Let A be given by A = e−β(−∆−µ1) , where β > 0, µ ∈ R. The subspace h0 is given by h0 = {f ∈ h : fb ∈ Cc (Rd )}, where fb denotes the Fourier transform of f . Then Assumption 3.1 is satisfied with α = eβµ . For a given bounded operator A on h satisfying Assumption 3.1, the gauge invariant quasi-free state ω on A(h) is defined by ω(a(fm )∗ · · · a(f1 )∗ a(g1 ) · · · a(gn )) = δnm det((gi , A(1 + A)−1 fj ))
(3.4)
for any f1 , . . . , fm , g1 , . . . , gn ∈ h. Let σt : A(h) → A(h) be the ∗-automorphisms defined by σt (a(f )) = a(Ait f )
(3.5)
for any f ∈ h, t ∈ R. Then one can check KMS conditions ω(Bσ−i (C)) = ω(CB) for any analytic elements B, C ∈ A(h) [7]. Let (Hω , πω , Ωω ) be the GNS representation of (A(h), ω). Define Mω := πω (A(h))00 , a(f )ω := πω (a(f )), a(f )∗ω := πω (a(f )∗ ), f ∈ h, σtω := πω σt πω−1 and γω := πω γπω−1 . From now on we suppress ω from the notations, e.g. M := Mω , H := Hω , a(f ) := aω (f ), a(f )∗ := aω (f )∗ , σt := σtω and γ := γω etc. By continuity, σt and γ extend to M. We also write ξ0 = Ωω . Notice that ω satisfies the σt -KMS conditions [7]. Thus, by the uniqueness of the modular automorphism, σt is the modular automorphism on M. See [7, Theorem 5.3.10]. Let ∆ and J denote the modular operator and the modular conjugation associated with the pair (M, ξ0 ) respectively. Thus σt (B) = ∆it B∆−it , ∀ B ∈ M. It follows from (3.3) and (3.4) that ω(γ(B)) = ω(B) ,
∀B ∈ M.
December 3, 2003 17:12 WSPC/148-RMP
834
00182
C. Bahn, C. K. Ko & Y. M. Park
Since ω is γ-invariant, that is, hξ0 , γ(B)ξ0 i = hξ0 , Bξ0 i, ∀ B ∈ M, there exists a unitary operator Uγ on H satisfying the properties in (2.7). For any g ∈ h, define an odd element B(g) ∈ Mo by 1 B(g) := √ (a(g) + a(g)∗ ) . 2
(3.6)
Using the antilinearity of a(g) and (3.6), we get that for any g ∈ h, 1 a(g) = √ (B(g) + iB(ig)) , 2 1 a(g) = √ (B(g) − iB(ig)) . 2
(3.7)
∗
From the CAR relations (3.1), we obtain that for any f , g ∈ h, 1 {a(f ), B(g)} = √ (f, g)1 , 2 1 {a(f ) , B(g)} = √ (g, f )1 . 2
(3.8)
∗
Denote by Mf in the algebra generated by 1 and a(f )# , f ∈ h0 , where a(f )# is either a(f ) or a(f )∗ for f ∈ h0 . Because of (3.7), Mf in is equal to the algebra generated by 1 and B(f ), f ∈ h0 . Since Mf in is norm dense in πω (A(h)), Mf in ξ0 is also dense in H. For any f ∈ h0 , z ∈ C, we write σz (a(f )) := a(Ai¯z f ) , σz (a(f )∗ ) := a(Aiz f )∗ .
(3.9)
In fact, using (3.4) and Assumption 3.1(c), one can check that for any f ∈ h0 and ξ ∈ Mf in ξ0 , the map z 7→ σz (a(f )# )ξ is the analytic extension on C of the map t 7→ σt (a(f )# )ξ (where a(f )# is either a(f ) or a(f )∗ ). Thus for any g ∈ h0 , a(g), a(g)∗ and B(g) are σt -entire analytic odd elements of M. Let us turn to construction of a Dirichlet form which generates the symmetric Markovian semigroup on CAR algebra A(h) with respectR to the quasi-free state ω. An admissible function f is said to be normalized if f (t) dt = 1. For given normalized function f and a complete orthonormal system (CONS) {gj }∞ j=1 ⊂ h0 for h, define a sesquilinear form E : D(E) × D(E) → C as follows: ( ) ∞ X D(E) = ξ ∈ H : Ej (ξ, ξ) < ∞ , j=1
E(η, ξ) =
∞ X j=1
Ej (η, ξ) ,
(3.10)
η, ξ ∈ D(E)
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
835
where for each j ∈ N, D(Ej ) = H , Z Ej (η, ξ) = h(σt−i/4 (a(gj )) − j(σt−i/4 (a(gj )∗ ))Uγ )η, (σt−i/4 (a(gj ))
− j(σt−i/4 (a(gj )∗ ))Uγ )ξif (t) dt Z + h(σt−i/4 (a(gj )∗ ) − j(σt−i/4 (a(gj )))Uγ )η, (σt−i/4 (a(gj )∗ )
(3.11)
− j(σt−i/4 (a(gj )))Uγ )ξif (t) dt .
We also define the associated quadratic forms by Ej [ξ] = Ej (ξ, ξ) , E[ξ] =
∞ X j=1
Ej [ξ] ,
ξ ∈ H,
j ∈ N,
ξ ∈ D(E) .
(3.12)
We remark that the expression Ej (η, ξ) in (3.11) can be obtained from E(η, ξ) in (2.9) by replacing x by a(gj ). The following are the main results in this section. It turns out that the form defined as in (3.10) and (3.11) is independent of the CONS{gj } ⊂ h0 for h and the normalized admissible function f we have chosen: Theorem 3.1. Let (E, D(E)) be defined as in (3.10) and (3.11). Then the form (E, D(E)) is a densely defined Dirichlet form, and independent of the CONS{g j } ⊂ h0 for h and the normalized admissible function f we have chosen. Moreover, let {Tt }t≥0 be the semigroup associated to the form (E, D(E)). Then the semigroup {Tt }t≥0 is J-real, strongly continuous, symmetric and Markovian. Theorem 3.2. Let {Tt }t≥0 be the symmetric Markovian semigroup associated to the form (E, D(E)) in Theorem 3.1 and H the Dirichlet operator, i.e., T t = e−tH , t ≥ 0. Then the following results hold : (a) H is essentially self-adjoint on Mf in ξ0 . (b) The zero is a simple eigenvalue of H with eigenvector ξ0 . Moreover (0, 2) ∩ σ(H) = ∅. By the spectral theorem, Theorem 3.2(b) implies that for any ξ ∈ H and t ≥ 0, kTt ξ − hξ0 , ξiξ0 kH ≤ e−2t kξ − hξ0 , ξiξ0 kH . Thus {Tt }t≥0 converges to the equilibrium exponentially fast. Note that Theorem 3.2(b) implies that the vector ξ0 is a simple, strictly positive ground state for the generator H. In view of [20], Tt satisfies the indecomposability and the ergodicity (for each positive ξ, η, there exists t > 0 such that hξ, Tt ηi > 0). See [20, Theorem 4.3].
December 3, 2003 17:12 WSPC/148-RMP
836
00182
C. Bahn, C. K. Ko & Y. M. Park
Remark 3.1. The main results in this section can be generalized in several ways. For instance, if one replaces gj by B λ gj (for the definition of B see the Eq. (4.1)), j ∈ N, for some λ ∈ R in the definition of E(η, ξ) in (3.10) and (3.11), and modifies Assumption 3.1(d) appropriately, then all of the results in this section still hold with a modified spectral gap in Theorem 3.2(b), i.e. (0, 21+2λ ) ∩ σ(H) = ∅. Remark 3.2. The Markovian semigroup in Theorem 3.2 commutes with the modular operator (group). In fact, let CONS{gj } be used to construct the Dirichlet operator H. See (4.11). It is easy to check that ∆it H∆−it is the Dirichlet operator corresponding to CONS{Ait gj }. Since H is independent of CONS{gj } chosen, H = ∆it H∆−it for any t ∈ R. Remark 3.3. It may be worthwhile comparing the results in Theorems 3.1 and 3.2 with those in [15, Lemma III.1 and Theorem III.2]. Since the free Fermion system satisfies the condition of L1 -asymptotic abelianness [7], the authors of [15] used the formula (2) of [15] to construct a completely positive unit preserving semigroup Qf,φ on (A(h), ω) for any function f ∈ Fβ and any vector φ ∈ H0 ⊂ h, where Fβ t and H0 are explicitly defined in [15]. They also showed that if any state ω e is left invariant under Qf,φ for all f ∈ F and φ ∈ H , then ω e is the equilibrium state ω. β 0 t On the other hand, Theorem 3.2(b) implies that the cyclic vector ξ0 corresponding to the quasi-free state ω is a unique invariant vector under Tt and for any ξ ∈ H, Tt ξ converges to hξ0 , ξiξ0 exponentially fast. In the rest of this section, we produce the proof of Theorem 3.1. The proof of Theorem 3.2 will be postponed to the next section. Proof of Theorem 3.1. Notice that a(gj ) ∈ Mo ∩ Man , ∀ j ∈ N (see below Eq. (3.9)). Since kσt+is (a(gj ))k = kAs gj k (see (3.9) and (3.2)), the assumption (2.12) is satisfied. Thus Theorem 2.1 impies that each Ej , j ∈ N, is a Dirichlet form. Thus if one can show that D(E) is dense in H, then it follows from [2, Theorem 5.2(ii)] that (E, D(E)) is a Dirichlet form and it generates a J-real, strongly continuous, symmetric and sub-Markovian semigroup Tt = e−tH , t ≥ 0. Using the calculations used in the proof of Theorem 2.1(a), we get that for each j ∈ N, (σt−i/4 (a(gj )# ) − j(σt−i/4 ((a(gj )# )∗ ))Uγ )ξ0 = 0 .
Thus Hξ0 = 0 and so {Tt }t≥0 is Markovian. Let us show that D(E) is dense. Since Mf in ξ0 is dense in H, it is sufficient to show that Mf in ξ0 is contained in D(E). For any B ∈ M, it follows from (2.11) and (3.9) that (σt−i/4 (a(gj )) − j(σt−i/4 (a(gj )∗ ))Uγ )Bξ0 = a(Ait−1/4 gj )Bξ0 − γ(B)a(Ait−1/4 gj )ξ0 ( [a(Ait−1/4 gj ), B]ξ0 if B ∈ Me = {a(Ait−1/4 gj ), B}ξ0 if B ∈ Mo
(3.13)
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
837
and (σt−i/4 (a(gj )∗ ) − j(σt−i/4 (a(gj )))Uγ )Bξ0 ( [a(Ait+1/4 gj )∗ , B]ξ0 if B ∈ Me . = it+1/4 ∗ {a(A gj ) , B}ξ0 if B ∈ Mo
(3.14)
Let B ∈ Mf in be any element of the form B = B(f1 ) · · · B(fn ) ∈ Mf in , f1 , . . . , fn ∈ h0 , n ∈ N (see (3.6) for the definition of B(fi )). Consider the case for n = 2m, that is, B ∈ Me . Using the CARs (3.1) and (3.8), we obtain that [a(Ait−1/4 gj ), B] = ∗
[a (A
it+1/4
n X (−1)k+1 it−1/4 b k ) · · · B(fn ) , √ (A gj , fk )B(f1 ) · · · B(f 2 k=1
n X (−1)k+1 b k ) · · · B(fn ) √ gj ), B] = (fk , Ait+1/4 gj )B(f1 ) · · · B(f 2 k=1
(3.15)
b ), f ∈ h0 , denotes that B(f ) is omitted. By the Schwarz inequaity and where B(f the Bessel inequality it is easy to see that for any h1 , h2 ∈ h0 , t ∈ R, ∞ X j=1
|(Ait h1 , gj )(gj , Ait h2 )| ≤ kh1 k kh2 k .
Using the above inequality and that kB(h)k ≤ khk for any h ∈ h, we get from (3.13)–(3.15) that ∞ X j=1
Ej [Bξ0 ] < ∞ .
Thus Bξ0 ∈ D(E). The result, same as that in the above, holds for odd n. Hence Mf in ξ0 is contained in D(E) and so D(E) is dense. Next we will show that (E, D(E)) is independent of the choice of the CONS{gj } ⊂ h0 for h and the normalized admissible function f . Using the Parseval relations, we get that for all t ∈ R, k, k 0 ∈ N, ∞ X
(fk , Ait−1/4 gj )(Ait−1/4 gj , fk0 ) = (fk , A−1/2 fk0 )
j=1
and
∞ X
(Ait+1/4 gj , fk )(fk0 , Ait+1/4 gj ) = (fk0 , A1/2 fk ) .
j=1
It follows from (3.12)–(3.15), the above relations and Dominated convergence theorem that 0 n X n X (−1)k+k {(fk , A−1/2 fk0 ) + (fk0 , A1/2 fk )} E[Bξ0 ] = 2 0 k=1 k =1
b k ) · · · B(fn )ξ0 , B(f1 ) · · · B(f b k0 ) · · · B(fn )ξ0 i , · hB(f1 ) · · · B(f
(3.16)
December 3, 2003 17:12 WSPC/148-RMP
838
00182
C. Bahn, C. K. Ko & Y. M. Park
where Bξ0 = B(f1 ) · · · B(fn )ξ0 ∈ Mf in ξ0 . It turns out that Mf in ξ0 is a core for the Dirichlet operator H associated to the form (E, D(E)) (Theorem 3.2(a)). Thus H is independent of normalized admissible function f and CONS{gj } ⊂ h0 and so is (E, D(E)). 4. Decomposition of Quasi-Free Hilbert Space: Proof of Theorem 3.2 As in [3], we will decompose the Hilbert space H = Hω into direct sum of H(m,n) , m, n ∈ N ∪ {0}, where H(m,n) is the Hilbert space of m quasi-particles and n anti quasi-particles. We then use the results to prove Theorem 3.2. The decomposition method we use is essentially the same as that in [21]. See also [7, Example 5.2.20]. Denote by B the operator given by B := A−1/2 + A1/2 .
(4.1)
In this section, a(g)# , g ∈ h0 stands for a(g) or a(g)∗ . And we write δ(a(g)# ) := a(g)# − j(σ−i/2 ((a(g)# )∗ ))Uγ = a(g)# + Uγ j(σ−i/2 ((a(g)# )∗ )) ,
(4.2)
as a bounded operator on H. For g ∈ h0 , C ∈ M, we have δ(a(g)# )Cξ0 = a(g)# Cξ0 − γ(C)a(g)# ξ0
(4.3)
(see the argument of (2.11)). Recall (2.13), Uγ−1 = Uγ and the fact that Uγ commutes with J and ∆. It follows from (3.9) and (4.2) that for any g ∈ h0 δ(a(B −1/2 A−1/4 g)) = a(B −1/2 A−1/4 g) − j(σ−i/2 (a(B −1/2 A−1/4 g)∗ ))Uγ = a(B −1/2 A−1/4 g) + Uγ j(a(B −1/2 A1/4 g)∗ ) ,
(4.4)
δ(a(B −1/2 A1/4 g)∗ ) = a(B −1/2 A1/4 g)∗ − j(σ−i/2 (a(B −1/2 A1/4 g)))Uγ = a(B −1/2 A1/4 g)∗ + Uγ j(a(B −1/2 A−1/4 g)) . Since (j(a(g)))∗ = j(a(g)∗ ) and Uγ = Uγ−1 = Uγ∗ , a computation shows that (δ(a(B −1/2 A−1/4 g)))∗ = a(B −1/2 A−1/4 g)∗ + j(a(B −1/2 A1/4 g))Uγ = a(B −1/2 A−1/4 g)∗ − Uγ j(a(B −1/2 A1/4 g)) = a(B 1/2 A1/4 g)∗ − δ(a(B −1/2 A3/4 g)∗ ) .
(4.5)
Here we have used the fact that B −1/2 (A−1/4 +A3/4 ) = B 1/2 A1/4 . Using the method similar to that used in the above, we get (δ(a(B −1/2 A1/4 g)∗ ))∗ = a(B 1/2 A−1/4 g) − δ(a(B −1/2 A−3/4 g)) .
(4.6)
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
839
From notational brevity, we write that for any g ∈ h0
D1 (g) := δ(a(B −1/2 A−1/4 g)) , D2 (g) := δ(a(B −1/2 A1/4 g)∗ ) .
(4.7)
Then it follows from (4.5)–(4.7) that D1 (g)∗ = a(B 1/2 A1/4 g)∗ − D2 (A1/2 g) ,
D2 (g)∗ = a(B 1/2 A−1/4 g) − D1 (A−1/2 g) .
(4.8)
We first collect some properties of Di (g) for g ∈ h0 and i = 1, 2. Lemma 4.1. Di (g)ξ0 = 0 for any g ∈ h0 and i = 1, 2. Proof. This follows from (4.3) and (4.7). Lemma 4.2. The following relations hold for any g, h ∈ h0 : (a) (b) (c) (d) (e)
{Di (g), Dj (h)} = 0, i = 1, 2, j = 1, 2, {D1 (g), a(h)} = 0, {D2 (g), a(h)∗ } = 0, {D1 (g), a(B 1/2 A1/4 h)∗ } = (g, h)1, {D2 (g), a(B 1/2 A−1/4 h)} = (h, g)1.
Proof. Notice that a(h)# Uγ = −Uγ a(h)# for h ∈ h0 . The proofs follows from direct computations using the definitions in (4.7) and the CARs in (3.1). Proposition 4.1. The following canonical anti-commutation relations (CARs) hold for any g, h ∈ h0 : (a) {D1 (g), D1 (h)∗ } = (g, h)1, {D1 (g), D1 (h)} = 0, {D1 (g)∗ , D1 (h)∗ } = 0, (b) {D2 (g), D2 (h)∗ } = (h, g)1, {D2 (g), D2 (h)} = 0, {D2 (g)∗ , D2 (h)∗ } = 0, (c) {D1 (g), D2 (h)} = 0, {D1 (g), D2 (h)∗ } = 0, {D1 (g)∗ , D2 (h)} = 0, {D1 (g)∗ , D2 (g)∗ } = 0. Proof. The anti-commutation relations follow from (4.7), (4.8), (3.1) and Lemma 4.2. Now we are ready to decompose the Hilbert space H = Hω , called quasi-free Hilbert space. According to Lemma 4.1 and the CARs in Proposition 4.1, Di (g) and Di (h)∗ , i = 1, 2, g, h ∈ h0 , can be thought as annihilation and creation operators respectively. We remark that h 7→ D1 (h)∗ is linear, but g 7→ D2 (g)∗ is conjugate linear. With an abuse of terminology, we call D1 (h)∗ and D2 (h)∗ the creation operators for quasi-particles and anti quasi-particles respectively for h ∈ h 0 . The following is the decomposition of H:
December 3, 2003 17:12 WSPC/148-RMP
840
00182
C. Bahn, C. K. Ko & Y. M. Park
Theorem 4.1. The following decomposition holds: H=
∞ M
m,n=0
H(m,n)
where for each m, n ∈ N ∪ {0}, H(m,n) is the closure of the subspace spanned by the vectors of the form ! n m Y Y D2 (hl )∗ ξ0 , gj , hl ∈ h0 . D1 (gj )∗ j=1
l=1
In the case in which m = 0 (n = 0), we replace the operator in the first (second ) parenthesis in the above by identity.
Proof. Remark that Mf in is the algebra generated by 1, B(f ), f ∈ h0 is dense in M. It follows from (4.8) and (3.6) that any B(g), g ∈ h0 can be written as linear Qm sum of four Di (h)# , h ∈ h0 , i = 1, 2. Thus any ( l=1 B(gl ))ξ0 , gl ∈ h0 , l = 1, . . . , m can be expressed as a finite linear combination of the vectors of the form ! p q Y Y # # ξ 0 , g j , hl ∈ h 0 , D1 (gj ) D2 (hl ) j=1
l=1
where D(g)# is either D(g) or D(g)∗ . Using Lemma 4.1 and the CARs in Proposition 4.1, the above vector can be expressed as a finite linear combination of the vectors of the form 0 0 p q Y Y D1 (gj )∗ D2 (hl )∗ ξ0 , gj , hl ∈ h0 , p0 , q 0 ∈ N ∪ {0} . j=1
l=1
The set of finite linear combinations of the vectors of the above form is dense in H. Thus the decomposition follows from Lemma 4.1 and the CARs in Proposition 4.1.
Recall that h0 is a dense subspace of a complex Hilbert space h. Let F = F(h) be the anti-symmetric Fock space over h, and a(g) and a(g)∗ , g ∈ h, the annihilation and creation operator on F respectively. Denote by Ω the vacuum vector in F. Let C : h → h be an anti-unitary operator. If h is a L2 -space, one may consider that C is the complex conjugation. Denote by Γ(C) the second quantization of C. See [7, Sec. 5.2.1]. Let F1 , Ω1 , a1 (g) and a1 (g)∗ , g ∈ h be the identical copies of F, Ω, a(g) and a(g)∗ , g ∈ h, respectively. Notice that Γ(C)a(g)# Γ(C)−1 = a(Cg)# . We write that F2 = Γ(C)F(= F), Ω2 = Ω, a2 (g) = a(Cg), and a2 (g)∗ = a(Cg)∗ , g ∈ h. Then the following anti-commutation relations hold: for any g, h ∈ h {a2 (g), a2 (h)∗ } = (h, g)1 , {a2 (g), a2 (h)} = 0 .
(4.9)
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
841
Proposition 4.2. Let U be the operator defined by U : H → F 1 ⊗ F2 ! ! m n m n Y Y Y Y ∗ ∗ ∗ ∗ D1 (gj ) D2 (hl ) ξ0 7→ (a1 (gj ) ⊗ 1) (θ ⊗ a2 (hl ) ) Ω1 ⊗ Ω2
j=1
j=1
l=1
l=1
for gj , hl ∈ h0 , j = 1, . . . , m, l = 1, . . . , n, m, n ∈ N ∪ {0}, where θ is an operator which anti-commutes with a(g), a(h)∗ for g, h ∈ h0 and satisfies θΩ1 = Ω1 . Then U is unitary. Proof. Notice that θ 2 = 1 and for g, h in h0 (a1 (g)∗ ⊗ 1)(θ ⊗ a2 (h)∗ )Ω1 ⊗ Ω2 = −(θ ⊗ a2 (h)∗ )(a1 (g)∗ ⊗ 1)Ω1 ⊗ Ω2 .
(4.10)
Since D1 (g)# and a1 (g)# , and D2 (g)# and a2 (g)# for g ∈ h0 satisfy the same anti-commutation relations respectively by Proposition 4.1 and (4.9), the unitarity of U follows from Lemma 4.1, (4.10) and ai (g)Ωi = 0, i = 1, 2 for any g ∈ h0 . We next turn to the spectral analysis of H, where H is the generator of the symmetric Markovian semigroup {Tt }t≥0 associated to the Dirichlet form (E, D(E)). Let us first describe the basic idea of the proof of Theorem 3.2. Recall that the vectors in Mf in ξ0 can be expressed as finite linear combination of the vectors of the form ! p q Y Y ∗ ∗ D1 (fj ) D2 (hl ) ξ0 , fj , hl ∈ h0 , p, q ∈ N ∪ {0} . j=1
l=1
Let {gj }∞ j=1 ⊂ h0 be a CONS for h. In the proof of Theorem 3.1, we showed that the form (E, Mf in ξ0 ) is independent of the CONS{gj } ∈ h0 for h. Thus one can show that for any ξ ∈ Mf in ξ0 ∞ Z X (kD1 (Ait B 1/2 gj )ξk2 + kD2 (Ait B 1/2 gj )ξk2 )f (t) dt E[ξ] = j=1
=
∞ X j=1
where
(kD1 (B 1/2 gj )ξk2 + kD2 (B 1/2 gj )ξk2 )
b , = hξ, Hξi b= H
∞ X j=1
(4.11)
{D1 (B 1/2 gj )∗ D1 (B 1/2 gj ) + D2 (B 1/2 gj )∗ D2 (B 1/2 gj )}
as a bilinear form on Mf in ξ0 × Mf in ξ0 . If one can show that Mf in ξ0 ⊂ D(H), b on Mf in ξ0 , and that Mf in ξ0 is a core for H, then one expects that the H =H
December 3, 2003 17:12 WSPC/148-RMP
842
00182
C. Bahn, C. K. Ko & Y. M. Park
spectrum of H can be analyzed completely. The following is one of the main results in this section: ˜ : Mf in ξ0 → H be the operator defined by Theorem 4.2. (a) Let H ! n ! m Y Y ∗ ∗ ˜ H D1 (fp ) D2 (hq ) ξ0 p=1
=
m X
k=1
+
q=1
k−1 Y
D1 (fp )
p=1
n X k=1
m Y
∗
!
D1 (Bfk )
D1 (fp )∗
p=1
∗
m Y
D1 (fp )
p=k+1
!
k−1 Y
D2 (hq )∗
q=1
!
∗
n Y
D2 (hq )
∗
q=1
D2 (Bhk )∗
n Y
q=k+1
!
ξ0
D2 (hq )∗ ξ0
(4.12)
for any m, n ∈ N ∪ {0} and fp , hq ∈ h0 , p = 1, . . . , m, q = 1, . . . , n. Then the relation ˜ E(η, ξ) = hη, Hξi holds for any η, ξ ∈ Mf in ξ0 , where E is defined as in (3.10) and (3.11). ˜ is essentially self-adjoint and the self-adjoint extension denoted by H ˜ (b) H again is equal to the Dirichlet operator H. Proof. (a) Let (E1 , Mf in ξ0 ) be the form given by E1 [ξ] =
∞ X j=1
kD1 (B 1/2 gj )ξk2 ,
˜ =H ˜1 + H ˜ 2 , where the image where {gj }, gj ∈ h0 is a CONS for h. We write that H ˜ 1 (resp. H ˜ 2 ) is defined by the first (resp. second) vector in the right-hand under H side in (4.12). The CARs in Proposition 4.1(a) and Lemma 4.1 imply that " m ! # Y ∗ E1 D1 (fp ) ξ0 p=1
=
∞ X m X m X
(−1)p+q (B 1/2 fp , gj )(gj , B 1/2 fq )G(f1 , . . . , fm ; p, q)
j=1 p=1 q=1
where G(f1 , . . . , fm ; p, q) * p−1 ! Y ∗ := D1 (fτ ) τ =1
m Y
τ =p+1
D1 (fτ )
!
∗
ξ0 ,
q−1 Y
τ =1
D1 (fτ )
!
∗
m Y
τ =q+1
D1 (fτ )
! +
∗
ξ0
.
December 3, 2003 17:12 WSPC/148-RMP
00182
843
Dirichlet Forms and Symmetric Markovian Semigroups
Using the Parserval relations and the fact that m X
(−1)p+q (fp , Bfq )G(f1 , . . . , fm ; p, q)
p=1
= (−1)
=
*
we have " E1
q−1
m Y
*
D1 (fp )
m X q=1
=
*
* m Y
p=1
D1 (fp )
∗
p=1
D1 (fp )
p=1
=
D1 (Bfq )
∗
p=1
m Y
m Y
∗
!
m Y
!
ξ0 ,
ξ0
#
q−1 Y
D1 (fp )
p=1
D1 (fp )
D1 (fτ )
τ =1
∗
!
∗
!
!
ξ0 ,
q−1 Y
∗
q−1 Y
ξ0 ,
!
τ =1
D1 (Bfq )
˜1 ξ0 , H
p=1
D1 (fτ )
D1 (fp )
!
m Y
∗
m Y
∗
∗
!
!
D1 (Bfq )
ξ0
∗
D1 (fτ )
m Y
τ =q+1
+
D1 (fτ )
∗
τ =q+1
τ =q+1
τ =1 m Y
D1 (fτ )
∗
∗
!
ξ0
+
D1 (fτ )
∗
!
ξ0
+
,
!
ξ0
+
.
˜2 = H ˜ −H ˜ 1 commutes with D1 (f )∗ for any f ∈ h0 by (4.12). Thus Notice that H by the polarization identity, we proved that ˜ 1 ξi E1 (η, ξ) = hη, H for any η, ξ ∈ Mf in ξ0 . The method similar to that used in the above implies that ˜ 2 ξi E2 (η, ξ) = hη, H
for any η, ξ ∈ Mf in ξ0 . This proves part (a) of the theorem. (b) By Proposition 4.2, we have ˜ −1 = dΓ1 (B) ⊗ 1 + 1 ⊗ dΓ2 (B) , U HU
(4.13)
where each i, i = 1, 2, dΓi (B) is the second quantization of B on Fi . We remark that dΓ2 (B) is anti-unitary equivalent to dΓ1 (B). By Assumption 3.1(d), any f ∈ h0 Qm is an analytic vector for B, and so it is easy to check that ( p=1 a(fp )∗ )Ω is an analytic vector for dΓ(B) for any fp ∈ h0 , p = 1, . . . , m. Thus it follows that any ˜ Since HM ˜ f in ξ0 ⊂ Mf in ξ0 by (4.12) and ξ ∈ Mf in ξ0 is an analytic vector for H. ˜ = H on Mf in ξ0 by part (a) of the theorem, it follows from [24, Corollary 2 of H ˜ and H are essentially self-adjoint on Mf in ξ0 , and so H ˜ = H. Theorem X.39] that H Finally we can produce the proof of Theorem 3.2. Proof of Theorem 3.2. (a) follows from Theorem 4.2.
December 3, 2003 17:12 WSPC/148-RMP
844
00182
C. Bahn, C. K. Ko & Y. M. Park
(b) Recall 0 < A ≤ α1. Let m=
(
α−1/2 + α1/2
α 0 [33]. Actually, this is true only for regular classical limits, while for chaotic ones the semi-classical regime typically scales as −log ~ [19, 11, 33]. Both time scales diverge when ~ → 0, but the shortness of the latter means that classical mechanics has to be replaced by quantum mechanics much sooner for quantum systems with chaotic classical behavior. The logarithmic breaking time −log ~ has been considered by some as a violation of the correspondence principle [17, 18], by others (see [11] and Chirikov in [19]), as evidence that time and classical limits do not commute. ∗ Also
Research Assistant of the Fund for Scientific Research — Flanders (Belgium) (F.W.O. — Vlaanderen). 847
December 3, 2003 18:33 WSPC/148-RMP
848
00183
F. Benatti et al.
The analytic studies of logarithmic time scales have been mainly performed by means of semi-classical tools, essentially by focusing, via coherent state techniques, on the phase space localization of specific time evolving quantum observables. In the following, we shall show how they emerge in the context of quantum dynamical entropies. As a particular example, we shall concentrate on finite dimensional quantizations of hyperbolic automorphisms of the 2-torus, which are prototypes of chaotic behavior; indeed, their trajectories separate exponentially fast with a Lyapounov exponent log λ > 0 [7, 31]. Standard quantization, a ` la Berry, of hyperbolic automorphisms [10, 14] yields Hilbert spaces of a finite dimension N . This dimension plays the role of semi-classical parameter and sets the minimal size 1/N of quantum phase space cells. By the theorems of Ruelle and Pesin [21], the positive Lyapounov exponents of smooth, classical dynamical systems are related to the dynamical entropy of Kolmogorov [20] which measures the information per time step provided by the dynamics. There are several candidates for non-commutative extensions of the latter [12, 3, 30, 1, 28]: in this paper we shall use two of them [12, 3] and study their semiclassical limit. We show that, from both of them, one recovers the Kolmogorov–Sinai entropy by computing the average quantum entropy produced over a logarithmic time scale and then taking the classical limit. This confirms the numerical results in [5], where the dynamical entropy [3] is applied to the study of the quantum kicked top. In this approach, the presence of logarithmic time scales indicates the typical scaling for a joint time-classical limit suited to preserve positive entropy production in quantized classically chaotic quantum systems. The paper is organized as follows: Sec. 2 contains a brief review of the algebraic approach to classical and dynamical systems, while Sec. 3 introduces some basic semi-classical tools. Sections 4 and 5 deal with the quantization of hyperbolic maps on finite dimensional Hilbert spaces and the relation between classical and time limits. Section 6 gives an overview of the quantum dynamical entropy of Connes, Narnhofer and Thirring [12] (CNT-entropy) and of Alicki and Fannes [2, 3] (ALFentropy, where L stands for Lindblad); finally, in Sec. 7 their semi-classical behavior is studied and the emergence of a typical logarithmic time scale is showed.
2. Dynamical Systems: Algebraic Setting We consider reversible, discrete time, compact classical dynamical systems that can be represented by a triple (X , T, µ), where: • X is a compact metric space: the phase space of the system. • T is a measurable transformation of X that is invertible such that T −1 is also measurable. The group {T k |k ∈ Z} implements the conservative dynamics in discrete time. • µ is a T -invariant probability measure on X , i.e. µ ◦ T = µ.
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
849
In this paper, we consider a general scheme for quantizing and dequantizing, i.e. for taking the classical limit (see [32]). Within this framework, we focus on the semiclassical limit of quantum dynamical entropies of finite dimensional quantizations of the Arnold cat map and of generic hyperbolic automorphisms of the 2-torus, cat maps for short. In order to make the quantization procedure more explicit, it proves useful to follow an algebraic approach and replace (X , T, µ) with (Mµ , Θ, ωµ ) where • Mµ is the von Neumann algebra L∞ µ (X ) of (equivalence classes of) essentially bounded µ-measurable functions on X , equipped with the so-called essential supremum norm k · k∞ [26]. • ωµ is the state on Mµ defined by the reference measure µ Z ωµ (f ) := µ(dx)f (x) . X
• {Θk |k ∈ Z} is the discrete group of automorphisms of Mµ which implements the dynamics: Θ(f ) := f ◦ T −1 . The invariance of the reference measure reads now ωµ ◦ Θ = ω µ . Quantum dynamical systems are described in a completely similar way by a triple (M, Θ, ω), the critical difference being that the algebra of observables M is no longer Abelian: • M is a von Neumann algebra of operators, the observables, acting on a Hilbert space H. • Θ is an automorphism of M. • ω is an invariant normal state on M: ω ◦ Θ = ω. Quantizing essentially corresponds to suitably mapping the commutative, classical triple (Mµ , Θ, ωµ ) to a non-commutative, quantum triple (M, Θ, ω). 3. Classical Limit: Coherent States Performing the classical limit or a semi-classical analysis consists in studying how a family of algebraic triples (M, Θ, ω) depending on a quantization ~-like parameter is mapped onto (Mµ , Θ, ωµ ) when the parameter goes to zero. The most successful semi-classical tools are based on the use of coherent states. For our purposes, we shall use a large integer N as a quantization parameter, i.e. we use 1/N as the ~-like parameter. In fact, we shall consider cases where M is the algebra MN of N -dimensional square matrices acting on CN , the quantum reference state is the normalized trace N1 Tr on MN , denoted by τN and the dynamics is given in terms of a unitary operator UT on CN in the standard way: ΘN (X) := UT∗ XUT . In full generality, coherent states will be identified as follows.
December 3, 2003 18:33 WSPC/148-RMP
850
00183
F. Benatti et al.
Definition 3.1. A family {|CN (x)i|x ∈ X } ∈ H of vectors, indexed by points x ∈ X , constitutes a set of coherent states if it satisfies the following requirements (1) (2) (3) (4)
Measurability: x 7→ |CN (x)i is measurable on X ; 2 Normalization: kCN (x)k R = 1, x ∈ X ; Overcompleteness: N X µ(dx)|CN (x)ihCN (x)| = 1l; Localization: given ε > 0 and d0 > 0, there exists N0 (, d0 ) such that for N ≥ N0 and d(x, y) ≥ d0 , one has N |hCN (x), CN (y)i|2 ≤ ε .
The overcompleteness condition may be written in dual form as Z µ(dx)hCN (x), XCN (x)i = Tr X , X ∈ MN . N X
Indeed, Z Z µ(dx)|CN (x)ihCN (x)|X = Tr X . µ(dx)hCN (x), XCN (x)i = N Tr N X
X
3.1. Anti-wick quantization In order to study the classical limit and, more generally, semi-classical behavior of (MN , ΘN , τN ) when N → ∞, we introduce two linear maps. The first, γN ∞ , (anti-Wick quantization) associates N × N matrices to functions in Mµ = L∞ µ (X ), the second one, γ∞N , maps N × N matrices to functions in L∞ µ (X ). Definition 3.2. Given a family {|CN (x)i|x ∈ X } of coherent states in CN , the anti-Wick quantization scheme will be described by a (completely) positive unital map γN ∞ : Mµ → MN Z Mµ 3f 7→ N µ(dx) f (x)|CN (x)ihCN (x)| =: γN ∞ (f ) ∈ MN . X
The corresponding dequantizing map γ∞N : MN → Mµ will correspond to the (completely) positive unital map MN 3 X 7→ hCN (x), XCN (x)i =: γ∞N (X)(x) ∈ Mµ . Both maps are identity preserving because of the conditions imposed on the family of coherent states and are also completely positive since the domain of γ N ∞ is a commutative algebra as well as the range of γ∞N . Moreover, kγ∞N ◦ γN ∞ (g)k∞ ≤ kgk∞ ,
g ∈ Mµ ,
(1)
where k · k∞ denotes the essential norm on Mµ = L∞ µ (X ). The following two equivalent properties are less trivial: Proposition 3.1. For all f ∈ Mµ lim γ∞N ◦ γN ∞ (f ) = f
N →∞
µ-a.e.
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
851
Proposition 3.2. For all f, g ∈ Mµ lim τN (γN ∞ (f )∗ γN ∞ (g)) = ωµ (f g) =
N →∞
Z
µ(dx) f (x)g(x) . X
The previous two propositions can be taken as requests on any well-defined quantization–dequantization scheme for observables. In the sequel, we shall need the notion of quantum dynamical systems (MN , ΘN , τN ) tending to the classical limit (X , T, µ). We then not only need convergence of observables but also of the dynamics. This aspect will be considered in Sec. 5. Proof of Proposition 3.1. We first prove the assertion when f is continuous on X and then remove this condition. We show that the quantity FN (x) := |f (x) − γ∞N ◦ γN ∞ (f )(x)| Z = f (x) − N µ(dy) f (y)|hCN (x), CN (y)i|2 X
Z = N µ(dy) (f (y) − f (x))|hCN (x), CN (y)i|2 X
becomes arbitrarily small for N large enough, uniformly in x. Selecting a ball B(x, d0 ) of radius d0 , using the mean-value theorem and property (3.1.3), we derive the upper bound Z 2 FN (x) ≤ N µ(dy) (f (y) − f (x))|hCN (x), CN (y)i| B(x,d0 ) Z +N µ(dy) (f (y) − f (x))|hCN (x), CN (y)i2 X \B(x,d0)
≤ |f (c) − f (x)| +
Z
X \B(x,d0 )
µ(dy) |f (y) − f (x)|N |hCN (x), CN (y)i|2 ,
(2)
(3)
where c ∈ B(x, d0 ). Because X is compact, f is uniformly continuous. Therefore, we can choose d0 in such a way that |f (c) − f (x)| < ε uniformly in x ∈ X . On the other hand, from the localization property (3.1.4), given ε0 > 0, there exists an integer N0 (ε0 , d0 ) such that N |hCN (x), CN (y)i|2 < ε0 whenever N > N0 (ε0 , d0 ). This choice leads to the upper bound Z FN (x) ≤ ε + ε0 µ(dy) |f (y) − f (x)| X \B(x,d0 )
≤ ε + ε0
Z
X
µ(dy) |f (y) − f (x)| ≤ ε + 2ε0 kf k∞ .
(4)
December 3, 2003 18:33 WSPC/148-RMP
852
00183
F. Benatti et al.
To get rid of the continuity of f , we use Lusin’s theorem [26]. It states that, given f ∈ L∞ µ (X ), with X compact, there exists a sequence {fn } of continuous functions on X such that |fn | ≤ kf k∞ and converging to f µ-almost everywhere. Thus, for f ∈ L∞ µ (X ), we pick such a sequence and estimate FN (x) ≤ |f (x) − fn (x)| + |fn (x) − γ∞N ◦ γN ∞ (fn )(x)| + |γ∞N ◦ γN ∞ (fn − f )(x)| . The first term can be made arbitrarily small (µ.a.e) by choosing n large enough because of Lusin’s theorem, while the second one goes to 0 when N → ∞ since fn is continuous. Finally, the third term becomes as well vanishingly small with n → ∞ as one can deduce from Z µ(dx) |γ∞N ◦ γN ∞ (f − fn )(x)| X
=
Z
≤
Z
X
=
Z
X
X
Z 2 µ(dx) µ(dy) (f (y) − fn (y))N |hCN (x), CN (y)i| X
µ(dy) |f (y) − fn (y)|
Z
X
µ(dx) N |hCN (x), CN (y)i|2
µ(dy) |f (y) − fn (y)| ,
where exchange of integration order is harmless because of the existence of the integral (1). The last integral goes to zero with n by dominated convergence and thus the result follows. Proof of Proposition 3.2. Consider ΩN := |τN (γN ∞ (f )∗ γN ∞ (g)) − ωµ (f¯g)| Z Z 2 = N µ(dx) f (x) µ(dy) (g(y) − g(x))|hCN (x), CN (y)i| X
≤
Z
X
X
Z 2 µ(dx) |f (x)| µ(dy)(g(y) − g(x))N |hCN (x), CN (y)i| . X
By choosing a sequence of continuous gn approximating g ∈ L∞ µ (X ), and arguing as in the previous proof, we get the following upper bound: Z Z 2 ΩN ≤ N µ(dx) |f (x)| µ(dy) (g(y) − gn (y))|hCN (x), CN (y)i| X
+N
Z
+N
Z
X
X
X
Z 2 µ(dx) |f (x)| µ(dy) (gn (y) − gn (x))|hCN (x), CN (y)i| X
Z µ(dx) |f (x)| µ(dy) (g(x) − gn (x))|hCN (x), CN (y)i|2 . X
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
853
The integrals in the first and third lines go to zero by dominated convergence and Lusin’s theorem. As regards the middle line, one can apply the argument used for the quantity FN (x) in the proof of Proposition 3.1. 4. Classical and Quantum Cat Maps In this section, we collect the basic material needed to describe both classical and quantum cat maps and we introduce a specific set of coherent states that will enable us to perform the semi-classical analysis of the dynamical entropy. 4.1. Finite dimensional quantizations We first introduce cat maps in the spirit of the algebraic formulation introduced in the previous sections. Definition 4.1. Hyperbolic automorphisms of the torus, i.e. cat maps, are generically represented by triples (Mµ , Θ, ωµ ), where • Mµ is the algebra of essentially bounded functions on the two dimensional torus T := {x = (x1 , x2 ) ∈ R2 (mod 1)}, equipped with the Lebesgue measure µ(dx) := dx. • {Θk |k ∈ Z} is the family of automorphisms (discrete time evolution) given by Mµ 3 f 7→ (Θk f )(x) := f (A−k x (mod 1)), where A = ac db has integer entries such that ad − cb = 1, |a + d| > 2 and maps T onto itself. • ωµ is the expectation obtained by integration with respect to the Lebesgue meaR sure: Mµ 3 f 7→ ωµ (f ) := T dxf (x), that is left invariant by Θ.
The matrix A has irrational eigenvalues 1 < λ , λ−1 , therefore distances stretch along the eigendirection u of λ, while shrink along v, the eigendirection of λ−1 . Once the folding condition is added, the hyperbolic automorphisms of the torus become prototypes of classical chaos, with positive Lyapounov exponent log λ. One can quantize the associated algebraic triple (Mµ , Θ, ωµ ) on either infinite [8] or finite dimensional Hilbert spaces [10, 14, 13]. In the following, we shall focus on the latter. Given an integer N , we consider an orthonormal basis |ji of CN , where the index j runs through ZN , namely |j + N i ≡ |ji, j ∈ Z. By using this basis we define two unitary matrices UN and VN as follows: 2πi 2πi u |j + 1i , and VN |ji := exp (v − j) |ji . (5) UN |ji := exp N N
u, v ∈ [0, 1) are parameters labelling the representations and N UN = e2iπu 1lN ,
VNN = e2iπv 1lN .
(6)
It turns out that UN VN = exp
2iπ N
VN U N .
(7)
December 3, 2003 18:33 WSPC/148-RMP
854
00183
F. Benatti et al.
Introducing Weyl operators labeled by n = (n1 , n2 ) ∈ Z2 iπ n1 WN (n) := exp = WN (−n)∗ n1 n2 VNn2 UN N
(8)
it follows that WN (N n) = eiπ(N n1 n2 +2n1 u+2n2 v) iπ WN (n)WN (m) = exp σ(n, m) WN (n + m) , N
(9) (10)
where σ(n, m) := n1 m2 − n2 m1 . Definition 4.2. Quantized cat maps will be identified with algebraic triples (MN , ΘN , τN ) where • MN is the full N × N matrix algebra linearly spanned by the Weyl operators WN (n). • ΘN : MN 7→ MN is the automorphism such that WN (p) 7→ ΘN (WN (p)) := WN (Ap) ,
p ∈ Z2 .
(11)
In the definition above, we have omitted reference to the parameters u, v in (5): they must be chosen such that ! ! ! ! a c u u N ac (mod 1) . (12) = + 2 bd b d v v Then, the folding condition (9) is compatible with the time evolution [14]. Further, the algebraic relations (10) are also preserved since the symplectic form remains invariant, i.e. σ(At n, At m) = σ(n, m). Useful relations can be obtained by using 2iπ iπ WN (n)|ji = exp (−n1 n2 + 2n1 u + 2n2 v) exp − jn2 |j + n1 i . (13) N N From (13) one readily derives (N )
iπ
τN (WN (n)) = e N (−n1 n2 +2n1 u+2n2 v) δn,0 ,
(14)
τN (WN (An)) = τN (WN (n)),
(15)
1 N
N −1 X
WN (−p)WN (n)WN (p) = Tr (WN (n)) 1lN ,
(16)
p1 ,p2 =0
MN 3 X =
N −1 X
τN (XWN (−p))WN (p) .
(17)
p1 ,p2 =0 (N )
In (14), we have introduced the periodic Kronecker delta, that is δn,0 = 1 if and only if n = 0 mod (N ).
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
855
From Eq. (10) one derives [WN (n), WN (m)] = 2i sin
π σ(n, m) WN (n + m) , N
which suggests that the ~-like parameter is 1/N and that the classical limit corresponds to N → ∞. In the following section, we set up a coherent state technique suited to study classical cat maps as limits of quantized cats. 4.2. Coherent states for cat maps We shall construct a family {|CN (x)i|x ∈ T} of coherent states on the 2-torus by means of the discrete Weyl group. We define |CN (x)i := WN ([N x])|CN i ,
(18)
where [N x] = ([N x1 ], [N x2 ]), 0 ≤ [N xi ] ≤ N − 1 is the largest integer smaller than N xi and the fundamental vector |CN i is chosen to be s N −1 X 1 N −1 . (19) |CN i = CN (j)|ji , CN (j) := (N −1)/2 j 2 j=0
Measurability and normalization are immediate, over-completeness comes as follows. Let Y be the operator in the left-hand side of property (3.1.3). If τN (Y WN (n)) = τN (WN (n)) for all n = (n1 , n2 ) with 0 ≤ ni ≤ N − 1, then according to (17) applied to Y it follows that Y = 1l. This is indeed the case as, using (9) and N -periodicity, Z dxhCN (x), WN (n)CN (x)i τN (Y WN (n)) = T
=
Z
dx exp T
1 = 2 N
N −1 X
2πi σ(n, [N x]) hCN , WN (n)CN i N
exp
p1 ,p2 =0
2πi σ(n, p) hCN , WN (n)CN i N
= τN (WN (n)) .
(20)
In the last line when x runs over T, [N xi ], i = 1, 2 runs over the set of integers 0, 1, . . . , N − 1. The proof the localization property (3.1.4) requires several steps. First, we observe that, due to (6), v N −n −1 ! ! u 1 u N −1 N −1 2πi 1 X t exp − E(n) := |hCN , WN (n)CN i| = N −1 `n2 2 N ` ` + n1 `=0 v ! u u N −1 2πi t exp − + `n2 N ` `=N −n1 N −1 X
! ` + n1 − N N −1
(21)
December 3, 2003 18:33 WSPC/148-RMP
856
00183
F. Benatti et al.
1
≤
2N −1
v
u N −n 1 −1 u X
t N −1 `
`=0
v ! u u N −1 t + ` `=N −n1 N −1 X
!
N −1
` + n1
N −1
` + n1 − N
!
!
.
(22)
Second, using the entropic bound of the binomial coefficients ! N −1 ` ≤ 2(N −1)η( N −1 ) , `
(23)
where η(t) :=
(
−t log2 t − (1 − t) log2 (1 − t)
if 0 < t ≤ 1
0
if t = 0
,
(24)
we estimate E(n) ≤
1 2N −1
"N −1−n X 1
2
`+n1 N −1 ` 2 [η( N −1 )+η( N −1
)]
+
N −1 X
2
N −1 2 [η
`=N −n1
`=0
# 1 −N ] ( N `−1 )+η( `+n ) . N −1 (25)
The exponents in the two sums are bounded by their maxima ` ` + n1 η +η ≤ 2η1 (n1 ) , (0 ≤ ` ≤ N − n1 − 1) N −1 N −1 η where
` N −1
+η
` + n1 − N N −1
≤ 2η2 (n1 ) ,
(N − n1 ≤ ` ≤ N − 1)
(26)
(27)
η1 (n1 ) := η
n1 1 − 2 2(N − 1)
≤1
(28)
η2 (n1 ) := η
1 N − n1 + 2 2(N − 1)
≤ η2 < 1 .
(29)
Notice that η2 is automatically < 1, while η1 (n1 ) < 1 if limN n1 /N 6= 0. If so, the upper bound E(n) ≤ N (2−(N −1)(1−η1 (n1 )) + 2−(N −1)(1−η2 ) ) 2
(30)
implies N |hCN , WN (n)CN i| 7→ 0 exponentially with N → ∞. The condition for which η1 (n1 ) < 1 is fulfilled when |x1 − y1 | > δ; in fact, n = [N y] − [N x] and limN ([N x1 ] − [N y1 ])/N = x1 − y1 . On the other hand, if x1 = y1 and n2 = [N x2 ] − [N y2 ] 6= 0, one explicitly computes N −1 πn2 N |hCN , WN ((0, n2 ))CN i|2 = N cos2 . (31) N
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
857
Again, the above expression goes exponentially fast to zero, if limN n2 /N 6= 0 which is the case if x2 6= y2 . 5. Quantum and Classical Time Evolutions One of the main issues in the semi-classical analysis is to compare if and how the quantum and classical time evolutions mimic each other when a quantization parameter goes to zero. In the case of classically chaotic quantum systems, the situation is strikingly different from the case of classically integrable quantum systems. In the former case, classical and quantum mechanics agree on the level of coherent states only over times which scale as −log ~. As before, let T denote the evolution on the classical phase space X and UT the unitary single step evolution on CN . We formally impose the relation between the classical and quantum evolution on the level of coherent states through: Condition 5.1 (Dynamical Localization). There exists an α > 0 such that for all choices of ε > 0 and d0 > 0 there exists an N0 ∈ N with the following property: if N > N0 and k ≤ α log N , then N |hUTk CN (x), CN (y)i|2 ≤ ε whenever d(T k x, y) ≥ d0 . Remark. The condition of dynamical localization is what is expected of a good choice of coherent states, namely, on a time scale logarithmic in the inverse of the semi-classical parameter, evolving coherent states should stay localized around the classical trajectories. Informally, when N → ∞, the quantities Kk (x, y) := hUTk CN (x), CN (y)i
(32)
should behave as if N |Kk (x, y)|2 ' δ(T k x − y). The constraint k ≤ α log N is typical of hyperbolic classical behavior and comes heuristically as follows. The maximal localization of coherent states cannot exceed the minimal coarse-graining dictated by 1/N ; if, while evolving, coherent states stayed localized forever around the classical trajectories, they would get more and more localized along the contracting direction. Since for hyperbolic systems the increase of localization is exponential with Lyapounov exponent λLyap > 0, this sets the upper bound and indicates that α ' 1/λLyap . Proposition 5.1. Let (MN , ΘN , τN ) be a general quantum dynamical system as p defined in Sec. 3 and suppose that it satisfies Condition 5.1. Let kXk2 := τN (X ∗ X), X ∈ MN denote the normalized Hilbert–Schmidt norm. In the ensuing topology lim
k,N →∞ k