Commun. Math. Phys. 224, 1 – 2 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Preface The present issue of CMP is dedicated to Joel L. Lebowitz, the Willliam Hill Professor of Mathematics and Physics of Rutgers University, in recognition of his outstanding contributions and scientific leadership in statistical physics and related areas of mathematical physics. In his research, Joel Lebowitz has addressed topics in statistical physics ranging over equilibrium and non-equilibrium phenomena. His works reflect the rare combination of a deep understanding of the relevant physics and the ability to see through the mathematical formalism in which physics is being expressed. He has reveled both in learning new physical phenomena and in shedding light on basic questions of physics through mathematically rigorous results. The subjects on which Joel has worked, with numerous collaborators, include: the theory of equilibrium fluids, rigorous approach to the liquidvapor transition, critical phenomena in Ising type models, the statistical mechanics of Coulomb systems, phase segregation studied in conjunction with pioneering implementations of Monte-Carlo simulations, ergodic theory in relation to fundamental issues of statistical mechanics, kinetic theory, and the structure of non-equilibrium steady states. Joel’s contributions have been widely recognized; in 1980 he was elected to the National Academy of Sciences and in 1992 he was awarded the Boltzmann Medal of the Union of Pure and Applied Physics.
2
M. Aizenman, H. Spohn
An inseparable aspect of the Lebowitz experience for the many whom he has touched has been the sense of his personal engagement and care. Having witnessed the holocaust as a young teenager, Joel emerged from the devastation and the inhumanely twisted reality that he was thrown into with a tenacious commitment to stand up for human rights and dignity. He has inspired many with the message that scientists should use the unique opportunities accorded to them to be at the forefront of that struggle. In 1999 he was awarded the Scientific Freedom and Responsibility Award of the American Association for the Advancement of Science for “. . . his tireless devotion to the rights of scientists in oppressive regimes throughout the world and his extraordinary creativity in finding ways to help these scientists survive their ordeal ”. In the late fifties Joel instituted a unique series of biannual meetings in Statistical Mechanics, which he has been running uninterrupted since then. These conferences have provided an invaluable forum for the presentation of recent results and for stimulating exchanges on both new emerging vistas and long outstanding fundamental issues. In a fitting reflection of the spirit of these meetings, Joel’s 70th birthday is being marked with two special issues: Physica A Vol. 279, Nos. 1–4, where the reader may also find a more complete biographical sketch, and this issue of CMP which presents rigorous results in fields related to Joel’s interests. Joel L. Lebowitz stands out in his never satisfied curiosity, the clarity of his thought, and his exceptional ability to reach out and stimulate others. He has inspired and guided numerous colleagues and students. This issue is dedicated with deep gratitude, with a sense of joy at having had the privilege to interact with him, and with best wishes for Joel’s continuing quest. Michael Aizenman Herbert Spohn
Commun. Math. Phys. 224, 3 – 16 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Entropy Production in Quantum Spin Systems David Ruelle IHES, route de Chartes, 91440 Bures sur Yvette, France. E-mail:
[email protected] Received: 7 June 2000 / Accepted: 5 November 2000
Abstract: We consider a quantum spin system consisting of a finite subsystem connected to infinite reservoirs at different temperatures. In this setup we define nonequilibrium steady states and prove that the rate of entropy production in such states is nonnegative. For several decades, Joel Lebowitz has been the soul of research in statistical mechanics. He now plays a central role in the development of new ideas which reshape our understanding of nonequilibrium. The present paper, dedicated to Joel on his 70th birthday, extends some of the new ideas to quantum systems. 1. Introduction Consider a physical situation where a “small” system S is connected to different “large” heat reservoirs Ra (a = 1, 2, . . . ) at different inverse temperatures βa . We want to define nonequilibrium steady states for the total system L = S + R1 + R2 + . . . , and verify that the rate of entropy production in such states is ≥ 0. The model which we discuss in this paper is that of a fairly realistic quantum spin system. In what follows we first describe the model and state our assumptions (A1), (A2), (A3). In this setup we introduce nonequilibrium steady states ρ as states which, in the distant past, described noninteracting reservoirs at different temperatures. Under suitable conditions we check that our definition does not depend on where we place the boundary between the small system and the reservoirs. Our definition of the entropy production eρ also does not depend on where the boundary between the small system and the reservoirs is placed. With this definition we prove eρ ≥ 0. By contrast with an earlier paper [4], we omit here assumptions of asymptotic abelianness in time which are difficult to verify, the definition of nonequilibrium steady states is more general, but we obtain less specific results.
4
D. Ruelle
2. Description of the Model (See [3, 1]) Let L be a countably infinite set. For each x ∈ L, let Hx be a finite dimensional complex Hilbert space, and write HX = ⊗x∈X Hx if X is a finite subset of L. We let AX be the C∗ -algebra of bounded operators on HX , and if Y ⊂ X we identify AY with a subalgebra of AX by the map AY → AY ⊗1HX\Y ⊂ AX . We write L as a finite union L = ∪a≥0 Ra , where R0 = S is finite (small system) and the Ra with a > 0 are infinite (reservoirs). We can then define the quasilocal C∗ algebras Aa , A as the norm closures of AX , AX X⊂Ra
X⊂L
repectively. Note that all these algebras have a common unit element 1. In this setup we assume that an interaction : X → (X) is given such that (X) is a selfadjoint element of AX for every finite X ⊂ L. Also, for each reservoir, we prescribe an inverse temperature βa > 0 and a state σa on Aa . The assumptions (A1), (A2), (A3). Assumption A1. The interaction satisfies
λ = enλ sup n≥0
x∈L Xx:cardX=n+1
(X) < ∞
for some λ > 0. The importance of this assumption is that it allows us to equip A with a one-parameter group (α t ) of automorphisms1 defining a time evolution. Introduce a linear operator δ : ∪X⊂L AX → A such that δA = i [(Y ), A] if A ∈ AX . Y :Y ∩X=∅
If A ∈ AX , one checks that
δ m A ≤ A eλcardX m!(2λ−1 λ )m . The strongly continuous one-parameter group (α t ) of ∗-automorphisms of A is given by ∞ m t m αt A = δ A m! m=0
if A ∈ ∪X⊂L AX and |t| < λ/2 λ . (More generally one could take A ∈ Aλ , where Aλ is defined in the Appendix.) Let H = (X) X⊂
for finite ⊂ L. Writing → L if eventually contains each finite X ⊂ L we have, assuming A ∈ A, lim eitH Ae−itH − α t A = 0 →L
uniformly for t in compact intervals of R. 1 See [1] Theorem 6.2.4 (or [3] Sect. 7.6).
Entropy Production in Quantum Spin Systems
5
Assumption A2. (X) = 0 if X ∩ S = ∅, X ∩ Ra = ∅, X ∩ Rb = ∅ for different a, b > 0. Note that the description of the interaction is somewhat ambiguous because anything ascribed to (X) might also be ascribed to (Y ) for Y ⊃ X. Condition (A2) means that in our accounting, if a part of the interaction connects two different reservoirs, it must also involve the small system S. Assumption A3. If a > 0, let a be the restriction of the interaction to subsets of Ra and write a (X) = HRa ∩ . Ha = X⊂Ra ∩
Let also the interactions () be given such that
() λ ≤ K < ∞ and write
Ba =
(1)
() (X).
X⊂Ra ∩
We assume that, for a suitable sequence → L, lim
Tr HRa ∩ (e−βa (Ha +Ba ) A) Tr HRa ∩ e−βa (Ha +Ba )
→L
= σa (A)
if A ∈ Aa : this defines a state σa on Aa , depending on the choice of (() ) and the sequence → L. Furthermore we assume that for each finite X there is X such that () (Y ) = 0 if ⊃ X and Y ⊂ X; therefore
[Ba , A] = 0
(2)
if ⊃ X and A ∈ AX . In particular we can take all () = 0. Using (3) below, it is readily verified that σa is a βa -KMS state (see [2]) for the one-parameter group (α˘ at ) of automorphisms of Aa corresponding to the interaction a . [I do not know which of the βa -KMS states can be obtained in this manner]. Note that the assumptions (A1), (A2), (A3) can be explicitly verified in specific cases. From (A3) we obtain the following result. Lemma. lim eit (Ha +Ba ) Ae−it (Ha +Ba ) − α˘ at A = 0
→L
(3)
for a > 0, and
lim eit (H +
→L
a>0
Ba )
Ae−it (H +
uniformly for t in compact intervals of R.
a>0
Ba )
− α t A = 0
(4)
6
D. Ruelle
t it (H + a>0 Ba ) Ae−it (H + Proof. We prove (4). Write α A = e i[H + a>0 Ba , A]. If A ∈ ∪X AX we see using (1) that t α A=
a>0
Ba )
and δ A =
∞ m t m δ A m!
m=0
converges uniformly in for |t| < λ/2( λ + K). Using also (2), it is shown in the m A → δ m A in A when → L. Therefore Appendix that δ t A − α t A = 0 lim α
→L
when A ∈ ∪X AX , uniformly for |t| ≤ T < λ/2( λ + K). But the condition A ∈ ∪X AX is removed by density, and the condition |t| ≤ T < λ/2( λ + K) by use of the group property. The proof of (3) is similar. The KMS state σ . The interaction a>0 βa a , evaluated at X is βa a (X) if X ⊂ Ra and 0 if X is not contained in one of the Ra . The corresponding one-parameter group (β t ) of automorphisms of A has, according to (A3), the KMS state2 σ = ⊗a≥0 σa where σ0 is the normalized trace on A0 = AS . In fact Tr H (exp(− a βa (Ha + Ba ))A) σ (A) = lim . (5) →L Tr H exp(− a βa (Ha + Ba )) Nonequilibrium steady states. We call nonequilibrium steady states (NESS) associated with σ the limits when T → ∞ of 1 T dt (α t )∗ σ T 0 using the w∗ -topology on the dual A∗ of A. With respect to this topology, the set % of NESS is compact, nonempty, and the elements of % are (α t )∗ -invariant states on A. This definition generalizes that given in [4] where, under stringent asymptotic abeliannes conditions, the existence of a single NESS was obtained. Dependence on the decomposition L = S + R1 + R2 + . . . 3 . Our definition of σ , and therefore of % depends on the choice of a decomposition of L into small system and reservoirs. If S is replaced by a finite set S ⊃ S and the Ra by correspondingly smaller sets Ra ⊂ Ra one checks that (A1), (A2),(A3) remain valid. If a is the restriction of to subsets of Ra , the replacement of βa a by βa a changes (β t ) to a one-parameter group (β t ) and σ to a state σ . These changes are in fact bounded perturbations covered by Theorem 5.4.4 and Corollary 5.4.5 of [1]. The map σ → σ (of KMS states for (β t ) to KMS states for (β t )) is nonlinear (as can be guessed from (5)) and T T therefore we cannot expect that T1 0 dt (α t )∗ σ has the same limit as T1 0 dt (α t )∗ σ 2 The state σ corresponds to the inverse temperature +1 rather than the inverse temperature −1 favored in the mathematical literature. 3 This section and the following Proposition are in the nature of a technical digression, and may be omitted by the reader essentially interested in the positivity of the entropy production.
Entropy Production in Quantum Spin Systems
7
in general, but the deviation is not really bad. The (central) decomposition of KMS states into extremal KMS states gives factor states. If σ is assumed to be a factor state, and T (α t ) is asymptotically abelian, one finds that lim T1 0 dt (α t )∗ σ does not depend on the decomposition L = S + R1 + R2 + . . . , as the following result indicates. Proposition. Using the above notation, assume that σ is a factor state, and that lim [α t A, B] = 0
t→∞
when A, B ∈ A. Then, when T → ∞, 1 T 1 T dt (α t )∗ σ = lim dt (α t )∗ σ. lim T 0 T 0 Proof. Let us introduce the GNS representation (H, π, ') associated with σ so that if 1 T ρ = lim dt (α t )∗ σ, T 0 we have
1 T dt (', π(α t A)'). T 0 By restricting T to a subsequence we may assume that in the weak operator topology 1 T lim dt π(α t A) = A¯ ∈ π(A) T 0 ρ(A) = lim
and by assumption we also have A¯ ∈ π(A) , hence A¯ ∈ π(A) ∩ π(A) = {λ1} since σ is a factor state. But we may write σ (·) = (' , π(·)' ): this follows from the perturbation theory of [1] (see proof of Theorem 5.4.4). We have thus 1 T 1 T lim dt σ (α t A) = lim dt (' , π(α t A)' ) T 0 T 0 1 T 1 T t = lim dt (', π(α A)') = lim dt σ (α t A) T 0 T 0 as announced. Entropy production. For finite ⊂ L we have defined (X), H = X⊂
but HL , HRa do not make sense. We can however define [HL , HRa ] = lim [H , HRa ∩ ] = lim [H , Ha ]. →L
→L
We have indeed [H , Ha ] = [H − Ha , Ha ] = [H −
b>0
Hb , Ha ]
8
D. Ruelle
and (A2) gives H −
Hb =
b>0
x∈S X:x∈X⊂
1 (X) card(X ∩ S)
[implying the existence of the limit lim→L (H − A]. Using (A1) we obtain
b>0 Hb )
= HL −
b>0 HRb
∈
[(X), Ha ] ≤ 2λ−1 λ (X) eλcardX , hence
[(X), Ha ] ≤ 2λ−1 λ eλ λ ,
Xx
and [H , Ha ] has a limit [HL , HRa ] ∈ A when → L with
[HL , HRa ] ≤ 2cardSλ−1 eλ 2λ . The operator i[HL , HRa ] may be interpreted as the rate of increase of the energy of the reservoir Ra or (since this energy is infinite) rather the rate of transfer of energy to Ra from the rest of the system. According to conventional wisdom we define the rate of entropy production in an (α t )∗ -invariant state ρ as eρ =
βa ρ(i[HL , HRa ])
a>0
(this definition does not require that ρ ∈ %). Remark. If we replace S by a finite set S ⊃ S and the Ra by the correspondingly smaller sets Ra ⊂ Ra , we have noted earlier that (A1), (A2), (A3) remain satisfied. As a consequence of (A1) we have i[HL , HRa − HRa ] = lim i[H , Ha − Ha ] = lim δ(Ha − Ha ) →L
→L
(where the operator δ has been defined just after (A3)), hence ρ(i[HL , HRa − HRa ]) = lim ρ(δ(Ha − Ha )) = 0 →L
i.e. , the rate of entropy production is unchanged when S and the Ra are replaced by S and the Ra . The reason why we do not have ρ(i[HL , HRa ]) = 0 is mathematically because HRa is “infinite” (HRa ∈ / A), and physically because our definition of ρ(i[HL , HRa ]) takes into account the flux of energy into Ra from S, but not the flux at infinity. Theorem. The entropy production in a NESS is nonnegative, i.e. , eρ ≥ 0 if ρ ∈ %.
Entropy Production in Quantum Spin Systems
9
We have seen that [HL , HRa ] = lim [H , Ha ] →L Hb , Ha ]. = lim [H − →L
b>0
Therefore, using (A3) and [Hb + Bb , a>0 βa (Ha + Ba )] = 0, we find βa [HL , HRa ] = lim [H − Hb , βa Ha ] →L
a>0
a>0
b>0
= lim [H − →L
Hb ,
→L
βa (Ha + Ba )]
a>0
b>0
= lim [H +
Bb ,
βa (Ha + Ba )]
a>0
b>0
in the sense of norm convergence. We also have, for some sequence of values of T tending to infinity and all A ∈ A, 1 T 1 T t ρ(A) = lim dt σ (α t A) = lim lim dt σ (α A), T →∞ T 0 T →∞ →L T 0 where, by (4),
t α A = eit (H +
a>0
Ba )
Ae−it (H +
a>0
Ba )
→ α t A in norm
when → L, uniformly for t ∈ [0, T ]. Write HB = H + Ba , G =
a>0
βa (Ha + Ba ) + log Tr H exp
a>0
a>0
βa (Ha + Ba ) .
a>0
Then the entropy production is eρ = ρ(i
−
i T →∞ →L T
βa [HL , HRa ]) = lim lim
0
T
dt σ (eitHB [HB , G ] e−itHB )
and the convergence when → L of the operator (eitHB [HB , G ]e−itHB ) is uniform for t ∈ [0, T ]. According to (A3) we may choose the tending to L such that Tr H e−G (·) tends to σ (·) in the w ∗ -topology, hence T i dt Tr H (e−G eitHB [HB , G ]e−itHB ) eρ = lim lim T →∞ →L T 0 1 T −G d itHB −itHB = lim lim (e dt Tr H e G e ) T →∞ →L T 0 dt 1 = lim lim Tr H (e−G eiT HB G e−iT HB ) − Tr H (e−G G ) T →∞ →L T and the theorem follows from the lemma below, applied with A = G , U = eiT HB and φ(s) = −e−s .
10
D. Ruelle
Lemma. Let A, U be a hermitian and a unitary n×n matrix respectively, and φ : R → R be an increasing function. Then tr(φ(A)U AU −1 ) ≤ tr(φ(A)A). Proof. As R. Seiler kindly pointed out to me, this lemma can be obtained readily from O. Klein’s inequality tr(f (B) − f (A) − (B − A)f (A)) ≥ 0, where A, B are hermitian and f convex: take B = U AU −1 and φ = f . Remark. We have
ρ(i[HL , HRa ]) = 0
a>0
because −
ρ(i[HL , HRa ]) = lim ρ i H , H − Ha →L
a>0
=
d ρ αt dt
X:X∩S=∅
a>0
(X)
= 0, t=0
where we have used the fact that ρ is (α t )∗ -invariant. In particular, in the case of two reservoirs 0 ≤ eρ = (β1 − β2 )ρ(i[HL , HR1 ]) so that if the temperature β1−1 is less than β2−1 , i.e., β1 − β2 > 0, the flux of energy into R1 is ≥ 0: heat flows from the hot reservoir to the cold reservoir. 3. Proving Strict Positivity of eρ It is an obvious challenge to prove that eρ = 0. A natural situation to discuss would correspond to Ra = Zν and a translationally invariant. But we need then ν ≥ 3 as discussed in [4]. Indeed, for ν < 3 one expects a nonequilibrium steady state to be in fact an equilibrium state at a temperature intermediate between the original temperatures of the reservoirs. Instead of a quantum spin system as described above, a gas of noninteracting fermions would probably be easier to treat first. 4. Complements and Relation with Recent Work of Jakši´c and Pillet After this paper was submitted for publication, two interesting contributions were posted to the mp arc archive: one by Jakši´c and Pillet4 and one by Maes et al.5 In this section and the next two, I am complying with the editor’s request to take into account remarks by the referees, and in particular to discuss the relations of my work with the two references mentioned above. 4 V. Jakši´c and C.-A. Pillet. “On entropy production in quantum statistical mechanics.” mp arc 00-309. 5 Chr. Maes, F. Redig, and M. Verschuere. “Entropy production for interacting particle systems.” mp arc
00-357.
Entropy Production in Quantum Spin Systems
11
Note that the definition of entropy production used above is based on the thermodynamic relation dQ = kT dS or, in the present case dS = a (kTa )−1 dQa . It can be considered a drawback that this definition does not relate directly to a microscopically defined entropy-like quantity, as is done in the papers of Jakši´c and Pillet, and Maes et al. We now discuss in detail the approach of Jakši´c and Pillet, and its relation with the present paper.6 We are given a C∗ -algebra A with identity, an element V = V ∗ ∈ A, time evolutions t (α˘ ), (α t ) (i.e. , strongly continuous one-parameter groups of ∗-automorphisms of A) such that t1 tn−1 t
α t (A) = α˘ t (A) + in dt1 dt2 . . . dtn α˘ tn (V ), . . . [α˘ t1 (V ), A] 0
0
n≥1
0
and an (α˘ t )-invariant state σ on A. Therefore (α t ) is a local perturbation by V of the “free” evolution given by (α˘ t ) and σ is an invariant state for the “free” evolution. We furthermore assume that (C1) There exists a time evolution (β t ) for which σ is a KMS state at inverse temperature +1. (C2) V is in the domain of the infinitesimal generator δβ of (β t ). [In fact Jakši´c and Pillet assume a temperature −1 in (C1); our choice of temperature +1 will bring a change of sign below in the definition of the entropy production. In the situation discussed earlier we have V = (X), X∩S=∅
hence V λ ≤ λ cardS, and V ∈ Aλ . Note that Aλ is in the domain of the infinitesimal generator δβ of (β t ) (see the Appendix), hence (C2) holds. The advantage of the approach of Jakši´c and Pillet is that σ can be an arbitrary KMS state: the existence of “boundary terms” Ba as in (A3) is not required.] In this setup one introduces the observable −δβ (V ) and the entropy production in the state ρ is defined as ρ(−δβ (V )). [In our situation we have −δβ (V ) = −
a>0
=
βa
i[(X), (Y )]
X⊂Ra Y :Y ∩S=∅
βa i[HL , HRa ]
a>0
so that ρ(−δβ (V )) = eρ is indeed the rate of entropy production in the state ρ.] 6 We have changed the notation of [2] to align it with the one used above.
12
D. Ruelle
Finite dimensional digression. For the purpose of motivation we discuss now the case where A would be the algebra of n × n matrices, and consider two states on A given by density matrices µ, ν. A relative entropy is then defined by Ent(µ|ν) = −tr(µ log µ − µ log ν) ≤ 0. If (α t ) is a one parameter group of ∗-automorphisms of A we have thus d d Ent(µ ◦ α t |ν) = tr µ α t (log ν) . dt dt Suppose now that ν is preserved by the “free” evolution (α˘ t ), and that (α t ) is a perturbation of (α˘ t ), so that α t (A) = ei(H +V )t Ae−i(H +V )t ,
α˘ t (A) = eiH t Ae−iH t ,
then d t α (log ν) = α t (i[V , log ν]). dt Define now (β t ) by β t (A) = e−it log ν Aeit log ν so that ν is the corresponding KMS state (at inverse temperature +1). Then if δβ is the infinitesimal generator of (β t ) we have i[V , log ν] = δβ (V ), hence d t α (log ν) = α t (δβ (V )), dt
d Ent(µ ◦ α t |ν) = µ(α t (δβ (V ))). dt We obtain thus
T
T
Ent(µ ◦ α |ν) − Ent(µ|ν) = 0
(µ ◦ α t )(δβ (V )) dt
or, taking µ = ν = σ , 0 ≤ −Ent(σ ◦ α T |σ ) =
T 0
(σ ◦ α t )(−δβ (V )) dt.
Entropy Production in Quantum Spin Systems
13
The infinite dimensional situation. If µ, ν are two faithful normal states on a von Neumann algebra M [in our case πσ (A) ], Araki has introduced a relative entropy Ent(µ|ν) in terms of a relative modular operator associated with µ, ν. We must refer the reader to [1] Definition 6.2.29 for details. Using this definition, Jakši´c and Pillet have worked out an infinite dimensional version of the finite dimensional calculation given above. They are able to prove the formula
T 0
(σ ◦ α t )(−δβ (V )) dt = −Ent(σ ◦ α T |σ ) ≥ 0
which can be interpreted as an entropy balance, and gives in the limit ρ(−δβ (V )) ≥ 0 if ρ is a NESS. The proof is fairly technical. The approach of Jakši´c and Pillet has the interest of great generality. In particular σ can be an arbitrary KMS state.Also, instead of a spin lattice system one can consider fermions on a lattice. For a nonintertacting fermion model, Jakši´c and Pillet have announced a proof of strict positivity of the entropy production, as had been suggested above. Appendix: The Algebras Aλ The purpose of this Appendix is to complete the proof of (4) by establishing (10) below. On the way to this result we introduce “partial traces” π , and algebras Aλ which are of interest in their own right. For finite ⊂ L, a map π : ∪X AX → A is defined by π A =
lim
Y →L\
tr HY A . dimHY
If the φi form an orthonormal basis of HY , and ψ , ψ ∈ H we have tr HY A 1 ψ , = ψ (φi ⊗ ψ , Aφi ⊗ ψ ), dimHY dimHY i
hence π A ≤ A . The properties of the following lemma are then readily checked. Lemma. The map π extends to a unique linear norm-reducing map A → A . Furthermore if A ∈ A , π A = A π A∗ = (π A)∗ , π π = π π . Choose now some λ > 0. For A ∈ A , define λ cardX
A λ = inf
AX e : AX = A . X⊂
X
14
D. Ruelle
By the inf by min. If is replaced by a larger set , and compactness we may replace Y AY = A with Y ⊂ , we have
AY eλ cardY ≥
π AY eλ card(Y ∩)
Y ⊂
Y
with Y π AY = π A = A. Therefore A λ does not depend on the choice of provided A ∈ A . We have thus a norm . λ on ∪X AX , and we may define the Banach space Aλ by completion. Proposition. The inclusion map ∪X AX → A extends to a norm-reducing map ω : Aλ → A and ω is injective. Proof. ω is norm-reducing because A ≤ A λ for A ∈ ∪X AX . Note now that π : ∪X AX → A reduces the . λ -norm and extends thus to a linear norm-reducing map Aλ → Aλ , where Aλ is A equipped with the . λ -norm. Assume that A ∈ Aλ with A λ = a > 0. We may choose and B ∈ A such that
A − B λ < a/3, hence B λ > 2a/3. Now ωA = 0 would imply π A = 0, hence a 2a < B λ = π (B − A) λ ≤ A − B λ < . 3 3 Therefore ω must be injective. Corollary. Aλ is identified by ω to a dense ∗-subalgebra of A; Aλ is then a Banach algebra with respect to the norm . λ . Taking λ = 0 we may define A0 = A. With this definition, if λ < µ we have Aλ ⊃ Aµ , and the map Aµ → Aλ is norm-reducing. Proof.If A, B ∈ A we may choose AX , BX ∈ AX such that A = X⊂ AX , B = X⊂ BX , and
AX eλ cardX ,
B λ =
BX eλ cardX .
A λ = X⊂
Thus
AB λ ≤
X
≤
X⊂
Y
X
AX AY eλ card(X∪Y )
AX . AY eλ(cardX+cardY ) = A λ B λ .
Y
Therefore if A, B tend to limits A∞ , B∞ in Aλ , AB tends in Aλ to A∞ B∞ and A∞ B∞ λ ≤ A∞ λ B∞ λ . The rest is clear. If λ < ∞ and AX ∈ AX the formula [(Y ), AX ] δAX = i Y :Y ∩X=∅
defines an element of Aλ . If λ > µ ≥ 0, and λ < ∞, one also checks that δ m defines a map Aλ → Aµ such that
δA µ ≤ 2(λ − µ)−1 A λ λ ,
δ m A µ ≤ A λ m!(2(λ − µ)−1 λ )m .
(6)
Entropy Production in Quantum Spin Systems
15
(The proof of (6) is basically the same as that of the standard case µ = 0). + δ , where We turn now to the proof of (10) below . We have δ = δ δ A = i[ Ba , A] δ A = i[H , A], a>0
and (1) and (6) (for m = 1) yield
δA µ ≤ A λ .2(λ − µ)−1 λ ,
δ A µ ≤ A λ .2(λ − µ)−1 λ ,
δ A µ ≤ A λ .2(λ − µ)−1 K.
Given 4 > 0 and A ∈ Aλ we can find X such that A = A1 + A2 with A1 ∈ AX and
A2 λ < 4. Therefore
(δ − δ )A µ ≤ (δ − δ )A1 µ + δA2 µ + δ A2 µ + δ A2 µ
(7)
= (δ − δ )A1 µ + 4.2(λ − µ)−1 (2 λ + K).
Taking ⊃ X we also have
δ A1 = 0
by (2), and )A1 = i (δ − δ
[(Y ), A1 ]
Y :Y ⊂,Y ∩X=∅
so that
(δ − δ )A1 µ ≤ A1 λ .2(λ − µ)−1 Xλ , (8) where Xλ = supx∈X Y x,Y ⊂X e(cardY −1)λ (Y ) . When → L we have
Xλ → 0 and (7), (8) yield
lim (δ − δ )A µ = 0.
(9)
→L
We can now prove that, if λ < ∞ and A ∈ Aλ , m A = 0. lim δ m A − δ
(10)
→L
We have indeed m δ m A − δ A=
m−1 k=0
m−k−1 δ (δ − δ )δ k A
and, using (6),
6
δ k A 2λ/3 ≤ A λ .k! hence, by (9),
λ
λ
k
,
lim (δ − δ )δ k A λ/3 = 0
→L
so that, using (6), m−k−1 m−k−1 (δ − δ )δ k A ≤ δ (δ − δ )δ k A 0
δ
≤ (δ − δ )δ k A λ/3 (m − k − 1)!
6
λ
λ which tends to zero when → L. This concludes the proof of (10).
m−k−1
,
16
D. Ruelle
References 1. Bratteli, O. and Robinson. D.W.: Operator algebras and quantum statistical mechanics I, II. New York: Springer, 1979–1981 (2nd ed. 1987–1997) 2. Haag, R., Hugenholtz, N.M. and Winnink, M.: On the equilibrium states in quantum statistical mechanics. Commun. Math. Phys. 5, 215–236 (1967) 3. Ruelle, D.: Statistical mechanics. Rigorous results. New York: Benjamin, 1969 4. Ruelle, D.: Natural nonequilibrium states in quantum statistical mechanics. J. Statist. Phys. 98, 57–75 (2000) Communicated by H. Spohn
Commun. Math. Phys. 224, 17 – 31 (2001)
Communications in
Mathematical Physics
A Rigorous Derivation of the Gross–Pitaevskii Energy Functional for a Two-dimensional Bose Gas Elliott H. Lieb1 , Robert Seiringer2 , Jakob Yngvason2 1 Departments of Physics and Mathematics, Jadwin Hall, Princeton University, P. O. Box 708,
Princeton, NJ 08544, USA
2 Institut für Theoretische Physik, Universität Wien, Boltzmanngasse 5, 1090 Vienna, Austria
Received: 3 May 2000 / Accepted: 23 October 2000
Dedicated to Joel Lebowitz on the occasion of his 70th birthday Abstract: We consider the ground state properties of an inhomogeneous two-dimensional Bose gas with a repulsive, short range pair interaction and an external confining potential. In the limit when the particle number N is large but ρa ¯ 2 is small, where ρ¯ is the average particle density and a the scattering length, the ground state energy and density are rigorously shown to be given to leading order by a Gross–Pitaevskii (GP) energy functional with a coupling constant g ∼ 1/| ln(ρa ¯ 2 )|. In contrast to the 3D case the coupling constant depends on N through the mean density. The GP energy per particle depends only on Ng. In 2D this parameter is typically so large that the gradient term in the GP energy functional is negligible and the simpler description by a Thomas–Fermi type functional is adequate. 1. Introduction Motivated by recent experimental realizations of Bose–Einstein condensation the theory of dilute, inhomogeneous Bose gases is currently a subject of intensive studies. Most of this work is based on the assumption that the ground state properties are well described by the Gross–Pitaevskii (GP) energy functional (see the review article [1]). A rigorous derivation of this functional from the basic many-body Hamiltonian in an appropriate limit is not a simple matter, however, and has only been achieved recently for bosons with a short range, repulsive interaction in three spatial dimensions [2]. The present paper is concerned with the justification of the GP functional in two spatial dimensions. Several new issues arise. One is the form of the nonlinear interaction term inthe energy functional for the GP wave function . In three dimensions this term is 4πa ||4 , where a is the scattering length of the interaction potential. The rationale is the well known formula for the energy density of a homogeneous Bose gas, which,
© 2000 by the authors. Reproduction of this work, in its entirety, by any means, is permitted for noncommercial purposes.
18
E. H. Lieb, R. Seiringer, J. Yngvason
for dilute gases with particle density ρ, is 4π aρ 2 . This fact has been ‘known’ since the early 50’s but a rigorous proof is fairly recent [3]. In two dimensions the corresponding formula is 4πρ 2 | ln(ρa 2 )|−1 as proved in [4] by extension of the method of [3]. The formula was first stated by Schick [5]; other early references to this formula are [6–10]. It would seem natural to consider 4π ||4 | ln(||2 a 2 )|−1 as the interaction term in the GP functional, and this has indeed been suggested in [11, 12]. Such a term, however, is unnecessarily complicated for the purpose of leading order calculations. In fact, since the logarithm varies only slowly it turns out that one can use the same form as in the three dimensional case, but with an appropriate dimensionless coupling constant g replacing the scattering length, and still retain an exact theory (to leading order in ρ). It is often assumed that a justification of the GP functional depends on the existence of Bose Einstein condensation. Several remarks can be made about this: 1. We neither assume nor prove the existence of BE condensation, but we do demonstrate a kind of condensation over a distance that is fixed (i.e., non-thermodynamic) but whose length goes to infinity as the density goes to zero; 2. BE condensation does not exist in two dimensions when the temperature is positive, but it can, and most likely does, exist in the ground state; 3. In any event, when the density is low and the temperature is zero it appears to be likely that the system can be described for many purposes in terms of only a few macroscopic order parameters such as the density and phase – at least this is true for the dependence of the ground state energy and density upon an external potential. The functional we shall consider is E
GP
[] =
|∇(x)|2 + V (x)|(x)|2 + 4πg|(x)|4 d2 x,
(1.1)
where V is the external confining potential and all integrals are over R2 . The choice of g is an issue on which there has not been unanimous opinion in the recent papers [12–18] on this subject. We shall prove that a right choice is g = | ln(ρa ¯ 2 )|−1 , where ρ¯ is a mean density that will be defined more precisely below. This mean density depends on the particle number N , which implies that the scaling properties of the GP functional are quite different in two and three dimensions. In the three-dimensional case the natural parameter is N a/aosc , with aosc being the length scale defined by the external confining potential. If a/aosc is scaled like 1/N as N → ∞ this parameter is fixed and the gradient term |∇|2 in the GP functional is of the same order as the other terms. In two dimensions the corresponding parameter is N | ln(ρa ¯ 2 )|−1 . For a quadratic external 1/2 2 potential ρ¯ behaves like N /aosc and hence the parameter can only be kept fixed if a/aosc decreases exponentially with N . A slower decrease means that the parameter tends to infinity. This corresponds to the so-called Thomas Fermi (TF) limit where the gradient term has been dropped altogether and the functional is E TF [ρ] =
V (x)ρ(x) + 4πgρ(x)2 d2 x,
(1.2)
defined for nonnegative functions ρ. Our main result, stated in Theorems 1.3 and 1.4 below, is that minimization of (1.2) reproduces correctly the ground state energy and density of the many-body Hamiltonian in the limit when N → ∞, ρa ¯ 2 → 0, but 2 −1 2 −1 N| ln(ρa ¯ )| → ∞. Only in the exceptional situation that N | ln(ρa ¯ )| stays bounded is there need for the full GP functional (1.1), cf. Theorems 1.1 and 1.2.
Derivation of Gross–Pitaevskii Energy Functional for 2D Bose Gas
19
We shall now describe the setting more precisely. The starting point is the Hamiltonian for N identical bosons in an external potential V and with pair interaction v, H
(N)
=
v(xi − xj ), −∇i2 + V (xi ) +
N i=1
(1.3)
i<j
acting on the totally symmetric wave functions in ⊗N L2 (R2 ). Units have been chosen so that h¯ = 2m = 1, where m is the particle mass. We assume that v is nonnegative and spherically symmetric with a finite scattering length a. (For the definition of scattering length in two dimensions see the appendix.) The external potential should be continuous and tend to ∞ as |x| → ∞. It is then possible and convenient to shift the energy scale so that minx V (x) = 0. For the TF limit theorem we shall require some additional properties of V to be specified later. The ground state energy ω of the one-particle operator −∇ 2 + V is a natural energy unit and gives rise to the length unit aosc ≡ ω−1/2 . In the sequel we shall be considering a limit where a/aosc tends to zero while N → ∞. Experimentally a/aosc can be changed in two ways: One can either vary aosc or a. The first alternative is usually simpler in practice but very recently a direct tuning of the scattering length itself has also been shown to be feasible [19]. Mathematically, both alternatives are equivalent, of course. −2 Vˆ (x/a ) and keeping Vˆ and v fixed. The The first corresponds to writing V (x) = aosc osc second corresponds to writing the interaction potential as v(x) = a −2 v(x/a), ˆ where vˆ has unit scattering length, and keeping V and vˆ fixed. This is equivalent to the first, since for given Vˆ and vˆ the ground state energy of (1.3), measured in units of ω, depends only on N and a/aosc . In the dilute limit when a is much smaller than the mean particle distance, the energy becomes independent of v. ˆ We shall measure all energies in terms of ω and lengths in terms of aosc and regard Vˆ and vˆ as fixed. The notation E QM (N, a) for the ground state energy of (1.3) is then justified. The quantum mechanical particle density is defined by QM ρN,a (x) = N | (N) (x, x2 , . . . , xN )|2 d2 x2 . . . d2 xN , (1.4) where (N) is a ground state for (1.3). The GP functional (1.1) has an obvious domain of definition (cf. Eq. (2.1) in [2]). The infimum of E GP [] under the condition ||2 = N will be denoted by E GP (N, g). The infimum is obtained for a unique, positive function, denoted GP N,g , and the GP density GP (x) = GP (x)2 . is defined as ρN,g N,g of the TF functional (1.2) with the subsidiary condition The ground state energy ρ = N is denoted E TF (N, g). The corresponding minimizer can be written explicitly; it is TF ρN,g (x) =
1 [µTF − V (x)]+ , 8πg
where [t]+ ≡ max{t, 0} and µTF is chosen so that the normalization condition N holds.
(1.5)
TF = ρN,g
20
E. H. Lieb, R. Seiringer, J. Yngvason
TF at coupling We now define the mean density ρ¯ as the average of the TF density ρN,1 TF , i.e., constant g = 1, weighted with N −1 ρN,1
1 N
ρ¯ =
TF ρN,1 (x)2 d2 x.
(1.6)
It is clear that ρ¯ depends on N and when we wish to emphasize this we write ρ¯N . The definition (1.6) has the advantage that ρ¯ is easily computed; for instance, if V (x) ∼ |x|s for some s > 0, then ρ¯N ∼ N s/(s+2) . It may appear more natural to define ρ¯ selfTF (x)2 d2 x with g = | ln(ρa consistently as ρ¯ = N1 ρN,g ¯ 2 )|−1 , which amounts to solving a nonlinear equation for ρ. ¯ Also, the TF density could be replaced by the GP density. However, since ρ¯ will only appear under a logarithm such sophisticated definitions are not needed for the leading order result we are after. The simple formula (1.6) is adequate for our purpose, but it should be kept in mind that the self-consistent definition may be relevant in computations beyond the leading order. With this notation we can now state the two dimensional analogue of Theorem I.1 in [2]. Theorem 1.1 (GP limit for the energy). If, for N → ∞, a 2 ρ¯N → 0 with N/| ln(a 2 ρ¯N )| fixed, then E QM (N, a)
lim
N→∞ E GP (N, 1/| ln(a 2 ρ¯N )|)
= 1.
(1.7)
The corresponding theorem for the density, cf. Theorem I.2 in [2], is Theorem 1.2 (GP limit for the density). If, for N → ∞, a 2 ρ¯N → 0 with γ ≡ N/| ln(a 2 ρ¯N )| fixed, then lim
N→∞
1 QM GP ρ (x) = ρ1,γ (x) N N,a
(1.8)
in the sense of weak convergence in L1 (R2 ). These theorems, however, are not particularly useful in the two dimensional case, because the hypothesis that N/| ln(a 2 ρ¯N )| stays bounded requires an exponential decrease of a with N . As remarked above, the TF limit, where N/| ln(a 2 ρ¯N )| → ∞, is much more relevant. Our treatment of this limit requires that V is asymptotically homogeneous and sufficiently regular in a sense made precise below. This condition can be relaxed, but it seems adequate for most practical applications and simplifies things considerably. Definition 1.1. We say that V is asymptotically homogeneous of order s > 0 if there is a function W with W (x) = 0 for x = 0 such that λ−s V (λx) − W (x) → 0 as λ → ∞ 1 + |W (x)|
(1.9)
and the convergence is uniform in x. The function W is clearly uniquely determined and homogeneous of order s, i.e., W (λx) = λs W (x) for all λ ≥ 0.
Derivation of Gross–Pitaevskii Energy Functional for 2D Bose Gas
21
Theorem 1.3 (TF limit for the energy). Suppose V is asymptotically homogeneous of order s > 0 and its scaling limit W is locally Hölder continuous, i.e., |W (x) − W (y)| ≤ (const.)|x − y|α for |x|, |y| = 1 for some fixed α > 0. If, for N → ∞, a 2 ρ¯N → 0 but N/| ln(a 2 ρ¯N )| → ∞, then E QM (N, a) = 1. (1.10) N→∞ E TF (N, 1/| ln(a 2 ρ¯N )|) To state the corresponding theorem for the density we need the minimizer of (1.2) with g = 1, V replaced by W , and normalization ρ = 1. We shall denote this minimizer TF ; an explicit formula is by ρ˜1,1 lim
TF (x) = ρ˜1,1
1 TF [µ˜ − W (x)]+ , 8π
(1.11)
where µ˜ TF is determined by the normalization condition. Theorem 1.4 (TF limit for the density). Let V satisfy the same hypothesis as in Theorem 1.3. If, for N → ∞, a 2 ρ¯N → 0 but γ = N/| ln(a 2 ρ¯N )| → ∞, then γ 2/(s+2) QM 1/(s+2) TF x) = ρ˜1,1 (x) ρN,a (γ N→∞ N lim
(1.12)
in the sense of weak convergence in L1 (R2 ). Remark 1.1. For large N, ρ¯N behaves like (const.)N s/(s+2) . Moreover, prefactors are unimportant in the limit N → ∞, because ρ¯N stands under a logarithm. Hence Theorems 1.3 and 1.4 could also be stated with N s/(s+2) in place of ρ¯N . The proofs of these theorems follow from upper and lower bounds on the ground state energy E QM (N, a) that are derived in Sects. 3 and 4. For these bounds some properties of the minimizers of the functionals (1.1) and (1.2), discussed in the following section, are needed. 2. GP and TF Theory In this section we consider the functionals (1.1) and (1.2) with an arbitrary positive coupling constant g. Existence and uniqueness of minimizers is shown in the same way as in Theorem II.1 in [2]. The GP energy E GP (N, g) has the simple scaling property GP E GP (N, g) = N E GP (1, Ng). Likewise, N −1/2 GP N,g ≡ φγ depends only on γ ≡ Ng (2.1) GP 2 and satisfies the normalization condition |φγ | = 1. The variational equation (GP equation) for the GP minimization problem, written in terms of φγGP , is −"φγGP + V φγGP + 8π γ (φγGP )3 = µGP (γ )φγGP ,
(2.2)
where the Lagrange multiplier (chemical potential) µGP (γ ) is determined by the subsidiary normalization condition. Multiplying (2.2) with φγGP and integrating we obtain µGP (γ ) = E GP (1, γ ) + 4π γ φγGP (x)4 d2 x. (2.3) For the upper bound on the quantum mechanical energy in the next section we shall need a bound on the absolute value of the minimizer φγGP .
22
E. H. Lieb, R. Seiringer, J. Yngvason
Lemma 2.1 (Upper bound for the GP minimizer). φγGP 2∞ ≤
µGP (γ ) . 8π γ
(2.4)
Proof. φγGP is a continuous and positive function that satisfies the variational equation −"φγGP + U φγGP = µGP φγGP
(2.5)
with U = V + 8πγ (φγGP )2 . Let B = {x | φγGP (x)2 > µGP /(8π γ )}. Since V ≥ 0 we see that −"φγGP ≤ 0 on B, i.e., φγGP is subharmonic on B. Hence φγGP achieves its maximum on the boundary of B, where φγGP (x)2 = µGP /(8π γ ), so B is empty. The ground state energy E TF (N, g) of the TF functional (1.2) scales in the same way TF as E GP (N, g), i.e., E TF (N, g) = N E TF (1, Ng), and the corresponding minimizer ρN,g TF TF TF is equal to Nρ1,Ng . For short, we shall denote ρ1,γ by ργ . By (1.5) we have ργTF (x) =
1 [µTF (γ ) − V (x)]+ , 8π γ
with the chemical potential µTF (γ ) determined by the normalization condition 1. In the same way as in (2.3) we have TF TF µ (γ ) = E (1, γ ) + 4π γ ργTF (x)2 d2 x.
(2.6)
ργTF = (2.7)
The chemical potential can also be computed from a variational principle: Lemma 2.2 (Variational principle for µTF ). µTF (γ ) = Vρ + 8π γ ρ∞ . inf ρ≥0, ρ=1
(2.8)
Proof. Obviously, the infimum is achieved for a multiple of a characteristic function for some measurable set R ⊂ R2 . If |R| denotes the Lebesgue measure of R, then 1 inf Vρ + 8πγ ρ = inf V + 8π γ (2.9) ∞ |R| R ρ=1 R 1 TF TF = inf V − µ (γ ) + 8π γ + µ (γ )|R| . |R| R R (2.10) Now R (V − µTF (γ )) ≥ −8πγ , with equality for x|V (x) < µTF (γ ) ⊆ R ⊆ x|V (x) ≤ µTF (γ ) . (2.11) Corollary 2.1 (Properties of µTF (γ )). µTF (γ ) is a concave and monotonously increasing function of γ with µTF (0) = 0. Hence µTF (γ )/γ is decreasing in γ . Moreover, µTF (γ ) → ∞ and µTF (γ )/γ → 0 as γ → ∞.
Derivation of Gross–Pitaevskii Energy Functional for 2D Bose Gas
23
Proof. Immediate consequences of Lemma 2.2, using that minx V (x) = 0 and lim|x|→∞ V (x) = ∞. Note that since E TF (1, γ ) ≥ 21 µTF (γ ) we also see that E TF (1, γ ) → ∞ with γ . In this limit the GP energy converges to the TF energy, provided the external potential satisfies a mild regularity and growth condition: Lemma 2.3 (TF limit of the GP energy). Suppose for some constants α > 0, L1 and L2 , |V (x) − V (y)| ≤ L1 |x − y|α eL2 |x−y| (1 + V (x)).
(2.12)
E GP (1, γ ) = 1. γ →∞ E TF (1, γ )
(2.13)
Then lim
Proof. It is clear that E TF (1, γ ) ≤ E GP (1, γ ). For the other direction, we use (j% ∗ ργTF )1/2 as a test function for E GP , where 1 1 j% (x) = exp − |x| . (2.14) 2π % 2 % Note that j% = 1 and |∇j% | = % −1 j% . Therefore
1 GP TF 2 TF TF 2 |∇j% ∗ ργ | + V (j% ∗ ργ ) + 4π γ (j% ∗ ργ ) E (1, γ ) ≤ 4j% ∗ ργTF 1 ≤ 2+ (j% ∗ V )ργTF + 4π γ (ργTF )2 , 4% (2.15) where we have used convexity for the last term. Moreover, (j% ∗ V − V )ργTF = d2 xd2 yj% (x − y) (V (x) − V (y)) ργTF (x) L1 −1 ≤ d2 xd2 y|x − y|α e(−% +L2 )|x−y| (1 + V (x))ργTF (x) 2π% 2 ≤ (const.) % α 1 + E TF (1, γ ) , (2.16) as long as % < L−1 2 . So we have E GP (1, γ ) ≤ (1 + (const.) % α )E TF (1, γ ) +
1 + (const.) % α . 4% 2
Optimizing over % gives as a final result E GP (1, γ ) ≤ E TF (1, γ ) 1 + (const.)E TF (1, γ )−α/(α+2) .
(2.17)
(2.18)
24
E. H. Lieb, R. Seiringer, J. Yngvason
Condition (2.12) is in particular fulfilled if V is homogeneous of some order s > 0 and locally Hölder continuous. In this case, E TF (1, γ ) = γ s/(s+2) E TF (1, 1)
(2.19)
TF γ 2/(s+2) ργTF (γ 1/(s+2) x) = ρ1,1 (x).
(2.20)
µTF (γ ) = γ s/(s+2) µTF (1).
(2.21)
and
By (2.7) we also have
If V is asymptotically homogeneous with a locally Hölder continuous limiting function W , we can prove corresponding formulas for the limit γ → ∞. This is the content of the next theorem, where we have included results on the GP → TF limit as well: Theorem 2.1 (Scaling limits). Suppose V satisfies the condition of Theorem 1.3. Let E˜ TF (1, 1) be the minimum of the TF functional (1.2) with g = 1 and N = 1 and V TF be the corresponding minimizer. Then replaced by W , and let ρ˜1,1 (i) limγ →∞ E GP (1, γ )/γ s/(s+2) = limγ →∞ E TF (1, γ )/γ s/(s+2) = E˜ TF (1, 1). GP (γ 1/(s+2) x) = ρ˜ TF (x), strongly in L2 (R2 ). (ii) limγ →∞ γ 2/(s+2) ρ1,γ 1,1 TF (x), uniformly in x. (iii) limγ →∞ γ 2/(s+2) ργTF (γ 1/(s+2) x) = ρ˜1,1 Proof. With the demanded properties of V , (2.13) holds. Using this and (1.9) one easily GP (γ 1/(s+2) x) is a minimizing sequence for the funcverifies (i). Moreover, γ 2/(s+2) ρ1,γ tional in question, so we can conclude as in Theorem II.2 in [2] that it converges to TF (x) strongly in L2 , proving (ii). (Remark: In Eq. (2.10) in [2] there is a misprint, ρ˜1,1 GP one should have ρ˜ GP on the left side.) To see (iii) let us define instead of ρ1,Na 1,Na ρ γ (x) = γ 2/(s+2) ργTF γ 1/(s+2) x . (2.22) We can write ρ γ (x) =
1 −s/(s+2) TF µ (γ ) − W (x) − %(γ , x) γ + 8π
(2.23)
with %(γ , x) = γ −s/(s+2) V (γ 1/(s+2) x) − W (x).
(2.24)
By assumption, |%(γ , x)| < δ(γ )(1 + W (x)) for some δ(γ ) with limγ →∞ δ(γ ) = 0. Because ρ γ = 1 for all γ , we see from Eq. (2.23) that µTF (γ )γ −s/(s+2) converges to some c as γ → ∞. Moreover, we can conclude that the support of ρ γ is for large γ contained in some bounded set B independent of γ . Therefore 1 = lim ρ γ = (8π )−1 [c − W (x)]+ (2.25) γ →∞
by dominated convergence, so c is equal to the µ˜ TF of Eq. (1.11). Now 1 TF ρ γ (x) = µ˜ − W (x) − %¯ (γ , x) + 8π
(2.26)
Derivation of Gross–Pitaevskii Energy Functional for 2D Bose Gas
25
with %¯ (γ , x) = %(γ , x) + µ˜ TF − γ −s/(s+2) µTF (γ ).
(2.27)
¯ )(1 + W (x)) for some δ(γ ¯ ) with limγ →∞ δ(γ ¯ ) = 0. By Eqs. Again |¯% (γ , x)| < δ(γ (1.11) and (2.26) we thus have TF ¯ ) ργ − ρ˜1,1 ∞ < C δ(γ
(2.28)
with C = (8π)−1 supx∈B (1 + W (x)) < ∞. The mean density for the TF theory is defined by ρ¯γ ≡ N ργTF (x)2 d2 x.
(2.29)
For γ = N, i.e., g = 1 this is the same as (1.6). It satisfies Lemma 2.4 (Bounds on ρ¯γ ). For some constant C > 0, N
µTF (γ ) µTF (γ ) ≥ ρ¯γ ≥ CN . 8π γ γ
(2.30)
Proof. The upper bound is trivial. Because ρ γ , defined in (2.22), converges uniformly TF and µTF (γ )γ −s/(s+2) → µ ˜ TF as γ → ∞, we have the lower bound to ρ˜1,1 γ ρ¯γ ≥ 8πγ s/(s+2) µTF (γ )−1 N µTF (γ ) for some C > 0.
TF 2 TF (ρ˜1,1 ) − 2ρ˜1,1 −ρ ∞
>C
(2.31)
Remark 2.1. With V asymptotically homogeneous of order s, µTF (γ )γ −s/(s+2) converges as γ → ∞, i.e. µTF (γ ) ∼ γ s/(s+2) for large γ . So the mean TF density for coupling constant g = 1, defined in (1.6), has the asymptotic behavior ρ¯ ∼ N s/(s+2) . 3. Upper Bound to the QM Energy As in the three dimensional case, cf. Eqs. (3.29) and (3.27) in [2], one has the upper bound |∇φγGP |2 + V (φγGP )2 N J (φγGP )4 + 23 N 2 (φγGP 2∞ K)2 E QM (N, a) ≤ , (3.1) + N 1 − N φγGP 2∞ I (1 − N φγGP 2∞ I )2 where we have implicitly used that −"φγGP +V φγGP ≥ 0, which is justified by Lemma 2.1. The coefficients I , J and K are given by Eqs. (2.4)–(2.10) in [4]. They depend on the scattering length and a parameter b. We choose γ = N/| ln(a 2 ρ)| ¯ and b = ρ¯ −1/2 . (Recall that ρ¯ is short for ρ¯N .) With this choice we have (as long as a 2 ρ¯ < 1) J =
4π , | ln(a 2 ρ)| ¯
(3.2)
26
E. H. Lieb, R. Seiringer, J. Yngvason
and the error terms N φγGP 2∞ I ≤ (const.)
µGP (γ ) 1 + O(| ln(a 2 ρ)| ¯ −1 ) ρ¯
(3.3)
and K 2 N 2 φγGP 4∞ ≤ (const.)E GP (1, γ )
µGP (γ ) ¯ −1 ) , 1 + O(| ln(a 2 ρ)| ρ¯
(3.4)
where we have used Lemma 2.1. So we have the upper bound E QM (N, a) E GP (N, 1/| ln(a 2 ρ)|) ¯
≤ 1 + O µGP (γ )/ρ¯ + O | ln(a 2 ρ)| ¯ −1 .
(3.5)
Now if γ is fixed as N → ∞, µGP (γ ) 1 1 ∼ ∼ . ρ¯ | ln(a 2 ρ)| ¯ N
(3.6)
If γ → ∞ with N we have instead, assuming that the external potential is asymptotically homogeneous of order s, γ s/(s+2) µTF (γ ) µGP (γ ) ∼ TF , ∼ ρ¯ µ (N ) N
(3.7)
E QM (N, a) 2 −s/(s+2) ≤ 1 + O | ln(a ρ)| ¯ E GP (N, 1/| ln(a 2 ρ)|) ¯
(3.8)
so in any case
holds as N → ∞ and a 2 ρ¯ → 0. 4. Lower Bound to the QM Energy Compared to the treatment of the 3D problem in [2] the new issue here is the TF case, i.e., γ = N/| ln(a 2 ρ)| ¯ → ∞, and we discuss this case first. The GP limit with γ fixed can be treated in complete analogy with the 3D case, cf. Remark 4.1 below. We introduce again the rescaled ρ γ as in (2.22) and also v (x) = γ 2/(s+2) v γ 1/(s+2) x . (4.1) Note that the scattering length of v is a = a γ −1/(s+2) . Using V ≥ µTF (γ ) − 8π γργTF and (2.7) we see that γ2 + γ −2/(s+2) Q E QM (N, a) ≥ E TF (N, γ /N ) + 4π N γ s/(s+2) ρ (4.2) TF − 8πN γ s/(s+2) ργ − ρ˜1,1 ∞ ,
Derivation of Gross–Pitaevskii Energy Functional for 2D Bose Gas
with Q = inf
||2 =1
|∇i | + 2
i
27
v (xi − xj )||
2
TF − 8π γ ρ˜1,1 (xi )||2
.
(4.3)
j
Dividing space into boxes α of side length L with Neumann boundary conditions we get Q≥ E hom (nα , L) − 8π γρα,max nα , (4.4) α TF in the box α, and E hom (n, L) is the where ρα,max denotes the maximal value of ρ˜1,1 energy of a homogeneous gas of n bosons in a box of side length L and Neumann boundary conditions. We can forget about the boxes where ρα,max = 0, because the energy of particles in these boxes is positive. We now want to use the lower bound on E hom given in [4], namely
E hom (n, L) ≥ 4π
1 n2 1 − C| ln( a 2 n/L2 )|−1/5 . 2 2 2 L | ln( a n/L )|
(4.5)
a 2 n/L2 . Now if This bound holds for n > (const.)| ln( a 2 n/L2 )|1/5 and small enough the minimum in (4.4) is taken in some box α for some value nα , we have E hom (nα + 1, L) − E hom (nα , L) ≥ 8π γρα,max .
(4.6)
By a computation analogous to the upper bound (see [2]) one shows that E hom (n + 1, L) − E hom (n, L) 1 n 2 2 −1 . 1 + O | ln( a n/L )| ≤ 8π 2 L | ln( a 2 n/L2 )| Using Lemma 2.4 and the asymptotics of µTF (Remark 2.1) we see that 2/(s+2) 2 2/(s+2) a2n N a a2N s/(s+2) N 2 C ≤ 2 =N ≤ a ρ¯ 2 , L2 L γ L2 L γ
(4.7)
(4.8)
for some constant C, so (4.7) reads E hom (n + 1, L) − E hom (n, L)
1 n 1 + | ln((γ /N )2/(s+2) L2 /C)| ≤ 8π 2 1+O . L | ln(a 2 ρ)| ¯ | ln(a 2 ρ)| ¯
(4.9)
So if L is fixed, our minimizing nα is at least ∼ ρα,max L2 N . If N is large enough and a 2 ρ¯ is small enough, we can thus use (4.5) in (4.4) to get Q≥
α
n2 4π α2 L
Nρα,max C 1 − −2 . 2 ρ)| | ln(a ¯ 2 2 a nα a N 1/5 | ln L2 | | ln L2 |
1
(4.10)
28
E. H. Lieb, R. Seiringer, J. Yngvason
Lemma 4.1. For 0 < x, b < 1 we have
b b2 1 x2 −2 x≥− 1+ . | ln x| | ln b| | ln b| (2| ln b|)2
(4.11)
1 −d Proof. Since ln x ≥ − de x for all d > 0 we have
x | ln b| 2x x 2 | ln b| −2 ≥ ≥ c(d)(bd ed| ln b|)−1/(1+d) edx 2+d − 2 2 b | ln x| b b b with c(d) = 2(2+d)/(1+d)
1 (2 + d)(2+d)/(1+d)
−
1 (2 + d)1/(1+d)
Choosing d = 1/| ln b| gives the desired result.
1 ≥ −1 − d 2 . 4
(4.12)
(4.13)
Note that the lemma above implies for k ≥ 1, b b2 1 x2 k2 . −2 xk ≥ − 1+ | ln x| | ln b| | ln b| (2| ln b|)2
(4.14)
a 2 ρα,max we get the bound Applying this with x = a 2 nα /L2 and b = N 2 ρα,max L2 Q ≥ − 4πN γ
× 1+
α
1 4| ln( a 2 Nρ
α,max
)|2
| ln( a 2 Nρα,max )| C 2 1− 2 | ln(a ρ)| ¯ | ln aLN |1/5 2
−1 (4.15)
for (4.10). To estimate the error terms, note that as in (4.8), 2/(s+2) N 2 2 a N ∼ a ρ¯ , γ
(4.16)
TF → 0 ¯ + O(ln | ln(a 2 ρ)|) ¯ for small a 2 ρ. ¯ Using ργ − ρ˜1,1 so | ln( a 2 N )| = | ln(a 2 ρ)| ∞ TF 2 2 (Theorem 2.1 (iii)) and ρ γ → (ρ˜1,1 ) as γ → ∞ (which follows from the uniform convergence and boundedness of the supports) we get
E QM (N, a) 2 2 TF 2 lim inf TF ≥ 1 − (const.) ρα,max L − (ρ˜1,1 ) . (4.17) N→∞ E (N, 1/| ln(a 2 ρ)|) ¯ α
Since this holds for all choices of the boxes α with arbitrary small side length L, and by TF is continuous and has compact support, we can conclude the assumptions on V ρ˜1,1 lim inf N→∞
E QM (N, a) ≥1 E TF (N, 1/| ln(a 2 ρ)|) ¯
¯ → ∞. in the limit N → ∞, a 2 ρ¯ → 0 and N/| ln(a 2 ρ)|
(4.18)
Derivation of Gross–Pitaevskii Energy Functional for 2D Bose Gas
29
Remark 4.1 (The GP case). In the derivation of the lower bound we have assumed that γ → ∞ with N , i.e. N | ln(a 2 ρ)|, ¯ which seems natural because otherwise the scattering length would have to decrease exponentially with N . However, for fixed γ one can use the methods of [2] (with slight modifications: One uses the 2D bounds on the homogeneous gas and Lemma 4.1) to compute a lower bound in terms of the GP energy. The result is lim inf N→∞
E QM (N, a) ≥1 E GP (N, 1/| ln(a 2 ρ)|) ¯
(4.19)
¯ fixed. in the limit N → ∞, a 2 ρ¯ → 0 with γ = N/| ln(a 2 ρ)| 5. The Limit Theorems We have now all the estimates needed for Theorems 1.1–1.4. The upper bound (3.8) and the lower bound (4.19) prove Theorem 1.1. The energy limit Theorem 1.3 for the TF case follows from (3.8), Theorem 2.1 (i) and (4.18). The convergence of the energies implies the convergence of the densities in the usual way by variation of the external potential. Replacing V (x) by V (x) + δγ s/(s+2) Y (γ −1/(s+2) x) for some positive Y ∈ C0∞ and redoing the upper and lower bounds we see that Theorem 1.3 and Theorem 2.1 (i) hold with W replaced by W + δY . Differentiating with respect to δ at δ = 0 yields γ 2/(s+2) QM 1/(s+2) TF x) = ρ˜1,1 (x) ρN,a (γ N→∞ N lim
(5.1)
in the sense of distributions. Since the functions all have norm 1, we can conclude that there is even weak L1 -convergence. Remark 5.1 (The 3D case). In [2] the analogues of Theorems 1.1 and 1.2 were shown for the three-dimensional Bose gas. Using the methods developed here one can extend these results to analogues of Theorems 1.3 and 1.4. In 3D the coupling constant is g = a, so γ = N a. Moreover, the relevant mean 3D density is ρ¯γ ∼ N (N a)−3/(s+3) . A. Appendix: Scattering Length in Two Dimensions Due to the logarithmic behavior of the Green function of the two dimensional Laplacian the definition of the scattering length is slightly more delicate in two dimensions than in three. For a nonnegative potential v(x), depending only on |x| and with finite range R0 , it is naturally defined by the following variational principle: Theorem A.1. Let R > R0 and consider the functional 1 2 2 ER [φ] = |∇φ(x)| + v(x)|φ(x)| d2 x. 2 |x|≤R
(A.1)
30
E. H. Lieb, R. Seiringer, J. Yngvason
Then, in the subclass of functions such that (|φ|2 + |∇φ|2 ) < ∞ and φ(x) = 1 for |x| = R, there is a unique function φ0 that minimizes ER [φ]. This function is nonnegative and rotationally symmetric, and satisfies the equation 1 −"φ0 (x) + v(x)φ0 (x) = 0 2
(A.2)
for |x| ≤ R in the sense of distributions, with boundary condition φ0 (x) = 1 for |x| = R. For R0 < |x| < R, φ0 (x) = ln(|x|/a)/ ln(R/a)
(A.3)
for a unique number a called the scattering length. For the proof see [4], where generalizations to other dimensions and potentials with a negative part are also discussed. Note that the factor 21 in (A.1) and (A.2) is due to the reduced mass of the two body problem. If v has infinite range it is easy to extend ∞the definition of the scattering length for nonnegative v under the assumption that |x|≥R1 v(x)d2 x < ∞ for some R1 . In fact, one may then simply cut off the potential at some point R0 > R1 (i.e., set v(x) = 0 for |x| > R0 ) and consider the limit of the scattering lengths of the cut off potentials as R0 → ∞. See [4] for details. References 1. Dalfovo, F., Giorgini, S., Pitaevskii, L.P. and Stringari, S.: Theory of Bose–Einstein condensation in trapped gases. Rev. Mod. Phys. 71, 463–512 (1999) 2. Lieb, E.H., Seiringer, R. and Yngvason, J.: Bosons in a Trap: A Rigorous Derivation of the Gross– Pitaevskii Energy Functional. Phys. Rev. A 61, 043602-1–043602-13 (2000); arXiv: math-ph/9908027, mp_arc 99-312. See also: Proceedings of ‘Quantum Theory and Symmetries’ (Goslar, 18-22 July 1999), edited by H.-D. Doebner, V.K. Dobrev, J.-D. Hennig and W. Luecke, Singapore: World Scientific, 2000; arXiv math-ph/9911026, mp_arc 99–439 3. Lieb, E.H. andYngvason, J.: Ground State Energy of the Low Density Bose Gas. Phys. Rev. Lett. 80, 2504– 2507 (1998); arXiv math-ph/9712138, mp_arc 97-631. A more leisurely presentation is in Differential Equations and Mathematical Physics, Proceedings of 1999 conference at the Univ. of Alabama, R. Weikard and G. Weinstein, eds., Cambridge, MA: International Press, 2000, pp. 295–306 4. Lieb, E.H. and Yngvason, J.: Ground State Energy of a Dilute Two-dimensional Bose Gas. J. Stat. Phys. 103, 509–526 (2001); arXiv: math-ph/0002014 5. Schick, M.: Two-Dimensional System of Hard Core Bosons. Phys. Rev. A 3, 1067–1073 (1971) 6. Hines, D.F., Frankel, N.E. and Mitchell, D.J.: Hard disc Bose gas. Physics Letters 68A, 12–14 (1978) 7. Popov, V.N.: On the theory of the superfluidity of two- and one-dimensional Bose systems. Theor. and Math. Phys. 11, 565–573 (1977) 8. Fisher, D.S. and Hohenberg, P.C.: Dilute Bose gas in two dimensions. Phys. Rev. B 37, 4936–4943 (1988) 9. Kolomeisky, E.B. and Straley, J.P.: Renormalization group analysis of the ground state properties of dilute Bose systems in d spatial dimensions. Phys. Rev. B 46, 11749–11756 (1992) 10. Ovchinnikov, A.A.: On the description of a two-dimensional Bose gas at low densities. J. Phys. Condens. Matter 5, 8665–8676 (1993). See also JETP Letters 57, 477 (1993); Mod. Phys. Lett. 7, 1029 (1993) 11. Shevchenko, S.I.: On the theory of a Bose gas in a nonuniform field. Sov. J. Low Temp. Phys. 18, 223–230 (1992) 12. Kolomeisky, E.B., Newman, T.J., Straley, J.P. and Qi, X.: Low-dimensional Bose liquids: Beyond the Gross–Pitaevskii approximation. Phys. Rev. Lett. 85, 1146–1149 (2000); arXiv: cond-mat/0002282 13. Kim, S., Won, C., Oh, S.D. and Jhe, W.: Bose–Einstein condensation in a two-dimensional trap. arXiv: cond-mat/0003342 (2000) 14. Kim, S., Won, C., Oh, S.D. and Jhe, W.: Two-dimensional condensation of dilute Bose atoms in harmonic trap. J. Korean Phys. Soc. 37, 665 (2000); arXiv: cond-mat/9904087 15. Garcia-Ripoll, J.J. and Perez-Garcia, V.M.:Anomalous rotational properties of Bose–Einstein condensates in asymmetric traps. Phys. Rev. A 64, 013602 (2001); arXiv: cond-mat/0003451 (2000)
Derivation of Gross–Pitaevskii Energy Functional for 2D Bose Gas
31
16. Gonzalez, A. and Perez, A.: Ground-state properties of bosons in three- and two-dimensional traps. Int. J. Mod. Phys. B 12, 2129–38 (1998) 17. Heinrichs, S. and Mullin, W.J.: Quantum-Monte-Carlo Calculations for Bosons in a Two-Dimensional Harmonic Trap. J. Low Temp. Phys. 113, 231–6 (1998) 18. Bayindir, M. and Tanatar, B.: Bose–Einstein condensation in a two-dimensional, trapped, interacting gas. Phys. Rev. A 58, 3134–7 (1998) 19. Cornish, S.L., Claussen, N.R., Roberts, J.L., Cornell, E.A. and Wieman, C.E.: Stable 85 Rb Bose–Einstein Condensates with Widely Tunable Interactions. Phys. Rev. Lett. 85, 1795–98 (2000) Communicated by H. Spohn
Commun. Math. Phys. 224, 33 – 63 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Quantum Lattice Models at Intermediate Temperature J. Fröhlich1 , L. Rey-Bellet2 , D. Ueltschi3, 1 Institut für Theoretische Physik, ETH Hönggerberg, 8093 Zürich, Switzerland.
E-mail:
[email protected] 2 Department of Mathematics, University of Virginia, Charlottesville, VA 22903, USA.
E-mail:
[email protected] 3 Department of Physics, Princeton University, Jadwin Hall, Princeton, NJ 08544, USA.
E-mail:
[email protected] Received: 6 December 2000 / Accepted: 18 July 2001
Dedicated to Joel Lebowitz on the occasion of his seventieth birthday Abstract: We analyze the free energy and construct the Gibbs-KMS states for a class of quantum lattice systems, at low temperature and when the interactions are almost diagonal, in a suitable basis. The models we study may have continuous symmetries, our results, however, apply to intermediate temperatures where discrete symmetries are broken but continuous symmetries are not. Our results are based on quantum Pirogov– Sinai theory and a combination of high and low temperature expansions.
1. Introduction In this paper we study the low temperature phase diagram for a class of quantum lattice systems. Starting with [PS, Sin], Pirogov–Sinai theory has evolved [KP, Zah, BKL, BS, BI, BK] into a very powerful tool to study the pure phases, their coexistence and the firstorder phase transitions in classical spin systems at low temperature. In recent years a large part of the Pirogov–Sinai theory has been extended to quantum systems [Pir, BKU, DFF, DFFR, KU], quantum spin systems as well as fermionic and bosonic lattice gases, and applied to a variety of models [FR, DFF2, GKU] to describe insulating phases associated with discrete symmetry breaking. Here we formulate the Pirogov–Sinai theory in terms of tangent functionals to the free energy. This allows us to discuss the completeness of the phase diagram avoiding the difficulties associated with boundary conditions. We reformulate results of [BKU, DFF, DFFR, KU] in this framework, and extend the theory to a class of models where discrete symmetries are broken at intermediate temperatures. This applies in particular to some systems with continuous symmetries. For this, we consider the restricted ensembles introduced in [BKL] that are very useful to analyze phases which are associated to a family of configurations rather than to a single configuration. Supported by the US National Science Foundation, grant PHY 9820650
34
J. Fröhlich, L. Rey-Bellet, D. Ueltschi
The models that we consider have Hamiltonians, for finite volumes , of the form H = V + T , where V is a classical Hamiltonian (i.e. diagonal in a suitable basis) and T is a (usually small) quantum perturbation. In typical situations the suitable basis is the basis of occupation numbers of position operators. Electronic systems provide a large class of interesting models. The classical interaction V describes the many-body short range and classical interaction between the spin- 21 fermions as well as external fields and chemical potentials: V = Jx,σ nx,σ + Jxy,σ σ nx,σ ny,σ + · · · . x∈ σ ∈{↑,↓}
x,y∈ σ,σ ∈{↑,↓}
A typical quantum perturbation T is the kinetic energy † T = txy,σ (cxσ cyσ + h.c.), <x,y>⊂ σ ∈{↑,↓}
† where cxσ and cxσ are the creation and annihilation operators and < x, y > denotes pairs of nearest neighbors. Often, in such systems, the behavior at low temperatures arises from a subtle interplay between the (classical) potential energy and the kinetic energy. In this paper two such mechanisms are considered and combined, each of which we now illustrate with an example.
Example 1 (Hubbard Model). In this case the (classical) interaction is only on-site: V = U nx↑ nx↓ − µ(nx↑ + nx↓ ). x∈
For suitable values of U and µ, the ground states of V have an infinite degeneracy (in the thermodynamic limit): each site is occupied by a single particle of arbitrary spin. However the kinetic energy lifts this degeneracy and induces an effective antiferromagnetic interaction between nearest neighbors. The perturbative methods of [DFFR, DFF2] shows that, in this parameter range, this system is equivalent, in the sense of statistical mechanics, to the Heisenberg antiferromagnet, up to controlled error terms. If the hopping coefficients are asymmetric (e.g. txy,↑ txy,↓ ) then quantum Pirogov–Sinai implies the coexistence of two antiferromagnetic phases at low enough temperatures [DFFR, KU, DFF2]. Rigorous results for the Hubbard model are reviewed in [Lieb]. Example 2 (Extended Hubbard Model). This variant of the Hubbard model includes a nearest neighbor interaction: V = (nx↑ + nx↓ )(ny↑ + ny↓ ). U nx↑ nx↓ − µ(nx↑ + nx↓ ) + W x∈
<x,y>⊂
If the interaction between nearest neighbors is repulsive then for suitable values of U , W and µ the ground states of V are chessboard configurations where empty sites alternate with sites occupied with one particle of arbitrary spin. The degeneracy of the
Quantum Lattice Models at Intermediate Temperature
35
ground states is infinite in the thermodynamic limit but we have a spatial ordering of the particles. Using a restricted ensemble we associate a pure phase to this spatial ordering by neglecting the spin degrees of freedom. The methods of this paper imply the existence of only two pure phases in the intermediate temperature range βt 1
and
βW 1.
The temperature is so low that the spatial ordering of the particles survives but so high that the spins are in a disordered phase. The continuous symmetry (if txy,↑ = txy,↓ ) is not broken in this parameter regime. These two models illustrate some of the mechanisms arising from the competition between classical and quantum effects, where the system remains insulating and no continuous symmetry is broken. Our main result, Theorem 4.4, provides tools to describe the phase diagram of such models, in particular the coexistence of several phases and the associated first-order phase transitions. The main technical ingredient in this paper is a combined low-temperature and hightemperature expansion for suitable contour models obtained using the perturbation theory developed in [DFFR]. This paper is organized as follows. In Sect. 2 we describe the general formalism of quantum lattice systems and the perturbation theory of [DFFR]. Section 3 is devoted to the Pirogov–Sinai theory. In Sect. 4 we state the results of Pirogov–Sinai theory for quantum systems. The extended Hubbard model is discussed in Sect. 5 as an illustration. In Sect. 6 we prove our main result by studying a contour model and deriving the required bounds on the contours. 2. General Framework of Quantum Lattice Models 2.1. Basic set-up. We consider a quantum mechanical system on a ν-dimensional lattice Zν , as considered, e.g., in [Rue, Isr, BR, Sim]. We will need a slight modification of the usual formalism in order to treat fermionic lattice gases [DFFR] and to accommodate the fact that fermionic creation and annihilation operators do not commute but anticommute. A quantum lattice system is defined by the following data: (i) Hilbert space. For convenience we choose a total ordering (denoted by the symbol ) of the sites in Zν . We choose the spiral order, depicted in Fig.1 for ν = 2, and an analogous ordering for ν ≥ 3. This ordering has the property that, for any finite set A, the set A := {z ∈ Zν , z A} of lattice sites which are smaller than A, or belong to A, is finite. To each lattice site a ∈ Zν is associated a finite-dimensional Hilbert space Ha and, for any finite subset A = {a1 ≺ · · · ≺ an } ⊂ Zν , the corresponding Hilbert space HA is given by the ordered tensor product HA = Ha1 ⊗ · · · ⊗ Han .
(2.1)
We further require that there be a Hilbert space isomorphism φa : Ha −→ H, for all a ∈ Zν . (ii) Field and observable algebras. For any finite subset A ⊂ Zν an operator algebra FA , the field algebra, is given. The algebra FA is isomorphic to the algebra B(HA ) of bounded operators on HA , but in general FA = B(HA ), rather FA ⊂ B(HA ). The algebra FA is a ∗-algebra equipped with a C ∗ -norm obtained from the operator norm on B(HA ). If A ⊂ B and a ≺ b, for all a ∈ A and all b ∈ B \ A, then there is a natural
36
J. Fröhlich, L. Rey-Bellet, D. Ueltschi
✛ ❄
✻ 5
t
4
t
3
6
1
2
t
t
t t
(0, 0)
7
t
✲
Fig. 1. Spiral order in Z2
embedding of FA into FB : An operator K ∈ FA corresponds to the operator K ⊗ 1HB\A in FB . In the following we denote by K both operators. For the infinite system the field algebra is the C ∗ algebra given by F=
FA
norm
,
(2.2)
AZν
(the limit being taken through a sequence of increasing subsets of Zν , where increasing refers to the (spiral) ordering defined above). The algebras FA contain the observable algebras OA which have the same embedding properties as the field algebras and, moreover, satisfy the following commutativity condition: If A ∩ B = ∅, then for any K ∈ FA , L ∈ OB we have [K, L] = 0.
(2.3)
For the infinite system the observable algebra O is given by O=
OA
norm
.
(2.4)
AZν
The group of space translations Zν acts as a ∗-automorphism group {τa }a∈Zν on the algebras F and O, with FX+a = τa (FX ),
OX+a = τa (OX ),
(2.5)
for any X ⊂ Zν and a ∈ Zν . (iii) Interactions, dynamics and free energy. An interaction H = {HA } is given: This is a map from the finite sets A ⊂ Zν to self-adjoint operators HA in the observable algebra OA . We assume the interaction to be translation invariant or periodic, i.e., there is a lattice # ⊆ Zν , with dim# = ν, such that τa HA = HA+a , for all a ∈ # and all A ⊂ Zν . We will consider finite range or exponentially decaying interactions. The norm of an interaction is defined as H r = sup HA er|A| , (2.6) a∈Zν Aa
for some r > 0. Here |A| denotes the cardinality of the smallest connected subset of Zν which contains A. We shall denote by Br = {H : H r < ∞} the corresponding Banach space of interactions.
Quantum Lattice Models at Intermediate Temperature
37
For a finite box , we denote H the finite-volume Hamiltonian given by H = A⊂ HA . Here, we consider only periodic boundary conditions, i.e. is the ν-dimensional torus (Z/LZ)ν , L being the size of . In the sequel we will consider infinite volume limits; the notation limZν will stand for limL→∞ . If H ∈ Br , the interaction H determines a one-parameter group of ∗-automorphisms, {αt }t∈R on F. These automorphisms are constructed as the limit (in the strong topology) of the automorphisms αt given by for K ∈ FA , A ⊂ by αt (K) = eitH K e−itH .
(2.7)
The proof is standard (see e.g. [BR]). Note that one makes crucial use of the commutativity condition (2.3). For an interaction H and at inverse temperature β the partition function is defined as β
Z = Tr e−βH ;
(2.8)
the free energy f (H ) is then f (H ) = −
1 1 β lim ν log Z . β Z ||
(2.9)
Existence of the limit is a well-known result, see [Isr, Sim]. Notice that f (H ) is a concave function of the interaction H . (iv) KMS states and tangent functionals. A state w on O is a positive normalized linear functional on O. A state w is periodic if w ◦ τa = w, for all a in a lattice # ⊂ Zν and invariant if # = Zν . A KMS state at inverse temperature β is a state wβ which satisfies the KMS condition wβ (Kαt (L)) = wβ (αt−iβ (K)L).
(2.10)
For finite systems with periodic boundary conditions it is easy to check that the Gibbs state given by wβ ( · ) = (Tr e−βH )−1 Tr( e−βH · )
(2.11)
satisfies the KMS condition. The set of KMS states is convex, and w is called extremal if it cannot be written as a linear combination of KMS states. The state w is clustering if lim w(Kτa (L)) = w(K)w(τa L),
a→∞
(2.12)
for all K, L ∈ O. Note that a state w is extremal if it is clustering. The state w is exponentially clustering if, for any local observables K ∈ OA , L ∈ OB we have the property w(Kτa (L)) − w(K)w(τa L) CK,L e−|a|/ξ
(2.13)
with ξ > 0; here CK,L depends on K and L only. If we consider the free energy as a function of the interaction, KMS states at inverse temperature β are in one-to-one correspondence with tangent functionals to the free energy. The free energy f is a concave function of the interaction H and a linear functional α on Br is said to be tangent to f at H if for all interaction K ∈ Br we have f (H + K) f (H ) + α(K).
(2.14)
38
J. Fröhlich, L. Rey-Bellet, D. Ueltschi
To an invariant state w we associate a tangent functional α defined by α(K) = w(AK ),
(2.15)
where AK = X0 |X|−1 KX (and similarly for periodic states). The results of Israel and Araki [Isr,Ara] show that if α is a tangent functional at H , then the invariant state w defined in (2.15) is a KMS state at temperature β and, conversely, for any KMS state at temperature β there is a unique tangent functional α. The identification of KMS states with tangent functionals will be very useful to describe the phase diagrams arising from Pirogov–Sinai theory. Example. As an illustration of the general formalism we consider spin 1/2 fermions, as in the examples treated in this paper. The Hilbert space Ha is isomorphic to C4 . We † and caσ the creation and annihilation operators of a particle at site a with spin denote caσ σ ∈ {↑, ↓}. One can construct an explicit representation of the creation and annihilation † ∈ / B(Ha ). operators as operators in B(Ha ), see e.g. Sect. 4.2 in [DFFR], but caσ , caσ † , a ∈ A, The algebras FA ⊂ B(HA ) are chosen to be the algebras generated by caσ , caσ σ ∈ {↑, ↓}. The observable algebras OA are chosen as the algebras generated by pairs of creation or annihilation operators. It is easy to check that the elements FA and OA satisfy the commutativity condition (2.3). Classical interactions. A particular class of interactions consists of the classical interactions. Let {ej }j ∈I be an orthonormal basis of H. Then, for A ⊂ Zν , EA = {⊗a∈A ejaa }, with ejaa = φa−1 ej ,
(2.16)
is an orthonormal basis of HA . We denote by C(EA ) the abelian subalgebra of OA consisting of all operators which are diagonal in the basis EA . An interaction V is called classical, if there exists a basis {ej }j ∈I of H such that VA ∈ C(EA ), for all A ⊂ Zν .
(2.17)
The set .A of configurations in A is defined as the set of all assignments {ja }{a∈A} of an element ja ∈ I to each a. A configuration ωA is an element in .A . There is a one-to-one correspondence between basis vectors a∈A ejaa of HA and configurations on A: a∈A
ejaa ←→ ωA ≡ {ja }a∈A .
(2.18)
In the sequel we shall use the notation eωA to denote the basis vector defined by the configuration ωA via the correspondence (2.18). Since a classical interaction V only depends on the numbers 0A (ωA ) = %eωA |VA |eωA &
(2.19)
we may view 0A as a (real-valued) function on the set of configurations. Similarly the algebra C(EA ) may be viewed as the ∗-algebra of complex-valued functions on the set of configurations .A .
Quantum Lattice Models at Intermediate Temperature
39
2.2. Perturbation theory for interactions. The interactions we will study have the form H = V +λT , where V is a classical interaction, T is a perturbation and λ a small parameter. A typical situation is the following: the classical part of the interaction has infinitely many ground states, i.e. the number of ground states of the finite-volume Hamiltonian H diverges as || → ∞, but the perturbation T lifts this degeneracy (completely or partially). This is usually easy to check this using standard perturbation theory for the finite-volume Hamiltonian V + λT . Standard perturbation theory however does not work in the thermodynamic limit, the norm of the error growing with || and other methods are required. Such methods have been developed in [DFFR] and applied in [FR, DFF2] (see also [KU] for an alternative approach). ˜ which is equivalent to H and which can be The idea is to construct an interaction H cast in the form ˜ = V˜ (λ) + T˜ (λ), H
(2.20)
where now the degeneracy of the ground states of V˜ is lifted and T˜ (λ) is suitably small with respect to V˜ (λ). ˜ are equivalent if there exists a ∗-automorphism Recall that two interactions H and H of the algebra O of local observables such that H˜ A = γ (HA ),
(2.21)
˜ ∈ Br˜ . A convenient way of for all A. In particular, if H ∈ Br , there exists r˜ such that H constructing equivalent interactions is with a family of unitary transformations U . Let SA , A ⊂ Zν , be a family of antiselfadjoint operators, periodic or translation invariant, with SA ∈ OA and Sr < ∞ for some r > 0. We set S = A⊂ SA and then U = exp(S ) is unitary. It is shown in [DFFR] that if Sr is small enough then the ˜ ∈ Br˜ for unitary equivalent Hamiltonians H˜ = U H U−1 define an interaction H ˜ is equivalent to H . some r˜ > 0 and H We consider now an interaction of the form H = V + λT which satisfy the following conditions: (P1) The interaction V is classical and of finite range. Moreover, we assume that V is given by a translation-invariant m-potential. This last condition means that we can assume (if necessary by passing to a physically equivalent interaction) that there exists at least one configuration ω minimizing all 00X , i.e., 00X (ω) = min 00X (ω ), ω
(2.22)
for all X. For any m-potential, the set of all configurations for which Eq. (2.22) holds is the set of ground states of 00 . (P2) The perturbation interaction T is in some space Banach space Br for some r > 0. Since, by condition (P1), the ground states can be determined locally, there is a corresponding decomposition of the Hilbert space HA for all A: high
low ⊕ HA , HA = HA
(2.23)
low is the subspace spanned by the ground states of V . We can decompose any where HA low and Hhigh : operator KA ∈ B(HA ) according to their action on HA A
KA = KAll + KAhh + KAlh ,
(2.24)
40
J. Fröhlich, L. Rey-Bellet, D. Ueltschi
with low low ⊂ HA KAll HA high KAhh HA low KAlh HA
⊂ ⊂
high HA high HA
high
KAll HA
= 0,
low KAhh HA = 0, high
KAlh HA
low ⊂ HA .
Accordingly we decompose any interaction T : T = T ll + T hh + T lh ,
(2.25)
The following theorem shows that, for any integer n ≥ 1, it is possible to construct an interaction H (n) equivalent to H with the property that H (n) is block diagonal up to order n. Note that this is a constructive result and an algorithm is given in [DFFR] which (n) allows one to construct the unitary transformations U and the interactions H (n) . Theorem 2.1. Consider an interaction of the form H = V + λT ,
(2.26)
where V satisfies Condition (P1) and T satisfies Condition (P2). For any integer n ≥ 1 there is rn > 0 and λn > 0 such that for |λ| < λn there is an interaction H (n) = V + T (n) ∈ Brn , equivalent to H , with T (n)lh rn = O(λn+1 ).
(2.27)
This theorem is useful to analyze the low temperature behavior of quantum spin systems when the ground states of V have infinite degeneracy and T lifts this degeneracy (totally or partially). Consider for example the typical case where the degeneracy is lifted in second order perturbation theory. In that case we may take n = 1 and we have T (1)lh = O(λ2 ): (1)ll (1)hh (1)lh H (1) = V + λj T j + λj T j + λj T j . (2.28) j ≥1
j ≥1
j ≥2
We then decompose H (1) = V˜ + T˜ into a new “classical part” V˜ given by V˜ = V +
2 j =1
(1)ll
λj T j
,
(2.29)
and T˜ contains all remaining terms. The new perturbation satisfies the bounds T˜ = hh lh O(λ3 ), T˜ = O(λ), and T˜ = O(λ2 ). If V˜ is a classical interaction with a sufficiently regular zero-temperature phase diagram, then Pirogov–Sinai techniques can be applied to study the phase diagrams of V˜ + T˜ for sufficiently small λ (see below). Note that this perturbation scheme is not only useful to analyze the low-temperature behavior of the model. The new “classical part” V˜ does not need to be classical at all. For example, see [DFFR, DFF2], if one applies this perturbation scheme to the Hubbard model at half-filling, V˜ is given by the Heisenberg model and this gives a rigorous proof of the equivalence of both models up to controlled error terms. ll
Quantum Lattice Models at Intermediate Temperature
41
3. Phase Diagrams, Contour Models, and Pirogov–Sinai Theory A phase diagram in Thermodynamics is a partition of a space of physical parameters in domains corresponding to phases; the free energy varies very smoothly inside a domain. However, first derivatives or of higher order may have discontinuities when crossing the boundary between two domains, and in this case one talks of phase transitions. The first proof of a phase transition was proposed by Peierls for the Ising model [Pei]. It was extended by Pirogov and Sinai [PS, Sin] to situations where different phases are not related by a symmetry. Important extensions and simplifications of the Pirogov– Sinai theory include Kotecký and Preiss [KP], Zahradník [Zah], Bricmont et al. [BKL] and [BS], Borgs and Imbrie [BI], Borgs and Kotecký [BK, BK2]. An exposition of the Pirogov–Sinai theory can be found in [EFS]. Another extension of the Peierls argument was done in Fröhlich and Lieb [FL] using reflection positivity [FSS, DLS]. 3.1. Phase diagrams. We consider the Banach space Br of periodic interactions, with the norm defined in (2.6). Here r is any positive number, but further assumptions (bounds for the weights of the contours, see below) can be verified in given models only if r is large enough. To a given interaction H ∈ Br and temperature β we associate the set of all translation invariant (or periodic) KMS states or, equivalently [Ara, Isr], the set of all tangent functionals to the free energy f (H ). The set of periodic KMS states forms a simplex, so that it is enough to describe the extremal states, or the corresponding tangent functionals. We denote the set of extremal states by E β (H ). In order to define a phase diagram we consider a smooth (p−1)-dimensional manifold on the Banach space Br of periodic interactions; it is described by an application u ) → H u , from a connected open set U ⊂ Rp−1 into Br . For m = 1, 2, 3, . . . , we introduce E (m) = {H ∈ Br : |E β (H )| = m}; accordingly, we partition the set U as U=
∞
∪ U (m) ,
m=1
(3.1)
where u ∈ U (m) iff H u ∈ E (m) . The decomposition (3.1) is called the phase diagram of H u. The phase diagram of H u , u ∈ U ⊂ Rp−1 , is said to satisfy the Gibbs phase rule if the following conditions hold. Here, we call “boundary” of U (i) the set (U¯ (i) \ U (i) ) ∩ U, with U¯ (i) the closure of U (i) . (i) U = U (1) ∪ · · · ∪ U (p) . (ii) (a) U (1) consists of p connected components, each of which is a (p−1)-dimensional manifold. The boundary of U (1) is U (2) ∪ · · · ∪ U (p) . (b) U (2) consists of p2 connected components, each of which is a (p − 2)-dimensional manifold.The boundary of U (2) is U (3) ∪ · · · ∪ U (p) . p (q) (c) U consists of q connected components, each of which is a (p − q)-dimensional manifold. The boundary of U (q) is U (q+1) ∪ · · · ∪ U (p) . (d) U (p) consists of a single point u0 . In other words, the phase diagram of H u satisfies the Gibbs phase rule iff it is homeomorphic to a connected, open neighborhood U of the boundary of the positive octant of Rp , in such a way that u0 is mapped onto the origin, U (p−1) is mapped onto the union of axis ∪i {ai > 0, aj = 0, j = i}, and so on...
42
J. Fröhlich, L. Rey-Bellet, D. Ueltschi
Connected components of U (1) are the one-phase region, or pure phase region, U (2) is the region of coexistence of two phases, . . . , U (p) is the point of coexistence of all p phases. We will call a phase diagram which satisfies the Gibbs phase rule regular if the free energy is a real analytic function of u in each one-phase region, and if all connected components of the manifold U (j ) are smooth (C 1 ). 3.2. Contour models. A contour A is a pair (A, α), where A ⊂ Zν is a finite connected set and is the support of A; to describe α, let us introduce the closed unit cell C(x) ⊂ Rν centered at x, i.e. C(x) = {y ∈ Rν : |y − x|∞ 21 }. The boundary B(A) of A ⊂ Zν is the union of plaquettes B(A) = {C(x) ∩ C(y) : x ∈ A, y ∈ / A}.
(3.2)
The boundary B(A) decomposes into connected components; each connected component b is given a label αb ∈ {1, . . . , p}, and α = (αb ). Let ⊂ Zν finite, with periodic boundary conditions.A set of contours {A1 , . . . , Ak } is admissible iff • Ai ⊂ , and dist (Ai , Aj ) 1 if i = j . • Labels αj are matching in the following sense. Let W = \ ∪kj =1 Aj ; then each connected component of W must have the same label on its boundaries. For j ∈ {1, . . . , p}, let Wj be the union of all connected components of W with labels j on their boundaries. β,u For each j ∈ {1, . . . , p}, we give ourselves a complex function gj (“free energy of a restricted ensemble”), that is real analytic in u ∈ U. We suppose that the limit β → ∞ β,u of gj exists, and we write β,u
eiu = lim Re gi , β→∞
1 i p,
(3.3)
e0u = min eiu .
(3.4)
i
We consider the partition function (2.8) for an interaction H u = V u + T , where the periodic interaction T is a perturbation of V u . We assume that the partition function can be rewritten as β,u,T
Z
=
k
{A1 ,...,Ak } j =1
w β,u,T (Aj )
p
β,u
e−βgi
|Wi |
,
(3.5)
i=1
where the sum is over admissible sets of contours in .1 The weight w β,u,T (A) of a contour A is a complex function of β, u, and T , that behaves nicely for β large and T in a neighborhood of 0. Precisely, we assume that there exists a set W ⊂ R+ × Br , that is open and connected, and whose closure contains (∞, 0); furthermore, we suppose that for all u ∈ U and all (β, T ) ∈ W, and all contours A, β,u −βgi || 1 The sum includes the case k = 0, and the corresponding term is p . It is however j =1 e
irrelevant, since it does not contribute to the infinite-volume free energy (3.6).
Quantum Lattice Models at Intermediate Temperature
43
• w β,u,T is periodic with period =, i.e. we have w β,u,T (τa A) = w β,u,T (A) for all a ∈ (=Z)ν and all A. Here τa is the translation operator. u • |w β,u,T (A)| e−βe0 |A| e−τ |A| for a large enough constant τ (depending on ν, p, and =). Furthermore, |
u ∂ β,u,T w (A)| β|A|C e−βe0 |A| e−τ |A| ∂ui
and |
u ∂ β,u,T +ηK (A)| β|A|CKr e−βe0 |A| e−τ |A| w ∂η
for a uniform constant C. • limβ→∞ limT →0 w β,u,T (A) = 0. This means that the weights represent the correction to the situation (β = ∞, T = 0). • wβ,u,T (A) is real analytic in u; for all K ∈ Br , wβ,u,T +ηK (A) is real analytic in η in a neighborhood of 0 (the neighborhood depends on K). Finally, the free energy is f β,u,T = −
1 1 β,u,T lim ν log Z . β Z ||
(3.6)
We also assume the following properties for f β,u,T : • f β,u,T is real, and concave as a function of T ; • whenever H u + T = H u + T , we have
f β,u,T = f β,u ,T .
(3.7)
Although these properties seem difficult to verify in the context of a contour model, they are usually clear in the original physical model.
3.3. The Pirogov–Sinai theory. The results of the Pirogov–Sinai theory are usually presented in terms of existence of many Gibbs states for a given interaction. However, it is more convenient to think of the Pirogov–Sinai theory as to express the free energy in a suitable form for the description of first-order phase transitions: the free energy is given as the minimum of C 1 functions (“metastable free energies”), that intersect themselves by making angles, hence a first-order phase transition when varying parameters so as to cross an intersection. The free energy at zero temperature is given by (3.4); in typical situations this is the minimum over energies of some important configurations (the “potential ground states”). The Pirogov–Sinai theory shows that in contour models, this structure extends at low temperatures. In the quantum situation one is also interested in adding a perturbation to a “nice” model; the metastable free energies then depend not only on β, but also on the quantum perturbation. We claim that the Pirogov–Sinai theory allows to construct metastable free energies that satisfy the following properties.
44
J. Fröhlich, L. Rey-Bellet, D. Ueltschi
Properties of the metastable free energies. We consider a contour model that satisfies the β,u,T for (β, T , u) ∈ structure described in Sect. 3.2. Then there exist p real functions fi W × U, such that β,u,T
(a) f β,u,T = mini fi ; β,u,T (b) limβ→∞ limT →0 fi = eiu , and limβ→∞ limT →0
∂ ∂uj
β,u,T
fi
∂ u ∂uj ei ; β,u,T +ηK fi
=
(c) for all K ∈ Br , there exists a neighborhood NK of 0 such that is C 1 as ∂ β,u,T +ηK a function of (u, η) in U × NK , and | ∂η fi | CKr for a constant C depending on ν, p, = only; β,u,T β,u,T β,u,T (d) fi is a real analytic function of u in M{i} = u : fi < fj ∀ j = i . Notice that the point (d) implies that the free energy f β,u,T is a real analytic function of u in ∪i M{i} (which is the region of uniqueness, as will be seen below). The proof of these properties involves the full artillery of the Pirogov–Sinai theory. The item (c) is not really standard and may appear as superfluous technicalities, but it plays a role when establishing the properties of the phase diagram, see Theorem 3.1 below. Since the present paper is only aimed at studying a special class of quantum models, we content ourselves with an outline of the proof, so as to make it plausible for readers who have knowledge of the details of the Pirogov–Sinai theory. A review of the Pirogov–Sinai theory is expected to appear shortly and will contain a detailed proof of these properties. Sketch of the proof of these properties. We heavily rely on [BKU], which itself follows [PS, Sin, Zah, BI, BK, BK2]. Our metastable free energies are defined as the real part of the metastable free energies of [BKU], which are complex in general. The first step consists in defining the metastable free energies. This can be done by introducing truncated contour activities and truncated partition functions following the inductive procedure of [BKU], Eqs. (5.6)–(5.12). One obtains metastable free energies (n) fj (that depend on β, u, T ). One can then prove the claims of Lemma A.1 i), iii), iv), β,u,T
(n)
= limn→∞ fj . v), vi) of [BKU]. We then set fj At this point we have well-defined metastable free energies depending on β, u and T (that is, they are functionals on the Banach space of interactions), and the free energy of the system is given by the minimum of the metastable free energies, as stated in item β,u,T β,u,T = eiu , and that fi is real analytic in (a). It is also clear that limβ→∞ limT →0 fi u on M{i} . What remains to be done is to check differentiable properties. β,u,T +ηK For given T and K, we consider fj as a function of (u, η). This is a mild complication of the situation in [BKU], since the metastable free energies here depend on p parameters instead of p − 1. One then gets the items ii) and vii) of Lemma A.1 – the partial derivatives with respect to η of the truncated contour activities and of the partition function with given external label satisfying the claims of the lemma with a constant C0 Kr instead of C0 . Finally, the metastable free energies are given as convergent series of clusters of contours, the weights of those obeying suitable bounds. This leads to item (c). / 0 We show now that these metastable free energies allow for a complete characterization of tangent functionals, under the extra assumption that the situation at zero temperature and without perturbation satisfies the Gibbs phase rule in a strong sense.
Quantum Lattice Models at Intermediate Temperature
45
The stronger condition for the Gibbs phase rule is that, for some u0 ∈ U, we have that all “potential ground state energies” are equal, eiu0 = eju0 for all i, j , and that the matrix of derivatives
∂ (3.8) eiu − epu 1 i,j p−1 ∂uj has an inverse that is uniformly bounded. Actually, energies eiu may not be differentiable; β,u in this case, we consider the same matrix with Re gi instead of eiu , and we suppose that it has an inverse for all β large enough, the inverse matrix being uniformly bounded with respect to u ∈ U, and β const. Theorem 3.1 (Stability of the phase diagram). Assume that there exist metastable free β,u,T energies fi , 1 i p, that satisfy all points (a)–(d) of the properties above. We assume in addition that the strong version of the Gibbs phase rule, described above, is satisfied. Then for β large enough and T r small enough (depending on p and on the bound of the inverse of the matrix of derivatives (3.8)), there exists U ⊂ U such that the phase diagram for H u + T , u ∈ U , at inverse temperature β, satisfies the Gibbs phase rule and is regular. Theorem 3.1 states that there exists u0 ∈ U such that the set of tangent functionals to the free energy at H u0 + T is a simplex with p extremal points. More generally, we have the decomposition U = U (1) ∪ · · · ∪ U (p) such that for u ∈ U (q) , the set of tangent functionals at H u + T is a q-dimensional simplex. This “completeness” of the phase diagram was addressed in [Zah] and [BW]. The approach was however different and involved studying the Gibbs states, which is more intricate and does not easily extend to the quantum case. It is simpler to look at tangent functionals, and then to use existing results on their equivalence with DLR or KMS states. Notice that the Pirogov–Sinai theory also provides various extra information, such as the fact that the limit of U (q) , as T → 0 and β → ∞, is equal to U (q) . Also, the extremal equilibrium states can be shown to be exponentially clustering. We do not claim these properties here however, because doing so would require extra assumptions and technicalities in the description of the abstract contour model. Proof of Theorem 3.1. Items (b) and (c) of the properties of metastable free energies β,u ,T
β,u0 ,T
(with η = 0) imply that there exists u0 such that fi 0 = fj the matrix of derivatives
∂ β,u,T fi − fpβ,u,T 1 i,j p−1 ∂uj
for all i, j , and that (3.9)
has a bounded inverse, uniformly in u in a neighborhood U of u0 . Let us define β,u,T
Mi = {u ∈ U : fi and, for Q ⊂ {1, . . . , p}, MQ =
i∈Q
β,u,T
= min fj j
Mi \
i ∈Q /
Mi
},
(3.10)
(3.11)
46
J. Fröhlich, L. Rey-Bellet, D. Ueltschi
(notice that M{i} Mi ). By the implicit function theorem, each MQ is described by a C 1 function from an open subset of Rp−|Q| into U . If we set U (q) = ∪|Q|=q MQ the phase diagram satisfies the Gibbs phase rule, provided there are exactly |Q| tangent functionals at H u + T for each u ∈ MQ . β,u,T Each metastable free energy fj , j ∈ Q, defines a tangent functional αj : for all β,u,T +ηK
∂ fj |η=0 . Notice that item (c) ensures boundedness K ∈ Br , we set αj (K) = ∂η 2 of the tangent functional. We show now that these tangent functionals are linearly independent, and that any other tangent functional is a linear combination of these ones. We examine the manifold where q phases coexist; without loss of generality, we can choose u˜ ∈ MQ with Q = {1, . . . , q}. The determinant of (3.9) can be written as a linear combination of determinants of
∂ β,u,T ˜ ˜ fi − fqβ,u,T , 1 i,j q−1 ∂ukj
(3.12)
with k1 , . . . , kq−1 being q − 1 different indices. Since the determinant of (3.9) differs from 0, at least one of the determinants in the previous equation differs from 0. Without loss of generality we can assume that
∂ β,u,T ˜ ˜ fi − fqβ,u,T 1 i,j q−1 ∂uj
(3.13)
is not singular. p−1 Our analysis is local, so we can take u˜ = 0 and H u = H 0 + j =1 uj K j . Then β,u,T
|u=0 , and non-singularity of (3.13) shows that (3.7) implies that αj (K i ) = ∂u∂ i fj αj , 1 j q, are linearly independent. Furthermore, it also implies that for all tangent functionals α the system of equations for ξ = (ξ1 , . . . , ξq ), α (K i ) =
q
ξj αj (K i ),
i = 1, . . . , q − 1,
(3.14)
j =1
has a unique solution with gj (u, η) =
β,u,T +ηK fj ,
j ξj
= 1 . Now we consider any K ∈ Br ; we define
1 j q, and
g1 (u, η) − gq (u, η) .. g(u, η) = . . gq−1 (u, η) − gq (u, η)
(3.15)
∂ g(0, 0) is an isomorphism, and g(u, η) is a map of class C 1 We have g(0, 0) = 0, ∂u by item (c) of the properties metastable free energies. By the implicit function theorem 2 One may wonder whether the functional α is linear. It is actually, because α can be obtained as the j j limit of linear functionals that are tangent to the free energy, uniquely defined for all points of M{j } – a region of parameters where the concave free energy has a unique tangent functional.
Quantum Lattice Models at Intermediate Temperature
47
there exists a map u(η) such that g(u(η), η) = 0. We introduce the interactions R(η) = K +
q−1 1 uj (η)K j , η
(3.16)
j =1
uj (0)K j .
(3.17)
= · · · = fqβ,0,T +ηR(η) .
(3.18)
R = lim R(η) = K + η→0
q−1 j =1
Then using (3.7) we have β,0,T +ηR(η)
f β,0,T +ηR(η) = f1
Differentiating with respect to η, we obtain (recall that α is tangent to f β,0,T +ηR(η) at η = 0) α (R) = α1 (R) = · · · = αq (R). (3.19) Then obviously α (R) = j ξj αj (R), and it follows by linearity of the tangent functionals that
α (K) =
q
ξj αj (K).
(3.20)
j =1
0 /
4. Results of the Quantum Pirogov–Sinai Theory We summarize in this section the results obtained in [BKU, DFF, DFFR, KU], and in the present paper. All results concern the situation where the interaction has the form H = V + T , where V is a classical interaction satisfying the standard Pirogov–Sinai framework, and T is a small perturbation. The temperature will be assumed to be small. The results however split into four classes, according to whether we use the perturbation methods of [DFFR] (Sect. 2.2), and whether we include high temperature expansions to analyze phases at intermediate temperatures. In this section, we implicitly assume all properties of the metastable free energies, see Subsect. 3.3, to be valid – without these properties the statements below would not include completeness, i.e. we could not ascertain to have identified all the periodic Gibbs states of the systems. 4.1. Quantum perturbation of classical model with finitely many ground states. In this case the classical interaction V has finitely many ground states and the phase diagram of V + T is, at low temperatures and for sufficiently small T a small deformation of the zero temperature phase diagram of V . The extension of the Pirogov–Sinai theory to this class of quantum systems goes back to [Pir] and was proved in [BKU, DFF]. ν (a) Structure. We denote by . = {1, . . . , M}Z the space of classical configurations; the dimension ν of the physical space is always supposed to be bigger or equal to 2. The interaction has the form H = V +T , where V is a block interaction and is diagonal with
48
J. Fröhlich, L. Rey-Bellet, D. Ueltschi
respect to the basis of classical configurations: if A = U (x) ≡ {y : |y − x|∞ R} for some x ∈ Zν , VA |e&ω = 0x (ωU (x) ) |e&ω ,
(4.1)
and VA = 0 if there is no x with U (x) = A. The function 0x depends on µ ∈ U ⊂ Rp−1 , and we assume that its derivatives ∂µ∂ j 0x (ωU (x) ) are bounded uniformly in x, µ, ω, j .
A finite set G = {g (1) , . . . , g (p) } ⊂ . of periodic configurations is given, that contains all ground states of V for all µ (see below the precise assumption). We write GA = {gA : g ∈ G}. We suppose that 0x (gU (x) ) is independent of x, for all g ∈ G, and µ this value is denoted by eg (this is the mean energy of the configuration g). (b) Assumptions. (A1) A gap separates the excitations: for all ωU (x) ∈ / GU (x) , 0x (ωU (x) ) − min 0x (gU (x) ) D g∈G
(uniformly in µ). (A2) The zero temperature phase diagram is (linearly) regular: there is µ0 ∈ U such µ µ µ that eg 0 = eg 0 for all g, g ∈ G, and the inverse of the matrix of derivatives MG , see (3.8), is uniformly bounded. (c) Properties of Gibbs states. Theorem 4.1. Assume (A1) and (A2) hold true. There exist β0 , c < ∞ (depending on ν, R, p, M and on the periods of {g (j ) } and H only) such that if βD β0 and T c /D 1, the phase diagram of the quantum model satisfies the Gibbs phase rule and is regular in a neighborhood U ⊂ U of µ0 . In the single phase region, i.e. if µ ∈ Mβ ({g}), the KMS state w β,µ,T (·) is close to the ground state g: for all K ∈ OA , limβ→∞,T r →0 w β,µ,T (K) = %eg |K|eg &. The condition T c /D 1 means that T is a perturbation with respect to V ; c plays the role of the perturbative parameter: from Definition (2.6) of the norm · c , TA must be very small if c is very large. The proof of this theorem follows from [BKU, DFF]. 4.2. Models with infinite degeneracy. Consider a model whose classical part has infinitely many ground states, and a perturbation which lifts this degeneracy completely. The pertubation methods of [DFFR] (see Sect. 2.2) permits one in certain cases to analyze this by constructing an equivalent interaction with a new classical part which has finitely many ground states. In this case the new perturbation has a slightly more complicated form than in Sect. 4.1 and the following theorem deals with this situation. This situation was considered in [DFFR] (for a different approach see [KU]). ν (a) Structure. The space of classical configurations is again . = {1, . . . , M}Z . We consider two sets G, D ⊂ ., with D ⊂ G finite, D = {d (1) , . . . , d (p) } is a finite set of periodic configurations; G may be infinite and will represent the configurations of low energy. For A ⊂ Zν , the Hilbert space HA has the following decomposition HA = low ⊕ Hhigh , where Hlow is the subspace spanned by the low energy configurations HA A A gA ∈ GA . The interaction has the form H = V + T , where V is a classical block
Quantum Lattice Models at Intermediate Temperature
49
interaction with uniformly bounded derivatives ∂µ∂ 0x (ωU (x) ), and T is a perturbation j that is submitted to some restrictions, see the assumptions below. (b) Assumptions. / GU (x) , (B1) A gap separates high and low energies: for all ωU (x) ∈ 0x (ωU (x) ) − max 0x (gU (x) ) D0 . g∈G
(B2) Gap with the ground states: we assume that 0x (dU (x) ) is independent of x for d ∈ D, and for all ωU (x) ∈ / DU (x) , 0x (ωU (x) ) − min 0x (dU (x) ) D d∈D
(and we assume that D D0 ). (B3) The perturbation may be decomposed T = K + K + K ; for all A, low KA HA = 0,
KA HA
⊂ HA ;
KA HA
low ⊂ HA
high
low KA HA ⊂ HA , high
high
high
(there is no assumption on K).3 µ (B4) The zero temperature phase diagram is (linearly) regular, i.e. all energies ed are µ equal for some µ0 ∈ U, and the matrix MD [see (3.8)] has a uniformly bounded inverse. (c) Properties of Gibbs states. Theorem 4.2. Assume (B1)–(B4) hold true. There exist β0 , c < ∞ (depending on ν, R, p, M and on the periods of {d (j ) } and H only) such that if βD β0 , Kc /D 1, K c /D0 1, K c /D0 1 the phase diagram of the quantum model satisfies the Gibbs phase rule and is regular in U ⊂ U, U µ0 . In the single phase region, i.e. if µ ∈ Mβ ({d}), the KMS state w β,µ,T (·) is close to the ground state d: for all K ∈ OA , limβ→∞,T r →0 w β,µ,T (K) = %ed |K|ed &. The proof of this theorem is given in [DFFR]. A somewhat different method yielding similar results has been developed later in [KU]. 4.3. Combined high and low temperature expansions. Here we consider models whose classical part V has partially ordered ground states, typically described by periodic configurations of holes and particles but still with infinite degeneracy due to, e.g., degeneracy of the spin at each site. Together with the quantum perturbation the system may have a continuous symmetry. We will suppose that the temperature is low and, in addition, that βT c is actually small (i.e. the temperature is large compared to T ) and we will prove that in this case one phase corresponds to each periodic configuration of holes and particles and that in this phase the spin degrees of freedom are in a disordered phase. This situation has many similarities with that of [BKL], and could be called “a theory of restricted ensembles in quantum lattice systems”. 3 Motivation comes from (2.25). It is however slightly more general, and it is just what is required in the proof of Theorem 4.2.
50
J. Fröhlich, L. Rey-Bellet, D. Ueltschi ν
(a) Structure. As before, let . = {1, . . . , M}Z . Intermediate temperature phases will be characterized by “motives” giving partial information on the underlying configurations. In order to describe this, we consider a partition of {1, . . . , M}: N
{1, . . . , M} =
Ij
with Ii ∩ Ij = ∅.
(4.2)
j =1
We denote N = {1, . . . , N} (and N ≡ NZν ). For n ∈ N , we write .n = {ω ∈ . : ωx ∈ Inx ∀x}. Let G = {g (1) , . . . , g (p) } ⊂ N be a finite set of periodic configurations; this is the set of motives and a pure phase will be associated with each of these configurations. We write .G = ∪g∈G .g . The interaction has the form H = V +T , where V is a classical block interaction with uniformly bounded derivatives w.r.t. µ, and T is a perturbation. We introduce restricted partition functions for each g ∈ G: let g Z = e−β x,U (x)⊂ 0x (ωU (x) ) (4.3) ω ∈.g,
and hβ,µ =− g
1 1 g lim log Z . β Zν ||
(4.4)
β,µ
µ
The ground energies are eg = limβ→∞ hg , g ∈ G. (b) Assumptions. (C1) For all configurations ωU (x) ∈ / .G,U (x) , we have 0x (ωU (x) ) − min 0x (ωU (x) ) D. ω ∈.G
Moreover, we assume that min
ωU (x) ∈.g,U (x)
0x (ωU (x) ) = eµ (g)
independently of x, for all g ∈ G. (C2) We need a condition that ensures that no phase transition takes place in a restricted ensemble .g ; in other words, spatial correlations should decay quickly enough. The following condition is stronger, and amounts to saying that there is no correlation between different sites. For all g ∈ G, we suppose that there exists an on-site interaction 0g such that for all x: g
0x (ωU (x) ) = 0x (ωx ) for all ω ∈ .g . µ µ (C3) The zero temperature phase diagram is regular with eg 0 = eg 0 , g, g ∈ G, for µ some µ0 ∈ U, and the matrix MG , see (3.8), has a uniformly bounded inverse.4 µ
β,µ
4 If {e } are not C 1 , we consider the matrix of derivatives of h g g is bounded uniformly w.r.t. µ and large β.
for β large; it must have an inverse that
Quantum Lattice Models at Intermediate Temperature
51
(c) Gibbs states at intermediate temperature. Theorem 4.3. Assume (C1)–(C3) hold true. There exist β0 , c < ∞ (depending on ν, R, p, M and on the periods of {g (j ) } and H only) such that if β0 βD < ∞ and βT c 1, the phase diagram satisfies the Gibbs phase rule and is regular in U ⊂ U, U µ0 . In the single phase region, i.e. if µ ∈ Mβ ({g}), the KMS state w β,µ,T (·) is close β,µ,T (K) = (Tr(P ))−1 Tr(KP ), to the motive g: for all K ∈ OA , lim β→∞,T r →0 w A A where PA is the projection given by ωA ∈.g,A |eωA &%eωA | . Remark. It follows from our assumptions that T is small compared to V ; more precisely, T c /D 1/β0 . This theorem is actually a consequence of Theorem 4.4 below, see the remark after Theorem 4.4. 4.4. Infinite degeneracy, high and low temperature expansions. Here we consider systems where phases result from subtle interplay between potential and kinetic energy, combining the effect described in Sects. 4.2 and 4.3. The quantum perturbation lifts partially the degeneracy of the classical interaction, leading at intermediate temperatures, to spatially ordered phases. Hereafter we describe the general framework in a rather abstract way; it will be illustrated in Sect. 5, and the reader may gain better understanding by working out a concrete application. ν (a) Structure. The space of classical configurations is . = {1, . . . , M}Z ; we consider a partition like in (4.2) and define similarly N and .n . We consider a (possibly infinite) set G ⊂ N that represents low energy configurations; the Hilbert spaces decompose in low ⊕ Hhigh , where Hlow is the subspace spanned by the the following way: HA = HA A A low-energy configurations gA ∈ GA . The interaction has the form H = V + T ; V is a block interaction with uniformly bounded derivatives ∂µ∂ 0x (ωU (x) ); the perturbation j
T decomposes further T = K + K + K ; we shall require different assumptions on K, K , K , motivated by the perturbation theory of Sect. 2.2. We suppose that a finite set D = {d (1) , . . . , d (p) } ⊂ G is given, that corresponds to possible ground states. For each d ∈ D, we define the corresponding restricted partition function d Z = e−β x,U (x)⊂ 0x (ωU (x) ) (4.5) ω ∈.d,
and the corresponding restricted free energy β,µ
hd µ
=−
1 1 d , lim ν log Z β Z ||
β,µ
and ed = limβ→∞ hd . (b) Assumptions. (D1) A gap separates high and low energies: for all ωU (x) ∈ / .G,U (x) , 0x (ωU (x) ) − max 0x (ωU (x) ) D0 . ω ∈.G
(4.6)
52
J. Fröhlich, L. Rey-Bellet, D. Ueltschi
(D2) Gap with the ground states: for all ωU (x) ∈ / .D,U (x) , 0x (ωU (x) ) − min 0(ωU (x) ) D. ω ∈.D
(D3) For all d ∈ D, there exists an on-site interaction 0d such that for all ω ∈ .d and all x, 0x (ωU (x) ) = 0dx (ωx ). Moreover, we suppose that µ
min 0dx (ωx ) = ed
ωx ∈Idx
independently of x. (D4) The quantum perturbation T = K + K + K has the same properties as in (B3), with respect to the decomposition into low and high energy states. µ µ (D5) There is µ0 ∈ U such that ed 0 = ed 0 , d, d ∈ D, and the matrix of derivaµ tives (3.8) has a uniformly bounded inverse (see the footnote of (C3) if ed is not differentiable). (c) Properties of Gibbs states. Theorem 4.4. Assume (D1)–(D5) hold true. There exist β0 , c < ∞ (depending on ν, R, p, M and on the periods of {d (j ) } and H only) such that if β0 βD < ∞, βKc 1, K c /D0 1, K c /D0 1, and βK 2c /D0 1, the phase diagram satisfies the Gibbs phase rule and is regular in an open set U ⊂ U that contains µ0 . In the single phase region, i.e. if µ ∈ Mβ ({d}), the KMS state w β,µ,T (·) is close β,µ,T (K) = (Tr(P ))−1 Tr(KP ), to the motive d: for all K ∈ OA , lim β→∞,T r →0 w A A where PA is the projection given by ωA ∈.d,A |eωA &%eωA | . This theorem follows from the contour representation obtained in Sect. 6, together with the Pirogov–Sinai theory. Remarks. 1. Theorem 4.3 is an immediate consequence of Theorem 4.4. Indeed, we clearly recover the setting of Sect. 4.3 by choosing G = . (i.e. all configurations have low energy), and K = K = 0. 2. These two theorems also generalize results of [Uel]: they can be applied to the Hubbard model † H = −t (cxσ cyσ + h.c.) + U nx↑ nx↓ , (4.7) <x,y> σ =↑,↓
x
to show that the high temperature phase extends to
(β, t, U ) : βt small
and
(β, t, U ) : βt 2 /U small
(standard high temperature expansions apply when both βt and βU are small).
Quantum Lattice Models at Intermediate Temperature
53
5. Example: Extended Hubbard Model This is a Hubbard model where particles interact among each other when their distance is smaller than or equal to 1. Explicitly, † H = −t (cxσ cyσ + h.c.) + U nx↑ nx↓ + W nx ny − µ nx . <x,y>⊂ σ =↑,↓
x∈
<x,y>⊂
x∈
(5.1) † , cxσ are creation, annihilation, operators of a fermion of spin σ at site x; Here, cx,σ † cxσ is the number of < x, y > stands for a set of nearest neighbor sites; nxσ = cxσ particles of spin σ at x (it has eigenvalues 0 and 1); nx = nx↑ + nx↓ is the total number of particles at x. The coefficient t represents the hopping, and will be taken to be small compared to the nearest-neighbor repulsion W ; µ is the chemical potential. The classical limit t → 0 was studied in [J¸ed, BJK]. The stability of the chessboard phase M(0,2) (see below) with small t is a straightforward application of [DFF]; a later study devoted to it is [BK3]. Weνstart by analyzing the classical interactions. The configuration space is . = {0, ↑, ↓, 2}Z and the corresponding classical interaction can be written as (taking R = 21 )
0x (ωU (x) ) =
U W δωy ,2 + ν−1 2ν 2 y∈U (x)
Here we introduced qy ∈ {0, 1, 2}: 0 qy = 1 2
qy q z −
⊂U (x)
µ qy . 2ν
(5.2)
y∈U (x)
if ωy = 0 if ωy =↑ or ωy =↓ if ωy = 2.
(5.3)
The interaction can also be written as a sum over pairs of n.n. sites; this simplifies the analysis of the zero temperature phase diagram, and the search for symmetries (see below). This pair interaction is given by 0<x,y> (qx , qy ) =
U µ (δqx ,2 + δqy ,2 ) + W qx qy − (qx + qy ). 2ν 2ν
(5.4)
This model has a hole-particle symmetry. Introducing the unitary operator U such that † † U −1 = cxσ and U cxσ U −1 = cxσ , we see that U T U −1 = T .As for the potential, U cxσ the effect of the symmetry can be exhibited by considering classical configurations; defining qx = 2 − qx , and µ = U + 4νW − µ, we easily check that
µ 0µ <x,y> (qx , qy ) = 0<x,y> (qx , qy ) + C,
(5.5)
where C = −U/ν − 4W + 2µ/ν does not depend on (qx , qy ). As a result, the phase diagrams (U, µ) are symmetric along the line µ= for any temperature.
U + 2νW, 2
(5.6)
54
J. Fröhlich, L. Rey-Bellet, D. Ueltschi µ ν|W |
µ νW
M1
M2
2
U ν|W |
4
M(0,2)
-2
M(1,2)
M2 4
M1
2 M(0,1)
M0 2
U νW
M0
(a) (b) Fig. 2. Zero temperature phase diagrams of the extended Hubbard model, (a) when W < 0 and (b) when W > 0. The dashed line represents the hole-particle symmetry, see (5.6)
The zero temperature phase diagrams with t = 0 are depicted in Fig. 2, in both cases W < 0 and W > 0. In the case W < 0, it decomposes into three domains M0 , M1 , and M2 ; M0 and M2 have a unique translation invariant ground state with respectively 0 and 2 particles at each site. In M1 , any configurations with one particle per site is a ground state; there is degeneracy 2|| since each particle has spin ↑ or ↓. The situation W > 0 presents a richer structure with six domains. Domains M0 , M1 and M2 have the same features as with attractive n.n. interactions. In between domains M(0,2) , M(1,2) and M(0,1) now appear. M(0,2) consists in two ground states, the two 1 chessboard configurations with alternatively 0 and 2 electrons per site. M(0,1) has 2·2 2 || ground states of the chessboard type, one sublattice being empty, while the other has exactly one particle of spin ↑ or ↓; M(1,2) is similar, with 2 particles per site on one sublattice and one on the other. We are interested in the case where the temperature is small, but bigger than 0, and with small hopping. The phase diagrams for large β and small βt are presented in Fig. 3. µ νW
µ ν|W | β,t
β,t
β,t
M2
M1
M2 2
4
U ν|W |
β,t
4
M1
2
-2
β,t
Mcb β,t
M0
2
β,t
M0
U νW
(b) (a) Fig. 3. Phase diagrams of the extended Hubbard model at intermediate temperature and with small hopping, (a) when W < 0 and (b) when W > 0. Bold lines denote first-order phase transitions. White is the region PK that resists rigorous investigations, where second-order transitions are expected
In the case W < 0, all three domains survive at low temperature and with t = 0; a first-order phase transition occurs when crossing the border between any two domains.
Quantum Lattice Models at Intermediate Temperature
55 β,t
µ U The point ( νW = 2, νW = 1) belongs to M1 : this phase has residual entropy (it also has more quantum fluctuations, although this has much less effect). The Gibbs β,t state corresponding to the domain M1 is thermodynamically stable and exponentially clustering. The restriction to intermediate temperatures (βt ε) is important, because, for ν 3, a phase transition is expected when the temperature decreases, leading to an antiferromagnetic phase that breaks both symmetries of translations and of rotations of the spins. The phase diagram at finite β and nonzero t is especially interesting for W > 0. β,t β,t β,t β,t There are not six, but only four domains M0 , M1 , M2 and Mcb ; see Fig. 3. Indeed, the three domains corresponding to chessboard phases have merged into a single domain (this was first understood and proven in [BJK] in the absence of hopping). The β,t β,t free energy is real analytic in the whole domain Mcb . The transition between M2 β,t and Mcb is presumably second-order, but our results do not cover the intermediate β,t β,t region between these domains. The boundary between Mcb and M1 contains a part where a first-order phase transition occurs that can be rigorously described. Crossing the boundary elsewhere presumably results in a second-order transition. Due to the thermal β,t fluctuations, the segment from (2,2) to (2,4) belongs to M1 . Our results for this model are summarized in the next two theorems.
Theorem 5.1 (Hubbard model with attractive n.n. interactions). Let ν 2. There exist constants β0 < ∞ and ε0 > 0 (depending on ν) such that the phase diagram β,t (U, µ) for β|W | β0 and βt ε0 is regular; domains Ma , a ∈ {0, 1, 2} satisfy β,t β,t limβ→∞ limt→0 Ma = Ma . If (U, µ) belongs to a unique Ma , there is a unique Gibbs state. Furthermore, the density of the system is close to a, %nx & − a ε(β, t), for all x. ε(β, t) can be made arbitrarily small by taking β large and t small. In order to describe the situation W > 0 we first introduce the region of the phase diagram PK where we have no results. Let M0 ∪ M1 ∪ M2 \ M(0,2) ∩ M1 , (5.7) L = M(0,2) ∪ M(1,2) ∪ M(0,1) and for K > 0, PK =
BK (U, µ),
(5.8)
(U,µ)∈L
where BK (U, µ) is the open ball of radius K centered on (U, µ). We restrict our considerations to the complement of PK . Theorem 5.2 (Hubbard model with n.n. repulsions). Let ν 2 and K > 0. There exist constants β0 < ∞ and ε0 > 0 (depending on ν and K) such that if β0 βW < ∞ and βt ε0 , we have the decomposition β,t
β,t
β,t
β,t
PKc = M0 ∪ M1 ∪ M2 ∪ Mcb , and
56
J. Fröhlich, L. Rey-Bellet, D. Ueltschi β,t
β,t
β,t
(i) M0 ⊂ M0 , M2 ⊂ M2 , M1 ( ⊂ M1 ) are domains with a unique Gibbs state. Densities are close to 0, 2, 1 respectively in the sense β,t in M0 %nx & ε(β, t) β,t %nx & 2 − ε(β, t) in M2 β,t |%nx & − 1| ε(β, t) in M1 with ε(β, t) arbitrarily close to 0 if β is large and t small. β,t (ii) Mcb ⊂ M(0,2) ∪ M(1,2) ∪ M(0,1) is a domain with two extremal Gibbs states of the chessboard type. The free energy is a real analytic function of β and µ in the domain β,t (β, µ) : β0 /W β ε0 /t and (U, µ) ∈ Mcb . β,t
β,t
(iii) Mcb ∩ M1 states.
is a line of first-order phase transition, with exactly three extremal
Remarks. The proofs of Theorems 5.1 and 5.2 use Theorem 4.3. But using Theorem 4.1, one could establish stability of domains M0 , M2 , M(0,2) for all β|W | β0 , without the restriction that the temperature be not too small. Another possible improvement, for U, W > 0, would use Theorem 4.4 to replace the condition βt ε0 by βt 2 /U ε0 . The latter clearly allows lower temperatures.5
6. Combined High-Low Temperature Expansions In this section we simultaneously perform a low and a high temperature expansion. The low ) are temperature is low, in such a way that excitations above the low energy states (H rare. At the same time, the temperature is high relatively to the quantum perturbations K and K . These expansions allow to write the partition functions as one of a contour model, that can be treated by the Pirogov–Sinai theory, see Sect. 3.2. We rewrite the quantum model as a contour model, by making a mixed low and high temperature expansion (Sect. 6.1); we define suitable weights, so that the partition function takes the form required in Sect. 3.2. Section 6.2 is devoted to proving that the weights are small compared to their size. Finally, we explain in Section 6.3 how other requirements of Sect. 3.2 are fulfilled.
6.1. Expansion of the partition function. Our intention is to expand in K + K + K ; in order to simplify the notation, we introduce B = (B, i), B ⊂ Zν , i = 1, 2, 3, and we write KB = TB with B = (B, 1), KB = TB with B = (B, 2), and KB = TB with B = (B, 3). We refer to B as a transition. 5 Furthermore, the restriction to intermediate temperatures arises because of possible antiferromagnetism due to “quantum fluctuations” of strength t 2 /U ; it should be stable for βt 2 /U > const; therefore this new condition is qualitatively correct.
Quantum Lattice Models at Intermediate Temperature
57
Using Duhamel’s formula, we obtain Tr e−βH = Tr e−β
B⊂ VB
+
m
e−τ1
1 x∈ 0x (ωU (x) )
0 1 + r and ϕ and a will be chosen small depending on some “geometric” constants that will appear in the course of the proof. Thus, if ω ∈ Uκ , then for all k we have |ωk | ≤ νaκ|k|−r e−κ
−1 |k|
(15)
and 2
" ≤ ν 2 ϕκ α .
(16)
κ , 1 + ηνt min(1, κ)
(17)
Let now κ(t) =
where η will be chosen suitably small below, and denote also by ω(t) the solution of (6) without the forcing term gk (t). Proposition 1. (a) Let ω(0) ∈ Uκ , then for all 0 ≤ t ≤ 1, ω(t) ∈ Uκ(t) . (b) Suppose ω(0) ∈ $ with ω(0) ≤ Dν. Then ω(1) ∈ Uκ for κ = C(D α + ν1 ). The point of part (a) of this proposition is that the domain of analyticity of the solution of the unforced Navier–Stokes equation increases with time and its L2 and κ-norms decrease with time. Part (b) says that, even if ω(0) is not analytic, but belongs to $, the solution after time 1 is analytic and its L2 and κ-norms are bounded in terms of the norm of the initial data in $. Our proof of Proposition 1 is inspired by [5] (see also [1]). For the proof we rewrite (6) (without forcing) in integral form t 2 2 ds e(s−t)νk (k × l)|l|−2 ωk−l (s)ωl (s) (18) ωk (t) = e−tνk ωk (0) + 0
l∈Z2 \{0,k}
and solve this in a suitable Banach space. Let Yκ be the Banach space equipped with the norm || · ||κ and Xκ,τ = {ω ∈ C 0 ([0, τ ], Yκ ) | |||ω||| ≡ sup |||ω(t)||κ(t) < ∞}. t∈[0,τ ]
We have the following existence lemma.
(19)
Ergodicity of 2D Navier–Stokes Equations with Random Forcing
69
Lemma 1. Let ω(0) ∈ Uκ , then the solution ω of Eq. (18) exists in the set Xκ,τ , for 1
τ ≤ (Cν 2 κ)−2 . Moreover, |||ω||| ≤ 2νaκ. Proof. Let ωk0 (t) = e−tνk ωk (0), 2
(20)
and write (18) as a fixed point equation ω(t) = ω0 (t) + N (ω)(t) ≡ F(ω)(t)
(21)
with
t
Nk (ω)(t) ≡
ds e(s−t)νk
0
2
(k × l)|l|−2 ωk−l (s)ωl (s).
(22)
l∈Z2 \{0,k}
We show that the map F is a contraction in the ball B = {ω ∈ Xκ,τ | |||ω − ω0 ||| ≤ νaκ}.
(23)
Let us first show that F maps B into itself. Obviously, |||ω0 ||| ≤ νaκ and if ω ∈ B, then |||ω||| ≤ 2νaκ, which means, −1 |k|
|ωk (t)| ≤ 2νaκ|k|−r e−κ(t)
.
(24)
We must prove that |||N (ω)||| ≤ νaκ, i.e. −1 |k|
|Nk (ω)(t)| ≤ νaκ|k|−r e−κ(t)
,
(25)
for all k ∈ Z2 \0 (recall that Nk = 0 for k = 0) and for all t ∈ [0, τ ]. Inserting (24) and |k × l| |l|−2 ≤ |k||l|−1 in (22), we get: t 2 |Nk (ω)(t)| ≤ (2νaκ)2 |k| ds e(s−t)νk 0 −1 −1 e−κ(s) |k−l| e−κ(s) |l| |k − l|−r |l|−r−1 . ×
(26)
l∈Z2 \{0,k}
Writing (17) as κ(t)−1 = κ −1 + ηνt min(1, κ)κ −1 , we obtain, since k = 0 means |k| ≥ 1, that 1 (s 2
− t)νk2 ≤ η(s − t)ν|k| ≤ (κ(s)−1 − κ(t)−1 )|k|
(27)
holds for 0 ≤ s ≤ t ≤ 1 and η ≤ 21 . Since −|k − l| − |l| ≤ −|k| and l∈Z2 \{0,k}
|k − l|−r |l|−r−1 ≤ C|k|−r ,
(28)
70
J. Bricmont, A. Kupiainen, R.Lefevere
(since r > 1), we get t 1 (s−t)νk2 −1 |k|−r e−κ(t) |k| ds e 2 |Nk (ω)(t)| ≤ C(2νaκ)2 |k| 0 −1 − 1 tνk2 = Cν(2aκ)2 2|k|−1 (1 − e 2 ) |k|−r e−κ(t) |k| . − 1 tνk2
1
(29)
1
) ≤ (νt) 2 , (25) follows for τ ≤ (Cν 2 κ)−2 . The contractive Since |k|−1 (1 − e 2 property is proven similarly. Thus we obtain a unique solution of (21) in B, which 1
satisfies (24), hence ωκ(t) ≤ 2νaκ, ∀t ≤ (Cν 2 κ)−2 .
Proof of Proposition 1. (a) It suffices to show that the solution constructed in Lemma 1 on the interval [0, τ ] satisfies the two bounds of the proposition there: the one on the κ(t) norm of the solution ω and the one on its enstrophy. This implies trivially that the solution can be extended to the whole interval [0, 1] and satisfies also there the bounds of the proposition. The bound on the enstrophy is easy to prove; as is well known, the enstrophy satisfies d "(t) = − νk2 |ωk |2 ≤ −ν"(t), (30) dt k=0
leading to "(t) ≤ "(0)e−νt . Since e−νt ≤ (1 + ηνt min(1, κ))− α for η small, we get 2
2
"(t) ≤ ν 2 ϕκ(t) α ,
(31)
i.e. the claim of the proposition concerning "(t). To prove the bound ωκ(t) ≤ νaκ(t), we consider separately the cases κ < 1 and κ ≥ 1. If κ < 1, it is enough to use the bound (29) which, since |k| ≥ 1, gives −1 − 1 tνk2 |Nk (ω)(t)| ≤ νaκλ 1 − e 2 (32) |k|−r e−κ(t) |k| , where λ can be chosen arbitrarily small by decreasing a. Now inserting this bound and e
− 21 tνk2
−1 |k|
|k|−r e−(κ)
−1 |k|
≤ |k|−r e−κ(t)
,
which follows from (27) with s = 0, in (21), we conclude that 1 2 −1 − tνk − 1 tνk2 |k|−r e−κ(t) |k| +λ 1−e 2 |ωk (t)| ≤ νaκ e 2 −1 |k|
≤ νaκ(t)|k|−r e−κ(t)
(33)
(34)
,
since e
− 21 tνk2
+ λ(1 − e
− 21 tνk2
) ≤ (1 + ηνt min(1, κ))−1 ,
(35)
which holds, since |k| ≥ 1, for λ and η small enough and 0 ≤ t ≤ 1. Inequality (34) is the claim of part (a) of the proposition concerning the κ(t) norm of the solution, namely ω(t)κ(t) ≤ νaκ(t).
Ergodicity of 2D Navier–Stokes Equations with Random Forcing
Turning to κ ≥ 1, fix a number 1 < β < β α
α−1 r
71
(recall that α > 1 + r) and consider
first 1 ≤ |k| ≤ κ . Using 1
2
|ωk (t)| ≤ (2"(t)) 2 , "(t) ≤ ν 2 ϕκ(t) α , we get immediately, 1
−1 |k|
|ωk (t)| ≤ (2ν 2 ϕ) 2 κ(t) α ≤ νaκ(t)|k|−r e−κ(t) 1
rβ
(36)
for ϕ small enough and t ≤ 1 since |k|r ≤ κ α ≤ κ 1− α and κ(t)−1 |k| ≤ 1. To conclude the proof, it suffices to show |Nk (ω)(t)| ≤ νaκλ(1 − e
1
− 21 tνk2
−1 |k|
)|k|−r e−κ(t)
,
(37)
β
for |k| ≥ κ α since we may then proceed as in (33–35). β Consider first the case κ α < |k| ≤ κ. We bound |k × l||l|−2 ≤ |k||l|−1 and split the sum in (22) into |ωk−l (s)||ωl (s)||k||l|−1 ≡ 91 + 92 . + (38) 0=|l|≤ |k| 2
l=k,|l|> |k| 2
In the first sum, we bound, using Lemma 1, |ωk−l (s)| ≤ 2νaκ|k − l|−r ≤ Cνaκ|k|−r , since |k − l| ≥ 21 |k|. Then Schwartz’ inequality and (31) yield
1
|ωl (s)||l|−1 ≤ (2ν 2 ϕ) 2 κ α 1
0=|l|≤ |k| 2
|l|−2
21
1
1
1
≤ Cνϕ 2 κ α (log |k|) 2 .
(39)
0 =|l|≤ |k| 2
Combining these two bounds, we get 1
1
91 ≤ Cν 2 |k|ϕ 2 κ α (log |k|) 2 aκ|k|−r . 1
(40)
For the second sum, we use |ωl (s)| ≤ 2νaκ|l|−r (coming from Lemma 1 again), together with (31) and Schwartz’ inequality to bound it by 1 2
1 α
92 ≤ Cν |k|ϕ κ aκ 2
|l|
−2(r+1)
21
1
≤ Cν 2 |k|ϕ 2 κ α aκ|k|−r . 1
(41)
l=k,|l|> |k| 2
Inserting (38), (40) and (41) into Nk (ω)(t) and performing the integral over time, we get the bound 1 1 1 2 |Nk (ω)(t)| ≤ Cνϕ 2 κ α (log |k|) 2 |k|−1 1 − e−tνk aκ|k|−r (42) 1 1 β 1 −1 − 1 tνk2 aκ|k|−r e−κ(t) |k| ≤ Cνe1+ην ϕ 2 κ α κ − α (log κ) 2 1 − e 2
72
J. Bricmont, A. Kupiainen, R.Lefevere β
where we used κ α < |k| ≤ κ and −1 |k|
1 ≤ e−κ(t)
e1+ην
which holds since |k| ≤ κ and, see (17), κ(t)−1 ≤ κ −1 (1 + ην) if 0 ≤ t ≤ 1. Thus we obtain (37) for ϕ small enough, because β > 1, log x ≤ 1; x ; , for x > 1 and ; > 0. Finally, in the case |k| > κ the bound (29) yields immediately (37) for a small. This finishes the proof of part (a) of Proposition 1. For part (b), we can proceed as in Lemma 1, but replace aκ by D and in the definition (19) κ(t) by νt2 . The inequality (27) is then replaced by 21 (s − t)νk2 ≤ 21 (s − t)ν|k| and the proof goes as before to the conclusion ω(t) νt2 ≤ 2νD
(43)
1
for t ≤ (Cν 2 D)−2 ≡ τ . We want to rewrite this bound in the form ω(τ )κ ≤ νaκ, for −1 a suitable κ. If τ ≤ 1, i.e. D ≥ C −1 ν 2 we have ντ2 = CD 2 and we can write (43) as ω(τ )ρ ≤ 2νD where ρ = CD 2 (remember that C is allowed to vary). Choosing now κ = Cρ = C D 2 , we obtain (since D is bounded away from zero, hence D ≤ CD 2 ) that ω(τ )κ ≤ νaκ. Applying now part (a) yields the same claim for ω(1). For τ > 1, i.e. D < C −1 ν
− 21
we get ω(1) 2 ≤ 2νD ≤ C, given the bound on D; so, ω(1)κ ≤ νaκ, ν
C ν.
with κ = Finally, for the enstrophy, we have, by (31), 2
"(t) ≤ "(0) ≤ ν 2 CD 2 ≤ ν 2 ϕκ α if we take κ > CD α . Since α > 2, taking κ = C D α + ν1 gives an upper bound covering all cases, i.e. ω(1) ∈ Uκ .
3. Probabilistic Estimates We define a region U ≡ Uν −p , where p > 27 α, and in which the solution of (6) is confined with high probability. Let us divide the transition probability into a likely and unlikely part: P (ω, E) = Q(ω, E) + R(ω, E),
(44)
Q(ω, E) = χU (ω)P (ω, U ∩ E).
(45)
where The following proposition about the dynamics in U and the unprobability of excursions outside U will play a central role in the proof of our uniqueness result1 . Proposition 2. (a) There exist constants c, C < ∞, c > 0, such that for all ω ∈ U , E ∈ B, |Qt (ω, E) − Qt (0, E)| ≤ 4e−mt , c and t ≤ c m−1 ν −q , with q ≡ where m ≥ exp −Cν −3 log ν −1 1 Here and below, the kernel AB(ω, E) is defined in the obvious way by
(46) 2p α
− 4 > 3.
A(ω, dω )B(ω , E).
Ergodicity of 2D Navier–Stokes Equations with Random Forcing
73
(b) There exists ζ < 1, c > 0, C < ∞, such that ∀κ ≥ 0, for all ω ∈ Uκ and for κ ≥ ζ κ, 2 P (ω, Uκc ) ≤ C exp −cν 4 κ α . (47) The proof of (46) is based on a standard argument for exponential convergence of Markov chains (given in Doob [2]), and the idea is fairly simple. If Q was a genuine transition probability, it would be enough, in order to prove the proposition, to show that Q has good mixing properties. The precise properties are stated in the lemmas below. First, Lemma 2 says that, for any point in U there is a nonzero probability to go in a finite time to a smaller region U¯ ⊂ U determined by the covariance of the noise and thus by κγ : U¯ ≡ U2κγ +ρν ,
(48)
where ρ > 0 will be chosen below (sufficiently small)2 . This is an easy consequence of Proposition 1. On each time interval, the solution increases its domain of analyticity (which is determined by κ, i.e. κ decreases); then, if the “kicks” of the noise are sufficiently small (but not too small, so that this event is not too unprobable), the solution reaches U¯ in a finite time (of order ν −1 log ν −1 ). Secondly, we show in Lemma 3 that, in the region U¯ , the stochastic dynamics is sufficiently mixing; this is again due to the fact that the deterministic Navier–Stokes evolution increases the domain of analyticity of the solution. Third, the fact that Q is not a bona fide transition probability is what limits the proposition to finite times. For longer times, we will need to have some estimate on the probability of escaping the region U , which follows from part (b) of the Proposition. Indeed, the latter implies, using (44, 45) and taking κ = κ = ν −p that, for all ω ∈ U , −q
P (ω, U c ) = R(ω, $) ≤ e−cν , with q =
2p α
(49)
− 4 > 3 (remember that p > 27 α and that ν is small).
Lemma 2. There exist constants c, C < ∞, such that ∀ω ∈ U , P T1 (ω, U¯ ) ≥ exp −Cν −3 (log ν −1 )c ,
(50)
with T1 = Cν −1 log ν −1 . Lemma 3. There exist constants c, C < ∞, such that, ∀ω, ω ∈ U¯ , ∀B ⊂ U¯ , P (ω, B) + P (ω , U¯ \B) ≥ exp −Cν −2 (log ν −1 )c . Lemmas 2, 3 imply that there exist
δ(ν) ≡ exp −Cν −3 (log ν −1 )c
and
T ≡ T (ν) = Cν −1 log ν −1
2 Similar ideas were used by Kuksin and Shirikyan in [4].
(51)
74
J. Bricmont, A. Kupiainen, R.Lefevere
with C, c < ∞, such that ∀ω, ω ∈ U and ∀B ⊂ U¯ , P T (ω, B) + P T (ω , U¯ \B) ≥ δ(ν),
(52)
which implies in turn, since U¯ ⊂ U , that ∀ω, ω ∈ U and ∀B ⊂ U , P T (ω, B) + P T (ω , U \B) ≥ δ(ν).
(53)
This is the main inequality that we shall use now.
3.1. Proof of Proposition 2. We start with the proof of part (a), where we shall use (49), which is a consequence of part (b), to be proven independently below. To get (46) we follow, with slight modifications, an argument in [2, pp. 197–198]. Let Q(t, E) = inf Qt (ω, E), Q(t, E) = sup Qt (ω, E). ω∈U
ω∈U
Fix ω, ω ∈ U and consider the function defined on subsets E ⊂ $: ψω,ω (E) = QT (ω, E) − QT (ω , E). Let S + be the set such that ψω,ω (E) ≥ 0 for E ⊂ S + and ψω,ω (E) ≤ 0 for E ⊂ U \S + ≡ S − (S ± depend on ω, ω , but we suppress this dependence). Observe that writing, see (44), P = Q + R, and using (49), we have, for any ω ∈ U , E ⊂ $, that |P T (ω, E) − QT (ω, E)| = |
T −1
Qt RP T −t−1 (ω, E)| ≤ T e−cν
t=0
−q
1
≡ 2 ;(ν).
(54)
Then, |ψω,ω (S + ) + ψω,ω (S − )| = |QT (ω, $) − QT (ω , $)| ≤ ;(ν),
(55)
since P T (ω, $) = P T (ω , $) = 1. Moreover, using (54, 53), ψω,ω (S + ) = QT (ω, S + ) − QT (ω , S + ) ≤ 1 − (P T (ω, S − ) + P T (ω , S + )) + ;(ν) ≤ 1 − δ(ν) + ;(ν).
(56)
Thus, Q(t + T , E) − Q(t + T , E) = sup
ω,ω
= sup
ω,ω
(QT (ω, dω ) − QT (ω , dω ))Qt (ω , E) ψω,ω (dω )Qt (ω , E)
≤ sup (ψω,ω (S + )Q(t, E) + ψω,ω (S − )Q(t, E)) ω,ω
≤ (1 − δ(ν) + ;(ν))(Q(t, E) − Q(t, E)) + ;(ν),
Ergodicity of 2D Navier–Stokes Equations with Random Forcing
75
where, to get the last inequality, we write ψω,ω (S − ) = −ψω,ω (S + ) + ψω,ω (S + ) + ψω,ω (S − ), bound ψω,ω (S + ) by (56), |ψω,ω (S + ) + ψω,ω (S − )| by (55) and use Q(t, E) ≤ 1. We conclude that, for ;(ν) < δ(ν), |QnT (ω, E) − QnT (0, E)| ≤ Q(nT , E) − Q(nT , E) ≤ 2(1 − δ(ν) + ;(ν))n−1 +
;(ν) . δ(ν) − ;(ν)
Recall that δ(ν) = exp(−Cν −3 (log ν −1 )c ) and that T = T (ν) = Cν −1 log ν −1 , hence, −q see (54), ;(ν) ≤ e−c ν ; so, since we assume ν to be small and q > 3, part (a) of the proposition follows. Let us now prove part (b). It suffices to assume κ ≥ Cν −2α since the LHS of (47) is bounded by one. Using (10) for n = 0, we have ω(1)κ ≤ F (ω(0))κ + g(1)κ
(57)
and 1
"(1) = 2 ω(1)2L2 ≤ F (ω(0))2L2 + g(1)2L2 ,
(58)
and, by Proposition 1, we know that, if ω(0) ∈ Uκ , F (ω(0)) ∈ Uκ(1) , with κ(1) = (recall that κ ≥ Cν −2α ≥ 1 here). Then, letting ζ = (1 + ην) κ ≥ ζ κ, that F (ω(0))κ ≤ (1 + ην)
− 21
− 21
κ 1+ην
< 1, we get, for
νaκ
and 2
F (ω(0))2L2 ≤ (1 + ην)− α ν 2 ϕκ α . 2
Now, assume that g(1) satisfies, ∀k, 1
|gk (1)| ≤ ;1 ν 2 κ α b
− 21
1
|k|(γk ) 2 ,
(59)
with ;1 small (depending on η but independent of ν and κ). Then, we get that ω(1) ∈ Uκ , using the upper bound in (7) and the fact that, for ν small, κ ≥ ζ κ ≥ Cζ ν −2α is much larger than 2κγ . Hence, the probability in (47) is bounded by the probability that at least one of the inequalities in (59) is violated. Since the gk ’s are Gaussian random variables with covariance γk , this event has a probability less than:
2 1 − C exp −C;12 ν 4 (κ ) α b−1 |k|2 . (60) 1− k 2
For ν 4 (κ ) α large, each is small. The factor |k|2 controls exponential2 in the product 2 2 4 −1 2 the sum over k of exp −C;1 ν (κ ) α b |k| , a sum which is small for ν 4 (κ ) α large 2 enough, and therefore (60) is bounded from above by 1 − 1 − exp −cν 4 κ α , i.e. by the RHS of (47).
76
J. Bricmont, A. Kupiainen, R.Lefevere
3.2. Proofs of Lemmas 2 and 3. Proof of Lemma 2. Let ω(0) ∈ Uκ ⊆ U and consider (10) for n = 0. Choose g(1) such that, ∀k, |gk (1)| ≤ ;1 ν 2 e;2 ν|k| b
− 21
1
(γk ) 2 .
(61)
From Proposition 1, (61) and (7), one obtains, −1 |k|
|ωk (1)| ≤ νaκ(1)|k|−r e−κ(1)
−1 |k| 2
+ ;1 ν 2 e;2 ν|k| e−κγ
.
(62)
Then, from (17), one gets, for any ρ > 0, by choosing ;1 , ;2 small enough, that ∃λ¯ < e−cν < 1 such that, ∀k, −1 |k|
|ωk (1)| ≤ νaκ |k|−r e−(κ )
(63)
¯ 2κγ + ρν). with κ = max(λκ, From (31), (58), (59), one also easily obtains that "(1) ≤ ν 2 ϕ(κ ) α , 2
and thus that ω(1) ∈ Uκ . Thus,
1 −1 P |gk (1)| ≤ ;1 ν 2 e;2 ν|k| b 2 (γk ) 2 . P (ω, Uκ ) ≥ k
Now, since the gk ’s are Gaussian random variables with covariance γk , we have that, P (|gk (1)| ≤ ;1 ν 2 e;2 ν|k| b
− 21
1
(γk ) 2 ) ≥ 1 − exp(−c;12 ν 4 e2;2 ν|k| b−1 )
(64)
for |k| ≥ Cν −1 log ν −1 , if C is chosen so that bν −4 ≤ e;2 ν|k| (note that the product over such k’s of the RHS of (64) is strictly positive uniformly in ν), while for |k| ≤ Cν −1 log ν −1 , 1 1 −1 −1 P |gk (1)| ≤ ;1 ν 2 e;2 ν|k| b 2 (γk ) 2 ≥ P |gk (1)| ≤ ;1 ν 2 b 2 (γk ) 2 ≥ Cν 4 , (65) which follows from the fact that the gk ’s are (complex) Gaussian random variables with covariance γk and therefore that ;k 1 2rdr −1 P |gk (1)| ≤ ;1 ν 2 b 2 (γk ) 2 ≥ ≥ Cν 4 γ 0
−1
k
1
with ;k ≡ ;1 ν 2 b 2 (γk ) 2 . The bound (65) readily implies that there are constants C, c1 < ∞ such that ∀ω ∈ Uκ , (66) P (ω, Uκ ) ≥ exp −Cν −2 (log ν −1 )c1 . Since U = Uν −p , and since κ decreases by a factor λ¯ < 1 at each step, as long as ¯ ≥ 2κγ + ρν, one may iterate the above argument and reach U¯ = U2κγ +ρν , see λκ (48), in a time less than T1 (ν) = Cν −1 log ν −1 , ∀ω(0) ∈ U . Therefore, the claim of the lemma follows (with a different C than in (66), and with c = c1 + 1).
Ergodicity of 2D Navier–Stokes Equations with Random Forcing
77
Proof of Lemma 3. Let ω0 ∈ U¯ and B ⊂ U¯ . Since the gk ’s are Gaussian random variables with covariance γk , we have,
|ωk −Fk (ω0 )|2 d ω¯ k ∧dωk P (ω0 , B) = , (67) exp − 2πiγ γ k
B k
k
where we recall from (10) that F (ω0 ) denotes the value at time 1 of the solution of (18) with initial condition ω0 . In view of Proposition 1, and the definition of U¯ = U2κγ +ρν , we can bound, ∀ω0 ∈ U¯ , |Fk (ω0 )| ≤ Cνae
|k| − 2κ γ
e−ρν|k| ≡ ;k ,
(68)
provided we choose ρ sufficiently small so that 1 + ην min(1, 2κγ + ρν) 1 + ρν. ≥ 2κγ + ρν 2κγ Thus, we can bound |ωk − Fk (ω0 )|2 ≤ (|ωk | + ;k )2 ; this gives a lower bound on (67) independent of ω0 and we may use this bound on each term of the LHS of (51), with ω0 = ω, ω . We get that the LHS of (51) is bounded from below by
(|ω |+; )2 d ω¯ k ∧dωk . (69) exp − k γ k 2πiγ U¯ k
k
k
In order to estimate that latter integral, observe that, by (7), ω ∈ U¯ = U2κγ +ρν provided that, ∀k, |ωk | ≤ ;1 νe;2 ν|k| b
− 21
1
(γk ) 2 ≡ ;¯k ,
(70)
if we take ;1 , ;2 small enough. Thus, by restricting the domain of integration, we get a lower bound on (69):
;¯k 2rdr (r+;k )2 exp − . (71) γ γ k
0
k
k
Each factor is bounded from below by 1−C
;k2 − exp −c;12 ν 2 e2;2 |k| (b)−1 γk
(72)
for |k| ≥ Cν −1 log ν −1 . To bound the product over those k’s of the factors given by (72) by a strictly positive constant, independent of ν, observe first that the last term is summable over k, for |k| ≥ Cν −1 log ν −1 and that the sum is small. Moreover, using (68) and the lower bound in (7), we get, ;k2 ≤ Cabν 2 exp (−2ρν|k|) . γk
(73)
Then, (73) is also summable over k, for |k| ≥ Cν −1 log ν −1 and the sum is also small. Finally, for |k| ≤ Cν −1 log ν −1 , each factor in (71) is bounded from below, using (7, ;¯ 2 70), by C 0 k 2rdr γk ≥ Cν , which yields the claim of the lemma.
78
J. Bricmont, A. Kupiainen, R.Lefevere
4. Proof of the Theorem We deduce the theorem from Proposition 2. Let us choose a number ;¯ small enough and a time τ large enough, i.e. τ = cm−1 ν −q so that (46) is less than 2;¯ . Then, for T an integer multiple of τ , write T
P T (ω, E) = (P τ ) τ (ω, E).
(74)
π(ω, E) ≡ π(E) = Qτ (0, E)
(75)
Next, let
and ¯ R(ω, E) = P τ (ω, E) − Qτ (ω, E), R (ω, E) =τ (ω, E) − Qτ (0, E) = Qτ (ω, E) − π(E), ¯ r(ω, E) = R(ω, E) + R (ω, E).
(76) (77)
One may then write T
P T (ω, E) = (π + r) τ (ω, E).
(78)
T
We can expand (π + r) τ in powers of r: T T T (π + r) τ = π k1 r k2 . . . π kl + r τ ≡ 9 1 + r τ ,
(79)
ki
where the sum 9 1 runs over ki ≥ 0, ki = Tτ and collects all the terms with at least one factor π. Now observe that, using (78) with T = τ , we have that r(ω, $) = P τ (ω, $) − π($) = 1 − π($) = 1 − Qτ (0, U ) is independent of ω; hence, by (75), (rπ )(ω, dω2 ) = r(ω, dω1 )π(ω1 , dω2 )
(80)
= r(ω, $)π(dω2 ) = (1 − Qτ (0, U ))π(dω2 )
is also independent of ω. From this, we conclude that, since there is at least one factor π in each term of 9 1 , 9 1 (ω, E) = 9 1 (ω , E), ∀ω, ω , ∀E, and, using (78), that T
T
|P T (ω, E) − P T (ω , E)| ≤ |r τ (ω, E)| + |r τ (ω , E)|,
(81)
where the RHS is controlled by: Lemma 4. For T an integer multiple of τ , T
|r τ (ω, E)| ≤ C(ω)e−mT , where m = m(ν) ≥ exp(−Cν −3 (log ν −1 )c ) and where C(ω) ≤ C C, c < ∞.
(82)
||ω||+1 ν
c
, for
Ergodicity of 2D Navier–Stokes Equations with Random Forcing
79
To conclude the proof of (12), it is enough to show that limT →∞ P T (0, E) = µ(E) exists. And, to prove that, we write, for T > T , |P T (0, E) − P T (0, E)| ≤ P T −T (0, dω)|P T (ω, E) − P T (0, E)|. (83) −p We may write the integral as an integral over U plus a sum over κ ∈ N, κ ≥ νc , of integrals over Uκ+1 \Uκ and, combining (81), Lemma 4, C(ω) ≤ C ||ω||+1 ≤ ν c C ||ω||νκ +1 , ∀κ, and (47) (which implies a similar bound for P T −T (0, Uκc )), we bound (83) by 2 Ce−mT (ν −(p+1)c + κ c exp −c ν 4 κ α ≤ C(ν)e−mT , (84) κ∈N,κ≥ν −p
which proves the existence of limT →∞ P T (0, E). Finally, the bound (11) follows from (47), for κ large, and we bound the LHS of (11) by 1 for κ small. Proof of Lemma 4. Define, for n ≥ 0, U (n) ≡ Uζ −n ν −p , (so that U = U (0)) with ζ < 1 as in Proposition 2, and define V (n) by V (n) = U (n)\U (n − 1), for n ≥ 1, and V (0) = U (0). Next, let ρmn ≡ sup |r(ω, V (n))|, ω∈V (m)
where r is defined in (77). Observe that we have the following bounds on ρmn : ρ00 ≤ ;¯ ,
ρmn ≤ exp −cξ n ν −q ρmn ≤ 4
n ≥ m,
(85)
n < m,
2 where ξ ≡ ζ − α > 1. To check this, use, for m, n = 0, (54) to bound R¯ and (46) to bound R . For the second inequality, n = 0, only P contributes to r and the bound follows immediately from (47), with κ = ζ 1−n ν −p (remember that q = 2p α − 4). Finally, for n < m, we use the fact that r is the sum of four terms, each less than 1. Write now N−1
N r (ω, E) = r(ωi , dωi+1 )χ (ωN ∈ E) (86)
i=0
with ω0 = ω, and insert a decomposition of the identity for each i = 1, . . . , N, 1= χ (ωi ∈ V (ni )) . ni ≥0
This leads to sup sup |r N (ω, E)| ≤
ω∈U (n0 ) E
N−1
i=0 (ni )N i=1 ,ni ≥0
ρni ni+1 ≡
n
ρnN0 n .
(87)
80
J. Bricmont, A. Kupiainen, R.Lefevere
Note that the RHS describes “random walks” on nonnegative integers, where only steps strictly down (ni+1 < ni ) are not suppressed. To estimate it, write ρ = d + u, where d is the “down” part of ρ, i.e. the matrix whose elements are given by ρmn with n < m and zero otherwise, and u is the rest (“up”). We shall first prove the simple estimates k dmn ≤ Cm k ≤ m (88) n
and zero otherwise (where the restriction k ≤ m comes from the fact that the indices of dmn must be positive and, whenever dmn = 0, must satisfy n < m), and (uk d l )mn ≤ (C ;¯ )k+l . (89) n
Indeed, (88) is estimated by n1 ...nk
dmn1 . . . dnk−1 nk ≤
k
4k = 4k
1 pi ≤m
m l−1 ≤ 4k 2 m , k−1 l=k
with pi = ni−1 − ni ≥ 1, n0 = m, yielding the claim since k ≤ m. To prove (89), write l (uk d l )mn = ukmm dm n, l≤m
where the constraint in the sum comes from the second inequality in (88), and note that ukmm is bounded by (¯; )k if m = m = 0 and by exp −cξ m ν −q (C ;¯ )k−1 otherwise (both bounds following from (85) and the fact that, by definition, umn = 0, unless m ≥ n). The bound (89) follows by combining these with (88), since l ≤ m , we can therefore use the factor exp(−cξ m ν −q ) and ν small to obtain the factor ;¯ l+1 in (89) (¯; is a fixed small number). Inserting (88), (89) into ρ N = (u + d)N = d l0 uk1 d l1 . . . uks d ls , where li ≥ 0, li > 0 for i = 0, s, we obtain the bound ρnN0 n ≤ C N ;¯ N−n0 , n
where ;¯ −n0
comes from the fact that we have no ;¯ bound on d l0 , but we can use l0 ≤ n0 . This proves the lemma, if we choose in (82), 1 m = − log C ;¯ = −cm−1 ν −q log C ;¯ ≥ exp −Cν −3 (log ν −1 )c τ (given our choice of τ at the beginning of this section, our bound on m in Proposition 2, and changing the constants), and, for ω ∈ U (n0 ), choose C(ω) = (C ;¯ )−n0 ; indeed, let, for ω ∈ Uκ , n0 be the smallest integer such that κ ≤ ζ −n0 ν −p ; then, n0 ≤ C log κ and C(ω) ≤ Cκ c . Moreover, that, from part (b) of Proposition 1, we know ∀ω ∈ $, F (ω) ∈ Uκ , with κ ≤ C
||ω|| ν
α
+
1 ν
; so, altogether, C(ω) ≤ C
||ω||+1 ν
c
.
Ergodicity of 2D Navier–Stokes Equations with Random Forcing
81
References 1. Bricmont, J., Kupiainen, A., Lefevere, R.: Probabilistic estimates for the two dimensional stochastic Navier–Stokes equations. J. Stat. Phys. 100, 743–756 (2000) 2. Doob, J.L.: Stochastic Processes. New-York: John Wiley, 1953 3. Flandoli, F., Maslowski, B.: Ergodicity of the 2-D Navier–Stokes equation under random perturbations. Commun. Math. Phys. 171, 119–141 (1995) 4. Kuksin, S., Shirikyan, A.: Stochastic dissipative PDE’s and Gibbs measures. Commun. Math. Phys. 213, 291–330 (2000) 5. Mattingly, J.C., Sinai, Y.: An elementary proof of the existence and uniqueness theorem for the Navier– Stokes equations. Preprint 6. Mattingly, J.C.: Ergodicity of 2D Navier–Stokes equations with random forcing and large viscosity. Commun. Math. Phys. 206, 273–288 (1999) Communicated by G. Gallavotti
Commun. Math. Phys. 224, 83 – 106 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Gibbsian Dynamics and Ergodicity for the Stochastically Forced Navier–Stokes Equation Weinan E1 , J. C. Mattingly2 , Ya. Sinai3 1 Department of Mathematics and Program in Applied and Computational Mathematics, Princeton University,
Princeton, NJ 08544, USA and School of Mathematics, Peking University, Beijing, P.R. China
2 Department of Mathematics, Stanford University, Stanford, CA 94305, USA 3 Department of Mathematics, Princeton University, Princeton, NJ 08544, USA and Landau Institute of
Theoretical Physics, Moscow, Russia Received: 21 November 2000 / Accepted: 9 December 2000
Dedicated to Joel L. Lebowitz, on the occasion of his 70th birthday Abstract: We study stationary measures for the two-dimensional Navier–Stokes equation with periodic boundary condition and random forcing. We prove uniqueness of the stationary measure under the condition that all “determining modes” are forced. The main idea behind the proof is to study the Gibbsian dynamics of the low modes obtained by representing the high modes as functionals of the time-history of the low modes. 1. Introduction and Main Results We are interested in determining conditions sufficient to insure that the stochasticallyforced Navier–Stokes equation (SNS) possesses a unique stationary measure, or equivalently, that the dynamics is ergodic in the phase space. Our main result is that this holds if all the “determining modes” are forced. To prove this, we show that the dynamics of the Navier–Stokes equation can be reduced to the dynamics of the low modes, the so-called determining modes, with memory. This is the stochastic analog of results proved for the deterministic case by Foias et al. [FMRT]. We will work with the periodic boundary condition. But in principle our techniques should also apply for the more physical no-slip boundary condition. Consider the two-dimensional Navier–Stokes equation with stochastic forcing: ∂u ∂W (x, t) + (u · ∇)u + ∇p − νu = . (1) ∂t ∂t ∇ ·u=0 For simplicity of presentation we will take W to be of the form σk wk (t, ω)ek (x)m W (x, t) =
(2)
|k|≤N
where the wk ’s are standard i.i.d complex valued Wiener process satisfying w−k (t) = w k (t), and σk ∈ C, with |σk | > 0 and σ−k = σ k , are the amplitudes of
84
W. E, J.C. Mattingly, Ya. Sinai
2 eik·x 2 the forcing, {ek (x) = −ik ik1 |k| , k ∈ Z} are the basis in the space of L divergence2 free, mean zero vector fields on T , the two dimensional torus. Our techniques apply to more general cases when the higher modes are also forced, as long as |σk | decays sufficiently fast as |k| → ∞ or to forcing which is not diagonal in Fourier space. But we will restrict ourselves to the form in (2) for clarity. Define B(u, v) = −Pdiv (u · ∇)v, 2 u = −Pdiv u, where Pdiv is the L2 projec2 tion operator onto vector fields. Let σmax = max{|σk |2 : the space2 of divergence-free 2 2 |k| ≤ N }. E0 = |k|≤N |σk | and E1 = |k|≤N |k| |σk | . Writing u(x) = k uk ek (x), we will define Hα = u = (uk )k∈Z2 , u0 = 0, k |k|2α |uk |2 < ∞ and L2 = H0 . We will work on a probability space (, F, Ft , P, θt ). We associate with the canonical space generated by all dωk (t). F and Ft are respectively the associated global σ -algebra and filtration generated by W (t). Lastly, θt is the shift on defined by θt dωk (s) = dωk (s + t). Notice that θt is an ergodic group of measure-preserving transformations with respect to P. Expectations with respect to P will be denoted by E. Projecting (1) onto L2 , we obtain the the following system of Itô stochastic equation du(x, t) + ν2 u(x, t)dt = B(u, u)dt + dW (x, t).
(3)
It can be shown that (3) generates a continuous Markovian stochastic semi-flow on L2 defined by ω ϕs,t u0 = u(t, ω; s, u0 ).
(4)
When s = 0, we simply write ϕtω (see [Fla94, DPZ96]). We will take the state space of (3) to be L2 equipped with the Borel σ -algebra. A measure µ(du) on L2 is stationary for the stochastic flow (3) if for all bounded continuous functions F on L2 and t > 0, F (u)µ(du) = EF ϕtω u µ(du). (5) L2
L2
Our main result is: Theorem 1. There exists some absolute constant C such that if N 2 ≥ C Eν 30 then (3) has a unique stationary measure on L2 . The existence of at least one stationary measure was proved in [Fla94] and [VF88]. The proof proceeds by establishing compactness for a family of empirical measures. The limiting points of these empirical measures are the stationary measures. Uniqueness has been proved under restrictive assumptions when ALL modes are forced. Flandoli and Maslowski [FM95] proved that if the σk ’s decay algebraically, i.e. if the forcing is sufficiently rough spatially, then the system has a unique stationary measure. These results were extended and refined in [Fer97]. In [Mat99], it was proven that if the viscosity was large enough the contraction induced by the Laplacian dominates and the system possesses a trivial random attractor; and hence, a unique stationary measure. We do not address convergence to the stationary measure. This and the coupling construction used to prove convergence are discussed in [Mat00]. Recently Kuksin and Shirikyan [KS] proved uniqueness of stationary measure when the Navier–Stokes equation is perturbed by a bounded degenerate kicked noise. Results similar to ours have also been obtained independently by Bricmont et al. [BKL].
Gibbsian Dynamics and Ergodicity for the Stochastically Forced Navier–Stokes Equation
85
Our main strategy is to reduce the dynamics of the Navier–Stokes equation to the dynamics of a finite dimensional set of low modes with memory. The reduced dynamics is no longer Markovian, but rather Gibbsian (see §2, §4). The finite dimensional Gibbsian dynamics has a non-degenerate noise, and have a unique stationary measure if the memory is short ranged. Before proceeding further, let us observe that any given stationary measure µ can be extended to a measure on the path space, denoted by µp , where p stands for path or past. Consider the example of the path space C (−∞, 0], L2 . Let A be a cylinder set of the type: For some t0 , t1 , · · · tn , t0 < t1 < t2 · · · tn ≤ 0,
A = u(s) ∈ C (−∞, 0], L2 , u(ti ) ∈ Ai , i = 0, · · · n , (6) where the Ai ’s are Borel sets of L2 . Corresponding to A, let B ⊂ × L2 , B = {(u, ω), u ∈ A0 , ϕtω0 ,ti u ∈ Ai , i = 1, · · · n}.
(7)
µp (A) = (P × µ)(B),
(8)
We will define
where (P × µ) is the product measure on × L2 . Clearly µp is consistent on cylinder sets and can be extended to the natural σ -algebra using the Kolmogorov extension theorem. The natural σ -algebra is the one generated by the cylinder sets. The dynamics of the stochastic semi-flow {ϕtω } can be trivially extended to return a function from C (−∞, t], L2 , given an initial function from C (−∞, 0], L2 . One simply flows forward with ϕ from the initial condition avoid confusion, we will call at time 0. To this map ψtω . Symbolically, if u(·) ∈ C (−∞, 0], L2 , then (ψtω u)(s) = ϕsω u(0) for s ∈ [0, t] and (ψtω u)(s) = u(s) for s ≤ 0. If we define the shift on trajectories by (θt v)(s) = v(s + t), we can define a dynamics on C (−∞, 0], L2 by θt ψtω . In other words, θt ψtω u takes a trajectory u from C (−∞, 0], L2 , extends it t units of time by flowing forward and then shifts the entire resulting trajectory back t units of time so it again lives on C (−∞, 0], L2 . It is easy to check directly that if µ is invariant then µp is invariant in the sense that (9) F (u)dµp (u) = E F (θt ψtω u)dµp (u) 2 2 C ((−∞,0],L ) C ((−∞,0],L ) for all bounded functions on C (−∞, 0], L2 , and t ≥ 0. Assume that µ and ν are two stationary measures for the stochastic flow (3), and µp 2 and νp are respectively their induced measure on the path space C (−∞, 0], L . It is obvious that µp = νp implies µ = ν. 2. Reduction to the Gibbsian Dynamics Define two subspaces of L2 : L2( = span{ek , |k| ≤ N },
L2h = span{ek , |k| > N }.
(10)
We will call L2( the set of low modes and L2h the set of high modes. Obviously L2 = L2( ⊕ L2h . Denote by P( and Ph the projections onto the low and high mode spaces.
86
W. E, J.C. Mattingly, Ya. Sinai
Since we are concerned with stationary measures of (3), we are interested in (statistically) stationary solutions of (3) that exist for time from −∞ to +∞. We will show in this section that for such solutions, the high modes are completely by the determined past history of the low modes. For this purpose, we write u(t) = ((t), h(t) and
d((t) = −ν2 ( + P( B((, () dt
+ P( B((, h) + P( B(h, () + P( B(h, h) dt + dW (t), (11)
dh(t) (12) = −ν2 h + Ph B(h, h) + Ph B((, h) + Ph B(h, () + Ph B((, (). dt Define the set of “nice pasts” U ⊂ C (−∞, 0], L2 to consist of all v : (−∞, 0] → L2 such that: i) v(t) is in H2 for all t ≤ 0. ii) The energy averages correctly. More precisely, 1 t→−∞ |t| lim
t
0
|v(s)|2L2 ds =
E0 . 2ν
iii) The energy fluctuations are typical. More precisely, there exists a T = T (v) such that 2
|v(t)|2L2 ≤ E0 + max(|t|, T ) 3 for t ≤ 0. The following lemma shows that U contains almost all of the trajectories defined on the whole time interval. Lemma 2.1. Let µp be themeasure on C (−∞, 0], L2 induced by a stationary measure µ for (3). Then µp U = 1. Proof of Lemma 2.1. It is proved in [Mat98] or [Fer97] that with probability one, a solution to (3) is in H2 for all t. The fact that the last condition is satisfied by a set of full measure is proved in Lemma B.3. All that remains to show is ii). From Lemma B.2 |v|2L2 is in L1 (µ) for any stationary measure µ and |v|2L2 dµ = E0 2ν . Since the measure is invariant under shifts back in time and each ergodic component has the same average enstrophy, the ergodic theorem implies that for µp –almost every trajectory time average converges to the average of |u|2L2 against µ. Given an arbitrary continuous function of time ((t) on L2( , we can view (12) as a closed equation with some exogenous forcing ((t). By ,s,t ((, h0 ), we mean the solution to (12) at time t given the initial condition h0 at time s and the “forcing” (. Denote by P the set of all ( ∈ C (−∞, 0], L2( such that the following two conditions hold. First, ( = P( u for some u = ((, h) ∈ U . Second, h(t) = ,s,t ((, h(s)) for any s < t ≤ 0, where h was the matching high mode so ((, h) ∈ U . That is to say h(t) solves (12) with low modes ((t) and the total solution ((, h) is in our space of “nice pasts”. In light of Lemma 2.1 the set P is not empty. We now will show that this h is uniquely determined by (.
Gibbsian Dynamics and Ergodicity for the Stochastically Forced Navier–Stokes Equation
87
Lemma 2.2. There exists an absolute positive constant C such that if N 2 > C Eν 30 then the following holds: If there exists two solutions u1 (t) = ((t), h1 (t) , u2 (t) = ((t), h2 (t) corresponding to some (possibly different) realizations of the forcing and such that u1 , u2 ∈ U , then u1 = u2 , i.e. h1 = h2 . Furthermore given a solution u(t) = ((t), h(t) ∈ U , any h0 ∈ L2h , and t ≤ 0, the following limit exists: lim ,t0 ,t ((, h0 ) = h∗
t0 →−∞
and h∗ = h(t). Proof of Lemma 2.2. We begin with the first clam. Denote by ρ(t) = h1 (t) − h2 (t). From (12) we have dρ = − ν2 ρ + Ph B(h1 , h1 ) − Ph B(h2 , h2 ) + Ph B((, ρ) + Ph B(ρ, () dt = − ν2 ρ + Ph B(( + h1 , ρ) + Ph B(ρ, ( + h2 )
(13)
= − ν2 ρ + Ph B(u1 , ρ) + Ph B(ρ, u2 ). Taking the inner product with ρ, using the fact that Ph B(u1 , ρ), ρL2 = 0, gives 1 d |ρ|2 2 = −ν|ρ|2L2 + Ph B(ρ, u2 ), ρL2 . 2 dt L Since |Ph B(ρ, u2 ), ρL2 | ≤Cˆ |ρ|L2 |ρ|L2 |u2 |L2 ν Cˆ 2 |ρ|2L2 |u2 |2L2 , ≤ |ρ|2L2 + 2 2ν we get ν Cˆ 2 1 d |ρ|2L2 ≤ − |ρ|2L2 + |u2 |2L2 |ρ|2L2 . 2 dt 2 2ν Since ρ only contains modes with |k| > N , the Poincaré inequality implies Cˆ 2 d 2 2 2 |u2 |L2 |ρ|2L2 . |ρ|L2 ≤ −νN + ν dt Therefore we have, for t0 < t < 0, |ρ(t)|2L2
≤
|ρ(t0 )|2L2
Cˆ 2 t 2 |u2 (s)|L2 ds . exp −νN (t − t0 ) + ν t0 2
From the third assumption on functions in U , we know that lim E0 2ν . Hence for t0 < T1 , where T1 depends on t and u2 , we have −νN 2 (t − t0 ) +
1 0 t −t
(14)
|u2 (s)|2L2 ds =
Cˆ 2 t γ |u2 (s)|2L2 ds ≤ − (t − t0 ), ν t0 2
88
W. E, J.C. Mattingly, Ya. Sinai ˆ2
ˆ2
where γ = νN 2 − C2νε20 . If we set C = C2 , then our assumption on N implies γ > 0. Now using the last property of paths in U we have for any t0 ≤ T2 , γ |ρ(t)|2L2 ≤ |ρ(t0 )|2L2 exp − (t − t0 ) 2 γ 2 3 ≤2 E0 + |t0 | ] exp − (t − t0 ) → 0 2 as t0 → −∞, where T2 is some finite constant depending on u1 and u2 . This completes the proof of the first part of Lemma 2.2. t To see the second part, observe that (14) only required control of t0 |u(s)|2L2 ds for one of the two solutions. If we proceed as before letting the given solution u(t) play the role of u2 and the solution to (12) starting from h0 play the role of u1 , the we obtain the estimate Cˆ 2 t 2 2 2 2 |ρ(t)|L2 ≤ |h(t0 ) − h0 |L2 exp −νN (t − t0 ) + |u(s)|L2 ds . (15) ν t0 Since u(t) = (((t), h(t)) ∈ U , the same reasoning as before shows that ρ(t) goes to zero as t0 → −∞. Hence the limit exists and equals h(t). In fact the splitting into high and low modes can be accomplished even when all of the modes are forced. One replaces (12) with an Itô stochastic differential equation. This causes little complication as (13) remains a standard PDE. See [Mat98].The ideas in this section are related to the ideas of Lyapunov-Schmidt reduction and those around center and inertial manifolds. See [EFNT94] for a discussion and other references. From now on we assume that N satisfies N2 > C
E0 , ν3
(16)
where C is the constant from Lemma 2.2. Because of Lemma (2.2), we can define a map ,0 which reconstructs the high modes at time zero from a given low mode trajectory stretching from zero back to −∞. Before making this more precise, let us fix some notation. In general, we will use ((t) to refer to the value of the low modes at time t and will use Lt to mean the entire trajectory from −∞ to t. Hence ((t) ∈ L2 and Lt ∈ C (−∞, t], L2 and ((s) = Lt (s) for s ≤ t. In this notation h(0) = ,0 L0 , where L0 is some “low mode past” in P which is the projection of U to the low modes. By ,s (Lt , h(0)) with s ≤ t, we mean the solution to (12) at time s with initial condition h(0) and low mode forcing Lt . Of course ,s (Lt , h(0)) only depends on the information in Lt between 0 and s. We can extend the definition of , beyond time zero by defining ,t (Lt ) = ,t (Lt , h(0)), where h(0) = ,0 (L0 ). Given the initial low mode past of L0 ∈ P, we can solve for the future of ( using
d((t) = −ν2 ((t) + P( B ((t), ((t) + G ((t), ,t (Lt ) dt + dW (t), (17) where G ((, h) = P( B((, h) + P( B(h, () + P( B(h, h).
(18)
Thus we have a closed formulation of the dynamics on the low modes given an initial past in L0 ∈ P. We write Lt = Sωt L0 . We reiterate that Lt is the entire trajectory from time t back to −∞, whereas ((t) is simply the value of the low modes at time t.
Gibbsian Dynamics and Ergodicity for the Stochastically Forced Navier–Stokes Equation
89
Except for the fact the G-term in (17) is history-dependent, (17) has the form of a standard finite dimensional stochastic ODE with non-degenerate forcing, which of course has a unique stationary measure. Our task is reduced to showing that the memory effort in (17) is not strong enough to spoil ergodicity. Existence of the solution for memory-dependent stochastic ODEs of the type (17) was considered in the work of Ito et al. [IN]. 3. Uniqueness of the Invariant Measure 3.1. Proof of the Main Theorem. Given any “nice low mode past” L ∈ P, we can reconstruct the “high modes” and hence define a closed dynamics on the paths of the low modes. However, this dynamics is no longer Markovian which will produce difficulties. 2 Let µ be an ergodic stationary measure on L and µp be its extension to the path 2 space C (−∞, 0], L . We will also consider the restriction of µp to C (−∞, 0], L2( , still denoted by µp . Lemma 2.1 says that µp (P) = 1. Given any L0 ∈ P, let Qt (L0 , · ) be the measure induced on C [0, t], L2( by the dynamics starting from L0 . In other words,Qt (L0 , · ) is the distribution of Sωt L0 viewed as a random variable taking values in C [0, t], L2( . Similarly let Q∞ (L0 , · ) be the distribution induced on C [0, ∞), L2( starting from L0 . Consider the stochastic process defined by θt Sωt L0 , where L0 is a random variable on P distributed according to the invariant measure µp . For t ≥ 0 it is a random process with values in P. This is clear as all of the defining properties of U are asymptotic in t; and hence the addition of a segment of finite length does not destroy them. Since µp is invariant with respect to the dynamics, θt Sωt L0 is a stationary random process. Hence 0 with probability one there exist time averages along trajectories θt Sωt L . 2 Take any bounded measurable functional F from C (−∞, 0], L( → R such that F (L0 ), L0 ∈ C (−∞, 0], L2( depends only on a finite range of L0 . Let F¯ = F (L)dµp (L). (19)
Theorem 2. The SNS equation (1) has a unique stationary measure. The proof of Theorem 2 is based on the following two lemmas whose proofs will be given later. Lemma 3.1. Let L01 and L02 be two initial pasts in P, such that (1 (0) = (2 (0). Then Q∞ (L01 , ·) and Q∞ (L02 , ·) are equivalent. Recall that ((τ ) is the solution of (16) with initial condition L. Lemma 3.2. For any past L ∈ P and any t > 0, the distribution of ((t) ∈ L2( conditioned at starting from L at time zero, denoted by Rt (L, ·), satisfies the following: there exists a strictly positive function fL,t ∈ L1 (L2( ), such that dRt (L, ·) ≥ fL,t (·)dm(·). where m(·) is the Lebesgue measure on L2( .
90
W. E, J.C. Mattingly, Ya. Sinai
For any measure µ on L2 let P( µ denote its projection to a measure on the low modes L2( . Namely, (P( µ)(B) = µ(P(−1 (B)). Then we have the following direct consequence of Lemma 3.2. Corollary 3.3. If µ is a stationary measure then P( µ has a component which is equivalent to the Lebesgue measure. Proof of Theorem 2. Assume that there are two different ergodic stationary measures on L2 called µ1 and µ2 . They must be mutually singular. Let µ1,p and µp,2 be the extensions of these two measures onto the path space P. Let L0i be a random variable on P distributed as µi,p . Since θt Sωt L0i is stationary with respect to µp,i we can pick a set Pi , of full µp,i -measure, such that for all L ∈ Pi One can find a functional F such as above so that F¯1 = F (L)dµp,1 (L) = F¯2 = F (L)dµp,2 (L). This assumption will lead to a contradiction. The limit 1 T F (θt Sωt Loi )dt = F¯i (20) lim T →∞ T 0 is well defined for P-almost every ω. For ( ∈ L2( define Pi (() = {L ∈ Pi : L(0) = (} and let µp,i ( · |() be the conditional measure that L(0) = (. By Fubini’s theorem, we know that for P( µi -almost every ( ∈ L2( we have µp,i (Pi (() | () = 1. Hence we can find a set Ai ⊂ L2( such that µp,i (Pi (() | () = 1 for all ( ∈ Ai and P( µi (Ai ) = 1. Define A = A1 ∩ A2 . Corollary 3.3 implies that P( µi (A) > 0 for i = 1, 2. Hence there exists some (∗ ∈ A. Since (∗ ∈ A1 ∩A2 , we know that µp,i (Pi ((∗ ) | (∗ ) = 1 for i = 1, 2. Thus there exist some L∗,1 ∈ P1 ((∗ ) and L∗,2 ∈ P2 ((∗ ). Notice that by construction L∗,1 (0) = (∗ = L∗,2 (0), and hence it follows from Lemma 3.1 that Q∞ (L∗,1 , ·) and Q∞ (L∗,2 ,·) are equivalent. Since L∗,i ∈ Pi ((∗ ), we know that we can pick Bi ⊂ C [0, ∞), L2 such that the time average of F converges to F¯i for all futures in Bi and Q∞ (L∗,i , Bi ) = 1 for i = 1, 2. Since the Q’s are equivalent, Q∞ (L∗,1 , B1 ∩ B2 ) > 0 and hence B1 ∩ B2 is non-empty. This in turn implies that F¯1 = F¯2 which contradicts the assumption that they were not equal.
3.2. Proofs of the lemmas. We first prove Lemma 3.1. Fix L01 and L02 . Most of our construction will depend explicitly on them. With probability one, we can extend each of the initial pasts into the infinite future by Lsi = Sωs L0i and setting (i (s) = Lti (s) for s ≤ t. We can also reconstruct the entire solution by using ,t to obtain the high modes. Set hi (s) = ,s (Lsi ) and ui (s) = (i (s), hi (s) . Fix a constant C0 such that |ui (0)|2L2 ≤ C0 . We begin by constructing a set of nice future paths which will contain most trajectories. For any positive K we define
Ai (K) = f ∈ C [0, ∞), L2( : |v(t)|2L2 + 2ν
and A(K) = A1 (K) ∩ A2 (K).
t
4
|v(s)|2L2 ds < C0 + E0 t + Kt 5 0 where v(s) = f (s) + ,s (f, hi )
Gibbsian Dynamics and Ergodicity for the Stochastically Forced Navier–Stokes Equation
91
By Lemma A.5, we know that for any a ∈ (0, 1) there exists a K such that a for i = 1, 2, P ω : Sωt L0i ∈ Ai (K) > 1 − 2 and hence
P ω : Sωt L0i ∈ A(K)
for i = 1, 2 > 1 − a > 0.
This is just another way of saying Q∞ (L0i , A(K)) > 1 − a. 0 0 0 Lemma 3.4. Let L01 and Let L2 be two initial pasts in P such that L1 (0) = L2 (0). 2 0 A(K) ⊂ C [0, ∞), L( be as defined above. For any choice of K > 0, Q∞ (L1 , · ∩ A(K)) is equivalent to Q∞ (L02 , · ∩ A(K)).
Proof of Lemma 3.1. Since we can choose K so that A(K) has measure arbitrarily close to 1, we have that Q∞ (L01 , ·) is equivalent to Q∞ (L02 , ·). Proof of Lemma 3.4. We intend to use Girsanov’s theorem to compare the two induced measures, Q∞ (L01 , · ) and Q∞ (L02 , · ). However we do not do so directly. To aid in our analysis, we consider the following surrogate processes y which will agree with ( on the set A = A(K). As before, we will use y(t) to denote the value of the process at time t and Y t to be the entire trajectory up to time t.
dyi (t) = −ν2 yi (t) + P( B yi (t), yi (t) + :t (Yit )G yi (t), ,t (Yit , hi (0)) dt + dW (t) (21) yi (0) = (i (0), where hi (0) = ,t (L0i ), 1 if f ∈ A|[0,t] , :t (f ) = 0 if f ∈ A|[0,t] and A|[0,T ] isthe low mode paths which agree with a path in A up to time T . Recall that ,t Yit , hi (0) is the solution to (12) with ( = Y and h(0) = hi (0). Equation (21) is the same as (17) except for the insertion of :t (Yit ). As long as :s (Yit ) = 1 for s ∈ [0, t], then yi (s) = (i (s) for s ∈ [0, t]. y y Let Q∞ (L01 , · ) and Q∞ (L02 , · ) be the measures induced by Y1 and Y2 respectively. If applicable, Girsanov’s theorem would imply that these measure are equivalent, that y y is Q∞ (L01 , · ) ∼ Q∞ (L02 , · ). For Girsanov’s theorem to apply, it is sufficient that the Novikov condition holds. Namely, 2 t t 1 ∞ −1 t ; :t (Y1 )D y1 (t), ,t Y1 , h1 (0) , ,t Y1 , h2 (0) dt < ∞, E exp 2 0 (22) where D(g, f1 , f2 )=G(g, f1 ) − G(g, f2 ) and ; is a diagonal matrix with the σk ’s on its diagonal. Here we have written the condition in terms of the y1 process. One can also def
92
W. E, J.C. Mattingly, Ya. Sinai
write the condition in terms of the y2 process; the finiteness of one implies the finiteness of the other. We will in fact show something much stronger than (22). Since |; −1 | < ∞, it would be enough to show that
∞
sup ω
:t (Y t )D y1 (t), ,t Y t , h1 (0) , ,t Y t , h2 (0) 2 dt < ∞. 1 1 1
0
(23)
Putting hi (s) = ,s (Y1s , hi (0)), ui (s) = (i (s) + hi (s), ρ(s) = h1 (s) − h2 (s) and using Lemma A.4, we have
D (1 (s), h1 (s), h2 (s) 2 2 ≤ C |ρ(s)|2 2 |u1 (s)|2 2 + |u2 (s)|2 2 . (24) L L L L Notice that if (i ∈ A|[0,T ] then for all t ∈ [0, T ], 4
|ui (t)|2L2 < C0 + E0 t + Kt 5 , t 4 1 |ui (s)|2L2 ds < C0 + E0 t + Kt 5 , 2ν 0 |ρ(0)|2L2 = |u1 (0) − u2 (0)|2L2 ≤ 2 |u1 (0)|2L2 + |u2 (0)|2L2 ≤ 4C0 . In addition, we can apply the same analysis as in Sect. 2. Starting from (14) and using the above estimates produces
|ρ(t)|2L2
Cˆ 2 t 2 |u2 (s)|L2 ds ≤ exp −νN t + ν 0 4 Cˆ 2 2 ≤ 4C0 exp −νN t + 2 C0 + E0 t + Kt 5 . 2ν |ρ(0)|2L2
2
ˆ2
Since by assumption νN 2 > C Eν 20 = C2νE20 , the second term goes to zero sufficiently fast and hence the estimate on the right-hand side of (24) decays exponentially fast. Thus, ω
2 :t (Y1 )D y1 (t), ,t (Y t , h1 (0)), ,t (Y t , h2 (0)) dt 1 1 ∞ |D (f (r), ,t (f, h1 (0)), ,t (f, h2 (0)))|2 dt ≤ sup
∞
sup 0
f ∈A 0
< const(C0 ) < ∞, y
y
which implies, Q∞ (L01 , · ) ∼ Q∞ (L02 , · ). As long as Yi stays in A, yi = (i . Hence y Q∞ (L0i , · ∩ A) = Q∞ (L0i , · ∩ A) and finally Q∞ (L01 , · ∩ A) ∼ Q∞ (L02 , · ∩ A). In fact our proof provided more information than stated in Lemma 3.4. It contains some estimates uniform over a class of initial pasts which will be useful in later investigations of the convergence rate. (See [Mat00]. ) We state the extra information in the following corollary.
Gibbsian Dynamics and Ergodicity for the Stochastically Forced Navier–Stokes Equation
93
Corollary 3.5. In the setting of the proof of Lemma 3.4, define P = {L ∈ P : |L(0) + ,0 (L)|L2 < C0 }. Then there exists a constant, depending on C0 and K, so that 2 y dQ∞ (L1 , g) y sup 1 − dQy (L , g) dQ∞ (L2 , g) < const(C0 , K1 ) < ∞. L1 ,L2 ∈P ∞ 2 We now move to the proof of Lemma 3.2. Fix L ∈ P. The proof proceeds by comparing the process ((t) to the associated Galerkin approximation living on L2( which we will denote by x(t). The advantage is that x(t) is a standard non-degenerate diffusion and hence it is Markovian and well understood. Take x(t) as the solution defined by the following stochastic ODEs: dx(t) = −ν2 x + P( B(x, x) dt + dW (t), x(0) = ((0). As in the previous section, we do not compare x(t) directly to ((t) but instead to a modified version of ((t) which we will denote by z(t). In analogy to before, we will denote the path of this process up to time t by Z t . Before continuing let us assume without loss of generality that |((0)|L2 ≤ C0 and t ≤ T for some positive C0 and T . This will give our estimates some uniformity over all initial conditions inside this ball and for times t ≤ T . The evolution of z(t) is given by
dz(t) = −ν2 z + P( B(z, z) + :t (Z t )G z, ,t Z t , h0 dt + dW, z(0) = ((0) = L(0) , where h0 = ,0 (L) and G is defined in (18). As in the last section, :t (Z t ) is a cut-off function. For any fixed b0 > 1, we define s 1 if 0 |Z s (r)|4L2 dr < (b0 C0 )4 T s :s (Z ) = . 0 otherwise Here b0 is a fixed constant to be chosen below. For any B ⊂ L2( , define [B] = v ∈ C [0, t], L2( : v(t) ∈ B . Then Rt (L(0), B) = Qt (L, [B]). Letting Qxt (L, · ) and Qzt (L, · ) be the two measures induced on C [0, t], L2( by the dynamics of x and z respectively. Lemma 3.2 will be a consequence of the following two lemmas. Lemma 3.6. Fix any b0 > 1. (The constant used in defining the z process.) Then the following holds: For any L ∈ P and t ≥ 0, Qxt (L(0), · ) is equivalent to Qzt (L, · ). Lemma 3.7. For any b0 the following holds: For any L ∈ P and t ≥ 0, there exists a positive function g( · ) so that Qxt (L(0), [B] ∩ A) ≥ B g(y)dm(y), where m( · ) is the Lebesgue measure.
94
W. E, J.C. Mattingly, Ya. Sinai
We now use these two lemmas to prove Lemma 3.2. Proof of Lemma 3.2. Observe that by construction as long as the trajectories stay in A, x(t) = ((t). Hence using Lemma 3.7, we have Rt (L, B) = Qt (L, [B]) ≥ Qt (L, [B] ∩ A) = Qzt (L, [B] ∩ A), g(L(0), y)dm(y), Qxt (L(0), [B] ∩ A) ≥ B
where g(L(0), y) is a positive function in y. Since Lemma 3.6 says that Qzt ((, · ∩ A) is equivalent to Qxt (L(0), · ∩ A), we know that Rt (L(0), B) is also bounded from below by a positive measure equivalent to the Lebesgue measure. We now turn to Lemma 3.6. Our construction gives some measure of uniform control which is useful for estimating the rate the system converges to the stationary measure. (See [Mat00]. ) We state these more precise estimates in the following corollary. Corollary 3.8. Fix a C0 > 0 and define P = {L ∈ P : |L(0) + ,0 (L)|L2 < C0 }. Then for any α ∈ (0, 1) there exists a b0 > 0 (the constant used to define A) so that: inf inf P Sωt L ∈ A > 1 − a, t∈[0,T ] L∈P
2 z 1 − dQt (L, g) dQx (L, g) < K(C0 , t) sup t x dQt (L, g) L∈P for t ∈ [0, T ], where K is a constant depending on C0 and t such that for each C0 , K → 0 as t → 0. Proof of Lemma 3.6 and Corollary 3.8. Girsanov’s theorem would imply the result if the Novikov condition t 2 1 s 2 s |:s (Z )| G z(s), ,s (Z , h0 ) L2 ds < ∞ E exp 2 0 holds. As in the proof of Lemma 3.4, we will prove the stronger condition t G z(s), ,s (Z s , h0 ) 2 2 ds < ∞. sup L z(·)∈A 0
Using Lemma A.4, we obtain the following estimate on G: G z(s), ,s (Z s , h0 ) 2 2 ≤ C |z(s)|2 2 |h(s)|2 2 + |h(s)|4 2 , L L L L where h(s) = ,s (Z s , h0 ) . By Lemma C.1 we know that if z is in A then sups∈[0,t] |h(t)|L2 is less than some C1 , where C1 depends on |h0 |L2 and the b0 , C0 and T used to define A. Hence for any z ∈ A, we have t t G z(s), ,s (Z s , h0 ) 2 2 ds ≤ C |z(s)|2L2 |h(s)|2L2 + |h(s)|4L2 ds L 0
0
≤ C
t 0
|z(s)|4L2 ds
≤ C (b0 C0 ) T 2
1 2
1 C12 t 2
21
t 0
|h(s)|4L2 ds
+ C C14 t.
21
+ C C14 t
Gibbsian Dynamics and Ergodicity for the Stochastically Forced Navier–Stokes Equation
Hence Novikov’s condition holds and the lemma is proven.
95
Proof of Lemma 3.7. The basic idea is as follows. Some of the paths which satisfy the condition defining A can be described by requiring that some norm of the paths be less than some fixed fk∗ (t) at time t. Such a condition has the advantage that it corresponds to fixing a zero boundary condition along the boundary of some region for the associated Fokker-Planck equation. Since the diffusion is nondegenerate this process has a positive density on the interior of this region. By carefully picking fk∗ we can have the region contain sets arbitrarily far away from the origin. We now make this precise. Fix a L ∈ P, and a t > 0. For k = 0, 1, 2, . . . define the disk Dk by Dk = f ∈ L2( : |f |4L2 ∈ [2k , 2k+1 ) and let D¯ k be the closure of Dk . We will construct g( · ) = gk ( · )1Dk , where gk is strictly positive on D¯ k and zero outside of D¯ k . Let fk∗ be a non-decreasing, positive, real-vaued C ∞ function fk∗ such that fk∗ (s) = 1
1
(C04 + αk ) 4 for s ∈ [0, (1 − αk )t − ε] and fk∗ (s) = (100 · 2k+1 ) 4 for s ∈ [(1 − αk )t, t] and linearly t interpolates in [(1−αk )t −ε, (1−αk )t]. αk is some number in (0, 1) chosen so that 0 (fk∗ (r))4 dr < (b0 C0 )4 T . This is possible as long as b0 > 1 and t ≤ T . Now define the subset Hk of C [0, t], L2( by Hk = f ∈ C
[0, t], L2(
: sup |f (s)|L2 ≤ s∈[0,t]
fk∗ (s)
.
By the choice of fk∗ it is clear that Hk ⊂ A, where A is the same set used in the definition of z. Now consider the process xk (t) which follows the same equation as x(t) except that it is killed whenever the trajectory leaves Hk . Another way of saying this is xk (t) is the process x(t) conditioned on staying in Hk . The transition density of this process gk (s, ((0), y) is the solution to the Kolmogorov equation with the same generator as x but with zero boundary conditions along the boundary of Hk . Since the generator is elliptic, we know that gk (t, ((0), y) is strictly positive everywhere in the interior of Hk . Since the trace of Hk at time t strictly contains Dk , we know that gk (t, ((0), y) is strictly positive for y ∈ D¯ k . Also by construction it is clear that Qxt (((0), Hk ) > 0 for all k. Let ak = Qxt (((0), Hk ) and set gk ( · ) = ak gk (t, ((0), · )1Dk ( · ). All that remains is to verify that this choice of gk constructs a g with the desired minorization property since it is clearly everywhere positive. Without loss of generality it is enough to show it for a B contained in some arbitrary Dk . Then Qxt (((0), [B] ∩ A) ≥ Qxt (((0), [B] ∩ Hk ) ≥ P((0) {x ∈ [B] & x ∈ Hk } ≥ P((0) {x ∈ [B] x ∈ Hk }P((0) {x ∈ Hk } ≥ ak gk (t, ((0), y)dm(y) = gk (y)dm(y). B
B
96
W. E, J.C. Mattingly, Ya. Sinai
4. Stationary Measures and Thermodynamical Formalism In this section we make a few general heuristic remarks about the methodology behind our approach. The starting point of our construction is rewriting the original Navier–Stokes equation with random forcing as a finite-dimensional system of ordinary stochastic differential equations whose drift coefficients depends on the whole past: d( = [−ν2 ( + P( B((, () + G((, ,t (Lt ))]dt + dW.
(25)
dW = d( − [−ν2 ( + P( B((, () + G((, ,t (Lt ))]dt.
(26)
From (25)
The measure corresponding to all dwk (t), k ∈ Zν , −∞ < t < ∞ can be symbolically written as 1 1 ∞ dw (t) 2 k dt exp − dwk (t). 2 |σk |2 −∞ dt k∈Zν
k
Here Zν is the set of modes that are forced. The substitution of the expression for dwk from (26) gives exp
∞
−∞
L1 (((t))dt +
∞ −∞
L2 (((t))dt −
∞ d(k (t) 2 1 1 dt 2 |σk |2 −∞ dt k∈Zν
d(k (t),
k
where 2 1 L1 (((t)) = − −ν2 ( + P( B((, () + G((, ,t (Lt )) , 2∞ ∞ 1 L2 (((t))dt = −ν2 ( + P( B((, () + G((, ,t (Lt )) k d(k (t). 2 |σk | −∞ −∞ k∈Zν
2 ∞ The factor exp − 21 k∈Zν |σ1|2 −∞ d(dtk (t) dt k d(k (t) can be considered as the k differential of a “free measure” which in our case is a finite-dimensional white noise. The “Lagrangians” L1 , L2 describe the non-local interaction of ((t) with the past. The whole expression shows that the stationary measure for the SNS system is actually a Gibbs state constructed with the help of Lagrangians L1 , L2 . The estimations of the growth of L1 , L2 as a function of the growth of |(k (s)|L2 , s → −∞ show the class of realizations for which the conditional distributions can be defined. Therefore we have a weaker form of the Gibbs state. R. L. Dobrushin in his last papers and talks stressed the importance of this class of probability distributions. Since we are dealing all the time with probability distributions, the free energy of our Gibbs state is zero. It would be interesting to develop a general theory of existence and uniqueness of Gibbs states for general Lagrangians L1 , L2 so that our result becomes a particular case of a more general statement.
Gibbsian Dynamics and Ergodicity for the Stochastically Forced Navier–Stokes Equation
97
5. Conclusion When analyzing the ergodic properties of an infinite dimensional stochastic process, one of the most delicate aspects is often finding the correct topology in which to work. One of the principle advantages of the approach presented in this paper is that it evades this difficulty. We trade an infinite dimensional diffusion process for a finite dimensional Itô process with memory. We have tried to present the simplest case of our theory, so that the exposition would be unencumbered. In fact the proofs contained in this work have proved a more general theorem than originally stated. Consider forcing defined by W (x, t) = σk wk (t, ω)ek (x), k∈Z
where Z is some finite subset of Z2 such that (0, 0) ∈ Z and k ∈ Z if and only if σk > 0. If we define L2( = span{ek , k ∈ Z}, and
L2h = span{ek , k ∈ Z}
N− = sup N : k ∈ Z for all k with 0 < |k| ≤ N .
With these definitions all of the previous lemmas and theorems hold with the role of N replaced by N− . In particular, if N−2 > C Eν 30 the system has a unique invariant measure. This formulation emphasizes the nature of our principle assumption. By requiring that all of the low modes are forced, we are essentially requiring that the reduced Gibbsian dynamics are elliptic in nature. Some steps towards dealing with a hypo-elliptic setting have been made. In [EMatt], finite dimensional truncations of the two dimensional SNS equation were studied and shown to be ergodic under minimal assumptions. In [EM], a reaction diffusion equation was studied under degenerate forcing. Our arguments can be easily extended to the case where the forcing of the k th mode has the form fk + σk dwk (t), fk is a constant, fk = 0 and σk = 0 for k ∈ / Z or the case when the forcing is not diagonal in Fourier space. Our approach can also be extended in several other different directions. We can consider the case when the high modes are also forced. As long as the forcing of the high modes decays sufficiently fast, our argument still applies with almost no change. The Wiener process in the forcing can be replaced by other diffusion processes such as the Ornstein-Uhlenbeck process. Dissipative PDEs such as the Cahn-Hilliard equation and the Ginzburg-Landau equations can also be studied using the same method. Finally, exponential convergence of empirical distributions to the stationary distribution can be proved. A. Energy Estimates In this Appendix, we prove a number of estimates controlling the evolution of the energy and enstrophy. Estimates for higher Sobolev norms are also possible, see [Mat98] for examples. In all cases, they are analogous to the standard results in the deterministic setting. Here we do not limit ourselves to forcing with only finitely many active modes. def We will characterize the forcing in terms of the El defined by El = |k|2l |σk |2 . We begin with the basic energy and enstrophy estimates in the stochastic setting.
98
W. E, J.C. Mattingly, Ya. Sinai
Lemma A.1. For any p > 1, we have
2p
t
2(p−1)
E |u(s)|2L2 |u(s)|L2 ds t 2p 2(p−1) ≤ E |u(0)|L2 + C0 E |u(s)|L2 ds, 0 t 2 2p 2(p−1) E |u(t)|L2 + 2pν E 2 u(s) 2 |u(s)|L2 ds L 0 t 2p 2(p−1) ≤ E |u(0)|L2 + C1 E |u(s)|L2 ds. E |u(t)|L2 + 2pν
0
0
2 and σ 2 = sup |σ |2 . In the case p = 1, we have the Here Ci = pEi + 2p(p − 1)σmax k max equalities
E |u(t)|2L2
t
+ 2ν E |u(s)|2L2 =E |u(0)|2L2 + E0 t, 0 t 2 + 2ν E 2 u(s) 2 =E |u(0)|2L2 + E1 t.
E |u(t)|2L2
L
0
(27) (28)
Proof. We begin by fixing a positive integer M and considering the Galerkin approxima (M) tion defined by u(M) (t) = |k|≤M uk (t)ek . u(M) (t) satisfies an equation of exactly the same form as the full solution except the nonlinearity has been projected to those terms def |k|2l |σk |2 . Our estimates of order less than or equal to M. We will also need ElM = |k|≤M
will be independent of the order of approximation M. For simplicity, we will sometimes neglect the superscript M. p Applying Itô’s formula to the map {uk } → |uk |2 produces, 2p
−ν |u(t)|2L2 dt + u(t), dW L2 (29)
2(p−2) 2(p−1) M + 2p(p − 1) |u(t)|L2 |uk (t)|2 |σk |2 dt + p |u(t)|L2 E0 dt 2(p−1)
d |u(t)|L2 = 2p |u(t)|L2
k
for the energy moments and d
2p |u(t)|L2
! 2 2 2 −ν u(t) 2 dt + u(t), dW L2 = L
2(p−2) + 2p(p − 1) |u(t)|L2 |k|2 |σk |2 |uk (t)|2 dt 2(p−1) 2p |u(t)|L2
(30)
k
2(p−1) M + p |u(t)|L2 E1 dt
for the enstrophy moments. Here α u(t), dW (t)L2 is shorthand for |k|α uk (t)σk dwk (t). In the first, we have used the fact that B(u, u), uL2 = 0 and in the second the fact that B(u, u), 2 uL2 = 0. Since, on the torus, the structure of the energy and the enstrophy equations are the same we will continue giving all of the details for analysis of the enstrophy equation.
Gibbsian Dynamics and Ergodicity for the Stochastically Forced Navier–Stokes Equation
99
The analysis for the energy equation proceeds analogously, see [Mat99, Mat98]. For a fixed H > 0, we introduce the stopping time 2 T = inf t ≥ 0 : 2 u(t) 2 ≥ H 2 . L
Denoting by Mt the local martingale term in (30) , we define the stopped martingale MtT by t 2(p−1) MtT = 2p |u(s ∧ T )|L2 2 u(s ∧ T ), dW (s)L2 . 0
MtT has the advantage that its quadratic variation, denoted by [M T , M T ]t , is clearly finite. t 2p 2 T T 2 [M , M ]t ≤ 2pσmax u(s ∧ T ) 2 ds L 0 t 2p 2 2 2 ≤ 2pσmax H 2p t < ∞. u(s ∧ T ) 2 ds ≤ 2pσmax L
0
Because E[M T , M T ]t < ∞ we know that EMtT = 0. And because t ∧ T is a bounded T stopping time the Optional Stopping Time Lemma says that EMt∧T = 0. Since Mt∧T = T Mt∧T , we have E |u(t
∧ T )|2L2
t∧T
+ 2νE
2 2 u(s) 2 ds = E |u(0)|2L2 + E1M E(t ∧ T ), L
0
and when p > 1,
2p
E |u(t ∧ T )|L2 + 2pνE 2p
= E |u(0)|L2 + E
t∧T
0 t∧T
0
2 2 u(s) 2 ds
2(p−1)
|u(t)|L2
L
2(p−2)
2p(p − 1) |u(s)|L2
|k|2 |σk |2 |uk (s)|2
k 2(p−1)
+ p |u(s)|L2
E1M ds.
Hence
2 2 u(s) 2 ds L 0 t∧T 2p 2(p−1) 2 |u(s)|L2 ≤ E |u(0)|L2 + 2p(p − 1)σmax + pE1M E ds. 2p
E |u(t ∧ T )|L2 + 2pνE
t∧T
2(p−1)
|u(t)|L2
0
Since u(t) is continuous in time, T → ∞ as H → ∞ and hence T ∧ t → t. Thus we obtain t 2 2 E |u(t)|2L2 + 2νE u(s) 2 ds = E |u(0)|2L2 + E1M t, 0
L
100
W. E, J.C. Mattingly, Ya. Sinai
2p E |u(t)|L2
t
+ 2pνE 0
2 2 u(s) 2 ds
2(p−1)
|u(t)|L2
L
2p 2 ≤ E |u(0)|L2 2p(p − 1)σmax + pE1M E
t
2(p−1)
|u(s)|L2
0
ds.
Recall that we have been calculating with an M th order Galerkin approximation. For the p = 1 equation, the right hand side converges to the desired right hand side. With this bound on E |u(t)|2L2 in hand we can take the M → ∞ limit of the p = 2 equation. Analogously, once we have taken the limit in the pth equation we have the dominating bound needed to take the limit in the p + 1 equation. 2 In our setting, the Poincaré inequality reads |f |2L2 > |f |2L2 and 2 f L2 > |f |2L2 . This allows us to close the above inequalities. After applying Gronwall’s inequality, we obtain the following estimates which are uniform in time. Corollary A.2. E |u(t)|2L2 ≤ e−2νt E |u(0)|2L2 + E |u(t)|2L2 ≤ e−2νt E |u(0)|2L2
E0
1 − e−2νt , 2ν E1
+ 1 − e−2νt . 2ν
For any p > 1, E |u(t)|L2 ≤ e−2νt E |u(0)|L2 + C0 2p
2p
t 0
E |u(t)|L2 ≤ e−2νt E |u(0)|L2 + C1 2p
2p
e−2ν(t−s) E |u(s)|L2
2(p−1)
t 0
e−2ν(t−s) E |u(s)|L2
ds,
2(p−1)
ds.
We use standard estimates in the tri-linear term B(u, v), wL2 specialized to our two dimensional setting. Its proof can be found in [CF88] for example. Lemma A.3. Let α, β, γ be positive real numbers such that α + β + γ ≥ 1 and (α, β, γ ) = (0, 0, 1), or (0, 1, 0), or (1, 0, 0), |B(u, v), wL2 | ≤ C α uL2 β+1 v
L2
γ w
L2
.
Using this lemma we prove the following estimate specialized to the two dimensional setting with periodic boundary conditions. Lemma A.4. Let {ek , k ∈ Z2 } be a basis for L2 . Consider a splitting of L2 = L2( + L2h . Let N + be in sup{|k| : ∃ ek with ek ∈ L2( } and P( be the projector onto L2( . If u, v ∈ L2 then |P( B(u, v)| ≤ C(N + )3 |u|L2 |v|L2 .
Gibbsian Dynamics and Ergodicity for the Stochastically Forced Navier–Stokes Equation
101
Proof of Lemma A.4. In the periodic setting, P( , Pdiv , and (−)s all are simply Fourier multipliers and hence commute with one other. Recall that B(u, v) = Pdiv (u · ∇)v and hence, |P( B(u, v)| = sup |P( B(u, v), wL2 | = sup |B(u, v), P( wL2 | w∈L2 |w|=1
w∈L2 |w|=1
= sup |B(u, P( w), vL2 | ≤ C |u|L2 |v|L2 sup 3 P( w w∈L2 |w|=1
L2
w∈L2 |w|=1
≤ C(N + )3 |u|L2 |v|L2 sup |w|L2 ≤ C(N + )3 |u|L2 |v|L2 . w∈L2 |w|=1
Lemma A.5. Fix any δ > 21 , a ∈ (0, 1) and C1 > 0. Let u(t) = ϕtω u0 . There exists a K1 > 0 such that whenever |u0 |2L2 < C0 , t |u(s)|2L2 ds ≤ C0 + E0 t + K1 (t + 1)δ for all t ≥ 0 ≥ 1 − a. P |u(t)|2L2 + 2ν 0
Proof of Lemma A.5. The energy equation reads |u(t)|2L2
t
+ 2ν 0
|u(s)|2L2
ds =
|u0 |2L2
t
+ E0 t + 0
u(s), dW (s)L2 .
Since |u0 |2L2 < C0 , all we need to show is that P Mt ≤ K1 (t + 1)δ for t ≥ 0 ≥ 1 − a t for K1 large enough, where Mt = 0 u(s), dW (s)L2 . The quadratic variation [M, M]t can be calculated and one sees that t 2 |u(s)|2L2 , [M, M]t ≤ σmax 0
and hence p
([M, M]t ) ≤
2p σmax
t 0
|u(s)|2L2
p
≤
2p p−1 σmax t
t 0
2p
|u(s)|L2 ds.
From Corollary A.2, we know that if |u(0)|2L2 < C0 , then there exists a constant Cp (C0 ) 2p
so that E |u(t)|L2 ≤ Cp for all t ≥ 0 and p ≥ 1. Now define the events Ak =
sup |Ms | > K1 k
s∈[0,k]
δ
.
102
W. E, J.C. Mattingly, Ya. Sinai
By the Doob–Kolmogorov martingale inequality we have 2p E([M, M]t )p σmax Cp k p P Ak ≤ ≤ . 2p 2p k 2pδ K1 k 2pδ K1
Lastly observe that
P Mt ≤ K1 (t + 1)
δ
≥1−P
"
Ak ≥ 1 −
k
P Ak . k
By the previous estimate on P Ak , for any δ > 21 we see that the sum is finite for p sufficiently large. Specifically, we need δ > 21 (1 + p1 ). Lastly, the sum can be made as small as we want by increasing K1 .
B. Properties of Stationary Measures We now establish a number of properties, derived from the dynamics, which any stationary measure must possess. Lemma B.1. For any stationary measure all energy moments are finite. In fact for any p ≥ 1 there exist a constant Cp < ∞ such that 2p |u|L2 dµ(u) < Cp L2
for all stationary measures µ. In particular C1 = E2ν0 . Proof. We will consider the case when p = 1. The other cases follow by the same method. For any H > 0 there exists a bH such that µ{u ∈ L2 : |u|2L2 ≤ bH } > 1 − H. Let BH denote {u ∈ L2 : |u|2L2 ≤ bH }. For any H > 0 and t > 0, we have
ω 2 |u|2L2 ∧ H dµ(u) = E ϕ0,t u L2 ∧ H dµ(u) L2 L2
ω 2 ≤ HH + E ϕ0,t u L2 ∧ H dµ(u) B H
ω 2 ≤ HH + E ϕ0,t uL2 dµ(u). BH
Applying the first bound in Corollary A.2 gives
E0 E0 2 −2νt |u|L2 ∧ H dµ(u) ≤ H H + bH − +e . 2ν 2ν L2 Taking the limit as t → ∞ and then observing that H was arbitrary, we obtain
E0 |u|2L2 ∧ H dµ(u) = |u|2L2 ∧ H dµ(u) ≤ . 2ν L2 U Taking H → ∞ gives that the energy of any stationary measure is bounded by E2ν0 . The argument for higher moments of the energy is the same
Gibbsian Dynamics and Ergodicity for the Stochastically Forced Navier–Stokes Equation
103
Lemma B.2. For any stationary measure µ, L2
|u|2L2 dµ(u) =
E0 . 2ν
In addition if the forcing is such that E1 < ∞ then
E1 2 2 u 2 dµ(u) = 2 L 2ν L
and
2p
L2
|u|L2 dµ(u) < C1 (p) < ∞
for all p ≥ 1. Proof. Using Eq. (27), we have that for any initial condition u0 ∈ L2 , 2 E ϕ0,t u0 L2 + 2ν
t 0
2 E ϕ0,s u0 L2 ds = |u0 |2L2 + E0 t.
Here we have switched the time integral and the expectation by the Fubini–Tonelli theorem because the integrand is non-negative. We know from Lemma B.1 that any stationary measure has finite energy moments. Hence averaging with respect to the stationary measure gives
t 2 2 E ϕ0,t u0 L2 dµ(u0 ) + 2ν E ϕ0,s u0 L2 ds dµ(u0 ) 2 L2 L 0 |u0 |2L2 dµ(u0 ) + E0 t. = L2
Because µ was a stationary measure, we have that L2
and
L2
2 E ϕ0,t u0 2 dµ(u0 ) =
L
t 0
2 E ϕ0,s u0 L2 ds = t
L2
|u0 |2L2 dµ(u0 )
L2
|u0 |2L2 dµ(u0 ).
Hence 2ν L2 |u0 |2L2 dµ(u0 ) = E0 , concluding the proof of the first claim. We now turn to the enstrophy moments. By the first part of this lemma, we know that there exist a U ⊂ H1 such that µ(U ) = 1. We now can proceed just as in Lemma B.1 to prove that all of the enstrophy moments are finite. To find the expected value of the H2 norm we use Eq. (28). Then we proceed exactly as we did to obtain the expected value of the enstrophy (the H1 norm). Lemma B.3. Let µp be the measure induced on C (−∞, 0], L2( by any given stationary measure µ. Fix any K0 > 0 and δ > 21 . Then for µp -almost every trajectory in C (−∞, 0], L2( , v(s), there exists a constant T such that for s ≤ 0, |v(s)|2L2 ≤ E0 + K0 min(T , |s|)δ .
104
W. E, J.C. Mattingly, Ya. Sinai
Proof. The basic energy estimate, derived from (29), reads: t t |v(s)|2L2 ds + |v(t)|2L2 = |v(t0 )|2L2 + E0 (t − t0 ) − 2ν v(s), dW (s)L2 , t0
t0
for any t0 < t ≤ 0. There is no problem writing the integration against the Wiener path in the above integral. Our stochastic PDE had pathwise defined solutions. Therefore if we know the initial condition v(t0 ) and the trajectory of v(s) for s ∈ [t0 , t] the increments of the Wiener process on the interval [t0 , t] are uniquely defined. For any k ≥ 1, the above estimate implies sup
s∈[−k,−k+1]
where Fk (s) = −2ν Now define
s
−k
|v(s)|2L2 ≤ |v(−k)|2L2 + E0 +
sup
s∈[−k,−k+1]
|v(r)|2L2 dr + Mk (s) and Mk (s) =
s
−k
Fk (s),
v(r), dW (r)L2 .
Ak = v(s) :
sup
s∈[−k,−k+1]
|v(s)|2L2
≤ E0 + K0 |k − 1|
δ
and UT = ∩k>T Ak . Since the UT are an increasing collection of sets it will be sufficient to prove that the limT →∞ µp (UT ) = 1. This is the same as showing that c c c limT →∞ µp (UT ) = 0. Now since µp (UT ) ≤ k>T µp (Ak ), we need only to show c that k>0 µp (Ak ) < ∞: K0 c 2 δ |k − 1| µp (Ak ) ≤ µp v(s) : |v(−k)|L2 ≥ 2 K0 + µp v(s) : sup Fk (s) ≥ |k − 1|δ , 2 s∈[−k,−k+1] The first term is the most straightforward. Lemma B.2 implies that the second moment of the energy is uniformly bounded by some constant C2 . Hence Chebyshev’s inequality produces 4 K0 4C E |v(−k)|4L2 ≤ 2 |k − 1|δ ≤ 2 µp v(s) : |v(−k)|2L2 ≥ 2δ 2 K0 |k − 1| K0 |k − 1|2δ which is summable as long as δ > 21 . The second term proceeds in the same way but with Chebyshev’s inequality replaced by the exponential martingale estimate. The exponential martingale inequality controls the size of a martingale minus something proportional to its quadratic variation (see [RY94, Mao97] for example). The details are given in the following. The key observation is that we can control Fk (s) by controlling Mk (s)− α[Mk , Mk ](s), where [Mk , Mk ](s) is the quadratic variation of the martingale Mk (s) and α is a constant we will choose presently. First notice that with probability one, s s 2 |v(r)|2L2 dr [Mk , Mk ](s) = |σl |2 |vl (r)|2 dr ≤ σmax −k
2 ≤ σmax
l
−k
s −k
|v(r)|2L2 dr
Gibbsian Dynamics and Ergodicity for the Stochastically Forced Navier–Stokes Equation
105
and hence Fk (s) ≤ Mk (s) −
2ν [Mk , Mk ](s) 2 σmax
almost surely. In this setting, the exponential martingale inequality states that for positive α and β, α P sup Mk (s) − [Mk , Mk ](s) > β ≤ e−αβ . 2 s∈[−k,0] Taking α = µp
4ν 2 σmax
we find
K0 v(s) : sup Fk (s) ≥ |k − 1|δ 2 s∈[−k,−k+1]
2νK0 δ ≤ exp − 2 |k − 1| . σmax
Since this is summable for any δ > 0, the proof is complete.
C. Control of High Modes by Low Modes
Lemma C.1. If h(t) is the solution to (12) with some low mode forcing ( ∈ C [0, t], L2( , t 4 then sups∈[0,t] |h(s)|L2 is bounded by a constant depending on |h(0)|L2 and 0 |(|L2 ds. Proof. Taking the inner product of (12) with h produces 1d |h(t)|2L2 = −ν |h|2L2 + Ph B(h, (), hL2 + Ph B((, (), hL2 2 dt because Ph B((, h), hL2 = Ph B(h, h), hL2 = 0. Next using Lemma A.3 produces, 1d |h(t)|2L2 ≤ −ν |h|2L2 + C |h|L2 |h|L2 |(|L2 + C |h|L2 |(|2L2 2 dt C C |h|2 2 |(|2L2 + |(|4L2 ≤ 2ν L 2ν Since ( ∈ L2( we have |(|L2 ≤ (N + ) |(|L2 , where N + = sup{|k| : ∃ ek with ek ∈ L2( }, and hence after applying Gronwall’s Lemma we have t |h(t)|2L2 ≤ C1 |h(0)|2L2 exp a1 |(|2L2 ds 0 t t |(|4L2 ds exp a1 |(|2L2 ds . + C2 0
Since by Hölder inequality,
t 0
0
|(|2L2 ds ≤ t
0
t
|(|4L2 ds,
the proof is complete. Acknowledgements. The authors would like to thank Gérard Ben Arous, Amir Dembo, Perci Diaconis,Yitzhak Katznelson, Di Liu, George Papanicolaou and Andrew Stuart for useful discussions. The work of the first author is partially supported by a Presidential Faculty Fellowship from the NSF. The work of the second author is partially supported by NSF grant DMS-9971087. The work of the third author is partially supported by NSF grant DMS-9706794 and RFFI grant 99-01-00314.
106
W. E, J.C. Mattingly, Ya. Sinai
References [BKL] [CDF97]
Bricmont, J., Kupiainen, A., and Lefevere, R.: Preprint Crauel, H., Debussche, A., and Flandoli, F.: Random attractors. J. Dynam. Diff. Eqs. 9 no. 2, 307–341 (1997) [CF88] Constantin, P. and Foia¸s, C.: Navier–Stokes equations. Chicago: University of Chicago Press, 1988 [EMatt] E, W. and Mattingly, J.C.: Ergodicity for the Navier–Stokes Equation with Degenerate Random Forcing: Finite Dimensional Approximation. Submitted [EM] Eckmann, J.P., and Hairer, M.: Uniqueness of the invariant measure for a stochastic PDE driven by degenerate noise. Preprint [EFNT94] Eden, A., Foias, C., Nicolaenko, B., and Temam, R.: Exponential attractors for dissipative evolution equations. Research in Applied Mathematics, New York: John Wiley and Sons and Masson, 1994 [Fer97] Ferrario, B.: Ergodic results for stochastic Navier–Stokes equation. Stochastics and Stochastics Rep. 60, no. 3–4, 271–288 (1997) [Fla94] Flandoli, F.: Dissipativity and invariant measures for stochastic Navier–Stokes equations. NoDEA 1, 403–426 (1994) [FM95] Flandoli, F. and Maslowski, B.: Ergodicity of the 2-D Navier–Stokes equation under random perturbations. Commun. Math. Phys. 171, 119–141 (1995) [FMRT] Foias, C., Manley, O., Rosa, R., Temam, R.: Navier–Stokes Equations and Turbulence. To be published [IN] Ito, K., Nisio, M.: On stationary solutions of a stochastic differential equation. J. Math. Kyoto Univ. 4, 1–75 (1964) [KS] Kuksin, S. and Shirikyan, A.: Stochastic Dissipative PDE’s and Gibbs Measures. Commun. Math. Phys. 213, 291–330 (2000) [Mao97] Mao, X.: Stochastic differential equations and their applications. Horwood Series in Mathematics & Applications, Chichester: Horwood Publishing Limited, 1997 [Mat98] Mattingly, J.C.: The stochastically forced Navier–Stokes equations: Energy estimates and phase space contraction. Ph.D. thesis, Princeton University, 1998 [Mat99] Mattingly, J.C.: Ergodicity of 2D Navier–Stokes equations with random forcing and large viscosity. Commun. Math. Phys. 206 no. 2, 273–288 (1999) [Mat00] Mattingly, J.C.: Exponential convergence for the stochastically forced Navier–Stokes equations and other partially dissipative dynamics. Submitted [RY94] Revuz, D. and Yor, M.: Continuous martingales and Brownian motion. Second ed., Grundlehren der Mathematischen Wissenschaften, Vol. 293, Berlin: Springer-Verlag, 1994 [Str82] Stroock, D.W.: Lectures on topics in stochastic differential equations. Bombay: Tata Institute of Fundamental Research, 1982, with notes by Satyajit Karmakar [SV79] Stroock, D.W. and Varadhan, S.R.S.: Multidimensional diffusion processes. Berlin: SpringerVerlag, 1979 [VF88] Vishik, M. and Fursikov, A.: Mathematical problems of statistical hydrodynamics. Dordrect: Kluwer Academic Publishers, 1988 Communicated by G. Gallavotti
Commun. Math. Phys. 224, 107 – 112 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Counting Phase Space Cells in Statistical Mechanics Giovanni Gallavotti Fisica, I.N.F.N., Università di Roma “La Sapienza”, P. le Moro 2, 00185 Roma, Italy. E-mail:
[email protected] Received: 16 November 2000 / Accepted: 22 April 2001
To Joel L. Lebowitz on his 70th birthday Abstract: The problem of counting the number of phase space cells is analyzed with the purpose of interpreting the variational principle for the SRB statistics as an equidistribution property, in equilibrium as well as in nonequilibrium statistical mechanics. 1. Phase Space Cells When Volume is Not Conserved. Variational Properties of Stationary States Consider a transitive Anosov map S on a bounded surface M (“phase space”) modeling, for instance, a simple gas of identical particles subject to nonconservative external forces and to thermostating forces balancing them in the average: a simple but important example that seems well modeled in this way can be found in [CELS93]. Such models would be “typical” for non equilibrium systems if one accepted the chaotic hypothesis, see [GC95a], and Ch. 9 in [Ga99]; for a general discussion see [Ru99]. The general theory of Anosov systems, see [Si68], implies the existence of a “statistics” µSRB describing the asymptotic behavior of almost all initial data in phase space (in the sense of the Liouville measure). This means thatexcept for a volume zero set of ini −1 tial data x it will be limT →∞ T −1 Tj =0 F (S j x) = µSRB (dy)F (y) for all continuous functions (“observables”) F on M. The SRB distribution admits a rather simple representation which can be interpreted in terms of “coarse graining” of the phase space, and it is convenient to introduce it at this point for later use. Let P be a “Markov partition” of phase space P = (P1 , . . . , Pm ) with sets Pσ , see (for instance) Ch. 9 in [Ga99]. Let T be a time such that the size of the Eσ−T /2 ,... ,σT /2 ∈ def T /2 P T = −T /2 S j P, σj = 1, 2, . . . , m, is so small that the physically interesting observables can be viewed as a constant inside Eσ−T /2 ,... ,σT /2 = E( σ ). Then the SRB Work partially supported by IHES and Rutgers University
108
G. Gallavotti
probability µ( σ ) of E( σ ) and the Liouville distribution are described in terms of the functions λ1u (x) = log | det(∂S)u (x)|,
λ1s (x) = log | det(∂S)s (x)|,
(1)
where (∂S)u (x) (resp. (∂S)s (x)) is the Jacobian of the evolution map S restricted to the unstable (stable) manifold through x and mapping it to the unstable (stable) manifold ±T /2 1 j ±T /2 1 j T /2 T /2 through Sx. Defining Uu,± (x) = j =0 λu (S x) and Us,± (x) = j =0 λs (S x) and selecting a point x( σ ) ∈ E( σ ) for each σ , the SRB distribution and the volume distribution µL , on the phase space M, which we suppose to have volume W = V (M), attribute to the nonempty sets E( σ ) the probabilities def T /2 T /2 µ( σ ) = µSRB (E( σ )) = hTu,u ( σ ) · exp − Uu,− (x( σ )) − Uu.+ (x( σ )) def T /2 T /2 µL ( σ ) = V (E( σ ))/W = hTs,u ( σ ) · exp − Us,− (x( σ )) − Uu,+ (x( σ )) ,
(2)
where V (E) is the Liouville volume of E, and hTu,u ( σ ), hTs,u ( σ ) are suitable functions of σ uniformly bounded as σ , T vary, c.f.r. Ch. 9 in [Ga99]. One can read (2) by saying that the “difference” between the Liouville volume and the SRB volume is that the first weighs asymmetrically the past and the future while the second weighs them symmetrically. As mentioned above we have in mind that the sets E( σ ) represent macroscopic states, being small enough so that the physically interesting observables have a constant value inside them; and we would like to think that they provide us with a model for a “coarse grained” description of the microscopic states. The dynamics will, in general, be nonconservative: hence the phase space volume will generally contract under time evolution. We want to describe the time evolution in terms of evolution of microscopic states, with the aim of counting the microscopic states relevant for a given stationary state of the system, i.e. for the SRB distribution. Therefore we divide phase space, supposed of dimenson d, into parallelepipedal cells of size εd V (E( σ )) and try to discuss time evolution in terms of them. This is a situation that arises in computer simulations: where the cells are the computer points with coordinates given by a set of integers and the evolution S is a program or code (simulating the solution of equations of motion suitable for the model under study) which operates exactly on the coordinates (i.e. imagining that the deterministic round offs are part of the program). It is clear, or at least it is a widely held belief, that the simulation will produce a chaotic evolution “for all practical purposes”, i.e. if we only look at “macroscopic observables” on the coarse graining scale e−λT 0 of the partition P T , if 0 is the phase space size1 , 1/d and −λ is the most contractive line element exponent, or even at finer 0 = W observables corresponding to a finer coarse graining, which are constant on elements of the pavement P T for T > T ,: provided the latter size is greater than the size ε of the cells : T < T with e−λ T 0 ≥ ε. The question we ask on general grounds is, see also [Ga95] 1 Here the phase space size 0 should be thought of as measured in dimensionless units, i.e. in terms of the sizes δp, δq in momentum and position of the cells . Assuming that we consider N mass–m particles in a gas at temperature T and density ρ, so that d = 6N, then W = 6N 0 with 0 proportional to √ (ρ −1/3 2mkB T /δpδq)1/2 .
Counting Phase Space Cells in Statistical Mechanics
109
Question: Can we count the number of ways in which the asymptotic state of the system can be realized microscopically? In equilibrium the (often) accepted answer is simple: the number is N0 = W/ε d , i.e. just the number of cells (“ergodic hypothesis”). This means that we think that our program will generate a one cycle permutation of the N0 cells , each of which is therefore representative of the equilibrium state. Average values of macroscopic observables will be obtained simply as: lim N −1
N→∞
N −1
F (S j x) = N0−1
F () =
j =0
M
F (y)µL (dy)
(3)
According to Boltzmann the quantity: SB = log (W/εd ) def
(4)
is then, see [Bo77] (where however w’s denote integers rather than phase space volumes), proportional to the physical entropy of our equilibrium system. Can one extend the above view to systems out of equilibrium? In such systems the volume will no longer be preserved by time evolution and, in fact, its contraction rate η(x) = − log | det ∂S(x)|
(5)
not only does not vanish but, in general, will have a positive time average η+ , η+ = j limN→∞ N −1 N−1 j =0 η(S x) = M η(y)µSRB (dy), see [Ru96]. If η+ > 0 the volume will contract indefinitely (hence the system is called dissipative). Out of equilibrium we may imagine that a similar kind of “ergodicity” holds: namely that the cells that represent the stationary state form a subset of all the cells, on which evolution acts as a one cycle permutation. If so the statistical properties of motions will be determined by the equidistribution among such cells, which thus attributes probabilities ρ() which maximize the quantity − ρ() log ρ(). Hence the above counting question can be related to a problem ... which necessarily follows from Boltmann’s train of thought, [and] has remained untouched. Consider an irreversible process which, with fixed outside constraints, is passing by itself from the nonstationary to the stationary state. Can we characterize in any sense the resulting distribution of state as the “relatively most probable distribution”, and can this be given in terms of the minimum of a function which can be regarded as the generalization..., [EE11], footnote 239, p. 103. Before proceeding it is convenient to note a nontrivial relation between η and λ1u , λ1s valid for all T , T > 0, T /2 j =−T /2
η(S j x) =
T /2
T /2
(Uu,α (x) + Us,α (x)) + O(1)
(6)
α=±
see Eq. (1), with the error O(1) being uniformly bounded in T and x: this is a property which is obtained in proving (2), see [Si68] and [Ga99], Chap. 9. Considering simulations of a dissipative system we must recognize that no code can be an invertible code: it must happen (many times) that S = S with = . Clearly
110
G. Gallavotti
˜ we can think that both and are not really different and only one if S = S = of the two can be taken as a representative of the microscopic state. We can imagine “pruning” one after the other the “unnecessary” cells until the map S becomes invertible. More formally each cell will have a motion that is eventually periodic and we discard as “transients” all cells whose evolution is not strictly periodic. The remaining cells will form a discrete model of the attractor. The above question becomes now a precise one: which is the number N of leftover cells? It will be only a fraction of the initial number N0 of cells: and we can attempt to estimate it assuming that the evolution is a one cycle permutation of them. The number N ( σ ) of cells leftover in E( σ ) will have to be proportional to the SRB probability µ( σ ) of E( σ ) otherwise the time average of the observables (i.e. the SRB average introduced above) will not be correctly given by the sum over the cells. The just described pruning process will have to leave N ≤ N0 cells; and furthermore inside each “coarse grain” set E( σ ) a number of cells equal to N ( σ ) = N µSRB (E( σ )). If V ( σ ) is the volume of E( σ ), so that σ V ( σ ) = W , it must be: V ( σ )/εd ≥ N ( σ ) = N µSRB (E( σ ))
(7)
(8)
for all σ ’s. This gives, using that W = ε d N0 : N ≤ N0 min σ
V ( σ )/ε d . N0 µSRB (E( σ ))
(9)
T (x( σ )) − U T (x( σ ))) differs by O(T −1 ) from the The quantity η = max σ T2 (Us,− u,− maximal average phase space contraction maxx∈ attractor − T2 log | det ∂S T /2 (x)|, and Eqs. (2), (6), give
N ≤ N0 e− 2 T η+O(1) , 1
(10)
where the O(1) is uniform in T , and η can be identified with the infinite time average η+ of the phase space contraction rate − log | det ∂S(x)|. The picture must hold for all Markovian pavements P and for all T ’s such that e−λT δ > ε if δ 0 is the typical size of an element of the partition P: this restricts T to be of −1 the order of T = λ log 0 /ε. And, as in equilibrium, once this requirement is fulfilled we shall think that N has the maximal allowed value, i.e. that in (10) the inequality is saturated for T = T . This is a kind of “ergodicity” assumption which is similar to the corresponding assumption that in equilibrium all cells are actually visited (while assuming that only a fraction of them is visited would give the same statistics as long as the fraction is taken to be the same in each coarse grain volume, but a different cell count hence a different entropy assignment). + α We call −λ− (equal in i , λi the Lyapunov exponents, λi > 0, i = 1, 2, . . . , d/2 − , d λ ≥ η ≥ η = number by the transitivity assumption), so that λ ≤ mini λ− + i (λi − i + λi ), and define: 1η 1 η 0 Scells = log N = log N0 − log . (11) = (log N0 ) 1 − 2λ ε 2d λ
Counting Phase Space Cells in Statistical Mechanics
111
This will depend on ε and, unlike the equilibrium case when η = 0, nontrivially so because η/λ is a dynamical quantity and changing (i.e. our representation of the microscopic motion) ε will change Scells as Scells /Scells = | log ε /ε|. Given a precision ε of the observations, the quantity Scells measure, how many “non transient” phase space cells must be used to obtain a faithful representation of the attractor and of its statistical properties on scale ε. Here by “faithful” on scale ε we mean that all observables which are constant on such a scale will show the correct statistical properties, i.e. that cells of size larger than ε will be visited with the correct SRB frequency. Since the quantity η/λ is bounded by 1 we see that dissipation does not “simplify” much the motion. Note, however, that we also assume that the system isAnosov transitive: which implies that the attactor is dense; so that the small reduction due to the dissipation, estimated above, holds only as long as this is a correct assumption: at high forcing the attractor is likely (i.e. examples abound) to be no longer dense on phase space and the number N0 will have to be replaced by the smaller power of 0 , affecting correspondingly the analysis leading to (11). One can ask how many phase space cells are required for a faithful representation of the dynamics by a permutation of cells if one just asks faithfulness to hold only for “most” observations on scale ε or higher: depending on the meaning attributed to “most” we can expect that η/λ is replaced by other similar quantities (i.e.g. by some averages of η+ and of Lyapunov exponents, respectively). Since we are not interested in all observables but only in very few ones, it might be interesting to attempt concrete estimates in this more general sense. 2. Remarks (1) Although Eq. (11) gives the cell count it does not seem to deserve to be taken as a definition of entropy also for systems out of equilibrium, not even for systems simple enough to admit a transitive Anosov map as a model for their evolution. It seems a notion distinct from what has become known as the “Boltzmann entropy”, [Le93], see also [EE22]. The notion is also different from the Gibbs’entropy, to which it is equivalent only in equilibrium systems: in nonequilibrium (dissipative) systems the latter can only be defined as −∞ and perpetually decreasing; because in such systems one can define the rate at which (Gibbs’) entropy is “created” or “ceded to the thermostats” by the system to be η+ , i.e. to be the average phase space contraction η+ , see [An82, Ru99]. (2) We also see, from the above analysis, that the variational principle that determines the SRB distribution can be identified with the one that leads to equal probability of the phase space cells. The SRB distribution appears to be the equal probability distribution among the N cells which are not transient. In equilibrium all cells are non transient and the SRB distribution coincides with the Liouville distribution. (3) If we could take T → ∞ (hence, correspondingly, ε → 0) then the distribution µ which is uniform inside each E( σ ) but which attributes a total weight to E( σ ) equal to N ( σ ) = µSRB (E( σ ))N would become the exact SRB distribution. However it seems conceptually more satisfactory, imitating Boltzmann, to suppose that ε is very small but > 0 so that T will be large but not infinite. (4) A deeper understanding of the above analysis appears to be linked to an important question raised by Ruelle asking whether (and how) one could possibly relate an entropy notion to the logarithm of the Hausdorff measure of the attractor: and a pertinent possibility is that the Hausdorff measure on the attractor is absolutely continuous with
112
G. Gallavotti
respect to the SRB measure. The above analysis in terms of cells is reminiscent, in fact, of the methods to study Hausdorff dimension, Hausdorff measure and Pesin’s formula in general hyperbolic systems, [Yo94]. Acknowledgements. I am grateful to E. Speer for pointing out an inconsistency in a preliminary version of this work and to F. Bonetto for very stimulating and clarifying discussions. The references to [EE11, EE33] and their relevance were pointed out to me by E.G.D. Cohen.
References [An82]
Andrej, L.: The rate of entropy change in non-Hamiltonian systems. Phys. Lett: A 111, 45–46 (1982) [Bo77] Boltzmann, L.: Über die Beziehung zwischen dem zweiten Hauptsatze der mechanischen Wärmetheorie und der Wahrscheinlichkeitsrechnung, respektive den Sätzen über das Wärmegleichgewicht. In: Wissenschaftliche Abhandlungen, Vl. II, Chelsea, New York: F. Hasenöhrl, 1968, reprint, pp. 164–223 [CELS93] Chernov, N. I., Eyink, G. L., Lebowitz, J.L., Sinai, Y.: Steady state electric conductivity in the periodic Lorentz gas. Commun. Math. Phys. 154, 569–601 1993 [EE11] Ehrenfest, P., Ehrenfest, T.: The conceptual foundations of the statistical approach in Mechanics. New York: Dover, 1990, reprint [EE22] Einstein, E.: Zur Theorie des Radiometers. Annalen der Physik, 69, 241–254, 1922. And: Epstein, P.S.: On the resistance experienced by spheres in their motion through gases. Physical Review 23, 710–733 (1924). See also Epstein, P.S.: Theory of the radiometer. Zeitschrift für Physik 54, 537–563 (1929) [Ga95] Gallavotti, G.: Ergodicity, ensembles, irreversibility in Boltzmann and beyond. J. Stat. Phys. 78, 1571–1589 1995 [GC95a] Gallavotti, G., Cohen, E.G.D.: Dynamical ensembles in nonequilibrium statistical mechanics. Phys. Rev. Lett. 74, 2694–2697 (1995) Dynamical ensembles in stationary states. J. Stat. Phys. 80, 931–970, (1995) [Ga99] Gallavotti, G.: Statistical mechanics. A short treatise. Berlin–Heidelberg–New York: Springer Verlag, 1999, pp. 1–345 [Le93] Lebowitz, J.L.: Boltzmann’s entropy and time’s arrow. Phys. Today, 32–38, 1993 [Ru68] Ruelle, D.: Statistical mechanics of one-dimensional lattice gas. Commun. Math. Phys. 9, 267–278 1968 [Ru96] Ruelle, D.: Positivity of entropy production in non equilibrium statistical mechanics. J. Stat. Phys. 85, 1–25 (1996); Entropy production in nonequilibrium statistical mechanics. Commun. Math. Phys. 189, 365–371 (1997) [Ru99] Ruelle, D.: Smooth dynamics and new theoretical ideas in non-equilibrium statistical mechanics. J. Stat. Phys. 95, 393–468 (1999) [Si68] Sinai, Y.G.: Markov partitions and C-diffeomorphisms. Funct. Anal. and Appl. 2, no. 1, 64–89 (1968); Construction of Markov partitions. Funct. Anal. and Appl. 2, no. 2, 70–80 (1968) See also Gibbs measures inergodic theory, Russ. Math. Surv. 27, 21–69 (1972) and Lectures in ergodic theory, Lecture notes in Mathematics, Princeton, NJ: Princeton University Press, 1977 [Yo94] Young, L.S.: Ergodic theory of differentiable dynamical systems. In: Real and complex dynamical systems, ed. B. Branner, P. Hjorth, Nato ASI series, Dordrecht: Kluwer, 1995; Ergodic theory of chaotic dynamical systems. In: Mathematical Physics XII (M ∩ 5 Conference Proceedings), editors D. de Witt, A.J.B. Bracken, M.D. Gould, P.A. Pearce, Cambridge, MA: International Press, 1999 Communicated by Ya. G. Sinai
Commun. Math. Phys. 224, 113 – 132 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Adiabatic Decoupling and Time-Dependent Born–Oppenheimer Theory Herbert Spohn, Stefan Teufel Zentrum Mathematik and Physik Department, Technische Universität München, 80290 München, Germany. E-mail:
[email protected];
[email protected] Received: 10 July 2000 / Accepted: 30 July 2001
Dedicated to Joel Lebowitz on the occasion of his 70th birthday Abstract: We reconsider the time-dependent Born–Oppenheimer theory with the goal to carefully separate between the adiabatic decoupling of a given group of energy bands from their orthogonal subspace and the semiclassics within the energy bands. Band crossings are allowed and our results are local in the sense that they hold up to the first time when a band crossing is encountered. The adiabatic decoupling leads to an effective Schrödinger equation for the nuclei, including contributions from the Berry connection. 1. Introduction Molecules consist of light electrons, mass me , and heavy nuclei, mass M which depends on the type of nucleus. Born and Oppenheimer [3] wanted to explain some general features of molecular spectra and realized that, since the ratio me /M is small, it could be used as an expansion parameter for the energy levels of the molecular Hamiltonian. The time-independent Born–Oppenheimer theory has been put on firm mathematical grounds by Combes, Duclos, and Seiler [5], Hagedorn [8], and more recently in [16]. With the development of tailored state preparation and ultra precise time resolution there is a growing interest in understanding and controlling the dynamics of molecules, which requires an analysis of the solutions to the time-dependent Schrödinger equation, again exploiting that me /M is small. The molecular Hamiltonian is of the form H =
2 2 h¯ 2 h¯ 2 − i∇x − Aext (x) + − i∇X + Aext (X) 2me 2M + Ve (x) + Ven (X, x) + Vn (X).
(1)
For notational simplicity we ignore spin degrees of freedom and assume that all nuclei have the same mass. We have k electrons with positions {x1 , . . . , xk } = x and l nuclei with positions {X1 , . . . , Xl } = X. The first and second term of H are the kinetic energies of the electrons and of the nuclei, respectively. An external magnetic field is
114
H. Spohn, S. Teufel
σ (He (R))
(R)
E3 (R) E2 (R) E1 (R)
R0
R
Fig. 1. The schematic spectrum of He (R) for a diatomic molecule as a function of the separation R of the two nuclei
included through the vector potential Aext . Electrons and nuclei interact via the static Coulomb potential. Therefore Ve is the electronic, Vn the nucleonic repulsion, and Ven the attraction between electrons and nuclei. Ve and Vn may also contain an external electrostatic potential. In atomic units (me = h¯ = 1) the Hamiltonian (1) can be written more concisely as H =
2 me 1 − i∇X + Aext (X) + He (X), M 2
(2)
emphasizing that the nuclear kinetic energy will be treated as a “small perturbation”. He (X) is the electronic Hamiltonian for a given position X of the nuclei, He (X) =
2 1 − i∇x − Aext (x) + Ve (x) + Ven (X, x) + Vn (X). 2
(3)
He (X) is a self-adjoint operator on the electronic Hilbert space L2 (R3k ) restricted to its antisymmetric subspace. Later on we will need some smoothness of He (X), which can be established easily if the electrons are treated as point-like and the nuclei have an extended, rigid charge distribution. Generically He (X) has, possibly degenerate, eigenvalues E1 (X) < E2 (X) < . . . which terminate at the continuum edge (X). Thereby one obtains the band structure as plotted schematically in Fig. 1. The discrete bands Ej (X) may cross and possibly merge into the continuous spectrum as indicated in Fig. 2. Comparing kinetic energies, we find for the speeds |vn | ≈ (me /M)1/2 |ve |, which means that on the atomic scale the nuclei move very slowly. If we regard X(t) as a given nucleonic trajectory, then He (X(t)) is a Hamiltonian with slow time variation and the time-adiabatic theorem [15, 14, 1] can be applied [2]. For us X are quantum mechanical degrees of freedom. The Hamiltonian H of (2) is time-independent and we can only exploit that the nucleonic Laplacian carries a small prefactor. To distinguish, we refer to our situation as space-adiabatic. Since the nuclei move very slowly, their dynamics must be followed over sufficiently long times. From the speed ratio we conclude that
Adiabatic Decoupling and Time-Dependent Born–Oppenheimer Theory
115
these times are of order (me /M)1/2 in atomic units. To simplify notation we define me (4) ε= M as the small dimensionless parameter. Then 2 1 − i∇X + Aext (X) + He (X), H ε = ε2 2
(5)
and we want to study the solutions of the time-dependent Schrödinger equation iε
∂ψ = H εψ ∂t
(6)
in the limit of small ε. The crude physical picture underlying the analysis of (6) is that the nuclei behave semiclassically because of their large mass and that the electrons rapidly adjust to the slow nucleonic motion. Thus, in fact, the time-dependent Born–Oppenheimer approximation involves two limits. If the electrons are initially in the eigenstate χj (X0 ) of the j th band with energy Ej (X0 ), where X0 is the approximate initial configuration of the nuclei, then the j th band is adiabatically protected provided there is an energy gap separating it from the rest of the spectrum. Thus at later times, up to small error, the electronic wave function is still in the subspace corresponding to the j th band. But this implies that the nuclei are governed by the Born–Oppenheimer Hamiltonian 2 1 ε = ε2 (7) − i∇X + Aext (X) + Ej (X). HBO 2 ε can be analyzed through semiclassical methods where to leading Since ε 1, HBO order the contributions come from the classical flow t corresponding to the classical cl = 1 p 2 + E (q) on nucleonic phase space. Hamiltonian HBO j 2 In general, Ej (X) may touch another band as X varies. To allow for such band crossings we introduce the region ! ⊂ Rn , n = 3l, in nucleonic configuration space, such that Ej restricted to ! does not cross or touch any other energy band. The classical flow t then has ! × Rn as phase space and is defined only up to the time when it first hits the boundary ∂! × Rn . Up to that time (7) still correctly describes the quantum evolution. To follow the tunneling through a band crossing other methods have to be used [11, 7], in particular, the codimension of the crossing is of relevance. The mathematical investigation of the time-dependent Born–Oppenheimer theory was initiated and carried out in great detail by Hagedorn. In his pioneering work [9] he constructs approximate solutions to (6) of the form φq(t),p(t) ⊗ χj (q(t)), where φq(t),p(t) is a coherent state carried along the classical flow, (q(t), p(t)) = t (q0√ , p0 ). The difference to the true solution with the same initial condition is of order ε in the L2 -norm over times of order ε −1 in atomic units and the approximation holds until the first hitting time of ∂! × Rn . In a recent work Hagedorn and Joye [10] construct solutions to (6) satisfying exponentially small error estimates. In Hagedorn’s approach the “adiabatic and semiclassical limits are being taken simultaneously, and they are coupled [10]”. In our paper we carefully separate the space-adiabatic and the semiclassical limit. One immediate benefit is the generalization of the first order analysis of Hagedorn from coherent states to arbitrary wave functions.
116
H. Spohn, S. Teufel σ (He (X))
(a) (b)
(b)
}
σ∗ (X)
(a)
Ran P∗
!
X
Fig. 2. The wave function can leave RanP∗ in two different ways: either by transitions to other bands (a) or through the boundary of ! (b)
Let us explain our result for the space-adiabatic part in more detail. We assume that there is some region ! ⊂ Rn in the nucleonic configuration space, such that some subset σ∗ (X) of σ (He (X)) is separated from the remainder of the spectrum by a gap for all X ∈ !, i.e. dist σ∗ (X), σ (He (X)) \ σ∗ (X) ≥ d > 0 for all X ∈ !. ! could be punctured by small balls (for n = 2) because of band crossings. ! could also terminate because the point spectrum merges in the continuum, which physically means that the molecule loses an electron through ionization. Let P∗ (X) be the spectral ⊕ projection of He (X) associated with σ∗ (X) and P∗ = ! dX P∗ (X). We will establish ε that the unitary time evolution e−iH t/ε agrees on RanP∗ with the diagonal evolution −iH ε t/ε ε e diag generated by Hdiag := P∗ H ε P∗ up to errors of order ε as long as the leaking through the boundary of ! is sufficiently small. To complete the analysis one has to control the flow of the wave function through ∂!. One possibility is to simply avoid the problem by assuming that ! = Rn , hence ∂! = ∅. We will refer to this case as a globally isolated band. Of course, the set {(X, y) ∈ Rn × R : y ∈ σ∗ (X)} may contain arbitrary band crossings. As one of our main results, we prove that the subspace RanP∗ is adiabatically protected. In particular for the purpose of studying band crossings the full molecular Hamiltonian may be replaced by a simplified model with two bands only. In general one has ∂! = ∅, to which we refer as a locally isolated band. To estimate the flow out of ! the only technique available seems to be semiclassical analysis. But this requires a control over the semiclassical evolution, for which one needs, at present, that {(X, y) ∈ ! × R : y ∈ σ∗ (X)} contains no band crossings. Then {(X, y) ∈ ! × R : y ∈ σ∗ (X)} = ∪j {(X, y) ∈ ! × R : y = Ej (X)} is the disjoint union of possibly degenerate energy bands Ej (X). We will prove that each band separately is adiabatically protected. In the special case where σ∗ (X) = Ej (X) is a nondegenerate eigenvalue for X ∈ !, ε −iH ε t/ε ε is a standard e diag is well approximated through e−iHBO t/ε on L2 (Rn ). Since HBO
Adiabatic Decoupling and Time-Dependent Born–Oppenheimer Theory
117
semiclassical operator, one can easily control the X-support of the wave function and therefore prove a result for rather general ! ⊂ Rn , for the details see Theorem 2. Roughly speaking, it says that if φt is a solution of the effective Schrödinger equation for the nuclei iε
∂φt ε = HBO φt , ∂t
(8)
with suppφ0 ⊂ !, then, modulo an error of order ε, ψt := φt (X)χj (X, x) is a solution of the full Schrödinger equation (6) with initial condition ψ0 (X, x) = φ0 (X)χj (X, x) as long as φt is supported in ! up to L2 -mass of order ε. This maximal time span can be computed using the classical flow t . ε acquires as a first order As first observed by Mead and Truhlar [19], in general HBO correction an additional vector potential Ageo (X) = −iχj (X), ∇X χj (X) and (7) has to be replaced by 2 1 ε HBO = ε2 (9) − i∇X + Aext (X) + Ageo (X) + Ej (X). 2 Multiplying χj (X) with a smooth X-dependent phase factor induces a gauge transformation for Ageo , which implies that the physical predictions based on (9) do not change, as it should be. As noticed in [19], in general Ageo cannot be removed through a gauge transformation and (9) and (7) describe different physics. Berry realized that geometric phases appear whenever the Hamiltonian has slowly changing parameters. Therefore Ageo (X) is referred to as Berry connection, cf. [22] for an instructive collection of reprints. In fact, the motion of nuclei as governed by the Born–Oppenheimer Hamiltonian (9) is one of the paradigmatic examples for geometric phases. ε If σ∗ (X) = E(X) is k-fold degenerate, not much of the above analysis changes. HBO 2 n ⊕k becomes matrix-valued and acts on L (R ) , i.e. 2 2 ε ε HBO = − i∇X + Aext (X) + Ej (X) 1k×k 2 ε + (−iε∇X ) · Ageo (X) + Ageo (X) · (−iε∇X ) . 2 The connection Ageo (X) contains in general also off-diagonal terms and matrix-valued semiclassics must be applied. However, since the only nondiagonal term is in the subprincipal symbol, the leading order semiclassical analysis reduces to the scalar case and, in particular, agrees with the nondegenerate band case. We do not carry out the straightforward extension of Theorem 2 below to the degenerate band case, because the technicalities of matrix-valued semiclassics would obscure the simple ideas behind our analysis. In their recent work [18] Martinez and Sordoni independently study the time-dependent Born–Oppenheimer approximation as based on techniques developed by Nenciu and Sordoni [20]. They consider the case of a globally isolated band for a Hamiltonian of the form (1) with smooth V and Aext = 0. They succeed in proving the adiabatic decoupling to any order in ε for subspaces P∗ε which are ε-close to the unperturbed subspaces P∗ considered by us. With this result, in principle, higher order corrections to the effective Hamiltonian (7) could be computed.
118
H. Spohn, S. Teufel
The paper is organized as follows. Section 2 contains the precise formulation of the ε and on results. Section 3 gives a short discussion of the semiclassical limit of HBO how such results extend to the full molecular system. Proofs are provided in Sect. 4. In spirit they rely on techniques developed in [23] in the context of the semiclassical limit for dressed electron states. In practice the Born–Oppenheimer approximation requires 2 several novel constructions, since the “perturbation” − ε2 ) increases quadratically. Our results can be formulated and proved in a more general framework dealing with, possibly time-dependent, perturbations of fibered operators. Also the gap condition can be removed by using arguments similar to those developed by Avron and Elgart in [1]. The general operator theoretical results will appear elsewhere [24]. 2. Main Results The specific form (3) of the electronic part of the Hamiltonian will be of no importance in the following. Thus we only assume that
⊕ He = dX He (X), He (X) = He0 + He1 (X), Rn
where He0 is self-adjoint on some dense domain D ⊂ He and bounded from below and He1 (X) ∈ L(He ) is a continuous family of self-adjoint operators, bounded uniformly for X ∈ Rn . Thus He is self-adjoint on D(He ) = L2 (Rn )⊗D ⊂ H := L2 (Rn )⊗He and bounded from below. For the definition of L2 (Rn ) ⊗ D we equip D with the graph-norm · He0 , i.e., for ψ ∈ D, ψHe0 = He0 ψ + ψ. Let Aext ∈ Cb1 (Rn , Rn ), where for any open set , ⊂ Rm , m ∈ N, Cbk (,) denotes the set of functions f ∈ C k (,) such that for each multi-index α with |α| ≤ k there exists a Cα < ∞ with sup |∂ α f (x)| ≤ Cα .
x∈,
2
2
Then ε2 − i∇X + Aext (X) is self-adjoint on W 2 (Rn ), the second Sobolev space, since −i∇X is infinitesimally operator bounded with respect to −)X . It follows that Hε =
2 ε2 − i∇X + Aext (X) ⊗ 1 + He , 2
(10)
self-adjoint on D(H ε ) = W 2 (Rn ) ⊗ He ∩ D(He ). For X ∈ !, ! ⊂ Rn open, we require in addition some regularity for He (X) as a function of X: Hk He1 (·) ∈ Cbk (!, L(He )). The exact value of k will depend on whether ! = Rn or ! ⊂ Rn . For the type of Hamiltonian considered in the introduction, cf. (1), all the above conditions including Condition Hk are easily checked and put constraints only on the smoothness of the external potentials and on the smoothness and the decay of the charge distribution of the nuclei. For point nuclei Hk fails and a suitable substitute would require a generalization of the Hunziker distortion method of [16]. We will be interested in subsets of {(X, s) ∈ ! × R : s ∈ σ (He (X)} which are isolated from the rest of the spectrum in the following sense.
Adiabatic Decoupling and Time-Dependent Born–Oppenheimer Theory
S
119
For X ∈ !, let σ∗ (X) ⊂ σ (He (X)) be such that there are functions f± ∈ Cb (!, R) and a constant d > 0 with f− (X) + d, f+ (X) − d ∩ σ∗ (X) = σ∗ (X) and
f− (X), f+ (X) ∩ σ (He (X) \ σ∗ (X) = ∅.
⊕ We set P∗ = ! dX P∗ (X), where P∗ (X) = 1σ∗ (X) (He (X)) is the spectral projection of He (X) with respect to σ∗ (X). As explained in the introduction we have to distinguish two cases. (i) Globally isolated bands. We assume ! = Rn and let ε Hdiag := P∗ H ε P∗ + P∗⊥ H ε P∗⊥ .
(11)
Since we aim at a uniform result for the adiabatic theorem, we introduce the Sobolev spaces W 1,ε (Rn ) and W 2,ε (Rn ) with respect to the ε-scaled gradient, i.e.
W 1,ε (Rn ) := φ ∈ L2 (Rn ) : φW 1,ε := ε |∇φ| + φ < ∞ and
W 2,ε (Rn ) := φ ∈ L2 (Rn ) : φW 2,ε := ε 2 )φ + φ < ∞ .
Alternatively we will project on finite total energies smaller than E and define E(H ε ) := 1(−∞,E ] (H ε ). ε is self-adjoint on the domain of Theorem 1. Assume H3 and S for ! = Rn . Then Hdiag ε < ∞ such that for all t ∈ R, H . There are constants C, C −iH ε t/ε −iH ε t/ε − e diag ≤ ε C (1 + |t|)3 , (12) e 2,ε
L(W
⊗He ,H)
and for all E ∈ R, −iH ε t/ε −iH ε t/ε − e diag E(H ε ) e
L(H)
(1 + |E|) (1 + |t|). ≤ εC
(13)
L(W 2,ε ⊗ He , H) denotes the space of bounded linear operators from W 2,ε ⊗ He to H equipped with the operator norm. This result should be understood as an adiabatic theorem for the subspaces RanP∗ and RanP∗⊥ , which are not spectral subspaces. Let us point out one immediate application of Theorem 1. The behavior near band crossings is usually investigated using simplified models involving only two energy bands and ignoring the rest of the spectrum, cf. [11, 7]. Theorem 1 shows that this strategy is indeed justified modulo errors of order ε. (ii) Locally isolated bands. σ∗ (X) = E(X) is a nondegenerate eigenvalue for all X ∈ !. ! may now be any open subset of Rn and for such a ! we assume H∞ and S. We also assume that ! is connected. Otherwise one could treat each connected component separately.
120
H. Spohn, S. Teufel
It is easy to see that, given H∞ and S, the family of projections P∗ (·) ∈ Cb∞ (!, L(He )). However, in order to “map” the dynamics from RanP∗ to L2 (!) we need in addition a smooth version χ (·) ∈ Cb∞ (!, He ) of the normalized eigenvector of He (X) with eigenvalue E(X). In other words we require the complex line bundle over ! defined by P∗ to be trivial. This always holds for contractible !, but, as discussed below, also for some relevant examples where ! is not contractible. Given a smooth version of χ (X) with χ (X) = 1, one has Reχ (X), ∇X χ (X) = 0, but, in general, Imχ (X), ∇X χ (X) = 0. In the following we distinguish two cases: Either it is possible to achieve Im χ (X), ∇X χ (X) = 0 by a smooth gauge transformation χ (X) → χ (X) = eiθ(X) χ (X) or not. In the latter case Ageo (X) := −iχ (X), ∇X χ (X) is the gauge potential of a connection on the trivial complex line bundle over !, the Berry connection, and has to be taken into account in the definition of the effective operator ε := HBO
2 ε2 − i∇X + Aext (X) + Ageo (X) + E(X) 2
(14)
with domain W 2 (Rn ). Thus Ageo acts as an additional external magnetic vector potenε with an ε in front only, and therefore are tial. Although Aext and Ageo appear in HBO not retained in the semiclassical limit to leading order, they do contribute to the solution of the Schrödinger equation for times of order ε−1 . If the full Hamiltonian is real in position representation, as it is the case for the Hamiltonians considered in the introduction whenever Aext = 0, then χ (X) can be chosen real-valued. If, in addition, ! is contractible, the existence of a smooth version of χ (X) with Imχ (X), ∇X χ (X) = 0 follows. ε on L2 (Rn ) through (14), the functions E(X) and A To define HBO geo (X), which are a priori defined on ! only, must be continued to functions on Rn . Hence we arbitrarily extend E(X) and Ageo (X) to functions in Cb∞ (Rn ) by modifying them, if necessary, on ! \ (! − δ/5) (cf. (17)) for some δ > 0. The parameter δ will be fixed in the formulation of Theorem 2 and will appear in several places. It controls how close the states are allowed to come to ∂!. The generic example for the Berry phase is a band crossing of codimension 2 (cf. [22, 11, 7]). If E(X) is an isolated energy band except for a codimension 2 crossing, then ! = Rn \ {closed neighborhood of the crossing} is no longer contractible, but the line bundle is still trivial. Although the underlying Hamiltonian is real, the Berry connection cannot be gauged away. Within the time-independent Born–Oppenheimer approximation Herrin and Howland [12] study a model with a nontrivial eigenvector bundle. With the fixed choice for χ (X) we have ⊕ Ran P∗ = dX φ(X)χ (X); φ ∈ L2 (!) ⊂ H. (15) !
Thus there is a natural identification U : RanP∗ → L2 (Rn ) connecting the relevant subspace on which the full quantum evolution takes place and the Hilbert space L2 (Rn ) on which the effective Born–Oppenheimer evolution is defined. According to (15), we set U(φχ ) = φ,
i.e.
( UP∗ ψ )(X) = χ (X), (P∗ ψ)(X) He .
Adiabatic Decoupling and Time-Dependent Born–Oppenheimer Theory
121
Its adjoint U ∗ : L2 (Rn ) → RanP∗ is given by
⊕ ∗ U φ= dX φ(X)χ (X). !
U ∗U
= 1 on RanP∗ . But U is not surjective and thus not Clearly U is an isometry and unitary. ε By construction, e−iHBO t/ε is a good approximation to the true dynamics only as long as the wave function of the nuclei is supported in ! modulo errors of order ε. ε is a standard semiclassical operator, the X-support of solutions of (8) can be Since HBO calculated approximately from the classical dynamics generated by its principal symbol Hcl (q, p) = 21 p 2 + E(q) on phase space Z := Rn × Rn , d q = p, dt
d p = −∇E(q). dt
(16)
The solution flow to (16) exists for all times and will be denoted by t . In order to make these notions more precise, we need to introduce some notation. The Weyl quantization of a ∈ Cb∞ (Z) is the linear operator
X+Y W,ε −n a φ (X) = (2π ) , ε k e−i(X−Y )·k φ(Y ), dY dk a 2 Rn as acting on Schwartz functions. a W,ε extends to L(L2 (Rn )) with operator norm bounded uniformly in ε (cf., e.g., Theorem 7.11 in [6]). The wave functions with phase space support in a compact set 7 ⊂ Z do not form a closed subspace of L2 (Rn ). Hence we cannot project on this set. In order to define approximate projections, let for 7 ⊂ Rm , m ∈ N, and for α > 0, 7 − α := z ∈ 7 : infm |w − z| ≥ α . (17) w∈R \7
Definition 1. An approximate characteristic function 1(7,α) ∈ Cb∞ (Rm ) of a set 7 ⊂ Rm with margin α is defined by the requirement that 1(7,α) |7−α = 1 and 1(7,α) |Rm \7 = 0. If 1(7,α) is an approximate characteristic function on phase space Z, then the corresponding approximate projection is defined as its Weyl quantization 1W,ε (7,α) . We will say that functions in Ran1W,ε (7,α) have phase space support in 7.
For 7 ⊂ Z we will use the abbreviations 7q := q ∈ Rn : (q, p) ∈ 7 for some p ∈ Rn , 7p := p ∈ Rn : (q, p) ∈ 7 for some q ∈ Rn . Let the phase space support 7 of the initial wave function be such that 7q ⊂ ! − δ. Then the maximal time interval for which the X-support of the wave function of the nuclei stays in ! up to errors of order ε can be written as δ Imax (7, !) := [T−δ (7, !), T+δ (7, !)],
where the “first hitting times” T± are defined by the classical dynamics through
T+δ (7, !) := sup t ≥ 0 : s (7) q ⊆ ! − δ ∀ s ∈ [0, t]
122
H. Spohn, S. Teufel
and T−δ (7, !) analogously for negative times. These are just the first times for a particle starting in 7 to hit the boundary of ! − δ when dragged along the classical flow t . The following proposition, which is an immediate consequence of Egorov’s Theorem δ (7, !) the support of the wave function of the nuclei [4, 21], shows that for times in Imax stays indeed in !−δ, up to errors of order ε uniformly on Ran1W,ε (7,α) for any approximate projection 1W,ε (7,α) .
Proposition 1. Let 7 ⊂ Z be such that 7q ⊂ ! − δ and let 1!−δ denote multiplication with the characteristic function of ! − δ on L2 (Rn ). For any approximate projection δ 1W,ε (7,α) and any bounded interval I ⊆ Imax (7, !) there is a constant C < ∞ such that for all t ∈ I , ε W,ε 1 − 1!−δ e−iHBO t/ε 1(7,α)
L(L2 (Rn ))
≤ C ε.
An approximate projection on 7 in H is defined as P7α := U ∗ 1(!,δ) 1W,ε (7,α) U P∗ ,
where 1W,ε (7,α) is an approximate projection on 7 according to Definition 1 and 1(!,δ) is an approximate characteristic function for !. Using the latter instead of the sharp cutoff from U ∗ makes RanP7α a bounded set in W 2,ε ⊗ He whenever 7p is a bounded set.
Theorem 2. Assume H∞ and S with dim(RanP∗ (X)) = 1 for some open ! ⊆ Rn . Let 7 ⊂ Z be such that 7q ⊂ ! − δ for some δ > 0 and 7p bounded. For any approximate δ (7, !) there is a constant C < ∞ projection P7α and any bounded interval I ⊆ Imax such that for all t ∈ I , ε −iH ε t/ε − U ∗ e−iHBO t/ε U P7α e
L(H)
≤ Cε.
(18)
Theorem 2 establishes that the electrons adiabatically follow the motion of the nuclei up to errors of order ε as long as the leaking through the boundary of ! is small. The ε the semiclassics was used only to control such a leaking uniformly. However, for HBO limit ε → 0 is a semiclassical limit and, as discussed in the following section, beyond the mere support of the wave function more detailed information is available.
3. Semiclassics for a Single Band The semiclassical limit of Eq. (8) with a Hamiltonian of the form (14) is well understood and there is a variety of different approaches. For example one can construct approximate solutions φq(t) of (8) which are localized along a classical trajectory q(t), i.e. along a solution of (16). Then it follows from Theorem 2 that φq(t) χ is a solution of the full Schrödinger equation, (6), up to an error of order ε as long as q(t) ∈ ! − δ. Roughly speaking, this coincides with the result of Hagedorn [9]. In applications the assumption that the wave function of the nuclei is well described by a coherent state seems to be rather restrictive and a more general approach to the semiclassical analysis of a Schrödinger equation of the form (8) is to consider the distributions of semiclassical observables, i.e. of operators obtained as Weyl quantization a W,ε of classical phase space functions a : Z → R.
Adiabatic Decoupling and Time-Dependent Born–Oppenheimer Theory
123
Consider a general initial wave function φ ε ∈ L2 (Rn ), such that φ ε corresponds to a probability measure ρcl (dq dp) on phase space in the sense that for all semiclassical observables with symbols a ∈ Cb∞ (Z),
ε W,ε ε lim φ , a φ − a(q, p) ρcl (dq dp) = 0. (19) ε→0
Z
The definition is equivalent to saying that the Wigner transform of φ ε converges to ρcl weakly on test functions in Cb∞ (Z) [17]. An immediate application of Egorov’s theorem yields
ε iH ε t/ε W,ε −iH ε t/ε ε t BO BO lim φ , e a e φ − (a ◦ )(q, p) ρcl (dq dp) = 0 (20) ε→0
Z
uniformly on bounded intervals in time, where we recall that t is the flow generated by (16). In (20) one can of course shift the time evolution from the observables to the states on both sides and write instead
lim φtε , a W,ε φtε − a(q, p) ρcl (dq dp, t) = 0. (21) ε→0
Z
Here φtε = e−iHBO t/ε φ ε and ρcl (dq dp, t) = ρcl ◦−t (dq dp) is the initial distribution ρcl (dq dp) transported along the classical flow. Thus with respect to a certain type of experiments the system described by the wave function φtε behaves like a classical system. For a molecular system the object of real interest is the left hand side of (21) with φtε ε as acting replaced by the solution ψtε of the full Schrödinger equation and a W,ε =: aBO 2 n W,ε on L (R ) replaced by a ⊗ 1 as acting on H. In order to compare the expectations ε with the expectations of a W,ε ⊗ 1, we need the following proposition. of aBO Proposition 2. In addition to the assumptions of Theorem 2 let a ∈ Cb∞ (Z) with
dξ sup |ξ | | a (2) (x, ξ )| < ∞, (22) x∈Rn
where (2) denotes Fourier transformation in the second argument. Then there is a constant C < ∞ such that W,ε ⊗ 1 − U ∗ a W,ε U 1!−δ P∗ ≤ C ε. a For the proof of Proposition 2 see the end of Sect. 4.2. With its help we obtain the semiclassical limit for the nuclei as governed by the full Hamiltonian. Corollary 1. Let 7 and I be as in Theorem 2. Let ψ ε ∈ H be such that ε(19) is satisfied for φ ε := UP∗ ψ ε for some ρcl with suppρcl ⊂ 7 − α. Let ψtε = e−iH t/ε ψ ε , then for all a ∈ Cb∞ (Z) which satisfy (22)
ε W,ε ε lim ψt , (a ⊗ 1) ψt − a(q, p) ρcl (dq dp, t) = 0 (23) ε→0
uniformly for t ∈ I .
Z
124
H. Spohn, S. Teufel
Translated to the language of Wigner measures Corollary 1 states the following. Let us define the marginal Wigner transform for the nuclei as
ε Wnuc (ψtε )(q, p) := (2π)−n dX eiX·p ψtε∗ (q + εX/2), ψtε (q − εX/2)He .
Rn ε ε Then, whenever Wnuc (P∗ ψ0 )(q, p) dq dp ε (P ψ ε )(q, p) dq dp sure ρcl (dq dp), Wnuc ∗ t
converges weakly to some probability mea converges weakly to (ρcl ◦ −t (dq dp). Corollary 1 follows by applying first Proposition 2 and then Theorem 2 to the lefthand side in the difference (23), where we note that limε→0 (1 − P7α )ψ ε = 0 and thus also limε→0 (1 − P!−δ ( )ψtε = 0 for any δ ( < δ. This yields the left hand side of (20) and thus (23). We mention some standard examples of initial wave functions φ ε of the nuclei which approximate certain classical distributions. The initial wave function for the full system is, as before, recovered as ψ ε = U ∗ φ ε = φ ε (X)χ (X). In these examples one regains some control on the rate of convergence with respect to ε which was lost in (19). (i) Wave packets tracking a classical trajectory. For φ ∈ L2 (Rn ) let X − q0 n φqε0 ,p0 (X) = ε− 4 e−ip0 ·(X−q0 )/ε φ √ . ε
Then |φqε0 ,p0 (X)|2 is sharply peaked at q0 for ε small and its ε-scaled Fourier transform is sharply peaked at p0 . Thus one expects that the corresponding classical distribution is given by δ(q − q0 )δ(p − p0 ) dq dp. As was shown, e.g. in [23], this is indeed true , |p|φ ∈ L1 (Rn ). Then Corollary 1 holds with (23) for φ ∈ L2 (Rn ) such that φ, |x|φ, φ replaced by ε ψt , (a W,ε ⊗ 1) ψtε − a(q(t), p(t)) √ L1 + |x|φL1 φ L1 , (24) = O( ε) φ2L2 + φL1 |p|φ where (q(t), p(t)) is the solution of the classical dynamics with initial condition (q0 , p0 ). Equation (24) generalizes Hagedorn’s first order result in [9] to a larger class of localized wave functions. (ii) Either sharp momentum or sharp position. For φ ∈ L2 (Rn ) let p pε (p) = φ p− 0 , φ 0 ε where denotes the ε-scaled Fourier transformation, then the corresponding classical distribution is ρcl (dq dp) = δ(p − p0 )|φ(q)|2 dq dp. Note that the absolute value of φ does not depend on ε in that case. Equivalently one defines X − q n 0 φqε0 (X) = ε− 2 φ ε (p)|2 dq dp. In both cases one finds that the and obtains ρcl (dq dp) = δ(q − q0 )|φ L1 for difference in (23) is bounded a constant times either ε φ2L2 + φL1 |p|φ L1 for φqε . φpε 0 or ε φ2L2 + |x|φL1 φ 0 (iii) WKB wave functions. For f ∈ L2 (Rn ) and S ∈ C 1 (Rn ) both real valued let φ ε (X) = f (X) eiS(X)/ε , then ρcl (dq √ dp) = f 2 (q) δ(p − ∇S(q)) dq dp. In this case one expects that (23) is bounded as ε, which has been shown in [23] for a smaller set of test functions.
Adiabatic Decoupling and Time-Dependent Born–Oppenheimer Theory
125
4. Proofs 4.1. Globally isolated bands. We collect some immediate consequences of H3 and S. Using the Riesz formula 1 P∗ (X) = − dλ Rλ (He (X)), (25) 2π i γ (X) with γ (X) a smooth curve in the complex plain circling σ∗ (X) only and Rλ (He (X)) = (He (X) − λ)−1 , one easily shows that P∗ (·) ∈ Cb2 (Rn , L(He )). Assumption S enters at this point, since it allows to chose γ (X) locally independent of X. Hence, when taking derivatives with respect to X in (25), one only needs to differentiate the integrand. In particular one finds that P∗⊥ (X)(∇X P∗ )(X)P∗ (X) 1 dλ Rλ (He (X)) P∗⊥ (X) (∇X He )(X) Rλ (He (X)) P∗ (X). = 2πi γ (X)
(26)
Since P∗ (X)(∇X P∗ )(X)P∗ (X) = P∗⊥ (X)(∇X P∗ )(X)P∗⊥ (X) = 0, which follows from (∇X P∗ )(X) = (∇X P∗2 )(X) = (∇X P∗ )(X)P∗ (X) + P∗ (X)(∇X P∗ )(X), we have that (∇X P∗ )(X) = P∗⊥ (X)(∇X P∗ )(X)P∗ (X) + adjoint.
(27)
In (27) and in the following “+ adjoint” means that the adjoint operator of the first term in a sum is added. Starting with (12), we find, at the moment formally, that ε ε −iH ε t/ε −iH ε t/ε iH ε t/ε e diag − e−iH t/ε = e diag 1 − e diag e−iH t/ε
t/ε −iH ε s −iH ε t/ε iH ε s ε = i e diag e ds e diag H ε − Hdiag , (28) 0
where ε H ε − Hdiag = P∗⊥ H ε P∗ + adjoint 2 2 ⊥ ε = P∗ − i∇X + Aext (X) , P∗ P∗ + adjoint. 2
(29)
Let DA := −i∇X + Aext (X). Then the commutator is easily calculated as
ε2 (DA ⊗ 1)2 , P∗ = −i ε (∇X P∗ ) · (εDA ⊗ 1) + O(ε 2 ) 2
(30)
= −ε (∇X P∗ ) · (ε∇X ⊗ 1) + O(ε 2 ),
(31)
where O(ε2 ) holds in the norm of L(H, H) as ε → 0. For (30) and (31) it was used that Aext (X) and P∗ (X) are both differentiable with bounded derivatives and that Aext (X) commutes with P∗ .
126
H. Spohn, S. Teufel
ε is self-adjoint Before we can continue, we need to justify (28) by showing that Hdiag on D(H ε ). To see this, note that −iε∇X is bounded with respect to ε 2 )X with relative bound 0 and that for ψ ∈ D(H ε ), 2 (ε 2 )X ⊗ 1) ψ ≤ c1 (ε 2 DA ⊗ 1) ψ + ψ 2 (32) ≤ c2 (ε 2 DA ⊗ 1 + 1 ⊗ H0 ) ψ + ψ ≤ c3 H ε ψ + ψ ,
where we used that He0 is bounded from below and that He1 is bounded. Hence H ε − ε ε Hdiag is infinitesimally operator bounded with respect to H ε , consequently Hdiag is ε ε self-adjoint on D(H ) and thus (28) holds on D(H ). Equations (29) and (31) in (28) give ε −iH ε t/ε P∗⊥ e diag − e−iH t/ε (33)
t/ε ε −iH ε t/ε iH ε s = −iε e diag ds e diag P∗⊥ (∇X P∗ ) P∗ · (ε∇X ⊗ 1) e−iH s + O(ε)|t|, 0
where we used that the term of order O(ε 2 ) in (31) yields a term of order O(ε)|t| after integration, since all other expressions in the integrand are bounded uniformly in time and the domain of integration grows like t/ε. In (33) and in the following we omit the adjoint term from (29) and thus consider the difference of the groups projected on RanP∗⊥ only. The argument for the difference projected on RanP∗ goes through analogously by taking adjoints at the appropriate places. Now ε(∇X P∗ ) · (ε∇X ⊗ 1) is only O(ε) in the norm of L(W 1,ε ⊗ He , H) and thus, according to the naive argument, only O(1)|t| after integration. As in [13] and [23] we proceed by writing (∇X P∗ ) · (ε∇X ⊗1) as the commutator of a bounded operator B with H ε modulo terms of order O(ε). This is in analogy to the proof of the time-adiabatic theorem [15] and allows one to write the first order part of the integrand in (33) as the time derivative of a bounded operator and, as a consequence, to do the integration without losing one order in ε. In view of (26) we define 1 B(X) := dλ Rλ (He (X))2 P∗⊥ (X) (∇X He )(X) Rλ (He (X)) P∗ (X). (34) 2πi γ (X) An easy calculation shows that = − P∗⊥ (∇X P∗ ) P∗ . He , B
(35)
j (X) ∈ By assumption ∂Xj He (X) ∈ C 2 (Rn , L(He )), j = 1, . . . , n, hence B 2 n C (R , L(He )) and thus 2 ε 2 = −ε (∇X B) · (ε∇X ⊗ 1) + O(ε 2 ) = O(ε) (36) DA ⊗ 1, B 2 in the norm of L(W 1,ε ⊗ He , H). Equations (35) and (36) combined yield that ε = − P∗⊥ (∇X P∗ ) P∗ + O(ε) H ,B
Adiabatic Decoupling and Time-Dependent Born–Oppenheimer Theory
127
with O(ε) in the norm of L(W 1,ε ⊗ He , H). Since ∇X He ∈ L(H), a short calculation shows that [H ε , ε∇X ⊗ 1] = O(ε) in L(W 1,ε ⊗ He , H). Hence we define · (ε∇X ⊗ 1) B := B and obtain H ε , B = − P∗⊥ (∇X P∗ ) P∗ · (ε∇X ⊗ 1) + O(ε)
with O(ε) in the norm of L(W 1,ε ⊗ He , H). Let ε
ε
B(s) = eiH s B e−iH s , then −i
d ε ε B(s) = eiH s [H ε , B] e−iH s . ds
Continuing (33), we have ε −iH ε t/ε P∗⊥ e diag − e−iH t/ε
t/ε ε −iH ε t/ε iH ε s = i ε e diag ds e diag H ε , B e−iH s + O(ε)(|t| + |t|2 ) 0
t/ε d ε −iH ε t/ε iH ε s = ε e diag ds e diag e−iH s B(s) + O(ε)(|t| + |t|2 ), ds 0
(37)
where O(ε) holds now in the norm of L(W 1,ε ⊗ He , H). The additional factor of |t| in (37) comes from the fact that −iH ε s e
L(W 1,ε ⊗He )
≤ c (1 + ε |s|)
(38)
for some constant c < ∞, i.e. the scaled momentum of the nuclei may grow in time. Using Aext ∞ = C < ∞ and (εDA ⊗ 1), H ε
L(H)
ε, ≤ C
(38) follows from −iH ε s −iH ε s (−iε∇X ⊗ 1) e−iH ε s ψ ≤ ⊗ 1) e ψ + ⊗ 1) e ψ (εD (εA A ext ε ≤ (εDA ⊗ 1) ψ + (εDA ⊗ 1), e−iH s ψ + C ψ ε |s| ψ + 2 C ψ ≤ (−iε∇X ⊗ 1) ψ + C for ψ ∈ W 1 ⊗ He .
128
H. Spohn, S. Teufel
Finally, continuing (37), integration by parts yields ε −iH ε t/ε P∗⊥ e diag − e−iH t/ε
t/ε ε t/ε ε s d −iHdiag iHdiag −iH ε s = εe ds e e B(s) + O(ε)(|t| + |t|2 ) ds 0 ε −iH ε t/ε = ε B e−iH t/ε − e diag B
t/ε −iH ε s ε t/ε −iHdiag iH ε s ε + iεe ds e diag H ε − Hdiag + O(ε)(|t| + |t|2 ) Be 3 = O(ε) 1 + |t| ,
0
(39)
where O(ε) holds in the norm of L(W 2,ε ⊗ He , H). For the last equality we used that B is bounded in L(W 2,ε ⊗ He , H) as well as in L(W 2,ε ⊗ He , W 1,ε ⊗ He ) uniformly ε is O(ε) in L(W 1,ε ⊗ H , H), as we saw in (29) and (31), with respect to ε, H ε − Hdiag e and −iH ε s ≤ c (1 + ε |s|)2 (40) e 2,ε L(W
⊗He )
for some constant c < ∞. Equation (40) follows from arguments similar to those used in the proof of (38). We are left to prove (13). This follows from exactly the same proof using that E(H ε ) εs −iH commutes with e and that, according to (32), (ε2 )X ⊗ 1) E(H ε ) ψ ≤ c3 H ε E(H ε ) ψ + ψ ≤ c4 (|E| + 1) ψ. 4.2. Locally isolated bands. To prove Theorem 2 we proceed along the same lines as in the previous section, with the one modification that we use Proposition 1 to control the ⊕ ε anymore, flux out of ∂!. However, one cannot use P∗ = ! dX P∗ (X) to define Hdiag ε because the functions in its range would not be in the range of H and some smoothing in the cutoff is needed. For i ∈ {0, 1, 2, 3} let 1i = 1(!− 4−i δ, 1 δ) be approximate 5 5 characteristic functions according to Definition ⊕ 1. Then the smoothed projections are dX Pi (X). In the following it will be defined with Pi (X) = 1i (X) P∗ (X) as Pi = used that for i < j we have Pi Pj = Pj Pi = Pi , and hence (1−Pj )Pi = Pi (1−Pj ) = 0. Proposition 1 yields ε ε ε ε e−iH t/ε − U ∗ e−iHBO t/ε U P7α = e−iH t/ε − P1 U ∗ e−iHBO t/ε U P7α + O(ε). (41) We make also use of the fact that the phase space support of the initial wave function lies in 7 and has thus bounded energy with respect to Hcl . Let E := supz∈7 Hcl (z) < ∞, let 1((−∞,E+α),α) be a smooth characteristic function on R and let W,ε . E := 1((−∞,E+α),α) (Hcl (·)) Then standard results from semiclassical analysis imply the following relations.
Adiabatic Decoupling and Time-Dependent Born–Oppenheimer Theory
129
W,ε Proposition 3. (a) 1W,ε (7,α) = E 1(7,α) + O(ε); ε ε (b) e−iHBO t/ε E = E e−iHBO t/ε + O(ε) uniformly for t ∈ I ; ε , E ] = O(ε 2 ); (c) [ HBO (d) E ∈ L(L2 (Rn ), W 2,ε ). In (a)–(c) O(ε) resp. O(ε 2 ) hold in the norm of L(L2 (Rn )). Proposition (3) (a), (c) and (d) are direct consequences of the product rule for pseudodifferential operators (see, e.g., [21,6]) and (b) is again Egorov’s Theorem. Using Proposition 3 (a) and (b) we continue (41) and obtain ε ε e−iH t/ε − P1 U ∗ e−iHBO t/ε U P7α ε ε = e−iH t/ε − P1 U ∗ E e−iHBO t/ε U P7α + O(ε).
We proceed as in the globally isolated band case and write ε t/ε −iH ε t/ε ∗ −iHBO e − P1 U E e U P7α = − ie
−iH ε t/ε
= − ie
−iH ε t/ε
− ie−iH
ε t/ε
t/ε
0
0
t/ε t/ε 0
ds eiH
εs
ds eiH
εs
ds eiH
εs
−iH ε s ε H ε P1 U ∗ E − P1 U ∗ E HBO e BO U P7α
ε
ε ) P1 U ∗ Ee−iHBO s U P7α H ε − Hdiag
(42)
−iH ε s ε ε Hdiag e BO U P7α , P1 U ∗ E − P1 U ∗ E HBO (43)
where ε Hdiag := P3 H ε P3 .
One can now show that (42) is bounded in norm by a constant times ε(1 + |t|) using exactly the same sequence of arguments as in the proof in the previous section. One must only keep track of the “hierarchy” of smoothed projections, e.g., instead of (29) one has 2 ε ε ε H − Hdiag P1 = (1 − P3 ) − )X ⊗ 1, P2 P1 + O(ε 2 ). 2 The adjoint part drops out completely, because this time only the difference on the band, i.e. on RanP1 , is of interest. Note also that the smoothed projections Pi are bounded operators on the respective scaled Sobolev spaces and thus, according to Proposition 3 (d), all estimates hold in the norm of L(H). It remains to show that also (43) is O(ε). First note that, according to Proposition 3 ε yields an error of order O(ε 2 ) in the integrand and thus an (c), commuting E and HBO error of order O(ε) after integration. For φ ∈ W 2 we compute ε (Hdiag P1 U ∗ φ)(X) = 11 (X) E(X) φ(X)χ (X) 2 2 ε + 11 (X) − i∇X + Aext φ (X) χ (X) 2 + ε 11 (X) (−iε∇φ) (X) · −iχ (X), ∇X χ (X)He χ (X)
− i ε (∇11 )(X) · (−iε∇φ) (X) χ (X) + O(ε 2 ).
130
H. Spohn, S. Teufel
On the other hand, again for φ ∈ W 2 , ε (P1 U ∗ HBO φ)(X) = 11 (X) E(X) φ(X)χ (X) 2 2 ε + 11 (X) − i∇X + Aext φ (X) χ (X) 2
+ ε 11 (X) (−iε∇φ) (X) · Ageo (X) χ (X) + O(ε 2 ). Hence ε ε Hdiag P1 U ∗ E − P1 U ∗ HBO E = −ε U ∗ (∇11 ) · ε∇X E + O(ε 2 ).
Thus the norm of (43) is, up to an error of order O(ε), bounded by the norm of ε U∗
t/ε 0
ε
ds (∇11 ) · ε∇X E e−iHBO s U P7α .
(44)
(∇11 ) · ε∇X E is a bounded operator and we can apply Proposition 1 in the integrand of (44) once more, this time however with the smoothed projection P0 , and obtain (44) = ε U ∗
t/ε 0
ε
ds (∇11 ) · ε∇X E 10 e−iHBO s U P7α + O(ε) = O(ε).
(45)
The last equality in (45) follows from the fact that [ε∇X E, 10 ] = O(ε) and that (∇11 ) and 10 are disjointly supported. Proof of Proposition 2. For the following calculations we continue χ (·) ∈ Cb∞ (!, He ) arbitrarily to a function χ (·) ∈ Cb∞ (Rn , He ) by possibly modifying it on ! \ (! − δ/2). For φ in a dense subset of L2 (! − δ) and X ∈ ! − δ/2, by making the substitutions = (Y − X)/ε and using the Taylor expansion with rest, we have: k = εk and Y
W,ε X+Y −n a dY dk a ⊗ 1 φχ (X) = (2π) , εk e−i(X−Y )·k φ(Y )χ (Y ) 2
ε ) χ (X) = (2π)−n dY a (2) X + Y , −Y φ(X + ε Y 2
ε )) ) Y · ∇X χ (f (X, εY + ε (2π)−n dY a (2) X + Y , −Y φ(X + ε Y 2 = U ∗ a W,ε U φχ (X) + R ε . (46) From (46) we conclude that 1!−δ/2 (·) ⊗ 1 a W,ε ⊗ 1 − U ∗ a W,ε U P!−δ ≤ R ε . Since 1 − 1!−δ/2 (·) ⊗ 1 a W,ε ⊗ 1 − U ∗ a W,ε U P!−δ = 1 − 1!−δ/2 (·) ⊗ 1 a W,ε ⊗ 1 − U ∗ a W,ε U 1!−δ (·) ⊗ 1 P!−δ = O(ε n )
Adiabatic Decoupling and Time-Dependent Born–Oppenheimer Theory
131
for arbitrary n, Proposition 2 follows by showing that R ε is of order ε:
ε ) Y · ∇X χ (f (·, ε Y )) R ε ≤ ε (2π)−n , −Y φ(· + εY dY a (2) · + Y 2 H −n ≤ ε (2π) sup (∇X χ )(X)He X∈Rn
ε ) , −Y |Y | φ(· + ε Y × dY a (2) · + Y 2 n 2 L (R )
(2) sup |Y | | )| dY ≤ ε C φL2 (Rn ) a (X, Y φχ H . = εC
X∈Rn
Acknowledgement. We are grateful to André Martinez and Gheorghe Nenciu for explaining to us their work in great detail. S. T. would like to thank George Hagedorn for stimulating discussions and, in particular, for helpful advice on questions concerning the Berry connection and Markus Klein and Ruedi Seiler for explaining their treatment of Coulomb singularities. We thank Caroline Lasser and Gianluca Panati for careful reading of the manuscript and the referee for pointing out Reference [12].
References 1. Avron, J.E. and Elgart, A.: Adiabatic theorems without a gap condition. Commun. Math. Phys. 203, 445–463 (1999) 2. Bornemann, F. and Schütte, C.: On the singular limit of the quantum-classical molecular dynamics model. SIAM J. Appl. Math. 59, 1208–1224 (1999) 3. Born, M. and Oppenheimer, R.: Zur Quantentheorie der Molekeln. Ann. Phys. (Leipzig) 84, 457–484 (1927) 4. Bouzouina, A. and Robert, D.: Uniform semi-classical estimates for the propagation of Heisenberg observables. Math. Phys. Preprint Archive mp_arc 99–409 (1999) 5. Combes, J.-M., Duclos, P., Seiler, R.: The Born–Oppenheimer approximation. In: Rigorous Atomic and Molecular Physics, eds. G. Velo, A. Wightman, Plenum: New York, 1981, pp. 185–212 6. Dimassi, M. and Sjöstrand, J.: Spectral Asymptotics in the Semi-Classical Limit. London Mathematical Society Lecture Note Series 268, Cambridge: Cambridge University Press, 1999 7. Fermanian Kammerer, C. and Gérard, P.: A Landau–Zener formula for two-scaled Wigner measures. Preprint (2001) 8. Hagedorn, G.A.: High order corrections to the time-independent Born–Oppenheimer approximation I: Smooth potentials. Ann. Inst. H. Poincaré Sect. A 47, 1–19 (1987) 9. Hagedorn, G.A.: A time dependent Born–Oppenheimer approximation. Commun. Math. Phys. 77, 1–19 (1980) 10. Hagedorn, G.A. and Joye, A.: A time-dependent Born–Oppenheimer approximation with exponentially small error estimates. Math. Phys. Preprint Archive mp_arc 00-209 (2000) 11. Hagedorn, G.A.: Molecular Propagation Through Electronic Eigenvalue Crossings. Memoirs Am. Math. Soc. 536 (1994) 12. Herrin, J. and Howland, J.S.: The Born–Oppenheimer approximation: Straight-up and with a twist. Rev. Math. Phys. 9, 467–488 (1997) 13. Hövermann, F., Spohn, H., Teufel, S.: Semiclassical limit for the Schrödinger equation with a short scale periodic potential. Commun. Math. Phys. 215, 609–629 (2001) 14. Joye, A. and Pfister, C.-E.: Quantum adiabatic evolution. In: On Three Levels, eds. M. Fannes, C. Maes, A. Verbeure, New York: Plenum, 1994, pp. 139–148 15. Kato, T. On the adiabatic theorem of quantum mechanics. Phys. Soc. Jap. 5, 435–439 (1958) 16. Klein, M., Martinez, A., Seiler, R., Wang, X.P.: On the Born–Oppenheimer expansion for polyatomic molecules. Commun. Math. Phys. 143, 607–639 (1992) 17. Lions, P.L. and Paul, T.: Sur les mesures de Wigner. Revista Mathematica Iberoamericana 9, 553–618 (1993) 18. Martinez, A. and Sordoni, V.: On the time-dependent Born–Oppenheimer approximation with smooth potential. Math. Phys. Preprint Archive mp_arc 01-37 (2001)
132
H. Spohn, S. Teufel
19. Mead, V. and Truhlar, D.G.: On the determination of Born–Oppenheimer nuclear motion wave functions including complications due to conical intersections and identical nuclei. J. Chem. Phys. 70, 2284–2296 (1979) 20. Nenciu, G. and Sordoni, V.: Semiclassical limit for multistate Klein-Gordon systems: Almost invariant subspaces and scattering theory. Math. Phys. Preprint Archive mp_arc 01-36 (2001) 21. Robert, D.: Autour de l’Approximation Semi-Classique. Progress in Mathematics, Volume 68, Basel– Boston: Birkhäuser, 1987 22. Shapere, A. and Wilczek, F. (eds): Geometric Phases in Physics. Singapore: World Scientific, 1989 23. Teufel, S. and Spohn, H.: Semiclassical motion of dressed electrons. Preprint ArXiv.org math-ph/0010009, to appear in Rev. Math. Phys. (2001) 24. Teufel, S.: Adiabatic decoupling for perturbations of fibered Hamiltonians. In preparation Communicated by B. Simon
Commun. Math. Phys. 224, 133 – 152 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Resonance Theory for Schrödinger Operators O. Costin, A. Soffer Department of Mathematics, Rutgers University, Piscataway, NJ 08854-8019, USA Received: 10 November 2000 / Accepted: 5 September 2001
Dedicated to J. L. Lebowitz, on the occasion of his 70th birthday Abstract: Resonances which result from perturbation of embedded eigenvalues are studied by time dependent methods. A general theory is developed, with new and weaker conditions, allowing for perturbations of threshold eigenvalues and relaxed Fermi Golden rule. The exponential decay rate of resonances is addressed; its uniqueness in the time dependent picture is shown in certain cases. The relation to the existence of meromorphic continuation of the properly weighted Green’s function to time dependent resonance is further elucidated, by giving an equivalent time dependent asymptotic expansion of the solutions of the Schrödinger equation. 1. Introduction and Results 1.1. General remarks. Resonances may be defined in different ways, but usually refer to metastable behavior (in time) of the corresponding system. The standard physics definition would be as “bumps” in the scattering cross section, or exponentially decaying states in time, or poles of the analytically continued S matrix (when such an extension exists). Mathematically, in the last 25 years one uses a definition close to the above, by defining λ to be a resonance (energy) if it is the pole of the meromorphic continuation of the weighted Green’s function χ (H − z)−1 χ with suitable weights χ (usually, in the Schrödinger Theory context, χ will be a C0∞ function). Here H is the Hamiltonian of the system. In many cases the equivalence of some of the above definitions has been shown [1–3]. However, the exponential behavior in time, and the correct estimates on the remainder are difficult to produce in general [21]. It is also not clear how to relate the time behavior to a resonance, uniquely, and whether “analytic continuation” plays a fundamental role; see the review [5]. Important progress
134
O. Costin, A. Soffer
on such relations has recently been obtained; Orth [6] considered the time dependent behavior of states which can be related to resonances without the assumption of analytic continuation and established some preliminary estimates on the remainder terms. Then, Hunziker [7] was able to develop a quite general relation between resonances defined via poles of analytic continuations in the context of Balslev–Combes theory, to exponential decay in time, governed by the standard Fermi Golden rule. Here the resonances were small perturbations of embedded eigenvalues. In [1] a definition of resonance in a time dependent way is given and it is shown to agree with the one resulting from analytic continuation when it exists, in the Balslev–Combes theory. They also get exponential decay and estimates on the remainder terms. Exact solutions, including the case of large perturbations, for time dependent potentials have recently been obtained in [8]. Further notable results on the time dependent behavior of the wave equation were proved by Tang and Zworski [9]. The construction of states which resemble resonances, and thus decay approximately exponentially was accomplished e.g. in [10]. For resonance theory based on the Balslev–Combes method the reader is referred to the book [21] and its comprehensive bibliography on the subject. Then, in a time-dependent approach to perturbation of embedded eigenvalues developed in [11] exponential decay and dispersive estimates on the remainder terms were proved in a general context, without the assumption of analytic continuation. When an embedded eigenvalue is slightly perturbed, we generally get a “resonance”. One then expects the solution at time t to be a sum of an exponentially decaying term plus a small term (in the perturbation size) which, however, decays slowly. The lifetime of the resonance is given by −1 , where , the probability of decay per unit time, enters in the exponential decay rate p(t) ∼ e− t/h¯ . If an analytic continuation of χ (H0 − z)−1 χ exists in a neighborhood of an embedded eigenvalue, then = −2z0 , and a resonance z0 is defined as the pole of the analytic continuation of χ (H − z)−1 χ . In this case, has the following expansion in :
(λ0 , ) = 2 γ (λ0 , ) + o( 2 ). The expression for γ (λ0 , ) is called the Fermi Golden Rule (FGR). A remarkable fact is that this expansion is defined even when analytic continuation does not exist. Previous works on the existence of resonances required that γ (λ0 , ) > 0 as → 0. This condition is sometimes hard to verify, and in the present work we remove this assumption. 1.2. Outline of new results. In this work we improve the theory of perturbation of embedded eigenvalues and resonances in three main directions: First, the Fermi Golden Rule condition, which originally required as above (sometimes implicitly) that > C 2 as → 0 is removed. We show that under (relatively weak) conditions of regularity of the resolvent of the unperturbed Hamiltonian all that is needed is that > 0. The price one sometimes has to pay is that it may be needed to evaluate at a nearby point of the eigenvalue λ0 of the unperturbed Hamiltonian (see (3)). In cases of very low regularity of the unperturbed resolvent, we need in general
> C m , with m > 2; m becomes larger if more regularity of the resolvent is provided; cf. (1) and (2) below. The second main improvement relative to known results in resonance theory is that we only require H η regularity (see Sect. 2.1), with η > 0, of the unperturbed resolvent near
Resonance Theory
135
the relevant energy. Most works on resonance require analyticity; the recent works [6, 11, 21] require H η regularity with η > 1. This improvement is important to perturbations of embedded eigenvalues at thresholds (e.g., our condition is satisfied by H0 = − at λ0 = 0 in three or more dimensions, while the previous results only apply to five or more dimensions). As a third contribution we indicate that under conditions of analytic continuation and with suitable cutoff, the term e− t can be separated from the solution and the remainder b term is given by an asymptotic series in t −a , a > 0, times a stretched exponential e−t , with b < 1, see Sect. 5. Our analyticity assumptions are weaker and thus apply in cases of threshold eigenvalues where standard complex deformation approaches could fail. Furthermore we replace analytic perturbation methods by more general complex theory arguments. As concrete examples of applications we outline the following two classes of problems; (1) In many applications H0 = − ⊕ H1 , where H1 has a discrete spectrum (see e.g. [21]), if H1 ψ0 = 0 has a solution, then H0 has an embedded eigenvalue at the threshold, since σ (−) = [0, ∞). In this case the known analytic methods do not apply; the methods of [6] apply when η > 1 which is the case of the Laplacian on L2 (RN ) if N ≥ 5. The results of this paper apply down to N = 3. (2) The Hamiltonians one gets by linearizing a nonlinear dispersive completely integrable equation around an exact solution have an embedded eigenvalue corresponding to the soliton/breather, etc. Small perturbations of such completely integrable equations then produce a perturbation problem of embedded eigenvalues with selfconsistent potential W . In these cases the size of is typically of higher order in 2 and in certain cases it is even O(e−1/ ). Hence the previous works are not applicable since they require a lower bound O( 2 ) on . Our approach follows the setup of the time dependent theory of [11], combined with Laplace transform techniques. It is expected to generalize to the N -body case following [12]. We will follow, in part, the notation of [11]. The analysis in this work utilizes in some ways this framework, but generalizes the results considerably: the required time decay is O(t −1−η ) and we remove here the assumption of lower bound on ; it is replaced by 2
≥ Cε 1−η
(1)
> 0, arbitrary
(2)
when η < 1, and
when η > 1. Whenever a meromorphic continuation of the S-matrix or Green’s function exists, the poles give an unambiguous definition of “resonance”. A time dependent approach or other definitions are less precise, not necessarily unique, as was observed in [6], but usually apply in more general situations, where analytic continuation is either hard to prove or not available. We provide some information about defining resonance by time dependent methods and its relation to the existence of “analytic continuation”. In particular, we will show that in general one can find the exponential decay rate up to higher order corrections depending on η and .
136
O. Costin, A. Soffer
In case it is known that analytic continuation exists, our approach provides a definition of a unique resonance corresponding to the perturbed eigenvalue. It is given by the solution of some transcendental equation in the complex plane and it also corresponds to a pole of the weighted Green’s function. 2. Main Results We begin with some definitions. Given H0 , a self-adjoint operator on H = L2 (Rn ), we assume that H0 has a simple eigenvalue λ0 with normalized eigenvector ψ0 : H0 ψ0 = λ0 ψ0 , ψ0 = 1.
(3)
Our interest is to describe the behavior of solutions of i
∂φ = H φ, ∂t
H := H0 + W () ,
(4)
where is a small parameter, taken to be the size of the perturbation in an appropriate norm (cf. e.g. (8)), φ(0) = E φ0 , where E is the spectral projection of H on the interval and is a small interval around λ0 . (Note that W () depends on in general, and may not even have a limit as → 0.) Furthermore, we will describe, in some cases, the analytic structure of (H − z)−1 in a neighborhood of λ0 . W is a symmetric perturbation of H0 , such that H is self-adjoint with the same domain as H0 . For an operator A, A denotes its norm as an operator from L2 to itself. We interpret functions of a self-adjoint operator as being defined by the spectral theorem. In the special case where the operator is H0 , we omit the argument, i.e., g(H0 ) = g. For an open interval , we denote an appropriate smoothed characteristic function of by g (λ). In particular, we shall take typically g (λ) to be a nonnegative C ∞ function, which is equal to one on and zero outside a neighborhood of . The support of its derivative is furthermore chosen to be small compared to the size of . We further require that |g (n) (λ)| ≤ cn ||−n , n ≥ 1. P0 denotes the projection on ψ0 , i.e., P0 f = (ψ0 , f )ψ0 . P1b denotes the spectral projection on Hpp ∩ {ψ0 }⊥ , the pure point spectral part of H0 orthogonal to ψ0 . That is, P1b projects onto the subspace of H spanned by all the eigenstates other than ψ0 . In our treatment, a central role is played by the subset of the spectrum of the operator H0 , T - on which a sufficiently rapid local decay estimate holds. For a decay estimate to hold for e−iH0 t , one must certainly project out the bound states of H0 , but there may be other obstructions to rapid decay. In scattering theory these are called threshold energies. Examples of thresholds are: (i) points of stationary phase of a constant coefficient principal symbol for two body Hamiltonians and (ii) for N-body Hamiltonians, zero and eigenvalues of subsystems. We will not give a precise definition of thresholds. For us it is sufficient to say that away from thresholds the favorable local decay estimates for H0 hold. Let ∗ be a union of intervals, disjoint from , containing a neighborhood of infinity and all thresholds of H0 except possibly those in a small neighborhood of λ0 . We then let P1 = P1b + g∗ ,
Resonance Theory
137
where g∗ = g∗ (H0 ) is a smoothed characteristic function of the set ∗ . We also define for x ∈ Rn x2 = 1 + |x|2 ,
Q = I − Q,
and
Pc- = I − P0 − P1 .
(5)
-
Thus, Pc is a smoothed out spectral projection of the set T - defined as T - = σ (H0 ) \ {eigenvalues, real neighborhoods of thresholds and infinity}.
(6)
-
We expect e−iH0 t to satisfy good local decay estimates on the range of Pc ; (see (H4) below). 2.1. Hypotheses on H0 . We assume H η regularity for H0 . By this we mean that (ψ, (H0 − z)−1 φ) is in the Hölder space of order η, H η , in the z variable for z near the relevant energy. Here ψ, φ are in the dense set {φ ∈ L2 : xσ φ ∈ L2 }. (H1) (H2) (H3) (H4)
H0 is a self-adjoint operator with dense domain D, in L2 (Rn ). λ0 is a simple embedded eigenvalue of H0 with (normalized) eigenfunction ψ0 . There is an open interval containing λ0 and no other eigenvalue of H0 . Local decay estimate: Let r > 1. There exists σ > 0 such that if xσ f ∈ L2 then x−σ e−iH0 t Pc- f 2 ≤ Ct−r xσ f 2 .
(7)
(H5) By appropriate choice of a real number c, the L2 operator norm of xσ (H0 + c)−1 x−σ can be made sufficiently small. Remarks. (i) We have assumed that λ0 is a simple eigenvalue to simplify the presentation. Our methods can be easily adapted to the case of multiple eigenvalues. (ii) Note that does not have to be small and that ∗ can be chosen as necessary, depending on H0 . (iii) In certain cases, the above local decay conditions can be proved even when λ0 is a threshold; see [13]. (iv) Regarding the verification of the local decay hypothesis, one approach is to use techniques based on the Mourre estimate [14–16]. If contains no threshold values, then quite generally, the bound (7) holds with r arbitrary and positive. We now specify the conditions we require of the perturbation, W . Conditions on W . (W1) W is symmetric and H = H0 + W is self-adjoint on D and there exists c ∈ R (which can be used in (H5)), such that c lies in the resolvent sets of H0 and H . (W2) For the same σ as in (H4) and (H5) we have : |||W ||| := x2σ Wg (H0 ) + xσ Wg (H0 )xσ + xσ W (H0 + c)−1 x−σ < ∞ and xσ W (H0 + c)−1 xσ < ∞.
(8)
138
O. Costin, A. Soffer
(W3) Resonance condition–nonvanishing of the Fermi golden rule: For a suitable choice of λ (which will be made precise later)
(λ, ) := (λ) := π 2 (W () ψ0 , δ(H0 − λ)(I − P0 )W () ψ0 ) = 0.
(9)
In most cases = (λ0 ). But in the case is very small it turns out that the “correct”
will be
(λ0 + δ) with δ given in the proof of Proposition 12. See also Sect. S4. The main results of this paper are summarized in the following theorem. Theorem 1. Let H0 satisfy the conditions (H1)...(H5) and the perturbation satisfy the conditions (W1). . . (W3). Assume moreover that is sufficiently small and either: (i) H0 has regularity as in Sect. 2.1 with η > 1 or (ii) We have lower regularity 0 < η < 1 supplemented by the conditions
> C n , and η >
n≥2
n−2 n .
Then a) H = H0 + W has no eigenvalues in . b) The spectrum of H is purely absolutely continuous in , and x−σ e−iH t g (H )40 2 ≤ C t−1−η xσ 40 2 .
(10)
c) For t ≥ 0 we have
e−iH t g (H )40 = (I + AW ) e−iω∗ t a(0)ψ0 + e−iH0 t φd (0) + R(t),
(11)
where AW := K(I − K)−1 − I and K is an integral operator defined in (35) and 1. if η < 1 and → 0 with t fixed we have R(t) = O( 2 η−1 ) while as t → ∞ we have R(t) = O( −1 t −η−1 ), 2. for η > 1 we have R(t) = O( 2 t −η+1 ), 3. AW ≤ C|||W |||,
(12)
a(0) and φd (0) are determined by the initial data. The complex frequency ω∗ is given by −iω∗ = −is0 − , where s0 solves the equation s0 + ω + 2 {F (, is0 )} = 0
(13)
(see (47) and (49) below) and 4.
= 2 {F (, is0 )} .
(14)
Remark. ω∗ can be found by solving the transcendental equation (13) by either expansion or iteration if sufficient regularity is present (see also Proposition 12 and note following it and Lemma 18).
Resonance Theory
139
2.2. Sketch of the proof of the Theorem 1. The proof of Theorem 1 is given in Sects. 3 and 4. Section 3 prepares the ground for the proof, Subsect. 4.1 provides key definitions while Subsects. 4.2 and 4.3 contain the proof of Theorem 1 (ii) and (i) respectively. As an intuitive guideline, the solution φ(t) of the time dependent problem is decomposed into the projection a(t)ψ0 on the eigenfunction of H0 and a remainder (see (18)). The remainder is estimated from the detailed knowledge of a(t) (see (34) and (39). Thus it is essential to control a(t); once that is done, parts (a) and (b) follow from Proposition 4; this a(t) satisfies an integral equation, cf. (43). We chiefly use the Tauberian type duality between the large t behavior of a(t) and the regularity properties of its Laplace transform, cf. Proposition 9 and also Eq. (55). Then, an essential ingredient in the proof of the estimate (11) is Proposition 15. When enough regularity is present, no lower bound on > 0 is imposed; Proposition 16 and Proposition 17 are key ingredients here. 2.3. Further results. Lemma 2. Assuming the conditions of Theorem 1 with η > 1 then ω∗ = λ0 + (ψ0 , W ψ0 ) + (; + i ) + o( 2 ),
(15)
where ; = 2 (W ψ0 , P .V .(H0 − λ0 )−1 W ψ0 ),
(16)
= π (W ψ0 , δ(H0 − λ0 )(I − P0 )W ψ0 ).
(17)
2
This follows from the proof of Proposition 12 and the Remarks below it. 3. Decomposition and Isolation of Resonant Terms We begin with the following decomposition of the solution of (4): ˜ e−iH t φ0 = φ(t) = a(t)ψ0 + φ(t), ˜ = 0, −∞ < t < ∞. ψ0 , φ(t)
(18) (19)
Substitution into (4) yields i∂t φ˜ = H0 φ + W φ˜ − (i∂t a − λ0 a)ψ0 + aW ψ0 .
(20)
-
Recall now that I = P0 + P1 + Pc . Taking the inner product of (20) with ψ0 gives the amplitude equation: ˜ + (ψ0 , W φd ), i∂t a = (λ0 + (ψ0 , W ψ0 ) )a + (ψ0 , W P1 φ)
(21)
˜ φd := Pc- φ.
(22)
where
-
The following equation for φd is obtained by applying Pc to Eq. (20): i∂t φd = H0 φd + Pc- W (P1 φ˜ + φd ) + aPc- W ψ0 .
(23)
140
O. Costin, A. Soffer
To derive a closed system for φd (t) and a(t) we now propose to obtain an expression ˜ to be used in Eqs. (21) and (23). Since g (H )φ(·, t) = φ(·, t) we find for P1 φ, (I − g (H ))φ = (I − g (H )) aψ0 + P1 φ˜ + Pc- φ˜ = 0 (24) or
(I − g (H )gI (H0 ))P1 φ˜ = −g (H ) aψ0 + φd ,
(25)
where gI (λ) is a smooth function which is identically equal to one on the support of P1 (λ), and which has support disjoint from . Therefore P1 φ˜ = −Bg (H )(aψ0 + φd ),
(26)
B = (I − g (H )gI (H0 ))−1 .
(27)
where
This computation is justified in Appendix B of [11]. The following was also shown there: Proposition 3 ([11]). For small , the operator B in (27) is a bounded operator on H. From (26) we get φ(t) = a(t)ψ0 + φd + P1 φ˜ = g˜ (H )(a(t)ψ0 + φd (t)),
(28)
g˜ (H ) := I − Bg (H ) = Bg (H )(I − gI (H0 )),
(29)
with see (5). Although g˜ (H ) is not really defined as a function of H , we indulge in this mild abuse of notation to emphasize its dependence on H . In fact, in some sense, g˜ (H ) ∼ g (H ) to higher order in [11]. Substitution of (26) into (23) gives: i∂t φd = H0 φd + aPc- W g˜ (H )ψ0 + Pc- W g˜ (H )φd
(30)
and
i∂t a = λ0 + (ψ0 , W g˜ (H )ψ0 ) a + (ψ0 , W g˜ (H )φd ) = ωa + (ω1 − ω)a + (ψ0 , W g˜ (H )φd ),
(31)
where ω = λ0 + (ψ0 , W ψ0 ), ω1 = λ0 + (ψ0 , W g˜ (H )ψ0 ).
(32) (33)
We write (30) as an equivalent integral equation. We will later need the integral representation of the solution of (30) t −iH0 t φd (t) = e φd (0) − i e−iH0 (t−s) a(s)Pc- W g˜ (H )ψ0 ds 0 t −i e−iH0 (t−s) Pc- W g˜ (H )φd ds. (34) 0
This was also used to prove the following statement.
Resonance Theory
141
Proposition 4 ([11]). Suppose |a(t)| ≤ a∞ t−1−α and assume that η > 0 and α ≥ η. Then for some C > 0 we have x−σ φd (t) L2 ≤ Ct−1−η xσ φd (0) L2 + a∞ |||W ||| . Note. The proposition, as we mentioned, implies parts (a) and (b) of the main theorem, given the properties of a(t) which will be shown in the sequel. The absolute continuity stated in the theorem follows from (10) with η > 0. We define K as an operator acting on C(R+ , H), the space of continuous functions on R+ with values in H by t e−iH0 (t−s) Pc- W g˜ (H )f (s, x)ds. (35) K f (t, x) = 0
We introduce on
C(R+ , H)
the norm f β = suptβ f (·, t) H
(36)
A β;σ = sup x−σ Axσ f β .
(37)
t≥0
and define the operator norm f β ≤1
The above definitions directly imply the following. Proposition 5. If is small, 0 ≤ β ≤ r, r > 1 and for some β1 > 0 we have x−σ e−iH0 t Pc x−σ ≤ Ct −1−β1 , then for 0 ≤ β ≤ β1 we have K β;σ ≤ Cβ;σ ;r .
(38)
The proof uses the smallness of which in turn entails the boundedness of −σ σ −1 x ∞ g˜ n(H )x . Using the definition of K given above we see that K(1 − K) = n=1 K is also bounded. We can now rewrite the equations for φd as φd (t) = e−iH0 t φd (0) + K a(t)ψ0 + Kφd
= (I − K)−1 K a(t)ψ0 + e−iH0 t φd (0)
(39)
(recall that we defined AW = −I + (I − K)−1 K) and therefore i∂t a = ω1 a + ψ0 , W g˜ (H )(I − K)−1 K aψ0 + ψ0 , W g˜ (H )(I − K)−1 e−iH0 t φd (0) .
(40)
To complete the proof of Theorem 1 we need to estimate the large time behavior of a(t) solving Eq. (40). Since the inhomogeneous term satisfies the required decay O(t −1−η ) by our assumptions on H0 it is sufficient to study the associated homogeneous equation. Equivalently, we may choose the embedded eigenfunction as initial condition (that is φd (0) = 0). We now define two operators on L∞ by (41) j˜(a) = v, x−σ K(aψ0 ) , where v = xσ W g˜ (H )ψ0
142
O. Costin, A. Soffer
and
j (a) = v, x−σ (I − K)−1 K(aψ0 ) .
(42)
Proposition 6. The operators j˜ and j are bounded from L∞ into itself. The proposition follows from Proposition 5 with β = 0. Remark. The equation for a can now be written in the equivalent integral form t −iωt −iωt a(t) = a(0)e +e eiωs j (a)(s)ds := a(0)e−iωt + J (a).
(43)
0 ∞ Definition 1. Consider the spaces L∞ T ;ν and Lν to be the spaces of functions on [0, T ] + and R respectively, in the norm
a ν = sup |e−νs a(s)|
(44)
s
Remark 7. We note that for T ∈ R+ , the norm on L∞ T ;ν is equivalent to the usual norm on L∞ [0, T ]. Proposition 8. For some constants c, C and c˜ independent of T we have j a ν ≤ cν −1 2 a ν , J a ν ≤ Cν −2 2 a ν and j˜a ν ≤ cν ˜ −1 2 a ν , and thus j , J , and j˜ ∞ ∞ are defined on LT ;ν and Lν and their norms, in these spaces, are estimated by j ν ≤ cν −1 2 ; j˜ ν ≤ cν ˜ −1 2 ; J ν ≤ Cν −2 2 .
(45)
Similar arguments as above lead to Proposition 9. Equation (40) has a unique solution in L1loc (R+ ), and this solution belongs to L∞ ν if ν > ν0 with ν0 sufficiently large. Thus, in the half-plane (p) > ν0 the Laplace transform of a ∞ aˆ := e−pt a(t)dt (46) 0
exists and is analytic in p. Furthermore, for (p) > ν0 , the Laplace transform of a satisfies ˆ ip aˆ = ωaˆ + ia(0) − i 2 F (, p)a(p),
(47)
where F (, p) is defined by F (, p) := ψ0 , W g˜ (H )
−1 −iI iI I+ P W g˜ (H ) P W g˜ (H ) ψ0 p + iH0 c p + iH0 c
+ i(ω1 − ω) −2
(48)
so ˆ = ia(0). (ip − ω + i 2 F (, p))a(p) Eq. (47) follows by taking the Laplace transform of (31).
(49)
Resonance Theory
143
Proof. By Proposition 8, and since e−iωt ν = 1, for large ν Eq. (43) is contractive in 1 L∞ T ;ν and has a unique solution there. It thus has a unique solution in Lloc , by Remark 7. 1 ∞ Since by the same argument Eq. (43) is contractive in L∞ T ;ν and since Lν ⊂ Lloc , the 1 ∞ unique Lloc solution of (43) is in Lν as well. The rest is straightforward. ! " Remark 10. Note that by construction (47) and (48) define F as a Laplace transform of a function. Our assumptions easily imply that if is small enough, then: (a) F (, p) is analytic except for a cut along i. F (, p) is Hölder continuous of order η > 0 at the cut, i.e. lim F (, iτ ± γ ) ∈ H η , γ ↓0
the space of Hölder continuous functions of order η. (b) |F (, p)| ≤ C|p|−1 for some C > 0 as |p| → ∞. To see it we write B = B1 B2 x−σ ; B1 :=
I P - x−σ ; p + iH0 c
B2 := xσ W g˜ (H )xσ .
(50)
-
Noting that Pc projects on the interval it is clear by the spectral theorem that x−σ B is analytic in p on D := C \ (i). By the assumption on the decay rate and the Laplace transform of Eq. (7) we have that B3 (p) := x−σ
I P - x−σ p + iH0 c
(51)
is uniformly Hölder continuous, of order η, as p → i. For p0 ∈ i, the two sided limits lima↓0 B3 (p0 ±a) = B3± will of course differ, in general. A natural closed domain of definition of B3 is D together with the two sides of the cut, D := D ∪ ∂D+ ∪ ∂D− . We then write B3 ≤ C1 (p),
(52)
where we note that C1 can be chosen so that: Remark 11. C1 (p) > 0 is uniformly bounded for p ∈ D and C1 (p) = O(p −1 ) for large p. Hence for some C2 we have uniformly in p (choosing small enough), x−σ (B1 B2 )n ≤ C2n n ,
(53)
and therefore the operator
W g˜ (H )
−1 I I I− P W g˜ (H ) P W g˜ (H ) p + iH0 c p + iH0 c
is analytic in D and is in H η (D).
(54)
144
O. Costin, A. Soffer
4. General Case 4.1. Definition of . We have from Proposition 9, Eq. (47) that a(p) ˆ =
ia(0) . ip − ω + i 2 F (, p)
(55)
We are most interested in the behavior of aˆ for p = is, s ∈ R. will be defined in terms of the approximate zeros of the denominator in (55). Let F =: F1 + iF2 . Proposition 12. For small enough, the equation s + ω + 2 F2 (, is) = 0 has at least one root s0 , and s0 = −ω + O( 2 ). If η ≥ 1, then for small enough the solution is 2
unique. If η < 1 then two solutions s1 and s2 differ by at most O( 1−η ). Proof. We write s = −ω + δ and get for δ an equation of the form δ = 2 G(δ) where G(x) = −F2 (, ix − iω), and G(x) ∈ H η . The existence of a solution for small is an immediate consequence of continuity and the fact that δ − 2 G(δ) changes sign in an interval of size 2 G ∞ . If η ≥ 1 we note that the equation δ = 2 G(δ) is contractive for small and thus has a unique root. If instead 0 < η < 1 we have, if δ1 , δ2 are two roots, then for some K > 0 independent of , |δ1 − δ2 | = 2 |G(δ1 ) − G(δ2 )| ≤ 2 K|δ1 − δ2 |η whence the conclusion. ! " Remark. Note that s0 are not, in general, poles of (55) since we only solve for the real part equal to zero. 2
Assumption 13. If η < 1 then we assume that 2 F1 (, −iω) & 1−η for small . When η > 1 this restriction will not be needed, cf. Sect. 4.3. Definition. We choose one solution s0 = −ω + δ and let be defined by (14). Note. In the case η < 1 the choice of s0 yields, by the previous assumption a (possible) 2
arbitrariness in the definition of of order O( 1−η ) = o( ). Remarks on the verifiability of condition > 0. As it is generally difficult to check the positivity of itself but relatively easier to find 0 , we will look at various scenarios, which are motivated by concrete examples, in which the condition of positivity reduces to a condition on F (, −iω). Let
0 = 2 F1 (, −iω); γ0 = 2 F2 (, −iω), where we see that 0 and γ0 are O( 2 ). The equation for δ reads δ = − 2 [F2 (, −iω + iδ) − F2 (, −iω)] − γ0 = 2 H (δ) − γ0 , where H (0) = 0. We write δ = −γ0 + ζ and get ζ = 2 H (−γ0 + ζ ) and the definition of becomes
= 2 F1 (, −iω − iγ0 + iζ ).
Resonance Theory
145
Proposition 14. (i) If H0 satisfies the conditions of Theorem 1 with η > 1 and γ0 = o( −2 0 ), then as → 0,
= 0 + o( 0 ) and in particular is positive for 0 > 0.
(56) 2
(ii) Assume that η < 1, γ0 = o( −2 0 ) and 0 & 1−η as → 0. Then again (56) holds. 1/η
Proof. (i) Since ζ = O( 2 γ0 ) + O( 2 ζ ) we get ζ = O( 2 γ0 ), implying that
= 2 F1 , −iω − iγ0 (1 + o(1)) = 0 + O( 2 γ0 ) = 0 + o( 0 ). (ii) We have η
ζ = O( 2 γ0 ) + O( 2 ζ η ).
(57)
If ζ ≤ const.γ0 as → 0, then the proof is as in part (i). If on the contrary, for some large constant C we have ζ > Cγ0 then by (57) we have ζ < const. 2 ζ η so that ζ = O( 2/(1−η) ) and 2 ζ η = O( 2/(1−η) ) = o( 0 ). But then η
= 2 F1 (, −iω) + O( 2 γ0 ) + O( 2 ζ η ) = 0 + o( 0 ).
" !
4.2. Exponential decay. We now let p = is0 + v. The intermediate time and long time behavior of a(t) are given by the following proposition Proposition 15. For t = O(1) (note that in general depends on ), as → 0 we have (i) a(t) = e−is0 t e− t + O( 2 η−1 ).
(58)
a(t) = O( −1 t −η−1 ).
(59)
(ii) As t → ∞ we have Proof. (i) Note first that, taking (v) > 0 and writing F as a Laplace transform, cf. Remark 10, ∞ e−is0 t−vt f (t)dt, F (, −is0 + v) = 0
we have by our assumptions that t ' ∞ −vt −is0 u e e f (u)du F (, −is0 + v) = 0 0 t ∞ =v e−vt e−is0 u f (u)du 0 0 ∞ ∞ ∞ =v e−is0 u f (u)du e−vt − 0 0 t ∞ ∞ ∞ −is0 u e f (u)du − v e−vt e−is0 u f (u)du = 0
= F (, −is0 ) − vL[g](v),
0
t
(60)
146
O. Costin, A. Soffer
where we denoted g(v) = define
∞ t
e−is0 u f (u)du and L[g] is its Laplace transform. Now h(v) = vL[g](v).
(61)
We have, by the formula for the inverse Laplace transform i∞ evt 2πia(t) = e−is0 t dv, 2 −i∞ v + + h(v)
(62)
where by construction we have h ∈ H η , h is analytic in C \ i and h(0) = 0. We write i∞ i∞ evt evt dv dv = 2 2 −1 −i∞ v + + h(v) −i∞ (v + ) 1 + h(v + ) i∞ vt i∞ h(v + )−1 e dv 1 2 = − evt dv. (63) 2 −1 −i∞ v +
−i∞ v + 1 + h(v + ) We first need to estimate L−1 h(v + )−1 ( the transformation is well defined, since the function is just (v + )−1 (F (, −is0 + v) − F (, −is0 )). We need to write
vL[g](v) =: (v + )L[g1 ](v) or L[g1 ] = 1 − L[g] (64) v+
which defines the function g1 : g1 = g − e− t
t
e s g(s)ds.
(65)
0
Since |g(t)| < Const.t −η we have |g1 (t)| ≤ Const.t −η + e− t
t 0
A similar inequality holds for
Q := L−1
Indeed, we have Q = −L−1
eu
u −η
h v+
2h 1 + v+
du ≤ Const.t −η .
(66)
.
h h + 2 L−1 ∗ Q. v+
v+
(67)
(68)
It is easy to check that for t ≤ r −1 and small enough this equation is contractive in the norm Q = sups≤t sη |Q(s)|. But now, for constants independent of , t 1 2 L−1 e s s −η ds ∗ Q ≤ Const.e− s v+
0 t −η u 2 − s −1 (69) = Const.e
eu du
0 2 ≤ Const. 1−η .
Resonance Theory
147
(ii) We now use (60) and (61) to write h F (, −is0 + v) F (, −is0 ) = − v+
v+
v+
and get H1 := L−1
t h = e− t e s f (s)ds + conste− t , v+
0
and thus, proceeding as in the proof of (i) we get for some C > 0 |H1 | ≤ C −1 t−η−1 . To evaluate a(t) for large t we resort again to Q as defined in (67) which satisfies (68). This time we note that the equation is contractive in the norm sups≥0 |s1+η · | when is small enough. ! " Using (59), Proposition 4 and (28) imply local decay and therefore χ cannot be an eigenfunction which implies (i). Since the local decay rate is integrable (ii) follows [24]. Part c) follows from (58), (39) and (28) while (12) follows from (39) and the smallness of K. 4.3. Proof of Theorem 1 in case (i) of regularity η > 1. In this case we obtain better estimates. We write G(v) = L−1 [g](v)
(70)
and (62) becomes a(t) = e
−is0 t
i∞ −i∞
evt dv. v + + 2 vG(v)
(71)
Now L−1 (v + + 2 vG(v))−1
v 1 1 v+ G(v) −1 2 −1 −1 =L − L ∗L . v v+
v+
1 + 2 v+
G(v)
Proposition 16. Let
−1
H2 (t) := L
v v+ G(v) v 1 + 2 v+
G(v)
(72)
.
We have |H2 | ≤ Const.t−η ;
0
∞
H2 (t)dt = 0.
(73)
148
O. Costin, A. Soffer
Proof. Consider first the function h3 := v(v + )−1 G(v) = G(v) − (v + )−1 G(v); we see that (cf. (70) and (60)) t ∞ H3 := L−1 h3 = e−is0 u f (u)du − e− t e s t
0
∞ s
e−is0 u f (u)duds,
(74)
and thus, for some positive constants Ci , |H3 | ≤ Const.t
−η
+ Const.e
− t
t
ev −η v−η dv,
(75)
0
and thus, since h3 (0) = 0 we have −η
|H3 | ≤ Const.t
;
∞
H3 (t)dt = 0.
0
Note now that the function −1 v 2 v G(v) 1 + G(v) v+
v+
vanishes for v = 0. Note furthermore that H2 = H3 − 2 H3 ∗ H2 . It is easy to check that this integral equation is contractive in the norm H = sups≤t |sη H (s)| for small enough ; the proof of the proposition is complete. ! " Proposition 17. L−1 (v + + 2 G(v))−1 = e− t + (t), where for some constant C independent of , t, we have || ≤ C 2 t−η+1 . Proof. We have, by (72) ∞ ' t (t) = 2 e− t e s H2 (u)du ds 0 s ∞ t ∞ = 2 H2 (s)ds − e− t e s H2 (u)du. t
The estimate of the last term is done as in (75). Theorem 1 part (c) in case (i) follows.
0
" !
s
(76)
Resonance Theory
149
5. Analytic Case Suppose that the function F (p, ) has analytic continuation in a neighborhood of the relevant energy −iω = 0; in this case we can prove stronger results. In many cases one can show the analyticity of F if the resolvent, properly weighted, has analytic continuation. Lemma 18. Assume that for some ω and some neighborhood D of ω, E(, p) is a function with the following properties: (i) E ∈ H η (D) and E is analytic in D (this allows for branch-points on the boundary of the domain, a more general setting than meromorphicity). (ii) |E(, p)| ≤ C 2 for some C. (iii) lima↓0 E(, −iω − a) = − 0 < 0. If (a) η > 1, E(, −iω) = o( 0 / 2 ) or (b) η < 1 and E(, −iω) = O( 0 ) and is small enough, then the function G1 (, p) = p + iω + E(, p) has a unique zero p = pz in D and furthermore (pz ) < 0. In fact, (pz ) + 0 = o( 0 ).
(77)
Remark. If the condition that for η > 1, E(, −iω) = o( −2 0 ) is not satisfied, then we can replace −iω by −iω − is0 and the uniqueness of the complex zero will still be true. Proof. We have G1 (, pz ) = 0 = pz + iω + E(, −iω) + [E(, pz ) − E(, −iω)] or, letting p = −iω + ζ , ζz := pz + iω, 2 φ(, ζ ) := E(, p) − E(, −iω), ζz = −E(, −iω) − 2 φ(, ζz ). Consider a square centered at E(, −iω) with side 2|(E(, −iω))| = 2 0 . For both cases (a) and (b) for η considered in part (iii) of the lemma, note that in our assumptions and by the choice of the square we have 2 φ(ζ, ) (78) ζ + E(, −iω) → 0 (as → 0) (on all sides of the square). In case (a) on the boundary of the rectangle we have by construction of the rectangle, |ζ + E(, −iω)| ≥ 0 . Also by construction, on the sides of the rectangle we have |ζ | ≤ 0 . Still by assumption, φ(, ζ ) ≤ Cζ = o( −2 0 ) and the ratio in (78) is o(1). In case (b), we have η
2 φ(, ζ ) = O( 2 ζ η ) = O( 2 0 ) = o( 0 ). Thus, on the boundary of the square, the variation of the argument of the functions ζ + E(, −iω) + 2 φ(ζ ) and that of ζ + E(, −iω) differ by at most o(1) and thus have to agree exactly (being integer multiples of 2π i); thus ζ + E(, −iω) + 2 φ(ζ ) has exactly one root in the square. The same argument shows that p + iω + E(, p) has no root in any other region in its analyticity domain except in the square constructed in the beginning of the proof. ! "
150
O. Costin, A. Soffer
Theorem 19. Assume the conditions (H) and (W) as before, and furthermore that the function F (, p) has analytic continuation in a neighborhood of −iω; with an appropriate choice of the cutoff function E (H0 ), we have that χ (H − z)−1 χ has a unique pole away from the real axis, near −iω, corresponding to a resonance with imaginary part near , with appropriate choice of weights χ . Proof. First we note that by taking the Laplace transform of (28) and (34) and solving for the resolvent of H we get that ˆ χ (H − z)−1 χ = A(z)a(z)ψ 0 + B(z) with A(z) and B(z) analytic in D by our assumptions (H) and (W), and the assumed analyticity of F (, p), ip := z. Hence the existence and uniqueness of the pole of χ (H − z)−1 χ follows from Lemma 18, with 2 F (, p) = E(, p). ! " As a consequence we obtain the following result. Proposition 20. With an appropriate exponential cutoff function, the remainder term decays as a stretched exponential times an asymptotic series. Sketch of proof. We need the large t behavior of a(t) which is the Inverse Laplace transform of G(p) := (p + iω + i 2 F (, p))−1 and to this end we write G(p) = (p + iω∗ )−1 − i 2 (p + iω∗ )−1 F∗ (, p)G(p),
(79)
where F∗ (, p) := F (, p) − (ω∗ − ω)/ 2 and ω∗ is the unique pole of G(p) found in the previous theorem. Taking the inverse Laplace transform of (79) we get an integral √ ˜ ∼ e− t+iθt ak t −k/4 implies equation for G(t), and direct calculations show that F √ G(t) ∼ e−iω∗ t + O( 2 )e− t+iθt bk t −k/4 . To find the asymptotic behavior of F˜ (t) we derive an integral equation by taking the inverse Laplace transform of (48) and the same integral equation arguments as above reduce the asymptotic study of F˜ to that of the following expression for any u ∈ L2 : ∗ e−iλt g Bψ 0 dµa.c. (λ) := ξ(λ)e−iλt g (λ)dλ, (u, Be−iH0 t Pc- Bψ0 ) = Bu where B = W g˜ (H ) and φ˜ is the spectral representation of φ associated to H0 . By as ∗ )(λ)(λ−z)−1 (Bv)(λ)f sumption B(H0 −z)−1 B is analytic in z ∈ D, hence (Bu (λ)dλ 2 is analytic for any v ∈ L , where f (λ) = dµa.c. /dλ; therefore so is its Hilbert transform ∗ Bvf and thus ξ is also analytic. Choosing g (λ) = exp(−(λ−a)−1 +(λ−b)−1 ) the Bu b 1 1 asymptotic expansion of F˜ follows from that of the integral a e− λ−a + λ−b −itλ ξ(λ)dλ. " !
5.1. Example. Suppose H0 =
− 0 := − ⊕ (− + x 2 ) 0 − + x 2
on L2 (R) ⊕ L2 (R). Assume
W =
0 W˜ W˜ 0
Resonance Theory
151
with W˜ = W˜ (x) sufficiently regular and exponentially localized. Then, the spectrum of H0 has embedded eigenvalues corresponding to the spectrum of −+x 2 , with Gaussian localized and smooth eigenfunctions. Since the projection I − P0 in the definition of Pc 2 2 eliminates the − + x part in any interval containing an eigenvalue of − + x , it is left to verify the conditions of the theorem for H0 replaced by −. Since e−αx (− − z)−1 e−αx
(80)
has analytic continuation through the cut (0, ∞) and is an analytic function away from z = 0, we can now choose an interval = [a, b] around each eigenvalue En of −+x 2 , avoiding zero, and let −1 −1 E (λ) = e−(λ−a) e(λ−b) be a function analytic in C except z = a and b. 5.2. Remarks on applications. The examples covered by the above approach include those discussed in [11] as well as the many cases where analytic continuation has been established, see e.g. [21]. Furthermore, following results of [21] it follows that under favorable assumptions on V (x), − + V (x) has no zero energy bound states in three or more dimensions extending the results of [11], where it was proved for 5 or more dimensions. It is worth mentioning that the possible presence of thresholds inside makes it necessary to allow for η < ∞, and that in the case where there are finitely many thresholds inside of known structure, sharper results may be obtained. Other applications of our methods involve numerical reconstruction of resonances from time dependent solutions data, in cases where Borel summability is ensured. This and other implications will be discussed elsewhere. Acknowledgement. The authors acknowledge partial support from the NSF. One of us (A. S.) would like to thank I. M. Sigal for discussions.
References 1. Gérard, C. and Sigal, I.M.: Space-time picture of semiclassical resonances. Commun. Math. Phys. 145, 281–328 (1992) 2. Helffer, B. and Sjöstrand, J.: Résonances en limite semi-classique. Mem. Soc. Math. France (N. S) #24-25 (1986) 3. Balslev, E.: Resonances with a Background Potential. In: Lecture Notes in Physics 325, Berlin– Heidelberg–New York: Springer, 1989 4. Philips, R. and Sarnak, P.: Perturbation theory for the Laplacian on Automorphic Functions. J. Am. Math. Soc. Vol. 5, No. 1, 1–32 (1992) 5. Simon, B.: Resonances and complex scaling: A rigorous overview. Int. J. Quantum Chem. 14, 529–542 (1978) 6. Orth, A.: Quantum mechanical resonance and limiting absorption: The many body problem. Commun. Math. Phys. 126, 559–573 (1990) 7. Hunziker, W.: Resonances, Metastable States and Exponential Decay Laws in Perturbation Theory. Commun. Math. Phys. 132, 177–188 (1990) 8. Costin, O., Lebowitz, J.L., Rokhlenko, A.: Exact results for the ionization of a model quantum system. J. Phys. A: Math. Gen. 33, 1–9 (2000) 9. Tang, S.H. and Zworski, M.: Resonance Expansions of Scattered waves. To appear in CPAM 10. Skibsted, E.: Truncated Gamov functions, α-decay and exponential law. Commun. Math. Phys. 104, 591–604 (1986)
152
O. Costin, A. Soffer
11. Soffer, A. and Weinstein, M.I.: Time dependent resonance theory. GAFA, Geom. Funct. Anal. vol 8, 1086–1128 (1998) 12. Merkli, M., Sigal, I.M.: A Time Dependent Theory of Quantum Resonances. Commun. Math. Phys 201 549–576 (1999) ' 13. Journé, J.L., Soffer, A. and Sogge, C.: Lp → Lp Estimates for time dependent Schrödinger Equations. Bull. AMS 23, 2 (1990) 14. Jensen, A., Mourre, E. and Perry, P.: Multiple commutator estimates and resolvent smoothness in quantum scattering theory. Ann. Inst. Poincaré – Phys. Théor. 41, 207–225 (1984) 15. Sigal, I.M. and Soffer, A.: Local decay and velocity bounds for quantum propagation. Preprint (1988); ftp:// www.math.rutgers.edu/pub/soffer 16. Hunziker, W., Sigal, I.M., Soffer, A.: Minimal Escape Velocities. Comm. PDE 24, (11, 12) 2279–2295 (2000) 17. Agmon, S., Herbst, I. and Skibsted, E.: Perturbation of embedded eigenvalues in the generalized N-body problem. Commun. Math. Phys. 122, 411–438 (1989) 18. Aguilar, J. and Combes, J.M.: A class of analytic perturbations for one body Schrödinger Hamiltonians. Commun. Math. Phys. 22, 269–279 (1971) 19. Costin, O.: On Borel summation and Stokes phenomena for rank one nonlinear systems of ODE’s. Duke Math. J. Vol. 93, No. 2, 289–344 (1998) 20. Costin, O., Tanveer, S.: Existence and uniqueness for a class of nonlinear higher-order partial differential equations in the complex plane. CPAM Vol. LIII, 1092–1117 (2000) 21. Hislop, P. and Sigal, I.M.: Introduction to Spectral Theory. Applied Math. Sci. 113, Berlin–Heidelberg– New York: Springer, 1996 22. Rauch, J.: Perturbation Theory for Eigenvalues and Resonances of Schrödinger Hamiltonians. J. Funct. Anal. 35, 304–315 (1980) 23. Lavine, R.: Exponential Decay. In: Diff. Eq. and Math. Phys, Proceedings, Alabama, Birmingham, 1995 24. Reed, M., Simon, B.: Methods of Modern Mathematical Physics IV, Analysis of Operators. New York: Academic Press, 1978 Communicated by M. Aizenman
Commun. Math. Phys. 224, 153 – 204 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
The Birth of the Infinite Cluster: Finite-Size Scaling in Percolation C. Borgs1 , J. T. Chayes2 , H. Kesten2 , J. Spencer3 1 Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA 2 Department of Mathematics, Cornell University, Ithaca, NY 14853, USA 3 Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street,
New York, NY 10012, USA Received: 6 December 2000 / Accepted: 25 May 2001
Abstract: We address the question of finite-size scaling in percolation by studying bond percolation in a finite box of side length n, both in two and in higher dimensions. In dimension d = 2, we obtain a complete characterization of finite-size scaling. In dimensions d > 2, we establish the same results under a set of hypotheses related to so-called scaling and hyperscaling postulates which are widely believed to hold up to d = 6. As a function of the size of the box, we determine the scaling window in which the system behaves critically. We characterize criticality in terms of the scaling of the sizes of the largest clusters in the box: incipient infinite clusters which give rise to the infinite cluster. Within the scaling window, we show that the size of the largest cluster behaves like nd πn , where πn is the probability at criticality that the origin is connected to the boundary of a box of radius n. We also show that, inside the window, there are typically many clusters of scale nd πn , and hence that “the” incipient infinite cluster is not unique. Below the window, we show that the size of the largest cluster scales like ξ d πξ log(n/ξ ), where ξ is the correlation length, and again, there are many clusters of this scale. Above the window, we show that the size of the largest cluster scales like nd P∞ , where P∞ is the infinite cluster density, and that there is only one cluster of this scale. Our results are finite-dimensional analogues of results on the dominant component of the Erd˝os–Rényi mean-field random graph model. 1. Introduction: Background and Discussion of Results We dedicate this paper to Joel Lebowitz on the occasion of his 70th birthday. He is an inspiration to us all. We present here the complete version of results announced several years ago in [CPS96] and [Cha98]. Finite-size scaling is the study of corrections to the thermodynamic behavior of an infinite system due to finite-size effects. In particular, this includes the broadening of the transition point into a transition region in a finite system. Here we present an analysis
154
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
of finite-size scaling for percolation on the hypercubic lattice, both in two and in higher dimensions. Our analysis is based on a number of postulates which are mathematical expressions of the purported scaling behavior in critical percolation in dimensions two through six. We explicitly verify these scaling postulates in two dimensions. We consider bond percolation in a finite subset of the hypercubic lattice Zd . Nearest-neighbor bonds in are occupied with probability p and vacant with probability 1 − p, independently of each other. Let pc denote the bond percolation threshold in Zd , namely the value of p above which there exists an infinite connected cluster of occupied bonds. As a function of the size of the box , we determine the scaling window about pc in which the system behaves critically. For our purposes, criticality is characterized by the behavior of the distribution of sizes of the largest clusters in the box. We show how these clusters can be identified with the so-called incipient infinite cluster – the cluster of infinite expected size which appears at pc . The motivation for this work was threefold: first, to give a finite-dimensional analogue and interpretation of results on the Erd˝os-Rényi mean-field random graph model; second, to provide rigorous results on finite-size scaling at a continuous transition; and third, to establish detailed results on incipient infinite clusters which correspond closely to results observed by numerical physicists. In this introduction, we will discuss each aspect of the motivation in some detail. The Random Graph Model. The original motivation for this work was to obtain an analogue of known results on the random graph model of Erd˝os and Rényi ([ER59, ER60]; see also [Bol85,AS92]). The random graph is simply the percolation model on the complete graph, i.e., it is a model on a graph of N sites in which each site is connected to each other site, independently, with uniform probability p(N ). It turns out that the model has particularly interesting behavior if p(N ) scales like p(N ) ≈ c/N with c 1. Here, as usual, f g means that there are nonzero, finite strictly positive constants c1 and c2 , such that c1 g ≤ f ≤ c2 g. Let W (i) denote the random variable representing the size of the i th largest cluster in the system. Erd˝os and Rényi ([ER59, ER60]) showed that the model has a phase transition at c = 1 characterized by the behavior of W (1) . It turns out that, with probability one, W (1)
log N N 2/3 N
if c < 1 if c = 1 if c > 1.
(1.1)
Moreover, for c > 1, W (1) /N tends to some constant θ(c) > 0, with probability one, while for c = 1, W (1) has a nontrivial distribution (i.e., W (1) /N 2/3 constant) ([ER59, ER60, JKLP93,Ald97]). For c ≤ 1, the sizes of the second, third, . . . , largest clusters are of the same scale as that of the largest cluster, while for c > 1 this is not the case: For any fixed i > 1, W (i) log N for all c = 1 ([ER59, ER60]), while at c = 1, W (i) N 2/3 [Bol84]. The cluster of order N for c > 1 is clearly the analogue of the infinite cluster in percolation on finite-dimensional graphs; in the random graph, it is called the giant component. As we will see, the clusters of order log N or smaller are analogues of finite clusters in ordinary percolation. The clusters of order N 2/3 will turn out to be the analogue of the so-called incipient infinite cluster in percolation. More interestingly, the critical point c = 1 is actually broadened into a critical regime by finite-N corrections. It was shown by Bollobás [Bol84] and Łuczak [Luc90] that the
Finite-Size Scaling in Percolation
155
correct parameterization of the critical regime is 1 p(N ) = N
λN 1 + 1/3 , N
(1.2)
in the sense that if limN→∞ |λN | < ∞, then W (i) N 2/3 for all i; see also the combinatoric tour de force of Janson, Knuth, Łuczak and Pittel [JKLP93] for more detailed properties, including some distributional results on the W (i) ’s. Finally, it was shown by Aldous that the W (i) , rescaled by N 2/3 , have a nontrivial limiting joint distribution which can be calculated from a one-dimensional Brownian motion with time-dependent drift [Ald97]. On the other hand, if limN→∞ λN = −∞, then W (2) /W (1) → 1 with probability one, whereas if limN→∞ λN = +∞, then W (2) /W (1) → 0 and W (1) /N 2/3 → ∞ with probability one. The largest component in the regime with λN → +∞ is called the dominant component. As we will show, it has an analogue in ordinary percolation. The initial motivation for our work was to find a finite-dimensional analogue of the above results. To this end, we consider d-dimensional percolation in a box of linear size n, and hence volume N = nd . We ask how the size of the largest cluster in the box behaves as a function of n for p < pc , p = pc and p > pc . It is straightforward from known results to describe these cluster sizes for fixed p = pc . However, we are interested mainly in the situation where p varies with n. In particular, we ask whether there is a window about pc such that the system has a nontrivial cluster size distribution within the window. Finite-size scaling. The considerations of the previous paragraph lead us immediately to the question of finite-size scaling (FSS). Phase transitions cannot occur in finite volumes, since all relevant functions are polynomials and thus analytic; nonanalyticities only emerge in the infinite-volume limit. What quantities should we study to see the phase transition emerge as we go to larger and larger volumes? Before our work, this question had been rigorously addressed in detail only in systems with first-order transitions – transitions at which the correlation length and order parameter are discontinuous ([BoK90, BI92-1, BI92-2]). Finite-size scaling at secondorder transitions is more subtle due to the fact that the order parameter vanishes at the critical point. For example, in percolation it is believed that the infinite cluster density vanishes at pc . However, physicists routinely talk about an incipient infinite cluster at pc . This brings us to our third motivation. The incipient infinite cluster. At pc , it is believed that with probability one there is no infinite cluster. On the other hand, the expected size of the cluster of the origin is infinite at pc , see [Ham57], [Kes82, Cor. 5.1] and [AN84]. This suggests that from the perspective of an observer at the origin, all clusters are finite, with larger and larger clusters appearing as one considers larger and larger length scales. Physicists have called the emerging object the incipient infinite cluster. In the mid-1980’s there were two attempts to construct rigorously an object that could be identified as an incipient infinite cluster. Kesten [Kes86] proposed to look at the conditional measure in which the origin is connected to the boundary of a box centered at the origin, by a path of occupied bonds: Ppn (·) = Pp (· | 0 ↔ ∂[−n, n]d ). Here, as usual, Pp (·) is a product measure at bond density p. Observe that, at p = pc , as
156
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
n → ∞, Ppn (·) becomes mutually singular with respect to the unconditioned measure Pp (·). Nevertheless, Kesten found that in d = 2, lim Ppnc (·) = lim Pp (· | 0 ↔ ∞).
n→∞
ppc
(1.3)
Moreover, Kesten studied properties of the infinite object so constructed and found that it has a nontrivial fractal dimension which agrees with the fractal dimension of the physicists’ incipient infinite cluster. Another proposal was made by Chayes, Chayes and Durrett [CCD87]. They modified the standard measure in a different manner than Kesten, replacing the uniform p by an inhomogeneous p(b) which varies with the distance of the bond b from the origin: p(b) = pc +
λ , 1 + dist(0, b)ζ
(1.4)
with λ constant. The idea was to enhance the density just enough to obtain a nontrivial infinite object. In d = 2, [CCD87] proved that for ζ = 1/ν, where ν is the so-called correlation length exponent, the measure Pp(b) has some properties reminiscent of the physicists’ incipient infinite cluster. In this work, we propose a third rigorous incipient cluster – namely the largest cluster in a box. This is, in fact, exactly the definition that numerical physicists use in simulations. Moreover, it will turn out to be closely related to the IICs constructed by Kesten and Chayes, Chayes and Durrett. Like the IIC of [Kes86], the largest cluster in a box will have a fractal dimension which agrees with that of the physicists’ IIC. Also, our proofs rely heavily on technical estimates from the IIC construction of [Kes86]. More interestingly, the form of the scaling window p(n) for our problem will turn out to be precisely the form of the enhanced density used to construct the IIC of [CCD87]. Yet a fourth candidate for an incipient infinite cluster is a spanning cluster in a large box, an object studied by Aizenman in [Aiz97]. Let us caution the reader that the terminology in [Aiz97] differs somewhat from ours. While Aizenman reserves the term IIC for an incipient infinite cluster viewed from a point inside this cluster (thus implying uniqueness almost by definition), we use the term incipient infinite clusters for the large clusters viewed from the scale of the box under consideration. From this point of view the IIC is not necessarily unique, see below. Recently, Járai has shown that, viewed from a random point in the IIC, all four notions of the IIC lead to the same distribution on local observables in dimension d = 2 [Jar00]. Informal statement and heuristic interpretation of results. Our results will be stated precisely in Sect. 3. Here we give an informal statement in terms of the critical exponents of percolation, assuming these exponents exist. Note that our results hold independently of the existence of critical exponents, but they are easier to state informally and to compare to the random graph results (1.1) and (1.2) in terms of these exponents. To this end, let P∞ (p) denote the infinite cluster density, χ fin (p) denote the expected size of finite clusters, ξ(p) denote the correlation length, i.e., the inverse exponential decay rate of the finite cluster connectivity function, and P≥s (p) denote the probability that the cluster of the origin is of size at least s. Also let πn (pc ) denote the probability at criticality that the origin is connected to the boundary of a hypercube of side 2n. See Sect. 2, in particular Eqs. (2.5), (2.15), (2.18), (2.4) and (2.10), for precise definitions.
Finite-Size Scaling in Percolation
157
It is believed, but not proved in low dimensions, that the behavior of these quantities as p → pc or at p = pc is described by the following scaling laws: P∞ (p) ≈ |p − pc |β
p > pc ,
(1.5)
χ (p) ≈ |p − pc |
−γ
,
(1.6)
ξ(p) ≈ |p − pc |
−ν
,
(1.7)
fin
P≥s (pc ) ≈ s
−1/δ
,
(1.8)
πn (pc ) ≈ n−1/ρ .
(1.9)
and In (1.5)–(1.7), G(p) ≈ |p − pc
|α
means
lim
p→pc
log G(p) = α. log |p − pc |
(1.10)
Unless otherwise noted we implicitly assume that the approach is identical from above and below threshold. Similarly, we use G(n) ≈ nα in (1.8)–(1.9) to mean lim
n→∞
log G(n) = α. log n
(1.11)
(i)
Let n denote a hypercube of side n and let W n denote the i th largest cluster in this hypercube. Then, under certain “scaling assumptions,” we find the asymptotic behavior (1) of W n , both for fixed p and, more generally, for p which vary with n. Combining our results at pc with known results for fixed p = pc , we first establish the following analogue of (1.1): log n if p < pc (1) W n ndf (1.12) if p = pc nd if p > pc , where we use the suggestive notation df = d − 1/ρ
(1.13)
to indicate that d −1/ρ is the fractal dimension of our candidate incipient infinite cluster. Moreover, we show that, under the scaling assumptions, the critical point pc is broadened into a scaling window of the form λ p(n) = pc 1 ± 1/ν , (1.14) n in the sense that inside the window W (1) ≈ ndf ,
W (2) ≈ ndf , · · · ,
(1.15)
while above the window W (1) ≈ nd P∞ , W (1) /ndf → ∞, W (2) /W (1) → 0,
(1.16)
158
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
and below the window W (1) /ndf → 0,
(1.17)
W (1) ≈ ξ df log(n/ξ ).
(1.18)
where, in fact,
The results in (1.14)–(1.18) are established both in expectation and in probability. Note the similarity between the form of the scaling window (1.14) and the bond density (1.4) of the [CCD87] incipient infinite cluster. Furthermore, within the scaling window, we get results on the distribution of cluster sizes which show that the distribution does not go to a point mass. This is to be contrasted with the behavior above the window, where the normalized cluster size approaches its expectation, with probability one. All of these additional results require some delicate second moment estimates. Our scaling assumptions, which are described in detail in Sect. 3, are explicitly proved in dimension d = 2, and are believed – but not proved – to hold for d less than the socalled upper critical dimension dc . The upper critical dimension is the dimension above which the critical exponents assume their Cayley tree values; presumably dc = 6 for percolation. What would results (1.14) and (1.15) say if we attempted to apply them in the case of the random graph model (to which, of course, they do not rigorously apply)? Let us use the widely believed hyperscaling relation dν = γ + 2β and the observation that the volume N of our system is just nd , to rewrite the window in the form λ λ λ (1.19) pn = pc 1 ± 1/ν = pc 1 ± 1/dν = pc 1 ± 1/(γ +2β) . n N N Similarly, let us use the hyperscaling relation df /d = δ/(1 + δ) to rewrite the size of the largest cluster as W (1) ≈ ndf ≈ N df /d ≈ N δ/(1+δ) .
(1.20)
Noting that the random graph model is a mean-field model, we expect (and in fact it can be verified [BBCK98]) that γ = 1, β = 1 and δ = 2. Using also pc = 1/N , (1.19) suggests a window of the form 1 λ p(N ) = 1 ± 1/3 , (1.21) N N and within that window W (1) ≈ N 2/3 ,
(1.22)
just the values obtained in the combinatoric calculations on the random graph model. We caution the reader that hyperscaling relations do not apply to the random graph, so that a proper version of the arguments above requires that we deal with a “correlation volume” rather than the correlation length, and that we establish (1.20) directly from the scaling of the cluster size distribution (1.8), rather than by recourse to our finite-dimensional results and a hyperscaling relation. Such arguments can be derived, but are beyond the scope of this paper.
Finite-Size Scaling in Percolation
159
Our results also have implications for finite-size scaling. Indeed, the form of the window tells us precisely how to locate the critical point, i.e., it tells us the correct region about pc in which to do critical calculations. Finally, the results tell us that we may use the largest cluster in the box as a candidate for the incipient infinite cluster. Within the window, it is not unique, in the sense that there are many clusters of this scale. However, above the window (even including a region where p is not uniformly greater than pc as n → ∞), there is a unique cluster of largest scale. This is the analogue of what is called the dominant component in the random graph problem. It is interesting to contrast our results with recent results in high dimensions. As already observed on a heuristic level in [Con85], the validity of hyperscaling is related to the fact that the critical crossing clusters in a box of side length n have size of order nd−1/ρ , and that their number is bounded uniformly in n; see [BCKS98] for rigorous results concerning this relationship. Conversely, breakdown of hyperscaling above six dimensions requires, at least on a heuristic level, that at criticality, the number of crossing clusters in a box of side length n grows like nd−6 , and that all of them have sizes of order n4 ; see again [Con85]. In a similar way, one would expect that the largest cluster in a box of side length n is of size n4 , and that there are roughly nd−6 clusters of similar size. Indeed, it can be proven [Aiz97] that these results follow from a postulate on the decay of the connectivity function at criticality which is widely believed to hold above six dimensions. Very recently, T. Hara [Har01] used the so-called Lace expansion, in the form developed in [HHS01], to rigorously establish this postulate in sufficiently high dimensions d 6. Methods and organization. As mentioned above, our results are proved under certain scaling assumptions which we explicitly verify in dimension d = 2. Obviously, the results could have been proven directly – with no assumptions – in d = 2, but the resulting proof would have been quite complicated and would not have yielded much insight. Instead, we formulate postulates which we believe characterize critical behavior in all dimensions below the critical dimension dc , and then prove our results under these postulates. We believe that the postulates are of independent interest since they provide insight into the nature of critical behavior. Indeed, in previous announcements of this work [CPS96] and [Cha98], we used more postulates than we need now. In [BCKS98], we proved that one of these original postulates was implied by several others, in particular that a reasonable assumption on the behavior of crossing probabilities implies certain hyperscaling relations among critical exponents. The proofs in this paper will rely heavily on the results and methods of [BCKS98]. Indeed, [BCKS98] should really be viewed as “Part I” of this paper, since many of our results on the cluster size distribution were derived there. The verification of the postulates in d = 2 relies on the constructive two-dimensional methods of [Kes86] and [Kes87]. The organization of this paper is as follows. In Sect. 2, we give definitions, notations and previous percolation results we will need in our proofs. Our main results are formulated in Sect. 3. There we first state our postulates, and then state the finite-size scaling results under these postulates. In Sect. 4, we state many additional results which may be of independent interest, including the results of [BCKS98]. Finally, using these additional results, in Sect. 5 we prove our main finite-size scaling theorems under the scaling postulates. We believe, but cannot prove, that the scaling postulates should hold up to the upper critical dimension, which is believed to be dc = 6 for percolation. Finally, in Sect. 6, we prove that the scaling postulates are satisfied in two dimensions. Thus, we have a complete proof of finite-size scaling for percolation in dimension d = 2. In
160
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
Sect. 7, we give a proof of slightly stronger finite-size scaling results under an alternative set of postulates, and also show that the alternative postulates hold in d = 2. 2. Definitions, Notation and Preliminaries Consider the hypercubic site lattice Zd , and the corresponding bond lattice Bd consisting of bonds between all nearest-neighbor pairs in Zd . Bond percolation on Bd is defined by choosing each bond of Bd to be occupied with probability p and vacant with probability 1 − p, independently of all other bonds. The corresponding product measure on configurations of occupied and vacant bonds is denoted by Prp . Ep denotes expectation with respect to the measure Prp , and Covp (· ; ·) denotes the covariance of two indicator functions with respect to Prp : Covp (A; B) = Prp (A ∩ B) − Prp (A)Prp (B). A generic configuration is denoted by ω. If S1 , S2 , S3 ⊂ Zd , we say that S1 is connected to S2 in S3 , denoted by {S1 ↔ S2 in S3 }, if there exists an occupied path with vertices in S3 from some site of S1 to some site of S2 . Maximal connected subsets are called (occupied) clusters. The occupied cluster (in the configuration ω) containing the site x is denoted by C(x) = C(x; ω). The size of the cluster C, denoted by |C|, is the number of sites in C. C∞ denotes the (unique) infinite cluster, i.e., the occupied cluster with |C| = ∞. We also consider clusters in a finite box ⊂ Zd . The connected component of x in C(x) ∩ is denoted by C (x) = C (x; ω); this is therefore the collection of all (1) (2) (k) points which are connected to x by an occupied path in . C , C , · · · C denote the occupied clusters in , ordered from largest to smallest size, with lexicographic order (i) (i) between clusters of the same size. W = |C | denotes the size of the i th largest cluster in . Finally (i)
N (s1 , s2 ) = |{i | s1 ≤ W ≤ s2 }|
(2.1)
denotes the number of clusters in with size between s1 and s2 , and (s1 , s2 ) = |{i | s1 ≤ W (i) ≤ s2 , C (i) ↔ ∂ }| N
(2.2)
is the corresponding number of clusters which do not touch the boundary ∂ of . Here ∂ is the set of points x ∈ that have distance less than 1 from the complement
c = Zd \ of . Returning now to the model on the full lattice, the cluster size distribution is characterized by Ps = Ps (p) = Prp (|C(0)| = s),
(2.3)
P≥s = P≥s (p) = Prp (|C(0)| ≥ s).
(2.4)
or alternatively
The order parameter of the model is the percolation probability or infinite-cluster density P∞ (p) = Prp (|C(0)| = ∞).
(2.5)
pc = inf{p : P∞ (p) > 0}.
(2.6)
The critical probability is
Finite-Size Scaling in Percolation
161
We consider several connectivity functions: the (point-to-point) connectivity function τ (v, w; p) = Prp (v ↔ w),
(2.7)
the finite-cluster (point-to-point) connectivity function τ fin (v, w; p) = Prp (v ↔ w, |C(v)| < ∞),
(2.8)
the point-to-hyperplane connectivity function πn (p) = Prp {∃ v = (n, ·) such that 0 ↔ v}
(2.9)
(v = (n, ·) means that the first coordinate of v equals n), and the point-to-box connectivity function πn (p) = Prp {0 ↔ ∂Bn (0)},
(2.10)
Bn (v) = {w ∈ Zd : |v − w|∞ ≤ n} = [−n, n]d ∩ Zd ,
(2.11)
where
πn (p) are equivalent, in the with | · |∞ denoting the 0∞ -norm. Notice that πn (p) and sense that πn (p). πn (p) ≤ πn (p) ≤ 2d
(2.12)
A quantity which for p > pc behaves much like τ fin (x, y; p) is the covariance: τ cov (v, w; p) = Covp (v ↔ ∞; w ↔ ∞)
(2.13)
(see [CCGKS89], Sect. 6). We also consider several susceptibilities: τ (0, v; p), χ (p) = Ep (|C(0)|) = χ fin (p) = Ep (|C(0)|, |C(0)| < ∞) =
v
τ fin (0, v; p) =
v
and χ cov (p) =
sPs (p)
(2.14) (2.15)
s pc this follows from Grimmett and Marstrand [GM90]). While it is also believed that ξ(p) → ∞ as p ↓ pc , this is rigorously known only for d = 2. Alternatively, lengths may be expressed in terms of the finite-size scaling correlation length L0 (p, ε), introduced in [CCF85] and studied in [CCF85, CCFS86] and [Kes87]. For p < pc , L0 (p, ε) is defined in terms of the crossing probabilities of rectangles, the so-called sponge crossing probabilities: RL,M (p) = Prp { ∃ occupied bond crossing of [0, L] × [0, M] · · · × [0, M] (2.19) in the 1-direction}. Observing that, for p < pc , the sponge crossing probability RL,3L (p) → 0 as L → ∞, we define L0 (p) = L0 (p, ε) = min{L ≥ 1 | RL,3L (p) ≤ ε}
if p < pc .
(2.20)
Using the methods and results of [ACCFR83, CC86, CCF85] and [Kes87], it is straightforward to show that there exists a(d) > 0 such that for ε < a(d), the scaling behavior of L0 (p, ε) is independent of ε for p < pc , in the sense that L0 (p, ε1 )/L0 (p, ε2 ) is bounded away from 0 and infinity for two fixed values ε1 , ε2 < a(d). This scaling behavior is also essentially the same as that of the standard correlation length ξ(p). More specifically, for 0 < ε < a(d), there exist constants c1 = c1 (d), c2 = c2 (d, ε) < ∞ such that1 1 1 c1 log L0 (p, ε) + c2 ≤ ≤ , L0 (p, ε) ξ(p) L0 (p, ε) − 1
p < pc .
(2.21)
Hereafter we will assume that ε < a(d); we usually suppress the ε-dependence in our notation. For p > pc , it is natural to define L0 (p, ε) in terms of a suitable finite-cluster analogue of the sponge-crossing probability RL,M (p), see [CC87], Eq. (53). For technical reasons, it is convenient, however, to consider instead crossings in an annulus HL,M = Zd ∩ [−L, L + M]d \ (0, M)d ,
(2.22)
with inner and outer boundaries ∂I HL,M and ∂E HL,M . We say that an occupied cluster CH in H = HL,M is H -finite if H \ CH contains a path – occupied or not – that connects ∂I H to ∂E H . Let fin (p) = Prp { ∃ an occupied H -finite cluster CH in H = HL,M SL,M
that connects ∂I H to ∂E H },
(2.23)
fin (p) = 1. We define with the convention S0,M fin (p) ≥ ε} L0 (p) = L0 (p, ε) = 1 + max{L ≥ 0 : SL,L
if
p > pc ,
(2.24)
and more generally, for x ≥ 1, fin (p) ≥ ε} L0 (p, ε; x) = 1 + max{L ≥ 0 : SL,xL
if
p > pc .
(2.25)
Note that L0 (p, ε; x) may be finite or infinite, depending on whether or not there exists fin an L0 < ∞ such that SL,xL (p) < ε for all L ≥ L0 . We expect that this definition 1 K. Alexander [Ale96] has shown that one can take c (d = 2) = 0 in (2.21) 1
Finite-Size Scaling in Percolation
163
coincides, say in the sense of Eq. (2.21) (with an x−dependent constant c2 , and c1 (d) = 0), with the standard correlation length ξ(p) above threshold. However, we are not able to prove this in d ≥ 3, since the rescaling techniques of [ACCFR83] do not work for finite-cluster crossings. In d = 2, we can use a Harris ring construction [Har60] in conjunction with the Russo–Seymour–Welsh Lemma ([Rus78, SW78]) to show that this definition is equivalent to ξ(p); see Sect. 6. An important quantity in the high-density phase is the surface tension σ (p); see [ACCFR83] for the precise definition. By analogy with the definition of a finite-size scaling correlation length below threshold, we define a finite-size scaling inverse surface tension as A0 (p) = A0 (p, ε) = min{Ld−1 ≥ 1 | RL,3L (p) ≥ 1 − ε}
if p > pc .
(2.26)
It is easy to see that A0 (p) is well-defined and finite for all p > pc . Indeed, p > pc implies P∞ (p) > 0, which in turn implies that the probability of the event |C(x)| < ∞ for all x ∈ Zd ∩ [0, L]d goes to zero as L → ∞. Since this probability is bounded from below by (1 − RL,3L (p))2d (cf. the proof of Lemma 4.4), this implies that RL,3L (p) → 1 as L → ∞, and hence A0 (p) is well-defined and finite. We expect that A0 (p) is equivalent to the inverse surface tension2 1/σ (p), which in turn should be equivalent to ξ d−1 (p) below the critical dimension dc (presumably dc = 6). Again, we are only able to prove this equivalence in d = 2. While the behavior of L0 (p) below pc is well understood in general dimension, much less is known about L0 (p) or A0 (p) above pc . In particular, below pc , it is easy to see that L0 (p) is monotone increasing, left continuous and piecewise constant. Moreover, L0 (p) ↑ ∞
as
p ↑ pc ,
(2.27)
because RL,3L (pc ) is bounded away from 0 (e.g., by Theorem 5.1 in [Kes82]). Furthermore, the jumps in L0 (p) are uniformly bounded on a logarithmic scale. In particular, by the methods of [ACCFR83, CC86, CCF85] and [Kes87], we have R2L,6L ≤
1 R2 , a(d) L,3L
(2.28)
which in turn implies lim
δ→0
L0 (p + δ) ≤ 2, L0 (p)
(2.29)
provided p < pc and ε < a(d). By contrast, none of these properties are known for L0 (p) above pc . Next consider A0 (p), which, almost by definition, is monotone decreasing and right continuous. However, in general dimension, we do not have a proof that A0 (p) diverges as p ↓ pc , nor do we have a bound of the form (2.29). We will therefore require several postulates on the behavior of L0 (p) and A0 (p) above pc . 2 Using Proposition 3 of [CC87], one can actually prove that A (p) ≤ const σ (p)−1 for all d ≥ 2. We 0 do not expect that the opposite inequality holds for d > the critical dimension, dc , since such an inequality – together with the usual assumption that σ (p) → 0 as p ↓ pc – would imply that A0 (p) → ∞ as p ↓ pc for d > dc , which is believed to be false, see Sect. 3.3.
164
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
3. Statement of Postulates and Theorems 3.1. The scaling postulates. Most of our theorems are established under a set of assumptions which we can verify explicitly in two dimensions, and which we expect to be true for all dimensions not exceeding the critical dimension dc (presumably dc = 6). We call these assumptions the Scaling Postulates, since they follow from the type of scaling typically assumed in the physics literature. Since L0 (p) and A0 (p) depend on ε, see Eqs. (2.20), (2.24) and (2.26), many of our postulates implicitly involve the constant ε. We assume that they are true for all nonzero ε < ε0 , where ε0 = ε0 (d) is a suitable constant. We write our postulates in terms of the equivalence symbol . Here F (p) G(p)
(3.1)
means that there are lower and upper bounds of the form C1 F (p) ≤ G(p) ≤ C2 F (p),
(3.2)
where C1 > 0 and C2 < ∞ are constants which do not depend on p, as long as p is uniformly bounded away from zero or one, but which may depend on the constants ε, ε or x appearing explicitly or implicitly in the postulates. Occasionally, p is further restricted to lie on one side of pc . Similarly F (n) G(n) means that C1 F (n) ≤ G(n) ≤ C2 F (n) for some constants 0 < C1 ≤ C2 < ∞. Our scaling postulates are (I) L0 (p) → ∞ as p ↓ pc ; d−1 (II) A0 (p) Ld−1 ε; x), provided p > pc , x ≥ 1 and 0 < ε < ε0 ; 0 (p) L0 (p, (III) There are constants D1 > 0 and D2 < ∞ such that D1 ≤
πn (p) ≤ D2 πn (pc )
(IV) There are constants D3 > 0 and ρ1 >
2 d
if n ≤ L0 (p);
such that
m −1/ρ1 πm (pc ) ≥ D3 πn (pc ) n (V)
if
m ≥ n ≥ 1;
There is a constant D4 such that χ cov (p) ≤ D4 Ld0 (p)πL2 0 (p) (pc )
and
χ fin (p) ≤ D4 Ld0 (p)πL2 0 (p) (pc )
if p > pc ; (VI) πL0 (p) (pc ) P∞ (p) if p > pc ; (VII) There are constants D5 , D6 < ∞ such that P≥ks(L0 (p)) (p) ≥ D5 e−D6 k P≥s(L0 (p)) (p) if
p < pc
and
k ≥ 1.
We shall have some comments on the interpretation of the postulates and other remarks after we state our theorems.
Finite-Size Scaling in Percolation
165
3.2. Statement of the main results. A central concept in our theorems is the notion of a scaling window in which the system behaves critically. This can best be described by the function n − L0 (p) if p < pc (3.3) g(p, n) := 0 if p = pc n L0 (p) if p > pc . It will be seen that a sequence of systems with density pn behaves critically – as far as size of large clusters is concerned – in the finite boxes
n := {v ∈ Zd | −n ≤ vi < n, i = 1, . . . , d}
(3.4)
pn → p and lim sup |g(pn , n)| < ∞.
(3.5)
if n→∞
If this is the case we shall say that the (sequence of) systems are inside the scaling window. We shall say that the systems are below (respectively above) the scaling window if g(pn , n) → −∞ (respectively, g(pn , n) → ∞). These regimes correspond to subcritical, respectively supercritical behavior. In particular we must have pn < pc eventually if {pn } lies below the scaling window, and pn > pc eventually if {pn } lies above the scaling window. Our theorems below give many details of the finite-size scaling behavior of the system inside, above, and below the scaling window. They confirm the folklore that within distances of the order of the correlation length the system behaves critically. Specifically, we make this statement precise for the behavior of the size of the large clusters. Unfortunately we cannot derive this from the definition of correlation length only. One of our basic assumptions is that within the correlation length the point-to-box connectivity behaves as it does at the critical point (see Postulate III). In order to state these theorems, we again use the symbol , this time for two sequences an and bn of real numbers. We write an bn
(3.6)
if 0 < lim inf n→∞
an an ≤ lim sup < ∞. bn n→∞ bn
(3.7)
| n | denotes the number of sites in n ; thus | n | = (2n)d . We remind the reader that Postulates (I)–(VII) are verified for d = 2 in Sect. 6. Thus all the conclusions of our theorems hold in the two-dimensional case. Our first theorem characterizes the scaling window in terms of the expectation of the largest cluster sizes. Theorem 3.1. i) Suppose that Postulates (I)–(IV) hold. If {pn } is inside the scaling window, i.e., if lim supn→∞ |g(pn , n)| < ∞, and i ∈ N, then (i)
Epn {W n } s(n).
(3.8)
166
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
ii) Suppose that Postulates (I)–(IV) and (VII) hold. If {pn } is below the scaling window, i.e., g(pn , n) → −∞, then n (1) Epn {W n } s(L0 (pn )) log . (3.9) L0 (pn ) iii) Suppose that Postulates (II), (V) and (VI) hold. If {pn } is above the scaling window, i.e., g(pn , n) → ∞, then (1)
Epn {W n }
| n |P∞ (pn )
→ 1 as n → ∞,
(3.10)
→ 0 as n → ∞.
(3.11)
and (2)
Epn {W n }
| n |P∞ (pn )
The next theorem tells us about the distribution of the largest cluster sizes above the scaling window. Theorem 3.2. Suppose that Postulates (II), (V) and (VI) hold. Let {pn } be above the scaling window. Then, as n → ∞, (1)
W n
| n |P∞ (pn )
→ 1 in probability.
(3.12)
The next theorem gives information about the distribution of the cluster sizes inside the scaling window. It shows that, in this regime, the tails of the distribution of (1) (i) W n /E{W n } decay, but the distribution does not go to a delta function. This should be contrasted with the behavior (3.12), which shows that above the scaling window the (1) (1) distribution of W n /E{W n } does tend to a delta function. Theorem 3.3. Suppose that Postulates (I)–(IV) hold. Let {pn } lie inside the scaling window. i) For all i < ∞,
lim inf Prpn K
−1
n→∞
(i)
≤
W n
(i)
Epn {W n }
≤ K → 1 as K → ∞.
(3.13)
ii) For each K < ∞ and all i < ∞, lim sup Prpn n→∞
(i)
W n
(i)
Epn {W n }
≥ K −1 < 1.
(3.14)
We have one more theorem for p inside the scaling window. This concerns the number of clusters on scales m < n. Before stating the theorem, we point out that, due to (3.8), the (2) “incipient infinite cluster” inside the scaling window is not unique, in the sense that W n (1)
(2)
(1)
is of the same scale as W n . This should be contrasted with the behavior of W n /W n above the scaling window (see (3.10) and (3.11)), a remnant of the uniqueness of the infinite cluster above pc . The next theorem relates the non-uniqueness of the “incipient infinite cluster” inside the scaling window to the property of scale invariance at pc . We n are defined in Eq. (2.1) and (2.2). remind the reader that the quantities N n and N
Finite-Size Scaling in Percolation
167
Theorem 3.4. Suppose that Postulates (I)–(IV) hold. Let {pn } lie inside the scaling window. Then there exist strictly positive, finite constants σ1 , σ2 , C1 and C2 (all depending on the sequence {pn }, but not on n, m or k) such that d d
n n ≤ Epn N n (s(m), s(km)) ≤ Epn N n (s(m), s(km)) ≤ C2 , C1 m m (3.15) provided m and k are strictly positive integers with k ≥ σ1 and σ2 m ≤ n. (i)
Our next theorem gives the behavior of the W n when p is below the scaling window. Theorem 3.5. Suppose that Postulates (I)–(IV) and (VII) hold. Let {pn } lie below the scaling window. Then, for each fixed i,
lim inf Prpn K
−1
n→∞
(i)
≤
W n
n s(L0 (pn )) log L0 (p n)
≤ K → 1 as K → ∞.
(3.16)
As mentioned before, we expect the Scaling Postulates to hold for all d ≤ dc = 6. The next theorem states that they do hold if d = 2. Theorem 3.6. The Postulates (I)–(VII) hold in d = 2. Notice that in Theorem 3.3 ii) (in conjunction with (3.8)), we prove that inside the (i) scaling window the support of W n /s(n) is not bounded away from 0. We would expect that this support is also unbounded above and that this should be easy to prove from Postulate (VII), which states in a way that the support of |C(0)|/s(L0 (p)) is unbounded. However we have been unable to derive this from the Postulate (VII). Instead, in Sect. 7, we consider an alternative postulate, Postulate (VII alt), which says roughly that clusters of size of order s(L0 (p)) and distance of order L0 (p) have a reasonable chance of being connected to each other. In that section, we prove the following theorem. Theorem 3.7. i) Suppose Postulates (I) – (IV) and (VII alt) hold. Let {pn } be inside the scaling window and let i ∈ N. Then lim sup Prpn n→∞
(i)
W n
(i)
Epn {W n }
≤ K < 1 for all K < ∞.
ii) Postulate (VII alt) holds in d = 2. 3.3. Comments on the postulates and further remarks. The interpretation of our postulates is as follows. The first tells us that the approach to pc is critical – i.e., continuous or second-order – from above pc . The second postulate is the assumption of equivalence of length scales above pc : namely, Widom scaling, dimensionally relating the surface tension to the correlation length, together with the equivalence of the finite-size scaling lengths at various values of x ≥ 1 and ε ∈ (0, ε0 ). This postulate is not expected to hold above the critical dimension. In fact, it is not even believed that A0 (p) → ∞ as p ↓ pc , because this would imply that the crossing probability RL,3L (pc ) is bounded away from 1 uniformly in L. But uniform boundedness of crossing probabilities implies hyperscaling [BCKS98], which is not believed to hold above the upper critical dimension dc .
168
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
Postulate (III) tells us that the system within the correlation length behaves as it does at threshold, at least as characterized by the behavior of the point-to-box connectivity function. Postulate (IV) implies that the connectivity function has a lower bound of power law behavior at threshold. Especially Postulates (III) and (IV) turn out to imply more than is immediately apparent. Proposition 4.6 states that the cluster size distribution for clusters with diameters up to the correlation length behaves like the corresponding distribution at threshold. This proposition also gives us a hyperscaling relation between the exponents δ and ρ, assuming that these exponents exist. We also obtain a scaling relation for χ (p) in Proposition 4.8. Assuming power laws for χ and L0 , and the relation (4.24), the assumed bound on ρ1 in Postulate (IV) is equivalent to the very weak bound γ > 0. But it is known ([AN84]) that χ (p) ≥ C1 (pc − p)−1 , p < pc , i.e., γ ≥ 1 if it exists. In the light of this, Postulate (IV) seems very reasonable. The fifth and sixth postulates give various exponent relations, again provided that these exponents exist. Finally, the last postulate states that (in the subcritical region) s(L0 (p)) is the natural scale for the cluster size distribution and that on this scale the tail of the distribution does not decay faster than exponentially. Proposition 4.8 provides an inequality in the opposite direction, i.e., this decay is at least exponentially fast. See also Remark vi) below. Remarks. i) Assuming the existence of the exponent ρ, see (1.9), Theorem 3.1 implies that inside the scaling window the largest, second largest, third largest,..., clusters scale like ndf , with df = d − 1/ρ, while below the scaling window the size of the largest cluster (and hence of all clusters) goes to zero on the scale ndf . ii) By Postulate (VI), and Lemma 4.5 below, πL0 (pn ) (pc ) | n |P∞ (pn ) P∞ (pn ) = →∞ s(n) πn (pc ) πn (pc )
(3.17)
above the scaling window. Statement iii of Theorem 3.1 therefore implies that (1)
Epn {W n } s(n)
→∞
as
n→∞
(3.18)
above the scaling window. iii) Assume that the critical exponent ν, see Eq. (1.7), exists, and that an equivalence of the form (2.21) holds for p > pc as well. Choose pn− = sup{p < pc : L0 (p) ≤ n}. Then by (2.29), L0 (pn− ) n. Moreover, L0 (pn− ) ≈ ξ(pn− ) ≈ |pn− − pc |−ν
(3.19)
so that pc − pn− ≈ n−1/ν . Finally, {pn } is below the scaling window if lim inf n→∞ log(pc −pn )/ log n > −1/ν. Similar statements hold to the right of pc with pn+ := inf{p > pc : L0 (p) ≤ n}, provided we make the further assumption that lim sup lim p↓pc
δ↓0
L0 (p − δ) < ∞. L0 (p)
Thus under these various assumptions the scaling window has width n−1/ν . It should be pointed out, though, that at present we do not have enough rigorous knowledge of the behavior of L0 (p) as a function of p to define the scaling window in terms of the behavior of (pn − pc )/gn± for suitable sequences {gn± }. For instance, it is not
Finite-Size Scaling in Percolation
169
known that there exists a sequence {gn− } of positive numbers such that n/L0 (pn ) → ∞ is equivalent to (pc − pn )/gn− → ∞ for pn < pc . iv) It follows from (3.11) and Markov’s inequality that (2)
W n
| n |P∞ (pn )
→0
in probability
(3.20)
above the scaling window. Combined with (3.12) this implies that, as n → ∞, (2)
W n
(1)
W n
→0
in probability,
(3.21)
provided g(pn , n) → ∞. v) In a similar way, it follows from (3.9) that, as n → ∞, (1)
W n
s(n)
→0
in probability,
(3.22)
provided g(pn , n) → −∞. 4. Auxiliary Results In this section, which is split into two subsections, we state several useful auxiliary results, most of which have been already proved in [BCKS98], which we will need for our proofs in Sect. 5. The first subsection gives a fundamental moment estimate and an exponential tail estimate for cluster sizes. These estimates show a close relationship between the diameter and the size or volume of a large cluster. A cluster in n of diameter small with respect to n usually has a volume which is small with respect to s(n). We believe – but could not prove – that the converse also holds, namely that a cluster in n of diameter of order n has with high probability a volume bigger than a small multiple of s(n). The second subsection contains various important properties of the quantities πn , Ps , P≥s and χ which are akin to the postulates. Throughout, the basic parameter p is bounded away from 0 and 1, that is we restrict p to ζ0 ≤ p ≤ 1 − ζ0 for some small strictly positive ζ0 . No further mention of ζ0 will be made. Many constants Ci appear in this paper. These are always finite and strictly positive, even when this is not indicated. In different formulae the same symbol Ci may denote different constants. All these constants depend on ε, d, ζ0 and the constants which appear in the postulates. This dependence will not be indicated in the notation. I [A] denotes the indicator function of the event A. All results in this section are proven under Postulates (I)–(IV) or a subset of these. In fact, none of the statements of this section rely directly on Postulates (I) and (II). Instead, they use the following two assumptions, which are much weaker than Postulates (I) and (II). The first is the assumption that the sponge crossing probabilities at pc are bounded away from one, that is, 1 − Rn,3n (pc ) > ε,
n ≥ 1,
(4.1)
170
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
for some ε > 0, and the second is the assumption that (4.1) can be extended to p > pc , provided n ≤ L0 (p). Actually, we only need the slightly weaker assumption that there are some constants ε > 0 and σ3 > 0 such that 1 − Rn,3n (p) > ε
p > pc
for all
and all
n ≤ σ3 L0 (p).
(4.2)
To see that (4.1) follows from Postulates (I) and (II), we note that these postulates imply that A0 (p) → ∞ as p ↓ pc , which in turn implies the statement (4.1). The bound (4.2) follows directly from Postulate (II), since, by the definition of A0 (p), 1 − Rr,3r (p) > ε
for
r d−1 < A0 (p)
and
p > pc .
By the equivalence of A0 (p) and L0 (p)d−1 (see Postulate (II)) this means that there exists some σ3 > 0 such that (4.2) holds for p > pc and all n ≤ σ3 L0 (p). We caution the reader that above pc , the definition of the correlation length L0 (p) in [BCKS98] is slightly different from the definition here (compare (2.17) in [BCKS98] to our Eq. (2.24)). However, as noted in Remark (vi) in [BCKS98], all results there remain valid for any definition of L0 (p) above pc that obeys Postulates (3.15) and (3.16) in [BCKS98]. While Postulate (3.16) of [BCKS98] is identical to our Postulate (III), Postulate (3.15) in [BCKS98] is slightly stronger than our assumption (4.2) – the former corresponds to (4.2) with σ3 = 1. Here, we need only one result which uses Postulate (3.15), namely Theorem 3.6 of [BCKS98], which we cite to establish the last statement in our Proposition 4.8 below. However, a careful reading of the proof of Theorem 3.6 in Eqs. (5.32)–(5.35) of [BCKS98] shows that actually only our weaker assumption (4.2) is needed. 4.1. General moment estimates. The first lemma is a direct consequence of Postulate (IV). It is identical to Lemma 4.4 in [BCKS98]. Lemma 4.1. If Postulate (IV) holds, then for β > 1/ρ1 − 1 (and a fortiori for β > d/2 − 1 = (d − 2)/2) there exists constants C1 = C(β, d) and C2 = C2 (d) such that L
(m + 1)β πm (pc ) ≤ C1 Lβ+1 πL (pc ) if L ≥ 1,
(4.3)
m=0
and L m=0
(m + 1)d−1 πm2 (pc ) ≤ C2 Ld πL2 (pc ) if L ≥ 1.
(4.4)
The next lemma, which is identical to Lemma 6.1 in [BCKS98], gives a basic moment estimate. For d = 2 such an estimate was already given in [Ngu88]. Lemma 4.2. Assume Postulate (IV) holds. Define V (L) := number of sites in L connected to ∂ 2L .
(4.5)
Then for some constants Ci , it holds that
k Ep V k (L) ≤ C1 k! C2 Ld πL (pc ) ,
(4.6)
Finite-Size Scaling in Percolation
171
provided p ≤ pc , k ≥ 1 and L ≥ 1. Consequently Ep exp(tV (L)) ≤ C1 [1 − tC2 Ld πL (pc )]−1
(4.7)
whenever p ≤ pc and 0 ≤ t < [C2 Ld πL (pc )]−1 . When Postulates (III) and (IV) hold, then (4.6) and (4.7) remain valid for p > pc and L ≤ L0 (p). The next proposition, which is one of the main technical results of [BCKS98] (Proposition 6.3 in [BCKS98]), follows from the above moment estimate Lemma 4.2. It is crucial for our proofs in Sects. 5.1 and 5.3. Proposition 4.3. i) Assume that Postulate (IV) holds. Then there exist constants Ci such that d n (1) Prp W n ≥ xs(L0 (p)) ≤ C1 e−C2 x (4.8) L0 (p) if x ≥ 0, n ≥ L0 (p), and p < pc . In particular (1) Prp W n ≥ ys(L0 (pn )) log
n L0 (pn )
→0
(4.9)
if y > d/C2 and g(pn , p) → −∞. ii) Assume that Postulate (IV) holds, and if p > pc , that also Postulate (III) holds. Then there exist constants Ci such that (1) (4.10) Prp W n ≥ xs(n) ≤ C1 e−C2 x if x ≥ 0 and n ≤ L0 (p). iii) Assume that Postulates (III) and (IV) hold. Then there exist constants Ci such that d d n n (1) exp −C2 x + C3 Prp W n ≥ xs(L0 (p)) ≤ C1 L0 (p) L0 (p) (4.11) if x ≥ 0, n ≥ L0 (p) and p > pc . The next lemma summarizes several additional results which follow from Postulate (IV). To state it, we introduce the diameter of a cluster C as diam(C) = max |v − w|∞ . v,w∈C
(4.12)
Lemma 4.4. Assume that Postulate (IV) holds. Then there exist constants Ci such that P rp {diam(C(0)) ≥ xL0 (p)} ≤ C1 πL0 (p) (p)e−C2 x if x ≥ 2 and p < pc ,
(4.13)
and d/2
P rp {∃ cluster C in n with diam(C) ≤ yn and |C| ≥ xs(n)} ≤ C1 y −d e−C2 x/y (4.14) if x ≥ 0, 0 < y ≤ 1, p ≤ pc and 4/y ≤ n ≤ L0 (p).
172
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
Proof. The bound (4.14) was proved in [BCKS98], see Remark (xiii) at the end of Section 6 in [BCKS98]. To prove (4.13) we note that for x ≥ 2, P rp {diam(C(0)) ≥ xL0 (p)} ≤ P rp {0 ↔ ∂BL0 (p) and ∂BL0 (p) is connected to at least x/2 distinct boxes BL0 (p) (jL0 (p)), j ∈ 2Zd \ {0}} = πL0 (p) (p)P rp {∂BL0 (p) is connected to at least x/2 distinct boxes BL0 (p) (jL0 (p)), j ∈ 2Zd \ {0}} (see (2.11) for the definition of Bn (v)). As in the proof of Proposition 6.3 (ii) of [BCKS98], (more precisely, as in the proof of the bound (6.39) in [BCKS98]), the renormalized Peierls argument of Theorem 5.1 in [Kes82] shows that for suitable constants C1 , C2 the probability P rp {∂BL0 (p) is connected to at least x/2 distinct boxes BL0 (p) (jL0 (p)), j ∈ 2Zd \ {0}} is bounded above by C1 e−C2 x .
# "
4.2. Some important scaling properties. In this subsection we state a number of properties of the functions πn , Ps and χ (p), most of which have already been proved in [BCKS98]. The first lemma provides an upper bound for πm (pc )/πn (pc ) which complements the lower bound of Postulate (IV). Lemma 4.5. i) There are constants C1 < ∞ and C2 > 0 such that πn (p) ≤ C1 e−C2 n/L0 (p) if p < pc and n ≥ L0 (p). πL0 (p) (p)
(4.15)
ii) Assume that (4.1) holds for some ε > 0. Then P rpc {∂Bn (0) ↔ ∂B3n (0)} ≤ 1 − ε 2d if n ≥ 1.
(4.16)
iii) Assume that (4.1) holds for some ε > 0. Then there exist constants C1 , ρ2 < ∞ such that m −1/ρ2 πm (pc ) if m ≥ n ≥ 1. (4.17) ≤ C1 πn (pc ) n Proof. Statements i) and iii) are the content of Theorem 3.8 of [BCKS98]. To prove ii), we show that for any p ∈ [0, 1] and any n ≥ 1, one has P rp {∂Bn (0) ↔ ∂B3n (0)} ≥ [1 − R2n,6n (p)]2d .
(4.18)
Indeed, by the definition of Rn,m , the probability that there is no occupied crossing in the 1-direction of the block [n, 3n] × [−3n, 3n]d−1
(4.19)
Finite-Size Scaling in Percolation
173
is equal to 1 − R2n,6n . The cube B3n (0) is the union of Bn (0) and the block in (4.19) plus 2d − 1 more blocks congruent to the block in (4.19). Let Fn be the event that none of these 2d blocks congruent to (4.19) has an occupied crossing in the short direction. Obviously, the event Fn implies that ∂Bn (0) is not connected to ∂B3n (0), so that the probability on the left hand side of (4.18) is bounded from below by the probability of Fn . Since P rp {Fn } is at least [1−R2n,6n (p)]2d by the Harris–FKG inequality, the bound (4.18) follows. " # The next proposition summarizes the results of Theorem 3.7 and the first statement of Theorem 3.4 in [BCKS98]. Assuming existence of the critical exponents ρ and δ, the first statement implies the hyperscaling relation dρ = δ + 1. The second statement is the analogue of Postulate (III) for P≥s (p). Proposition 4.6. Assume that (4.1) holds for some ε > 0 and that Postulate (IV) holds. Then there exists constants C1 > 0 and C2 < ∞ such that C1 πn (pc ) ≤ P≥s(n) (pc ) ≤ C2 πn (pc ).
(4.20)
If Postulate (III) holds as well, then there exist constants C3 > 0, C4 < ∞ and 0 < σ0 = σ0 (ε, d) ≤ 1 such that C3 P≥s(n) (pc ) ≤ P≥s(n) (p) ≤ C4 P≥s(n) (pc ) if n ≤ σ0 L0 (p).
(4.21)
Our last two propositions in this section summarizes the results of several theorems in [BCKS98], namely Theorem 3.5, Theorem 3.6 and Theorem 3.9. Proposition 4.8 in particular has two upper bounds complementing lower bounds in the postulates, and a hyperscaling relation. Assuming the existence of the corresponding exponents, this relation implies γ = (d − 2/ρ)ν. Lemma 4.7. Assume Postulate (IV) holds. Then there exist constants 0 < Ci < ∞ such that P≥xs(L0 (p)) (p) ≤ C1 e−C2 x if p < pc and x ≥ 1. πL0 (p) (pc )
(4.22)
Proposition 4.8. Assume that (4.1) is valid for some ε > 0, and that Postulates (III) and (IV) hold. Then there exist constants 0 < Ci < ∞ such that, with σ0 as in Proposition 4.6, it holds that P≥xs(L0 (p)) (p) ≤ C1 exp[−C2 x] if x ≥ 1 and p < pc , P≥s(σ0 L0 (p)) (p)
(4.23)
and C3 L0 (p)d [πL0 (p) (pc )]2 ≤ χ (p) ≤ C4 L0 (p)d [πL0 (p) (pc )]2 ,
p < pc .
(4.24)
If (4.1) and (4.2) are valid for some ε > 0 and some σ3 > 0, and if Postulate (IV) holds, then there exists a constant C5 > 0 such that C5 L0 (p)d [πL0 (p) (pc )]2 ≤ χ fin (p),
p > pc .
(4.25)
174
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
5. Proof of the Theorems, Given the Postulates In this section, we prove our principal results, Theorems 3.1–3.5. The section is divided into three subsections. These correspond to the proof of results within, above and below the scaling window: Theorem 3.1 i), Theorem 3.3 and Theorem 3.4 in Sect. 5.1, Theorem 3.1 iii) and Theorem 3.2 in Sect. 5.2, and finally, Theorem 3.1 ii) and Theorem 3.5 in Sect. 5.3. 5.1. Inside the scaling window. We start this subsection with several lemmas and propo (s1 , s2 ) of clusters with size between sitions concerning the numbers N (s1 , s2 ) and N s1 and s2 , defined in (2.1) and (2.2). Although some of these results are very similar to the theorems we are finally going to prove, we give them as separate propositions, since this allows us to better keep track of which postulates are needed in which step. At many points in this and the following subsections, we use the fact that, for an arbitrary configuration ω, and number α, it holds that (i) α = s α−1 I [|C (v)| = s]. (5.1) W
i≥1
v∈ s≥1
This is obvious from the fact that in the right-hand side, the sum of I [|C (w)| = s] over all points w in C (v) equals sI [|C (v)| = s]. Taking expectations of (5.1) gives (i) α E p W
s α−1 P rp {|C (v)| = s} . (5.2) = v∈ s≥1
i≥1
This argument for α = 1 will be used in the proof of Proposition 5.5, but even more often will we use the special case α = 0, which says that the number of clusters of size s can be rewritten as 1 i | W (i) = s = (5.3) I [|C (v)| = s].
s v∈
These formulae and some variants form a basic relationship which allows us to relate (i) estimates on the distributions of W and |C(0)|. We use the following consequence of (5.3): s2
1 Ep N (s1 , s2 ) = Prp |C (v)| = s . s s=s 1
(5.4)
v∈
In a similar way, we have s2
1 (s1 , s2 ) = Ep N Prp |C (v)| = s, v ↔ ∂ . s s=s 1
(5.5)
v∈
2 (s1 , s2 ): We also need the corresponding representation for the expectation of N
1
2
(s1 , s2 ) = Ep N Prp |C (v)| = s, s s s ≤s≤s 1 2 s1 ≤˜s ≤s2
v,w∈
s, w ↔ ∂ . v ↔ ∂ , |C (w)| = (i)
The next two lemmas will be useful in proving lower bounds for W .
(5.6)
Finite-Size Scaling in Percolation
175
Proposition 5.1. Assume that (4.1) holds for some ε > 0 and that Postulates (III) and (IV) hold. Then there exist constants 0 < Ci < ∞ and 1 ≤ σ1 < ∞ such that C1
n m
d
d
n (s(m), s(km)) ≤ Ep N n (s(m), s(km)) ≤ C2 n , (5.7) ≤ Ep N m
provided σ1 m ≤ min{L0 (p), n} and k ≥ σ1 . Proof. For brevity we write instead of n . We start with the upper bound. Using the representation (5.4) and bounding the factor 1/s in (5.4) by 1/s(m), we get
Ep N (s(m), s(km)) ≤
1 Prp |C (v)| = s s(m) v∈ s≥s(m)
=
1 Prp |C (v)| ≥ s(m) s(m)
(5.8)
v∈
≤
(2n)d P≥s(m) (p), s(m)
where in the last step we used the definition (2.4) of P≥s(m) (p) and the fact that |C (v)| ≤ |C(v)|. Without loss of generality we shall take σ1 ≥ 1/σ0 ≥ 1, where σ0 is the constant of Proposition 4.6. Then σ1 m ≤ L0 (p) implies m ≤ σ0 L0 (p), and we may use Proposition 4.6 to bound the right-hand side of (5.8). We get for some finite constant C2 , d (2n)d n (2n)d . P≥s(m) (p) ≤ C2 πm (pc ) = C2 s(m) s(m) m
(5.9)
The estimates (5.8) and (5.9) imply the upper bound. To prove the lower bound, we use that Postulate (IV) implies that 0 d/2 s(0) ≥ D3 % % s(0 ) 0
if
0 ≥ 0% ≥ 1,
(5.10)
−2/d
so that in particular s(0) ≥ s(0% ) whenever 0/0% ≥ D3 . We conclude that for any choice of k ≥ 1 we can find a σ1 ≥ k(1 + 1/σ0 ) such that s(km) ≥ s( km) for all k ≥ σ1 . It then follows from (5.5) that for k ≥ σ1 ,
(s(m), s(km)) ≥ Ep N (s(m), s( Ep N km) − 1) ≥
s( km)−1
1
Prp |C (v)| = s, v ↔ ∂
s
s=s(m) v∈ n 2
=
s( km)−1
1
Prp |C(v)| = s, v ↔ ∂ , s
s=s(m) v∈ n 2
(5.11)
176
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
where in the second step we bounded the sum over = n from below by a sum over
n2 . Bounding the factor 1/s in (5.11) from below by 1/s( km), we get (s(m), s(km)) Ep N
1 ≥ km), v ↔ ∂
Prp s(m) ≤ |C(v)| < s( s(km) v∈ n 2
1 = Prp s(m) ≤ |C(v)| < s( km) s(km) v∈ n 2
≥
≥
(5.12) − Prp s(m) ≤ |C(v)| < s( km), v ↔ ∂
1 Prp s(m) ≤ |C(v)| < s( km) − πn/2 (p) s( km) v∈ n s( km)
2
P≥s(m) (p) − P≥s(km) (p) − πn/2 (p) .
(n − 2)d
Since n ≥ σ1 m ≥ km by the assumption σ1 m ≤ min{L0 (p), n}, we obtain
(s(m), s(km)) Ep N (n − 2)d ≥ P≥s(m) (p) − P≥s(km) (p) − πkm/2 (p) . s( km)
(5.13)
Again by the assumption σ1 m ≤ min{L0 (p), n}, we have m ≤ km ≤ ( k/σ1 )L0 (p) ≤ σ0 L0 (p). We therefore may use Proposition 4.6 in conjunction with Postulate (III) and the bound πkm (pc ) ≤ πkm/2 (pc ) to conclude that d
(s(m), s(km)) ≥ (n − 2) C3 πm (pc ) − C4 πkm/2 (pc ) , Ep N s( km)
(5.14)
for suitable constants C3 , C4 ∈ (0, ∞) which depend only on the constants in Proposition 4.6, but not on the choice of k. Finally we appeal to Lemma 4.5 iii) to fix k so large 1 that C4 πkm/2 (pc ) ≤ 2 C3 πm(pc ) . Here k depends only on C4 /C3 and the constants in Lemma 4.5 iii); also k determines the value to take for σ1 . We then get d
(s(m), s(km)) ≥ (n − 2) 1 C3 πm (pc ). Ep N s( km) 2
(5.15)
From s( km) ≤ k d s(m) we then conclude that for n ≥ 4, d d
(s(m), s(km)) ≥ C1 (2n) πm (pc ) = C1 n , Ep N s(m) m
(5.16)
where C1 = 2−2d−1 k −d C3 . This proves the lower bound when n ≥ 4. If we choose σ1 large enough, then 1 ≤ n < 4 is ruled out by σ1 ≤ σ1 m ≤ n. " #
Finite-Size Scaling in Percolation
177
Proposition 5.2. Assume that (4.1) holds for some ε > 0 and that Postulates (III) and (IV) hold. Then there is a constant C3 < ∞, such that
n (s(m), s(km)) Var N (5.17)
≤ C3 , (s(m), s(km)) 2 Ep N provided σ1 m ≤ min{L0 (p), n}, k ≥ σ1 . Here σ1 is the constant of Proposition 5.1. Proof. Again we write for n . We first will prove that for arbitrary s1 , s2 ∈ N, s1 ≤ s2 , and p ∈ (0, 1),
(2n)d 2 P≥s1 (p) . (5.18) ≤ Ep N (s1 , s2 ) 1 + Ep [N (s1 , s2 )] s1 We need some notation. We denote the set of bonds with both endpoints in by B( ), and the set of bonds with both endpoints in \ ∂ by B( ). Let B be a subset of B( ). With a slight abuse of notation, we say that v is a point in B if v is an endpoint of one of the bonds in B. We write B is occupied (vacant) for the event that all bonds in B ⊂ B( ) are occupied (respectively, vacant). Given v ∈ , we denote the set of all connected subsets B ⊂ B( ) that contain the point v by Bv ( ). Again with a slight abuse of notation, we denote the number of points in a cluster B ⊂ Bv ( ) by |B|. Finally, we A B for the set of all bonds b ∈ B( ) \ B which share an endpoint with a bond write ∂
% b ∈ B. Using Eq. (5.6), we rewrite the left-hand side of (5.18) as
(s1 , s2 )2 Ep N = v,w∈
w ( ) B∈Bv ( ) B∈B s1 ≤|B|≤s2 s ≤|B|≤s 1 2
is occupied, ∂ A B ∪ ∂ A B Prp B ∪ B
is vacant . |B| |B|
(5.19)
and B = B Next we observe that the event on the right-hand side cannot occur if B ↔ B in , because in this case some occupied bond in B ∪ B∪ (a suitable path from B to also lies in ∂ A B ∪ ∂ A B. As a consequence, the right-hand side decomposes into two B)
terms: the term
A B is vacant Prp B is occupied, ∂
|B|2 B∈B ( )∩B ( ) v,w∈
v w s1 ≤|B|≤s2
=
v∈
B∈Bv ( ) s1 ≤|B|≤s2
A B is vacant Prp B is occupied, ∂
|B|
(5.20)
(s1 , s2 ) = Ep N and the term v,w∈
w ( ) B∈Bv ( ) B∈B s1 ≤|B|≤s2 s ≤|B|≤s 1 2 ,B↔B
is occupied, ∂ A B ∪ ∂ A B Prp {B ∪ B
is vacant} . |B| |B|
(5.21)
178
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
By using the second decoupling inequality of [BC96], or, alternatively, the van den Berg-Kesten inequality [BK85] we see that the last sum equals
v,w∈
≤
w ( ) B∈Bv ( )\Bw ( ) B∈B s1 ≤|B|≤s2 s1 ≤|B|≤s 2
1 s1
v,w∈
≤
1 s1
v,w∈
B∈Bv ( )\Bw ( ) s1 ≤|B|≤s2
B∈Bv ( )\Bw ( ) s1 ≤|B|≤s2
is occupied, ∂ A B ∪ ∂ A B Prp {B ∪ B
is vacant} |B| |B|
A B is vacant, |C (w)| ≥ s } Prp {B is occupied, ∂
1 |B| A B is vacant} Pr {|C (w)| ≥ s } Prp {B is occupied, ∂
p
1 |B|
Prp {|C (w)| ≥ s1 }
(s1 , s2 ) . ≤ Ep N s1 w∈
(5.22) Combining the two terms (5.20) and (5.22), and observing that Prp {|C (w)| ≥ s1 } ≤ Prp {|C(w)| ≥ s1 }, we obtain (5.18). The bound (5.17) now follows from (5.18), (5.9) and the lower bound in (5.7). " # The next proposition is a consequence of Proposition 5.1, Proposition 5.2 and the fact that (s(m), s(km)) ≥ N 1 (s(m), s(km)) + N 2 (s(m), s(km)), N provided 1 ⊂ and 2 = \ 1 . Proposition 5.3. Assume that (4.1) holds for some ε > 0 and that Postulates (III) and (IV) hold. Then there are constants C4 , C5 > 0 such that Prp
d d n m N n (s(m), s(km)) ≥ C4 ≥ 1 − C5 , m n
(5.23)
provided σ1 m ≤ min{L0 (p), n} and k ≥ σ1 . Here σ1 is the constant of Proposition 5.1. Proof. Let k = n/'σ1 m( be the largest integer less than or equal to n/'σ1 m(, and (s(m), s(km)) is increasing in , n = k'σ1 m(. Note that then σ1 m ≤ n ≤ n. Since N i.e., (s(m), s(km)) ≥ N
⊂ , N (s(m), s(km)) if
(5.24)
= we get that for = n ,
n, d d n (s(m), s(km)) ≥ C4 n
Prp N ≥ Prp N . (s(m), s(km)) ≥ C4 m m (5.25)
Finite-Size Scaling in Percolation
179
contains Next we note that
k d disjoint subvolumes (i) of size (2'σ1 m()d , and introduce the random variable k d
X=
i=1
(i) (s(m), s(km)). N
(5.26)
(s(m), s(km)), we have
Since X ≤ N (s(m), s(km)) ≤ N d d n n ≥ Prp X ≥ C4 . Prp N (s(m), s(km)) ≥ C4 m m
(5.27)
(i) (s(m), s(km)) in (5.26) are i.i.d. and using Observing that the random variables N Proposition 5.2, we have (1) (s(m), s(km))} 1 Var{N C6 Var X = d ≤ d. 2 2 (Ep X) k Ep {N (1) (s(m), s(km))} k
(5.28)
Noting that
4 Var X , Prp X ≤ 21 Ep X ≤ Prp |X − Ep X|2 ≥ 41 (Ep X)2 ≤ (Ep X)2
(5.29)
we find that
Prp X ≥
1 2 Ep X
d m 4C6 ≥ 1 − d ≥ 1 − C5 , n k
k = n/'σ1 m( where C5 = (4σ1 )d 4C6 (note that 1/ lower bound
−1
(5.30)
≤ 4σ1 m/n). Using finally the
(1) (s(m), s(km)) ≥ C1 ( Ep X = k d Ep N kσ1 )d = C1
n m
d
,
which comes from (5.7), we obtain the desired bound (5.23), provided C4 > 0 is chosen small enough. " # Proposition 5.4. Suppose that Postulates (III) and (IV) hold, and that (4.1) and (4.2) are valid for some ε > 0 and some σ3 > 0. Then there are strictly positive constants C1 and σ4 such that 1+(n/m)d (1) , (5.31) Prp W n ≤ s(m) ≥ C1 provided m ≤ σ4 L0 (p). Proof. It follows from (4.10) and (5.10) that there exists a constant σ4 > 0 such that 1 1 (1) if r ≤ σ4 m and r ≤ L0 (p). (5.32) Prp W 3r ≤ s(m) ≥ 2 3 In addition, it follows (4.18) that Prp {v ↔ ∂ 3r for all v ∈ r } ≥ [1 − Rr,3r (p)]2d .
(5.33)
180
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
For p ≤ pc , 1 − Rr,3r (p) ≥ 1 − Rr,3r (pc ) > ε (see (4.1)). We still have 1 − Rr,3r (p) > ε for p > pc and r ≤ σ3 L0 (p), by virtue of (4.2). Consequently, as in (4.16), Prp {v ↔ ∂B3r for all v ∈ r } ≥ ε2d
r ≤ σ3 L0 (p).
(5.34)
Using the Harris-FKG inequality we obtain from (5.32) and (5.34) that
Prp |C(v)| ≤ s(m) for all v ∈ r (1) ≥ Prp W3r ≤ s(m) and v ↔ ∂B3r for all v ∈ r
(5.35)
≥
1 2d ε 2
if
if
r ≤ (σ3 ∧ 1/3)L0 (p) ∧ σ4 m.
We are now ready to prove (5.31) for arbitrary n. We first estimate (1) Prp W ≤ s(m) ≥ Prp |C(v)| ≤ s(m) for all v ∈
(5.36)
and note that the right-hand side of (5.36) is decreasing in . Let m ≤ σ4 L0 (p) and choose 0 < σ5 ≤ σ4 such that σ4 σ5 ≤ (σ3 ∧ 1/3). Then choose an integer r ≥ 1 in [σ5 m/2, σ5 m]; if this is not possible, because σ5 m < 1, then take r = 1. For this choice of r, Prp {|C(v)| ≤ s(m) for all v ∈ r } ≥ C1 > 0 for some constant C1 , by virtue of (5.35). If n < r, then this already implies (5.31). Otherwise, choose an integer k such that n ≤ n := kr ≤ 2n. We then get (1) |C(v)| ≤ s(m) . (5.37) Prp W n ≤ s(m) ≥ Prp v∈ n
d (i) of diameter 2r, and using the Harris-FKG Decomposing n into k subvolumes
inequality for the intersection of the events ∩v∈ (i) {|C(v)| ≤ s(m)}, we obtain
k d d (1) {|C(v)| ≤ s(m)} Prp W n ≤ s(m) ≥ Prp ≥ C1k .
(5.38)
v∈ r
The proof is concluded by observing that k ≤ 2n/r ≤ 4n/(σ5 m).
# "
Proof of Theorem 3.1 i). For this proof we only use (4.1) and Postulates (III) and (IV). As before, abbreviate n to . Since lim supn→∞ |g(pn , n)| < ∞, we have n ≤ λL0 (pn ) for all n ≥ n1 ,
(5.39)
constants depending on the sequence {pn }. where λ and n1 are finite (1) The fact that Ep W /s(n) is bounded above is immediate from Proposition 4.3. If n ≤ L0 (pn ) then (4.10) suffices. If L0 (pn ) ≤ n ≤ λL0 (pn ), then we use (4.8) or (4.11) plus the fact that s(n) ≥ D3 s(L0 (pn )) (by (5.10)). Note that this proof only requires Postulates (III) and (IV), and does not rely on the assumption (4.1).
Finite-Size Scaling in Percolation
181
(i) In order to complete the proof, we need lower bounds on Ep W . To this end, we
first note that Proposition 5.3 implies that for any δ > 0 there are constants 1 ≤ σ (i) = σ (i) (λ, δ) < ∞ such that (i) Prp W n ≥ s(m) ≥ 1 − δ, (5.40) provided σ (i) m ≤ n ≤ λL0 (p). Indeed, choose σ (i) (λ, δ) ≥ σ1 (with the constant σ1 as in Proposition 5.1) so large that i) σ (i) m ≤ λL0 (p) implies σ1 m ≤ L0 (p), ii) C4 (σ (i) )d ≥ i, and iii) C5 (σ (i) )−d ≤ δ, where C4 , C5 are as in Proposition 5.3. Then for σ (i) m ≤ n ≤ λL0 (p), we get
(i)
(s(m), s(σ1 m)) ≥ i Prp W ≥ s(m) = Prp N (s(m), ∞) ≥ i ≥ Prp N d (s(m), s(σ1 m)) ≥ C4 n ≥ Prp N , m (5.41) where we used that σ (i) m ≤ n implies C4 (n/m)d ≥ i in the last step. Combined with Proposition 5.3 and the fact that the assumption σ (i) m ≤ n implies C5 (m/n)d ≤ δ by our choice of σ (i) , the bound (5.41) implies (5.40). (i) (i) In order to prove a lower bound on lim inf Epn {W n }, we now assume that n ≥ n1 := n→∞
max{n1 , σ (i) }, where n1 and λ are the constants from (5.39), and σ (i) = σ (i) (λ, 21 ). Choosing m = n/σ (i) , we have m ≥ 1 and σ (i) m ≤ n ≤ λL0 (pn ). Thus, by (5.40)
(i) (5.42) Epn W n ≥ 21 s(m). Since m ≤ n/σ (i) ≤ m + 1 ≤ 2m by the definition of m, we have s(n)/s(m) ≤ (n/m)d ≤ (2σ (i) )d ,
(5.43)
(i)
and hence s(m) ≥ s(n)(2σ (i) )−d . Thus, with C1 (λ) = 21 (2σ (i) )−d , we have
(i) (i) Epn W n ≥ C1 (λ)s(n). This completes the proof of the lower bound.
(5.44)
# "
Proof of Theorem 3.3. For this proof use (4.1), (4.2) and Postulates (III) and (IV). We
(i) (i) start with a lower bound on Prpn W n ≥ K −1 Epn (W n ) . We again have (5.39) for some λ and n1 , and by Theorem 3.1 i) (whose proof only used (4.1) and Postulates (III) (i) and (IV)) there exists some constant C2 , which depends on the sequence {pn }, such that
(i) (i) Epn W n ≤ C2 s(n). Thus if m is such that (i)
then
s(m) ≥ K −1 C2 s(n),
(5.45)
(i) (i) (i) Prpn W n ≥ K −1 Epn W n ≥ Prpn W n ≥ s(m) .
(5.46)
182
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
We now choose m = n/σ (i) (λ, δ) , where the σ (i) are the constants introduced above (5.40). Then (5.45) will be satisfied for large enough K (by (5.43)). Since n ≥ n1 and n ≥ σ (i) (λ, δ) implies m ≥ 1 and mσ (i) (λ, δ) ≤ n ≤ λL0 (pn ), we now can use (5.40) to conclude that
(i) (i) lim inf Prpn W n ≥ K −1 Epn W n ≥ 1 − δ, (5.47) n→∞
provided K is large enough. Together with Markov’s inequality,
(i)
(i) Prpn W n ≥ KEpn W n ≤ K −1 ,
(5.48)
(5.47) implies Theorem 3.3 i). In order to complete the proof of Theorem 3.3, we choose m(n) as the maximal (i) m ≤ (σ4 /λ ∧ 1)n such that K −1 C1 (λ)s(n) > s(m), where σ4 is as in Proposition 5.4, (i) (i) (1) λ as in (5.39) and C1 as in (5.44). Then, by (5.44) and W ≤ W , we have
(i) (i) (i) (i) lim sup Prpn W n ≥ K −1 Epn W n ≤ lim sup Prpn W n ≥ K −1 C1 (λ)s(n) n→∞ n→∞ (1) (i) ≤ lim sup Prpn W n ≥ K −1 C1 (λ)s(n) n→∞ (1) ≤ lim sup 1 − Prpn W n ≤ s(m(n)) . n→∞
(5.49) Since n/m(n) is bounded above by virtue of Postulate (IV) (see (5.10)), Proposition 5.4 shows that the right-hand side of (5.49) is bounded away from 1. This proves Theorem 3.3 ii). " # Proof of Theorem 3.4. For this proof we only use (4.1), and Postulates (III) and (IV). Theorem 3.4 follows immediately from Proposition 5.1. Indeed, let λ and n1 be the constants from (5.39), and C1 , C2 and σ1 be those from Proposition 5.1. Choose σ2 ≥ max{σ1 , λσ1 , n1 }. We note that then m ≥ 1 and σ2 m ≤ n imply n ≥ n1 , and hence n ≤ λL0 (pn ) and σ1 m ≤ L0 (pn ). The conditions of Theorem 3.4 therefore imply those of Proposition 5.1, proving that Theorem 3.4 under the assumption that (4.1), as well as Postulates (III) and (IV), hold. " #
5.2. Above the scaling window. In this subsection, we prove Theorem 3.1 iii) and The(i) orem 3.2. To this end, we consider separately those clusters C which intersect the infinite cluster C∞ and those which do not. We denote the clusters intersecting C∞ by (1) (2) (k) C ,∞ , C ,∞ , · · · C ,∞ , ordering them again from largest to smallest size, with lexico(1)
(2)
(k)
graphic order between clusters of the same size. In the same way, C ,fin , C ,fin , · · · C ,fin (i)
denote the clusters in which do not intersect the infinite cluster C∞ . Finally, W ,fin = (i) |C ,fin |
(i) W ,∞
and = sponding classes.
(i) |C ,∞ |
denote the sizes of the i th largest clusters in the corre-
Finite-Size Scaling in Percolation
183
Proposition 5.5. Suppose that Postulates (V) and (VI) hold. Then there exists a constant C1 < ∞ such that (1)
Ep {W n ,fin } | n |P∞ (p)
≤ C1
L (p) d/2 0 if p > pc , n
(5.50)
so that in particular (1)
Epn {W n ,fin } | n |P∞ (pn )
→ 0 as n → ∞
(5.51)
whenever pn > pc is a sequence of densities such that n/L0 (pn ) → ∞ as n → ∞. Proof. Let t (n) = (2nL0 (p))d/2 πL0 (p) (pc ). Analogously to (5.2) we have (1)
(1)
(1)
Ep {W n ,fin } ≤ t (n) + Ep {W n ,fin ; W n ,fin ≥ t (n)} (1) ≤ t (n) + Prp {|C n (v)| = W n ,fin , |C n (v)| ≥ t (n), v ↔ ∞} v∈ n
≤ t (n) + | n |Prp {|C(0)| ≥ t (n), 0 ↔ ∞}. (5.52) Using Markov’s inequality and Postulate (V) we obtain (1)
(2n)d fin χ (p) t (n) (2n)d d ≤ t (n) + D4 L (p)πL2 0 (p) (pc ) t (n) 0 = t (n) 1 + D4 .
Ep {W n ,fin } ≤ t (n) +
(5.53)
Observing that t (n)/| n |P∞ (p) (L0 (p)/n)d/2 by Postulate (VI), we obtain (5.50) and hence (5.51). " # (1)
(2)
(k)
In order to estimate the size of the clusters C ,∞ , C ,∞ , · · · C ,∞ , we make extensive use of the facts that (i) W ,∞ = | n ∩ C∞ | = I [v ↔ ∞] (5.54) v∈ n
i≥1
and Epn {| n ∩ C∞ |} =
Prpn {v ↔ ∞} = | n |P∞ (pn ).
(5.55)
v∈ n
Lemma 5.6. Suppose that Postulates (V) and (VI) hold. Let pn > pc be a sequence of densities such that n/L0 (pn ) → ∞ as n → ∞. Then as n → ∞, | n ∩ C∞ | → 1 in probability. | n |P∞ (pn )
(5.56)
184
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
Proof. We bound the variance of | n ∩ C∞ | by Covpn (v ↔ ∞; w ↔ ∞) Var {| n ∩ C∞ |} = pn
v,w∈ n
≤
Covpn (v ↔ ∞; w ↔ ∞) = | n |χ cov (pn ).
(5.57)
v∈ n w∈Zd
Note that we used here the positivity of Covpn (v ↔; w ↔ ∞); this follows from the Harris–FKG inequality. Combined with (5.55) and Postulates (V) and (VI), we obtain that for a suitable constant C1 < ∞, Var pn {| n ∩ C∞ |} L0 (pn ) d C1 L0 (pn )d . (5.58) ≤ = C 1 Ep2 n {| n ∩ C∞ |} | n | 2n By our assumption on pn , the right-hand side goes to zero as n → ∞. This implies (5.56). " # Proposition 5.7. Suppose that Postulates (II), (V) and (VI) hold. Let pn > pc be a sequence of densities such that n/L0 (pn ) → ∞ as n → ∞. Then, as n → ∞, (1)
W n ,∞
| n |P∞ (pn )
→ 1 in probability.
(5.59)
Proof. We have to show that for all δ > 0 (1)
as n → ∞
(5.60)
(1)
as n → ∞.
(5.61)
Prpn {W n ,∞ ≥ (1 − δ)| n |P∞ (pn )} → 1 and Prpn {W n ,∞ ≤ (1 + δ)| n |P∞ (pn )} → 1 (1)
Since W n ,∞ ≤ | n ∩ C∞ | by (5.54), the result (5.61) follows from (5.56). We are therefore left with proving (5.60). Again by (5.56), this amounts to showing that with (1) high probability, the main contribution to the left-hand side of (5.54) comes from W n ,∞ . We consider suitable volumes m ⊂ n with lim | m |/| n | > 1 − δ.
(5.62)
n→∞
Since |C∞ ∩ m | →1 | m |P∞ (pn )
in Prpn -probability
(5.63)
as n → ∞ (the proof is identical to the proof of Lemma 5.6), we conclude that Prpn {|C∞ ∩ m | ≥ (1 − δ)| n |P∞ (pn )} → 1 We shall next show that for a suitable choice of m ,
(i) Ppn #{i | C n ,∞ ∩ m = ∅} ≥ 2 → 0
as
as
n → ∞.
n → ∞.
(5.64)
(5.65)
Finite-Size Scaling in Percolation
185
(i)
If #{i | C n ,∞ ∩ m = ∅} = 1, then all pieces of C∞ ∩ m are connected in n and (1)
|C n ,∞ | ≥ |C∞ ∩ m |, so that (5.65) together with (5.64) will prove the desired result (5.60). In order to show that m can be chosen so that (5.62) and (5.65) hold, we define, for 0 < α < 1/6 and n ≥ 1/α, x=
2 − 3, α
L(n) = αn ,
M(n) = xL(n)
and
m=
M(n) + 1 . 2
Note that with this choice m < n for all n ≥ 1/α, and | m | 3α d . = 1− lim n→∞ | n | 2
(5.66)
(5.67)
A sufficiently small choice of α therefore ensures the condition (5.62). Note also that d, ˜ ˜
m is isomorphic to [0, M(n)]d , while n is isomorphic to [−L(n), M(n) + L(n)] where ˜ L(n) := n − m ≥ L(n).
(5.68)
Using these observations and recalling the definition (2.23) of S fin (pn ), we then bound ˜ L,M
Ppn {#{i |
(i) C n ,∞
fin fin ∩ m = ∅} ≥ 2} ≤ SL(n),M(n) (pn ) ≤ SL(n),M(n) (pn ), ˜
(5.69)
fin (p ) is decreasing in L. where in the last step we have used that SL,M n In order to complete the proof, we use that for any ε˜ > 0,
L0 (p, ε˜ ; x) L0 (p) = L0 (p, ε; 1)
(5.70)
by Postulate (II). Our assumption n/L0 (pn ) → ∞ therefore implies that L0 (pn , ε˜ ; x)/n, and hence L0 (pn , ε˜ , x)/L(n), goes to zero as n → ∞. Since this is true for all ε˜ > 0, we can use the definition (2.25) of L0 (pn , ε˜ , x) to conclude that fin fin SL(n),M(n) (pn ) = SL(n),xL(n) (pn ) → 0
as
n → ∞.
Equations (5.71) and (5.69) imply (5.65), and hence the proposition.
(5.71)
# "
Proof of Theorem 3.1 iii). For this proof use Postulates (II) and (V) and (VI). Let pn > pc be such that n/L0 (pn ) → ∞. We may then use (5.59) to conclude that (1)
lim inf n→∞
Since (1)
Epn {W n ,∞ } ≤
Epn {W n ,∞ }
≥ 1.
(5.72)
Epn {W n ,∞ } = | n |P∞ (pn )
(5.73)
| n |P∞ (pn )
i≥1
(i)
for all n, it follows that (1)
lim
n→∞
Epn {W n ,∞ } | n |P∞ (pn )
= 1.
(5.74)
186
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer (1)
(1)
(1)
(1)
Combined with (5.51) and W n ,∞ ≤ W n ≤ W n ,∞ + W n ,fin , this proves (3.10). In order to prove (3.11), we note that (5.74) together with (5.54) and (5.72) imply that (2)
Epn {W n ,∞ } | n |P∞ (pn )
(1)
≤1−
Epn {W n ,∞ } | n |P∞ (pn )
as n → ∞. Combined with (5.51), this implies (3.11).
→0
(5.75)
# "
Proof of Theorem 3.2. We again use Postulates (II) and (V) and (VI). As before, by assumption, pn > pc for all sufficiently large n, and n/L0 (pn ) → ∞. Using Markov’s inequality and Proposition 5.5, we therefore get (1)
W n ,fin
| n |P∞ (pn )
→0
in probability.
Combined with Proposition 5.7, this implies Theorem 3.2.
(5.76)
# "
5.3. Below the scaling window. We start with a lemma which will play a similar role below the window to that played by the lower bound in Proposition 5.1 inside the window. Lemma 5.8. Assume that (4.1) holds for some ε > 0 and that Postulates (III), (IV) and (VII) hold. Then there exist constants 0 < C3 < ∞ and 1 ≤ σ6 , σ7 , σ8 < ∞ such that −D6 k n d
n (ks(L0 (p)), kσ6 s(L0 (p))) ≥ C3 e , Ep N k L0 (p)
(5.77)
provided k ≥ σ7 , n ≥ σ8 kL0 (p) and p < pc . Here D6 is the constant from Postulate (VII). Proof. Let C1 and C2 be the constants from Lemma 4.4. Combining the bound (4.13) with Postulate (VII) and Proposition 4.6 we see that for suitable constants C4 , C5 , with C2 C4 > D6 , and k sufficiently large, say k ≥ C7 , one gets P rp {|C(0)| ≥ ks(L0 (p)), but diam(C(0)) < C4 ksL0 (p)} ≥ P rp {|C(0)| ≥ ks(L0 (p))} − P rp {diam(C(0)) ≥ C4 kL0 (p)} ≥ C5 πL0 (p) (pc )e
−D6 k
(5.78)
.
We want to restrict |C(0)| further. For this we use Lemma 4.7, which tells us that P rp {|C(0)| ≥ σ6 ks(L0 (p))} ≤ C1 πL0 (p) (pc )e−C2 σ6 k . 7 , Therefore, if we take σ6 > D6 /C2 , then for sufficiently large k, say k ≥ C P rp {ks(L0 (p)) ≤ |C(0)| ≤ σ6 ks(L0 (p)), but diam(C(0)) < C4 kL0 (p)} 1 ≥ C5 πL0 (p) (pc )e−D6 k . 2
(5.79)
Finite-Size Scaling in Percolation
187
= n˜ . Observe that if Now let n ≥ 2C4 kL0 (p), n˜ = n − C4 kL0 (p) , = r and
and diam(C(v)) ≤ C4 kL0 (p) , then C(v) ⊂ and C (v) = C(v). Using this v∈
observation we now find
r (ks(L0 (p)), kσ6 s(L0 (p))) Ep N ≥
σ6 ks(L 0 (p)) 1 P rp {|C (v)| = s, but diam(C(v)) < C4 kL0 (p) } s
s=ks(L0 (p)) v∈
≥
1 P rp {ks(L0 (p)) ≤ |C(0)| ≤ C6 ks(L0 (p)), σ6 ks(L0 (p)) v∈B
but diam(C(0)) < C4 kL0 (p) } n d (2n)d k −1 e−D6 k . πL0 (p) (pc )e−D6 k = C3 ≥ C3 ks(L0 (p)) L0 (p) Choosing σ7 = max{C7 , C˜ 7 } and σ8 = 2C4 , this proves the lemma.
# "
Proof of Theorem 3.1 ii) and Theorem 3.5. For the proof, we will need (4.1), and Postulates (III), (IV) and (VII). Assume that pn < pc for sufficiently large n, and n/L0 (pn ) → ∞. It follows from (4.8) that for z ≥ 0 and n large, n (1) Epn {W n } ≤ s(L0 (pn )) log L0 (pn ) ∞ n (1) × z+ dy Prpn W n ≥ ys(L0 (pn )) log L0 (pn ) z n ≤ s(L0 (pn )) log L0 (pn ) d−C2 z ∞ n n × z + C1 exp[−C2 (y − z) log ]dy . L0 (pn ) L0 (pn ) z By choosing C2 z = d we see that
(1)
Epn {W n } ≤ C3 s(L0 (pn )) log
n L0 (pn )
(5.80) (1)
for a suitable constant C3 < ∞. This proves the upper bound for Epn {W n }, where we have so far only used Postulate (IV). (1) The lower bound for Epn {W n } follows immediately from Theorem 3.5 so that it suffices to prove (3.16). Also, we only have to prove that
lim inf Prpn K n→∞
−1
(i)
≤
W n
n s(L0 (pn )) log L0 (p n)
} → 1, as K → ∞,
(5.81)
since the other part of (3.16) is obvious from Markov’s inequality and the upper bound (5.80). For brevity we write p instead of pn for the remainder of this proof. The lower bound (5.77) will play a similar role to that played by Proposition 5.1. However, instead
188
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
of using an analogue of Proposition 5.2 for a second moment, we now appeal to the BK-inequality [BK85]. This tells us that P rp {∃ r disjoint clusters in B'σ8 kL0 (p)( of size ≥ ks(L0 (p))} r ≤ P rp {∃ at least one cluster in B'σ8 kL0 (p)( of size ≥ ks(L0 (p))} . Consequently if we set κ = P rp {∃ at least one cluster in B'σ8 kL0 (p)( of size ≥ ks(L0 (p))},
(5.82)
then Ep {number of disjoint clusters in B'σ8 kL0 (p)( of size ≥ ks(L0 (p))} ≤
κ . 1−κ
By (5.77) the left-hand side here is at least C8 k d−1 exp[−D6 k], C8 = C3 σ8d , so that 1 C8 d−1 −D6 k . (5.83) , k e κ ≥ min 2 2 Now choose k = k(n) = C9 log
n L0 (p)
with the constant C9 > 0 but so small that D6 C9 < d/2. Then we can find in n approximately d d/2 n n ≥ 2σ8 kL0 (p) L0 (p) disjoint boxes B'σ8 kL0 (p)( (vi ). Each of these boxes contains a cluster of size n ≥ k(n)s(L0 (p)) ∼ C9 s(L0 (pn )) log L0 (pn ) with a probability at least
min
1 C8 d−1 −D6 k . , k e 2 2
(5.84)
(5.85)
Moreover, as in (4.16) we also have P rp {∂B'σ8 kL0 (p)( (vi ) ↔ ∂B3'σ8 kL0 (p)( (vi )} ≥ ε2d . For large n this gives
C9 n s(L0 (pn )) log 2 L0 (pn ) and this cluster is not connected to ∂B3'σ8 kL0 (p)( (vi )}
P rp {B'σ8 kL0 (p)( (vi ) contains a cluster of size
(5.86)
≥ 21 ε 2d C8 k d−1 exp[−D6 k]. Since the number of boxes times the right hand side of (5.86) tends to infinity (by our choice of k(n) or C9 ), the probability that at least i of these boxes contains a cluster of size (5.84), and that these clusters are not connected to each other tends to 1. This establishes (3.16). " #
Finite-Size Scaling in Percolation
189
6. Verification of the Postulates in Two Dimensions In this section we prove Theorem 3.6, which states that the Scaling Postulates (I) – (VII) hold for d = 2. Before we start on the proof we discuss some general tools. The fundamental tool for two-dimensional bond percolation is duality. 3 This rests on the following observations. Let Z∗ denote the lattice Z2 + ( 21 , 21 ), which is called the dual lattice of Z2 . Each dual edge e∗ bisects exactly one edge e of the original lattice and vice versa. We call such a pair e∗ and e, associated. For each configuration ω of occupied and vacant edges of Z2 we obtain a configuration on Z∗ by declaring a dual edge e∗ occupied (repectively, vacant) if its associated edge is occupied (respectively, vacant). It is a well known result that there exists an occupied horizontal crossing of the rectangle [0, L] × [0, M] if and only if there does not exist a vertical vacant dual crossing of [ 21 , L − 21 ] × [− 21 , M + 21 ] (see [SmW78], Sect. 2.1 and [Kes82] , Sects. 2.6, 2.4). This translates into RL,M (p) = 1 − RM+1,L−1 (1 − p).
(6.1)
This relation can be used to relate quantities in the subcritical regime to similar quantities in the supercritical regime. For instance, define the two-dimensional finite-size scaling length as if p < pc 0 (p, ε) = min{L | RL,L (p) ≤ ε} L min{L | RL,L (p) ≥ 1 − ε} if p > pc .
(6.2)
(Note that this is in the spirit of definition (1.21) of [Kes87]. However, [Kes87] treats bond percolation on Z2 as site percolation on the covering graph of Z2 , so that the formal definition there is somewhat different. For the purposes of the proofs here this difference in the definitions is without significance.) It follows easily from duality and monotonicity 0 (p, ε) ≥ L 0 (1 − p, ε) for of RL,M in L and M that for bond percolation on Z2 , L p < pc . From the rescaling lemma (Lemmas 3.4 and 4.12 in [ACCFR83]) and duality 0 (1 − p, ε) − 1 ≥ L 0 (p, ε) for one obtains that for sufficiently small ε > 0 also 2L p < pc . We therefore have that 0 (p, ε) L 0 (1 − p, ε), L
p < pc =
1 . 2
(6.3)
Similarly, using the rescaling lemma and the Russo–Seymour–Welsh lemma [Rus78, SW78, Sect. 3.4] it is straightforward to show that in d = 2, the definition (6.2) is equivalent to our finite-size scaling correlation length below threshold, see (2.20): 0 (p) L0 (p) for L
p < pc ,
(6.4)
and to our finite-size scaling inverse surface tension above threshold, see (2.26): 0 (p) A0 (p) for L
p > pc .
(6.5)
As usual, the constants implicit in the equivalences (6.3)–(6.5) depend on ε. 3 Here we can use duality since we are dealing with bond percolation, which is self-dual. However, with a good deal more work, similar results can be proven for other two-dimensional models which are not self-dual – see [Kes87] (Eq. (1.23) and Sect. 4).
190
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
It also follows from the Russo-Seymour-Welsh lemma that for each x > 0 and integer 0 (p), M/L ≥ k ≥ 1 there exists a constant h(x, k, ε) > 0 such that for p ≤ pc , L ≤ k L x, it holds that RL,M (p) ≥ h(x, k, ε).
(6.6)
Thus, sponge crossing probabilities of rectangles with the ratio of the sides bounded 0 (p) are bounded away from 0. away from 0 and ∞ and with a size comparable to L By means of the Harris-FKG inequality it is then also easy to see that the probability of an occupied circuit surrounding the origin in the annulus A = [−M, M]2 \ (−L, L)2 is bounded away from 0, provided L ≤ kL0 (p), M/L ≥ 1 + x > 1. Indeed, by obvious monotonicity we may assume that M ≤ 2L. The annulus A is the union of four M − L × M rectangles, and if each of these has an occupied crossing in the long direction (i.e., a crossing in the direction of the side of length M), then A contains a circuit of the desired kind (compare [SmW78], Lemma 3.5). By the above, each of these crossings has a probability of RM,M−L (p) ≥ h(x/(1 + x) ∧ 1/2, 2k, ε), and by the Harris-FKG inequality the desired occupied circuit exists with a probability at least h4 (x/(1 + x) ∧ 1/2, 2k, ε). Now consider two adjacent rectangles [0, L] × [0, M] and [L, 2L] × [0, M], and assume that each of these contains an occupied horizontal crossing, r1 and r2 , say. If, in addition there exist occupied vertical crossings of [0, L] × [−L, M + L] and [L, 2L] × [−L, M + L] as well as occupied horizontal crossings of [0, 2L] × [−L, 0] and [0, 2L] × [M, M + L], then these four crossings contain a circuit which necessarily intersects r1 and r2 and therefore connects r1 and r2 (see Fig. 1). Therefore, another application of the Harris-FKG inequality shows that P rp {all horizontal crossings of [0, L] × [0, M] and of [L, 2L] × [0, M] are connectedthere exists at least one horizontal crossing in each of [0, L] × [0, M] and [L, 2L] × [0, M]} M + 2L L 1 2L , ≥ h2 ( , ε)h2 ( , , ε). M + 2L L0 (p) 2 L0 (p)
(6.7)
If M/L and L/L0 (p) are bounded, then the right-hand side of (6.7) is bounded away from zero. By minor variations of this argument one sees that there is a lower bound for the probability that two occupied crossings r1 and r2 over length L which are within distance of order L from each other are connected (by a circuit of diameter also of order L), provided L/L0 (p) is bounded. We shall say in such a situation that r1 and r2 can be connected by a Harris ring. We now prove the postulates for d = 2 in several subsections. These proofs rely to a large extent on the results and methods of [Kes86] and [Kes87]. 6.1. Proof of Postulates (I) and (II). Postulate (II) is the relation A0 (p, ε) L0 (p, ε; 1) L0 (p, ε; x)
(6.8)
for all p > pc , x ≥ 1 and ε, ε ∈ (0, ε0 ). Once we prove this, Postulate (I) follows e.g. from the equivalence in Eq. (6.5) and Postulate (II): 0 (p), L0 (p) A0 (p) L
(6.9)
Finite-Size Scaling in Percolation
191
0
y
Fig. 1. Harris ring construction for the proof of (6.7)
Eq. (6.3) and the known behavior (2.27). Hence it suffices to establish Postulate (II). We claim that in order to prove (6.8), it suffices to show that for all x ≥ 1 and ε ∈ (0, ε0 /2), there exists an ε ∈ (0, ε0 ) and a λ = λ(ε, ε, x) such that 0 (p, 2ε) ≤ L0 (p, 0 (p, ε) + 1 as p ↓ pc . L ε, x) ≤ λL
(6.10)
Indeed, given (6.10), we can deduce (6.8) for ε, ε < ε0 /2 from (6.5) and the known 0 (p, ε) at different values of ε, i.e., equivalence of L 0 (p, ε1 ) L 0 (p, ε2 ) as p ↓ pc for 0 < ε1 , ε2 < ε0 , L
(6.11)
which follows from the rescaling lemma. Finally we must replace ε0 by ε0 /2 to obtain Postulate (II). We establish (6.10) via an upper and a lower bound. For the upper bound, we note that for all L, M, fin SL,M (p) ≤ 1 − Pp (∃ an occupied circuit in HL,M surrounding ∂I HL,M ).
(6.12)
Given ε, ε ∈ (0, ε0 ) and x ≥ 1, it is not hard to show, by means of the rescaling lemma (compare the argument for (6.7)), that there exists a λ = λ(ε, ε; x) such that if 0 (p, ε), then the probability of the circuit described in (6.12) is M = xL and L ≥ λL fin 0 (p, ε). strictly bounded below by 1− ε for p > pc . Hence SL,xL (p) < ε for all L ≥ λL fin But it follows from the definition (2.25) that SL,xL (p) ≥ ε if L = L0 (p, ε; x) − 1. Thus 0 (p, ε) + 1. L0 (p, ε; x) ≤ λL
(6.13)
Next we establish a lower bound of the same form. To this end, note that the annulus HL,xL consists of four non-overlapping L × xL rectangles and four L × L corners. Let us call the rectangles the left, right, upper and lower rectangles. Clearly, for all L, fin SL,xL (p) ≥ Pp (∃ an occupied left-right crossing in the left rectangle and
a vacant dual left-right crossing in the right rectangle, each connecting ∂I HL,xL to ∂E HL,xL ).
(6.14)
192
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
Since x ≥ 1, the lower bound in (6.14) is only strengthened by requiring that the occupied crossing occur in an L × L sub-box of the corresponding L × xL rectangle and that the vacant dual crossing occur in an (L + 1) × (L − 1) rectangle. By (6.1) this gives fin (p) ≥ RL,L (1 − RL,L ) ≥ SL,xL
1 (1 − RL,L ), 2
(6.15)
ε; x). It then follows from the definition (2.25) that since p > pc . Now let L = L0 (p, fin SL,xL (p) ≤ ε, so that (6.15) implies RL,L ≥ 1 − (2 ε) if L = L0 (p, ε; x). Comparing this with the definition (6.2) for p > pc , we conclude 0 (p, 2 ε; x) ≥ L ε), L0 (p,
(6.16)
a lower bound of the desired form. " #
6.2. Proof of Postulate (III). Postulate (III) is almost identical to Theorem 1 of [Kes87], 0 (p, ε), whereas Postulate (III) assumes except that the latter uses the condition n ≤ L n ≤ L0 (p, ε). Thus, to establish Postulate (III), it suffices to show that for all ε ∈ (0, ε0 ), there exists an ε ∈ (0, ε0 ) such that 0 (p, ε). L0 (p, ε) ≤ L
(6.17)
0 (p, ε) for p > pc To prove (6.17) we note that by (6.9) we have L0 (p, ε) ≤ λ(ε)L and a suitable λ < ∞. This relation also holds for p < pc , as observed in (6.4). Therefore, it suffices to show that for all p = pc , λ < ∞, and ε ∈ (0, ε0 ), there exists an ε ∈ (0, ε0 ) such that 0 (p, 0 (p, ε) ≤ L ε). λL
(6.18)
Finally, by (6.3), it suffices to establish (6.18) only for p < pc , and by iteration, to establish the latter only for λ = 2. To this end, we note that by the Russo-SeymourWelsh lemma ([Rus78, SW78, Sect. 3.4]), rescaling and the obvious monotonicity of RL,M , we have RM,M (p) ≥ f (RL,L (p)) if L ≤ M ≤ 3L,
(6.19)
for some function f on [0, 1] which is strictly positive on (0, 1]. Without loss of generality 0 (ε, p), we conclude that we may take f (ε) ≤ ε. Using the definition (6.2) of L RM,M (p) > f (ε) if
0 (p, ε) − 3. M ≤ 3L
(6.20)
As a consequence, 0 (ε, p) − 2 ≥ 2L 0 (p, ε), 0 (p, f (ε)) ≥ 3L L
(6.21)
0 (p, ε) > 1 in the last step. where we have used that R1,1 (p) ≥ p > ε, and hence L This establishes (6.18) and hence Postulate (III). " #
Finite-Size Scaling in Percolation
193
6.3. Proof of Postulate (IV). We will establish Postulate (IV) for all p such that m ≤ L0 (p) (a somewhat stronger result than the stated postulate at pc ). This postulate with ρ1 = 2 follows from the claim that for some C1 > 0, m −1/2 πm (p) ≥ C1 πn (p) n
n ≤ m ≤ L0 (p).
if
(6.22)
In order to establish (6.22), we assume that kn ≤ m ≤ (k + 1)n for some integer k ≥ 1. By (2.12) and monotonicity of πn , π(k+1)n . πm ≥ π(k+1)n ≥
(6.23)
Recall the definition (2.9) of π(k+1)n and observe that one mechanism to ensure that the origin is connected to the line at x1 = (k +1)n is to have (1) the origin connected to some point in ∂Bn (0), (2) some point on ∂Bn (0) connected to the line at x1 = (k +1)n, and (3) Harris rings in the annuli Bn \ Bn/2 and B2n \ Bn and a rectangle crossing from (say) the right boundary of Bn/2 to the central quarter of the right boundary of B2n to “glue” the connections in (1) and (2) together. Since n ≤ L0 (p), the probability of the third event is bounded away from zero, uniformly in n (as in (6.7)). Denote the probability of the event described in (2) above by Gn,kn . Equation (6.23) and the Harris-FKG inequality then imply that for some constant C2 > 0, πm ≥ C2 πn Gn,kn .
(6.24)
By an argument almost √identical to the proof of Corollary (3.15) in [BK85], kn ≤ L0 (p) implies Gn,kn ≥ C3 / k, where C3 is a lower bound on the probability of an occupied crossing of a 2kn × 2kn square. The constant C3 > 0 by virtue of (6.17) and (6.20). (Essentially this same argument is used in [Kes87], Eq. (3.6) and its proof on p. 143.) Thus (6.24) implies the desired bound (6.22). " #
6.4. Proof of Postulate (V). Theorem 3 of [Kes87] gives the second inequality in Postulate (V). Thus it suffices to prove that for a suitable constant D4 and all p > pc , χ cov (p) ≤ D4 L20 (p)πL2 0 (p) (pc ).
(6.25)
To this end, we decompose the sum defining χ cov (p) (with |v| short for |v|∞ and L0 for L0 (p) ): χ cov (p) = Covp (0 ↔ ∞; v ↔ ∞) + Covp (0 ↔ ∞; v ↔ ∞). (6.26) |v|≤2L0
|v|>2L0
To control the first term, we use the bound (4.4) in Lemma 4.1 and Postulate III to estimate Covp (0 ↔ ∞; v ↔ ∞) ≤ Pp {0 ↔ ∞, v ↔ ∞} |v|≤2L0
≤
|v|≤2L0
|v|≤2L0
τ (0, v) ≤
|v|≤2L0
2 π[|v|/2] (p) L20 πL2 0 (pc ).
(6.27)
194
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
Next, we bound the second term in (6.26). To this end, let B(w) = BL0 (w) be the box of radius L0 centered at w. For |v| > 2L0 , we have Covp (0 ↔ ∞; v ↔ ∞) = Covp (0 ↔ ∞; v ↔ ∞) = Covp (0 ↔ ∞, 0 ↔ ∂B(0); v ↔ ∞, v ↔ ∂B(v)) + Covp (0 ↔ ∞, 0 ↔ ∂B(0); v ↔ ∂B(v)) + Covp (0 ↔ ∂B(0); v ↔ ∞, v ↔ ∂B(v)) + Covp (0 ↔ ∂B(0); v ↔ ∂B(v)) = Covp (0 ↔ ∞, 0 ↔ ∂B(0); v ↔ ∞, v ↔ ∂B(v)) + 2Covp (0 ↔ ∂B(0); v ↔ ∞, v ↔ ∂B(v)), (6.28) where in the last step we have used that Covp (0 ↔ ∂B(0); v ↔ ∂B(v)) = 0 by the independence of the events {0 ↔ ∂B(0)} and {v ↔ ∂B(v)} when B(0) and B(v) are disjoint, and also the symmetry of the roles played by 0 and v. Now we bound the second term on the right-hand side of (6.28) according to Covp (0 ↔ ∂B(0); v ↔ ∞, v ↔ ∂B(v)) = Covp (0 ↔ ∂B(0); v ↔ ∞, v ↔ ∂B(v), v ↔ ∂B(0)) + Covp (0 ↔ ∂B(0); v ↔ ∞, v ↔ ∂B(v), v ↔ ∂B(0)) = Covp (0 ↔ ∂B(0); v ↔ ∞, v ↔ ∂B(0))
(6.29)
= − Covp (0 ↔ ∂B(0); v ↔ ∞, v ↔ ∂B(0)) ≤ Pp {0 ↔ ∂B(0)} Pp {v ↔ ∞, v ↔ ∂B(0)}, where we have used that the two events {v ↔ ∞} ∩ {v ↔ ∂B(v)} ∩ {v ↔ ∂B(0)} and 0 ↔ ∂B(0) are independent. Using the Harris-FKG inequality and obvious monotonicities, the second factor on the right-hand side of (6.29) is in turn bounded according to Prp {v ↔ ∞, v ↔ ∂B(0)} ≤ Prp {v ↔ ∂B(0)} Prp {∃w ∈ ∂B(0) such that w and v are surrounded by a vacant dual contour}. (6.30) We now follow a coarse-graining argument along the lines of the proof of Theorem 3 in [Kes87] (see (3.12), (3.13) and (2.25) there). Let v = (v1 , v2 ) and for the sake of argument let v1 = |v| = |v|∞ . If there exists a vacant dual contour surrounding w ∈ ∂B(0) and v, then there exists a vacant dual path from B(0) to some B(v1 + j, v2 ) with j ≥ 0. By (2.25) in [Kes87] the probability that such a vacant path exists is at most C1 exp[−C2 |v|/L0 ]. Together with (6.29) and Postulate III this leads to a bound of C3 πL2 0 (pc ) exp[−C2 |v|/L0 ] for the second term in the right-hand side of (6.28).
(6.31)
Finite-Size Scaling in Percolation
195
Next we bound the first term in the right-hand side of (6.28) by means of the BK inequality as follows: Prp {0 ↔ ∞, 0 ↔ ∂B(0); v ↔ ∞, v ↔ ∂B(v)} ≤ Prp {0 ↔ ∂B(0), v ↔ ∂B(v) and there exist vacant dual contours C1 , C2 surrounding 0 and v, respectively} ≤ Prp {0 ↔ ∂B(0), v ↔ ∂B(v) and there exist edge-disjoint vacant dual contours C1 , C2 surrounding 0 and v, respectively} + Prp {0 ↔ ∂B(0), v ↔ ∂B(v) and there exist vacant dual contours C1 , C2 surrounding 0 and v, respectively, and C1 and C2 have an edge in common}. By the BK inequality the first term in the right-hand side is no more than Prp {0 ↔ ∂B(0) and there exists a vacant dual contour C1 which surrounds 0} × Prp {v ↔ ∂B(v) and there exists a vacant dual contour C2 which surrounds v} = Prp {0 ↔ ∞, 0 ↔ ∂B(0)}Prp {v ↔ ∞, v ↔ ∂B(v)}. Therefore, Covp (0 ↔ ∞, 0 ↔ ∂B(0); v ↔ ∞, v ↔ ∂B(v)} ≤ Prp {0 ↔ ∂B(0), v ↔ ∂B(v) and there exist vacant dual contours C1 , C2 surrounding 0 and v, respectively, and C1 and C2 have an edge in common} ≤ Prp {0 ↔ ∂B(0)}Prp {v ↔ ∂B(v)}Prp {∃ vacant dual contours C1 , C2 surrounding 0 and v respectively, and C1 and C2 have an edge in common}, (6.32) where we have used the Harris-FKG inequality and disjointness of B(0) and B(v) in the last step. If the two dual contours C1 , C2 in (6.32) have an edge in common, and if again v = (v1 , v2 ) with v1 = |v|, then C1 ∪ C2 contains a vacant dual path from some B(−j1 , 0) to some B(v1 + j2 , v2 ) with j1 , j2 ≥ 0. The same argument as used for (6.31) now shows that also the first term in the right-hand side of (6.28) is bounded by (6.31). Finally, then Covp (0 ↔ ∞; v ↔ ∞) |v|>2L0
≤
|v|>2L0
2C3 πL2 0 (pc ) exp[−C2 |v|/L0 ] ≤ C(ε)L20 πL2 0 (pc ).
Together with (6.26), (6.27) this yields (6.25).
(6.33)
# "
6.5. Proof of Postulate (VI). Postulate (VI) for d = 2 goes back to [Ngu85]. We can also immediately obtain this from Theorem 2 in [Kes87], which states that P∞ (p) is of the same order as πL 0 (p,ε) (pc ). But by (6.10), (6.11) there exists a λ = λ(ε) ≥ 1 such 0 (p, ε) ≤ λL0 (p, ε). Therefore, by Postulate (IV) that L −1/ρ1 πL πL0 (p,ε) (pc ). 0 (p,ε) (pc ) ≥ πλL0 (p,ε) (pc ) ≥ D3 λ
196
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
Combined with Theorem 2 of [Kes87] this gives one inequality of Postulate (VI). In the other direction, it is trivial that P∞ (p) ≤ πL0 (p) (p) and further πL0 (p) (p) πL0 (p) (pc ), by Postulate (III). " # 6.6. Proof of Postulate (VII). We shall build a cluster of size at least ks(L0 (p)) by connecting together C1 k clusters of size at least C2 s(L0 (p)) (for suitable constants C1 , C2 ) in adjacent squares of size 2L0 (p). These clusters will be connected by means of Harris rings. By Postulate (IV) and Proposition 4.6 (which relies only on Postulate (I)–(IV)) there exists a σ0 ∈ (0, 1] such that for n0 = σ0 L0 (p)/2 , P≥s(L0 (p)) (p) ≤ P≥s(n0 ) (p) ≤ C3 πn0 (pc ); in the first inequality we used that Postulate (IV) implies (5.10), which in turns implies that s(m) ≥ s(n) if m/n is large enough. In turn, by Postulate (III), the right-hand side here is at most C4 πn0 (p). It therefore suffices to show that for p < pc and suitable constants C5 , D6 , P≥ks(L0 (p)) (p) ≥ C5 e−D6 k πn0 (p).
(6.34)
First we use Theorem 3.3 and Lemma 4.4, (4.14). These results show that there exist constants K0 < ∞ and y0 > 0 such that P rp {∃ cluster C ⊂ L0 (p) with diam(C) ≥ y0 L0 (p) and |C| ≥ K0−1 s(L0 (p))} (1)
= P rp {WL0 (p) ≥ K0−1 s(L0 (p))} − P rp {∃ cluster C ⊂ L0 (p) with diam(C) < y0 L0 (p) and |C| ≥ K0−1 s(L0 (p))} 1 1 ≥ − C1 y0−2 exp[−C2 (K0 y0 )−1 ] ≥ , 2 4 (6.35) provided L0 (p) ≥ 4/y0 . The estimate (6.35) shows that with a probability of at least 1/4 there is a cluster with a “large” size and “large” diameter in L0 (p) . We wish to locate this large cluster more precisely. In fact we want to show that we may assume that it crosses a certain rectangle in the first coordinate direction. To this end we note that if diam(C) ≥ y0 L0 (p), then there are two points v, w ∈ C so that wi − vi ≥ y0 L0 (p) for i = 1 or i = 2. Assume that this holds for i = 1. Then for some −2/y0 ≤ j ≤ 2/y0 the event M(p, j ) := {∃ cluster C ∈ L0 (p) with |C| ≥ K0−1 s(L0 (p)) that contains (6.36) points v, w with v1 ≤ jy0 L0 (p)/2 < (j + 1)y0 L0 (p)/2 ≤ w1 } must occur. Therefore there exists a j0 ∈ [−2/y0 , 2/y0 ] for which y0 P rp {M(p, j0 )} ≥ . 8(y0 + 1)
(6.37)
From (6.37) and translation invariance it follows that each of the events {∃ cluster C0 ∈ [20L0 (p), (20 + 2)L0 (p)) × [−L0 (p), L0 (p)) with |C0 | ≥ K0−1 s(L0 (p))and which crosses [(20 + j0 )y0 L0 (p)/2, (20 + j0 + 1)y0 L0 (p)/2] × [−L0 (p), L0 (p)) in the horizontal direction} , 0 ≥ 0,
(6.38)
Finite-Size Scaling in Percolation
197
has probability at least y0 /(8y0 + 8). Let k ≥ 1 be given and take r = 'kK0 (. If the event in (6.38) occurs for 0 = 0, 1, . . . , r and 0 ↔ ∂Bn0 (0), and the paths from 0 to ∂Bn0 (0) and the horizontal crossings of [(20+j0 )y0 L0 (p)/2, (20+j0 +1)y0 L0 (p)/2]× [−L0 (p), L0 (p)) , 0 ≤ 0 ≤ r are all connected by Harris rings, then the cluster of the origin has size at least rK0−1 s(L0 (p)) ≥ ks(L0 (p)). The Harris-FKG inequality now shows that y0 P≥ks(L0 (p)) (p) ≥ πn0 (p)C6 [C6 ]r . 8(y0 + 1) This proves (6.34) with D6 = log
8(y0 + 1)(K0 + 1) , C6 y0
and Postulate (VII) follows for all p < pc with L0 (p) ≥ 4/y0 . If L0 (p) < 4/y0 , the −2 postulate follows from the trivial bound P≥ks(L0 (p)) ≥ pks(L0 (p)) ≥ p64y0 k . # " 7. Proof of Theorem 3.7 In this section, we introduce Postulate (VII alt), which is slightly stronger than Postulate (VII), and prove Theorem 3.7. To state the Postulate (VII alt), we need some notation. For k ≥ 1, let [k]d = {1, . . . , k}d . Given an integer k ≥ 1, and a choice of vertices v( j) ( j) := 2jL0 (p) + L0 (p)/4 , j ∈ [k]d , we define sets ( j) = 2jL0 (p) + L0 (p) in
and ! F=
( j), j∈[k]d
as well as events G( j) = G(j; x) = {|C ( j) (v( j))| ≥ xs(L0 (p))}, G(j; x), Gk = Gk (x) = j∈[k]d
H ( j) = {v( j) ↔ v(j ± ei ) in F, 1 ≤ i ≤ d}, where the i th component of j ± ei equals ji ± 1. We also define Hk = {all v( j) with j ∈ [k]d are connected in F} =
H ( j).
2≤ji ≤k−1 1≤i≤d
Postulate (VII alt). For all 0 < x ≤ 1 there exists a constant D7 = D7 (x) > 0 such that P rp {Hk | Gk (x)} ≥ D7k
d
(7.1)
for all ζ0 ≤ p < pc , k ≥ 1 and all choices of v( j), j ∈ [k]d . We remind the reader that ζ0 is some arbitrary number in (0, pc ).
198
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
Note that there are k d choices for j ∈ [k]d . Condition (7.1) therefore roughly speaking says that the conditional probability of H ( j), given that |C(v( j))| ≥ xs(L0 (p)) and each |C(v( j ± ei ))| ≥ xs(L0 (p)), is at least D7 . Or still more intuitively, “clusters of size of order s(L0 (p)) and a distance of order L0 (p)) apart have a reasonable conditional probability of being connected." We also mention that (7.1) is actually not needed for all x ∈ (0, 1], but only for one fixed value of x for which P rp {G( j)} ≥ C1 πL0 (p) (pc ) for some constant C1 > 0, independent of p < pc . Such x and C1 can be shown to exist by means of the bound (5.40) which follows from Proposition 5.3.
7.1. Proof of Theorem 3.7i. In this subsection we always assume Postulates (I)–(IV) and ζ0 ≤ p < pc . For brevity we write in many places L for L0 (p) and for L0 (p) . In Steps i–v we also use Postulate (VII alt), but we only shall use that (7.1) is valid for 0 < x ≤ x0 for some x0 > 0. The value of x0 is irrelevant. All constants in this section are independent of k. Step i. There exists an x ∈ (0, 1] and a constant C2 > 0 such that uniformly for ( j) = 2 jL0 (p) + L0 (p)/4 , v( j) ∈
P rp {G( j)} ≥ C2 πL (pc ).
(7.2) (1)
To prove (7.2) we use the relation (5.3) between the distribution of W and P≥s . For r ≥ 1 and any 0 < C1 < ∞, we get (1)
P rp {W r ≥ C1 s(r)} ≤
v∈ r s≥C1 s(r)
1 P rp {|C r (v)| = s} s
| r | ≤ sup P rp {|C r (v)| ≥ C1 s(r)}. C1 s(r) v∈ r
(7.3)
On the other hand, by (5.10), s(m) ≥ s(r) and hence (1)
(1)
P rp {W r ≥ C1 s(r)} ≥ P rp {W r ≥ s(m)}
(7.4)
whenever m ≥ r(C1 /D3 )2/d . Setting r = L0 (p)/4 , m = r'(C1 /D3 )2/d (, and choosing C1 > 0 small enough to guarantee that '(C1 /D3 )2/d (σ (1) (1/4, 1/2) ≤ 1, where σ (1) (λ, δ) is the constant introduced before (5.40), we can now use the bound (5.40). (1) Combined with (7.4) we get P rp {W r ≥ C1 s(r)} ≥ 1/2. Using (7.3), we therefore conclude that there exists a constant C3 > 0 and a w0 ∈ r such that P rp {|C r (w0 )| ≥ C1 s(r)} ≥ 21 C2 πr (p) ≥ C3 πr (pc ),
(7.5)
where we used Postulate (III) in the last step. Now for any v ∈ r , r shifted by v − w0 is contained in 3r ⊂ . Therefore for all v ∈ r = L0 (p)/4 and sufficiently small C4 , P rp {|C (v)| ≥ C4 s(L)} ≥ P rp {|C r (w0 )| ≥ C1 s(r)} ≥ C3 πr (pc ) ≥ C2 πL (pc ). (7.6) This proves (7.2) for j = 0 and x = C4 ∧ 1. But then it clearly holds for all j by translation and for all 0 < x ≤ C4 ∧ 1.
Finite-Size Scaling in Percolation
199
Step ii. Now fix k and for brevity write M = k d . Let C2 and C4 be such that (7.6) holds. Also fix x = C4 ∧ 1 ∧ x0 and take D7 = D7 (x). It is useful to indicate the choice of the v( j) more explicitly in our notation. With some abuse of notation we denote the possible values of j by 1, . . . , M, and we occasionaly write Gk (v(1), . . . , v(M)) instead of Gk , and similarly for Hk (v(1), . . . , v(M)). We have defined the ( j) such that they are disjoint. Consequently, for any choice ( j), we have by (7.2) of v( j) in
" P rp {Gk (v(1), . . . , v(M))} = P rp {G( j)} ≥ [C2 πL (pc )]M , j
and then by Postulate (VII alt) P rp {Gk (v(1), . . . , v(M)) ∩ Hk (v(1), . . . , v(M))} ≥ [D7 C2 πL (pc )]M . (7.7) # (k) ( j) for j = 1. We indicate this sum by We sum this over all v( j) ∈
. We therefore have for some constants C5 , C6 , (k) P rp {Gk (v(1), . . . , v(M)) ∩ Hk (v(1), . . . , v(M)) (7.8) ≥ [D7 C2 πL (pc )]M [2L0 (p)/4 ]M−1 ≥ C5 πL (pc )[C6 s(L)]M−1 . Step iii. We next work on an upper bound for the left-hand side of (7.8). To this end we note that on the event Gk ∩ Hk , v( j) is connected to v(1) and therefore to ∂ ( j) whenever j = 1. We therefore define ( j) = number of v ∈
( j) which are connected to ∂ ( j). V We further define Ik = Ik (v(1)) = I [|CF (v(1))| ≥ Mxs(L)].
(7.9)
Clearly, on the event Gk (v(1), . . . , v(M)) ∩ Hk (v(1), . . . , v(M)), it holds that |CF (v(1))| ≥ |C ( j) (v( j))| ≥ Mxs(L) and v( j) ↔ ∂ ( j), j
and therefore (k) P rp {Gk (v(1), . . . , v(M)) ∩ Hk (v(1), . . . , v(M))} ≤ Ep {Ik (v(1))I [v(1) ↔ ∂ (1)]
"
( j)}. V
(7.10)
j=1
We continue this inequality. For any γ ≥ 0 the right-hand side of (7.10) is at most eγ M [s(L)]M−1 Ep {Ik (v(1))} " " ( j); ( j) ≥ eγ M [s(L)]M−1 V V + Ep Ik (v(1))I [v(1) ↔ ∂ (1)] j=1
≤e
γM
[s(L)]
j=1
M−1
Ep {Ik (v(1))} " 2 ( j) . V + e−γ M [s(L)]−M+1 Ep I [v(1) ↔ ∂ (1)] j=1
200
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
( j) are independent among each other and of I [v(1) ↔ Finally we observe that the V ∂ (1)], because the ( j) are disjoint. Moreover, for each j, 2 ( j)} ≤ C7 s 2 (L), Ep {V by virtue of (4.6). Therefore " " 2 ( j)} 2 ( j)} = P rp {v(1) ↔ ∂ (1)} Ep {I [v(1) ↔ ∂ (1)] Ep {V V j=1
j=1
≤ πL/2 (pc )[C7 s(L)]
2M−2
.
Combining these estimates gives (k) P rp {Gk (v(1), . . . , v(M)) ∩ Hk (v(1), . . . , v(M))} ≤ eγ M [s(L)]M−1 Ep {Ik (v(1))} + e−γ M [s(L)]−M+1 πL/2 (pc )[C7 s(L)]2M−2 . (7.11) Step iv. In this step we complete the deduction of Postulate (VII) from Postulates (I)–(V) and Postulate (VII alt). From (7.8)–(7.11) we obtain (by means of Postulate (IV)) eγ M [s(L)]M−1 Ep {Ik (v(1))}
≥ C5 πL (pc ) [C6 s(L)]M−1 − e−γ M [s(L)]−M+1 (C5 D3 )−1 21+1/ρ1 [C7 s(L)]2M−2 . Choosing γ so large that e−γ < C6 C7−2 ∧
1 C5 D3 2−1/ρ1 , 4
we find that 1 Ep {Ik (v(1))} ≥ πL (pc )e−γ M C5 C6M−1 . 2
(7.12)
Since, by (7.9), the left-hand side is no more than P≥Mxs(L) (p), and, by (4.20), πL (pc ) ≥ C2−1 P≥s(L) (pc ) ≥ C2−1 P≥s(L) (p), we obtain Postulate (VII). Step v. Even though we finished the deduction of Postulate (VII), we point out here that had we summed over v(1) as well, then the derivation given above would have resulted in (1)
P rp {W kL ≥ C2 Ms(L)} = P rp {∃ a cluster in
!
( j) of size ≥ C2 Ms(L)}
(7.13)
j
≥ C9 e−γ M C6M . This is basically the estimate (5.83) and we can deduce the lower bound in Theorem 3.5 almost immediately from (7.13) without repeating most of its proof from Postulate (VII).
Finite-Size Scaling in Percolation
201
Also (7.13) can be used to derive the desired counterpart to (3.14), namely, for each fixed K and i, lim sup Ppn
(i)
W n
(i)
Epn {W n }
≤ K < 1,
(7.14)
when pn is inside the scaling window, i.e., when (3.5) holds. To see (7.14) for i = 1, fix some large K. Then choose k such that for large n, 1/ρ2 k (1) KEpn {W n } ≤ C2 C1−1 s(n) and kL0 (p) > 2n, (7.15) 2 (1)
with C1 as in (4.17). Such a k exists because Epn {W n } and s(n) are of the same order by Theorem 3.1 i) and pn is inside the scaling window. Finally choose pn% ≤ (pn ∧ pc ) such that n ≤ kL0 (pn% ) ≤ 2n. This can be done by virtue of (2.29). Lemma 4.5 then shows that 1/ρ2 (1) −1 k d % C2 k s(L0 (pn )) ≥ C2 C1 s(n) ≥ KEpn {W n } (see (7.15)). 2 Finally, then, by (7.13) for p = pn% , (1)
(1)
(1)
Ppn {W n ≥ KEpn {W n }} ≥ Ppn {W n ≥ C2 k d s(L0 (pn% ))} d
(1)
d
≥ Ppn% {W n ≥ C2 k d s(L0 (pn% ))} ≥ C9 e−γ k C6k > 0. This proves (7.14) for i = 1. For general i a little extra work is needed as in the last few lines of the proof of Theorem 3.5. " # 7.2. Proof of Theorem 3.7ii. We briefly indicate how to derive Postulate (VII alt) in dimension 2. We first show that (7.1) holds when x is sufficiently small. In fact, if K0 , y0 and j0 are the constants for which (6.35)–(6.37) hold, then this argument works for x ≤ [K0 ]−1 . With M(p, j ) as in (6.36), we have by a Harris ring construction that for ( j), suitable constants C1 , C2 > 0 and all v( j) ∈
P rp {∃ cluster C ∈ L0 (p) which contains v( j) and points v, w with v1 ≤ j0 y0 L0 (p)/2 < (j0 + 1)y0 L0 (p)/2 ≤ w1 and with |C| ≥ K0−1 s(L0 (p))} ≥ C1 P rp {M(p, j0 )}P rp {v( j) ↔ ∂BL0 (p) (v( j)} ≥
C1 y0 πL (p) (p) 8(y0 + 1) 0
≥ C2 πL0 (p) (pc ). On the other hand, by definition of G( j), P rp {G( j)} ≤ P≥xs(L0 (p)) (p).
(7.16)
202
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
Using the bound P≥s (p) ≤ πn (p) +
n 2 1 |∂Bm | πm/2 (p) , s m=1
proven in [BCKS98], Eq. (4.20), one easily shows that there is a constant C10 = C10 (x) < ∞ such that P≥xs(L0 (p)) (p) ≤ C10 πL0 (p) (pc ). Since the G( j) for different j depend on disjoint regions, they are independent, and P rp {Gk } ≤ [P≥xs(L0 (p)) (p)]k ≤ [C10 πL0 (p) (pc )]k . 2
2
(7.17)
Finally, denote the event in the left-hand side of (7.16) by K( j), that is K( j) = ∃ cluster C ∈ L0 (p) which contains v( j) and points v, w with v1 ≤ j0 y0 L0 (p)/2 < (j0 + 1)y0 L0 (p)/2 ≤ w1 and with |C| ≥ K0−1 s(L0 (p))}. Note that K( j) implies G( j) when x ≤ [K0 ]−1 . Therefore another Harris ring construction shows that for some constant C11 > 0, k P rp {K( j) for all 1 ≤ ji ≤ k, i = 1, 2} P rp {Gk ∩ Hk } ≥ C11 2
k [C2 πL0 (p) (pc )]k , ≥ C11 2
2
by virtue of (7.16) and the Harris–FKG inequality. Comparing this with (7.17), we see that P rp {Gk ∩ Hk } ≥ [C11 C2 /C10 ]k P rp {Gk }. 2
This completes the proof of (7.1) when x ≤ [K0 ]−1 . For our purposes (7.1) for 0 < x ≤ [K0 ]−1 is actually good enough, but it is not hard to obtain (7.1) for general 0 < x ≤ 1 now. In fact, we can apply the same argument as above, provided we first prove the following strengthening of (7.16) for some constant C12 > 0: P rp {∃ cluster C ∈ L0 (p) which contains v( j) and points v, w with v1 ≤ j0 y0 L0 (p)/2 < (j0 + 1)y0 L0 (p)/2 ≤ w1 and with |C| ≥ s(L0 (p))} ≥ C12 πL0 (p) (pc ).
(7.18)
But (7.18) can be derived exactly as (7.16) from an analogue of (6.37) if we start from P rp {∃ cluster C ∈ L0 (p) with diam(C) ≥ y0 L0 (p) and |C| ≥ s(L0 (p))} (1)
≥ P rp {WL0 (p) ≥ s(L0 (p))} − P rp {∃ cluster C ⊂ L0 (p) with diam(C) ≤ y0 L0 (p) but |C| ≥ s(L0 (p))} (1)
≥ P rp {WL0 (p) ≥ s(L0 (p))} − C1 y0−2 exp[−C3 y0−1 ] ≥ C13 > 0, (7.19)
Finite-Size Scaling in Percolation
203
which is valid for sufficiently small y0 > 0 and some constant C13 > 0. Equation (7.19) is the analogue of (6.35) with [K0 ]−1 replaced by 1. The reason why we can prove this now, but could not take [K0 ]−1 = 1 in (6.35) to begin with, is that we first needed to (1) show that P rp {WL0 (p) ≥ s(L0 (p))} is bounded away from 0. But this is now available to us from (7.14). As we pointed out before (7.14) only needs (7.1) for 0 < x ≤ x0 for some x0 > 0, and this we just derived. " # Acknowledgements. The authors wish to thank the Forschungsinstitut of the ETH in Zürich and the Institute for Advanced Study in Princeton for their hospitality and partial support of the research in this paper. The authors are also grateful for partial support from other sources: C.B. was supported by the Commission of the European Union under the grant CHRX-CT93-0411, J.T.C. by NSF grant DMS-9403842, and H.K. by an NSF grant to Cornell University.
References [ACCFR83] Aizenman, M., Chayes, J.T., Chayes, L., Fröhlich, J. and Russo, L.: On a sharp transition from area law to perimeter law in a system of random surfaces. Commun. Math. Phys. 92, 19–69 (1983) [Aiz97] Aizenman, M.: On the number of incipient spanning clusters. Nucl. Phys. B[FS] 485, 551–582 (1997) [Ald97] Aldous, D.: Brownian excursions, critical random graphs and the multiplicative coalescent. Ann. Probab. 25, 812–854 (1997) [Ale96] Alexander, K.: Private communication (1996) [AN84] Aizenman, M. and Newman, C.M.: Tree graph inequalities and critical behavior in percolationmodels. J. Stat. Phys. 36, 107–143 (1984) [AS92] Alon, N. and Spencer, J.: The Probabilistic Method. New York: Wiley Interscience, 1992 [BBCK98] Bollobás, B., Borgs, C., Chayes, J.T. and Kim, J.-H.: Unpublished (1998) [BCKS98] Borgs, C., Chayes, J.T., Kesten, H. and Spencer, J.: Uniform boundedness of critical crossing probabilities implies hyperscaling. Rand. Struc. Alg. 15, 368–413 (1999) [BC96] Borgs, C. and Chayes, J.T.: On the covariance matrix of the Potts model: A random cluster analysis. J. Stat. Phys. 82, 1235–1297 (1996) [BI92-1] Borgs, C. and Imbrie, J.: Finite-size scaling and surface tension from effective one-dimensional systems. Commun. Math. Phys. 145, 235–280 (1992) [BI92-2] Borgs, C. and Imbrie, J.: Crossover finite-size scaling at first order transitions. J. Stat. Phys. 69, 487–537 (1992) [BK85] van den Berg, J. and Kesten, H.: Inequalities with applications to percolation and reliability. J. Appl. Probab. 22, 556–569 (1985) [BoK90] Borgs, C. and Kotecký, R.: A rigorous theory of finite-size scaling at first-order phase transitions. J. Stat. Phys. 61, 79–110 (1990) [Bol84] Bollobás, B.: The evolution of random graphs. Trans. Am. Math. Soc. 286, 257–274 (1984) [Bol85] Bollobás, B.: Random Graphs. London: Academic Press, 1985 [CC86] Chayes, J.T. and Chayes, L.: Percolation and random media. In: Random Systems and Gauge Theories, Les Houches, Session XLIII, eds K. Osterwalder and R. Stora Amsterdam: Elsevier, 1986, pp. 1001–1142 [CC87] Chayes, J.T. and Chayes, L.: On the upper critical dimension in Bernoulli Percolation. Commun. Math. Phys. 113, 27–48 (1987) [CCD87] Chayes, J.T., Chayes, L. and Durrett, R.: Inhomogeneous percolation problems and incipient infinite clusters. J. Phys. A: Math. Gen. 20, 1521–1530 (1987) [CCF85] Chayes, J.T., Chayes, L. and Fröhlich, J.: The low-temperature behavior of disordered magnets. Commun. Math. Phys. 100, 399–437 (1985) [CCFS86] Chayes, J.T., Chayes, L., Fisher, D. and Spencer, T.: Finite-size scaling and correlation lengths for disordered systems. Phys. Rev. Lett. 57, 2999-3002 (1986) [CCGKS89] Chayes, J.T., Chayes, L., Grimmett, G.R., Kesten, H. and Schonmann, R.H.: The correlation length for the high-density phase of Bernoulli percolation. Ann. Probab. 17, 1277–1302 (1989) [Cha98] Chayes, J.T.: Finite-size scaling in percolation. Doc. Math. J. DMV, Extra Volume ICM III 113–122, (1998) [CPS96] Chayes, J.T., Puha, A.L. and Sweet, T.: Independent and dependent percolation. In: IAS-Park City Mathematics Series, Vol. 6 Probability Theory and Applications (Princeton, NJ, 1996), Providence, RI: AMS, 1999, pp. 49–166
204
[Con85] [ER59] [ER60] [GM90] [Gri99] [Ham57] [Har01] [HHS01] [Har60] [Jar00] [JKLP93] [Kes82] [Kes86] [Kes87] [Luc90] [Ngu85] [Ngu88] [Rus78] [SmW78] [SW78]
C. Borgs, J. T. Chayes, H. Kesten, J. Spencer
Coniglio, A.: Shapes, surfaces and interfaces in percolation clusters. In: Proc. Les Houches Conference on Physics of Finely Divided Matter, eds M. Daoud and N. Boccara Berlin–Heidelberg– New York: Springer-Verlag, 1985, pp. 84–109 Erd˝os, P. and Rényi, A.: On random graphs I. Publ. Math. (Debrecen) 6, 290–297 (1959) Erd˝os, P. and Rényi, A.: On the evolution of random graphs. Magy. Tud. Akad. Mat. Kut. Intéz. Közl. 5, 17–61 (1960) Grimmett, G. and Marstrand, J.M.: The supercritical phase of percolation is well behaved. Proc. Roy. Soc. London Ser. A 430, 439–457 (1990) Grimmett, G.: Percolation. 2nd edition, New York: Springer-Verlag, 1999 Hammersley: Percolation processes. Lower bounds for the critical probability. Ann. Math. Statist. 28, 790–795 (1957) Hara, T.: Critical two-point functions for nearest-neighbour high-dimensional self-avoiding walk and percolation. Preprint in preparation Hara, T., van der Hofstad, R. and Slade, G.: Critical two-point functions and the lace expansion for spread-out high-dimensional percolation and related models. Preprint, (2001) Harris, T.E.: A lower bound for the critical probability in a certain percolation process. Proc. Cambridge Philos. Soc. 56, 13–20 (1960) Járai, A.: Incipient infinite percolation clusters in 2D. Preprint (2000) Janson, S., Knuth, D.E., Łuczak, T. and Pittel, B.: The birth of the giant component. Random Struc. Alg. 4, 233–358 (1993) Kesten, H.: Percolation Theory for Mathematicians. Boston: Birkhäuser, 1982 Kesten, H. : The incipient infinite cluster in two-dimensional percolation. Probab. Theory Rel. Fields 73, 369–394 (1986) Kesten, H.: Scaling relations for 2D-percolation. Commun. Math. Phys. 109, 109–156 (1987) Łuczak, T.: Component behavior near the critical point of the random graph process. Rand. Struc. Alg. 1, 287–310 (1990) Nguyen, B.G.: Correlation lengths for percolation processes. PhD thesis, UCLA (1985) Nguyen, B.G.: Typical cluster size for two-dimensional percolation processes. J. Stat. Phys. 50, 715–726 (1988) Russo, L.: A note on percolation. Z. Wahrsch. verw. Geb. 43, 39–48 (1978) Smythe, R.T. and Wierman, J.C.: First-Passage Percolation on the Square Lattice. BerlinHeidelberg: Springer-Verlag, 1978 Seymour, P.D. and Welsh, D.J.A.: Percolation probabilities on the square lattice. Ann. Discrete Math. 3, 227–245 (1978)
Communicated by M. Aizenman
Commun. Math. Phys. 224, 205 – 218 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Are There Incongruent Ground States in 2D Edwards–Anderson Spin Glasses? C. M. Newman1, , D. L. Stein2, 1 Courant Institute of Mathematical Sciences, New York University, New York, NY 10012, USA.
E-mail:
[email protected] 2 Departments of Physics and Mathematics, University of Arizona, Tucson, AZ 85721, USA.
E-mail: dls @ physics.arizona.edu Received: 3 December 2000/ Accepted: 30 April 2001
Abstract: We present a detailed proof of a previously announced result [1] supporting the absence of multiple (incongruent) ground state pairs for 2D Edwards–Anderson spin glasses (with zero external field and, e.g., Gaussian couplings): if two ground state pairs (chosen from metastates with, e.g., periodic boundary conditions) on Z2 are distinct, then the dual bonds where they differ form a single doubly-infinite, positive-density domain wall. It is an open problem to prove that such a situation cannot occur (or else to show – much less likely in our opinion – that it indeed does happen) in these models. Our proof involves an analysis of how (infinite-volume) ground states change as (finitely many) couplings vary, which leads us to a notion of zero-temperature excitation metastates, that may be of independent interest.
1. Introduction The decades-old challenge of understanding the physical nature of laboratory spin glasses and the mathematical nature of spin glass models at low temperature continues. It is a paradigm of the wider effort to analyze the many novel features that occur in disordered systems generally. One can only hope that this effort will achieve some fraction of the successes that have been reached in understanding homogeneous systems – in and out of equilibrium – and that are epitomized by the work of Joel Lebowitz and his many collaborators. It is indeed an honor to contribute to this celebration of Joel’s first 70 years; may he live to 120. Research partially supported by the National Science Foundation under grants DMS-98-02310 and DMS01-02587. Research partially supported by the National Science Foundation under grants DMS-98-02153 and DMS01-02541.
206
C. M. Newman, D. L. Stein
Our focus here is entirely on the Edwards–Anderson (EA) [2] model on Zd , simplest of the short-ranged Ising spin glasses, with Hamiltonian Jxy σx σy . (1) HJ (σ ) = − x,y
Here J denotes a specific realization of the couplings Jxy = Jx,y , the spins σx = ±1 and the sum is over nearest-neighbor pairs x, y only, with the sites x, y on the square lattice Zd . The Jxy ’s are independently chosen from a symmetric, continuous distribution with unbounded support, such as Gaussian with mean zero; we denote by ν the overall disorder distribution for J . In this paper, we restrict attention entirely to ground states, and further, to the lowest interesting dimension, d = 2. Of course, for d = 1, and assuming as we do that the Jxy ’s are continuously distributed, it is easy to see that the multiplicity of infinite-volume ground states is exactly two – i.e., a single ground state pair (GSP) of spin configurations related to each other by a global spin flip – since, in the absence of frustration, every bond can be satisfied in a ground state. We are interested in the question of whether there are infinitely many observable GSP’s. By “observable” we mean that these states can be generated without using special J -dependent boundary conditions. This means that by using, say, periodic boundary conditions on the L × L squares SL centered at the origin, for a sequence of L’s tending to infinity, also chosen in a J -independent way, the corresponding sequence of finite(L) volume GSP’s for the finite-volume Hamiltonians HJ (when restricted to a fixed, but arbitrarily large window about the origin) will generate an empirical distribution, i.e., a histogram, that in the limit is dispersed over many GSP’s. 2. Main Result 2.1. Preliminaries: Metastates. To state a precise theorem about the GSP’s that arise in this way, we need to explain the notion of a metastate [3–6] in this zero-temperature context. We will do this in the briefest possible way here, using empirical distributions, while delaying to later sections of the paper a discussion of the fact that there are alternative definitions giving rise to the same mathematical object. First, we note that for a given J , with all couplings nonzero, a GSP α may be identified with the collection of unsatisfied bonds, which we regard as edges in the dual lattice. Now suppose that Lj → ∞ is a sequence of scale sizes, not depending on J , such that for ν-almost every J , there is a probability measure (called a metastate) κJ , defined on the configurations α of GSP’s on all of Z2 , which is the limit of the empirical (L) distributions of the finite volume GSP’s αJ along the sequence Lj as follows: Let D1 and D2 be disjoint finite sets of dual edges, let A(D1 , D2 ) denote the event that every (M) edge in D1 is unsatisfied and every edge in D2 is satisfied; let FJ (D1 , D2 ) denote the fraction of the indices j ∈ {1, . . . , M} such that all the edges of D1 and D2 are within (L ) the square SLj and such that the GSP αJ j obeys all the requirements of A(D1 , D2 ); then for every such D1 and D2 , (M)
lim FJ (D1 , D2 ) = κJ (A(D1 , D2 )).
M→∞
(2)
Thus a metastate for T = 0 is an ensemble of infinite-volume GSP’s that describes the asymptotic fractions of squares, along a subsequence Lj , for which the various
Are There Incongruent Ground States in 2D Spin Glasses?
207
GSP’s are observed (when restricted to windows of fixed, but arbitrarily large, size) within the finite-volume systems. It can be shown by compactness arguments [5, 6] that such subsequences Lj exist; in fact every subsequence has such an Lj as a further subsubsequence. Although it is a reasonable conjecture that any two metastates are in fact the same for almost every J , no general result has been proved. However, this would be an immediate corollary of the following conjecture, at least for d = 2, which would also imply that the metastate is supported on a single GSP for almost every J . We note that recent numerical results are consistent with the existence of only a single GSP in two dimensions [7, 8]. Conjecture 1. Let J be chosen from the disorder distribution ν and let α and β be GSP’s chosen independently from d = 2 periodic boundary condition metastates, κJ and κJ (coming from subsequences Lj and L k ). Then, with probability one, α = β. 2.2. Theorem. The main result of this paper is the proof of the following theorem, which we regard as partial verification of the above Conjecture – see the remark below. Equality of two GSP’s, α and β, is of course equivalent to the vanishing of the symmetric difference αβ, the collection of bonds that are satisfied in one of the two GSP’s and unsatisfied in the other. It is not hard to show (see Proposition 1 below) that, at least for periodic boundary conditions, the symmetric difference must consist either of a single domain wall (i.e., a doubly-infinite self-avoiding path in the dual lattice) with strictly positive density or else multiple nonintersecting domain walls which have altogether strictly positive density, but may have zero density individually. A priori, we felt (and still feel) that on a heuristic level, the former scenario for GSP multiplicity is the less plausible of the two. The next theorem rigorously eliminates the latter scenario. Theorem 1. Let J be chosen from the disorder distribution ν and let α and β be GSP’s chosen independently from d = 2 periodic boundary condition metastates, κJ and κJ (coming from subsequences Lj and L k ). Then, with probability one, either α = β or else αβ is a single domain wall with strictly positive density. Proof. This theorem will be an immediate consequence of three propositions, given in Sect. 4 of the paper. Remark. Although Theorem 1 does not eliminate the scenario of multiple GSP’s whose symmetric differences are single positive density domain walls, we suspect that such domain walls do not in fact occur. The proof of Theorem 1 is based on showing that the presence of two or more αβ domain walls would create an instability for both α and β with respect to the flip of a large droplet whose boundary consists of two long segments from adjacent domain walls, connected by two short “rungs” between the walls. The stability of α and β to such flips is controlled by the infimum E of the necessarily positive rung energies – see Eq. (11). Proposition 3 of Sect. 4 proves instability by showing that E = 0, while Proposition 2 there shows that such unstable GSP’s cannot actually occur with nonzero probability. If there were a single domain wall, it would be natural to expect that, like the rungs in Proposition 3, the “pseudo-rungs” that connect sections of the domain wall that are close in Euclidean distance, but greatly separated in distance along the domain wall, could also have arbitrarily low positive energies. If these pseudo-rungs connected long pieces of the domain wall containing some fixed bond (and we emphasize that these properties have not been proved), then single domain
208
C. M. Newman, D. L. Stein
walls would be ruled out by an analogue of Proposition 2. The consequence would be that the periodic boundary condition metastate in the 2D EA Ising spin glass would be unique and supported on a single GSP.
2.3. Extension to other boundary conditions. The restriction to periodic boundary conditions in Theorem 1 can in fact be relaxed to allow other boundary conditions that do not depend on J . For boundary conditions such as antiperiodic that are flip-related to periodic ones, nothing needs to be done, since they yield the same metastate – see Sect. IV of [4]. To explain how other boundary conditions can be handled, we begin by noting that the significance of periodic boundary conditions is that they yield translation-invariance of various infinite-volume objects, which in turn is a crucial ingredient in the propositions of the next section. With periodic boundary conditions, translation-invariance (L) is already valid for finite volume. For example, from the random pair J , αJ , the finite dimensional distributions of finitely many coupling values and finitely many bond satisfaction variables are unchanged under translation by y, as long as y does not translate any of the finitely many bonds in question beyond SL . On the other hand, in the spirit of the empirical distribution construction of the metastate described above, one (L) could rather consider the random pair J , αJ , with L chosen, uniformly at random, from L1 , . . . , LM . In that case, there is in a certain sense only approximate translation invariance for finite M, since the bonds typically do get translated out of SLj for small j . But full translation-invariance is restored in the limit M → ∞. For non-periodic, but still J -independent, boundary conditions, one can somewhat similarly obtain infinite-volume translation-invariance, as follows. For each L and x, let (L,x) αJ denote the GSP in the translated square SL +x with some J -independent boundary condition, such as free or plus. Next, let X (L) denote a uniformly random site in SL √(L) , where the deterministic L (L) → ∞ with, say, L − L (L) → ∞ (e.g., L (L) = L). (L,X (L)) (L,X (L)) Then the random pair (J , αJ ) or, alternatively, (J , αJ ), has approximate translation-invariance, which becomes exact as L → ∞, or, alternatively, M → ∞. Using such an “average over translates” construction, one can obtain metastates coming from, e.g., free or plus boundary conditions, for which the analogue of Theorem 1 will be valid. Such averaging over translates can also be used to obtain translation-invariance for the extended notions of metastates we describe next. 3. The Excitation Metastate An important part of the proof of Theorem 1 is based on extending the notion of metastates so as to describe how a given GSP changes as the couplings in J vary. Of course, if Conjecture 1 were true, then, at least for d = 2, there would be, for almost every J , a GSP αJ , uniquely determined as being the one on which the periodic boundary condition metastate is supported; thus one would know how αJ changes even when infinitely many of the couplings in J vary. But in general, since there might be many GSP’s and perhaps even many metastates, it is not so obvious how to formulate the dependence of a given GSP in the support of a metastate even on finitely many couplings. Neither the statement of Theorem 1 nor that of our three main propositions requires this extension of metastates, but it will be needed for the proofs of the latter two of the
Are There Incongruent Ground States in 2D Spin Glasses?
209
main propositions. This extension will be presented in detail in Sect. 5 of the paper, but we present a short exposition here, since it seems to be of independent interest. Roughly speaking, the extension requires that we keep track of not only the GSP itself, but also of all its excitations in which finitely many spins are forced to take specified values, modulo a global flip. We note that recent numerical studies of spin glasses have analyzed excitations induced in this way [9] and in more novel ways [10]. There are two types of information about our excitations that one might wish to keep track of: (a) the minimum energy cost required to force the spins, and (b) the pair of spin configurations that does the minimizing – i.e., the excited state. It actually suffices to keep track only of (a), but it is perhaps conceptually simpler to keep track of (b) as well, and we will take that tack. Suppose A is a finite subset of Z2 (in this discussion, we only take d = 2 for convenience), η is a spin configuration on A and L is sufficiently large so that A ⊂ SL . A,η,(L) We denote by αJ the pair of periodic boundary condition spin configurations on SL with minimum energy subject to the constraint that they equal ±η on A. If A is empty or (L) a singleton site, this is just the ordinary finite-volume ground state αJ . We also define A,η,(L)
A,η,(L)
to be the energy of αJ minus the ground state (L) energy of αJ . Let B be a finite set of bonds b = x, y and let J B denote a realization the excitation energy EJ
(L)
of the couplings Jb for all b ∈ B. To see how αJ and eventually αJ varies with J B when all other couplings are fixed, we begin by letting A = A(B) denote the set of sites A,η,(L) that are endpoints of bonds in B and considering the excitation energies EJ and A,η,(L)
, for all possible spin configurations η on A. We corresponding excited states αJ also define B HJ B (η) = − Jxy ηx ηy , HJ (η; B) = − Jxy ηx ηy , (3) x,y∈B
x,y∈B
J [J B ]
and denote by the coupling configuration in which each coupling Jb of J with b ∈ B is replaced by JbB and all other couplings are left unchanged. Then, for fixed η, A,η,(L) αJ [J B ] does not depend on J B and A,η,(L) A,η ,(L) (L) (L) HJ [J B ] αJ [J B ] − HJ [J B ] αJ [J B ] A,η,(L) A,η ,(L) (L) (L) = HJ [J B ] αJ − HJ [J B ] αJ (4) = HJ B (η) − HJ (η; B) − HJ B (η ) − HJ (η ; B) A,η,(L)
+ EJ
A,η ,(L)
− EJ
.
A,η,(L)
depends on J but not on J B while HJ B (η) depends on J B Note that EJ but not on J . Consider now the finitely many functions, as η varies on A, A,η,(L)
B h(L) η (J ) ≡ EJ
+ HJ B (η) − HJ (η; B).
(5)
∗(L)
These are affine functions of J B , and if we define ηJ (J B ) to be the η that minimizes (L)
hη (J B ), it follows that
(L)
∗(L)
A,ηJ (J B ),(L)
αJ [J B ] = αJ
.
(6)
210
C. M. Newman, D. L. Stein
When letting L → ∞, we will do so for the ground state αJ and simultaneously for the A,η A,η excitation energies EJ and excited states αJ for all choices of finite A and spin configurations η on A; a superscript will denote that collection of choices. Of course, this needs to be done via a metastate construction that extends the “ground metastate” κJ described earlier, to what we will call the excitation metastate κJ . The excitation metastate is a probability measure on infinite-volume excitation energies and states for the given J , (E , α ), which includes the ground metastate since the ground state α can be obtained by restricting α to A being the empty set (or a singleton, since we are dealing with periodic boundary conditions that do not break spin-flip symmetry). To see how the ground state α changes to α[J B ] when the couplings in a fixed finite B vary, we can then use the infinite-volume extensions of our last two displayed equations (where HJ B (η) and HJ (η; B) are as before): hη (J B ) ≡ E A(B),η + HJ B (η) − HJ (η; B),
(7)
∗ B α[J B ] = α A(B),η (J ) ,
(8)
and
where η∗ (J B ) is the η on A(B) that minimizes hη (J B ).
4. The Main Propositions In this section, we present the three central propositions leading immediately to Theorem 1. The proof of the first of these, a direct application to spin glasses of general 2D percolation results of Burton and Keane [11], will be given in this section. The proof of the second and third propositions will be given in Sect. 6. We begin with a somewhat more detailed discussion of ground metastates than given in the last section. For simplicity, we continue to restrict the discussion to periodic boundary condition metastates, as in Sect. 2. An (infinite-volume) ground state pair or GSP for a given coupling realization J is a pair of spin configurations ±σ on Zd , whose energy, governed by Eq. (1), cannot be lowered by flipping any finite subset of spins. That is, it must satisfy the constraint Jxy σx σy ≥ 0 (9) x,y∈C
along any closed loop C in the dual lattice. Infinite-volume ground states are always the limits of finite volume ground states, but, in general, the finite-volume boundary conditions may need to be carefully chosen, depending on J and/or the limiting ground state. In a disordered sytem, if there are many distinct GSP’s for typical fixed J , then in (L) general, as noted in [12], the limit limL→∞ αJ doesn’t exist, if the L’s are chosen in a coupling-independent way. This phenomenon was called chaotic size dependence [12]. The ground metastate, a probability measure κJ on the infinite-volume ground states (L) αJ , was proposed in [5] as a means of analyzing the way in which αJ samples from its various possible limits as L → ∞. (The metastate was introduced and defined for both zero and positive temperatures, but we confine the discussion here to zero temperature.) The same metastate can be constructed by at least two distinct approaches. The first,
Are There Incongruent Ground States in 2D Spin Glasses?
211
introduced earlier by Aizenman and Wehr (AW) [13], directly employs the randomness of the J ’s, while the “empirical distribution” approach of [5] and subsequent papers was motivated by, but doesn’t require, the potential presence of chaotic size dependence for fixed J . The empirical distribution point of view (and its natural extension to excitation metastates) will be the primary one used throughout this paper. However, we briefly describe the AW construction, since it is the one that directly gives, for, e.g., periodic boundary conditions, the translation invariance that will be crucial in our first proposition; for (L) more details see [13]. Here one considers, for each L, the random pair (J , αJ ) (where (L)
αJ is the finite-volume periodic boundary condition GSP obtained using the restriction J (L) of J to SL ), and takes the limit of the finite-dimensional distributions along a J -independent subsequence of L’s, using compactness. This yields a probability distribution K on infinite-volume (J , α)’s which is translation invariant, under simultaneous lattice translations of J and α, because of the periodic boundary conditions, and is such that the conditional distribution κ˜ J of α given J is supported entirely on GSP’s for that J . The conditional distribution κ˜ J is the AW ground metastate. It is easy to show that there is sequential compactness leading to convergence for J -independent subsequences of L’s, as described above. We have conjectured [6] that all subsequence limits are the same; i.e., that existence of a limit does not require taking a subsequence. Proving this conjecture remains an open problem. The empirical distribution approach of [3, 5, 6], as described in Sect. 2, takes a fixed J and, roughly speaking, replaces the “J -randomness” used in the AW construction of κ˜ J with “L-randomness” – i.e., with chaotic size dependence. The empirical distributions along a subsequence (L1 , L2 , . . . ) are the measures κJM = (1/M)
M k=1
δ
(L )
αJ k
,
(10)
where δα denotes the Dirac delta measure at the state α and where for convenience we (L) regard the finite-volume GSP αJ as defined in infinite volume by, e.g., taking all bonds outside SL as satisfied. We say that κJM has a limit κJ if the probability of any event A(D1 , D2 ) (that every edge in D1 is unsatisfied and every edge in D2 is satisfied, where D1 and D2 are disjoint finite sets of dual edges) converges to the κJ -probability of that event. It was shown in [6] that there exists a J -independent subsubsequence where the limits κ˜ J and κJ are the same. For more details and proofs, see [3, 5, 6]. Also see [4] for additional properties of the metastate, particularly invariance with respect to gaugerelated boundary conditions. Before we state Proposition 1, some additional definitions are needed. Consider a periodic boundary condition metastate κJ (in some fixed dimension, not necessarily two) and two GSP’s α and β chosen from κJ . Then their symmetric difference αβ, ∗ as introduced in Sect. 2, is the set of edges in the dual lattice Zd that are satisfied in α and not β or vice-versa. If B is the graph whose edge set is αβ and whose vertices are ∗ all sites in Zd touching αβ, then a domain wall, defined relative to the two GSP’s, is a cluster (i.e., a maximal connected component) of B. (In two dimensions, according to Proposition 1, domain walls are generically doubly-infinite self-avoiding paths in the dual lattice.) The symmetric difference αβ is the union of all αβ domain walls and
212
C. M. Newman, D. L. Stein
may consist of a single domain wall or of multiple domain walls that are site-disjoint and hence also edge-disjoint. Two distinct GSP’s α and β are said to be incongruent if αβ has a well-defined ∗ nonvanishing density within the set of all edges in Zd ; if the density is zero, they are regionally congruent. We do not consider here the case where the density is not well-defined; we will see from Proposition 1 that in fact this cannot happen in two dimensions. In Proposition 1, we will also see that, if there are multiple GSP’s, the “observable” ones are incongruent. Our primary interest is therefore in the question of existence of these “physical” incongruent states, which should be observable by using coupling-independent boundary conditions. As mentioned in Sect. 2, incongruent states may consist of a single positive-density wall, or else of multiple domain walls, which individually may or may not have positive density, but collectively have strictly positive density. In all our propositions, J is chosen from the disorder distribution ν and then α and β are GSP’s chosen independently from periodic boundary condition metastates κJ and κJ (which may be the same), as described above. Proposition 1 ([1, 11]). Distinct α and β in any dimension must, with probability one, be incongruent. In two dimensions, all domain walls comprising αβ have the following properties with probability one: (i) they are infinite and contain no loops or dangling ends; (ii) they cannot branch and thus are doubly-infinite self-avoiding paths; (iii) they together partition Z2 into at most two topological half-spaces and/or a finite or infinite number of doubly-infinite topological strips (that also cannot branch – i.e., each strip has two boundary domain walls and exactly one neighboring strip or half-space on each side). (iv) Moreover, each domain wall has a well-defined density and there cannot simultaneously be positive-density and zero-density walls. Proof of Proposition 1. Let us denote by DJ the probability measure on configurations of αβ corresponding to choosing α and β independently from κJ and κJ , and denote by D the measure then obtained by integrating out the couplings J with respect to the disorder distribution ν. We claim that D is translation-invariant. To see this, begin with the translation-invariant measures on joint configurations of couplings and GSP’s K (= νκJ ) and K (= νκJ ) and note that the natural coupling νκJ κJ , a measure on (J , α, β) configurations, retains translation-invariance. D is then translation-invariant since it is just the distribution of αβ with (α, β) distributed as the marginal of this coupled measure. The translation-invariance of D in turn implies by the ergodic theorem ∗ with respect to Z2 -translations that any “geometrically defined event”, such as a bond belonging to a domain wall, occurs either nowhere or else with strictly positive density. This proves the first claim. To prove property (i), we note that a domain wall taken from αβ separates regions in which the spins of α and β agree from regions where they disagree. A domain wall therefore cannot end at a point in any finite region. To rule out loops, note that the sum x,y Jxy σx σy along any such loop must have opposite signs in the two GSP’s, violating Eq. (9), unless the sum vanishes. But this occurs with zero probability because the couplings are chosen independently from a continuous distribution. Claims (ii), (iii), and (iv) are proven in [11], using percolation-theoretic arguments first presented in [14]; we sketch the arguments. To prove (ii), suppose that a domain wall
Are There Incongruent Ground States in 2D Spin Glasses?
213
branches at some site z in the dual lattice. (We note, although it’s not needed for the proof, that the number of branches emanating from z must be even, again because domain walls separate regions of spin configuration agreement from regions of disagreement. Hence the minimal branching at z is four.) None of these branches may intersect somewhere else, by property (i). By the translation-invariance of D, there must then be a positive density of branch points, so that the domain wall would have a treelike structure. That implies the existence of an " > 0 such that the boundary of SL is intersected by a number of distinct branches that grows as "L2 as L → ∞, which is impossible. The proof of (iii) uses a similar argument to rule out branching of the strips – see Theorem 2 of [11] for details. Property (iv) is not needed for subsequent arguments, but is included for completeness; it is proven in Theorem 4 of [11] and follows readily from the properties just proven. If zero-density and positive-density clusters coexist, then for some p > 0, there is positive D-probability that the origin of the dual lattice is contained in a zero-density domain wall with an adjacent wall of density at least p. Let Sp be the set of all walls with density greater than or equal to p. Then there can be no more than (1/p) walls in Sp . The maximum number of walls of density zero that are adjacent to walls belonging to Sp (i.e., if every Sp -wall is surrounded by two zero-density walls whose other adjacent wall does not belong to Sp ) is therefore 2/p. But then the union of such zero-density walls has density zero and so the probability of the event that the origin is contained in a zero-density wall adjacent to a wall in Sp is zero, leading to a contradiction. This completes the proof of the proposition. So the picture we now have of the symmetric difference αβ is a union of one or more doubly infinite domain walls. These domain walls do not branch or have any internal loops, and they divide the plane into strips or (if there are positive-density domain walls) half-planes. In all cases where there is more than a single domain wall, translation-invariance of D implies that distinct domain walls mostly remain within an O(1) distance of one another. E.g., there can be no “hourglass”, “martini glass”, etc., domain wall configurations; these can be ruled out by arguments similar to those used in the proof of part (ii) of Proposition 1. The essential idea behind the proof of Theorem 1 is contained in the next two propositions. Before we state these propositions, we need to introduce the notion of a “rung” between adjacent domain walls. A rung R, defined with respect to αβ, is a path of ∗ edges in Z2 connecting two distinct domain walls, with only the first and last sites in R on any domain wall. So R can contain only edges that are not in αβ, and the corresponding couplings are therefore either both satisfied or both unsatisfied in α and β. The energy ER of R is defined to be Jxy σx σy , (11) ER = xy∈R
with σx σy taken from α or equivalently β. It must be that ER > 0 with probability one for the following reasons, which we sketch here and make precise later in the proof of Proposition 2. Suppose that a rung could be found with negative energy (there is zero probability of a zero-energy rung); by translation-invariance there would need to be many such rungs between some fixed pair of adjacent domain walls. Consider the “rectangle” formed by two such negative-energy rungs and the connecting segments of the two adjacent domain walls. The sum of Jxy σx σy along the couplings in the domain wall segments would be positive in one GSP (say, α), and would therefore be negative in the other (say, β). Therefore, the loop formed by the boundary of this rectangle would violate Eq. (9) in GSP β.
214
C. M. Newman, D. L. Stein
It is then natural to ask the deeper question of whether rung energies along any strip are strictly bounded away from zero, or whether their infimum is exactly zero. Propositions 2 and 3 address this question. Proposition 2. The rung energies ER between two fixed (adjacent) domain walls cannot be arbitrarily small; i.e., there is zero probability that E = inf R ER = 0. Proposition 3. There is zero probability that E > 0. The contradiction between Propositions 2 and 3 leads directly to Theorem 1. These propositions will be proved in Sect. 6. 5. Transition Values and Flexibilities In this section, we present two auxiliary propositions. They will be used in the next section to prove Propositions 2 and 3. These auxiliary propositions involve two notions, transition value and flexibility, that arise in the analysis of how a GSP changes when a single coupling, Jb , varies. Since this is a restricted case of the dependence of α[J B ] on a finite collection J B of couplings, we begin the section by providing a more detailed exposition of the excitation metastate than that given in Sect. 3 above. Along with an empirical distribution construction of the excitation metastate κJ as a probability measure, defined for ν-almost every J , on configurations (E , α ) of excitation energies and states for the given J , there is an alternative AW-type construction, ,(L) ,(L) ,(L) ,(L) as follows. For each L, consider (J , EJ , αJ ), where EJ and αJ denote the excitation energies and states in SL , with periodic boundary conditions, when the spin configuration on A ⊂ SL is constrained to be ±η (for all allowed A’s and η’s). As in the AW ground metastate construction, one has sequential compactness of the corresponding probability measures, K,(L) , leading to convergence of the finite dimensional distributions (involving finitely many couplings, finitely many finite A’s and finitely many η’s) to those of a limiting translation-invariant measure K on infinite-volume configurations (J , E , α ) along deterministic subsequences of L’s. The marginal distribution of J from this K is of course just ν and the conditional distribution of (E , α ) given J is then an excitation metastate κ˜ J , which, like in the
ground metastate case, can be shown for ν-almost every J to equal the κJ constructed via empirical distributions, as the limit along a subsubsequence of (1/M)
M k=1
δ
,(Lk )
EJ
,(Lk )
,αJ
.
(12)
The translation-invariance of K follows, as usual, from the periodic boundary conditions. The relative compactness (tightness) for α ,(L) follows from the two-valuedness of spin variables. Finally, the relative compactness (tightness) for E ,(L) follows from the trivial bound, A,η,(L) |EJ |≤ |Jxy |, (13)
A
where A denotes the sum over bonds x, y with either x or y or both in A, together with the fact that the distribution of the Jxy ’s does not change with L.
Are There Incongruent Ground States in 2D Spin Glasses?
215
As explained in Sect. 3, for a given J , we can extract from (E , α ) not only the GSP α, but also α[J B ] , which describes how the GSP changes when the couplings in a fixed finite set B of bonds vary. When B consists of a single bond b = x, y, we write α(K ; b) for the ground state that results when Jb is replaced by K with all other couplings of J left unchanged. It should be clear from Equations (7) and (8) that as K varies in (−∞, +∞), the GSP α(K ; b) changes exactly once (this is particularly easy to see in finite volume and the property is preserved in the excitation metastate), from its original configuration α when K = Jb to a new configuration α b = α {x,y},ηˆ ,
(14)
where ηˆ is one of the two spin configurations on {x, y} of opposite parity to the original GSP α (so that σx σy is +1 in one of α and α b and −1 in the other, or equivalently Jb is satisfied in one and unsatisfied in the other). We call the value of K where this change happens the transition value and denote it by Kb . For a given b, the transition value Kb and the unordered set of two GSP’s {α, α b } do not depend on the value of Jb , with all other couplings held fixed (again, this is clear for finite volume, and is preserved in the limit). This means that with respect to the probability measure K on infinite-volume configurations (J , E , α ), the random variables Kb and Jb are independent. The next proposition is an immediate consequence of this independence. Proposition 4. With probability one, no coupling Jb is exactly at its transition value Kb . Proof of Proposition 4. From the independence of Jb and Kb , and the continuity of the distribution of Jb , it follows that there is probability zero that Jb − Kb = 0. As in the proof of the last proposition, we continue to work on the probability space of (J , E , α ) configurations with probability measure K . When the value of Jb is moved from its original value past the transition value Kb , the change from the original ground state of α to the new ground state, and originally excited state, of α b may involve the flipping of a finite droplet (region of Z2 ) or one or more infinite droplets. Thus the symmetric difference αα b , representing the dual bonds which change from satisfied to unsatisfied or vice-versa, may consist of a single finite loop or else of one or more infinite disconnected paths, but in all cases some part must pass through b since its satisfaction status clearly changes. To help analyze what other bonds αα b may or may not pass through, we introduce the notion of flexibility. The flexibility of a bond b = x, y is defined as Fb ≡ |Kb − Jb | = (1/2) |E {x,y},ηˆ |
(15)
and thus is proportional to the excitation energy needed to flip the relative sign of the spins at x and y; it is a measure of the stability of the ground state α with respect to fluctuations of the single coupling Jb . Proposition 5. For two bonds a and b, there is zero probability that Fb > Fa and simultaneously αα a passes through b. (L)
Proof of Proposition 5. For finite L, and a bond e in SL , let us denote by Fe ≡ (L) (L) |Je − Ke | the finite-volume flexibility. Now Fe is clearly the minimum, over all droplets in SL , with periodic boundary conditions, whose boundary passes through e, of (half the) droplet flip energy cost in the GSP α (L) . Since this is the case for both e = a
216
C. M. Newman, D. L. Stein
and e = b, it is an immediate consequence that the finite-volume droplet boundary (L) (L) α (L) α a,(L) cannot pass through b if Fb > Fa . After L → ∞, the characterization of Fe as a minimum over finite droplets may be lost, but we claim that the conclusion of the proposition still holds. This is because, although the convergence of K,(L) along (L) (L) a subsequence to K is not sufficient to imply, e.g., that the probability of Fb > Fa converges along the subsequence to the limiting probability of Fb > Fa , it is sufficent to imply that the probability of the event in the proposition is less than or equal to the the lim inf of the (zero) probability of the corresponding finite-volume events. This completes the proof of the proposition.
6. Proof of Propositions 2 and 3
Proof of Proposition 2. Suppose that there are two adjacent domain walls from the GSP’s α and β, W1 and W2 , with W1 passing through the origin of the dual lattice, and suppose further that the infimum E of rung energies ER for rungs R between W1 and W2 is zero. Our object is to prove that this event has zero probability. If the probability is nonzero, then for every " > 0 there is some *(") < ∞ so that, with nonzero probability, there is a rung R between W1 and W2 , with the property P("), that its length, defined as the number of bonds, is below *(") and its energy ER is below ". But then, by translation-invariance and the lemma given right after this proof, there must, with nonzero probability, be infinitely many such rungs with property P(") with starting points on W1 in both directions from the origin along W1 . Thus we can find two such rungs R and R , one in each direction, and sufficiently far apart that they do not touch each other. Consider the “rectangular” region of Z2 whose boundary is the union of these two rungs and the connecting segments, C1 and C2 of W1 and W2 . The energy cost of flipping the spins in this region in α (respectively, in β) is +E(C1 , C2 )+ER +ER (respectively, −E(C1 , C2 ) + ER + ER ). Both these quantities must be positive since both α and β are GSP’s; hence |E(C1 , C2 )| is bounded by ER + ER < 2" and the energy costs in both ground states are bounded by 4". This implies that every bond b that W1 (or W2 ) passes through has flexibility less than 2". Since " is arbitrary, the flexibilities must be zero, but that would contradict Proposition 4. This, together with the following lemma, completes the proof. Lemma 1. Suppose P is a translation-invariant property of rungs, e.g., the property that the rung energy is below a certain value and/or the rung length is below a certain value. There is zero probability that there exist two adjacent domain walls, W1 and W2 , such that the set of starting points on W1 of rungs between W1 and W2 that satisfy P is nonempty without being doubly infinite, i.e., along both directions of W1 . Proof of Lemma 1. The proof is based entirely on the translation invariance of the measure K . Suppose the claim of the lemma is false. Then for each site x in the dual lattice, there is nonzero probability for the event Ax that there is a domain wall W passing through x and an adjacent wall W such that x is the last site in one of the two directions along W such that there is a rung from that site to W satisfying P. Since every domain wall has two directions and at most two adjacent domain walls, there can be at most four sites on any domain wall for which this event occurs. Every domain wall that intersects the
Are There Incongruent Ground States in 2D Spin Glasses?
217
b1
a
b2
Fig. 1. A rung R with ER = E + δ. The dots are sites in Z2 , and bonds are drawn in the dual lattice. Two domain walls are solid lines and R is the dashed line. The bonds b1 and b2 have flexibility > δ. The ten dotted line bonds are super-satisfied
square SL , sitting inside the infinite lattice, much touch the boundary of the square and thus there are at most cL such domain walls for some constant c < ∞, and consequently at most 4cL sites x in SL for which Ax occurs. But by the ergodic theorem for spatial translations, there is nonzero probability that the number of such sites exceeds c L2 for some constant c > 0. This contradiction completes the proof. Proof of Proposition 3. For the proof, we need the notion of a “super-satisfied” bond b = x, y. It is easy to see, for a given J , that b is satisfied in every ground state if |Jxy | >min{Mx , My }, where Mx is the sum of the three other coupling magnitudes |Jxz | touching x, and My is defined similarly. Such a bond or its dual, called super-satisfied, clearly cannot be part of a domain wall between any two GSP’s. As in the proof of Proposition 1, but using the excitation metastates κJ and κ J that extend the ground metastates from which α and β are chosen, we work in the probability space with the coupled measure νκJ κ J . On this space, we can consider the modified ground states α[J B ] and β[J B ] as any finitely many couplings are varied as well as the transition values and flexibilities for both α and β for all bonds b. Now suppose that the rung energy infimum E between some pair W1 , W2 of domain walls satisfies E > 0 with positive probability; we show this leads to a contradiction. First we find, as in Fig. 1, a rung R and two dual bonds b1 , b2 whose locations on W1 are respectively in opposite directions from the starting site of R, and such that ER − E , which we denote by δ, is strictly less than the flexibility values for both α and β of both b1 , b2 . The existence with positive probability of such an R, b1 and b2 follows from the non-vanishing of flexibilities given by Proposition 4 and translation-invariance (e.g., Lemma 1). But we also want a situation, as in Fig. 1, where all the dual lattice non-domain-wall bonds that touch W1 between b1 and b2 , other than the first bond a in R, are supersatisfied, and remain so regardless of changes of Ja (by a bounded amount). We will call these bonds, numbering ten in Fig. 1, the “special” bonds. How do we know that
218
C. M. Newman, D. L. Stein
such a situation will occur with nonzero probability? If necessary, we can first adjust the signs and then increase the magnitudes (in an appropriate order) of the couplings of the special bonds, so that they first become satisfied and then super-satisfied. This can be done in an “allowed” way because of our assumption that the distribution of individual couplings has unbounded support. Also, this can be done so that α[J B ] and β[J B ] remain unchanged from α or β, and without changing ER , without decreasing any other ER (and thus without changing E or ER − E = δ) and without decreasing the flexibilities of b1 or b2 . Starting from a nonzero probability event, such an allowed change of finitely many couplings in J yields an event which still has nonzero probability. Next, suppose we move Ja toward its transition value Ka by an amount slightly greater than δ. The geometry – see, e.g., Fig. 1 – and Proposition 5 forbid the replacement of either α or β by α a or β a , because it is impossible, under the conditions given, for αα a or ββ a to connect to the end of bond a touching W1 . But this change of Ja reduces ER below ER for any R not containing a, yielding a nonzero probability event that contradicts translation-invariance (i.e., Lemma 1). This completes the proof.
References 1. Newman, C.M. and Stein, D.L.: Nature of ground state incongruence in two-dimensional spin glasses. Phys. Rev. Lett. 84, 3966–3969 (2000) 2. Edwards, S. and Anderson, P.W.: Theory of spin glasses. J. Phys. F 5, 965–974 (1975) 3. Newman, C.M. and Stein, D.L.: Metastate approach to thermodynamic chaos. Phys. Rev. E 55, 5194–5211 (1997) 4. Newman, C.M. and Stein, D.L.: Simplicity of state and overlap structure in finite volume realistic spin glasses. Phys. Rev. E 57, 1356–1366 (1998) 5. Newman, C.M. and Stein, D.L.: Spatial inhomogeneity and thermodynamic chaos. Phys. Rev. Lett. 76, 4821–4824 (1996) 6. Newman, C.M. and Stein, D.L.: Thermodynamic chaos and the structure of short-range spin glasses. In: Mathematics of Spin Glasses and Neural Networks, edited by A. Bovier and P. Picco. Boston: Birkhäuser, 1997, pp. 243–287 7. Middleton, A.A.: Numerical investigation of the thermodynamic limit for ground states in models with quenched disorder. Phys. Rev. Lett. 83, 1672–1675 (1999) 8. Palassini, M. and Young, A.P.: Evidence for a trivial ground-state structure in the two-dimensional Ising spin glass. Phys. Rev. B 60, R9919–R9922 (1999) 9. Krzakala, F. and Martin, O.C.: Spin and link overlaps in 3-dimensional spin glasses. Phys. Rev. Lett. 85, 3013–3016 (2000) 10. Palassini, M. and Young, A.P.: Nature of the spin glass state. Phys. Rev. Lett. 85, 3017–3020 (2000) 11. Burton, R.M. and Keane, M.: Topological and metric properties of infinite clusters in stationary twodimensional site percolation. Isr. J. Math. 76, 299–316 (1991) 12. Newman, C.M. and Stein, D.L.: Multiple states and thermodynamic limits in short-ranged Ising spin glass models. Phys. Rev. B 46, 973–982 (1992) 13. Aizenman, M. and Wehr, J.: Rounding effects of quenched randomness on first–order phase transitions. Commun. Math. Phys. 130, 489–528 (1990) 14. Burton, R.M. and Keane, M.: Density and uniqueness in percolation. Commun. Math. Phys. 121, 501–505 (1989) Communicated by M. Aizenman
Commun. Math. Phys. 224, 219 – 253 (2001)
Communications in
Mathematical Physics
Finite-Volume Fractional-Moment Criteria for Anderson Localization Michael Aizenman1,2 , Jeffrey H. Schenker2 , Roland M. Friedrich3 , Dirk Hundertmark1 1 Department of Physics, Princeton University, Princeton, NJ 08544, USA 2 Department of Mathematics, Princeton University, Princeton, NJ 08544, USA 3 Theoretische Physik, ETH-Zürich, 8093 Zürich, Switzerland
Received: 21 October 1999 / Accepted: 31 March 2000 / Revised: 30 August 2001
To Joel L. Lebowitz on the occasion of his seventieth birthday Abstract: A technically convenient signature of localization, exhibited by discrete operators with random potentials, is exponential decay of the fractional moments of the Green function within the appropriate energy ranges. Known implications include: spectral localization, absence of level repulsion, strong form of dynamical localization, and a related condition which plays a significant role in the quantization of the Hall conductance in two-dimensional Fermi gases. We present a family of finite-volume criteria which, under some mild restrictions on the distribution of the potential, cover the regime where the fractional moment decay condition holds. The constructive criteria permit to establish this condition at spectral band edges, provided there are sufficient “Lifshitz tail estimates” on the density of states. They are also used here to conclude that the fractional moment condition, and thus the other manifestations of localization, are valid throughout the regime covered by the “multiscale analysis”. In the converse direction, the analysis rules out fast power-law decay of the Green functions at mobility edges. Contents 1.
2.
3.
Introduction . . . . . . . . . . . . . . . 1.1 Overview . . . . . . . . . . . . . 1.2 The finite-volume criteria . . . . Proofs of the Main Results . . . . . . . . 2.1 Some useful notation . . . . . . 2.2 Key lemmas . . . . . . . . . . . 2.3 Proofs of the main results . . . . Generalizations . . . . . . . . . . . . . 3.1 Formulation of the general results
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
220 220 222 225 225 227 230 233 233
© 2001 Copyrights rest with the authors. Faithful reproduction of the article for non-commercial purpose is permitted.
220
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
3.2 Derivation of the general results . . . . . . . . . . . . . . . . . . . Some Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Fast power decay ⇒ exponential decay . . . . . . . . . . . . . . . 4.2 Lower bounds for Gω (x, y; Eedge + i0) at mobility edges . . . . . 4.3 Extending off the real axis . . . . . . . . . . . . . . . . . . . . . . 4.4 Relation with the multiscale analysis and density of states estimates Appendix A. Dynamical Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . B. A Fractional Moment Bound . . . . . . . . . . . . . . . . . . . . . . . . C. Decoupling Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . C.1 Decoupling inequalities for Green functions . . . . . . . . . . . . C.2 A condition for the validity of R2 (s) . . . . . . . . . . . . . . . . 4.
236 238 238 239 240 242 244 248 250 250 251
1. Introduction 1.1. Overview. Operators with extensive disorder are known to have spectral regimes (energy ranges) where the spectrum consists of a dense collection of eigenvalues corresponding to exponentially localized eigenfunctions. This phenomenon is of relevance in different contexts; e.g., it plays a role in the conductive properties of metals [1–3], in the quantization of Hall conductance [4–8], and in the emerging subject of optical crystals [9]. Most of the mathematical results on localization for operators with random potential in dimensions d > 1 have been derived using the multiscale analysis introduced by Fröhlich and Spencer [10] (and later evolved through various other works). For discrete systems there is an alternative approach, based on the analysis of the Green function’s fractional moments [11]. This approach has so far been developed for only a subset of the localization regime, but where it applies it yields somewhat stronger conclusions (through elementary arguments). In this work we present a further extension of that method. In particular, we derive a family of constructive finite-volume criteria for the exponential decay for the fractional moments of Green functions. This decay condition is a technically convenient characterization of localization, for it is known to imply spectral localization, absence of level repulsion, dynamical localization (in a strong exponential sense) and a related condition which plays a significant role in the quantization of the Hall conductance in two-dimensional Fermi gases. The constructive criteria are used to prove that for the discrete random operators described below all these properties hold throughout the regime of localization – if that is defined through either the criteria of the multiscale analysis or those presented here. The constructive criteria also preclude fast power-law decay of the Green functions at mobility edges. A guiding example for the operators discussed here is the discrete Schrödinger operator, acting in 2 (Zd ): Hω = T + λVω ,
(1.1)
with T denoting the off-diagonal part, whose matrix elements are referred to as the hopping terms, and Vω a random multiplication operator – referred to as the potential. The symbol ω represents a particular realization of the disorder, in this case the potential variables {Vω (x)}, and λ serves as the disorder strength parameter.
Finite-Volume Fractional-Moment Criteria for Anderson Localization
For the discrete Schrödinger operator 1 if |u − v| = 1, Tu,v = 0 if |u − v| = 1,
221
(1.2)
and the random potential is given by a collection of independent identically distributed random variables, {Vω (x)}x∈Zd . However, we shall also consider a more general class of operators, allowing the incorporation of magnetic fields, periodic terms, and off-diagonal disorder (see Sect. 3). We focus on the case of extensive disorder, where the distribution of the random operator Hω is either translation invariant, or at least gauge equivalent to shifts by multiples of basic periods (i.e. invariant under periodic magnetic shifts). Our main goal is to present a sequence of finite-volume criteria for localization, which permit to conclude that the following fractional-moment condition is satisfied in some energy interval [a, b] ∈ R: s 1 E x (1.3) y ≤ A(s)e−µ(s)|x−y| , Hω − E − iη for all E ∈ [a, b], η ∈ R, and suitable s ∈ (0, 1). E(·) represents here the average over the disorder, i.e. the random potential. Needless to say, the bound (1.3) is of interest mainly in situations where the energy E is within the spectrum, i.e. [Hω − E]−1 is an unbounded operator and the exponential decay occurs only due to the localization of the eigenfunctions with energies within the interval [a, b]. As in ref. [11], fractional powers are used in order to avoid infinity, however the value of 0 < s < 1 at which Eq. (1.3) is derived is of almost no importance (if Eq. (1.3) holds for a particular value of s, then it will hold for all s < τ , where τ < 1 is a number which depends only on the regularity of the probability distribution of Vω (x), see Appendix – Lemma B.2). For the systems considered here, Eq. (1.3) is known to imply various other properties, mentioned above, which are commonly associated with localization. More explicitly: Spectral localization ([11] – using [12]): The spectrum of Hω within the interval (a, b) is almost-surely of the pure-point type, and the corresponding eigenfunctions are exponentially localized. (ii) Dynamical localization ([13], expanded here in Appendix A): wave packets with energies in the specified range do not spread – −itH ˜ ˜ −µ|x−y| E sup |x|e PH ∈[a,b] |y| ≤ Ae . (1.4)
(i)
t∈R
(iii) Exponential decay of the projection kernel ([8]); the condition expressed in a bound similar to Eq. (1.4) for E(|x|PH ≤E |y|) with E ∈ [a, b]. This condition plays an important role in the quantization of Hall conductance, in the ground state of the two dimensional electron gas with Fermi level EF ∈ [a, b] [7, 6, 8]. (iv) Absence of level repulsion ([14]). Minami has shown that Eq. (1.3) implies, for operators of the type considered here, that in the range [a, b] the energy gaps have Poisson-type statistics. The fractional moment condition has already been established for certain regimes: extreme energies, as well as all energies at high enough disorder [11], and also for weak disorder but far enough from the unperturbed spectrum [13]. The results presented below
222
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
permit to extend it to band edges, provided there are sufficient “Lifshitz tail estimates” on the density of states (refs. [15–19]), and to other regimes mapped by a sequence of constructive criteria. 1.2. The finite-volume criteria. Our main results admit a number of variations. In this section we present a formulation which is natural for the prototypical example of the discrete random Schrödinger operators, i.e. Hamiltonians of the form (1.1) with T the discrete Laplacian (given by (1.2)). In Sect. 3 we formulate various extensions of the results, including operators incorporating magnetic fields and to operators with hopping terms of unbounded range. The results are derived under some mild regularity assumptions on the probability distribution of the variables {Vω (x)}x∈Zd which form the random potential. For simplicity we address ourselves here to the IID case: the potential variables are independent with a common probability distribution ρ(dV ). The assumption is then that ρ(dV ) satisfies the regularity conditions listed below, R1 (s) or R2 (s). However, the independence is not essential. What matters is that the stated regularity condition be satisfied, with a uniform constant, by the conditional distribution of each of the potential variables, conditioned on arbitrary values of the other potentials. The two regularity conditions mentioned here are: R1 (s): A probability distribution ρ(dV ), on R, is said to be s-regular, or to satisfy the condition R1 (s) at some 0 < s ≤ 1, if there exists C < ∞ such that ρ(a − $, a + $) ≤ C$ s .
(1.5)
R2 (s): The probability distribution ρ(dV ) is said to have the decoupling property R2 (s), with some 0 < s ≤ 1, if there exists C < ∞ such that for any pair of functions f and g of the form f (V ) =
1 , V −a
g(V ) =
V −b , V −c
(1.6)
with a, b, c ∈ C, the expectation of the product can be dominated as follows:
(1.7) E |f (V )|s |g(V )|s ≤ CE |f (V )|s E |g(V )|s . The smallest C such that Eq. (1.7) holds for all a, b, c ∈ C is called here the decoupling constant for ρ, and is denoted by Ds (ρ). A sufficient condition for R2 (s) is that ρ have bounded support and satisfy R1 (τ ) for some τ > 4s (see Appendix C; related discussion is found in refs. [11, 8].) In Appendix B we show that given any τ -regular measure ρ and any s < τ , there is a finite constant C such that for any 2 × 2 self adjoint matrix A2×2 , −1 s u0 ≤ C, ρ(du)ρ(dv) A2×2 + (1.8) 0 v i,j where [·]i,j denotes the i, j matrix element with i, j = 1, 2 . Throughout this work, we denote by Cs the smallest value of C at which (1.8) holds. For ρ(dV ) which also satisfy s = Cs · Ds (ρ)2 . R2 (s) we let: C
Finite-Volume Fractional-Moment Criteria for Anderson Localization
223
For * ⊂ Zd we denote by H*;ω the operator obtained from Hω by “turning off” the hopping terms outside *. Thus, the restriction of H*;ω to 2 (*) (considered as a subspace of 2 (Zd )), is nothing but Hω with the Dirichlet boundary conditions on the boundary of *. We also denote by +(*) the set of the nearest-neighbor bonds reaching out of * (i.e. pairs with one site in * and the other outside), by *+ the collection of sites within distance 1 from *, and by |+(*+ )| the number of bonds reaching out of that set. These notions will be generalized in Sect. 2.1. Following are our basic results for operators of the form (1.1). Theorem 1.1. Let Hω be a random Schrödinger operator with the probability distribution of the potential V (x) satisfying the regularity condition R1 (τ ) and fix s < τ . If for some z ∈ C (possibly real) and some finite region * ⊂ Zd which contains the origin 0: s C 1 s < 1, (1.9) E 0 b(*, z) := sup |+(*+ )| s u λ HW ;ω − z W ⊂* u,u ∈+(*)
then there are some µ(s) > 0 and A(s) < ∞ – which depend on the energy z only through the bound b(*, z) – such that for any region . ⊂ Zd , s 1 (1.10) E±i0 x y ≤ A(s)e−µ(s) |x−y| . H.;ω − z The subscript of E±i0 , in (1.10) is to be interpreted as saying that the bound is valid for either of the two limiting expressions: s 1 (1.11) lim E x y . η0 H.;ω − E −(+) iη The “cutoff” ±iη is needed for an unambiguous interpretation in case z is a real energy (E) within the spectrum of H . For the random operators considered here it is well understood that: (i) the expectation may be exchanged with the limit η 0, (ii) it suffices to verify the uniform bounds (1.10) for finite regions, and (iii) the finite volume expectations are continuous in η. In the proofs we shall be dealing with finite systems; the subscript will, therefore, be omitted there. Let us note that already the special case * = {0} is of interest. It provides the following variant of the single-site criterion of ref. [11] (which is, in fact, a bit simpler since it does not invoke the decoupling lemma). Corollary. For the random Schrödinger operator a sufficient condition for localization (1.3) is that for all E ∈ [a, b], Cs 1 2d(2d − 1) s ρ(dV ) < 1. (1.12) λ |λV − E|s Just as the main result of ref. [11], the above criterion permits to easily conclude localization for the cases of high disorder or extreme energies. However, we may now move beyond that. By testing the hypothesis of Theorem 1.1 in the increasing sequence of volumes * = [−L, L]d , one may extend the conclusion to increasing regimes in the “energy × disorder plane”. In fact, it is easy to see that for each energy at which the strong localization condition (1.10) is satisfied, the hypothesis (1.9) will be met at all
224
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
sufficiently large L. (This may, however, be far from a practical test, as the necessary computation may be rather difficult for large L). Observant readers may note that the conclusion of Theorem 1.1 provides not only the localization condition Eq. (1.3), but it also rules out extended boundary states. The flip side of this observation is that if such states are present in some geometry, e.g. the half space, then the hypothesis of Theorem 1.1 will fail to be satisfied even if the operator exhibits localization in the bulk. Therefore, we present also the following result which permits to establish bulk localization regardless of the possible presence of extended boundary states. Theorem 1.2. Let Hω be a random Schrödinger operator with the probability distribution of the potential V (x) satisfying R1 (τ ) and R2 (s), for some s < τ . If for some z ∈ C and some finite region 0 ∈ * ⊂ Zd , 2 s C 1 + s |+(*)| λ
u,u ∈+(*)
E 0
1 H*;ω − z
s u < 1,
(1.13)
then Hω satisfies the fractional-moment condition (1.3), and there exist µ(s) > 0, A(s) < ∞ so that for any region . ⊂ Zd , s 1 E±i0 x (1.14) y ≤ A(s)e−µ(s) dist. (x,y) , H.;ω − z with dist. (x, y) = min{|x − y|, [dist(x, ∂.) + dist(y, ∂.)]}.
(1.15)
Let us add that, as in Theorem 1.1, A(s) and µ(s) of (1.14) depend on z only through the value of the LHS in Eq. (1.13). The modified metric, dist. (x, y), is a distance function relative to which the entire boundary of . is regarded as one point. It permits us to state that there is exponential decay in the bulk without ruling out non-exponential decay along the boundary. We supplement the last result by the following observation. Theorem 1.3. Let Hω be a random operator given by Eq. (1.1), with the probability distribution of the potential V (x) satisfying R1 (τ ) and R2 (s), for some s < τ . If at some energy E (or z ∈ C) the localization condition (1.3) is satisfied, with some A < ∞ and µ > 0, then for all large enough (but finite) L the condition (1.13) is met for * = [−L, L]d . The statement is a bit less immediate than the analogous claim for Theorem 1.1. We shall therefore include the proof below. It is natural to compare the above criteria for localization with those of the multiscale analysis. The two methods share the basic feature that the analysis requires an initial condition which one may expect to be met in a finite system provided its linear size is of the order of the localization length, or larger. However, for the method presented here if a suitable input is received on some scale, then the analysis can proceed using steps, or blocks, of only that size. An important difference in the results is that the fractional moment condition yields exponential decay for the expectation values, which are important for some of the conclusions listed above. Such bounds have not been derived by methods based on the multiscale analysis, since (at least without further
Finite-Volume Fractional-Moment Criteria for Anderson Localization
225
improvement) the bounds the latter yields on the “error terms”, i.e., the probabilities of “bad blocks”, decay not faster than exp[−(log L/ log Lo )α ]. This rate is faster than any power of L, but in itself not fast enough to imply exponential bounds for the mean values. However, it should be noted that the extension of the present method to operators in the continuum, for which a number of basic localization results have been established using the multiscale analysis [20, 21, 17], is still unaccomplished. Also not covered are discrete operators with the potential assuming discrete values (e.g., Vω (x) = ±1 [22]). In Sect. 4 we discuss various implications of the basic results. In particular it is shown that, for discrete random operators of the type considered here, the fractional moment condition (1.3) is satisfied throughout the regime in which the multiscale analysis applies (see Theorem 4.4). This carries the further implication that the properties listed above hold throughout the entire regime for which localization can be proven by any of the known methods. One of those properties is a strong form of dynamical localization, on which more is said in Appendix A. 2. Proofs of the Main Results 2.1. Some useful notation. The proofs of the above statements will be presented in terms which permit a direct extension to operators with more general hopping terms. We start by generalizing the notation; in particular, the sets *+ and +(*) will be made to depend implicitly on the operator T . (+) In the study of H.;ω we shall often consider “depleted” Hamiltonians, H.;ω , obtained by setting to zero the operator’s non-diagonal matrix elements (hopping terms) along some collection of ordered pairs of sites (referred to here as bonds) + ⊂ Zd × Zd . The difference is the operator T (+) , with Tx,y if x, y ∈ + or y, x ∈ + (+) Tx,y = (2.1) 0 if x, y ∈ + and y, x ∈ +, so that (+)
H.;ω = H.;ω + T (+) .
(2.2)
Typically, + will be a collection of bonds which forms the “cut set” of some W ⊂ Zd , i.e., the set of bonds with Tx,y = 0 connecting sites in W with sites in its complement. Thus we denote +(W ) = u, u |u ∈ W, u ∈ Zd \W, and Tu,u = 0 , (2.3) and also
W + = W ∪ u ∈ Zd |Tu,u = 0 for some u ∈ W .
The number of elements (i.e. bonds) in + is denoted |+|. In addition, we use the “Green function” notation: 1 G.;ω (x, y; z) = x y , H.;ω − z
(2.4)
(2.5)
226
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark (+)
with G.;ω (x, y; z) defined correspondingly. Often, where it is obvious from context that an operator is a random variable, we shall suppress the subscript ω. In broad terms, the strategy for the proof is to derive a bound on the average Green function, of the form
E |G. (x, y; z)|s ≤
u,u ∈+(*(x))
s (+(*(x)) γ*(x) (u, u )|Tu,u |s E G. (u , y; z) , (2.6)
for all y ∈ Zd \*(x), where: *(x) = {x + y : y ∈ *} is a finite neighborhood of x, translate of some fixed region * 0, and γ*(x) is a quantity which is small when the typical values of the finite volume Green function between x and the boundary of *(x) are small (in a suitable sense). An inequality of the form (2.6) is particularly useful when
γ*(x) (u, u )|Tu,u |s < 1,
(2.7)
u,u ∈+(*(x))
since in that case Eq. (2.6) is akin to the statement that E (|G. (x, y; z)|s ) is a strictly subharmonic function of x, as long as |x −y| > diam|*|, and thus – if it is also uniformly bounded (which it is) – it decays exponentially. The first step towards a bound of the form (2.6) is, naturally, the resolvent identity: (+)
(+)
G.,ω = G.,ω − G.,ω · T (+) · G.,ω (+)
(2.8)
(+)
= G.,ω − G.,ω · T (+) · G.,ω
(written here in the operator form). However, one then reaches an obstacle, since the quantity whose mean needs to be estimated is a product of two Green functions which are not independent. For some time now this co-dependence has been the main obstacle on the road to an argument along the lines outlined above, since otherwise the general strategy applied here is well familiar from its various successful applications in the context of the statistical mechanics of homogeneous systems ([23–27]), and the other auxiliary tools specific to the present context have in essence been available since ref. [11]. The co-dependence problem is solved here through a second application of the resolvent identity (followed by a decoupling argument of a familiar type). In fact, a similar tactic was applied by von Dreifus to the mean correlation functions, in a study of the phase transitions in disordered ferromagnetic models [28] (as we learned from T. Spencer after the completion of the first draft of this work). The two applications of the resolvent identity, for which the depletion sets +1 and +2 need not coincide, may be combined by starting our argument from the identity: (+ )
(+ )
(+ )
(+ )
(+ )
G. = G. 1 − G. 1 · T (+1 ) · G. 2 + G. 1 · T (+1 ) · G. · T (+2 ) · G. 2 .
(2.9)
Readers familiar with the current techniques may note that once the middle term G. is replaced by a uniform bound, the remaining expression can be made free from codependence by an appropriate choice of +1 and +2 . The rest are technicalities, to which we turn next.
Finite-Volume Fractional-Moment Criteria for Anderson Localization
227
2.2. Key lemmas. We shall now present three lemmas which will be used in the proofs of our main results. The first is a known estimate which provides the afore-mentioned uniform upper bound. Lemma 2.1. Let V (x) be a random potential satisfying the regularity condition R1 (τ ). Then for each s < τ , any region ., and any random operator of the form (1.1)
Cs E |G. (x, y; z)|s ≤ s , λ
(2.10)
for all z ∈ C. The statement is an immediate consequence of a version of the Wegner estimate which we present in the appendix. (See Lemma B.1; also Eq. (2.18) below.) Next is our new bound. Lemma 2.2. Let Hω be a random operator given by Eq. (1.1) with the probability distribution of the potential V (x) satisfying the regularity condition R1 (τ ), and let W be a subset of .. Then, denoting + = +(W + ) and + = +(W ), for all z ∈ C: (1) The following “depleted-resolvent bound” holds for any pair of sites x ∈ W , y ∈ .\W + ,
|Tv,v |s E |G.\W + (v , y; z)|s , (2.11) E |G. (x, y; z)|s ≤ γ (W ) v,v ∈ +
with γ (W ) =
Cs λs
|Tu,u |s E |GW (x, u; z)|s .
(2.12)
u,u ∈+
(2) If, furthermore, the probability distribution of the potential satisfies also R2 (s) then the following bound holds for any pair of sites x ∈ W , y ∈ .\W ,
γx (v, v )|Tv,v |s E |G.\W (v , y; z)|s , (2.13) E |G. (x, y; z)|s ≤ v,v ∈+
with s
C γx (v , v) = E |GW (x, v ; z)|s + s λ
|Tu,u |s E |GW (x, u; z)|s .
u,u ∈+
(2.14)
Proof. Both results follow from the second-order resolvent identity Eq. (2.9), which yields: (+ ) (+ ) (+ ) (+ ) G. (x, y; z) = G. 1 (x, y; z) − x G. 1 T. 1 G. 2 y (2.15) (+ ) (+ ) (+ ) (+ ) + x G. 1 T. 1 G. T. 2 G. 2 y .
228
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
u’
y
u
x W
v
v’
Fig. 2.1. Diagramatic depiction of the bound (2.16) on G(x, y; z), for x, y ∈ Zd and z ∈ C. The long solid lines are “depleted Green functions”, the two short segments correspond to the hoping terms (T ) and the double line is a full Green function. Once the latter is replaced by a uniform upper bound, the expectation value of the product of the remaining terms factorizes
For the proof of the first claim, we take +1 = + = +(W ) and +2 = + = +(W + ). Then, the first term of Eq. (2.15) is zero because +(W ) decouples x and y and the second term is zero because +(W + ) decouples W + and y. Thus (+) ( +) G. (x, y; z) = Tu,u Tv,v G. (x, u; z)G. (u , v; z)G. (v , y; z) . (2.16) u,u ∈+ + v,v ∈
It follows that for any s ∈ (0, 1),
E |G. (x, y; z)|s s (+) ( +) ≤ |Tu,u |s |Tv,v |s E G. (x, u; z)G. (u , v; z)G. (v , y; z) .
(2.17)
u,u ∈+
+ v,v ∈
(Note that for 0 < s < 1: |a + b|s ≤ |a|s + |b|s .) In estimating the terms on the right-hand side of Eq. (2.17) let us consider first the conditional expectation of the central factors, G. (u , v; z). Only these factors depend on the values of the potential at u and v, and therefore they
can be replaced by their conditional expectation E |G. (u , v; z)|s {V (q)}q∈.\{u ,v} . As will be proven in the Appendix, under the regularity condition R1 (τ ) these are uniformly bounded (Lemma B.1):
Cs E |G. (u , v; z)|s {V (q)}q∈.\{u ,v} ≤ s . λ
(2.18)
(The proof involves a reduction to a two-dimensional problem via the Krein formula, and a two-dimensional Wegner-type estimate.) Once the central factor in each expectation on the right.hand side of Eq. (2.17) is replaced by the above bound, what remains there are two independent random variables (+) ( +) which are |G. (x, u; z)|s = |GW (x, u; z)|s and |G. (v , y; z)|s = |G.\W + (v , y; z)|s . The expectation now factorizes, and the resulting expression yields the first claim of the lemma.
Finite-Volume Fractional-Moment Criteria for Anderson Localization
229
For the second claim, we take +1 = +2 = + = +(W ). Once again the first term of Eq. (2.15) is zero because +(W ) decouples x and y. However, the second term is non-zero, and we obtain
E |G. (x, y; z)|s s (+) (+) ≤ |Tv ,v |s E G. (x, v; z)G. (v , y; z) v,v ∈+
+
u,u ∈+ v,v ∈+
s (2.19) (+) (+) |Tu,u |s |Tv,v |s E G. (x, u; z)G. (u , v; z)G. (v , y; z) .
At this point we may not use the previous argument, since in the last expectation V (v) affects each of the first two factors and V (u ) affects each of the last two factors. However, the dependence of each of these factors on the potentials is of a particularly simple form: they are ratios of two functions (determinants) which are separately linear in each potential variable. Using the decoupling hypotheses, i.e. the regularity conditions R1 (τ ) and R2 (s), the expectation may be bounded by the product of expectations. Specifically, we prove in Lemma C.1 that: s (+) (+) E G. (x, u; z)G. (u , v; z)G. (v , y; z) ≤
s (+) C G (x, u; z)G(+) (v , y; z)s . E . . λs
(2.20)
Once again, of two independent random variables, (+) we are left with a product G (x, u; z)s = GW (x, u; z)s and G(+) (v , y; z)s = G.\W (v , y; z)s . The fac. . torization of the remaining expectation yields the second claim of the lemma, Eq. (2.13). The above lemma provides a bound for the Green function in terms of its depleted versions. This suffices for the derivation of the first of our two main theorems (Thm 1.1). However, this does not suffice for the second theorem, Thm 1.2, for which we shall use an inequality that is linear in the original function. That “closure” will be attained with the help of the following bound on the depleted resolvent in terms of the full one. Lemma 2.3. Let H.,ω be a random operator in 2 (.), . ⊆ Z d , given by Eq. (1.1), with the probability distribution of the potential V (x) satisfying the regularity conditions R1 (τ ) and R2 (s) for some s < τ . Let W be a subset of .. Then, the following holds for any pair of sites u, y ∈ .\W , and every z ∈ C:
E |G.\W (u, y; z)|s ≤ E |G. (u, y; z)|s s
C + s |Tv ,v |s E |G. (v, y; z)|s , λ v,v ∈+
with + = +(W ) the “cut-set” of W .
(2.21)
230
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
Proof. Starting from the first order resolvent identity, Eq. (2.8), and taking expectation values of its matrix elements, we find:
(+) E |G. (u, y; z)|s ≤ E |G. (u, y; z)|s (+) + |Tv ,v |s E |G. (u, v ; z)|s |G. (v, y; z)|s , v,v ∈+(W )
(2.22) where + = +(W ), and G(+) = G.\W . It suffices, therefore, to show that in the last (+) term the factor |G. (u, v ; z)|s may be replaced (for an upper bound) by the constant s C λs .
This follows through a decoupling argument which we present in the Appendix – see Lemma C.1. Remark. In the applications we shall use Lemmas 2.2 and 2.3 both in the stated form and in the conjugated form, with the arguments of the Green functions reversed. One form of course implies the other (at conjugate energy). 2.3. Proofs of the main results. We are now ready to derive the results stated in the Introduction. For simplicity these were stated in the context of the Schrödinger operators, for which T is the discrete Laplacian. The proofs given in this section will be restricted to this case. A more generally applicable treatment is presented in the next section. Proof of Theorem 1.1. Assume that for some z ∈ C and a finite region * the smallness condition (1.9) holds. By Lemma 2.2 and translation invariance, we learn that for any region . and any x, y ∈ . with y ∈ Zd \*+ (x):
E |G. (x, y; z)|s ≤ b ·
1 |+(*+ )|
E |G.\*+ (x) (v , y; z)|s ,
(2.23)
v,v ∈+(*+ (x))
where b = b(*, z) of Eq. (1.9), and *(x) is the translate of * by x. By Lemma 2.1, each of the terms in the sum is bounded by Cs /λs . Since the sum is normalized by the prefactor 1/|+(*+ )|, the inequality (2.23) permits to improve that bound for E(|G. (x, y; z)|s ) by the factor b(< 1). Furthermore, the inequality may be iterated a number of times, each iteration resulting in an additional factor of b. One should take note of the fact that the iterations bring in Green functions corresponding to modified domains. It is for this reason that the initial input assumption was required to hold for modified geometries, i.e. not just for * but also for all its subsets. Inequality (2.23) can be iterated as long as the resulting sequences (x, v , . . . , v (n) ) do not get closer to y than the distance L = sup{|u||u ∈ *+ }. Thus:
Cs Cs E |G. (x, y; z)|s ≤ s · b|x−y|/L ≤ s e−µ|x−y| , λ λb with µ = | ln b|/L.
(2.24)
Next, let us turn to the proof of the second theorem (Thm 1.2). The main change is that we now proceed under the assumption that the smallness condition holds for some region * without requiring it to hold also in all subsets. As explained in the introduction, the difference may be meaningful if Hω has extended boundary states in some geometry.
Finite-Volume Fractional-Moment Criteria for Anderson Localization
231
Proof of Theorem 1.2. Our first goal is to show that under the assumption (1.13) there is b < 1 such that for all pairs {x, y} with *(x) ⊂ . and y ∈ .\*(x),
E |G. (x, y; z)|s ≤ b (2.25) Pxl (u)E |G. (u, y; z)|s , u∈*+ (x)
with non-negative weights satisfying: u∈*+ (x)
Pxl (u) = 1.
We shall use this inequality along with its conjugate:
Pyr (v)E |G. (x, v; z)|s , E |G. (x, y; z)|s ≤ b
(2.26)
(2.27)
v∈*+ (y)
where Pyr (v) satisfy the suitable analog of the normalization condition (2.26). It is important that – unlike in the inequality (2.23), the functions which appear on the right-hand side of (2.25) and (2.27) are computed in the same domain as those on the left-hand side. The first step is by Lemma 2.2, which yields
E |G. (x, y; z)|s ≤ (2.28) γx (u, u )E |G.\*(x) (u , y; z)|s , u,u ∈+(*(x))
whenever *(x) ⊂ . and y ∈ Zd \*(x), with γx (u, u ) specified in Eq. (2.14).
Next, we apply Lemma 2.3, Eq. (2.21), to bound E |G.\*(x) (u , y; z)|s in terms of a sum of quantities of the form E (|G. (v, y; z)|s ) with v ∈ *+ (x). The result is initially expressed as a sum over bonds:
E |G. (x, y; z)|s ≤ γx (u, u )E |G. (u , y; z)|s u,u ∈+(*(x))
+
s C : λs
E |G. (u, y; z)|s ,
(2.29)
u,u ∈+(*(x))
where, using translation invariance, : :=
γ0 (u, u ).
u,u ∈+(*)
Collecting terms, and pulling out normalizing factors, one may cast the inequality (2.29) in the form (2.25) with s s C C b := γx (u, u ) + s : = 1 + s |+(*)| : (2.30) λ λ u,u ∈+(*(x))
2 s C = 1 + s |+(*)| λ
E |G* (0, u; z)|s .
u,u ∈+(*)
The smallness condition (1.13) is nothing other than the assumption that b < 1.
(2.31)
232
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
The above argument proves Eq. (2.25). By the transposition, or time-reflection, symmetry of H (H T = H ) also Eq. (2.27) holds. (Such symmetry of H is not essential for our analysis: it suffices to assume that the smallness condition Eq. (1.13) holds along with its transpose.) We proceed in the proof by iterating the inequalities (2.25) and (2.27). However an adaptation is needed in the argument which was used in the proof of Theorem 1.1 since the iteration can be carried out only as long as the two points (the arguments of the resolvent) stay at distance L = sup{|u| : u ∈ *+ } not only from each other but also from the boundary ∂.. The relevant observation is that for every pair of sites x, y ∈ . there is a pair of integers {n, m} such that: 1. n + m = dist . (x, y) , 2. the ball of radius n centered at x and the ball of radius m centered at y form a pair of disjoint subsets of .. For the desired bound on E (|G. (x, y; z)|s ), we shall iterate Eq. (2.25) n/L times from the left, and (2.27) m/L times from the right. Similar to Eq. (2.24), we obtain:
Cs E |G. (x, y; z)|s ≤ s 2 e−µdist. (x,y) , λb with µ = | ln b|/L.
(2.32)
The third theorem stated in the introduction (Thm 1.3) is the claim that the condition which is shown above to be sufficient for exponential localization, in the sense of Eq. (1.3), is also a necessary one. We shall now prove this to be the case. Proof of Theorem 1.3. Suppose that Eq. (1.3) holds with some A < ∞ and µ > 0. We need to show that also in finite systems the Green function is sufficiently small between an interior point and the boundary. To bound the finite volume function in terms of the infinite volume one, we may use Lemma 2.3, by which
E |G* (0, u; z)|s ≤
u,u ∈+(*)
E |G(0, u; z)|s
u,u ∈+(*)
+
s C |+(*)| λs
|Tv,v |s E |G(0, v ; z)|s ,
(2.33)
v,v ∈+(*)
for any finite region * containing the origin. We need to show that for * = [−L, L]d with L large enough 1+
2 s C |+(*)| λs
E |G* (0, u; z)|s < 1.
(2.34)
u,u ∈+(*)
After applying Eq. (2.33) to the terms on the left side of Eq. (2.34) we find that the number of summands involved and their prefactors grow only polynomially in L, whereas under our assumption the relevant factors E (|G(0, u; z)|s ) are exponentially small in L. Hence the condition (2.34) is satisfied for L large enough.
Finite-Volume Fractional-Moment Criteria for Anderson Localization
233
3. Generalizations 3.1. Formulation of the general results. We shall now turn to some generalizations of the theorems which were presented in Sect. 1.2 for the random Schrödinger operator. The setup may be extended in a number of ways. 1. Addition of magnetic fields. The hopping terms {Tx,y } need not be real. In particular, the present analysis remains valid when one includes in Hω a constant magnetic field, or a random one with a translation invariant distribution. A magnetic field is incorporated in Tx,y through a factor exp(−iAx,y ), with Ax,y an anti-symmetric function of the bonds. (It represents the integral of the “vector potential” ×(−e/h) ¯ along the bond x, y.) Except for the trivial case, with such a factor T is no longer shift invariant. However, in the case of a constant magnetic field, T will still be invariant under appropriate “magnetic shifts”, which consist of ordinary shifts followed by gauge transformations. Translation-invariance plays a role in our discussion. However, since gauge transformations do not affect the absolute values of the resolvent, it suffices for us to assume that Hω is stochastically invariant under magnetic shifts – in the sense of Definition 3.1. 2. Extended hopping terms. The discrete Laplacian may be replaced by an operator with hopping terms of unlimited range. For exponential localization we shall however require {Tx,y } to decay exponentially in |x − y|. 3. Off-diagonal disorder. {Tx,y } may also be made random. It is convenient however to assume exponentially decaying uniform bounds. The regularity conditions on the potential will now be assumed for the conditional distribution of V (x) at specified off-diagonal disorder. 4. Periodicity. Hω may also include a periodic potential, i.e., Eq. (1.1) may be modified to: Hω = Tx,y;ω + Uper (x) + λVω (x).
(3.1)
This may be further generalized by requiring periodicity only of the probability distribution of H . 5. More general lattices. In the previous discussion, the underlying sets Zd may be replaced by other graphs, with suitable symmetry groups. The graph structure is relevant if the hopping terms are limited to graph edges. However, since we consider also operators with hopping terms of unlimited range, let us formulate the result for operators on 2 (T ) where the underlying set is of the form T = G ×S, with G a countable group and S a finite set. We let dist(x, y) denote a metric on T which is invariant under the natural action of G on that set. For example, this setup allows for T to be a Bethe lattice, or a more general Cayley lattice. (Instructive discussion of some statistical mechanical models in such settings may be found in refs. [29]). The set S is included here in order to leave room for periodic structures. We denote by C the “periodicity cell”, which is {ı} × S where ı is the identity in G, and by gx the “G-coordinate” of x. Thus, the lattice T is tiled by disjoint translates of C, the tile containing x being gx C. Some of the relevant concepts are summarized in the following definition. Definition 3.1. With T = G × S as above, let Hω be a random operator on 2 (T ) (i.e., one with some specified probability distribution), whose off-diagonal part is denoted by Tω and the diagonal part is referred to as the potential (for consistency, we denote it as λVω ).
234
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
1. We say that Hω is stochastically invariant under magnetic shifts if for each κ ∈ G and almost every ω there is a unitary map of the form
Uκ,ω ψ (x) = eiφκ,ω (x) ψ(κx),
(3.2)
(with some function φκ,ω (·) ) under which D
∗ Uκ,ω Hω Uκ,ω = Hω ,
(3.3)
D
where = means equality of the probability distributions. 2. The operator is said to have tempered off-diagonal matrix elements, at a specified value of s < 1, if there is a kernel τx,y , and some m > 0, such that Tx,y;ω ≤ τx,y , almost surely, and sup
x∈T y∈T
s τx,y e+ m dist(x,y) < ∞.
(3.4)
3. We say that the potential has an s-regular distribution if for some τ > s the conditional distributions of {Vω (x)}, at specified values of the hopping terms variables {Tu,v;ω }, are independent and satisfy the regularity conditions R1 (τ ) and R2 (s) with uniform constants. Before presenting our general theorems, Theorem 3.2 and Theorem 3.3, it is convenient to introduce notation for certain quantities which appear in their statements. For s each * ⊂ T we define τu,∂* , “the hopping term from u to the boundary”, by s = τu,∂*
v∈W
s τu,v ,
(3.5)
where W is either * or T \ *, whichever does not contain u. The kernel k* (u, v), s that appears in our basic bounds (see Lemma 3.4), which is a “dressed” version of τu,v is defined as follows: s C s s s k* (u, v) := τu,v I[u ∈ *, v ∈ T \ *] + τu,∂* τv,∂* I[u ∈ *] λs 2 Cs s s τv,∂* Es (*)I[u, v ∈ *], + τu,∂* λs where Es (*) =
s u∈* τu,∂* . Notice that k*
(3.6)
is concentrated on the boundary of *, i.e.,
k* (u, v) ≤ C* e−m [dist(u,∂*)+dist(v,∂*)]
(3.7)
where m is independent of * and dist(v, ∂*) is the distance from v to whichever set, * or T \ *, does not contain v. Following is the generalization of Theorem 1.1.
Finite-Volume Fractional-Moment Criteria for Anderson Localization
235
Theorem 3.2. Let Hω be a random operator on 2 (T ) (T = G × S, as above) with an s-regular distribution for the potential Vω (·), and with tempered off-diagonal matrix elements (Tx,y;ω ), which is stochastically invariant under magnetic shifts. Let µ > 0, and assume that for some z ∈ C and a finite region * ⊂ T , which contains the periodicity cell C, the following is satisfied for all subsets W ⊂ *: s 1 u k* (u, v) e+µdist(x,v) < 1. E x (3.8) sup HW ;ω − z x∈C u,v∈*×(T \*)
Then there exists A < ∞ such that for all . ⊂ T , and all x ∈ ., s 1 y e+µdist(x,y) ≤ A. E±i0 x H.;ω − z
(3.9)
y∈.
Remarks. 1. Because the hopping terms are tempered as described in Definition 3.1, the bound (3.8) will be satisfied for some µ > 0 provided s 1 u k* (u, v) < 1. E x (3.10) sup sup HW ;ω − z x∈C W ⊂* u,v∈*×T \*
We shall use this criterion in Sect. 4 in the slightly different form s 2 s C 1 s u < 1, (3.11) 1 + s Es (*) sup sup τu,u E x λ H − z W ;ω x∈C W ⊂* u,u ∈*×T \*
where we have summed various terms appearing in k* (u, v). 2. For graphs which grow at an exponential rate, such as the Bethe lattice, exponentially decaying functions need not be summable. The conclusion, Eq. (3.9), was therefore formulated in the stronger form, which implies both exponential decay, and almost sure summability. In particular, it is useful to recall that for s/2 < 1:
s/2 2 s ≤E (3.12) |G(x, y)| |G(x, y)| . E y
y
3. One may note that in the more general theorem we do make use of the “decoupling lemma”, which was not used in Theorem 1.1. 4. Translation invariance played a limited role here: the analysis extends readily to random operators with non-translation invariant distributions, provided only that the required bounds are satisfied uniformly for all translates of *, and the distribution of the potential is uniformly s-regular. To demonstrate the required change we cast the next statement in that form. As we discussed in the preceding sections, condition (3.8) may fail due to the existence of extended states at some surfaces. The following generalization of Theorem 1.2 provides criteria for localization in the bulk which are less affected by such surface states.
236
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
Theorem 3.3. Let Hω be a random operator on 2 (T ) (T = G × S, as above) with an s-regular distribution for the potential Vω (·), and with tempered off-diagonal matrix elements ({Tx,y;ω }). Let µ > 0 and assume that for some z ∈ C and a finite region *, C ⊂*⊂T, s 1 u kg * (u, v)e+µdist(x,v) < 1, sup E x (3.13) x H − z[¯ z ] g *;ω x∈T x u∈gx * v∈T
where z[¯z] means that the bound is satisfied for both z and z¯ . Then the condition (3.9) holds for the full operator Hω (i.e., with . = T ), and there exists B < ∞ with which for arbitrary . ⊂ T : s 1 y ≤ Be− µ˜ dist. (x,y) . E±i0 x (3.14) H − z .;ω
The modified distance dist. (x, y) is defined by the natural extension of Eq. (1.15). 3.2. Derivation of the general results. The derivation of Theorems 3.2 and 3.3 follows very closely the proofs of Sect. 2. The main difference is in the second portion of the argument where we extract decay in a single step rather than by iteration. The first part of the proof rests on Lemmas 2.2 and 2.3 which are easily seen to extend to the setup described in Theorem 3.3. One readily obtains the following extension (the hopping terms Tx,y appearing in Sect. 2.2 are replaced with the uniform upper-bound τx,y ): Lemma 3.4. Let Hω be a random operator with the properties listed in Theorem 3.3, and let * be a finite subset of T , containing the periodicity cell C, for which the condition (3.8) is satisfied. Then the following bound is valid for any x ∈ *, y ∈ T \*,
E |G. (x, y; z)|s ≤
E |G*∩. (x, u; z)|s k* (u, v)E |G.\* (v, y; z)|s ,
(3.15)
∈*×T \*
and
E |G. (x, y; z)|s ≤
E |G.∩* (x, u; z)|s k* (u, v)E |G. (v, y; z)|s .
∈*×T
(3.16) Notice that (3.16) differs from (3.15) in that the Green function in the region . (not . \ *) appears on the right hand side and the summation over v extends over the entire lattice. Theorems 3.2 and 3.3 follow easily from Lemma 3.4:
Finite-Volume Fractional-Moment Criteria for Anderson Localization
Proof of Theorem 3.2. To establish the claimed bound (3.9) we will show that
E |G. (x, y; z)|s e+µdist(x,y) An := sup sup .:|.|≤n x y∈.
237
(3.17)
is bounded independent of n, thus establishing the result for finite regions. For infinite regions (3.9) the result follows by a limiting procedure, with the convergence implied by Fatou’s lemma. For any given . with |.| ≤ n and any site x ∈ .,
Cs E |G. (x, y; z)|s e+ µ dist(x,y) ≤ |*|eµdiam(*) s λ y∈.
+ E |G*x ∩. (x, u; z)|s k* (u, v)E |G.\*x (v, y; z)|s e+µdist(x,y) , y∈.\*x u∈*x ,v∈T \*x
(3.18) where the first term on the right side bounds the contribution to the sum from sites y in *x ≡ gx *, and the remaining terms were estimated by Lemma 3.4, Eq. (3.15). Performing the summation over y first, and applying the triangle inequality to factor the exponential weight, we obtain: y∈.
Cs E |G. (x, y; z)|s e+µdist(x,y) ≤ |*| s eµdiam(*) + b An , λ
(3.19)
where b is the quantity on the left hand side of (3.8). When maximized over . and x this leads to the bound An ≤ Const. + bAn which, since b < 1, implies that An ≤
|*|Cs λ−s eµdiam(*) , 1−b
(3.20)
as claimed above. Proof of Theorem 3.3. The claim made for the special case . = T is covered by analysis similar to what was just described. However the second claim, i.e., Eq. (3.14), requires a somewhat different argument. We will first show that for a finite region . the function g(x, y) = E(|G. (x, y; z)|s ) e+µdist. (x,y)
(3.21)
attains its maximum value for some (x, y) with dist. (x, y) ≤ 2diam(*). For any pair with a larger distance at least one of the sites, say x, can be separated from both the other and the boundary ∂. by an appropriate translate of *, i.e. *x . We may then use Lemma 3.4, Eq. (3.16), to bound g(x, y) by a sum of products of Green functions. If, in this sum, we replace each factor of E(|G. (v, y)|s )eµdist(x,y) by the upper bound gmax eµdist(x,v) , the resulting sum yields g(x, y) ≤ bgmax ,
(3.22)
where b is the quantity which sits on the left hand side of (3.13). As b < 1, we learn that g(· , ·) is not maximized at (x, y).
238
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
Since g(x, y) ≤
Cs µdist. (x,y) , λs e
the above implies that for any finite .
E(|G. (x, y; z)|s ) ≤
Cs 2µdiam(*) −µ dist. (x,y) e e . λs
(3.23)
By strong resolvent convergence arguments, the bound extends to infinite regions.
4. Some Implications We shall now present a number of implications of the finite volume criteria for localization, focusing on the finite dimensional lattices Zd . The statements will bear some resemblance to results derived using the multiscale approach, however the conclusions drawn here go beyond the latter by yielding results on the exponential decay of the mean values. The significance of that was described in the introduction. 4.1. Fast power decay ⇒ exponential decay. An interesting and useful implication (as is seen below) is that fast enough power law implies exponential decay. In this sense, random Schrödinger operators join other statistical mechanical models in which such principles have been previously recognized. The list includes the general Dobrushin– Shlosman results [24] and the more specific two-point function bounds in: percolation (Hammersley [23] and Aizenman–Newman [27]), Ising ferromagnets (Simon [25] and Lieb [26]), certain O(N ) models (Aizenman–Simon [30]), and time-evolution models (Aizenman–Holley [31], Maes–Shlosman [32]). Theorem 4.1. Let Hω be a random operator on 2 (Zd ) with an s-regular distribution for the potential (Vω (x)) and tempered off-diagonal matrix elements (Tx,y;ω ). There are L0 , B1 , B2 < ∞, which depend only on the temperedness bound (3.4), such that if for some E ∈ R and some finite L ≥ L0 , either s 1 3(d−1) L sup E x (4.1) y ≤ B1 , H*L (x),ω − E L/2≤&x−y&≤L or L
4(d−1)
sup
L/2≤&x−y&≤L
E x
s 1 y Hω − E − i0
≤ B2 ,
(4.2)
where *L (x) = [−L, L]d + x and &y& ≡ maxj |yj |, then the exponential localization (1.3) holds for all energies in some open interval (a, b) containing E. Proof. By Theorem 3.2, to establish exponential decay at the energy E it suffices to show that for each x ∈ Zd , 2 s
C s s τu,u < 1. (4.3) 1 + s Es (*L ) E |G*L (x) (x, u; E)| λ u∈*L (x) u ∈Zd \*L (x)
Because the off diagonal elements are tempered we have the following bounds
−m|u−u | s τu,u , ≤ Const. e
Es (*L ) ≤ Const. qLd−1 ,
(4.4)
Finite-Volume Fractional-Moment Criteria for Anderson Localization
239
for some m > 0, and all L > 1. Under the assumption Eq. (4.1): u∈*L (x) u ∈Zd \*L (x)
s s τu,u E |G*L (x) (x, u; E)| s C Const. (L/2)d e−mL /2 λs s 1 + Const. sup E x y Ld−1 . H*L (x),ω − E L/2≤&x−y&≤L ≤
(4.5)
For this bound the sum was split according to &u − u & < (or ≥)L/2, and in the first s /λs . case we used the uniform upper bound E(|G(x, u; E)|s ) ≤ C It is now easy to see that with an appropriate choice of L0 and B1 condition (4.1) implies the claimed bound (4.3) – for the given energy E. The extension to an interval of energies around E then follows from the continuity of the fractional moments of finite volume Green functions. To show the sufficiency of the second condition, we first use Lemma 2.3 to bound finite volume Green functions in terms of the corresponding infinite volume funtions s
C E |G*L (x) (x, y; E)|s ≤ E |G(x, y; E)|s + s λ
u∈*L (x) u ∈Zd \*L (x)
τus ,u E |G(x, u ; E)|s . (4.6)
Splitting the sum as in Eq. (4.5), we get
E |G*L (x) (x, y; E)|s
sup
L/2≤&x−y&≤L
≤
! "2 Cs Const. (L/2)d e−mL /2 λs + 1 + Const. Ld−1 × Ld−1
(4.7) sup
L/2≤&x−y&≤L
E |G(x, y; E)|s .
The combination of Eq. (4.7) with (4.5), yields the claim – for the given energy. Again, the existence of an open interval of energies in which the condition is met is implied by the continuity of the finite-volume expectation values.
4.2. Lower bounds for Gω (x, y; Eedge + i0) at mobility edges. Boundary points of the continuous spectrum are often referred to as mobility edges. (In an ergodic setting the location of such points does not depend on the realization ω [33].) The proof of the occurrence of continuous spectrum for random stochastically shift-invariant operators on Zd is still an open problem (one may add that we are here glossing over some fine distinctions in the dynamical behaviour [34]). However it is intersting to note that Theorem 4.1 directly yields the following pair of lower bounds on the decay rate of
240
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
the Green function at mobility edges, Eedge , for stochastically shift invariant random operators with regular probability distribution of the potential:
E 0
s y
1
≥ B1 L−3(d−1) , H − E d edge L/2≤&y&≤L [−L,L] ,ω s 1 sup E 0 y ≥ B2 L−4(d−1) , Hω − Eedge − i0 L/2≤&y&≤L sup
(4.8) (4.9)
with &y& ≡ maxj |yj |. We do not expect the power laws provided here to be optimal. As mentioned above, vaguely similar bounds are known for the critical two-point functions in certain statistical mechanical models (percolation, Ising spin systems, and some O(N ) spin models).
4.3. Extending off the real axis. For various applications, such as the decay of the projection kernel (see [8, Sect. 5]), it is useful to have bounds on the resolvent at z = E + iη which are uniform in η. The following result shows that in order to establish such uniform bounds it is sufficient to verify our criteria for real energies in some neighborhood of E. Theorem 4.2. Let Hω be a random operator on 2 (Zd ) with an s-regular distribution for the potential (Vω (x)) and tempered off-diagonal matrix elements (Tx,y;ω ). Suppose that for some E ∈ R, and IE > 0, the following bound holds uniformly for ξ ∈ [E − IE, E + IE]: E x
s 1 y Hω − ξ − i0
≤ A e−µ|x−y| .
(4.10)
˜ −µ|x−y| ≤ Ae ,
(4.11)
Then for all η ∈ R: E x
s 1 y Hω − E − iη
< ∞ and µ˜ > 0 – which depend on IE and the bound (4.10). with some A Remarks. 1. This result is not needed in situations covered by the single site version of the criterion provided by Theorem 1.1, since if Eq. (1.12) is satisfied at some E ∈ R then it automatically holds uniformly along the entire line E + iR. We do not see a monotonicity argument for such a deduction in case of other finite-volumes. 2. One way to derive the statement is by using the fact that exponential decay may be tested in finite volumes: if a finite volume criterion holds for some E then continuity allows one to extend it to all E + iη with η sufficiently small. The Combes–Thomas estimate [35] can then be used to cover the rest of the line E + iR. However, by this approach one gets only a weaker decay rate for energies off the real axis. It is tempting to think that some contour integration argument could be found to significantly improve on that. The proof given below is a step in that direction (though it still leaves one with the feeling that a more efficient argument should be possible).
Finite-Volume Fractional-Moment Criteria for Anderson Localization
241
Proof. Assume that condition (4.10) is satisfied for all ξ ∈ [E − IE, E + IE]. We shall show that this implies that for any power α, s 1 Aα E x , (4.12) y ≤ Hω − ξ − iη |x − y|α with the constant Aα < ∞ uniform in η. The stated conclusion then follows by an application of Theorem 4.1 (and the uniform bounds seen in its proof). We shall deal separately with large and small |η|, splitting the two regimes at IE × π/α. The case |η| ≥ IE×π/α is covered by the general bound of Combes–Thomas [35], which states that: |G(x, y; E + iη)| ≤ (2/η)e−m|x−y| for any m ≥ 0 such that
τ (x) (em|x| − 1) ≤ η/2.
(4.13)
(4.14)
x∈Zd
To estimate the resolvent for |η| ≤ IE × π/α, we shall use the fact that the function
fL (ζ ) = E |G[−L,L]d (x, y; ζ )|s (4.15) is subharmonic in the upper half plane, and continuous at the boundary. The subharmonicity is a general consequence of the analyticity of the resolvent in ζ , and the continuity is implied through the continuity of the distribution of the potential. L serves as a convenient cutoff, which may be removed after the bounds are derived (since H[−L,L]d ,ω −→ H L→∞ ω in the strong resolvent sense). Let D ⊂ C be the triangular region in the upper half plane in the form of an equilateral triangle based on the real interval [E − IE, E + IE] with the side angles equal to θ – determined by the condition 2π − 1. (4.16) θ The Poisson-kernel representation of harmonic functions yields, for E + iη ∈ D, D fL (E + iη) ≤ fL (ζ )PE+iη (dζ ), (4.17) α=
∂D
D PE+iη (dζ )
where is a certain probability measure on ∂D. We now rely on the fact that this probability measure satisfies D PE+iη (dζ ) ≤ Const.d(η2π/θ ) /IE 2π/θ .
(4.18)
(This is easily understood upon the unfolding of D by the map z ( → z2π/θ applied from either of the base corners of D, i.e., from ζ = E ± IE, and a comparison with the Poisson kernel in the upper half plane.) For ζ ∈ ∂D ∩ R the integrand satisfies the exponential bound (4.10). Along the rest of the boundary of D we use the Combes–Thomas bound (4.13). Putting it all together we get IE θ 2 −Const. |x−y| η fL (E + iη) ≤ A e−µ|x−y| + Const. e d(η2π/θ ) /IE 2π/θ . η 0 (4.19) The claimed Eq. (4.12) follows by simple integration, and the relation (4.16).
242
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
4.4. Relation with the multiscale analysis and density of states estimates. Using the above results we shall now show that the fractional moment localization condition is satisfied throughout the regime for which localization can be shown via the multiscale analysis, and also in regimes over which one has suitable bounds (e.g., via Lifshitz tail estimates) on the density of states of the operators restricted to finite regions *L = [−L, L]d . The following result is useful for the latter case. Theorem 4.3. Let Hω be a random operator on 2 (Zd ) with tempered off-diagonal matrix elements (Tx,y;ω ) and a distribution of the potential which is s-regular for all s small enough, which is stochastically invariant under magnetic shifts. Then, given β ∈ (0, 1), C1 > 0, and ξ > 3(d − 1), there exist L0 > 0 and C2 > 0 such that if for some L ≥ L0 , $ #
Prob dist σ (H*L ;ω ), E ≤ C1 L−β < C2 L−ξ , (4.20) at some energy E, then the exponential localization condition (1.3) holds in some open interval containing E. The condition (4.20) is similar to the one used in the multiscale analysis, although there one can also find a sufficient diagnostic with arbitrary ξ > 0. It may therefore not be initially clear that the methods of this paper may be used throughout the regime in which the multiscale analysis applies. However, the proof of Theorem 4.3 is easily adapted to prove the following result which implies fractional moment localization via the conclusions of the multiscale analysis. Theorem 4.4. Let Hω be a random operator with tempered off-diagonal matrix elements (Tx,y;ω ) and a distribution of the potential which is s-regular for all s small enough, which is stochastically invariant under magnetic shifts. If for some E ∈ R there exist A < ∞, µ > 0 , and ξ > 3(d − 1) such that % & lim Lξ Prob |G*L ;ω (0, x)| > Ae−µ|x| for some x ∈ *L = 0, (4.21) L→∞
then the exponential localization condition (1.3) holds in some open interval containing E. Remarks. 1. When the multiscale analysis applies, it allows one to conclude that there are A < ∞ and µ > 0 such that the probabilities appearing on the left side of Eq. (4.21) decay faster than any power of L as L → ∞. Thus, the conclusions of the multiscale analysis imply that exponential localization in the stronger sense discussed in our work applies throughout the regime which may be reached by this prior method. 2. It is of interest to combine the criterion presented above with Lifshitz tail estimates on the density of states at the bottom of the spectrum, E0 , and at band edges. Using Lifshitz tail estimates, it is possible to show that [36]: $ # −d/2 . Prob inf σ (H*L ;ω ) ≤ E0 + IE ≤ Const. Ld e−IE
(4.22)
Theorem 4.3 then implies fractional moment localization in a neighborhood of E0 ; we need only choose IE ∝ L−β with β ∈ (0, 1) for large enough L. Previous results in this vein may be found in [21, 16–18].
Finite-Volume Fractional-Moment Criteria for Anderson Localization
243
Proof of Theorems 4.3 and 4.4. We first prove Theorem 4.3 and then indicate how the proof can be modified to show Theorem 4.4. Fix an energy E ∈ R. For L > 0, define #
$ (4.23) pL (δ) := Prob dist σ (H*L ;ω ), E ≤ δ , and let δL := C1 L−β .
(4.24)
We will show that for suitable s ∈ (0, 1), L0 > 0 and C2 > 0, if pL (δL ) < C2 L−ξ ,
(4.25)
then the input condition (4.1) of Theorem 4.1: s 1 L3(d−1) sup E 0 y ≤ B1 , H − E L/2≤&y&≤L *L ,ω
(4.26)
∈ [E − 1 δL , E + 1 δL ]. Exponential localization in the is satisfied for all energies E 2 2 corresponding interval (and strip, with η = 0) follows then by Theorems 4.1 (and Theorem 4.2).
s in terms of pL (δ). This First we must show how to estimate E |G*L ;ω (0, u; E)| is achieved by considering separately the contributions from the “good set”:
(4.27) .G = {ω|dist σ (H*L ;ω ), E > δ}, and its complement, the “bad set”: .B = .cG . is at a small yet significant distance (IE ≥ On the “good set”, ω ∈ .G , the energy E 1 δ) from the spectrum of H . In this situation, we use the Combes–Thomas [35] *L ;ω 2 bound, by which: ≤ |G*L ;ω (0, u; E)|
2 − 1 IE|u| . e 2 IE
(4.28)
The above estimate does not apply on the “bad set”. However, using the Hölder inequality, we find that the net contribution to the expectation is small because Prob(.B ) = pL (δ) is small. The two estimates are combined in the following bound:
s E |G*L ;ω (0, u; E)|
s I [ω ∈ .G ] + E |G*L ;ω (0, u; E)| s I [ω ∈ .B ] = E |G*L ;ω (0, u; E)|
s s t t E (I [ω ∈ .B ])1− t ≤ 4s δ −s e−s |u| δ /4 + E |G*L ;ω (0, u; E)| (4.29) s
s
≤ 4s δ −s e−s |u| δ /4 + Ctt /λs pL (δ)1− t , where t is any number greater than s for which the distribution of the potential is still t-regular (i.e., Ct < ∞). The required bound, Eq. (4.26), is satisfied once one chooses s small enough so that t 3(d − 1), and L0 large enough so that for L > L0 , ξ ≥ t−s 4s C1−s L3(d−1)−sβ e−s C1 L
1−β
/4
≤ B1 /2.
(4.30)
244
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
Finally let us remark on how this argument can be adapted to prove Theorem 4.4. We simply define the good and bad sets differently: .G = {ω||G*L ;ω (0, x)| ≤ Ae−µ|x| for all x ∈ *L },
(4.31)
and .B = .cG , and then proceed as in the proof of Theorem 4.3 using Hölder’s inequality to estimate the contributions from .B . It is easy to see that for large L, the condition (4.21) implies that the input for Theorem 4.1 is satisfied. Thus, we have seen here that the fractional moment localization condition holds throughout the regime for which localization can be established by any available methods. This is meaningful since that condition carries a number of physically significant implications. Appendix A. Dynamical Localization Among the implications of the fractional moment condition is dynamical localization, expressed through uniform exponential decay of the average time evolution kernels: E sup x PHω ∈F eitH y ≤ Ae−µ|x−y| , (A.1) t∈R
where PHω ∈F indicates the spectral projection of Hω onto a set F ⊂ R in which the fractional moment condition is known to hold. A derivation of this implication, under some auxiliary assumptions on the distribution of the potential, was given in ref. [13]. For completeness we offer here a streamlined version of that argument, which also extends the result in that we now allow F to be an unbounded set (in particular the full real line). The inequality expressed in Eq. (A.1) is not special to the time evolution operators ft (E) = eitE ; it follows, rather, from a similar bound on the average total mass of the x,y spectral measures, µω , associated to pairs of sites x, y. The measures are defined by the spectral representation: f (E)µx,y (A.2) ω ( dE) := x|f (Hω )|y, x,y
for bounded Borel functions f . In the following discussion we denote by |µω | the x,y absolute value (sometimes called the total variation) of µω . Theorem A.1. Let Hω be a random operator on 2 (Zd ) with tempered off-diagonal matrix elements and a potential Vω which satisfies:
1. For some δ ∈ (0, 1), the δ-moments of Vω , E |Vω (x)|δ , are uniformly bounded. 2. For each x ∈ Zd the conditional distribution of v = Vω (x) at specified values of all other matrix elements has a density ρωx (v), and the functions ρωx are uniformly bounded. Suppose there is an energy domain F ⊂ R on which Hω satisfies a uniform fractional moment bound, i.e., there exist A < ∞ and µ > 0 such that, for some s ∈ (0, 1), s 1 E x (A.3) y ≤ Ae−µ|x,y| , H*;ω − E
Finite-Volume Fractional-Moment Criteria for Anderson Localization
245
for any finite region * ⊂ Zd , any pair of sites x, y ∈ *, and every E ∈ F . Then there exist A < ∞ and µ > 0 such that for any pair of sites x, y ∈ Zd ,
−µ |x−y| , (A.4) E |µx,y ω |(F ) ≤ A e x,y
where µω is the spectral measure associated to the pair x, y and Hω . Remarks. 1. Recall that for any regular Borel measure µ, |µ|(F ) = sup | f (E)µ( dE)|, F
where the supremum ranges over Borel measurable (or even just continuous) functions f which are point-wise bounded by 1. Thus Eq. (A.4) implies that (A.5) E sup |x|ft (Hω )PHω ∈F |y| ≤ CA e−µ |x−y| , t
for any uniformly bounded family of Borel functions {ft }. In particular, we may take ft (E) = eitE for t ∈ R to obtain dynamical localization (A.1) as promised. 2. The requirement that the conditional densities, ρωx , be uniformly bounded is overly strong. By the arguments presented in ref. [13], the result extends to potentials for which ' there is some q > 0 such that (ρωx (v))1+q dv are uniformly bounded. 3. Since this work extends now the exponential dynamical localization to the regime covered by the multiscale analysis, let us mention that prior results covering this regime include the proof of localization in terms of power-law bounds for the time evolution kernel [37, 38]. (The analysis there is more general since it applies also to models for which the fractional moment method has not been developed, e.g., continuum operators). Proof of Theorem A.1. It is convenient to derive the result through the analysis of the finite volume operators obtained by restricting Hω to finite regions, *n ⊂ Zd . It is generally understood that for each x, y ∈ Zd and each increasing sequence of finite regions *n x,y which contain {x, y} and whose union is Zd , the associated spectral measures, µ*n ;ω , x,y converge in the vague topology to µω . Thus, by the lemma of Fatou, for any F ⊂ R: x,y x,y E(|µω |(F )) ≤ limn→∞ E(|µ*n ;ω |(F )). The upshot is that it suffices to prove the following statement regarding finite volume operators. Under the assumptions of Theorem A.1 there exist C, r > 0 (which depend only on the regularity assumptions for Hω ) such that for any finite region * ⊂ Zd , any x, y ∈ *, any F ⊂ R, and any s ∈ (0, 1): ! s "r 1 x,y E µ*;ω (F ) ≤ C sup E x . (A.6) y H − E *,ω E∈F Following is a summary of the proof of this assertion. Let us fix a finite region * ⊂ Zd and a pair of sites x, y ∈ *. For simplicity of notation, we will suppress the region * and denote the restricted operator by Hω and x,y the associated spectral measure by µω . x,y 2 Since (*) is finite dimensional, µω is a weighted sum of Dirac measures supported on the eigenvalues of Hω . Integrals with respect to this measure are discrete sums. The
246
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
argument of ref. [13] makes an essential use of the following representation of measure. Let v = Vω (x), and let vˆ be any other value in R. Denote ( + (E) := −1/ x ( 1 ˆ Then, with Hˆ ω the operator with the potential at x changed to v. µx,y (dE) = −(v − v) ˆ x ω
1 Hˆ ω − E
Hω −E
y δ(v − vˆ − +(E))dE. ˆ
this x ,
(A.7)
In what follows, we will take vˆ = vˆω to be a random variable independent of vω and identically distributed. In this case Eq. (A.7) holds almost surely. A special case of Eq. (A.7) is the formula (which was the basis for the important “Kotani-argument” [39, 12]) for the spectral measure at x, ˆ µx,x ω (dE) = δ(v − vˆ − +(E))dE.
(A.8)
The above is a probability measure. Another normalizing condition is: 2 1 y δ(v − vˆ − +(E))dE ˆ ≤ 1, |v − v| ˆ x ˆ Hω − E 2
(A.9)
(which typically holds as equality). The reason for Eq. (A.9) is that by the general structure of the spectral measures, x,y µω (dE) = Rω (E)µx,x ω ( dE), with Rω (E) satisfying
|Rω (E)|2 µx,x ω (dE) = y| Pω |y ≤ 1,
where Pω is the projection onto the cyclic subspace for Hω which contains |x. Let us first present the necessary estimates for the case that F ⊂ R is of finite Lebesgue measure. Using the bound Eq. (A.9), and the Hölder inequality,
(F ) E µx,y ω
1/(2−α) α 1 x y δ(v − vˆ − +(E))dE ˆ , ≤ E |v − v| ˆα ˆ Hω − E F
(A.10)
where α( < 1) is a small number to be specified later. By a further application of the Hölder inequality, followed by the Jensen inequality we obtain 2−α # $α/δ x,y ≤ 2E(|v|δ ) E µ*;ω (F ) α/s s 1 x y δ(v − vˆ − +(E))dE ˆ × E , ˆ Hω − E F (A.11)
Finite-Volume Fractional-Moment Criteria for Anderson Localization
247
where α is fixed by the equation α/s + α/δ = 1. Finally we evaluate: E
ˆ |y| δ(v − vˆ − +(E))dE |x| Hˆ ω − E s x 1 ˆ = E x y ρω (vˆ + +(E)) dE Hˆ ω − E F s 1 y dE, ≤κ E x Hˆ ω − E F 1
F
s
(A.12)
where κ is a uniform upper bound for ρωx . These estimates can be combined to provide a bound of the form Eq. (A.6) for F a finite interval, which was the case considered in ref. [13]. We shall now improve the argument, to obtain a statement which covers the case that the localized spectral regime is unbounded. Since we do not wish our final estimate to depend on the Lebesgue measure of F , we seek a way of introducing an integrable weight h(E), so that the final bound involves the integral of h(E) dE in place of dE. This may be accomplished with the following inequality: 1 x,y µ (F ) ≤ x||g(H )|2p |x 2p ω
F
|g(E)|
−p
x,y µ (dE)
1
ω
p
,
(A.13)
where 1/p + 1/p = 1 and g is any continuous function which x,y ' is bounded andx,ybounded away from zero. To prove Eq. (A.13), write µω (F ) = F g(E)/g(E) µω ( dE), and apply the Hölder inequality followed by 1/2 |g(E)|p µx,y ( dE) ≤ x||g(H )|2p |x . (A.14) ω It is convenient to choose g(E)2p = (1 + E 2 ), since x|(1 + Hω2 )|x = B + Vω (x)2 , where Bω is a bounded random variable which depends only on the off-diagonal part of Hω . Upon taking expectations followed by a further application of the Hölder inequality this leads to ! q "1/q x,y
2 2p E µω (F ) ≤ E Bω + Vω (x) × E
q
1 F
1/q
(A.15)
p
p
(1 + E 2 ) 2p
x,y µ (dE) ω
,
where 1/q + 1/q = 1. We estimate the two factors on the right-hand side of this inequality separately. The first factor can be controlled by choosing q = pδ so that q
2p δ/2 ≤ &Bω &∞ + E |Vω (x)|δ . (A.16) E Bω + Vω (x)2
248
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
The exponents p, p , q, q are all specified once we choose p > 1/δ. Specifically, q = δp, q = p(p − 1/δ)−1 , and p = p(p − 1)−1 . Note that p < q . x,y To estimate the second factor, we note that |µω | is a sub-probability measure and q /p > 1, so by the Jensen inequality, q p x,y x,y 1 1 µ (dE) ≤ E µ (dE) . E ω ω p q F (1 + E 2 ) 2p F (1 + E 2 ) 2p (A.17) Estimating the right hand side with the argument outlined above for F with finite Lebesgue measure, we find that x,y $ # 1 µ (dE) ≤ 2E(|v|δ ) α/δ E ω q F (1 + E 2 ) 2p
α/s s dE 1 y × κ E x , (A.18) (1 + E 2 )q /2p Hˆ ω − E F which is uniformly bounded provided we choose p such that q /p > 1. This is possible since q /p = (p − 1/δ)−1 which can be madeas large as we like. x,y Thus, for any finite volume E µ*;ω (F ) can be bounded by a constant multiple s of supE∈F E x ˆ 1 y raised to a certain power. Which multiple and which H*;ω −E
power depend only on the δ-moments of the potential and the uniform bound on the conditional distributions ρωx . By the vague convergence argument outlined at the start of the proof, this proves the theorem.
B. A Fractional Moment Bound The regularity conditions R1 (τ ) and R2 (s) have been used to give a priori estimates of certain fractional moments. Such fractional moment bounds are properties of the general class of operators with diagonal disorder. Hence, throughout this appendix, we consider random operators Hω on 2 (T ) of the form Hω = T0 + λVω ,
(B.1)
where T0 is an arbitrary bounded self-adjoint operator and Vω is a random potential for which Vω (x) are independent random variables (T is any countable set). Lemma B.1. Let Hω be a random operator given by Eq. (B.1) such that for each x the probability distribution of the potential Vω (x) satisfies R1 (τ ) for some fixed τ > 0 with constants uniform in x. Then there exists κτ < ∞ such that for any finite subset * of T , any x, y ∈ *, any z ∈ C, and any s ∈ (0, τ ), s τ (4κτ ) s/τ 1 . (B.2) E x y {V (u)}u∈*\{x,y} ≤ H*;ω − z τ − s λs
Finite-Volume Fractional-Moment Criteria for Anderson Localization
249
Proof. Let us first consider z = E ∈ R. For such energies Eq. (B.2) is a consequence of a Wegner type estimate on the 2-dimensional subspace spanned by |x >, |y >. The key is to determine the correct expression for the dependence of x| H*;ω1 −E |y on Vω (x) and Vω (y). Such an expression is given by the “Krein formula”: ! " −1 1 V (x) 0 x (B.3) 2 , y = 1 [A]−1 + λ ω0 V (y) ω H*;ω − E where [A] is a 2 × 2 matrix whose entries do not depend on Vω (x) or Vω (y). In fact, 1 1 x H( −E x x H( −E y *;ω *;ω 1 x [A] = y (B.4) , H(*;ω −E y 1 y ( H −E *;ω
(*;ω denotes the operator obtained from H*;ω by setting Vω (x) and Vω (y) equal where H to zero. The regularity condition R1 (τ ) implies a Wegner type estimate: 1 ! "−1 1 1 1 4κτ Vω (x) 0 1 1 −1 Prob 1 [A] + λ , (B.5) 1 > t {Vω (u)}u =x,y ≤ 0 Vω (y) 1 1 (λt)τ where κτ is any finite number such that for every v ∈ T , a ∈ R, and $ > 0, Prob (Vω (v) ∈ (a − $, a + $)) ≤ κτ $ τ .
(B.6)
The desired bound (B.2) follows easily from Eq. (B.5). (The factor, 4, on the right hand side of (B.5) arises as the square of the “volume” of the region {x, y}. In the case x = y, we could replace this factor by 1.) Although the Krein formula (B.3) is true when E is replaced by any z ∈ C, the resulting matrix [A] may not be normal if z ∈ R. (The resolvent, H 1−z , is normal. 1 However, given an orthogonal projection, P , the operator P H −E P may not be normal!) Yet, the Wegner-like estimate (B.5) holds only when [A] is a normal matrix. At first, this seems to be an obstacle to the extension of (B.2) to all values of z. However, once the inequality is known for real values of z, it follows for all z ∈ C from analytic properties of the resolvent. Specifically, the function s 1 φ(z) = x (B.7) y H*;ω − z is sub-harmonic in the upper and lower half planes and decays as z → ∞. Hence, φ(z) is dominated by the convolution of its boundary values with a Poisson kernel: |η| dE φ(E + iη) ≤ φ(E) . (B.8) 2 2 +η π (E − E) ∈ R, (B.2) is seen to hold for all z ∈ C. By Fubini’s theorem and Eq. (B.2) for E
250
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
The “all for one” principle mentioned previously is actually a simple consequence of Lemma B.1. Lemma B.2. Let Hω be a random operator as described in Lemma B.1, and suppose that there is a distance function dist on T such that for some s < τ and some z ∈ C 1 s (B.9) E x y ≤ A(s)e−µ(s) dist(x,y) , Hω − z for every x, y ∈ T . Then, in fact, (B.9) holds, with modified constants A(r) and µ(r), when s is replaced by any r < τ . Proof. Note that given r, s > 0 with r < s < τ , E x
1 H*;ω − E
r rs ≤ E x y
≤ E x
≤
1
H*;ω − E t−s r t−r E x y
H*;ω − E s−r
(4κτ ) t/τ λt
t−r
s y
1
E x
where t is any number with s < t < τ .
1 H*;ω − E
1
H*;ω − E t−s r t−r , y
t s−r t−r y (B.10)
C. Decoupling Inequalities C.1. Decoupling inequalities for Green functions. The condition R2 (s) plays a crucial role in several of the arguments presented in this paper. It has been used to bound expectations of products of Green functions in terms of products of expectations. In this section we demonstrate the validity of the necessary bounds. The main result is the following: Lemma C.1. Let Hω be a random operator given by Eq. (B.1), with an s regular distribution of the potential Vω (x). Then 1. For any .1 , .2 ⊂ T , any x, y ∈ .1 , and any u, v ∈ .2 , s
C
E |G.1 (x, y; z)|s |G.2 (u, v; z)|s ≤ s E |G.1 (x, y; z)|s . λ
(C.1)
2. For any .1 ∩ .2 = ∅, x, u ∈ .1 , v, y ∈ .2 , and .3 ⊂ +,
E |G.1 (x, u; z)|s |G.3 (u, v; z)|s |G.2 (v, y; z)|s s
C ≤ s E |G.1 (x, u; z)|s E |G.2 (v, y; z)|s . λ
(C.2)
Lemma C.1 is a consequence of the conditional expectation bound (B.2), the Krein formula (B.3), and the following:
Finite-Volume Fractional-Moment Criteria for Anderson Localization
251
Lemma C.2. Let V1 , V2 be independent real valued random variables which satisfy (2) R2 (s) for some s > 0. Then there exists Ds > 0 such that
(C.3) E |F (V1 , V2 )|s |F (V1 , V2 )|s ≤ Ds(2) E |F (V1 , V2 )|s E |G(V1 , V2 )|s , where F and G are arbitrary functions of the form 1 , L1 (V1 , V2 ) L2 (V1 , V2 ) G(V1 , V2 ) = , L3 (V1 , V2 ) F (V1 , V2 ) =
(C.4) (C.5)
with {Li } functions which are linear in each variable separately. In fact, we may take (2) Ds = Ds;1 Ds;2 , where, for j = 1, 2, Ds;j is the decoupling constant for Vj . Proof. Let f (V ) and g(V ) be two functions of the appropriate form for the decoupling lemma. Then, with j = 1, 2,
j )|s |g(Vj )|s , E |f (Vj )|s |g(Vj )|s ≤ Ds;1 E |f (V (C.6) j indicates an independent variable distributed identically to Vj . where V Now, if F and G are functions of 2 variables of the given form, then at fixed values of V2 , they satisfy the 1 variable decoupling lemma, so
1 , V2 )|s |G(V1 , V2 )|s . E |F (V1 , V2 )|s |G(V1 , V2 )|s ≤ Ds;1 E |F (V (C.7) 1 and V1 , F (V 1 , V2 ) and G(V1 , V2 ) (as functions of V2 ) are again For fixed values of V of the correct form to apply the 1 variable decoupling lemma. Thus,
1 , V 2 )|s |G(V1 , V2 )|s E |F (V1 , V2 )|s |G(V1 , V2 )|s ≤ Ds;1 Ds;2 E |F (V
(C.8) = Ds;1 Ds;2 E |F (V1 , V2 )|s E |G(V1 , V2 )|s . C.2. A condition for the validity of R2 (s). Decoupling lemmas have been discussed already in references [11, 13, 8]. Though these contain results similar to those required here, they do not provide the exact condition used in this work. Hence, we briefly present an elementary condition under which R2 (s) is satisfied. The following discussion is by no means exhaustive. Rather, we simply wish to show that the condition R2 (s) is not devoid of meaningful examples. Lemma C.3. Let ρ be a measure with bounded support which satisfies R1 (τ ). Then for any s < τ4 , ρ satisfies R2 (s). Proof. For each s > 0, we define
1 ρ(dV ), |V − z|s |V − z|s ψs (z, w) = ρ(dV ), |V − w|s |V − z|s γs (z, w, ζ ), = ρ(dV ). |V − w|s |V − ζ |s φs (z) =
(C.9) (C.10) (C.11)
252
M. Aizenman, J. H. Schenker, R. M. Friedrich, D. Hundertmark
Property R2 (s) amounts to the statement that γs (z, w, ζ ) < ∞. φ z,w,ζ ∈C s (ζ )ψs (z, w) sup
In fact, if we let
√ φ2s (z) Fs (z) = , φs (z) √ ψ2s (z, w) Gs (z, w) = , ψs (z, w)
(C.12)
(C.13) (C.14)
then by the Cauchy–Schwartz inequality, it suffices to show that Fs and Gs are uniformly bounded. However this is elementary since Fs and Gs are continuous functions which are easily shown to have finite limits at infinity. Acknowledgements. Questions asked by Frédéric Klopp led us streamline the original derivation of the results in Sect. 3. We thank him for this and other stimulating discussions. This work was supported in part by the NSF Grant PHY-9971149 (MA). Jeff Schenker thanks the NSF for financial support under a Graduate Research Fellowship, and Dirk Hundertmark thanks the Deutsche Forschungsgemeinschaft for financial support under grant Hu 773/1-1.
References 1. Anderson, P.W.: Absence of diffusion in certain random lattices. Phys. Rev. 109, 1492 (1958) 2. Mott, N. and Twose, W.: The theory of impurity conduction. Adv. Phys. 10, 107 (1961) 3. Martinelli, F. and Scoppola, E.: Introduction to the mathematical theory of Anderson localization. Rivista del Nuovo Cimento 10, no. 10 (1987) 4. Halperin, B.I.: Quantized Hall conductance, current-carrying edge states, and the existence of extended states in a two-dimensional disordered potential. Phys. Rev. B 25, 2185 (1982) 5. Niu, Q., Thouless, D.J. and Wu, Y.S.: Quantized Hall conductance as a topological invariant. Phys. Rev. B 31, 3372 (1985) 6. Avron, J.E., Seiler, R. and Simon, B.: Charge deficiency, charge transport and comparison of dimensions. Commun. Math. Phys. 159, 399 (1994) 7. Bellissard, J., van Elst, A. and Schulz-Baldes, H.: The noncommutative geometry of the quantum Hall effect. J. Math. Phys. 35, 5373 (1994) 8. Aizenman, M. and Graf, G.M.: Localization bounds for an electron gas. J. Phys. A: Math. Gen. 31, 6783 (1998) 9. Figotin, A. and Klein, A.: Midgap defect modes in dielectric and acoustic media. SIAM J. Appl. Math. 58, 1748 (1998); no. 6, 1748–1773 (electronic) 10. Fröhlich, J. and Spencer, T.: Absence of diffusion in the Anderson tight binding model for large disorder or low energy. Commun. Math. Phys. 88, 151 (1983) 11. Aizenman, M. and Molchanov, S.: Localization at large disorder and at extreme energies: An elementary derivation. Commun. Math. Phys. 157, 245 (1993) 12. Simon, B. and Wolff, T.: Singular continuous spectrum under rank one perturbations and localization for random Hamiltonians. Comm. Pure Appl. Math. 39, no. 1, 75 (1986) 13. Aizenman, M.: Localization at weak disorder: Some elementary bounds. Rev. Math. Phys. 6, 1163 (1994) 14. Minami, N.: Local fluctuation of the spectrum of a multidimensional Anderson tight binding model. Commun. Math. Phys. 177, 709 (1996) 15. Pastur, L. and Figotin, A.: Spectra of random and almost-periodic operators. Berlin: Springer-Verlag, 1992 16. Barbaroux, J.M., Combes, J.-M. and Hislop, P.D.: Localization near band edges for random Schrödinger operators. Helv. Phys. Acta 70, 16 (1997) Papers honouring the 60th birthday of Klaus Hepp and of Walter Hunziker, Part II (Zürich, 1995)
Finite-Volume Fractional-Moment Criteria for Anderson Localization
253
17. Kirsch, W., Stollmann, P. and Stolz, G.: Localization for random perturbations of periodic Schrödinger operators. Rand. Oper. Stoch. Eq. 6, 241 (1998) 18. Stollmann, P.: Lifshitz asymptotics via linear coupling of disorder. Preprint, 1999 19. Klopp, F.: Internal Lifshits tails for random perturbations of periodic Schrödinger operators. Duke Math. J. 98, 335 (1999) 20. Combes, J.-M. and Hislop, P.D.: Localization properties of continuous disordered systems in d- dimensions. In: Mathematical quantum theory. II. Schrödinger operators (Vancouver, BC, 1993). CRM Proc. Lecture Notes 8, Providence, RI: Am. Math. Soc., 1995, p. 213 21. Figotin, A. and Klein, A.: Localization of electromagnetic and acoustic waves in random media. Lattice models. J. Stat. Phys. 76, 985 (1994) 22. Carmona, R., Klein,A. and Martinelli, F.:Anderson localization for Bernoulli and other singular potentials. Commun. Math. Phys. 108, no. 1, 41 (1987) 23. Hammersley, J.M.: Percolation processes II. The connective constant. Proc. Camb. Phil. Soc. 53, 642 (1957) 24. Dobrushin, R.L. and Shlosman, S.B.. Completely analytical interactions: Constructive description. J. Stat. Phys. 46, no. 5–6, 983–1014 (1987) 25. Simon, B.: Correlation inequalities and the decay of correlations in ferromagnets. Commun. Math. Phys. 77, no. 2, 111 (1980) 26. Lieb, E.H.: A refinement of Simon’s correlation inequality. Commun. Math. Phys. 77, no. 2, 127 (1980) 27. Aizenman, M. and Newman, C.M.: Tree graph inequalities and critical behavior in percolation models. J. Stat. Phys. 36, 107 (1984) 28. von Dreifus, H.: Bounds of the critical exponents of disordered ferromagnetic models. Ann. Inst. Henri Poincaré 55, 657 (1991) 29. Benjamini, I., Lyons, R., Peres,Y. and Schramm, O.: Group-invariant percolation on graphs. Geom. Funct. Anal. 9, no. 1, 29 (1999) 30. Aizenman, M. and Simon, B.: Local Ward identities and the decay of correlations in ferromagnets. Commun. Math. Phys. 77, no. 2, 137 (1980) 31. Aizenman, M. and Holley, R.: Rapid convergence to equilibrium of stochastic Ising models in the Dobrushin Shlosman regime. In: Percolation theory and ergodic theory of infinite particle systems (Minneapolis, Minn., 1984–1985), 1, New York: Springer, 1987 32. Maes, C. and Shlosman, S.B.: Ergodicity of probabilistic cellular automata: a constructive criterion. Commun. Math. Phys. 135, no. 2, 233 (1991) 33. Kunz, H. and Souillard, B.: Sur le spectre des opérateurs aux différences finies aléatoires. Commun. Math. Phys. 78, no. 201 (1980) 34. Last, Y.: Quantum dynamics and decompositions of singular continuous spectra. J. Funct. Anal. 142, no. 2, 406 (1996) 35. Combes, J.-M. and Thomas, L.: Asymptotic behaviour of eigenfunctions for multiparticle Schrödinger operators. Commun. Math. Phys. 34, 251 (1973) 36. Simon, B.: Lifschitz tails for the Anderson model. J. Stat. Phys. 38, 65 (1985) 37. Germinet, F. and DeBièvre, S.: Dynamical localization for discrete and continuous random Schrödinger operators. Commun. Math. Phys. 194, no. 2, 323 (1998) 38. Damanik, D. and Stollmann, P.: Multi-scale analysis implies strong dynamical localization. Preprint, 1999; http://xxx.lanl.gov/abs/math-ph/9912002 39. Kotani, S.: Lyaponov exponents and spectra for one-dimensional random Schrödiner operators. In: Contemporary Mathematics (AMS), Vol. 50, Providence, RI: AMS, 1986 Communicated by B. Simon
Commun. Math. Phys. 224, 255 – 269 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Correlations Between Zeros and Supersymmetry Pavel Bleher1 , Bernard Shiffman2 , Steve Zelditch2 1 Department of Mathematical Sciences, IUPUI, Indianapolis, IN 46202, USA.
E-mail:
[email protected] 2 Department of Mathematics, Johns Hopkins University, Baltimore, MD 21218, USA.
E-mail:
[email protected];
[email protected] Received: 13 November 2000 / Accepted: 23 February 2001
To Joel Lebowitz on his 70th birthday Abstract: In our previous work [BSZ2], we proved that the correlation functions for simultaneous zeros of random generalized polynomials have universal scaling limits and we gave explicit formulas for pair correlations in codimensions 1 and 2. The purpose of this paper is to compute these universal limits in all dimensions and codimensions. First, we use a supersymmetry method to express the n-point correlations as Berezin integrals. Then we use the Wick method to give a closed formula for the limit pair correlation function for the point case in all dimensions. 1. Introduction This paper is a continuation of our articles [BSZ1, BSZ2, BSZ3] on the correlations between zeros of random holomorphic polynomials in m complex variables and their generalization to holomorphic sections of positive line bundles L → M over general Kähler manifolds of dimension m and their symplectic counterparts. These correlations N (z1 , . . . , zn ) of finding joint zeros of k indepenare defined by the probability density Knk dent sections at the points z1 , . . . , zn ∈ M (see Sect. 2). To obtain universal √ quantities, we rescale the correlation functions in normal coordinates by a factor of N . Our main result from [BSZ2, BSZ3] is that the (normalized) correlation functions have a universal scaling limit, z1 zn ∞ N N nkm K z0 + √ , . . . , z0 + √ (z1 , . . . , zn ) = lim K1k (z0 )−n Knk , (1) N→∞ N N ∞ depends which is independent of the manifold M, the line bundle L and the point z0 ; K nkm only on the dimension m of the manifold and the codimension k of the zero set. The
Research partially supported by NSF grants #DMS-9970625 (first author), #DMS-9800479 (second author), #DMS-0071358 (third author).
256
P. Bleher, B. Shiffman, S. Zelditch
problem then arises of calculating these universal functions explicitly and analyzing their small distance and large distance behavior. In [BSZ1,BSZ2], we gave explicit formulas ∞ (z1 , z2 ) in codimensions k = 1, 2, respectively. for the pair correlation functions K 2km The purpose of this paper is to complete these results by giving explicit formulas for ∞ in all dimensions and codimensions. K nkm Our first formula expresses the correlation as a supersymmetric (Berezin) integral involving the matrices (z), A∞ (z) used in our prior formulas, as well as a matrix of fermionic variables described below. Theorem 1.1. The limit n-point correlation functions are given by 1 [(m − k)!]n ∞ nkm (z1 , . . . , zn ) = dη. K (m!)n [det A∞ (z)]k det[I + (z)] Here, is the nkm × nkm matrix pj q p q p p = p j q = δp δq ηj η¯ j
(1 ≤ p, p ≤ n, 1 ≤ j, j ≤ k, 1 ≤ q, q ≤ m), (2)
p
p
where the ηj , η¯ j are anti-commuting (fermionic) variables, and dη = j,p dηj d η¯ j . The integral in Theorem 1.1 is a Berezin integral, which is evaluated by simply taking the coefficient of the top degree form of the integrand det[I + (z)]−1 (see Sect. 3). Hence the formula in Theorem 1.1 is a purely algebraic expression in the coefficients of (z) and A∞ (z), which are given in terms of the Szegö kernel of the Heisenberg group and its derivatives (see Sect. 2). We remark that supersymmetric methods have also been applied to limit correlations in random matrix theory by Zirnbauer [Zi]. ∞ (z1 , z2 ), depends only on the distance between the points In the case n = 2, K 2km z1 , z2 , since it is universal and hence invariant under rigid motions. Hence it may be written as: ∞ 2km (z1 , z2 ) = κkm (|z1 − z2 |). K
(3)
We refer to [BSZ2] for details. In [BSZ1] we gave an explicit formula for κ1m (using the “Poincaré–Lelong formula”), and in [BSZ2] we evaluated κ2m . (The pair correlation function κ11 (r) was first determined by Hannay [Ha] in the case of zeros of SU (2) polynomials in one complex variable.) In Sect. 3.1, we use Theorem 1.1 to give the following new Berezin integral formula for κkm : Corollary 1.2. The pair correlation functions are given by 1 (m − k)!2 κkm (r) = dη, 2 k 2 −r m−1 m! (1 − e ) where = det [I + P (1 + 2 ) + T 1 2 ] , P =1−
r 2 e−r
2
1 − e−r
,
T = 1−e
−r 2
−
r 4 e−r
2
1 − e−r 2 = det I + 1 + 2 + (1 − e−r )1 2 . 2
2
,
Correlations Between Zeros and Supersymmetry
257
Here, 1 , 2 are the k × k matrices p p p = ηj η¯ j
1≤j,j ≤k
,
p = 1, 2.
We then expand the formula as a (finite) series (32), which we use to compute explicit formulas for κkm . The most vivid case is when k = m, where the simultaneous zeros of k-tuples of sections almost surely form a set of discrete points. Our second result is an explicit formula for the point pair correlation functions κmm in all dimensions: Theorem 1.3. The point pair correlation functions are given by κmm (r) =
m(1−v m+1 )(1−v) + r 2 (2m + 2)(v m+1 −v) + r 4 v m+1 + v m + ({m + 1}v + 1)(v m −v)/(v−1) m(1−v)m+2 2
v = e−r ,
,
(4)
for m ≥ 1. For small values of r, we have κmm (r) =
m + 1 4−2m + O(r 8−2m ), r 4
as r → 0.
(5)
We prove Theorem 1.3 in Sect. 4 without making use of supersymmetry. Our proof uses instead the Wick formula expansion of the Gaussian integral representation of the correlation. It is interesting to observe the dimensional dependence of the short distance behavior of κmm (r). When m = 1, κmm (r) → 0 as r → 0 and one has “zero repulsion”. When m = 2, κmm (r) → 3/4 as r → 0 and one has a kind of neutrality. With m ≥ 3, κmm (r) ∞ as r → 0 and there is some kind of attraction between zeros. More 1.06
1.04
1.02
1
0.98
0.96
0.94 0
0.2
0.4
0.6
0.8
1
1.2
1.4
r
1.6
1.8
2
2.2
Fig. 1. The limit pair correlation function κ33
2.4
2.6
2.8
3
258
P. Bleher, B. Shiffman, S. Zelditch
precisely, in dimensions greater than 2, one is more likely to find a zero at a small distance r from another zero than at a small distance r from a given point; i.e., zeros tend to clump together in high dimensions. Indeed, in all dimensions, the probability of finding another zero in a ball of small scaled radius r about another zero is ∼ r 4 . We give in Fig. 1 a graph of κ33 ; graphs of κ11 and κ22 can be found in [BSZ2]. Remark. Theorem 1.3 says that the expected
r number of zeros in the punctured ball of scaled radius r about a given zero is ∼ 0 κmm (t)t 2m−1 dt ∼ r 4 . Also, one can show that for balls of small scaled radii r, the expected number of zeros approximates the probability of finding a zero. 2. Background We begin by recalling the scaling limit zero correlation formula of [BSZ2]. Consider a random polynomial s of degree N in m variables. More generally, s can be a random section of the N th power LN of a positive line bundle L on an m-dimensional compact complex manifold M (or a symplectic 2m-manifold; see [SZ3, BSZ3]). We give M the Kähler metric induced by the curvature form ω of the line bundle L. The probability measure on the space of sections is the complex Gaussian measure induced by the Hermitian inner product s1 , s¯2 = hN (s1 , s¯2 )dVM , M
where hN
is the metric on LN
and dVM is the volume measure induced by ω. (For further discussion of the topics of this section, see [BSZ2].) In particular, if L is the hyperplane section bundle over CPm , then random sections of LN are polynomials of degree N in m variables of the form CJ j j z11 · · · zmm ( J = (j1 , . . . , jm ) ), P (z1 , . . . , zm ) = √ (N − |J |)!j ! · · · j ! 1 m |J |≤N where the CJ are i.i.d. Gaussian random variables with mean 0; they are called “SU(m + 1)-polynomials”. We consider k-tuples s = (s1 , . . . , sk ) of i.i.d. random polynomials (or sections) sj N (z1 , . . . , zn ) is defined as the expected (1 ≤ k ≤ m). The zero correlation density Knk joint volume density of zeros of sections of LN at the points z1 , . . . , zn . In the case N (z1 , . . . , zn ) can be interpreted as k = m, where the zero sets are discrete points, Knk the probability density of finding simultaneous zeros at these points. For instance, the N (z) ≈ c N k as N → ∞, where c is independent of the point zero density function K1k k k z (see [SZ1]). In [BSZ2, BSZ3], we gave generalized forms of the Kac-Rice formula [Kac,Ri], N (z1 , . . . , zn ) in terms of the joint probability distribution which we used to express Knk (JPD) of the random variables s(z1 ), . . . , s(zn ), ∇s(z1 ), . . . , ∇s(zn ). We then showed ∞ given by (1) can be expressed in terms that the scaling limit correlation function K nkm of the scaling limit of the JPD. The central result of [BSZ2] is that the limit JPD is universal and can be expressed in terms of the Szegö kernel ,H 1 for the Heisenberg group: ,H 1 (z, θ ; w, ϕ) =
1 i(θ−ϕ+z·w)− 1 ¯ 21 |z−w|2 ¯ 21 (|z|2 +|w|2 ) e = m ei(θ−ϕ)+z·w− . πm π
(6)
Correlations Between Zeros and Supersymmetry
259
To be precise, the limit JPD is a complex Gaussian measure with covariance matrix 2∞ given by:
A∞ (z) B ∞ (z) m! ∞ , (7) 2 (z) = m π B ∞ (z)∗ C ∞ (z) where
p
p p π −m A∞ (z)p = ,H 1 (z , 0; z , 0), p
π −m B ∞ (z)p q = pq
π −m C ∞ (z)p q =
∇ p ∂ z¯ q
p
p
p p H p p ,H 1 (z , 0; z , 0) = (zq − zq ),1 (z , 0; z , 0) ,
∇2 p
(8)
p q
p p ,H 1 (z , 0; z , 0)
∂zq ∂ z¯ p p p p p p = δqq + (¯zq − z¯ q )(zq − zq ) ,H 1 (z , 0; z , 0).
(Here A∞ , B ∞ , C ∞ are n × n, n × mn, mn × mn matrices, respectively.) In the sequel, we shall use the matrix ∞ (z) := C ∞ (z) − B ∞ (z)∗ A∞ (z)−1 B ∞ (z).
(9)
We note that A∞ (z) and ∞ (z) are positive definite whenever z1 , . . . , zn are distinct points. In [BSZ2], we gave the following key formula for the limit correlation functions: n m n [(m − k)!] p p ∞ nkm K (z1 , . . . , zn ) = det ξj q ξ¯j q dγ(z) (ξ ), (m!)n [det A∞ (z)]k Ckmn 1≤j,j ≤k p=1
q=1
(10) where γ(z) is the Gaussian measure with (nkm × nkm) covariance matrix pj q j pq (z) := (z)p j q = δj ∞ (z)p q . p
p
(11)
pj q
(I.e., ξj q ξ¯j q γ(z) = (z)p j q .) For the pair correlation case (n = 2), Eq. (10) becomes: 1 2 m! k (m−k)! det A(r) m × det ξj1q ξ¯j1 q
κkm (r) =
C2km 1≤j,j ≤k
where
q=1
det
1≤j,j ≤k
(12) m 2 ¯2 ξj q ξj q dγ(r) (ξ ),
A(r) = A∞ (z1 , z2 ), (r) = (z1 , z2 ),
q=1
|z1 − z2 | = r.
The computations in this paper are all based on formula (10).
260
P. Bleher, B. Shiffman, S. Zelditch
3. Supersymmetric Approach to n-Point Correlations We now prove Theorem 1.1 using our formula (10) for the limit n-point correlation function, which we restate as follows: ∞ nkm K (z1 , . . . , zn ) =
[(m − k)!]n Gnkm , (m!)n [det A∞ (z)]k
where Gnkm (z) =
n
det
Ckmn p=1 1≤j,j ≤k
m p p ξj q ξ¯j q dγ(z) (ξ ).
(13)
(14)
q=1
Our approach is to represent the determinant in (14) as a Berezin integral and then to exchange the order of integration. p p We introduce anti-commuting (or “fermionic”) variables ηj , η¯ j (1 ≤ j ≤ k, 1 ≤ p ≤ n), which can be regarded as generators of the Grassmann algebra • C2l = 2l t 2l • 2l C , l = nk. The Berezin integral on C is the linear functional I : •t=02l C → C given by p p I|t C2l = 0 for t < 2l, I η ¯ η j,p j j = 1. Elements f ∈ write
•
C2l are considered as functions of anti-commuting variables, and we p p I(f ) = f dη = f j,p dηj d η¯ j .
pj (See for example [Ef, Chapter 2], [ID, Sect. 2.1].) If H = Hp j is an l × l Hermitian matrix, we have the supersymmetric formula for the determinant: p pj p ¯ det H = e−H η,η dη, H η, η ¯ = ηj Hp j η¯ j . (15) j,p,j ,p
We now use (15) to compute Gnkm : let p p ξ11 · · · ξ1m . .. . ξp = . . p p ξk1 · · · ξkm p
(where {ξj q } are ordinary “bosonic” variables). We also write ξ = ξ 1 ⊕ · · · ⊕ ξ n : Cmn → Ckn . Then 1 ξ 1∗ · · · ξ 0 n m . .. p p .. . (16) det ξj q ξ¯j q = det(ξ ξ ∗ ) = det . . . . 1≤j,j ≤k p=1 q=1 n n∗ 0 ··· ξ ξ
Correlations Between Zeros and Supersymmetry
261
Applying (15) with H = ξ ξ ∗ , we have 1 −1 ¯ det(ξ ξ ∗ )e− ξ,ξ dξ Gnkm = nkm π det Cnkm 1 −1 ¯ ∗ ¯ = nkm e− ξ,ξ −ξ ξ η,η dηdξ, π det Cnkm p p p p ξ ξ ∗ η, η ¯ = ξj q ξ¯j q ηj η¯ j = ξ, ξ¯ ,
(17) (18)
p,q,j,j
where is given by (2). Note that the entries of commute, since they are of degree 2. Furthermore, adopting the supersymmetric definition of the conjugate [Ef], p p (ηj )¯ = η¯ j ,
p p (η¯ j )¯ = −ηj ,
we see that the matrix is superhermitian; i.e., ∗ = , where ∗ = t ¯. Thus by (17)–(18), we have −1 1 ¯ e− ( +)ξ,ξ dηdξ. Gnkm = nkm π det Cnkm We recall that
1 π nkm
¯
Cnkm
e−P ξ,ξ dξ = det P −1 ,
(19)
(20)
for a positive definite, Hermitian (nkm × nkm) matrix P . Furthermore, (20) holds when P is the superhermitian matrix −1 + ; we give a short proof of this fact below. Reversing the order of integration in (19) and applying (20) with P = −1 + , we have 1 1 dη Gnkm = det det(−1 + ) (21) 1 = dη. det(I + ) We now verify by formal substitution that (20) holds when P = −1 + : Suppose that 0 there exists L1 > 0 such that |V (λ)| ≥ (2 + ) log |λ|, |λ| ≥ L1 ,
(1.2)
(ii) for any 0 < L2 < ∞ there exists γ > 0 such that |V (λ1 ) − V (λ2 )| ≤ C|λ1 − λ2 |γ , |λ1,2 | ≤ L2 , (iii) there exists m > 0 such that
|V (λ)|e−mV (λ) dλ < ∞,
(1.3)
(1.4)
and dM =
n
dMjj
j =1
dMj k dMj k .
(1.5)
j 0 is an arbitrary fixed number and const does not depend on k, n. In particular: gn(0) (z) = g(z),
gn(1) (z) = 0;
in the case (i) (0)
(0)
qk,n = 0,
Jk,n =
1 a, 2
(1)
qk,n = 0,
(1)
Jk,n =
1 1 (k − n) + , a P (a) P (−a)
(1.33)
(1.34)
i.e. the zero order coefficients for k = n(1 + o(1)) are independent of k; in the case (ii) (j )
qk,n = 0,
j = 0, 1, . . . ,
and (0)
1 1 (0) (b + (−1)k a), or Jk,n = (b − (−1)k a), 2 2
1 (k − n) (−1)k − = 2 ± , a − b2 P (b) P (a)
Jk,n = (1)
Jk,n
(1.35) (1.36)
where the sign corresponds to the chosen sign in (1.35) Besides, for m ≤ 1 formulas (1.28)–(1.35) are valid for any k : |k − n| ≤ n2/3 with |k − n| + 1 |k − n|2 + 1 (0,q) , , |˜rk,n | ≤ const n n2 |k − n|2 + 1 (1,q) (1,J ) |˜rk,n |, |˜rk,n | ≤ const . n (0,J )
|˜rk,n | ≤ const
(1.37)
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
277
(0)
Remark 2. According to (1.35) the 2-periodic function Jk,n is determined by our method up to the shift by 1. By using recent results of [13] on the form of the leading term of the asymptotics of the orthogonal polynomials (1.23) it can be shown that (0)
Jk,n =
1 (b − (−1)k a), 2 (j )
k = n(1 + o(1)).
(1.38)
(j )
Moreover, all subsequent coefficients Jk,n and gn , j = 1, . . . , N (n) of the asymptotic expansions (1.28) and (1.31) are uniquely determined by the choice (1.35) and by the recurrence procedure described in the proof of the theorem. (0)
(0)
Remark 3. The zero order coefficients qk,n and Jk,n were found in [2, 17]. The first (1)
order coefficients Jk,n (1.34) of the one-interval case were found in [14] in a somewhat different context. Theorem 1 allows us, in particular, to find the 1/n-expansion of the covariance 1 1 Dn (z1 , z2 ) ≡ E Tr(z1 − M)−1 Tr(z2 − M)−1 n n (1.39) 1 1 − E Tr(z1 − M)−1 E Tr(z2 − M)−1 , n n which is important in a number of questions of the random matrix theory and of its applications. Here and below the symbol E{. . . } denotes the expectation with respect to measure (1.1)–(1.5). In the paper [24] it is proven that for any V satisfying (1.2)–(1.3) we have the bound |Dn (z1 , z2 )| ≤
const . 2 2 1 ) (z2 )
n2 (z
Hence, the 1/n-expansion of Dn (z1 , z2 ) has the form Dn (z1 , z2 ) =
∞ j =2
dn(k) (z1 , z2 )n−j + o(n−p ),
n→∞
(1.40)
in which the leading term is of the order n−2 . Theorem 1 implies Corollary 1. Under the conditions of Theorem 1 we have: in the case (i) the n-independent
a 2 − z 1 z2 1 1 + , d (2) (z1 , z2 ) = − 2(z1 − z2 )2 X(z1 )X(z2 )
(1.41)
where X(z) is defined in the first line of (1.14); in the case (ii) the 2-periodic in n
(a 2 − z1 z2 )(b2 − z1 z2 ) 1 (−1)n ab 1 + − , dn(2) (z1 , z2 ) = − 2 2(z1 − z2 ) X(z1 )X(z2 ) 2X(z1 )X(z2 ) (1.42) where X(z) is defined in the second line of (1.14).
278
S. Albeverio, L. Pastur, M. Shcherbina
Remark 4. The covariance Dn (z1 , z2 ) of the traces of the resolvent is of considerable interest in the random matrix theory since the beginning of the 90s, when its study was motivated by matrix models of quantum field theory [1, 3–5, 9, 10] and later by solid state theory (see review [6] and references therein). Initially only the one interval case was studied but later the many interval case was also analyzed. In particular, in [3, 1] a version of the large-n expansion procedure was proposed. In the case (ii) of the two-interval symmetric potential the procedure leads to an expression for the leading term amplitude (2) dn (z1 , z2 ) that does not depend on n and contains elliptic integrals, while our expression (1.42) is 2-periodic in n and contains only elementary functions. By using recent results of paper [13] on the asymptotic form of the leading term of orthogonal polynomials (1.22)–(1.23) and our formula (2.78) below for the covariance Dn (z1 , z2 ), it can be shown that in the general case of a two-interval non-symmetric potential the leading (2) term amplitude dn (z1 , z2 ) is quasi-periodic in n and contains Jacobi elliptic functions that disappear when one passes to a two-interval symmetric potential. Moreover, by using the same results, it can be shown that in the case of a potential leading to a p-interval (2) support of the density of states the amplitude dn (z1 , z2 ) is a quasi-periodic function. Its frequency module contains generically p − 1 incommensurable frequencies (but can reduce to a p-periodic function in some special cases [11]), and its form includes the Riemann θ -function of p − 1 variables. The frequencies are determined by the density of states, and the θ -function are determined by the endpoints of the support of the density of states of the ensemble. Remark 5. Formulas (1.41) and (1.42) for the leading terms amplitude d (2) (z1 , z2 ) of the covariance Dn (z1 , z2 ) depend on the ensemble only via the number of intervals of the IDS support and via the endpoints of the support. This is why this property of the covariance is often referred to as the long-range universality [10] in contradistinction with the short range (or microscopic) universality that manifests itself in 1/n - neighborhoods of the interior points of σ and is valid independently of the number of connected components of σ (see e.g. papers [13, 24]). Thus under conditions of these papers all the unitary invariant ensembles belong to the same short range universality class. On the other hand, since according to (1.41) and (1.42) the leading terms of the covariance Dn (z1 , z2 ) are different in the one and in the two-interval cases, the long range universality classes depend on the number of intervals of the IDS support and on its endpoints. Corollary 2. Under the conditions of Theorem 1 we have the following expressions for (n) the weak limits of squares of the orthonormalized functions ψk (λ) with |k −n| ≤ N (n): w
2 (n) − lim ψk (λ) n→∞
χσ (λ) 1, in case (i), = π X+ (λ) λ, in case (ii),
(1.43)
where X+ (λ) is defined in (1.11). The proofs of these assertions will be given in the next section. 2. Proofs of Main Results Proof of Theorem 1. We introduce an eigenvalue distribution which is more general than (1.17), making different the number of the variable and the large parameter in front of
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
279
V in the exponent of the r.h.s of (1.17): pk,n (λ1 , . . . λk ) =
−1 Zk,n
(λj − λm ) exp
−n
2
1≤j <m≤k
k
V (λj ) ,
(2.1)
j =1
where Zk,n is the normalizing factor. For k = n this probability distribution density coincides with (1.17). Let ρ˜k,n (λ1 ) = dλ2 . . . dλk pk,n (λ1 , . . . λk ), (2.2) ρ˜k,n (λ1 , λ2 ) = dλ3 . . . dλk pk,n (λ1 , . . . λk ) (2.3) be the first and the second marginal densities of (2.1). By standard arguments [20, 7] we have ρ˜k,n (λ) = K˜ k,n (λ, λ), k 2 (λ, µ)], [K˜ k,n (λ, λ)K˜ k,n (µ, µ) − K˜ k,n ρ˜k,n (λ, µ) = k−1
(2.4)
where K˜ k,n (λ, µ) = k −1
k l=1
(n)
(n)
ψl (λ)ψl (µ),
(2.5)
(n)
and ψl (λ) is defined by (1.21). We will use the notations Kk,n (λ, µ) ≡ n−1
k l=1
(n)
(n)
ψl (λ)ψl (µ) =
k ˜ Kk,n (λ, µ), n
(2.6) k ρk,n (λ) ≡ Kk,n (λ, λ) = ρ˜k,n (λ). n V (λ1 ) Consider now the quantity Ek for z, z = 0, where Ek {. . . } denotes the z − λ1 expectation with respect to the probability distribution (2.1). It is well defined in view of condition (1.4) above. It is easy to find that V (λ1 ) V (λ)ρ˜k,n (λ) = dλ. (2.7) Ek z − λ1 z−λ On the other hand, integrating by parts the r.h.s. in (2.7) and using (2.3), we obtain that ρ˜k,n (λ) ρ˜k,n (λ, µ) V (λ1 ) 1 k−1 dλdµ. = Ek dλ + 2 z − λ1 n (z − λ)2 n (z − λ)(λ − µ) Combining these two expressions, we come to the identity V (λ)ρ˜k,n (λ) 1 ρ˜k,n (λ) ρ˜k,n (λ, µ) k−1 dλ = dλdµ, dλ + 2 2 z−λ n (z − λ) n (z − λ)(λ − µ)
(2.8)
280
S. Albeverio, L. Pastur, M. Shcherbina
The symmetry property ρ˜k,n (λ, µ) = ρ˜k,n (µ, λ) of (2.3) implies ρ˜k,n (λ, µ) ρ˜k,n (λ, µ) dλdµ = − dλdµ. (z − λ)(λ − µ) (z − µ)(λ − µ) This allows us to rewrite (2.8) in the form V (λ)ρ˜k,n (λ) 1 ρ˜k,n (λ) ρ˜k,n (λ, µ) k−1 dλ = dλdµ. dλ + 2 z−λ n (z − λ) n (z − λ)(z − µ) Now, by using (2.4)–(2.6), we can rewrite (2.9) as V (λ)ρk,n (λ) ρk,n (λ) dλ dλ = n−1 z−λ (z − λ)2 ρk,n (λ)ρk,n (µ) − (Kk,n (λ, µ))2 + dλdµ. (z − λ)(z − µ)
(2.9)
(2.10)
This relation is a version of the well known loop equation of the matrix models of the Quantum Field Theory [15]. We will use also Proposition 2. Consider any unitary invariant ensemble of the form (1.1)–(1.5) and assume that V (λ) possess two bounded derivatives in some neighborhood of the support σ of the density of states ρ and that ρ(λ) satisfies Condition C2. Denote by σε the ε-neighborhood of σ for some ε > 0. Then there exist n-independent quantities C, C0 , ε0 > 0 such that for any positive n-independent ε < ε0 there exists ε1 > 0 such that for any integer k satisfying inequality |k−n| n ≤ ε1 we have the bounds (n) ρk,n (λ)dλ ≤ e−nCε , (ψk (λ))2 dλ ≤ e−nCε . (2.11) R\σε
R\σε
Remark 6. The proof of Proposition 2, given in the next section, does not use the fact that ensemble (1.1)–(1.5) consists of Hermitian matrices. Therefore Proposition 2 is valid also for real symmetric and quaternion real matrices, i.e. for orthogonal and symplectic ensembles, satisfying (1.2), (1.3), and Condition C2. Let us fix now a sufficiently small ε such that σε ⊂ D and all the zeros of the function P (z) are outside of σε . Then (2.11) allows us to replace the integrals over the whole line by the integrals over σε in (2.10). Therefore, denoting gk,n (z) ≡
σε
(z) ≡ − Rj,m
ρk,n (λ)dλ , z−λ (n)
Rj,m (z) ≡ (n)
ψj (λ)ψm (λ)dλ (z − λ)2
σε
we get from (2.10):
(gk,n (z))2 −
σε
,
(n)
(n)
ψj (λ)ψm (λ)dλ
σε
z−λ
,
V (ζ ) V˜ (z, ζ ) ≡ , z−ζ
(2.12)
V˜ (z, λ)ρk,n (λ)dλ
k k 1 1 2 − 2 Rm,m (z) − 2 Rm,j (z) = en (z), n n m=1
m,j =1
(2.13)
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
281
where en (z) is the remainder function which appears because of our replacement of the integrals over the whole line by the integrals over σε . Note that since the l.h.s. of (2.13) is an analytic function in C \ σε , en (z) is also analytic in C \ σε , and admits the bound: C0 , |δε (z)|l
(2.14)
δε (z) ≡ dist{z, σε }
(2.15)
|en (z)| ≤ where
and l = 2. Besides, it follows from (2.11) that en (z) ≤
C1 e−nC2
|z|2 |δε (z)|l
(2.16)
with l = 0. We will denote below by {en (z)}∞ n=1 sequences of functions (may be different in different formulas) which are analytic everywhere in C \ σε and satisfy the estimates (2.14) and (2.16) with some nonnegative l, l and some positive n-independent C’s. According to our conditions V˜ (z, ζ ) in (2.12) is analytic with respect to ζ inside D, except for the point ζ = z. Hence, we can write that 1 ρk,n (λ) V˜ (z, λ)ρk,n (λ)dλ = dλ dζ V˜ (z, ζ ) 2π i ζ −λ σε σ L (2.17) ε 1 dζ V˜ (z, ζ )gk,n (ζ ), = 2π i L where L ⊂ D is an arbitrary closed contour which contains σε and does not contain z. This allows us to rewrite (2.13) as (gk,n (z))2 −
1 2π i
1 − 2 n
k 1 V˜ (z, ζ )gk,n (ζ )dζ − 2 Rm,m (z) n L m=1
k m,j =1
2 Rm,j (z)
(2.18)
= en (z).
Now, subtracting from (2.18) the relation obtained from (2.18) by the replacement k → (k − 1), we obtain: 1 V˜ (z, ζ )Rk,k (ζ )dζ 2Rk,k (z)gk−1,n (z) − 2π i L (2.19) k−1 1 2 2 − Rk,k (z) − Rk,j (z) = en (z). n n j =1
Relations (2.18) and (2.19) are our main technical tools in constructing the 1/n expansion given in the theorem. We will consider (2.18) and (2.19) as a system of equations with respect to the functions gk,n (z) and Rj,m (z) and solve them by iterations in 1/n.
282
S. Albeverio, L. Pastur, M. Shcherbina
We will need two more facts on ensembles (1.1)–(1.5). (a) The function gk,n (z) from (2.12) and g(z) from (1.12) are related as log1/2 n |k − n| |gk,n (z) − g(z)| ≤ const √ 2 . + nδε (z) nδε (z)
(2.20)
This relation follows from (2.12), (2.6), (2.4), and from the bound valid for any function φ(µ), which grows not faster than ebV (µ) , b > 0 as |µ| → ∞, φ(µ)ρn (µ)dµ − φ(µ)ρ(µ)dµ ≤ const||φ ||1/2 ||φ||1/2 n−1/2 log1/2 n, (2.21) 2 2 where the symbol || . . . ||2 denotes the L2 -norm on a compact set of R containing σε (the bound was proved in [8], Lemma 4, see also [24]). (b) g 2 (z) − V (z)g(z) + Q(z) = 0,
Q(z) =
1 2πi
L
Q(z, ζ )g(ζ )dζ =
σ
z ∈ D, z = 0,
(2.22)
V (z) − V (λ) ρ(λ)dλ, z−λ
(2.23)
and Q(z, ζ ) is defined by (1.16). The relations follow from (2.20), and identity (2.10) for n = k. Indeed, in view of (2.4) the r.h.s. of (2.10) is gn2
+E
n
−1
n
(z − λl )
−1
−E n
−1
l=1
n
(z − λl )
−1
2 .
l=1
The second term here is the variance of n−1 Tr(z − M)−1 , and according to [24], Lemma 3, the variance is of the order O(n−2 ). This and (2.20) imply (2.22). It follows from the above that the zero order approximation for gk,n (z) coincides with g(z). To find the zero order approximations for Rk,k (z) for |k − n| ≤ N (n), where N (n) is defined in (1.27), let us note that (2.12) leads to the bounds
|Rk,k (z)|, |
k−1 j =1
2 Rk,j (z)| ≤
const . δε2 (z)
The first bound follows from the definition of Rk,i (z) in (2.12). To prove the second bound we view Rk,i (z) of (2.12) as the generalized Fourier coefficients of the function (n) (n) χε (λ)ψk (λ)(z − λ)−1 with respect to the orthonormal system {ψl (λ)}∞ l=1 . Then the Bessel inequality gives us the second bound. These bounds imply that the last two terms in the l.h.s. of (2.19) have the order n−1 . Hence, the zero order equations for Rkk (z) have the form 1 (0,R) 2g(z)Rk,k (z) = dζ V˜ (z, ζ )Rk,k (ζ ) − rk,n (z) + en (z), (2.24) 2πi L
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
283
where the remainder k−1
(0,R)
rk,n (z) ≡ −
1 2 2 Rk,j (z) Rk,k (z) − n n
(2.25)
j =1
+ 2Rk,k (z)(gk−1,n (z) − g(z)) → 0,
n → ∞,
is analytic in C \ σε and tends to zero uniformly on any compact set for which dist (z, σε ) ≥ d > 0. Besides, since by definition (1.21)
(n)
(ψk )2 (λ)dλ = 1, we have from (2.11), that Rk,k (z) =
1 1 (1 + O( )) + en (z), z z
z → ∞.
(2.26)
Equation (2.24) was already considered in [2]. However we will use here a bit different way to analyze the equation, which is based on the following lemma: Lemma 1. Consider the equation 1 2g(z)R(z) − 2π i
L
dζ V˜ (z, ζ )R(ζ ) = 0,
z ∈ D \ σε ,
(2.27)
˜ ζ ) is defined in (2.12), and a closed contour L ∈ D contains σe and does where V (z, not contain the point z. Set for z ∈ σ , >(z) =
X−1 (z), in the case (i), zX−1 (z), in the case (ii),
(2.28)
where X(z) is defined by (1.14). Then the following statements are valid under the conditions of Theorem 1: 1. In the case (i) Eq. (2.27) has the unique solution R(z) = >(z) in the class of functions analytic in C \ σε and behaving as R(z) = z−1 (1 + o(1)),
z → ∞.
(2.29)
In the case (ii) Eq. (2.27) has the unique solution R(z) = >(z) in the class (2.29), under the additional symmetry condition R(−z) = −R(z). 2. In both cases Eq. (2.27) has no solutions in the class of functions R(z) analytic in C \ σε and satisfying the condition lim |z2 R(z)| ≤ const < ∞.
|z|→∞
(2.30)
284
S. Albeverio, L. Pastur, M. Shcherbina
3. For any analytic in C \ σε function F (z), satisfying condition (2.30) and even in the case (ii), the inhomogeneous equation 1 2g(z)R(z) = dζ V˜ (z, ζ )R(ζ ) − F (z) (2.31) 2π i L has the unique solution of the form R(z) =
1 2π iX(z)
L
dζ
F (ζ ) , P (ζ )(z − ζ )
(2.32)
in the class of functions analytic in C \ σε , satisfying condition (2.30) and odd in the case (ii). Here P (z) is defined by (1.15) and a closed contour L should be taken sufficiently close to σ , to have z and all zeros of P (z) outside of L. In particular, in the case (ii) the contour consists of two components, encircling each interval of the support. The proof of the lemma will be given in the next section. Omitting in (2.24) the error terms, we deduce from the obtained homogeneous equation and from (2.26) on the basis of Assertion 1 of Lemma 1 that the zero order approxi(0) mation Rk,k (z) of Rk,k (z) is >(z) from (2.28). Moreover, the difference Rk,k (z) − >(z) decays at infinity as z−2 at least, and the error terms in the r.h.s. of (2.24) decays also as z−2 , as z → ∞. Thus on the basis of Assertion 3 of the lemma we can write that (0,R)
Rk,k (z) = >(z) + r˜k,n (z) + en (z). (0,R)
(2.33) (0,R)
Here r˜k,n (z) is obtained from formula (2.32) with F (z) = rk,n (z) given by (2.25)). (0,R)
Using the fact that |rk,n (z)| → 0 as |z| → ∞ and that P (z) has no zeros on L we obtain the bound (0,R) rk,n (ζ ) 1 (0,R) dζ |˜rk,n (z)| ≤ 2πiP (z)X(z) L (z − ζ ) 1 P −1 (ζ ) − P −1 (z) (0,R) (2.34) + dζ rk,n (ζ ) 2πiX(z) L (z − ζ )
const (0,R) (0,R) ≤ rk,n (z) + max rk,n (ζ ) → 0, n → ∞. ζ ∈L |X(z)| Thus, for all k such that |k − n| ≤ N (n), where N (n) is given in (1.27) for m = 0, we have (0)
Rk,k ≡ lim Rk,k (z) = >(z). n→∞
We have also the relations following from (1.21), (1.24), (2.11) and (2.12): 1 2 qk = λψk (λ)dλ = ζ Rk,k (ζ )dζ + O(e−nCε ), 2π i L 1 2 qk2 + Jk2 + Jk−1 = λ2 ψk2 (λ)dλ = ζ 2 Rk,k (ζ )dζ + O(e−nCε ), 2π i L 2 2 2 (qk2 + Jk2 + Jk−1 )2 + (qk + qk+1 )2 Jk2 + (qk + qk−1 )2 Jk−1 + Jk2 Jk+1 1 2 2 +Jk−1 Jk−2 = λ4 ψk2 (λ)dλ = ζ 4 Rk,k (ζ )dζ + O(e−nCε ). 2π i L
(2.35)
(2.36)
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices (j )
285 (j )
In what follows we omit the subindex n in the coefficients qk,n and Jk,n , introduced in (1.28). By using (2.35), and (2.28) for the case (i), we find from the first of the above relations (0) that the zero order term qk is zero. Then, combining the second relation of (2.36) for (0) k, k − 1, and k + 1 and the third relation of (2.36), we find that Jk = a/2. In the (0) case (ii) the same scheme carried out for even and odd k leads to the coefficients Jk of (1.35). In other words we have proved that in the zero order in 1/n the coefficients of the Jacobi matrix J (n) defined in (1.25) do not depend on k, |k − n| ≤ N (n) in the case (i) of a one interval support of the density of states and are 2-periodic functions of k in the case (ii) of a two interval symmetric support. To find the first order terms for these coefficients, we will study the first order versions of Eqs. (2.18). Note first that we have the bound k k const 1
2 − R (z)j,j − Rj,m (z) ≤ 4 + |en (z)|, n nδε (z) j =1
(2.37)
j,m=1
where const does not depend on n, z. Indeed, by using the orthonormality of system (1.21) we can write the l.h.s. as n 2 2 2 dλ dµ(φ(λ) − φ(µ)) Kk,n (λ, µ) + n dλ dµφ 2 (λ)Kk,n (λ, µ), 2 σε σε σε R\σε where φ(λ) = (z − λ)−1 and Kk,n (λ, µ) is defined in (2.6). According to Lemma 3 of [24] the first term here is bounded by const · sup |φ (λ)|2 /n ≤ const/nδε4 (z), and according to Proposition 2, the second term is en (z). We conclude that the first order equation for the function (1)
gk,n (z) ≡ n(gk,n (z) − g(z)) has the form (1) 2g(z)gk,n (z)
1 = 2πi
(1,g)
(1)
(2.38)
V (z, ζ )gn,k (ζ )dζ − rk,n (z) + en (z),
(2.39)
k k 1 (1) 1 2
2 − R (z)j,j − ≡ (gk,n (z)) + Rj,m (z) n n m=1 j =1 1 (1) const (1,g) (1,g) ≡ (gk,n (z))2 + r k,n (z), r k,n (z) ≤ 4 . n nδε (z)
(2.40)
with (1,g) rk,n (z)
Besides, we have the normalization condition
1 (1) gk,n (z) = (k − n)z−1 1 + O + en (z), z → ∞, |k − n| ≤ N (n), z
(2.41)
which follows from Definition (2.12) of the function gk,n (z). Then, according to Lemma 1, we get (1)
(1,g)
gk,n (z) = (k − n)>(z) + r˜k,n (z) + en (z),
(2.42)
286
S. Albeverio, L. Pastur, M. Shcherbina (1,g)
where the remainder r˜k,n (z) has the form (1,g) r˜k,n (z)
1 = 2πiX(z)
(1,g)
(1)
n−1 (gk,n (ζ ))2 + r k,n (ζ ) P (ζ )(z − ζ )
L
dζ.
(2.43)
Thus, denoting (1)
mk,n (d) ≡
max
{z:δε (z)≥d}
(1)
|gk,n (z)|,
where d is a positive constant, we obtain from relations (2.42) and (2.43) the inequality (1)
mk,n (d) ≤
(1) (mk,n (d))2 |k − n| 1 , + C + d 1/2 nd 3/2 nd 9/2
where C is independent of n, k, and d. This inequality implies that either (1)
mk,n (d) ≤
2|k − n| , d 1/2
or
(1)
mk,n (d) ≥ nd 3/2 C −1 + O(1).
But the second inequality here cannot be true, because it was proved above that (1)
n−1 mk,n (d) =
max
{z:δε (z)≥d}
|gk,n (z) − g(z)| → 0
for any k such that |k − n| = N (n), where N (n) is given in (1.27) for m = 0. Hence in view of (2.43) we get that for {z : δε (z) ≥ d},
1 |k − n|2 (1,g) |˜rk,n (z)| ≤ const (2.44) + 9/4 . nd nd Substituting now representation (2.42) in the r.h.s. of (2.43), and using bound (2.44), we get finally (1,g)
r˜k,n (z) =
(k − n)2 Y (z) + O(|k − n|3 n−2 d −5/2 ) + O((nd 5 )−1 ), n
(2.45)
where > 2 (ζ ) 1 dζ Y (z) ≡ 2πiX(z) L P (ζ )(z − ζ )
1 1 1 − , (i), 1 2a P (a)(z − a) P (−a)(z + a)
= 1 az bz X(z) − , (ii). 2 (a − b2 ) P (a)(z2 − a 2 ) P (b)(z2 − b2 )
(2.46)
We have obtained the first order term in the 1/n-expansion for gn,k (z). Now we need a lemma that will allow us to replace Rk,j (z) in (2.18), (2.19) by a (j ) (j ) certain simpler expression constructed from the coefficients qk,n , Jk,n , j = 0, . . . , p found during the previous p steps of our expansion process and to estimate the error of this replacement.
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
287
Lemma 2. Take N˜ (n) = [log2 n] and let N1 (n) be such that N1 (n)n−1/(p+1) → 0, (N1 (n))−1 N˜ (n) → 0,
as n → ∞.
(2.47) (p)
(0)
Assume that for any k : |k − n| ≤ N1 (n) we have found the coefficients qk , . . . , qk , (p) (0) Jk , . . . , Jk , satisfying bound (1.29), and such that (1.28) is fulfilled for m = p. Here (j ) (j ) and below we omit the subindex n in the coefficients qk,n , Jk,n of the asymptotic formula (1.28) of Theorem 1. For any s such that |s| ≤ 2/n consider the (2N1 + 1)-periodic symmetric Jacobi matrix J˜(p) (s) defined by the entries (p) (p) J˜k,k ≡ q˜k =
p j =0
(j )
p
(p) (p) J˜k,k+1 ≡ J˜k =
s j qk ,
j =0
(j )
s j Jk ,
|k − n| ≤ N1 (n). (2.48)
Denote by R˜ (p) (z, s) the resolvent of J˜(p) (s), and set R
(j )
1 ∂ j ˜ (p) (z) ≡ R (z, s)|s=0 , j ! ∂s j
S
(p)
(z) ≡
p
n−j R (j ) .
(2.49)
j =0
Then for any L > 0 there exist positive n-independent quantities C1 and C2 such that for any k satisfying the inequality: |k − n| ≤ N1 − 2N˜ ≡ N2 (n),
(2.50)
and for any z ∈ σε , |z| < L, Rk,k (z) − S (p) (z), − R (z) − (S (p) · S (p) )k,k (z) k,k k,k p+1
˜ (p) C1 N 1 e−C2 δε (z)N 2εn + + , p+1 δε2 (z)np δε (z)|z|2 δε (z)np+1
(2.51)
k k p+1 ˜ (p) 2 C1 N 1 2εn e−C2 δε (z)N (p) 2 ≤ R (z) − (S (z)) + + , k,m k,m δ 2 (z)np p+1 δε (z)|z|2 δε (z)np+1 ε m=1 m=1
(2.52)
≤
k k k 1 k 1 (p) (p) (p)
2 2 − R (z)j,j − Rj,m (z) − [(S · S )j,j (z) − (Sj,m (z)) n n j =1
m=1
m=1
j =1
≤
(p) 2εn N1 δε2 (z)np+1
(p)
where δε (z) ≡ dist {z, σε } and εn
+
p+2 C1 N 1 p+1 δε (z)np+2
+
˜ e−C2 δε (z)N/2
= o(1), n → ∞ (see (1.30)).
|z|3
,
(2.53)
288
S. Albeverio, L. Pastur, M. Shcherbina
The proof of the lemma will be given in the next section. Consider the function
(1,n) (0) Rk,k (z) ≡ n Rk,k (z) − Rk,k (z) ,
(2.54)
(0)
with Rk,k (z) defined in (2.35). From (2.19) and (2.42) we get the first order equation for Rkk : 1 (1,n) (1,n) (1,R) (1,R) 2g(z)Rk,k (z) = dζ V˜ (z, ζ )Rkk (ζ ) − Fk (z) − rk,n (z) + en (z). 2πi L (2.55) Here (1,R)
Fk
(0)
(1)
(z) ≡ 2Rk,k (z)gk−1 (z) + (R (0) · R (0) )k,k (z) − 2
k−1 j =1
(0)
(Rk,j (z))2 ,
R (0) denotes the resolvent of the double infinite Jacobi matrix J (0) of the zero order (0) coefficients {Jk }k∈Z , and 2 (1,n) (1,g) (1,R) (0) (1) rk,n (z) ≡ 2Rk,k (z)˜rk,n (z) + Rk,k (z)gk,n (z) n
+ −Rk,k (z) − (R (0) · R (0) )k,k (z) −2
k−1 j =1
(Rk,j (z))2 −
k−1 j =1
(0)
(2.56)
(Rk,j (z))2 .
By using the translational symmetry of the resolvent R (0) and the exponential decay of (0) its matrix elements Rj m in |j − m|, as |j − m| → ∞, it is easy to show that (R (0) · R (0) )k,k (z) − 2
k−1 j =1
(0)
(Rk,j (z))2 (0) 2 (i), (Rk,k (z)) + en (z), (0) (0) 2 2 = (J ) − (Jk−1 ) (0) + en (z), (ii), (Rk,k (z))2 + k X 2 (z)
This relation, and formulas (2.42), and (2.54) imply that (i), [2(k − n) − 1] > 2 (z), (1,R) k ab Fk = (−1) [2(k − n) − 1] > 2 (z) ± , (ii), X 2 (z) where the sign in the case (ii) corresponds to that in (1.35). (1,n) In addition, bound (2.45), and the fact that n−1 Rk,k (z) → 0, as n → ∞ (see formulas (2.54) and (2.33)–(2.35)) imply that the first two terms in the r.h.s. of (2.56) tend to zero as n → ∞. And on the basis of Lemma 2, one can conclude that the last two (1,R) terms there also vanish as n → ∞. Therefore rk,n (z) → 0 as n → ∞. Then on the
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
289 (1)
basis of Lemma 1, and similarly to (2.38)–(2.46) we get for the first order term Rk,k (z) all k such that |k − n| ≤ N1 (n), where N1 (n) is given in (2.47): [2(k − n) − 1]Y (z), (i), (1) Rk,k (z) = (2.57) k (±) [2(k − n) − 1] Y (z) ± (−1) Y (z), (ii), where Y (z) is defined in (2.46), Y
(±)
dζ ab (z) ≡ 2πiX(z) L P (ζ )X 2 (ζ )(z − ζ )
z b a = − , X(z)(a 2 − b2 ) P (a)(z2 − a 2 ) P (b)(z2 − b2 ) (1,R)
and the remainder function r˜k,n (z) is (1,R)
r˜k,n (z)
1
|k − n|4 2(k − n)[3(k − n) − 1] ˜ + O Y (z) + O , (i), n n3 n 2(k − n)[3(k − n) − 1] ˜ = Y (z) ± 2(−1)k (k − n)Y˜ ± (z) n
4 +O |k − n| + O 1 , (ii) n3 n
where
1 Y (ζ )>(ζ ) , dζ 2π iX(z) L P (ζ )(z − ζ ) 1 Y ± (ζ )>(ζ ) Y˜ ± (z) ≡ dζ . 2π iX(z) L P (ζ )(z − ζ )
(2.58)
Y˜ (z) ≡
(2.59)
Now in the case (ii) we take the first order terms with respect to n−1 in Eqs. (2.36) (recall (0) that the diagonal coefficients qk are zero for all k). We obtain the relations 1 (0) (1) (0) (1) (1) (1,J,2) 2(J2q J2q + J2q−1 J2q−1 ) = ζ 2 R2q,2q (ζ )dζ + r2q , 2π i L (0) (1)
(0)
(1)
(0)
(0)
4(J2q J2q + J2q−1 J2q−1 )((J2q )2 + (J2q−1 )2 ) (0) (0)
(0)
(1)
(0) (1)
(0) (1)
(0)
(1)
+ 2J2q J2q−1 (J2q−1 J2q + J2q J2q+1 + J2q J2q−1 + J2q−1 J2q−2 ) 1 (1) (1,J,4) = ζ 4 R2q,2q (ζ )dζ + r2q , 2πi L (2.60) where k = 2q, |k − n| ≤ N1 (n), N1 (n) is defined in (2.47) for p = 0, and: (1,J,2) (1,R) rk,n ≡ ζ 2 r˜k,n (ζ )dζ → 0, n → ∞, L (1,J,4) (1,R) rk,n ≡ ζ 4 r˜k,n (ζ )dζ → 0, n → ∞. L
(2.61)
290
S. Albeverio, L. Pastur, M. Shcherbina
Consider also the two analogs of the first equation in (2.60) with 2q replaced by 2q − 1 and by 2q + 1. These relations and (2.60) comprise a linear system with the unknowns (1) (1) (1) (1) (0) (0) J2q−2 , J2q−1 , J2q and J2q+1 . The system is uniquely soluble for J2q = J2q−1 , and its solution is specified by (1.36), and its remainder terms satisfy the bounds (1.37). (0) (0) However, for J2q = J2q−1 this system is degenerated. Thus, in the case (i) we cannot (1)
use the system to find coefficients Jk,n . In this case we use first identity (2.36) that yields the following relation in the first order: (1,q,1) (1) (1,R) qk = rk,n ≡ ζ r˜k,n (ζ )dζ. L
(1)
This and (2.57) yield that qk (0)
(0)
= 0. Furthermore, the first equation in (2.60) for J2q =
J2q−1 = a/2, in view of (2.57) and (2.58), has the form (1)
a(Jk
(1)
(1,J,2)
+ Jk−1 ) = [2(k − n) − 1]I (i) + rk,n
1 1 1 (i) I ≡ + . 2 P (a) P (−a)
, (2.62)
Iterating this relation starting from k = n it is easy to obtain the one-parameter family of solutions (1)
aJk
(1,J )
= (k − n)I (i) − c(−1)k−n + r˜k,n ,
(2.63)
where (1,J )
r˜k,n
=
k−n j =0
(1,J,2)
(−1)k−n−j rn+j,n .
(1,R)
(1,J,2)
Substituting expression (2.58) for r˜k,n (z) in (2.61) and using the resulting rk,n (z) in the last relations, we obtain the bound
|k − n|2 + 1 |k − n|5 (1,J ) . (2.64) |˜rk,n | ≤ const + n n3 This leads to (1.37) for the case (i), if |k − n| ≤ n2/3 . To fix the parameter c in (2.63) we use the relation known in random matrix theory as the string equation (see e.g. [15]): k (n) (n) Jk V (λ)ψk (λ)ψk+1 (λ)dλ = . n The relation can be easily obtained from the identity (n) (n) e−nV (λ) pk−1 (λ)pk (λ) dλ = 0. We use this relation in the form Jn V (ζ )Rn,n+1 (ζ )dζ = 1 + O(e−nC ), 2πi L
(2.65)
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
291
following from Proposition 2. The first order equation which follows from (2.65) has the form (0) (1) Jn Jn (1) (0) V (ζ )Rn,n+1 (ζ )dζ + V (ζ )Rn,n+1 (ζ )dζ = 0. 2πi L 2π i L By using (1.34), (2.33), (2.57), and (2.63), we get a linear equation with respect to c: D (i) c − A(i) = 0, with
a (0) V (ζ )Rn,n+1 (ζ )dζ + V (ζ )(R (0) J ± R (0) )n,n+1 (ζ )dζ, 2 L L a (0) (1∗)
≡ Jn V (ζ )Rn,n+1 (ζ )dζ + V (ζ )(R (0) · J (1∗) · R (0) )n,n+1 (ζ )dζ, 2 L L (2.67)
D (i) ≡ Jn± A(i)
(2.66)
where J ± is the symmetric Jacobi matrix with coefficient Jk± = (−1)n−k and J (1∗) is the symmetric Jacobi matrix with coefficients defined by (1.34). Lemma 3. Under conditions of the theorem A(i) = 0, D (i) = 0 and Eq. (2.66) has the unique solution c = 0. The proof of this lemma is given in the next section. By using the lemma we find the first order terms of our expansion in the case (i) given in (1.34). Now we will prove (1.31) and (1.28) by induction. The scheme of the induction pro(p) (0) cedure will be as follows. Assume that we have found coefficients qk , . . . , qk and (p) (p+1) (0) Jk , . . . , Jk . Then we can find the p + 1 correction gk (z) and estimate the respec(p+1,g) tive remainder rk,n from the (p + 1) form of Eq. (2.18) (see Eq. (2.70) below), in (p)
(0)
(p)
(0)
which we use the functions gk (z), . . . , gk (z) and Rkk (z), . . . , Rkk (z) found previously. Then, by using the (p +1) form of Eq. (2.19) (see Eq. (2.73) below), we determine (p) (p+1,R) Rkk (z) and estimate the respective remainder rk,n . Finally, we find the coefficients (p+1)
(p+1)
, and Jk and estimate the respective remainder by using the (p + 1) form of qk relations (2.36) and (2.65). To realize this scheme we first write the asymptotic relation: gk,n (z) =
p j =0
(j )
(p,g)
n−j gk (z) + n−p r˜k,n (z),
(p,g)
r˜k,n (z) → 0,
as n → ∞,
(2.68)
valid for all k such that |k − n| ≤ N1 (n). Let matrices R (j ) (z), j = 0, . . . , p be defined as in Lemma 2 (see formula (2.48), (2.49)). Then, denoting (p+1) gk,n (z)
p+1
≡n
gk,n (z) −
p j =0
,
(j ) n−j gk (z)
(2.69)
292
S. Albeverio, L. Pastur, M. Shcherbina (p+1)
we obtain from (2.18) the equation of the (p + 1)th order for gk,n 1 = 2π i
(p+1) 2g(z)gk,n (z)
(z):
(p+1) V˜ (z, ζ )gk,n (ζ )dζ
(p+1,g) (p+1,g) − Fk (z) − rk,n (z) + en (z),
(2.70)
where (p+1,g)
Fk
(z) =
p
(p+1,g)
(z) =
(l)
(z)gk (z) +
∞ p−1 k
(p−l−1)
Rm,j
m=1 j =k+1 l=0 p (p+1) (p+1) (l) n−p−1 (gk,n (z))2 + 2gk,n (z) n−l gk (z) l=1 p
(l) (l ) np+1−l−l gk (z)gk (z) + l,l =1,l+l >p+1 l=1
rk,n
(p+1−l)
gk
· np
(l)
(z)Rm,j (z),
(2.71)
k k 1 2 − R (z)j,j − Rj,m (z) n m=1
j =1
1 − n
k
(S
(p)
·S
(p)
)j,j (z) −
k m=1
j =1
(p) (Sj,m (z))2
,
(p)
with Sj,m (z) defined by (2.49). On the basis of (2.68), (1.28), and Lemma 2 we conclude that the relations (p+1,g) F (z) ≤ const (|k − n|p+1 + 1), k and (p+1,g)
rk,n
(z) → 0,
as
n → ∞,
are valid uniformly in {z : δε (z) ≥ d}, for any fixed d > 0, because by the induction (p+1) (p) assumption (2.68) we have that n−1 gk,n (z) ≡ g˜ k,n (z) → 0 as n → ∞. Then Lemma 1 leads to the relations (p+1)
gk,n
(p+1)
(z) = gk
(p+1,g)
(z) + r˜k,n
(z),
(2.72)
where for δε (z) ≥ d > 0, (p+1)
gk
(z) =
1 2πi
L
(p+1,g)
Fk (ζ ) dζ, P (ζ )(ζ − z)
(p+1)
|gk
(z)| ≤ const (|k − n|p+1 + 1)
and (p+1,g)
|˜rk,n
(z)| ≤
const (p+1,g) (p+1,g) (z)| + max |rk,n (ζ )|). ((|rk,n {ζ ∈L} |X(z)|
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
293
Now, denoting (cf. (2.69)) (p+1,n) (z) Rk,k
p+1
Rk,k (z) −
≡n
p
n
−j
j =0
(j ) Rk,k (z)
,
we get from (2.19) the equation of the form (cf. (2.55)) 1 (p+1,n) (p+1) (z) = V˜ (z, ζ )Rk,k (ζ )dζ 2g(z)Rk,k 2π i (p+1,R)
− Fk
(p+1,R)
(z) − rk,n
(2.73)
(z) + en (z),
where (p+1,R) Fk (z) (p+1,R)
rk,n
=
p l=0
(p+1−l) (l) gk−1 (z)Rk,k (z) +
(p+1)
(z) = 2Rk,k
(z)
p
p
+
l=1
j =1
−
p ∞ j =k+1
l=0
(p−l)
(l)
Rm,j (z)Rm,j (z),
(l)
n−l gk−1 (z)
l,l =1,l+l >p+1
+ np−1
k
(l )
(l)
np+1−l−l gk−1 (z)Rk,k (z)
− R (z)k,k − 2
k
(Rk,m (z))2
m=1
k (p) − (S (p) · S (p) )j,j (z) − 2 (Sj,m (z))2 . m=1
By the virtue of (2.68), (1.28) and of Lemma 2, we conclude that the relations (p+1,R) F (z) ≤ const (|k − n|p+1 + 1), k and (p+1,R)
rk,n
(z) → 0,
as
n → ∞,
are valid uniformly in {z : δε (z) ≥ d}, for any fixed d > 0. Using again Lemma 1, we get (p+1,n)
Rk,k
(p+1)
(z) = Rk,k
(p+1,R)
(z) + r˜k,n
(z),
(2.74)
where for δε (z) > d, (p+1)
Rk,k and
(z) =
1 2πi
L
(p+1,R)
(ζ ) Fk (p+1) dζ, |Rk,k (z)| ≤ const (|k − n|p+1 + 1) (2.75) P (ζ )(ζ − z)
(p+1,R) ≤ const r (p+1,R) (z) + max r (p+1,R) (ζ ) . r˜ (z) k,n k,n {ζ ∈L} k,n |X(z)|
294
S. Albeverio, L. Pastur, M. Shcherbina
Now, as for the first order approximation case, in the case (ii) we take the (p + 1) - order terms (with respect to n−1 ) of Eqs. (2.36) for k = 2q: 1 (p+1) (p+1) (p+1,J,2) (0) (p+1) (0) + J2q−1 J2q−1 ) = ζ 2 R2q,2q (ζ )dζ + r2q , 2(J2q J2q 2π i L (0) (p+1)
4(J2q J2q
(p+1)
(0)
(0)
(0)
+ J2q−1 J2q−1 )((J2q )2 + (J2q−1 )2 )
(0) (0)
(p+1)
(0)
(0) (p+1)
(0) (p+1)
(0)
(p+1)
+ J2q J2q+1 + J2q J2q−1 + J2q−1 J2q−2 ) + 2J2q J2q−1 (J2q−1 J2q 1 (p+1) (p+1,J,4) = ζ 4 R2q,2q (ζ )dζ + r2q , (2.76) 2πi L (p+1,J,2)
(p+1,J,4)
are the coefficients at n−p−1 in the r.h.s. of the second p (j ) n−j Jk , and and the third equations (2.36) which we get, substituting there Jk =
where Fk
and Fk
(p+1,J,2)
rk
(p+1,J,4)
rk
j =0
≡ ≡
(p+1,R)
L
ζ 2 r˜k,n
(ζ )dζ → 0,
(p+1,R)
L
ζ 4 r˜k,n
n → ∞,
(ζ )dζ → 0,
n → ∞.
Consider also the two analogs of the first relation of (2.76), in which 2q is replaced by 2q − 1 and 2q + 1. These relations together with (2.76) comprise a linear system with (p+1) (p+1) (p+1) (p+1) (0) (0) respect to the variables J2q−2 , J2q−1 , J2q and J2q+1 . For J2q = J2q−1 , i.e. in the case (ii), the system is uniquely soluble and the solution satisfies condition (1.29) in view of (2.75). (0) (0) However, for J2q = J2q−1 this system is degenerated and so in the case (i) we (p+1)
from the system. Therefore similarly to (2.62)–(2.64) for the case (i) cannot find Jk we obtain the one-parameter family of solutions (p+1)
Jk
(p+1)
= bk
(p+1,J )
− c(−1)k−n + r˜k,n
,
(2.77)
where (p+1)
bk
=
k−n j =0
(p+1)
(−1)k−n−j an+j ,
(p+1,J )
r˜k,n
with (p+1)
ak
(p+1,J,2)
≡ −Fk
+
1 2π i
=
k−n j =0
L
(p+1,J,2)
(−1)k−n−j rn+j,n
(p+1)
ζ 2 Rk,k
,
(ζ )dζ,
To fix the parameter c we use again identity (2.65) and Lemma 2. Then we get the equation for c of the form (i)
D (i) c − Ap+1 = 0, where, as usually in perturbation theory, the coefficient D (i) is the same in each order of the procedure. Thus, in view of Lemma 3, D (i) is nonzero and the parameter c is
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
295
uniquely defined by this equation. By the same argument as in the case p = 1 it is (p+1) (p+1) easy to see that in view of (2.75) qk and Jk satisfy bounds (1.30). Theorem 1 is proven. Proof of Corollary 1. By using general formulas (1.18 )–(1.25), (2.12) (2.14)–(2.16) and the Christoffel–Darboux identity for orthogonal polynomials it can be shown that the covariance (1.39) can be written as (λ − µ)2 kn2 (λ, µ)dλdµ 1 Dn (z1 , z2 ) = 2 2n (z1 − λ)(z1 − µ)(z2 − λ)(z2 − µ) (2.78)
2 δRn+1,n 2 Jn δRn+1,n+1 δRn,n − + en (z1 ) + en (z2 ), = 2 n δz δz δz where kn (λ, µ) is defined in (1.20) and we denote δR k,j ≡ Rk,j (z1 ) − Rk,j (z2 ) and δz ≡ z1 − z2 . (2) Then, on the basis of Lemma 2, we conclude that the amplitude dn (z1 , z2 ) of the asymptotic formula (1.40) is: (0) (0) 2
(0) δRn+1,n+1 δRn,n δRn+1,n (2) (0) 2 . dn (z1 , z2 ) = (Jn ) − δz δz δz According to Theorem 1 and Remark 2 after the theorem the zero-order coefficients (0) Jk of the Jacobi matrix J (n) do not depend on k (k = n(1 + o(1))) in the case (i) and are 2-periodic functions of k in the case (ii). Thus, we have only to compute the matrix elements of the resolvent of the constant Jacobi matrix and of the 2-periodic Jacobi matrices whose coefficients are given by (1.34) and (1.35) in the cases (i) and (ii) respectively. The computations are standard and lead to (1.41) and to (1.42). ! (n)
Proof of Corollary 2. The weak convergence of (ψk (λ))2 is equivalent to the convergence of its Stieltjes transform (n) (ψk (λ))2 dλ (2.79) z−λ uniformly in z on any compact set of C \ R. According to (2.12) and Proposition 2 the Stieltjes transform (2.79) is Rkk (z) + en (z). Now the asymptotic formula (2.33) implies that the Stieltjes transform (2.79) converges to >(z) as n → ∞ and dist{z, σε } ≥ d˙ > 0. This fact and the inversion formula (3.2) yield the result. !
3. Auxiliary Results Proposition 1. For the proof of weak convergence of measures Nn and (1.10) see [8]. Furthermore, it follows from Eq. (2.22) that in D g(z) can be written as V (z) 1 − (V (z))2 − 4Q(z), (3.1) 2 2 where Q(z) is defined in (2.23). Since ρ(λ) = −
1 lim g(λ + iε), π ε→+0
(3.2)
296
S. Albeverio, L. Pastur, M. Shcherbina
we conclude that ρ(λ) satisfies the Holder condition. Thus we find from the real parts of (3.1) that: V (λ) ρ(µ)dµ v.p. = , λ ∈ σ. 2 σ λ−µ Regarding this relation as a singular integral equation and using standard facts (see [21]), we obtain (1.10) in which 1 −1 P (λ) = Q(λ, µ)X+ (µ)dµ π σ −1 and Q and X+ (µ) are defined in (1.16) and (1.11). It is clear that P (λ) can be analytically continued into D and can be written in form (1.15). Since g(z) is uniquely determined by its boundary values on σ and its asymptotic behaviour g(z) = z−1 (1 + o(1)), as z → ∞, we obtain the assertions of the lemma. !
Proof of Proposition 2. According to the result of [8], and our condition C2, if we consider the function u(x) of the form (1.9), then u(x) = C ∗ (x ∈ σ ) and u(x) < C ∗ (x ∈ σ ). It is easy to see that at all endpoints a∗ of σ there exist one-side derivatives u ± (a∗) (we take the right derivative for the right endpoints a∗ and the left derivative for the left endpoints), and these derivatives are nonzero. Set C1 = 21 min |u ± (a∗)| and consider the function x ∈ σ, 0, V1 (x) = C1 ε, (3.3) x ∈ R \ σε , ±C (x − a∗), σ \ σ. ε 1 In the last line here we take plus for the right endpoints and minus for the left endpoints of the spectrum. It is easy to see that we can always choose ε0 so small that for any ε ≤ ε0 the function u1 (x) ≡ u(x) + V1 (x) also takes its maximum value C ∗ on σ . Consider now the following functions of (x1 , . . . , xn ) ∈ Rn that we will call Hamiltonians because their role below will be analogous to that of Hamiltonians of classical statistical mechanics (see [8] for this analogy): Hn (x1 , . . . , xn ) = n
n
V (xi ) − 2
i=1
ln |xi − xj |,
1≤i<j ≤n
Hn(1) (x1 , . . . , xn ) = nV˜ (x1 ) + n
n
V (xi ) − 2
i=2
Hn(1a) (x1 , . . . , xn ) = V˜ (x1 ) − (n − 1)u1 (x1 ) + n −2
ln |xi − xj |,
1≤i<j ≤n n
V (xi )
i=2
ln |xi − xj |,
2≤i<j ≤n
Hn(a) (x1 , . . . , xn )
= − nV1 (x1 ) − n + n(n − 1)
n
u(xi )
i=1
ln |x − y|ρ(x)ρ(y)dxdy,
(3.4)
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
297
where V˜ (x) ≡ V (x) − V1 (x), u is defined in (1.9), and u1 = u + V1 . Denote by pn = (Z )−1 exp{−Hn } the probability density defined by one of these functions (cf. (1.17)). We will use the Bogolyubov inequality, valid for any two Hamiltonians H1,2 with correspondent normalization constants (partition functions) Z1,2 , #H2 − H1 $H2 ≤ log Z1 − log Z2 ≤ #H2 − H1 $H1 ,
(3.5)
where the symbol #. . . $H denotes the mathematical expectation with respect to the probability density p = Z −1 exp{−H }. (1)
(1a)
Using the r.h.s inequality in (3.5) for H1 = Hn and H2 = Hn
, we get
log Zn(1) − log Zn(1a)
≤ 2(n − 1) log |x1 − x2 | ρn(1,1) (x1 , x2 ) − ρn(1,1) (x1 )ρn(1,2) (x2 ) dx1 dx2
+ 2(n − 1) log |x1 − x2 |ρn(1,1) (x1 ) ρn(1,2) (x2 ) − ρ(x2 ) dx1 dx2 , (3.6) (1,1)
(1,2)
where ρn (x1 ), and ρn (x2 ) are the first marginal densities corresponding to x1 and x2 (1) (1,1) (1,2) (1) for the Hamiltonian Hn (note that ρn (x1 ) = ρn (x1 ) since Hn is not symmetric (1,1) in x1 and x2 ), and ρn (x1 , x2 ) is the second marginal density, corresponding to x1 , x2 (1,1) (note that ρn (x1 , x2 ) is not symmetric because of the same reason). Lemma 4 of [8] (valid for not necessarily symmetric Hamiltonians) implies that the first term in the r.h.s. of (3.6) is O(log n). To estimate the second term we first take into account that the integral kernel log |x − y|−1 is positive definite, hence by the corresponding Schwartz inequality
log |x − y|ρ (1,1) (x) ρ (1,2) (y) − ρ(y) dxdy n n 1/2 (1,1) (1,1) ≤ log |x − y|ρn (x)ρn (y)dxdy 1/2
(1,2) (1,2) × log |x − y| ρn (x) − ρ(x) ρn (y) − ρ(y) dxdy .
(3.7)
298
S. Albeverio, L. Pastur, M. Shcherbina
By using the estimate (1,1) x˜ (1,1) ρ x + 3/γ − ρn (x) n n n
x˜ x˜ = (Zn(1) )−1 dx2 . . . dxn exp − nV˜ x + 3/γ − n V xi + 3/γ n n i=2 (3.8) n n n − exp − nV˜ (x) − n V (xi ) · |x − xi |2 |xi − xj |2 i=2
i=2
2≤i<j
const (1,1) ρ ≤ (x), n n
(1,1) valid for |x| ˜ < 1 in view of the condition (1.3), and the fact that ρn (x)dx = 1, (1,1) we obtain that ρn (x) ≤ const n3/γ . Hence we have the following bound for the first factor in the r.h.s. of (3.7): ln |x − y|ρ (1,1) (x)ρ (1,1) (y)dxdy ≤ const log n. n n To estimate the second factor in the r.h.s. of (3.7) we use the l.h.s inequality in (3.5) for (a) (1) (a) (1) the Hamiltonians H1 = Hn and H2 = Hn , where Hn and Hn are defined in (3.4). We obtain the inequality (n − 1)(n − 2) log |x − y|(ρn(1,2) (x, y) − ρn(1,2) (x)ρn(1,2) (y))dxdy − n2 (n − 1)(n − 2) − log |x − y|(ρn(1,2) (x) − ρ(x))(ρn(1,2) (y) − ρ(y))dxdy n2 2(n − 1) + log |x − y|(ρn(1,2) (x) − ρ(x))ρ(y)dxdy n2 2 + log |x − y|ρn(1,1) (x)ρ(y)dxdy (3.9) n 2(n − 1) − log |x − y|ρn(1,1) (x, y)dxdy n2
1 1 1 1 (a) ≤ 2 log Zn(a) − 2 log Zn(1) = log Z − log Z n n n n n2 n2
1 log n 1 2 + log Zn − 2 log Zn(1) ≤ O − V1 (x)ρn (x)dx. 2 n n n n (a)
In the r.h.s. here we have used the result of [8] to estimate 1/n2 log Zn − 1/n2 log Zn (1) and inequality (3.5) to estimate 1/n2 log Zn − 1/n2 log Zn . Using Lemma 4 of [8] (more precisely, repeating almost literally the arguments of that lemma in the case of the non symmetric Hamiltonian ), we obtain that the first and the last terms in the l.h.s. of (3.9) are of the order O(log n/n). And the third, the fourth and the fifth terms here are evidently of the order O(n−1 ). Therefore finally we get from (3.9), log n . (3.10) − log |x − y|(ρn(1,2) (x) − ρ(x))(ρn(1,2) (y) − ρ(y))dxdy ≤ const n
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
299
Substituting this estimate in (3.6) we obtain log Zn(1) − log Zn(1a) ≤ const
√
n log n.
(3.11)
(1a)
(1a)
Now we use the r.h.s. inequality in (3.5) for H1 = Hn and H2 = Hn , where Hn and Hn are defined in (3.4). We get (1a) log Zn − log Zn ≤ n V1 (x1 )ρn(a1) (x1 )dx1 (3.12) + (n − 1) ρn(a1) (x1 )(ρn(a2) (y) − ρ(y))dx1 dy, (a1)
(a2)
(1a)
where ρn and ρn are the first marginal densities of the Hamiltonian Hn sponding to x1 and x2 . On the other hand it is easy to see that ρn(a1) (x) =
, corre-
exp{(n − 1)u1 (x) − V˜ (x)} , exp{(n − 1)u1 (x) − V˜ (x)}dx (a1)
and due to the choice of the function V1 the density ρn (x) decays exponentially outside of σ . Thus since V1 (x) = 0 for x ∈ σ the first term in the r.h.s. of (3.12) is of the order O(1). The second term can be estimated by the Schwartz inequality similarly to (3.7) (a2) and then, using the fact that ρn (x) coincides with the first marginal densities of the Hamiltonian,
Hn (x2 , . . . , xn ) = n
n
V (xi ) − 2
i=2
ln |xi − xj |.
2≤i<j (a2)
Therefore the analog of inequality (3.10) for ρn of [8]. Thus, from (3.12) we derive
(x) follows directly from the results
log Zn(1a) − log Zn ≤ const
√
n log n.
(3.13)
Bounds (3.11) and (3.13) lead to the relation
enV1 (x1 ) ρn (x1 )dx1 =
(1)
√ Zn ≤ eC2 n log n . Zn
2 Taking C0 = 2 C C1 , we obtain from the last relation that for any positive ε satisfying the inequality: C0 n−1/2 log n ≤ ε ≤ ε0 we have √ ρn (x1 )dx1 ≤ exp{C2 n log n − C1 εn} ≤ e−nC1 ε/2 .
R\σe
To obtain this statement for ρk,n we have to prove now that for any n-independent ε we can choose ε1 such that for |k − n| ≤ ε1 n the spectrum of the ensemble with potential n V˜ ≡ V is inside of σε/2 . This fact follows from the main result of [8, 12] and also from k [19]. Proposition 2 is proven. !
300
S. Albeverio, L. Pastur, M. Shcherbina
Proof of Lemma 1. Using Proposition 1 we rewrite Eq. (2.27) in D: 1 P (z)X(z)R(z) = dζ Q(z, ζ )R(ζ ), 2π i L
(3.14)
with Q(z, ζ ) defined by (1.16). It follows from formula (1.15) for P (z) that the function >(z) of (2.28) solves Eq. (3.14) in the class (2.29). Let us show that the solution is ˜ ˜ unique. Denoting by Q(z) the r.h.s. of (3.14), we see that Q(z) is an analytic function ˜ in D. From Eq. (3.14) we derive that zeros of P (z) in D coincide with zeros Q(z) and ˜ Q(z) have the same order. Thus, function R(z)X(z) = P (z) is analytic in D. In the rest of C it is analytic, because we are looking for a solution analytic outside σε . Thus R(z)X(z) is analytic in the whole C. Besides, if R(z) = 1z (1 + o(1)), as |z| → ∞, then in the case (i) R(z)X(z) is bounded, as |z| → ∞. Therefore by the Liouville theorem, R(z)X(z) is a constant. In the case (ii) we get also from the Liouville theorem, that R(z)X(z) = az+b. By the symmetry of the function R(z) we get R(z) = zX−1 (z). This proves the first statement of the lemma. To prove the second statement, we notice that under condition (2.30) in the case (i) we have R(z)X(z) → 0, as |z| → ∞. Thus, according to the above conclusions R(z) = 0 for all z. In the case (ii) condition (2.30) implies that R(z)X(z) = const and we get R(z)X(z) = 0 from the symmetry condition. To prove that (2.32) is a solution of Eq. (2.31) we note first that for any closed contour L that does not contain the zeros of P (z) we can write the relation ˜ )dζ 1 Q(ζ 1 R(ζ )X(ζ )dζ R(z)X(z) = − , (3.15) 2πi L (ζ − z) 2π i L P (ζ )(ζ − z) ˜ where Q(z) is defined as in the r.h.s. of (3.14). Indeed, under the condition of the lemma R(z)X(z) = z−1 (1 + o(1)), as z → ∞, i.e. the function is analytic outside of contour L. Then, by the Cauchy theorem, the first term in the r.h.s. is R(z)X(z). The second term is zero, because the integrand is analytic inside the contour L and z is outside of L. By using this relation, we can rewrite formula (2.32) for the solution as 1 2πi
L1
(V (ζ ) − P (ζ )X(ζ ))R(ζ ) −
1 2πi
L
V (ζ, ζ1 )R(ζ1 )dζ1 + F (ζ )
dζ = 0, P (ζ )(ζ − z)
where the contour L1 lies outside of L and is close enough to L. According to the condition of the lemma the expression in the brackets is analytic outside of L1 . Thus by the Cauchy theorem, we have 1
(V (z) − P (z)X(z))R(z) − V (z, ζ )R(ζ )dζ + F = 0. 2π i L Since 2g(z) = V − P (z)X(z), the last relation proves that (2.32) is the solution of Eq. (2.31). Uniqueness follows from the absence of solutions of the homogeneous equation (2.27) in the class (2.30). This fact was proven above. !
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
301
Proof of Lemma 2. Consider the ”block” symmetric Jacobi matrix J˙(n,N1 ) which can be obtained from J if we set Jn−N1 −1 = 0. Let R˙ (n,N1 ) (z) be its resolvent. We will use the resolvent identity valid for any two selfadjoint operators J1,2 with resolvents R1,2 respectively, R1 (z) − R2 (z) = R1 (z)(J2 − J1 )R2 (z).
(3.16)
Thus taking as R1 (z) the resolvent R(z) of J (n), and as R2 (z) the resolvent R˙ (n,N1 ) (z) of J˙(n,N1 ) we obtain (n,N1 )
Rk,j (z) − R˙ k,j
(n,N )
1 (z) = R˙ k,n−N J R (z) 1 −1 n−N1 −1 n−N1 ,j
(n,N1 ) J R (z). + R˙ k,n+N 1 +1 n+N1 +1 n+N1 +2,j
(3.17)
Now we use the general fact of the theory of the Jacobi matrices. Proposition 3. Let J be the Jacobi matrix with coefficients Jk,k+1 = Jk+1,k = ak ∈ R, |Jk,k | ≤ ε, and |ak | ≤ A. Then there exist positive constants C1,2 , such that for any z ∈ C \ [−2A − ε, 2A + ε] the matrix elements of the resolvent G = (zI − J )−1 satisfy the inequalities: |Gk,k (z)| ≤
C1 −C2 δε (z)|k−k | , e δε (z)
(3.18)
where δε (z) ≡ dist{z, [−2A − ε, 2A + ε]}. The proof of the proposition is similar to that of the well-known Combes- Thomas estimates for the Schroedinger operator (see e.g. [26]) and we omit the proof. On the basis of the proposition we obtain the bound (n,N1 )
|R˙ j,k
(z)| ≤
1 −C2 δε (z)|j −k| . e δε (z)
(3.19)
Thus, for (N1 − 2N˜ ) ≤ |k − n| ≤ (N1 − N˜ ) we have (n,N ) (n,N ) |R˙ n−N11 −1,k (z)|, |R˙ n+N11 +1,k (z)| ≤
1 −C2 δε (z)N˜ . e δε (z)
So, it follows from (3.17) that (n,N ) |Rk,j (z) − R˙ k,j 1 (z)| ≤
const −C1 δε (z)N˜ . e |z|δε (z)
(3.20)
Similarly, if we consider the (2N1 + 1)-periodic symmetric Jacobi matrix J˜ such that J˜k,k+1 = Jk,k+1
|k − n| ≤ N1 ,
(3.21)
and denote by R˜ its resolvent, then (n,N ) |R˜ k,k − R˙ k,k 1 (z)| ≤
2 ˜ e−C2 δε (z)N . |z|δε (z)
(3.22)
|Rk,k (z) − R˜ k,k (z)| ≤
const −C2 δε (z)N˜ . e |z|δε (z)
(3.23)
Therefore,
302
S. Albeverio, L. Pastur, M. Shcherbina
Applying the resolvent identity (3.16) to the matrices J˜(p) and J˜ we obtain in view of estimate (1.28): (p)
|R˜ k,j (z) − R˜ k,j (z, n−1 )| ≤
(p)
2εn , np |z|2
(3.24)
(p) where R˜ k,j (z, s) is the resolvent of the Jacobi matrix J˜(p) (z, s) defined in (2.48) and (p) (p) εn is defined in (1.30). Now expanding R˜ k,k (z, n−1 ) with respect to n−1 it is easy to find that (p)
(p)
|R˜ k,j (z, n−1 ) − Sk,j (z)| ≤
p+1
C1 N1
p+1
δε
(z)np+1
.
(3.25)
From (3.23)–(3.25) we derive that p+1
(p)
Rk,k (z) − Sk,k (z)| ≤
˜ (p) C1 N 1 εn e−C2 δε (z)N + + . p+1 δε2 (z)np |z|δε (z) δε (z)np+1
This inequality and (2.11), lead to the first inequality in (2.51). To prove the second inequality in (2.51) we use again identity (3.17). Taking the second power of the identity, using the bounds (n,N1 )
|R˙ k,j
(z)|, |Rk,j (z)| ≤
1 , |z|
valid for resolvents of arbitrary selfadjoint operators, and bound (3.19), we obtain ∞ ∞ (n,N1 ) 2 2 ˙ (Rk,j (z)) − (Rk,j (z)) j =1
j =1
∞
∞ 4 −C2 δε (z)N˜ 2 2 (3.26) ≤ |Rn−N1 ,j (z)| + |Rn+N1 ,j (z)) | e δε (z) j =1
j =1
8 ˜ ≤ e−C2 δε (z)N . 2 |z| δε (z) 2 To estimate here the sums of the type ∞ j =1 |Rn−N1 ,j (z)| we have used the simple inequalities ∞
|Rn−N1 ,j (z)|2 =
j =1
∞
Rn−N1 ,j (z)Rj,n−N1 (z) ≤ (R(z) · R(z))n−N1 ,n−N1 ≤
j =1
Similarly, ∞ ∞ 1 ˜ (n,N1 ) 2 2 ˜ ˙ e−C2 δε (z)N . (Rk,m (z)) − (Rkm (z)) ≤ 2 2 |z| δε (z) m=k+1
1 . |z|2
(3.27)
m=k+1
And then, by the same way as in (3.23)-(3.25) we get the second inequality of (2.51). The proof of (2.52) is similar.
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
303
Note that in fact we have proved (2.51) and (2.52) for |k − n| ≤ (N1 − N˜ ). To prove (2.53) we need to make one more step. Let us prove that for |k − n| ≤ (N1 − 2N˜ ), ˜ n−(N1 −N) k 1 ˜
2 −C2 δε (z)N/2 ≤ − R (z) − R (z) . j,j j,m |z|2 δ (z) e ε
(3.28)
m=1
j =1
˜
To this end we consider one more ”block” symmetric Jacobi matrix J˙(n,(N1 −2N)) which can be obtained from J if we put Jn−(N1 −2N)−1,n−(N ˜ ˜ = 0. Using identity (3.16) 1 −2N) ˜ ˜ (n,(N −2 N)) (n,(N −2 N)) 1 1 and (3.19) for J˙ , we obtain similarly to (3.26), for J and J˙ ˜ ˜ n−(N1 −N) n−(N ∞ ∞ 1 −N) ˜ (n,(N1 −2N)) 2 2 ˙ (R (z)) − ( R (z)) j,m j,m j =1
m=k+1
j =1
≤
m=k+1
2n −C2 δε (z)N˜ 1 ˜ e ≤ e−C2 δε (z)N/2 . |z|3 |z|2 δε (z) ˜ (n,(N1 −2N))
Then, using the estimate (3.19) for R˙ j,m k + 1 > n − (N1 − 2N˜ ) we get
(3.29)
(z) with j ≤ n − (N1 − N˜ ) and m ≥
˜ n−(N1 −N) ∞ 2n 1 ˜ ˜ ˜ (n,(N1 −2N)) 2 ˙ (Rj,m (z)) ≤ 2 e−C2 δε (z)N ≤ 2 e−C2 δε (z)N/2 . δε (z) δε (z) j =1
m=k+1
This inequality combined with (3.29) proves that ˜ n−(N1 −N) k 2 (R · R) (z) − R (z) j,j j,m m=1
j =1
˜ n−(N1 −N) ∞ 1 −C2 δε (z)N/2 ˜ 2 = Rj,m (z) ≤ e . |z|3 j =1
m=k+1
(z)) and R Now, using (2.11), we can replace (R ∗ R)j,j (z) by (−Rj,j j,m (z) by Rj,m (z) to get (3.28). Applying the first and the second line of (2.51) for |k − n| ≤ (N1 − N˜ ) we get (2.53). !
304
S. Albeverio, L. Pastur, M. Shcherbina (i)
Proof of Lemma 3. To find Dk we first compute the quantity (R (0) (ζ )J (±) R (0) (ζ ))n,n+1 = =
∞ j =−∞ ∞ j =−∞
(0)
(0)
(0)
(0)
(Rn,j (ζ )(−1)n−j Rj +1,n+1 + Rn,j +1 (ζ )(−1)n−j Rj +2,n+1 (ζ ) 1 (2π)2
2π 0
dxdy
ei(n−j )(x−y−π) (1 + e−i(x+y) ) (ζ − a cos x)(ζ − a cos x)
2π 1 − cos 2x 1 dx 2 2π 0 (ζ − a 2 cos2 x)
2π 1 1 ζ2 1 + dx 2 =2 1− 2 2 2 a 2π 0 (ζ − a cos x) π a 2
2 ζ2 = 1 − 2 X −1 (ζ ). ζ a
=
(0)
(0)
Then using the simple formula Rn,n+1 (ζ ) = a −1 (ζ Rn,n (ζ ) − 1) = a −1 (ζ X −1 (ζ ) − 1) we find from (2.65),
1 ζ a ζ2 D (i) = X −1 (ζ )dζ V (ζ ) + 1− 2 2πi L a ζ a a V (ζ ) = dζ = aP (0) = 0. 2πi L X(ζ )ζ Here we have used representation (1.15) and the fact that L dζ (X(ζ )ζ )−1 = 0. Similar calculations show us that A(i) = 0, so it follows from Eq. (2.66) that c = 0 and we get (1.34). !
References 1. Akemann, G.: Higher genus correlators for the Hermitian matrix model with multiple cuts. Nuclear Phys. B 482, 403–430 (1996) 2. Albeverio, S., Pastur, L., Shcherbina, M.: On asymptotic properties of certain orthogonal polynomials. Matem. Fizika, Analiz, Geometriya 4, 263–277 (1997) 3. Ambjörn, J., Akemann, G.: New universal spectral correlators. J. Phys. A 29, L555–L560 (1996) 4. Ambjörn, J., Chekhov, L., Makeenko,Yu.: Higher genus correlators from the Hermitian one-matrix model. Phys. Lett. B 282, 341–348 (1992) 5. Ambjörn, J., Jurkiewicz, J., Makeenko,Yu.M.: Multiloop correlators for two-dimensional quantum gravity. Phys. Lett. B 251, 517–524 (1990) 6. Beenakker, C.W.J.: Random-matrix theory of quantum transport. Rev. Mod. Phys. 69, 731–847 (1997) 7. Bessis, D., Itzykson C., Zuber J.-B.: Quantum Field Theory Techniques in Graphical Enumeration. Adv. Appl. Math. 1, 109–157 (1980) 8. Boutet de Monvel, A., Pastur L., Shcherbina, M.: On the statistical mechanics approach in the random matrix theory. Integrated density of states. J. Stat. Phys. 79, 585–611 (1995) 9. Brézin, É., Deo, N.: Correlations and symmetry breaking in gapped matrix models. Phys. Rev. E 59, 3901–3910 (1999) 10. Brézin, É., Zee, A.: Universality of the correlations between eigenvalues of large random matrices. Nuclear Phys. B 402, 613–627 (1993) 11. Buslaev, V., Pastur, L.: On a class of multi-cut solutions of matrix models and related structures. In preparation
On the 1/n Expansion for Some Unitary Invariant Ensembles of Random Matrices
305
12. Deift, P., Kriecherbauer, T., McLaughlin, K.: New results on the equilibrium measure in the presence of external field. J. Approx. Theory 95, 388–475 (1998) 13. Deift, P., Kriecherbauer, T., McLaughlin, K., Venakides, S., Zhou, X.: Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality questions in random matrix theory. Commun. Pure Appl. Math. 52, 1335–1425 (1999) 14. Deift, P., Kriecherbauer, T., McLaughlin, K., Venakides, S., Zhou, X.: Strong asymptotics of orthogonal polynomials with respect to exponential weights. Commun. Pure Appl. Math. 52, 1491–1552 (1999) 15. Di Francesco, P., Ginsparg, P., Zinn-Justin, J.: 2D gravity and random matrices. Phys. Rep. 254, 1–133 (1995) 16. Guhr, T., Mueller-Groeling, A., Weidenmueller, H.A.: Random matrix theories in quantum physics: Common concepts. Phys. Rept. 299, 189–425 (1998) 17. Johansson, K.: On fluctuations of eigenvalues of random Hermitian matrices. Duke Math. J. 91, 151–204 (1998) 18. Khorunzhy, A., Khoruzhenko, B., Pastur, L.: Random matrices with independent entries: Asymptotic properties of the Green function. J. Math. Phys. 37, 5033–5060 (1996) 19. Kuijlaars, A.B.J., McLaughlin, K.: Generic behavior of the density of states in random matrix theory and equilibrium problems in the presence of real analytic external fields. Commun. Pure Appl. Math. 53, 736–785 (2000) 20. Mehta, M.L.: Random Matrices. New York: Academic Press, 1991 21. Muskhelishvili, N.I.: Singular Integral Equations. Groningen: P. Noordhoff, 1953 22. Pastur, L.: Spectral and probabilistic aspects of matrix models. In: Boutet de Monvel, A., Marchenko, V. (eds), Algebraic and Geometric Methods in Mathematical Physics. Dordrecht: Kluwer, 1996, pp. 207–247 23. Pastur, L.: Random matrices as paradigm. In: Fokas, A., Grigoryan, A., Kibble, T., Zegarlinski B. (eds.), Mathematical Physics 2000. London: Imperial College Press, 2000, pp. 216–266 24. Pastur, L., Shcherbina, M.: Universality of the local eigenvalue statistics for a class of unitary invariant random matrix ensembles. J. Stat. Phys. 86, 109–147 (1997) 25. Pastur, L., Shcherbina, M.: Universality of the local eigenvalue statistics near generic endpoints of the spectrum for a class of unitary invariant random matrix ensembles. In preparation 26. Reed, M., Simon,B.: Methods of Modern Mathematical Physics, Vol. IV. New York: Academic Press, 1978 27. Saff, E.B., Totik, V.: Logarithmic Potentials with External Fields. Berlin: Springer, 1997 Communicated by H. Spohn
Commun. Math. Phys. 224, 307 – 321 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Symmetric Simple Exclusion Process: Regularity of the Self-Diffusion Coefficient C. Landim1,2, , S. Olla3 , S. R. S. Varadhan4, 1 IMPA, Estrada Dona Castorina 110, CEP 22460 Rio de Janeiro, Brasil. E-mail:
[email protected] 2 CNRS UPRES-A 6085, Université de Rouen, 76128 Mont Saint Aignan, France 3 Université de Cergy Pontoise, Département de Mathématiques, CNRS UPRES-A 8088, 2 Av. Adolphe
Chauvin, B.P.222, Pontoise, 95302 Cergy-Pontoise-Cedex, France. E-mail:
[email protected] 4 Courant Institute of Mathematical Sciences, NewYork University, 251 Mercer Street, NewYork, NY 10012,
USA. E-mail:
[email protected] Received: 6 December 2000 / Accepted: 6 April 2001
Dedicated to Joel Lebowitz on his seventieth birthday Abstract: We prove that the self-diffusion coefficient of a tagged particle in the symmetric exclusion process in Z d , which is in equilibrium at density α, is of class C ∞ as a function of α in the closed interval [0, 1]. The proof provides also a recursive method to compute the Taylor expansion at the boundaries. 1. Introduction In the course of the study of macroscopic behavior of large particle systems, effective diffusion coefficients which are functions of the parameters (associated to the conserved quantities) that define the equilibrium measures of the system often appear. These diffusion coefficients are usually expressed in terms of integrals of time correlations functions (Green–Kubo formulas), or through (infinite dimensional) variational formulas. They also appear as coefficients in the diffusive equations that govern the non-equilibrium evolution of the conserved quantities of the system. In order to study the existence and regularity of solutions to these equations it is important to establish first the regularity of these diffusion coefficients as functions of the parameters. In this article we develop a method for proving smooth dependence, on the density, of the self diffusion coefficient of a tagged or tracer particle in symmetric simple exclusion particle systems that are in equilibrium. It is based on the duality properties of the symmetric simple exclusion process. This method, with modifications, can also be applied to study the smooth dependence on the density of other diffusion coefficients that arise in the study of more general simple exclusion processes. But this will be taken up elsewhere. The paper is organized along the following lines. In Sect. 2 we introduce the notation and state the main theorem. In Sect. 3 we describe the generalized duality and discuss Partially supported by CNPq grant 300358/93-8, FAPERJ grant E26/150.940/99 and PRONEX grant 41.96.0923.00. Supported by the National Science Foundation grant DMS-9803140.
308
C. Landim, S. Olla, S. R. S. Varadhan
several operators and norms that appear in the dual representation. Section 4 is devoted to some key estimates that are used in Sect. 5 to prove the main result on the smoothness of the self diffusion coefficient of the tagged particle in the case of the symmetric simple exclusion process. At the end, in Remark 5.3, we expose a recursive method to compute the Taylor expansion at the boundaries. Very few results exist at the present time about the regularity of diffusion coefficients. Continuous dependence on the density has been established in different contexts (cf. [2]). Generally proving continuity does not seem to be considerably harder than establishing the existence of a diffusion coefficient. In [6], Lipschitz continuity of the selfdiffusion coefficient for the tagged particle in the symmetric simple exclusion is proved in dimensions d ≥ 3. 2. Notation and Results Let us fix a symmetric finite-range probability distribution p(·) on Zd . Consider the symmetric simple exclusion process associated with p. We assume, without loss of generality, that the subgroup generated by the support of p is all of Zd . In addition we assume that we are not dealing with the trivial situation of d = 1 and p(±1) = 21 , i.e. the one dimensional nearest neighbor case where the self diffusion coefficient is identically zero. d The simple exclusion process is the Markov process on X = {0, 1}Z whose generator L acts on cylinder functions f as (Lf )(η) = p(y − x)η(x)[1 − η(y)][f (σ x,y η) − f (η)] x,y∈Zd
=
1 p(y − x)[f (σ x,y η) − f (η)]. 2 d
(2.1)
x,y∈Z
Here and below the configurations of X are denoted by Greek letters. In particular, for x in Zd , η(x) is equal to 1 if the site x is occupied in the configuration η and is equal to 0 if it is not. Moreover, for a configuration η and x, y in Zd , σ x,y η is the configuration obtained from η by exchanging the occupation variables η(x), η(y) : η(y) if z = x, (σ x,y η)(z) = η(x) if z = y, (2.2) η(z) otherwise. Fix 0 ≤ α ≤ 1 and denote by µα the Bernoulli product measure on X . This is the probability measure on X obtained by placing a particle with probability α at each site x, independently from the other sites. It easy to check that the one-parameter family of probability measures {µα , 0 ≤ α ≤ 1} are stationary, reversible and ergodic for the Markov process with generator L. We examine in this article the evolution of a single tagged particle in the symmetric simple exclusion process. Let η be an initial configuration with a particle at the origin, i.e. with η(0) = 1. Tag this particle and denote by ηt (resp. Xt ) the state of the process (resp. the position of the tagged particle) at time t. We shall refer to ηt as the environment. Let ξt be the state of the environment as seen from the tagged particle: ξt = θXt ηt . Here, for x in Zd and a configuration η, θx stands for the translation of η by x, i.e.
Regularity of Self-Diffusion
309
(θx η)(y) = η(x + y). Notice that the origin is always occupied (by the tagged particle) for the environment as seen from the tagged particle. For this reason, we shall consider d the process ξt as taking values in {0, 1}Z∗ , where Zd∗ = Zd \{0}. Whereas Xt is not a Markov process due to the presence of the environment, (Xt , ξt ) and ξt are. A simple computation shows that the generator L of the Markov process ξt is given by L = L0 + Lτ , where (L0 f )(ξ ) =
p(y − x)ξ(x)[1 − ξ(y)][f (σ x,y ξ ) − f (ξ )],
x,y∈Zd∗
1 p(y − x)[f (σ x,y ξ ) − f (ξ )], 2 x,y∈Zd∗ p(z)[1 − ξ(z)][f (τz ξ ) − f (ξ )]. (Lτ f )(ξ ) = =
(2.3)
z∈Zd∗
The first part of the generator takes into account the jumps in the environment, while the second one corresponds to jumps of the tagged particle. In the above formula, τz ξ stands for the configuration where the tagged particle, sitting at the origin, is first transferred to the (empty) site z and then the entire environment is translated by −z: for all y in Zd∗ , (τz ξ )(y) =
ξ(z) ξ(y + z)
if y = −z, for y = −z. d
For 0 ≤ α ≤ 1, denote by µα the Bernoulli product measure on X∗ = {0, 1}Z∗ . A simple computation shows that µα is a reversible and ergodic stationary measure for the Markov process ξt . In this context Kipnis and Varadhan ([1]) proved a central limit theorem for the position of the tagged particle starting with an initial environment chosen randomly from the equilibrium µα . They showed that εXtε−2 converges, as ε ↓ 0, to a Brownian motion with diffusion coefficient D(α) which we will describe in more detail in the next section. This result has been generalized by Varadhan ([6]) to the asymmetric case with 0mean ( y yp(y) = 0). More recently, for the general asymmetric case in dimension d ≥ 3, if y yp(y) = m = 0, in Sethuraman-Varadhan-Yau ([5]) it is proved that ε[Xtε−2 − mt (1 − α)ε −2 ] converges, as ε ↓ 0, to a Brownian motion with another diffusion coefficient. In this article we limit ourselves to the symmetric case and study the regularity properties of D(α) as a function of α. The main result is Theorem 2.1. The self-diffusion coefficient D(α), as a function of α, is of class C ∞ in the closed interval [0, 1]. 3. Generalized Duality The proof of Theorem 2.1 relies on the duality properties of the symmetric exclusion process that we will now describe.
310
C. Landim, S. Olla, S. R. S. Varadhan
We have the Hilbert space L2 (µα ) with its natural inner product ·, ·α . The operator L is self adjoint and the natural Dirichlet inner products will be denoted by f, g1,α = −Lf, gα , f, g1,env,α = −L0 f, gα . The dual norms f −1,α and f −1,env,α are defined by f 2−1,α = sup 2 f, g − g, g1,α , g
f 2−1,env,α
= sup 2 f, g − g, g1,env,α . g
For each n ≥ 0, denote by E∗,n the subsets of Zd∗ with n points and let E∗ = ∪n≥0 E∗,n be the class of all finite subsets of Zd∗ . Let us consider an abstract Hilbert space H with a complete orthonormal basis consisting of {eA : A ∈ E∗ }. The space H can be viewed as the space of square summable maps f of E∗ → R. In a natural way H = ⊕n≥0 Gn , where Gn is spanned by {eA : A ∈ E∗,n }. For each A in E∗ , let the local function in L2 (µα ) be defined by
ξ(x) − α , √ A = A (α, ξ ) = χ (α) x∈A where χ (α) = α(1 − α). By convention, φ = 1. It is easy to check that { A , A ∈ E∗ } is an orthonormal basis of L2 (µα ). For each n ≥ 0, denote by Gn the subspace of L2 (µα ) generated by { A , A ∈ E∗,n }, so that L2 (µα ) = ⊕n≥0 Gn . Functions of Gn are said to have degree n. The main property of the symmetric simple exclusion process that will be used here is that part of the generator, i.e. L0 , preserves the degree of the functions. Consider a local function f . Since { A : A ∈ E∗ } is a basis of L2 (µα ), we may write f(A) A = πn f. f = n≥0 A∈E∗,n
n≥0
Here we have denoted by πn the orthogonal projection onto Gn . Notice that the coefficients f(A) depend not only on f but also on the density α: f(A) = f(A, α). Since f is a local function, f : E∗ → R has finite support. In other words we have a unitary isomorphism, f ∼ f(A)eA between L2 (µα ) and H that takes local functions in L2 (µα ) onto finite linear combinations of the basis elements. Of course this establishe also an isomorphism between Gn and Gn . We now conclude this section by expressing the operators L and L0 as well as their Dirichlet forms, through this isomorphism, in the basis {eA } of H. To begin with, because the isomorphism is unitary, we have f, gα = f, g =
f(A)g(A),
A∈E∗
where f ∼
f(A)eA
The norm in H will be denoted by f0 .
and
g∼
g(A)eA .
Regularity of Self-Diffusion
311
For a subset A of Zd∗ and x, y in Zd∗ , denote by Ax,y , Sy A the sets defined by (A\{x}) ∪ {y} if x ∈ A, y ∈ A, Ax,y = (A\{y}) ∪ {x} if y ∈ A, x ∈ A, A otherwise; (3.1) A−z if z ∈ A, Sz A = ((A\{z}) − z) ∪ {−z} if z ∈ A. In this formula, B + z is the set {x + z; x ∈ B}. Therefore, to obtain Sz A from A in the case where z belongs to A, we first remove z to get a set not containing z, then translate A\{z} by −z and finally add the site −z. Recall the definition of the generators L0 , Lτ given in (2.3). A simple computation shows that (L0 f)(A)eA , (Lτ f ) ∼ (Lτ,α f)(A)eA , (L0 f ) ∼ A∈E∗
A∈E∗
where (L0 f)(A) =
1 p(y − x)[f(Ax,y ) − f(A)] 2 d
(3.2)
x,y∈Z∗
and Lτ,α is an operator which can be decomposed as Lτ,α = αLτ1 + (1 − α)Lτ2 + χ (α)(Lτ+ + Lτ− ), where (Lτ1 f)(A) = (Lτ2 f)(A) = (Lτ+ f)(A) = (Lτ− f)(A)
=
p(y)[f(Sy A) − f(A)],
y∈A
p(y)[f(Sy A) − f(A)],
y∈A
p(y)[f(A\{y}) − f(Sy A\{−y})],
(3.3)
y∈A
p(y)[f(A ∪ {y}) − f(Sy A ∪ {−y})].
y∈A
Notice that L on L2 (µα ) is represented on H by Lα = L0 + Lτ,α : Lα = L0 + αLτ1 + (1 − α)Lτ2 + χ (α)[Lτ+ + Lτ− ].
(3.4)
We mentioned earlier that the main property to be exploited here is that the generator of the symmetric exclusion process preserves the degree of local functions. It is easy to check that the operators L0 , Lτ1 , Lτ2 preserve the degree of a function, i.e. they map Gn into itself. Moreover, Lτ+ increases the degree of a function by one while Lτ− decreases it by one. For a function f : E∗ → R and n ≥ 0, denote by πn f or by fn its restriction to En,∗ : (πn f)(A) = f(A)1{A ∈ En }.
312
C. Landim, S. Olla, S. R. S. Varadhan
For local functions f , g : X∗ → R, a long but elementary computation shows that, if we define 2 f, g1,α = p(y − x)[f(Ax,y ) − f(A)][g(Ax,y ) − g(A)] +
x,y∈Zd∗ A∈E
p(y)ry (A)[f(Sy A) − f(A)][g(Sy A) − g(A)]
(3.5)
y∈Zd∗ A∈E
− −
χ (α)
p(y)[f(Sy A) − f(A)][g(Sy [A ∪ {y}]) − g([A ∪ {y}])]
y∈Zd∗ A∈E y∈A
χ (α)
p(y)[f(Sy [A ∪ {y}]) − f([A ∪ {y}])][g(Sy A) − g(A)].
y∈Zd∗ A∈E y∈A
with ry (A) is equal to α if y belongs to A and is equal to 1 − α if y does not belong to A, then f, g1,α = f, g1,α Notice that the last three terms can be recombined to give a positive expression when f = g. The corresponding norm will be denoted by f1,α which of course is equal to f 1,α . By completing the space of finitely supported functions with this norm we obtain the Dirichlet space H1 . Let H−1 be the dual of H1 with respect to the standard inner product on H. This is the Hilbert space generated by finitely supported functions and the norm · −1,α defined by f2−1,α = sup 2 f, g − g, g1,α , g
where the supremum is carried over all finitely supported functions g. It follows from the isomorphism that f −1,α = f−1,α . The Dirichlet form corresponding to L0 is much simpler to calculate in the H representation. Denote by · 1,env and · −1,env respectively the Dirichlet norm and its dual associated to the generator L0 : g21,env = g, (−L0 )g 1 = p(y − x)[g(Ax,y ) − g(A)]2 2 x,y∈Zd∗ A∈E = πn g21,env n≥0
and
g2−1,env = sup 2 f, g − f, (−L0 )f f
=
n≥0
πn g2−1,env ,
(3.6)
where the supremum is carried over all finitely supported functions. In contrast to the norms · 1,α , · −1,α , the norms · 1,env , · −1,env do not depend explicitly on
Regularity of Self-Diffusion
313
the parameter α. Moreover, since f, (−L0 )f ≤ f, (−Lα )f, it follows that g1,env ≤ g1,α and g−1,α ≤ g−1,env . In Lemma 4.4, we estimate g1,α and g−1,env in terms of g1,env and g−1,α , respectively. Finally, for any k ≥ 0, let us define n2k πn f20 , |||f|||21,k = n2k πn f21,env , |||f|||20,k = n≥0
|||f|||2−1,k
=
n≥0
n≥0
n
2k
πn f2−1,env .
(3.7)
If T is the operator that acts as scalar multiplication by n on the space Gn of degree n, these are the quadratic forms T k f2 , < T k f, (−L0 )T k f > and < T k f, (−L0 )−1 T k f > respectively. Note that L0 commutes with T . The completion under these norms will be denoted by H0,k , H1,k and H−1,k respectively. 4. Some Estimates Since Lα is self adjoint, for the solution uλ of the resolvent equation λuλ − Lα uλ = f, we have the basic estimate that implies
(4.1)
uλ 1,α ≤ f−1,α uλ 1,env ≤ f−1,env
or
|||uλ |||1,0 ≤ |||f|||−1,0 .
The following regularity result follows from Eq. (5.5) of [4]. Lemma 4.1. Let k ≥ 1 be given. Let f be a function such that |||f|||−1,k < ∞. For λ > 0, let uλ be the solution of the resolvent equation (4.1). Then, |||uλ |||1,k ≤ C(k)|||f|||−1,k
(4.2)
for a finite constant C(k) independent of α and λ. In fact the proof of (4.2) given in [4] extends immediately to non-local f. We now state some bounds on the restrictions of Lτ1 , Lτ2 , Lτ+ and Lτ2 on Gn . These j bounds will grow linearly with n. Notice that Lτ , j = 1, 2 are symmetric operators, + − while Lτ is the adjoint of Lτ :
+ Lτj f, g = f, Lτj g Lτ f, g = f, Lτ− g , for j = 1, 2 and f, g in L2 (E∗ ). Moreover, p(y)[f(Sy A) − f(A)]2 , Lτ1 f, f = (1/2)
Lτ2 f, f = (1/2)
A∈E∗ y∈A
A∈E∗ y∈A
p(y)[f(Sy A) − f(A)]2 .
314
C. Landim, S. Olla, S. R. S. Varadhan
Lemma 4.2. There exists a finite constant C0 depending only on the transition probability p such that (−Lτj )f, f ≤; C0 n (−L0 )f, f (4.3) for j = 1, 2, all n ≥ 1 and all f in Hn . Moreover
2 (−Lτ± )f, g ≤ C02 n2 (−L0 )f, f (−L0 )g, g
(4.4)
for all n ≥ 1 and all f in Gn , g in Gn±1 . On the other hand for j = 1, 2, Lτj f20 ≤ 4f20 and Lτ± f20 ≤ 4f20
(4.5)
for all f in H. Proof. The first estimate (4.3) follows immediately from Lemma 5.1 in [4]. We first prove that for all f, g in L2 (E∗ ), ± 2 Lτ f, g ≤ (−Lτ1 )f, f (−Lτ2 )g, g .
(4.6)
Fix f, g in L2 (E∗ ). By the explicit formula for Lτ+ , we have that
(−Lτ+ )f, g = p(y) g(A) f(Sy A\{−y}) − f(A\{y}) . y
Ay
Rewrite this expression as twice one half of it. In one of the pieces, we perform the change of variables B = Sy A, z = −y to obtain that it is equal to −(1/2)
p(y)
y
g(Sy A) f(Sy A\{−y}) − f(A\{y}) .
Ay
Here we used
the fact that p(·) is symmetric. Adding the two expressions we get that (−Lτ+ )f, g is equal to −(1/2)
y
p(y)
g(Sy A) − g(A) f(Sy (A\{y})) − f(A\{y}) .
Ay
By Schwarz’s inequality, this expression is bounded above by 2 1 p(y) g(Sy A) − g(A) 4β y Ay
+
2 β p(y) f(Sy A\{−y}) − f(A\{y}) 4 y Ay
for all β > 0. By the identities
presented just before the statement of the lemma, the first term is (1/2β) (−Lτ1 )g, g . A change of variables B = A − {y} shows that the second
is bounded by (β/2) (−Lτ2 )f, f . Minimizing over β, we conclude the proof of (4.6).
Regularity of Self-Diffusion
315
We may now prove the second estimate of the lemma. Fix n ≥ 1, and functions
2 f and g of degree n and n + 1, respectively. By (4.6), Lτ+ f, g is bounded above by
(−Lτ1 )f, f (−Lτ2 )g, g . By the first part of the lemma, this product is bounded by C02 n2 (−L0 )f, f (−L0 )g, g This proves (4.4) for Lτ+ . The proof for Lτ− is similar. The last estimate (4.5) is elementary and follows from Schwarz’s inequality and the explicit formulas for the operators Lτ1 , Lτ2 , Lτ+ , and Lτ− . Lemma 4.3. For every k ≥ 0, there exists a finite constant Ck such that for j = 1, 2, +, −, |||Lτj f|||−1,k ≤ Ck |||f|||1,k+1 , j
so that Lτ maps H1,k+1 boundedly into H−1,k Proof. Follows immediately from the preceding lemma.
Lemma 4.4. There exists a finite constant C0 such that for all n ≥ 1, f1,α ≤ C0 nf1,env ,
f−1,env ≤ C0 nf−1,α
for all α in [0, 1], and all f in Gn . Proof. Fix n ≥ 1 and f in Gn . By (3.5) and Schwarz’s inequality, < f, f >1,α is bounded above by f21,env + 2
p(y)[f(Sy A) − f(A)]2
A∈E∗ y∈Zd∗
+
p(y)[f(Sy [A ∪ {y}]) − f([A ∪ {y}])]2
A∈E∗ y∈A
because |ry (A)| ≤ 1 and χ (α) ≤ 1. Since f belongs to Gn , we may restrict the second sum to sets A in En,∗ . A change of variables permits us to estimate the third sum by the second one. In conclusion, f, f1,α ≤ f21,env + 3
p(y)[f(Sy A) − f(A)]2 .
A∈E∗,n y∈Zd∗
By Lemma 4.2, the second term on the right-hand side is less than or equal to C0 nf21,env because f belongs to Gn . The second estimate of the lemma is obtained by duality.
316
C. Landim, S. Olla, S. R. S. Varadhan
5. The Self-Diffusion Coefficient By [1], the self–diffusion coefficient D(α) in the direction v is given by the variational formula : v · D(α)v = inf p(z)Eµα [1 − ξ(z)]{v · z − [f (τz ξ ) − f (ξ )]}2 f
z∈Zd∗
+
p(x − y)Eµα ξ(x)[1 − ξ(y)]{f (σ x,y ξ ) − f (ξ )}2 ,
x,y∈Zd∗
where the infimum is carried over all cylinder functions f . A simple computation shows that v · D(α)v = (1 − α) (z · v)2 p(z) − α(1 − α)fv 2−1,α (5.1) z∈Zd∗
for each v in Rd . Here fv is the cylinder function given by fv (ξ ) = √
1 p(y)(y · v)[1 − ξ(y)] α(1 − α) d y∈Z∗
1 p(y)(y · v)[α − ξ(y)] =√ α(1 − α) d y∈Z∗
because p has mean zero. With the notation introduced in the previous section, we may write fv as (y · v)p(y) y , fv (ξ ) = − y∈Zd∗
where z = {z} for z in Zd∗ . We are now in a position to state the main result of this section. Theorem 2.1 follows from this result in view of formula (5.1). Theorem 5.1. As a function of α, fv 2−1,α is of class C ∞ on [0, 1]. The proof is based on the lemmas at the end of the previous section. To explain the strategy of the proof we introduce the resolvent equation associated to fv : for λ > 0, denote by uλ the solution of the resolvent equation: λuλ − Luλ = fv . We will use the dual representation and carry out the estimates in H. Let uλ ∼ uλ through the unitary isomorphism. Of course uλ = uλ (α) depends on α, (z · v)p(z)e{z} fv ∼ f v = − z∈E∗
is independent of α and is actually in H−1 . We have λuλ (α) − Lα uλ = fv .
(5.2)
Regularity of Self-Diffusion
317
It follows from [1] that fv 2−1,α = lim fv , uλ α = − lim λ→0
= lim
λ→0
λ→0
(z · v)p(z)uλ ({z}, α)
z∈Zd∗
1 (z · v)p(z) [uλ ({−z}, α) − uλ ({z}, α)] 2 d
(5.3)
z∈Z∗
because p(·) is symmetric. In view of this identity, to prove Theorem 5.1 we just need to show that there exists a subsequence λk ↓ 0 such that, for each z with p(z) > 0, {uλk (α, {z}) − uλk (α, {−z}), k ≥ 1} converges uniformly in α to a smooth function. To prove the existence of such a subsequence, it is enough to show that the functions {uλ (α, {z})} are smooth for each λ > 0 and, for each z and j ≥ 0, to obtain the uniform bounds sup
(j )
(j )
sup |uλ (α, {−z}) − uλ (α, {z})| < ∞.
(5.4)
0 0 for 0 ≤ i < n. Rewriting the difference uλ (α, {−z}) − uλ (α, {z}) (j ) (j ) as 0≤i 0}, j ≥ 0. From this result and the relation between uλ and vλ , t and α, we deduce boundedness in ||| · |||1,0 norm of (j ) {uλ (A, α), λ > 0} in the interior of the domain. An extra argument, presented at the end of the proof, extends the smoothness up to the boundary. We start by observing that the function f has finite H−1,k norm for all k ≥ 0, i.e. there exists a finite constant C0 such that |||fv |||−1,k ≤ C0
(5.6)
for all k ≥ 0. The proof of this claim is elementary. Since fv has degree 1, |||fv |||−1,k = |||fv |||−1,0 = fv −1,env is finite as soon as f−1,env is finite. To prove that fv −1,env is finite, recall the variational formula (3.6) for the · −1,env norm and fix a finite supported function g. Since L0 does not change the degree of a function and since fv has degree one, we may assume that g has degree one. Since p is symmetric, f, g =
1 p(z)(z · v)[g({−z}) − g({z})]. 2 z
By Schwarz’s inequality, the square of this expression is bounded by 1 p(z)|(z · v)|2 p(z)[g({−z}) − g({z})]2 . 4 z z Now we proceed as for the bound (5.5): there exists a path z0 = −z, z1 , . . . , zn = z, avoiding 0, such thatp(zi+1 − zi ) > 0 for 0 ≤ i < n. Rewriting the difference g({−z}) − g({z}) as 0≤i, which proves the claim (5.6) in view of the variational formula (3.6) for the · −1,env norm. We now start our way through the proof that vλ is a sequence of smooth functions with bounded derivatives. Lemma 4.1 applied to f shows that sup
sup |||vλ (t)|||1,k
0 0} is a family of smooth functions whose derivatives satisfy for each k ≥ 0, sup
sup |||uλ (t)|||1,k < ∞.
(5.8)
0. (j ) (j ) Since f does not depend on α, for j ≥ 0, Uλ (0) = < f, uλ (0) >. Since f is a function of degree 1, to prove that the odd derivatives of Uλ (t) vanish at 0, it is enough to prove (2j +1) that uλ (0) is a function of even degree. We prove this statement by induction on j .
320
C. Landim, S. Olla, S. R. S. Varadhan
Observe that L(0) = L0 + Lτ2 , which are operators that preserve the degree of a function. On the other hand, since sin2 t, cos2 t are even functions and since sin t cos t is an odd function, there exist constants aj , bj , cj such that L(2j ) (0) = aj Lτ1 + bj Lτ2 ,
L(2j +1) (0) = cj [Lτ+ + Lτ− ]
for j ≥ 0. In particular, while L(2j ) (0) preserves the degree of a function, L(2j +1) (0) changes it by one. (2j +1) (2j ) (0) (resp. uλ (0)), j ≥ 0, are functions of even (resp. odd) To prove that uλ degree, notice first that uλ (0) is the solution of [λ − (L0 + Lτ2 )]uλ (0) = f. Since f is a function of degree 1, uλ (0) is also of degree 1. This proves the claim for j = 0. It is easy to conclude the proof by induction using formula (5.9) and the fact that L(2j ) (0) preserves the degree, while L(2j +1) (0) changes it by one. (2j +1) (2j +1) (2j +1) (0) are functions of even degree, Uλ (0) =< f, uλ (0) > vanSince uλ ishes because f has degree one. Since we proved uniform convergence of a subsequence uλk (t) and its derivatives, the limit U (t) = f2−1,α(t) of Uλ (·) inherits these properties. In particular, U 2j +1 (0) = 0 . Elementary analytic considerations show that U (t) is in fact a smooth function of t 2 and hence of sin2 t = α. Remark 5.3. The proof of the smoothness at the boundary provides a recursive method to compute the Taylor expansion at the origin of the diffusion coefficient. Recall that U (t) = f−1,α(t) . By Theorem 5.1, U (0) = limλ→0 < f, uλ (0) >= < f, u(0) >, where u(0) is the solution of −[L0 + Lτ2 ] u(0) = f.
(5.10)
Since f has degree one and since L0 , Lτ2 preserve the degree, this equation can be solved in H1 . In this space both L0 , Lτ2 are essentially Laplace operators and this equation may be solved. Knowing u(0), we may examine the equation −[L0 + Lτ2 ] u(1) (0) = L(1) (0)u(0). As noticed earlier, the right hand side is a function of degree 0 and 2 so that u(1) (0) has this property. By induction we may obtain u(j ) (0) for all j ≥ 1 by inverting an operator which is essentially a Laplacian. This permits us to compute the Taylor expansion of U around the origin because U (j ) (0) =< f, u(j ) (0) >. In particular, from (5.1), v · D(α)v = (1 − α)
(z · v)2 p(z) − α(1 − α) < u(0), fv > + O(α 2 )
z∈Zd∗
includes the first order correction, where u(0) is the solution of (5.10).
Regularity of Self-Diffusion
321
References 1. Kipnis, C., Varadhan, S.R.S.: Central limit theorem for additive functionals of reversible Markov processes and applications to simple exclusion. Commun. Math. Phys. 106, 1–19 (1986) 2. Landim, C., Olla, S.,Yau, H.T.: Some properties of the diffusion coefficient for asymmetric simple exclusion processes. Ann. of Probab. 24, 1779–1807 (1996) 3. Landim C.,Yau, H.T.: Fluctuation–dissipation equation of asymmetric simple exclusion processes. Probab. Th. Rel. Fields 108, 321–356 (1997) 4. Landim C., Olla S., Varadhan S.R.S.: Finite-dimensional approximation of the self-diffusion coefficient for the exclusion process. Preprint 5. Sethuraman, S., Varadhan, S.R.S., Yau, H. T.: Diffusive limit of a tagged particle in asymmetric exclusion process. Comm Pure Appl. Math. 53, 972–1006 (2000) 6. Varadhan, S.R.S.: Regularity of the self-diffusion coefficient. In: The Dynkin Festschrift, Progr. Probab. 34, Boston, MA: Birkhäuser Boston 1994, pp. 387–397 Communicated by H.-T. Yau
Commun. Math. Phys. 224, 323 – 340 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
A Simple Proof of Stability of Fronts for the Cahn–Hilliard Equation E. A. Carlen1, , M. C. Carvalho1, , E. Orlandi 2, 1 School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332, USA.
E-mail:
[email protected] 2 Dipartimento di Matematica, Universitá degli Studi di Roma Tre, P. S. Murialdo 1, 00146 Roma, Italy.
E-mail:
[email protected] Received: 14 November 2000 / Accepted: 30 July 2001
Dedicated to Joel Lebowitz on the occasion of his 70th birthday Abstract: We apply a method developed in our earlier work on a non-local phase kinetics equation to give a simple proof of the non-linear stability of fronts for the Cahn–Hilliard equation. 1. Introduction In this paper we consider the one dimensional Cahn–Hilliard equation, which is a particularly interesting example of a class of equations for the transport of a conserved order parameter m(x) on R. Such equations generally have the form ∂ ∂ m= J, ∂t ∂x
(1.1)
where the current J is given in terms of the variation of a free energy functional F through ∂ δF J = . (1.2) ∂x δm In this particular case, the free energy F is 1 ∂ 2 1 2 2 F(m) = m + 8 (1 − m ) dx. R 2 ∂x Work partially supported by U.S. National Science Foundation grant DMS 00–70589.
Work partially supported by E.U. grant ERB FMRX CT 97-0157 and FCT PRAXIS XXI.
On leave from Departamento de Matemática da Faculdade de Ciencias de Lisboa and GFM, 1700 Lisboa codex, Portugal. E-mail:
[email protected] Work partially supported by the CNR-GNFM, MURST COFIM 99–00.
(1.3)
324
E. A. Carlen, M. C. Carvalho, E. Orlandi
The variation in (1.2) is to be computed with respect to the L2 norm on R, and hence δF ∂2 1 = − 2 m − m(1 − m2 ) δm ∂x 2
(1.4)
and the equation is ∂ ∂2 m= 2 ∂t ∂x
∂2 1 2 − 2 m − m(1 − m ) . ∂x 2
Clearly the free energy is a decreasing function under this evolution: 2 ∂ δF d F(m) = − (m) dx, dt R ∂x δm
(1.5)
(1.6)
and thus our evolution has a Lyapunov functional. We will denote −dF(m)/dt by I(m(t)). Moreover, the evolution has a conservation law: For all t > 0, (m(x, t) − m(x, 0))dx = 0. (1.7) R
Replacing derivatives by gradients and divergences in the obvious places, one obtains a two or three dimensional version. In such cases, m(x) represents the order parameter in the model of a binary alloy with a phase transition. The two global equilibrium states correspond to the two minima of the potential W (m) = (1 − m2 )2 /8. Clearly these are m = 1 and m = −1. At the boundary between two regions of different phases, there will be a transition from m = 1 to m = −1. Since the evolution decreases the free energy, we expect that after a short initial time period, these transitions should occur in a way that minimizes the cost in excess free energy. Therefore, in the one dimension across the boundary between two regions of different phase, we expect a “transition profile” that is very close to some translate of m ¯ 0 , where (1.8) F(m ¯ 0 ) = inf F(m) sgn(x)m(x) ≥ 0, lim sgn(x)m(x) > 0 . x→±∞
The minimizer is well known, and easily seen, to be m ¯ 0 (x) = tanh(x/2). The physical interest in the one dimensional problem is that stability of these minimal free energy transition profiles, which we simply call “fronts” in the rest of the paper, is important for understanding how the boundaries between regions of different phases evolve in higher dimension. Without further mention of the higher dimensional case, we now turn to this stability problem. The subscript 0 on the minimizer in (1.8) is present because the constraint imposed in (1.8) breaks the translational invariance of the free energy. For any a in R, define m ¯ a (x) = m ¯ 0 (x − a).
(1.9)
These functions m ¯ a are the fronts whose stability is to be investigated here. Clearly F(m ¯ a ) = F(m ¯ 0 ), so that m ¯ 0 belongs to a one parameter family of minimizers of the free energy. Another family is obtained by reflecting this one because the free energy is also reflection invariant. However, these two families of minimizers separated in all of the relevant metrics, and it suffices to consider just one.
Simple Proof of Stability of Fronts for Cahn–Hilliard Equation
325
It is easy to guess the result of solving (1.5) for initial data m0 that is a small perturbation of the front m ¯ 0 . The excess free energy should decrease in a way that forces the solution m(t) to tend to the family of fronts, and the conservation law should select m ¯ a as the front it should be converging to, so the result should be that, in any reasonable sense, limt→∞ (m(x, t) − m ¯ a (x)) = 0 with a given in terms of the initial data m0 through (1.7) in the form
m(x, 0) − m ¯ a (x) dx = 0. (1.10) Our main result is a proof that this is the case. The result has recently been obtained in this case by Bricmont, Kupiainen and Taskinen [2] using renormalization group methods. Their result gives a tighter estimate on the decay rate, but in a weaker norm that does not control the excess free energy. We recently proved such a result for a related equation, the LOP equation, which first appeared in [10] and later rigorously derived from an underlying microscopic model in [7]. The method that we used was developed to deal with the non-local nature of the LOP equation, and the fact that one has no explicit formula for m ¯ in that case, which precluded the explicit spectral analysis required in the renormalized group method. However, as we show here, the method developed for the LOP equation also applies to the local Cahn–Hilliard equation, and yields a fairly simple proof of the non-local stability. Moreover, this method works directly in physical norms, and it provides an estimate on the rate of decrease of the excess free energy. The result is: Theorem 1.1. Consider initial data m0 (x) for the one dimensional Cahn–Hilliard equation (1.5) such that ¯ 0 (x))2 dx ≤ c0 , x 2 (m0 (x) − m where c0 is any positive constant. Then for any > 0 there is a strictly positive constant δ = δ(, c0 ) depending only on and c0 such that for all inital data with ¯ 0 (x))2 dx ≤ δ, (m0 (x) − m the excess free energy F(m(t)) − F(m0 ) of the corresponding solution m(t) of (1.5) satisfies F(m(t)) − F(m) ¯ ≤ c2 (1 + c1 t)−(9/13−) and
m(t) − m ¯ a 1 ≤ c2 (1 + c1 t)−(5/52−) , where c1 and c2 are finite constants depending only on and c0 and a is given by (1.10). Since the problem has both a Lyapunov functional and a conservation law, it may appear that it should be a simple matter to prove this result. One reason that it is not so simple is that the decrease of the excess free energy provides only L2 control, and by itself, only partial control at that. To use the conservation law, one needs L1 control. Our equation is not dissipative in L1 , a circumstance which is closely related to the lack of a maximum principle. Decrease of free energy can be used to show that the
326
E. A. Carlen, M. C. Carvalho, E. Orlandi
solution m(x, t) approaches some moving front ma(t) (x) in some norm other than L2 . For example, Asselah did this in [1] for the LOP equation studied in [4] and [5], with the approach controlled in the L∞ norm. But since the free energy is translation invariant, it cannot provide any control over a(t). Moreover, without control on a(t) that prevents it from “running away”, it is not at all clear how one can even get L2 control on the difference between m(x, t) and ma(t) (x), or get a rate estimate. The difficulties in this sort of problem are discussed in more detail in [4]. Here we move directly on to the solution. Despite what has been said above, understanding the free energy functional F is still central to understanding the stability. To begin, we introduce the operator A associated with its second variation at a front m. ¯ First, throughout this paper, we make the following convention: whenever some solution m(x, t) of (1.5) is under discussion, then v(x, t) is defined by v(x, t) = m(x, t) − m ¯ a(t) (x),
(1.11)
where a(t) is defined to be that value of c such that
m(t) − m ¯ a(t) 2 = inf { m(t) − m ¯ c 2 }. c∈R
(1.12)
It is shown in [4] that a(t) is a well–defined function as long as m(t) − m ¯ a(t) 2 stays sufficiently small since then the minimum is uniquely attained. Finally, it will be convenient to have the convention that m(x) ¯ denotes m ¯ a(t) (x). In the same vein, we shall generally simply write A in place of Aa(t) for the second variation of F at m ¯ a(t) , and leave the a(t) implicit. However, in the definition, we shall be explicit: v, Aa vL2
d2 = 2 F(m ¯ a + sv) . ds s=0
(1.13)
One easily computes that Av(x) = −v (x) + V (x)v(x) + v(x),
(1.14)
where V (x) =
x 3 3 2 m ¯ −1 = tanh2 −1 . 2 2 2
(1.15)
The operator A has a spectral gap: Lemma 1.2. In the spectrum of A, 0 is an isolated eigenvalue of multiplicity one. In fact, v(x)Av(x)dx ≥ for all v with
v(x)m ¯ (x)dx = 0.
3
v 22 4
Simple Proof of Stability of Fronts for Cahn–Hilliard Equation
327
Proof. We consider the operator H given by H v(x) = −v (x) + V (x)v(x). We know that m ¯ is an eigenvector, and that the corresponding eigenvalue is −1. Let −1 = e0 , e1 , e2 , . . . be the negative eigenvalues of H , repeated according to their multiplicity. Then by a bound of Lieb and Thirring [9], one has 3 |ej |3/2 ≤ |V (x)|2 dx. 16 R j
The integral is easily evaluated and equals 6. Keeping only the first two terms in the sum on the left 1 + |e1 |3/2 ≤ 18/16 and this implies that |e1 | ≤ 1/4. Thus e1 ≥ −1/4, and this completes the proof. As indicated in Theorem 1.1, we shall start out with v 2 small, and then, because of the smoothing properties of the equation [3, 5], it will be the case that at least a short time later, v 2 is still small, and then v 2 is small as well. We shall obtain a number of a-priori estimates that hold when v 2 and v 2 are both small, and shall use them in the final section of the paper to prove that this condition persists indefinitely. The first estimate that we obtain under these conditions shows that the excess free energy of m ¯ + v is comparable to v, Av . Lemma 1.3. For all > 0, there are δ, κ > 0 so that whenever v 2 ≤ δ and v 2 ≤ κ, then 1− 1+ v, Av ≤ F(m ¯ + v) − F(m) ¯ ≤ v, Av. 2 2 Proof. One easily computes that F(m ¯ + v) − F(m) ¯ =
1 1 v, Av + 2 4
2mv ¯ 3 + v 4 dx.
Using the inequality v 2∞ ≤ 2 v 2 v 2 , one obtains √ v4 3 ≤ 2 2κδ + κδ v 2 . 2 mv ¯ + dx 2 2 √ By the previous lemma, for κ and δ small enough, 2 2κδ + κδ v 22 ≤ (/2)v, Av, and this completes the proof. The first key result is a lower bound on the dissipation in terms of A: Lemma 1.4. For any > 0, d F m(t)) − F m ¯ ≥ (1 − ) I(m(t)) = − dt
2 (Av) (x) dx
(1.16)
whenever ||v ||2 ≤ κ1 () and ||v||2 ≤ δ1 () for some strictly positive constants κ1 () and δ1 (). Moreover, there exists a constant γ > 0 so that 2 (1.17) (Av) dx ≥ γ ||v ||22 whenever
v(x)m ¯ (x)dx = 0.
328
E. A. Carlen, M. C. Carvalho, E. Orlandi
This theorem is proved in Sect. 2. We use (1.16) only when I(m(t)) 0 so that one has d φ(t) ≤ 4(1 + ) [F(m ¯ + v) − F(m)] ¯ dt
(1.22)
Simple Proof of Stability of Fronts for Cahn–Hilliard Equation
329
whenever (1.18) holds, and ||v||2 ≤ δ1 (), ||v ||2 ≤ κ1 (), and |a(t)| ≤ 1 for some strictly positive constants κ1 () and δ1 (). Regardless of whether (1.18) holds or not, there is a constant K < ∞, d φ(t) ≤ K [F(m ¯ + v) − F(m)] ¯ dt
(1.23)
for as long as ||v ||2 ≤ κ1 (), ||v||2 ≤ δ1 () and |a(t)| ≤ 1. Theorem 1.5 is proved in Sect. 3. Theorems 1.4 and 1.5 are the main ingredients of our argument specific to the Cahn–Hilliard equation. The other two ingredients are a constrained form of the uncertainty principle inequality and decay estimate for a system of differential inequalities introduced in [5]. We will now explain what these are, and how they work together to provide the proof of Theorem 1.1. The constrained form of the uncertainty principle inequality [5] is the following: Under either of the constraints ψ(x)dx = 0 or ψ(0) = 0, one has
x 2 |ψ(x)|2 dx
|ψ (x)|2 dx
≥
9 4
2 |ψ(x)|2 dx
.
(1.24)
The difference between (1.24) and the usual uncertainty principle is a factor of 9 in the constant, and, as we showed in [5], this is crucial for L1 control. We wish to apply this to ψ = Av. It is clear that Av will have a zero somewhere, a technical argument is needed to control the location. To explain how all of the pieces of the argument fit together, assume for the moment that the initial data is antisymmetric. Then the solution will be antisymmetric for all time and so Av(0, t) = 0
(1.25)
for all t. The technical argument needed to remove the antisymmetry assumption will be given in Sect. 2. However, assuming (1.25) , we have from (1.16) and (1.24) that d 9 Av 42 . F m(t)) − F m ¯ ≤ −(1 − ) dt 4 xAv 22
(1.26)
The problem with this inequality is that
the right hand side does not directly involve the excess free energy F m(t)) − F m ¯ . If it did, we could hope to get a Gronwall inequality for the decay of the excess free energy. The problem is thus one of closure: we have to relate the quantity on the right-hand side to the excess free energy. Now we are ready to put the pieces together. When (1.20) is valid, interpreting the approximation sign appropriately in terms of , we can rewrite (1.26) as d [F(m(t)) − F(m)] ¯ 2 . F m(t)) − F m ¯ ≤ −9(1 − ) dt
xAv 22
(1.27)
f (t) = F(m ¯ + v(t)) − F(m) ¯
(1.28)
Now define
330
E. A. Carlen, M. C. Carvalho, E. Orlandi
and define φ(t) as in Theorem 1.5. Then (1.27) becomes 2 F m(t)) − F m ¯ d F m(t)) − F m ¯ ≤ −9(1 − ) , dt φ(t) and from Theorem 1.5 we have that d φ(t) ≤ (1 + )4 F(m ¯ + v) − F(m) ¯ . dt Notice the condition that |a(t)| ≤ 1 in Theorem 1.5, to which we shall return. Thus, when (1.18) holds, we have d f (t)2 f (t) ≤ −A˜ dt φ(t)
d ˜ (t) φ(t) ≤ Bf dt
and
(1.29)
˜ A+ ˜ B) ˜ and 9/13 arbitrarily small for small enough for with the difference between A/( all times t such that (1.18) holds, v(t) 2 , v (t) 2 are sufficiently small and |a(t)| ≤ 1. On the other hand, when (1.19) holds, there is plenty of dissipation, and using (1.19) and the second half of Theorem 1.5, we get (1.29) with some different constants A˜ and B˜ ˜ A˜ + B) ˜ (in fact, A˜ will be the constant K from Theorem 1.5), but such that the ratio A/( is the same. The upshot is that we always have (1.29), but at two different time scales according to whether (1.19) or (1.18) holds. The heuristic idea that we will make precise in Sect. 4 is that by taking the slower of these two time scales, we bound the decay of our system. Therefore we consider the system of differential inequalities d f (t)2 f (t) ≤ −A dt φ(t)
d φ(t) ≤ Bf (t) dt
and
(1.30)
with A = 9 and B = 4. Theorem 5.1 of [4] says that for any solution of (1.30),
−q φ(0) + (A + B)t , f (0) 1−q 1−q q φ(0) + (A + B)t , φ(t) ≤ f (0) φ(0) f (0)
f (t) ≤ f (0)1−q φ(0)q
where q = A/(A + B). In the case at hand, this is q = 9/13. Since this value exceeds 1/2, we get L1 decay in the following way: By the elementary Lemma 5.2 of [5], for any function w and any 0 < δ < 1, (1+δ)/2
w 1 ≤ C(δ) (1 + x 2 )1/2 w 2
(1−δ)/2
w 2
,
(1.31)
where C(δ) is a finite constant. (This
same method may be applied to solutions u of the heat equation ∂u/∂t = u with R u(t)dx = 0 to estimate the rate of L1 decay, as shown in [5].) Here, we apply (1.31) with w = Av(t), so that we obtain
Av(t) 21 ≤ C(δ)φ(t)1+δ Av(t) 1−δ 2 .
Simple Proof of Stability of Fronts for Cahn–Hilliard Equation
331
Since 9/13 > 1/2 for δ sufficiently small, we have that φ(t)1+δ increases more slowly decreases, and so Av(t) 1 decreases to zero. In fact, the rate one gets than Av(t) 1−δ 2 is arbitrarily close to t −5/26 , for δ sufficiently small, as in Theorem 1.1. This leads to lim Av(x, t)dx = lim (V (x) + 1) v(x, t)dx = 0. t→∞ R
t→∞ R
But R V (x)v(x, t)dx ≤ V 2 v(t)
2 , and this tends to zero as t tends to infinity by
the above, so that finally, limt→∞ R v(x, t)dx = 0. But (1.7) is equivalent to
m ¯ a(t) (x) − m(x, 0) dx + v(x, t)dx = 0, R
R
¯ a(t) (x) − m(x, 0) dx = 0 so and hence limt→∞ R m
that limt→∞ a(t) = a, where a ¯ a (x) − m(x, 0)) dx is linear, is determined through . Indeed, the map a → R (m
(1.10) and the slope is − R m ¯ a (x)dx = −2, as one sees simply by differentiating. Thus,
= 2|a(t) − a|. (x) − m(x, 0) dx m ¯ a(t) R
2. Free Energy Estimates It follows from (1.6) and the definition of A, one has 2 d d 1 2 3 dx. F(m) = − Av − 3mv ¯ +v dt 2 R dx For convenience of notation, define 3 2 1 d 3mv ¯ 2 + v3 = − m ¯ v + 2mvv ¯ + v2 v . U= 2 2 dx
(2.1)
(2.2)
Now for any f and g in L2 and for any 0 < < 1, 1
f + g 22 ≥ (1 − ) f 22 − g 22 . Combining (2.1), (2.2) and (2.3), we have d (Av) + U 2 dx ≥ (1 − ) (Av) 2 dx − 1 |U |2 dx. − F(m) = dt R R R
(2.3)
(2.4)
The following lemma is closely based on lemmas and arguments in Sect. 3 of [4]. We have stated it so that it applied to a general class of potentials because the proof, although somewhat involved, depends only on fairly general properties of m ¯ and A.
Theorem 2.1. Let v ∈ L2 (R), v ∈ L2 (R) and v(x)m ¯ (x)dx = 0 then there exists a positive constant γ , such that 2 (2.5) (Av) dx ≥ γ ||v ||22 , where A is the linear operator defined in (1.14) .
332
E. A. Carlen, M. C. Carvalho, E. Orlandi
Proof. First observe V is given in (1.15). Next,
x that (Av) = Av + V v, where v(x) = v(y) + y v (z)dz. Multiply both sides by m ¯ (y), and integrate in y. Since
v(y)m ¯ (y)dy = 0, and since m ¯ (y)dy = 2, x ∞ 1 v(x) = m ¯ (y) v (z)dz dy. (2.6) 2 −∞ y
Hence (Av) = Av + Kv , where 1 Kφ(x) = V (x) 2
∞
−∞
m ¯ (y)
x y
(2.7) φ(z)dz dy.
The operator K is compact on L2 . A detailed proof in a closely related case is given in [4]. Now consider the quadratic form Q(φ) given by Q(φ) = (A + K) φ 22 for φ in the domain of A. We next show that Q(φ) > 0 for all φ in its domain. Suppose on the contrary that Q(φ) = 0 for some φ in the domain of Q, which is the operator domain of A. Define x η(x) = φ(y)dy = 1[0,x] , φ. 0
It follows by the Schwarz inequality that |η(x)| ≤ φ 2 |x|
for all
x.
(2.8)
It then follows that Kφ = V η − 21 V m ¯ , η, where the inner product on the right is well defined because of the exponential decay of m ¯ and (2.8). Hence 1 1 ¯ , η = (Aη) − V m ¯ , η. (A + K) φ = Aη + V η − V m 2 2 Since the right side is a total derivative, we have 1 Aη − V m ¯ , η = C, 2
(2.9)
where ¯ , and integrate. Note
C is a constant. To determine C, multiply both sides by m that m ¯ (Aη) dx = 0, because (2.8) permits the integration by parts. The computation
then yields C = (1/2)m ¯ , η. Putting this in (2.9) yields A η − (1/2)m ¯ , η = 0. Now any solution ψ of Aψ = 0 either decays exponentially or diverges exponentially at infinity, since, due to the rapid decay of m ¯ , and hence V , φ ≈ φ. The only option consistent with (2.8) is exponential decay. Hence we must have that η − (1/2)m ¯ , η 2 is in the L kernel of A. However, we know from Lemma 1.2 that this is spanned by m ¯ . So we must have η − (1/2)m ¯ , η = α m ¯ . Integrating both sides against m ¯ yields α = 0. Hence η is constant, and so φ = 0, as was to be shown.
Simple Proof of Stability of Fronts for Cahn–Hilliard Equation
333
We will now show that there is a γ > 0 so that Q(φ) ≥ γ φ 22
(2.10)
for all φ. The proof is similar to the proof of Weyl’s lemma, though note that A + K is not self adjoint. If (2.10) were false, there would exist an infinite orthonormal sequence {φn } in L2 such that limn→∞ Q(φn ) = 0. Since the sequence {φn } is orthonormal, it converges ¯ and note that limn→∞ cn = 0. If the cn are not weakly to zero. Next, let cn = φn , m all zero, let n0 be such that |cn0 | ≥ |cn | for all n, and define φ˜ n = φn − (cn /cn0 )φn0 . It is clear that the φ˜ n are all orthogonal to m ¯ , and moreover the modified sequence still converges weakly to zero, and still satisfies limn→∞ Q(φ˜ n ) = 0 and limn→∞ φ˜ n 22 = 1. (If all of the cn vanish, we simply take φ˜ n = φn for all n.) Moreover, by Lemma 1.2,
Aφ˜ n 22 ≥
9
φ˜ n 22 . 16
(2.11)
Since the sequence {φ˜ n } converges weakly to zero, lim K φ˜ n = 0
n→∞
(2.12)
strongly in L2 . Also, it is clear that the
operator domain of A is the form domain of Q and that Aφ 22 ≤ 2 Q(φ) + Kφ 22 on this domain. Thus, (2.13)
Aφ˜ n 22 ≤ 2 Q(φ˜ n ) + K 2 φ˜ n 22 , where K denote the operator norm of K on L2 . In particular, the Aφ˜ n 2 are uniformly bounded by a finite constant. Now, Q(φ˜ n ) ≤ Aφ˜ n 22 + K φ˜ n 22 + 2 Aφ˜ n 2 K φ˜ n 2 .
(2.14)
By (2.12) and (2.13), the last two terms on the right in (2.14) tend to zero with n. Hence for any > 0, we obtain that Aφ˜ n 22 ≤ φ˜ n 22 for all sufficiently large n, which would contradict (2.11). This proves (2.10). Now by (2.7), when m ¯ , v = 0, 2
(Av) 2 = Q(v ), and hence we have the result. Combining this result with (2.4) , we have 2 d 1 |U |2 dx. − F(m) ≥ (1 − 2) (Av) dx + γ ||v ||22 − dt R R
(2.15)
We next show that the quantity on the last line is positive whenever δ and κ are small enough. To accomplish this, we use the following lemma: Lemma 2.2. Let v ∈ L2 (R), v ∈ L2 (R). For any κ > 0 and 0 > 0 small enough, there exists δ(κ, 0 ) > 0 such that the following estimate holds: 2 (2.16) U (v) dx ≤ 0 |v |2 dx, R
provided v 2 ≤ δ,
v
2
≤ κ.
334
E. A. Carlen, M. C. Carvalho, E. Orlandi
Proof. This follows directly from (2.2) and the bound v 2∞ ≤ 2 v 2 v 2 .
Proof of Theorem 1.4. Now choose κ and δ so that 0 ≤ 2 γ , and then from (2.15), we have the inequality of Theorem 2.1. We now prove a bound that will enable us to apply the dissipation–dichotomy argument described in the introduction. Theorem 2.3. For all > 0, there is an 0 > 0 such that for or all v orthogonal to m ¯ with I(m ¯ + v) = (Av) 22 ≤ 02 v, Av
(2.17)
(1 − ) Av 22 ≤ v, Av ≤ (1 + ) Av 22 .
(2.18)
one has
Proof. First, by Lemma 1.2, inserting A1/2 v in place of v, v, Av ≤
4
Av 22 3
(2.19)
so we have that (Av) 22 ≤ (402 /3) v 22 . Then, using the notation of Lemma 1.2, Av 22 − v, Av = v , Av + V v, Av ≤ v , (Av) + |V v, Av| . Now |V v, Av| ≤ v 2 V 2 Av ∞ and by (2.17) and (2.19),
Av 2∞ ≤ 2 Av 2 (Av) 2 ≤
80
Av 22 . 3
Then, by Lemma 1.2 and Schwarz’s inequality, v 2 ≤ (4/3) Av 2 , so that, recalling from the proof of Lemma 1.2 that V 22 = 6, |V v, Av| ≤ 8
0
Av 22 . 3
(2.20)
Next we bound v , (Av) . First, an easy application of (2.17) and (2.19) yields v , (Av) ≤ v 2 (Av) 2 ≤ 0
4
v 2 Av 2 . 3
(2.21)
√ By Theorem 2.1, v 2 ≤ (1/ γ ) (Av) 22 ; hence aplying (2.17) and (2.19) again, 4 v , (Av) ≤ 2 √
Av 22 . 0 3 γ Combining (2.20) and (2.22), we have the result.
(2.22)
Simple Proof of Stability of Fronts for Cahn–Hilliard Equation
335
3. Moment Estimates In this section we prove Theorem 1.5 which bounds the growth of φ(t) = 1 + |x (Av) |2 dx + C [F(m ¯ + v) − F(m)] ¯ , R
(3.1)
where C is a positive constant to be specified. Actually, 1 + C [F(m ¯ + v) − F(m)] ¯
(3.2)
is non-negative and monotone decreasing, so as far as growth is concerned, the quantity of real interest is ψ(t) = |x (Av) |2 dx. (3.3) R
However, (3.2) contributes negative terms to the time derivative of φ(t) that serve to absorb certain terms that cannot be controlled in terms of the excess free energy, due to the unboundedness of the operator A. Recall that A means Aa(t) , where the solution m(x, t) has the form m(x, t) = ¯ a 22 . Therefore, it follows from (1.14) v(x, t) + m ¯ a(t) (x), and a(t) minimizes m(t) − m that
∂ ∂ ˙ (3.4) Aa(t) v(t) = Aa(t) v(t) − 3m ¯ a(t) a(t), ∂t ∂t where a(t) ˙ denotes the derivative of a(t). We can also rewrite the evolution equation (1.5) in terms of v(t) = m(t) − m ¯ a(t) , and doing so we obtain
1 ∂ . (3.5) v(t) = Aa(t) Aa(t) v(t) + 3m ¯ a(t) v 2 (t) + v 3 (t) Aa(t) ∂t 2 (This time there is no contribution involving a(t) ˙ since m ¯ a(t) is annihilated by Aa(t) .) Note that the first term on the right is linear in v, and the second term is higher order. The main contribution will come from the linear term, and it is this that we must work hardest to control. To control the term involving a(t), ˙ first note that (m(t) − m ¯ a(t) )m ¯ a(t) dx = 0 2 which holds for all t. Differentiating this equation in t, one obtains a(t) ˙
m ¯ a 2 −
v, m ¯ a = − (∂m/∂t)m ¯ a . Thus, we have δF m ¯ a dx ≤ 2 I(m(t)) m ¯ 2 , (3.6) |a(t)| ˙ ≤ 2 δm 2
as long as v 2 is sufficiently small that m ¯ a 2 − v, m ¯ a > 1/2. Since m ¯ has exponential decay, this gives us the bounds we will need to control the effects of the terms involving a(t), ˙ as we will see below. The non-linear terms are easily handled without any preparatory analysis.
336
E. A. Carlen, M. C. Carvalho, E. Orlandi
We now turn to the linear part, which will provide all of the most important terms. Consider the growth of ψ(t) when v evolves according to the linearized equation ∂ v = (Av) . ∂t
(3.7)
The computations that follow can be more clearly and compactly represented if we introduce the notation ξ = x (Av)
and
η = Av.
(3.8)
Lemma 3.1. Let v(x, t) solve (3.7), and let ψ(t) be defined in terms of v through (3.3). Then for any α > 0, 1 + 4 V 1 d ψ(t) = 12 + η , η dt 2α (3.9) α + 2 + (x 2 V ) 2∞ + 2α V 1 η, η, 2 where η = Av. Proof. Let V be the potential defined in (1.15). Then one easily computes the commutators ∂ ∂ , A = V and [x, A] = 2 . (3.10) ∂x ∂x Clearly, d ψ(t) = 2 dt
R
x 2 (Av) A (Av) dx.
Now one commutes derivatives and multiples of x past
integrates by parts to A and obtain a dissipative term of the form − R x (Av) A x (Av) dx into which positive terms can be absorbed. The result, in the notation (3.8) , is that d ψ(t) = −2ξ, Aξ − 4ξ, η − 4xη, Aη − 2η , x 2 V η. dt The last three terms require further manipulation. First: 1 ξ, η = −ξ , η = −xη , η − η , η = − η , η . 2 This term is controlled by the derivative of the excess free energy. Second, one has, using (3.10) xη, Aη = η, Aξ + 2η, η = η, Aξ − 2η , η . Finally, for any α > 0, η , x 2 V η ≤ η , η 1/2 (x 2 V )η, (x 2 V )η1/2 ≤
1 α η , η + (x 2 V ) 2∞ η, η. 2α 2
Simple Proof of Stability of Fronts for Cahn–Hilliard Equation
337
Putting everything together, one obtains: 1 α d ψ(t) ≤ −2ξ, Aξ − 4ξ, Aη + 10 + η , η + (x 2 V ) 2∞ η, η. dt 2α 2 Now one uses that −2ξ, Aξ − 4ξ, Aη = −2(ξ + η), A(ξ + η) + 2η, Aη.
(3.11)
But η, Aη = η , η + η, V η + η, η ≤ η , η + V 1 η 2∞ + η, η, and
η 2∞ ≤ 2 η 2 η 2 ≤ Altogether
V 1 η, Aη ≤ 1 + α
1 η , η + αη, η. α
η , η + (1 + α V 1 ) η, η.
Putting (3.12) into (3.11) gives the result.
(3.12)
Lemma 3.2. η, η ≤ η , η + v, Av. Proof. By Schwarz, for any α > 0, η, η = A1/2 η, A1/2 v ≤ η, Aη1/2 v, Av1/2 ≤
α 1 η, Aη + v, Av, 2 2α
and η, Aη ≤ η , η + V + 1 ∞ η, η. Since V + 1 ∞ = 1, we can choose α = 1 and combine the above to obtain the result. Proof of Theorem 1.5. First, we deal with the inhomogenous terms involving a(t) ˙ on the right in (3.4) , as they contribute to
x 2 Aa(t) v ∂ Aa(t) v dx . ∂t R By symmetry and the Schwarz inequality, we have that
2 ˙ 3 A v x m ¯ dx ≤ 3 Aa(t) x 2 m ¯ a(t) 2 v 2 |a(t)|. ˙ a(t) a(t) |a(t)| R
Now applying (3.6) , the contribution of the term involving a(t) ˙ is bounded above by ¯ a(t) 2 v 2 m ¯ a(t) 2 I(m(t)). 6 Aa(t) x 2 m It is here that we begin using the hypothesis that |a(t)| ≤ 1. The exponential decay of m ¯ a(t) would not give a bound on Aa(t) x 2 m ¯ a(t) 2 that is uniform in t if |a(t)| gets large. Since this is precluded by the hypotheses, for any α > 0, there is a universal constant Kα so that
Kα 3 x 2 Aa(t) v m ¯ a(t) dx |a(t)| ˙ ≤ (3.13) I(m(t)) + α v 22 . α R
338
E. A. Carlen, M. C. Carvalho, E. Orlandi
Note that the first term on the right in (3.13) can be absorbed into the negative contribution from the inclusion of the multiple C of the excess free energy in φ, at least if C is chosen appropriately large. Therefore, since we can take α arbitrarily small, and can bound of
v 22 in terms of the excess free energy by Lemma 1.3, this term is under control. One even more easily handles the contributions of the nonlinear terms in (3.5) using the bound v 2∞ ≤ 2 v 2 v 2 . We do not give the details here, but turn to the application of the lemmas from this section to control the contribution from the linear terms. To apply Lemma 3.1, choose α so that α . 2 + (x 2 V ) 2∞ + 2α V 1 ≤ 2 1 + 2 4 Then, for this choice of α, and using the notation from (3.8) , d 1 + 4 V 1 ψ(t) = 12 + η , η + 2 1 + η, η. dt 2α 4
(3.14)
Next, by Theorem 1.4, d C [F(m ¯ + v) − F(m)] ¯ ≤ −C(1 − )η , η . dt Therefore, if we choose C so that C(1 − ) ≥ (12 + (1 + 4 V 1 )/(2α)), we get d φ(t) ≤ 2 1 + η, η. dt 4 It remains to bound η 22 . There are two cases. First suppose that the dissipation is small compared to the excess free energy so that (1.18) holds. Then by Theorem 2.3,
η 22 ≤ (1 + )v, Av, and then by Lemma 1.3, η 22 ≤ (1 + )3 [F(m(t)) − F(m)], ¯ for δ and κ sufficiently small. Redefining , we have proved (1.22) under the hypothesis (1.18). If we don’t assume (1.18), we use
η 22 = v , η + (V + 1)v, η ≤ v 2 I(m(t)) + v 2 η 2 since v+1 ∞ = 1. This leads to η 22 ≤ (2/γ )I(m(t))+4 v 22 , where γ is the constant in Theorem 1.4. Again, the term involving I(m(t)) can be absorbed by an appropriate choice of C. The remaining term is easily handled by Lemma 1.2 and Lemma 1.3, and so (1.23) is established.
4. Proof of the Main Theorem We will be brief in the presentation of this proof since from this point on, it is very close to the one we have given for the LOP equation in Sect. 4 of [5]. Let m(t) be a solution of (1.5) with initial data as specified in Theorem 1.1, where the size of δ is to be specified in the course of the proof. The first step is to wait a bit to acquire some smoothness. For any fixed κ > 0, if initially v 2 ≤ δ/4, where δ is sufficiently small, we will have that v(1) 2 ≤ δ/2 and v (1) 2 ≤ κ/2, and moreover |a(1)| will be small. Regularity theory for m(t) can be found in [3]. Also, the production of smoothness estimates in Sect. 2 of [5] are easily adapted to this case to see the validity of the above assertion.
Simple Proof of Stability of Fronts for Cahn–Hilliard Equation
339
We now begin the analysis from this starting point. All of the lemmas and theorems that required v(1) 2 ≤ δ, v (1) 2 ≤ κ, and |a(1)| < 1 can be used until time T , which is the first time that any of them is violated. Of course, we have to show that such a time T never occurs. Let f (t) and φ(t) be given in terms of m(t) as in the introduction. We begin by assuming that at time t, (1.18) holds. Then by Theorem 1.4, d f (t) ≤ −(1 − ) (Av) 22 . dt By convexity (Av) 22 ≥ (Aρ ∗ v) 22 , where ρ = (1/2)m ¯ , which is a probability density. Because v is orthogonal to m ¯ , ρ ∗ v(a(t)) = 0. Therefore, by the constrained uncertainty principle (1.24) ,
(Av) 22 ≥ (Aρ ∗ v) 22 ≥
Aρ ∗ v 42 9 . 4 (x − a(t)) (Aρ ∗ v) 22
Now under the condition (1.18) , v is so smooth and spread out that ρ ∗v ≈= v, and we do not lose much in passing from v to ρ ∗ v. The estimates are straightforward, making use of (3.10) , and are exactly like those applied on pp. 868–869 of [5].
Without repeating the
details, the result is that (Av) 22 ≥ (9/4)(1 − )2 A ∗ v 42 / (x − a(t)) (Av) 22 and hence that, with redefined, and making use of Lemma 1.3, d f 2 (t) f (t) ≤ −9(1 − ) , dt φ(t) where we have used the fact that |a(t)| < 1 to absorb the effects of a(t) into the constant term. By Theorem 1.5, we have that d φ(t) ≤ 4(1 + )f (t). dt Hence for such t, we have (1.30) satisfied with A/(A + B) arbitrarily close to 9/13. Now suppose that (1.19) holds. Then we have d ˜ (t) φ(t) ≤ Bf dt from the second half of Theorem 1.5, where B˜ is the constant K given there. From (1.19) d f 2 (t) f (t) ≤ −1 f (t) ≤ A˜ , dt φ(t)
(4.1)
where A˜ can be chosen as large as we like provided f (t) is sufficiently small. Thus with δ chosen sufficiently small, as long as f (t) < δ holds, we have (??) and can arrange ˜ A˜ + B) ˜ = A/(A + B). Thus, by rescaling for it to hold with a value of A˜ so that A/( the time in those time intervals in which (1.19) holds; i.e., possibly using a slower clock there, we have a system holding for all t. The details of this argument are exactly as in Sect. 5 of [5]. One now concludes that as long as |a(t)| < 1, v(t) 2 ≤ δ and v (t) 2 ≤ κ, f (t) decays at a rate close to t −9/13 (using the slower of the two time scales). Therefore, as in [5], |a(t)| < 1, v(t) 2 ≤ δ and v (t) 2 ≤ κ hold for all t, and so f (t) decays all the way to zero at a rate close to t −9/13 , as in Theorem 1.1. As explained at the end of Sect. 1 of this paper, this means that Av(t) 1 decays to zero at an algebraic rate, and that this forces limt→∞ a(t) = a, where a is given by the conservation law.
340
E. A. Carlen, M. C. Carvalho, E. Orlandi
References 1. Asselah, A.: Stability of a wave front for a nonlocal conservative evolution. Proc. Royal Soc. Edinburgh 128 A, no. 2, 219–234 (1998) 2. Bricmont, J., Kupiainen, A., Taskinen, J.: Stability of Cahn–Hilliard Fronts. Comm. Pure and Appli. Math. 52, no. 7, 839–871 (1999) 3. Caffarelli, L., Muler, N.E.: An L∞ bound for solutions of the Cahn–Hilliard equation. Arch. Rational Mech. Anal. 133, 129–144 (1995) 4. Carlen, E.A., Carvalho, M.C., Orlandi, E.: Algebraic rate of decay for the excess free energy and stability of fronts for a non-local phase kinetics equation with a conservation law I. J. Stat. Phys. 95, no. 5/6, 1069–1117 (1999) 5. Carlen, E.A., Carvalho, M.C., Orlandi, E.: Algebraic rate of decay for the excess free energy and stability of fronts for a non-local phase kinetics equation with a conservation law II. Comm. P.D.E. 25, no. 5/6, 847–886 (2000) 6. De Masi, A., Orlandi, E., Presutti, E., Triolo, E.: Stability of the interface in a model of phase separation. Proc. Royal Soc. Edinburgh 124A, 1013–1022 (1994) 7. Giacomin, G., Lebowitz, J.: Phase segregation dynamics in particle systems with long range interactions I: Macroscopic limits. J. Stat. Phys. 87, no. 1/2, 37–61 (1997) 8. Hardy, G., Littlewood, J.,and Polya, G.: Inequalities. Cambridge: Cambridge Univ. Press, 1932 9. Lieb, E.H., Thirring, W.: Inequalities of the moments of the eigenvalues of the Schroödinger Hamiltonian and their relation to Sobolev inequalities. In: Studies in Mathematical Physics, Essays in honor of Valentine Bargmann, edited by Lieb, Simon and Wightman, Princeton, NJ: Princeton University Press, 1976, pp. 269–303 10. Lebowitz, J.L., Orlandi, E., Presutti, E.: A Particle model for spinodal decomposition. J. Stat. Phys. 63, 933–974 (1991) 11. Weyl, H.: Gruppentheorie und Quantenmechanik. Leipzig: Wissenschaftlicher Verlag, 1926 Communicated by A. Kupiainen
Commun. Math. Phys. 224, 341 – 372 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
On Quasi-Hopf Superalgebras Mark D. Gould, Yao-Zhong Zhang, Phillip S. Isaac Department of Mathematics, The University of Queensland, Brisbane, Qld 4072, Australia. E-mail:
[email protected] Received: 14 December 1998 / Accepted: 29 January 2000
Abstract: In this work we investigate several important aspects of the structure theory of the recently introduced quasi-Hopf superalgebras (QHSAs), which play a fundamental role in knot theory and integrable systems. In particular we introduce the opposite structure and prove in detail (for the graded case) Drinfeld’s result that the coproduct ≡ (S ⊗ S) · T · · S −1 induced on a QHSA is obtained from the coproduct by twisting. The corresponding “Drinfeld twist” FD is explicitly constructed, as well as its inverse, and we investigate the complete QHSA associated with . We give a universal proof that the coassociator = (S ⊗ S ⊗ S) 321 and canonical elements α = S(β), β = S(α) correspond to twisting the original coassociator = 123 and canonical elements α, β with the Drinfeld twist FD . Moreover in the quasi-triangular case, it is shown algebraically that the R-matrix R = (S ⊗ S)R corresponds to twisting the original R-matrix R with FD . This has important consequences in knot theory, which will be investigated elsewhere. 1. Introduction The main aim of this paper, in conjuction with [1], is to continue the work introduced in [2] which defines Z2 graded versions of Drinfeld’s quasi-Hopf algebras [3], called quasiHopf superalgebras (QHSAs). In particular, we show that the special QHSA structure obtained by application of the antipode (see Proposition 4) actually coincides with the quasi-Hopf superalgebra structure induced by twisting with FD , the “Drinfeld twist” (see Eq. (4.10)). In the quasi-triangular case, our results in this direction are new, even in the non-graded case. The potential for application of these new structures is enormous. They give rise to new (non-standard) representations of the braid group and corresponding link polynomials which will be investigated elsewhere. Moreover, it has already been shown in [4–8] Current address: Graduate School of Mathematical Sciences, The University of Tokyo, 3-8-1 Komaba, Meguro, Tokyo 153-8914, Japan. E-mail:
[email protected] 342
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
and [2] that QHSAs are directly relevant to elliptic quantum (super)groups [9, 10], which are useful in obtaining elliptic solutions [11–16] to the (graded) quantum Yang-Baxter equation. The importance of QHSAs in supersymmetric integrable models and the theory of knots and links [17] should become evident as the theory is developed further, which is the aim of this paper. In particular, the opposite structure is introduced and several aspects of their structure theory are investigated. 2. Quasi-Hopf Superalgebras and Twistings This section is mostly a summary of the definitions and results given in [2]. They are important and worth restating here since they will be used frequently. Definition 1. A Z2 graded quasi-bialgebra A over C is a unital associative algebra equipped with algebra homomorphisms : A → C (counit), : A → A ⊗ A (coproduct) together with an invertible homogeneous ∈ A ⊗ A ⊗ A (coassociator) satisfying (1 ⊗ )(a) = −1 ( ⊗ 1)(a) , ∀a ∈ A, ( ⊗ 1 ⊗ 1) · (1 ⊗ 1 ⊗ ) = ( ⊗ 1) · (1 ⊗ ⊗ 1) · (1 ⊗ ), ( ⊗ 1) = 1 = (1 ⊗ ), (1 ⊗ ⊗ 1) = 1.
(2.1) (2.2) (2.3) (2.4)
Properties (2.2), (2.3) and (2.4) imply that ( ⊗ 1 ⊗ 1) = 1 = (1 ⊗ 1 ⊗ ) . In this case, multiplication of tensor products is Z2 graded and defined as (a ⊗ b)(c ⊗ d) = (−1)[b][c] ac ⊗ bd for homogeneous a, b, c, d ∈ H and where [a] ∈ Z2 denotes the grading of a, so that we have the following important result which will be used frequently: [a] = 1 ⇒ (a) = 0. Also, the twist map T : H ⊗ H → H ⊗ H is defined by T (a ⊗ b) = (−1)[a][b] b ⊗ a. Since is homogeneous, the counit properties imply that is even ([ ] = 0). Definition 2. A QHSA H is a Z2 graded quasi-bialgebra equipped with a Z2 graded antiautomorphism S : H → H (antipode) and homogeneous canonical elements α, β ∈ H such that for all a ∈ H , m · (1 ⊗ α)(S ⊗ 1)(a) = (a)α, m · (1 ⊗ β)(1 ⊗ S)(a) = (a)β, m(m ⊗ 1) · (S ⊗ 1 ⊗ 1)(1 ⊗ α ⊗ β)(1 ⊗ 1 ⊗ S) = 1, m(m ⊗ 1) · (1 ⊗ β ⊗ α)(1 ⊗ S ⊗ 1) −1 = 1.
(2.5) (2.6) (2.7) (2.8)
On Quasi-Hopf Superalgebras
343
Here m : H ⊗ H → H is the multiplication map, m(a ⊗ b) = ab, ∀a, b ∈ H , and S is defined by S(ab) = (−1)[a][b] S(b)S(a) for homogeneous a, b. This can be extended to inhomogeneous elements by linearity. Also, since H is associative, m(m ⊗ 1) = m(1 ⊗ m). If we apply to (2.7) and (2.8) we obtain, in view of Eq. (2.4), (α)(β) = (αβ) = 1, so that [α] = [β] = 0. It then follows by applying to (2.5) and (2.6) that (S(a)) = (a), ∀a ∈ H. If we write =
Xν ⊗ Y ν ⊗ Z ν ,
ν
and using the standard coproduct notation of Sweedler [18], (a) =
= a(1) ⊗ a(2) ,
(a)
(2.5), (2.6), (2.7) and (2.8) may be expressed
S(a(1) )αa(2) = (a)α,
(a)
a(1) βS(a(2) ) = (a)β,
(a)
1=
S(Xν )αYν βS(Zν )
ν
=
X¯ ν βS(Y¯ν )α Z¯ ν .
ν
The definition of a QHSA is designed to ensure that its finite dimensional representations constitute a monoidal category. For example, a Hopf superalgebra is a QHSA with α = β = 1 and = 1⊗3 . In fact, the relation between QHSAs and Hopf superalgebras is analogous to that between quasi-triangular Hopf superalgebras and cocommutative ones. In the latter case cocommutativity is weakened while in the former case coassociativity is weakened (in the same sense). Before proceeding, it is important to establish some notation. For the coassociator and its inverse, we set 123 ≡ = Xν ⊗ Y ν ⊗ Z ν , ν
−1 123
−1
≡
=
ν
X¯ ν ⊗ Y¯ν ⊗ Z¯ ν .
344
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
We may then define the elements 132 and 312 (for example) by applying appropriate twists to the positions so that 132 = (1 ⊗ T ) 123 = Xν ⊗ Zν ⊗ Yν × (−1)[Yν ][Zν ] , ν
312 = (T ⊗ 1) 132 = Zν ⊗ Xν ⊗ Yν × (−1)[Yν ][Zν ]+[Xν ][Zν ] , ν
and similarly for −1 , so that, for example, −1 −1 231 = (1 ⊗ T ) 213
= (1 ⊗ T )(T ⊗ 1) −1 123 ¯ ¯ ¯ ¯ Y¯ν ⊗ Z¯ ν ⊗ X¯ ν × (−1)[Xν ][Yν ]+[Xν ][Zν ] . = ν
Note that our convention differs from the usual one (see [3] for example) which employs the inverse permutations on the positions. However, this is simply notation and is not important below. We now have the following definition, which once again appears in [2], and which we include here for convenience. Definition 3. A QHSA H is called quasi-triangular if there exists an invertible homogeneous R ∈ H ⊗ H such that T (a)R = R(a), ∀a ∈ H, −1 ( ⊗ 1)R = −1 231 R13 132 R23 123 ,
(2.9) (2.10)
(1 ⊗ )R = 312 R13 −1 213 R12 123 ,
(2.11)
where T ≡ T · . Moreover, if R satisfies R −1 = T · R ≡ R T , then H is called triangular. Note that this definition of quasi-triangular QHSAs ensures that the family of finite dimensional H -modules constitutes a quasi-tensor category. Equations (2.10) and (2.11) immediately imply ( ⊗ 1)R = (1 ⊗ )R = 1, and hence [R] = 0. It can be shown that R also satisfies the graded quasi-quantum Yang-Baxter equation (graded QQYBE) −1 −1 −1 R12 −1 231 R13 132 R23 123 = 321 R23 312 R13 213 R12 .
(2.12)
Now we come to twistings. Here we point out that the category of quasi-triangular QHSAs is invariant under a kind of gauge-transformation. Let F ∈ H ⊗ H be an invertible homogeneous element satisfying the property (1 ⊗ )F = ( ⊗ 1)F = 1,
(2.13)
On Quasi-Hopf Superalgebras
345
(so that [F ] = 0) with H a (quasi-triangular) QHSA. Set F (a) = F (a)F −1 , ∀a ∈ H, F = (F ⊗ 1) · ( ⊗ 1)F · · (1 ⊗ )F −1 · (1 ⊗ F −1 ),
(2.14)
αF = m · (1 ⊗ α)(S ⊗ 1)F −1 , βF = m · (1 ⊗ β)(1 ⊗ S)F.
(2.15)
RF = F T RF −1 ,
(2.16)
and
Also put
where F T ≡ T · F ≡ F21 . The following theorem summarises results proven in [2]. Let (H, , , , S, α, β) denote the entire QHSA structure. Given this structure, we have Theorem 1. (H, F , , F , S, αF , βF ) is also a QHSA. Moreover, if H is quasi-triangular with R-matrix R, then (H, F , , F , S, αF , βF ) is also quasi-triangular with R-matrix RF . We refer to F as a twistor. (H, F , , F , S, αF , βF ) is said to be the structure of H twisted under F . It is possible to impose on F the cocycle condition (F ⊗ 1)( ⊗ 1)F = (1 ⊗ F )(1 ⊗ )F.
(2.17)
It is worth pointing out that if we have a quasi-triangular Hopf superalgebra ( = 1⊗3 , α = β = 1) with structure (H, , , S) and R-matrix R, and then applying a twist F that satisfies (2.17), we would obtain a Hopf superalgebra (H, F , , S) with new R-matrix RF . 3. Opposite Structure Let
T = T ·
be the opposite coproduct on a QHSA H . Also set ¯ ¯ ¯ ¯ ¯ ¯ T = −1 = Z¯ ν ⊗ Y¯ν ⊗ X¯ ν × (−1)[Xν ][Yν ]+[Xν ][Zν ]+[Yν ][Zν ] , 321 α T = S −1 (α), and
β T = S −1 (β).
Our aim here is to prove the following. Proposition 1. (H, T , , T , S −1 , α T , β T ) is a QHSA. This is called the opposite structure on H .
346
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
Proof. Firstly we prove that we indeed have a Z2 graded quasi-bialgebra structure. We note that (2.3) and (2.4) are obvious. For a ∈ A, (2.1) may be written (in Sweedler’s notation [18]) a(1) ⊗ (a(2) ) = −1 123 ((a(1) ) ⊗ a(2) ) 123 . Below we set (a(1) ) =
i
(a(2) ) =
i
i a(1)i ⊗ a(1) , i a(2)i ⊗ a(2) ,
so that (2.1) becomes i i = −1 a(1) ⊗ a(2)i ⊗ a(2) 123 (a(1)i ⊗ a(1) ⊗ a(2) ) 123 .
(3.1)
If we then apply the algebra homomorphism (1 ⊗ T )(T ⊗ 1)(1 ⊗ T ) to (3.1) we obtain T T (T ⊗ 1)T (a) = −1 321 (1 ⊗ ) (a) 321
which can be written (1 ⊗ T )T (a) = ( T )−1 (T ⊗ 1)T (a) T with T as stated. Taking the inverse of (2.2) and applying the algebra homomorphism (T ⊗ T )(1 ⊗ T ⊗ 1)(T ⊗ T )(1 ⊗ T ⊗ 1) to both sides, we have (T ⊗ 1 ⊗ 1) T · (1 ⊗ 1 ⊗ T ) T = ( T ⊗ 1) · (1 ⊗ T ⊗ 1) T · (1 ⊗ T ), which is (2.2) for the opposite structure. Hence we have proved the Z2 graded quasibialgebra properties. As to the remaining properties, we use (2.5) to obtain the following: m · (1 ⊗ α T )(S −1 ⊗ 1)T (a) = S −1 (a(2) )S −1 (α)a(1) × (−1)[a(1) ][a(2) ] = S −1 (S(a(1) )αa(2) ) = S −1 ((a)α) = (a)α T , and similarly, we can use (2.6) to obtain m · (1 ⊗ β T )(1 ⊗ S −1 )T (a) = (a)β T . As to the opposite of (2.7), we have m(m ⊗ 1) · (S −1 ⊗ 1 ⊗ 1)(1 ⊗ α T ⊗ β T )(1 ⊗ 1 ⊗ S −1 ) T ¯ ¯ ¯ ¯ ¯ ¯ = S −1 (Z¯ ν )S −1 (α)Y¯ν S −1 (β)S −1 (X¯ ν ) × (−1)[Xν ][Yν ]+[Xν ][Zν ]+[Yν ][Zν ] = S −1 (X¯ ν βS(Y¯ν )α Z¯ ν ) = 1.
On Quasi-Hopf Superalgebras
347
In a similar way, we can show the opposite of (2.8) is m(m ⊗ 1) · (1 ⊗ β T ⊗ α T )(1 ⊗ S −1 ⊗ 1) T = 1. This completes the proof. Now consider (2.9). This immediately shows that the opposite R-matrix R T ≡ T · R satisfies the intertwining property under the opposite coproduct T . We now investigate (2.10) and (2.11) for this opposite structure. Set R= ei ⊗ e i . i
Applying the homomorphism (1 ⊗ T )(T ⊗ 1)(1 ⊗ T ) to (2.10) gives (1 ⊗ T )R T = (X¯ ν ⊗ Z¯ ν ⊗ Y¯ν )(ej ⊗ 1 ⊗ ej )(Yρ ⊗ Zρ ⊗ Xρ )(ek ⊗ ek ⊗ 1) j k ¯ ¯ · (Z¯ µ ⊗ Y¯µ ⊗ X¯ µ ) × (−1)[Yν ][Zν ]+[Yρ ][Zρ ]+[Xρ ][Yρ ]+[ej ][e ]+[ek ][e ]
¯
¯
¯
¯
¯
¯
×(−1)[Xµ ][Yµ ]+[Xµ ][Zµ ]+[Yµ ][Zµ ] −1 T T = −1 132 R13 231 R123 321 . Since T −1 321 = 123 ,
231 = ( T )−1 213 , T −1 132 = 312 ,
we have
T T T ( T )−1 (1 ⊗ T )R T = T312 R13 213 R12 123 , which proves (2.11) for the opposite structure. Now applying the homomorphism (T ⊗ 1)(1 ⊗ T )(T ⊗ 1) to (2.11), we can obtain Eq. (2.10) for the opposite structure in a similar way: T T T T −1 (T ⊗ 1)R T = ( T )−1 231 R13 132 R23 ( )123 .
Thus we have proved Proposition 2. (H, T , , T , S −1 , α T , β T ) is a quasi-triangular QHSA with R-matrix R T ≡ T · R. It is worth noting that if H is a quasi-triangular QHSA, then its R-matrix R satisfies (2.13), so we may consider twisting H with its own R-matrix. Obviously the coproduct now reduces to the opposite one: R (a) = R(a)R −1 = T (a) for every a ∈ H . In this case, in view of the graded QQYBE (2.12), the coassociator induced by R coincides with the opposite coassociator: R
(2.14)
=
(2.10),(2.11)
=
−1 R12 · ( ⊗ 1)R · · (1 ⊗ )R −1 · R23 −1 −1 −1 −1 −1 R12 · −1 231 R13 132 R23 123 · R12 213 R13 312 R23
(2.12)
=
−1 321
=
T .
348
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
The corresponding canonical elements are given, from (2.15), by αR = m · (1 ⊗ α)(S ⊗ 1)R −1 , βR = m · (1 ⊗ β)(1 ⊗ S)R, while the R-matrix induced by twisting with R is, from (2.16), R T · R · R −1 = R T , which is simply the opposite R-matrix. It thus appears that the structure induced by twisting with R corresponds to the opposite quasi-triangular QHSA structure. Note however that αR and βR are defined with respect to the antipode S rather than the opposite antipode S −1 . So now we come to consider the opposite structure of the twisted quasi-triangular QHSA (H, F , F , , S, αF , βF ) with R-matrix RF . The opposite coproduct is clearly given by (F )T (a) = F T T (a)(F T )−1 , which obviously corresponds to twisting the opposite coproduct on H with F T . That is, (F )T (a) = (T )F T (a). To see this is in fact the case for the remaining structure, we note that the opposite coassociator to F is ( F )T
= ( −1 F )321
= (T ⊗ 1)(1 ⊗ T )(T ⊗ 1)( −1 F )123 (2.14)
−1 −1 · F12 } = (T ⊗ 1)(1 ⊗ T )(T ⊗ 1) · {F12 · (1 ⊗ )F · −1 123 · ( ⊗ 1)F T T T −1 T −1 = F23 · (T ⊗ 1)F T · −1 · (F23 ) 321 · (1 ⊗ )(F ) T = F23 · (T ⊗ 1)F T · T123 · (1 ⊗ T )(F T )−1 · (F T )−1 23
(2.14)
= ( T )F T .
Similarly for the opposite R-matrix we have (RF )T = F R T (F T )−1 = (R T )F T . It remains to consider the canonical elements (2.15). To this end, (αF )T = S −1 (αF ) = S −1 (S(f¯i )α f¯i ) ¯ ¯i = S −1 (f¯i )S −1 (α)f¯i × (−1)[fi ][f ] ¯ ¯i = m · (1 ⊗ S −1 (α))(S −1 ⊗ 1)(f¯i ⊗ f¯i ) × (−1)[fi ][f ] = m · (1 ⊗ α T )(S −1 ⊗ 1)(F T )−1 = (α T )F T and similarly (βF )T = (β T )F T . Here we have used Proposition 1 and the fact that S −1 is the antipode under the opposite structure. Thus we have proved
On Quasi-Hopf Superalgebras
349
Proposition 3. (H, (F )T , , ( F )T , S −1 , (αF )T , (βF )T ) = (H, (T )F T , , ( T )F T , S −1 , (α T )F T , (β T )F T ). Moreover, if H is quasi-triangular with R-matrix R, then (RF )T = (R T )F T . Now take H to be a normal quasi-triangular Hopf superalgebra and consider a twistor F (λ) ∈ H ⊗H which depends on λ ∈ H , where we assume λ depends on one or possibly several parameters. Here we assume that F (λ) satisfies the shifted cocycle condition (cf. Eq. (2.17)) F12 (λ) · ( ⊗ 1)F (λ) = F23 (λ + h(1) ) · (1 ⊗ )F (λ),
(3.2)
where h(1) = h ⊗ 1 ⊗ 1 and h ∈ H fixed. We then have the following QHSA structure induced by twisting with F (λ): (λ) ≡ F (λ) = F23 (λ + h(1) )F23 (λ)−1 , λ (a) = F (λ)(a)F (λ)−1 , ∀a ∈ H, αλ = m · (S ⊗ 1)F (λ)−1 , βλ = m · (1 ⊗ S)F (λ), R(λ) = F (λ)T RF (λ)−1 .
(3.3)
It is straightforward to show that Eqs. (2.10), (2.11) in this case reduce to (1) (λ ⊗ 1)R(λ) = −1 231 (λ)R13 (λ)R23 (λ + h ),
(1 ⊗ λ )R(λ) = R13 (λ + h(2) )R12 (λ) 123 (λ),
(3.4)
while the QQYBE (2.12) becomes R12 (λ + h(3) )R13 (λ)R23 (λ + h(1) ) = R23 (λ)R13 (λ + h(2) )R12 (λ). This is the graded dynamical QYBE, of interest in obtaining elliptic solutions to the QYBE. We can also determine the opposite structure of the above. Recall that H is also a QHSA with the opposite coproduct Tλ and with the opposite coassociator (3.3)
T T (3) −1 (λ)T = (λ)−1 321 = F12 (λ)F12 (λ + h ) .
It is worth noting, in view of Proposition 3, that this coincides with the QHSA structure induced on the opposite QHSA structure of H by twisting with F T (λ). By applying (1 ⊗ T )(T ⊗ 1)(1 ⊗ T ) to the shifted cocycle condition (3.2), it can be shown that F T (λ) satisfies the opposite shifted cocycle condition T F23 (λ)(1 ⊗ T )F T (λ) = F12 (λ + h(3) )(T ⊗ 1)F T (λ).
To complete the opposite QHSA structure the antipode is S −1 , while the canonical elements are now given by αλT = S −1 (αλ ),
βλT = S −1 (βλ ).
350
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
Applying (T ⊗ 1)(1 ⊗ T )(T ⊗ 1) to (3.4) gives the coproduct properties T T (Tλ ⊗ 1)R T (λ) = R13 (λ + h(2) )R23 (λ) 321 (λ),
T T (3) (1 ⊗ Tλ )R T (λ) = −1 132 (λ)R13 (λ)R12 (λ + h ),
which are special cases of (2.10) and (2.11), for the coassociator concerned. Finally, the graded QQYBE satisfied by R T (λ) reduces to T T T T T T R12 (λ)R13 (λ + h(2) )R23 (λ) = R23 (λ + h(1) )R13 (λ)R12 (λ + h(3) ),
which we refer to as the opposite graded dynamical QYBE. 4. Drinfeld Twist This section is concerned with the QHSA structure induced by the Drinfeld twist [3], and gives details of some remarkable results relating to this construction. First it is worth establishing some useful notation. Set (1 ⊗ )(a) = a(1) ⊗ (a(2) ) R R R = a(1) ⊗ a(2) ⊗ a(3) , ( ⊗ 1)(a) = (a(1) ) ⊗ a(2) L L L = a(1) ⊗ a(2) ⊗ a(3) . The following result will be used later. Lemma 1. ∀a ∈ H , we have L L L Xν a ⊗ Yν βS(Zν )(−1)[a][Xν ] = a(1) Xν ⊗ a(2) Yν βS(Zν )S(a(3) ) L
S(Xν )αYν ⊗ aZν (−1)
[a][Zν ]
×(−1)[Xν ][a(2) ] , R R R = S(a(1) )S(Xν )αYν a(2) ⊗ Zν a(3) R
×(−1)[Zν ][a(2) ] , L L L a X¯ ν ⊗ S(Y¯ν )α Z¯ ν = X¯ ν a(1) ⊗ S(a(2) )S(Y¯ν )α Z¯ ν a(3) ¯
L
L
¯
R
R
×(−1)[Xν ]([a(1) ]+[a(2) ]) , R ¯ R R ¯ X¯ ν βS(Y¯ν ) ⊗ Z¯ ν a = Xν βS(Y¯ν )S(a(2) Zν a(1) ) ⊗ a(3) ×(−1)[Zν ]([a(2) ]+[a(3) ]) . Proof. For (4.1), (1 ⊗ )(a) = ( ⊗ 1)(a) can be rewritten as
R
R
R
R R R Xν a(1) ⊗ Yν a(2) ⊗ Zν a(3) (−1)[Zν ]([a(1) ]+[a(2) ])+[Yν ][a(1) ] R R L L L L Xν ⊗ a(2) Yν ⊗ a(3) Zν (−1)[Xν ]([a(2) ]+[a(3) ])+[Yν ][a(3) ] . = a(1)
(4.1)
(4.2)
(4.3)
(4.4)
On Quasi-Hopf Superalgebras
351
Then applying (1 ⊗ m)(1 ⊗ 1 ⊗ βS) to both sides we obtain R R R R R R ⊗ Yν a(2) βS(a(3) )S(Zν )(−1)[Zν ]([a(2) ]+[a(3) ])+[a(1) ][Xν ] l.h.s. = Xν a(1) = Xν a(1) ⊗ Yν (a(2) )βS(Zν )(−1)[a(1) ][Xν ] = Xν a ⊗ Yν βS(Zν )(−1)[a][Xν ] L L L L = r.h.s. = a(1) Xν ⊗ a(2) Yν βS(Zν )S(a(3) )(−1)[Xν ][a(2) ] . This proves (4.1). Parts (4.2), (4.3) and (4.4) are proved similarly and we shall only outline how they are obtained. We can arrive at (4.2) by applying (m ⊗ 1)(S ⊗ α ⊗ 1) to ( ⊗ 1)(a) = (1 ⊗ )(a). Equation (4.3) can be obtained by applying (1 ⊗ m)(1 ⊗ S ⊗ α) to (1 ⊗ )(a) −1 = −1 ( ⊗ 1)(a). Finally, if we apply (m ⊗ 1)(1 ⊗ βS ⊗ 1) to −1 ( ⊗ 1)(a) = (1 ⊗ )(a) −1 we arrive at (4.4). This completes the proof. Also, the following equations, which arise from Eq. (2.2), will prove useful throughout: ⊗ 1 = ( ⊗ 1 ⊗ 1) · (1 ⊗ 1 ⊗ ) · (1 ⊗ −1 ) · (1 ⊗ ⊗ 1) −1 (1) (2) = (X(ν) X(µ) X¯ ρ ⊗ Xν) Xµ X¯ σ Y¯ρ(1) ⊗ Y(ν) Zµ(1) Y¯σ Y¯ρ(2) ⊗ Zν Zµ(2) Z¯ σ Z¯ ρ ) (2)
¯
×(−1)[Xρ ]([Xν
(1)
]+[Xµ ]+[Xν ])+([X¯ σ ]+[Y¯ρ ])([Xν ]+[Zµ ]) (2)
×(−1)[Zµ ][Xν ]+[Xµ ][Xν
(1)
¯ (2)
¯
(1)
(2)
]+[Zν ][Zµ ]+[Y¯ρ ][X¯ σ ]+[Y¯ρ ][Z¯ σ ] (2)
×(−1)([Yσ ]+[Yρ ])([Zν ]+[Zµ ]) , (4.5) 1 ⊗ = (1 ⊗ ⊗ 1) −1 · ( −1 ⊗ 1) · ( ⊗ 1 ⊗ 1) · (1 ⊗ 1 ⊗ ) = (X¯ ν X¯ µ Xσ(1) Xρ ⊗ Y¯ν(1) X¯ µ Xσ(2) Yρ ⊗ Y¯ν(2) Z¯ µ Yσ Zρ(1) ⊗ X¯ ν Zσ Zρ(2) ) (1)
]+[Xρ ])([X¯ µ ]+[X¯ ν ])+[Zρ ][Xσ ]
(2)
]+[Yρ ])([Z¯ ν ]+[Z¯ µ ]+[Y¯ν ])+[Z¯ ν ]([Yσ ]+[Zρ ])+[Xρ ][Xσ ]+[Zσ ][Zρ ]
×(−1)([Xσ ×(−1)([Xσ ¯
(2)
¯
¯
(1)
(2)
(1)
¯ (2)
¯
×(−1)[Xµ ][Xν ]+[Yµ ]([Zν ]+[Yν ])+[Zµ ][Zν ] , (4.6) −1 ⊗ 1 = (1 ⊗ ⊗ 1) · (1 ⊗ ) · (1 ⊗ 1 ⊗ ) −1 · ( ⊗ 1 ⊗ 1) −1 = (Xν X¯ σ X¯ ρ(1) ⊗ Yν(1) Xµ Y¯σ X¯ ρ(2) ⊗ Yν(2) Yµ Z¯ σ(1) Y¯ρ ⊗ Zν Zµ Z¯ σ(2) Z¯ ρ ) ¯
¯ (1) ])[Xν ]+([Y¯σ ]+[X¯ ρ(2) ])([Xµ ]+[Zν ]+[Yν(2) ])
¯
¯ (1) ])([Zν ]+[Zµ ])+[Zν ][Zµ ]+[Xµ ][Yµ(2) ]
×(−1)([Xσ ]+[Xρ ×(−1)([Yρ ]+[Zσ ¯ (1)
1 ⊗ −1
¯ (2) ¯
¯
¯
¯ (2)
×(−1)[Xρ ][Xσ ]+[Xρ ][Zσ ]+[Yρ ][Zσ ] , (4.7) = (1 ⊗ 1 ⊗ ) −1 · ( ⊗ 1 ⊗ 1) −1 · ( ⊗ 1) · (1 ⊗ ⊗ 1) = (X¯ ν X¯ µ(1) Xσ Xρ ⊗ Y¯ν X¯ µ(2) Yσ Yρ(1) ⊗ Z¯ ν(1) Y¯µ Zσ Yρ(2) ⊗ Z¯ ν(2) Z¯ µ Zρ ) ¯
¯
(2) ¯ (2) (1) ¯ (2) ])+[X¯ µ ][Zν ]+[Y¯µ ][Z¯ ν ]+[Zσ ][Yρ ]
×(−1)([Xσ ]+[Xρ ])([Xµ ]+[Xν ]+[Xµ (1)
×(−1)([Yσ ]+[Yρ
¯ (1) ][X¯ ν ]+[Xρ ][Xσ ]
×(−1)[Xµ
(2)
(2)
])([X¯ µ ]+[Z¯ ν ])+([Zσ ]+[Yρ ])([Z¯ µ ]+[Z¯ ν ])
.
(4.8)
352
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
Given a QHSA H , we note that (S ⊗ S)T and T · S −1 both determine Z2 graded algebra antihomomorphisms. It follows that ≡ (S ⊗S)T ·S −1 determines an algebra homomorphism and thus a new coproduct on H . That is, (a) = (S ⊗ S)T (S −1 (a)), ∀a ∈ H. Remark. In the case H is a normal Hopf superalgebra, = (cf. Sweedler [18]). In what follows, we work towards showing that is obtained from by twisting. Apply (S ⊗ S)T ⊗ 1 to Lemma 1, (4.1), to give l.h.s. = (S ⊗ S)T (a)(S ⊗ S)T (Xν ) ⊗ Yν βS(Zν ) L L L = r.h.s. = (S ⊗ S)T (Xν )(S ⊗ S)T (a(1) ) ⊗ a(2) Yν βS(Zν )S(a(3) ) L
L
×(−1)[Xν ]([a(1) ]+[a(2) ]) . Now let γ ∈ H ⊗ H be an even element (ie. [γ ] = 0). If we apply (1⊗2 ⊗ γ )(1⊗2 ⊗ ) to the above equation, we obtain (S ⊗ S)T (a)(S ⊗ S)T (Xν ) ⊗ γ (Yν βS(Zν )) L L L = (S ⊗ S)T (Xν )(S ⊗ S)T (a(1) ) ⊗ γ (a(2) )(Yν βS(Zν ))(S(a(3) )) L
L
×(−1)[Xν ]([a(1) ]+[a(2) ]) . Then applying (m ⊗ m)(1 ⊗ T ⊗ 1) gives (S ⊗ S)T (a)(S ⊗ S)T (Xν ) · γ · (Yν βS(Zν )) L L L = (S ⊗ S)T (Xν )(S ⊗ S)T (a(1) ) · γ · (a(2) )(Yν βS(Zν ))(S(a(3) )) L
L
×(−1)[Xν ]([a(1) ]+[a(2) ]) , so that if γ satisfies
(S ⊗ S)T (a(1) ) · γ · (a(2) ) = (a)γ ,
(4.9)
then
(S ⊗ S)T (Xν ) · γ · (Yν βS(Zν )) (S ⊗ S)T (a) = (S ⊗ S)T (Xν ) · (a(1) )γ · (Yν βS(Zν ))(S(a(2) ))(−1)[a(1) ][Xν ] = (S ⊗ S)T (Xν ) · γ · (Yν βS(Zν ))(S(a)).
This can be rewritten (S ⊗ S)T (a)FD = FD (S(a)), ∀a ∈ H where FD =
(S ⊗ S)T (Xν ) · γ · (Yν βS(Zν )).
(4.10)
On Quasi-Hopf Superalgebras
353
To find γ ∈ H ⊗ H satisfying (4.9), we first note, ∀a ∈ H , ( ⊗ )(a) = ( ⊗ 1 ⊗ 1)(1 ⊗ )(a) = ( ⊗ 1 ⊗ 1)( −1 ( ⊗ 1)(a) ) = ( ⊗ 1 ⊗ 1) −1 · (( ⊗ 1) ⊗ 1)(a) · ( ⊗ 1 ⊗ 1) = ( ⊗ 1 ⊗ 1) −1 · ( ⊗ 1) · ((1 ⊗ ) ⊗ 1)(a) · ( −1 ⊗ 1) · ( ⊗ 1 ⊗ 1) = ( ⊗ 1 ⊗ 1) −1 · ( ⊗ 1) · (1 ⊗ ⊗ 1)( ⊗ 1)(a) · ( −1 ⊗ 1) · ( ⊗ 1 ⊗ 1) . We thus arrive at ( −1 ⊗ 1) · ( ⊗ 1 ⊗ 1) · ( ⊗ )(a) = (1 ⊗ ⊗ 1)( ⊗ 1)(a) · ( −1 ⊗ 1) · ( ⊗ 1 ⊗ 1) . Now write ( ⊗ )(a) = =
(4.11)
(a(1) ) ⊗ (a(2) ) L L R R ⊗ a(2) ⊗ a(1) ⊗ a(2) , a(1)
L L L (1 ⊗ )(a(1) ⊗ a(2) ⊗ a(3) ) L L L L = a(1) ⊗ a(2)(1) ⊗ a(2)(2) ⊗ a(3) .
(1 ⊗ ⊗ 1)( ⊗ 1)(a) =
Lemma 2. γ = (m ⊗ m) · (1 ⊗ α ⊗ 1 ⊗ α)(S ⊗ 1 ⊗ S ⊗ 1) · (1 ⊗ T ⊗ 1)(T ⊗ 1 ⊗ 1)( −1 ⊗ 1)( ⊗ 1 ⊗ 1) satisfies (4.9). Moreover γ = (m ⊗ m) · (1 ⊗ α ⊗ 1 ⊗ α)(S ⊗ 1 ⊗ S ⊗ 1) · (1 ⊗ T ⊗ 1)(T ⊗ 1 ⊗ 1)(1 ⊗ )(1 ⊗ 1 ⊗ ) −1 . Proof. First we set Ai ⊗ B i ⊗ C i ⊗ D i i
≡
(1)
X¯ ν Xµ(1) ⊗ Y¯ν Xµ(2) ⊗ Z¯ ν Yµ ⊗ Zµ (−1)[Xµ
(2)
][X¯ ν ]+[Xµ ][Z¯ ν ]
= ( −1 ⊗ 1)( ⊗ 1 ⊗ 1) . Note that [Ai ] + [Bi ] + [Ci ] + [Di ] = 0 ( mod 2). Now we have, from (4.11), L L R L L R R ⊗ Bi a(2) ⊗ Ci a(1) ⊗ Di a(2) (−1)[a(1) ][Ai ]+[a(2) ]([Ci ]+[Di ])+[a(1) ][Di ] Ai a(1) L L L L = a(1) Ai ⊗ a(2)(1) Bi ⊗ a(2)(2) Ci ⊗ a(3) Di L
L
L
L
L
× (−1)[Ai ]([a(2) ]+[a(3) ])+[Bi ]([a(3) ]+[a(2)(2) ])+[Ci ][a(3) ] .
(4.12)
354
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
Applying (m ⊗ m)(S ⊗ α ⊗ S ⊗ α)(1 ⊗ T ⊗ 1)(T ⊗ 1 ⊗ 1) to the above we obtain L R L R )S(Bi )αCi a(1) ⊗ S(a(1) )S(Ai )αDi a(2) l.h.s. = S(a(2) R
L
L
R
× (−1)[a(1) ]([Ai ]+[Di ])+[Ai ]([Bi ]+[Ci ])+[a(1) ]([Bi ]+[Ci ]+[a(2) ]+[a(1) ]) L L R R = (S ⊗ S)(a(2) ⊗ a(1) )(S(Bi )αCi ⊗ S(Ai )αDi )(a(1) ⊗ a(2) ) L
R
× (−1)[Ai ]([Bi ]+[Ci ])+[a(1) ][a(1) ] = (S ⊗ S)T (a(1) )(S(Bi )αCi ⊗ S(Ai )αDi )(a(2) )(−1)[Ai ]([Bi ]+[Ci ]) = (S ⊗ S)T (a(1) ) · γ · (a(2) ) L L L = r.h.s. = S(Bi )(a(2) )αCi ⊗ S(Ai )S(a(1) )αa(3) Di L
L
× (−1)[Di ]([a(1) ]+[a(3) ])+[Ai ]([Bi ]+[Ci ]) = S(Bi )αCi ⊗ S(Ai )S(a(1) )αa(2) Di (−1)[Di ]([a(1) ]+[a(2) ])+[Ai ]([Bi ]+[Ci ]) = (a) S(Bi )αCi ⊗ S(Ai )αDi (−1)[Ai ]([Bi ]+[Ci ]) = (a)γ with γ given by (4.12). As to the second part, note that γ = S(Y¯ν Xµ(2) )α Z¯ ν Yµ ⊗ S(X¯ ν Xµ(1) )αZµ (1)
(2)
¯
¯
¯
(1)
¯
(2)
× (−1)[Xµ ][Xν ]+[Xµ ][Zν ]+([Xν ]+[Xµ ])([Xν ]+[Xµ ]+[Yµ ]) ¯ = (S ⊗ S)T (Xµ )(S(Y¯ν )α Z¯ ν Yµ ⊗ S(X¯ ν )αZµ )(−1)[Xν ](1+[Yµ ]) . From (2.2), (1 ⊗ )(1 ⊗ 1 ⊗ ) −1 = (1 ⊗ ⊗ 1) −1 · ( −1 ⊗ 1)( ⊗ 1 ⊗ 1) = X¯ σ X¯ ν Xµ(1) ⊗ Y¯σ(1) Y¯ν Xµ(2) ⊗ Y¯σ(2) Z¯ ν Yµ ⊗ Z¯ σ Zµ (1) (1) (2) ¯ (2) ]([Z¯ ν ]+[Xµ ])+[Y¯σ(1) ]([X¯ ν ]+[Xµ ])+[Xµ ][X¯ ν ]+[Xµ ][Z¯ ν ]
¯
× (−1)[Xσ ][Zµ ]+[Yσ
.
If we then apply (m ⊗ m)(S ⊗ α ⊗ S ⊗ α)(1 ⊗ T ⊗ 1)(T ⊗ 1 ⊗ 1) to this equation, straightforward calculation reveals (m ⊗ m)(S ⊗ α ⊗ S ⊗ α)(1 ⊗ T ⊗ 1)(T ⊗ 1 ⊗ 1)(1 ⊗ )(1 ⊗ 1 ⊗ ) −1 ¯ = (S ⊗ S)T (Xµ )(S(Y¯ν )α Z¯ ν Yµ ⊗ S(X¯ ν )αZµ )(−1)[Xν ](1+[Yµ ]) = γ.
Thus we have shown that FD defined by (4.10) satisfies (a)FD = FD (a), ∀a ∈ H.
(4.13)
It remains to show that FD is invertible and thus qualifies as a twist. We proceed by constructing FD−1 explicitly.
On Quasi-Hopf Superalgebras
355
Note. From the definition of γ , it is easily seen that (1 ⊗ )γ = α ⊗ (α), ( ⊗ 1)γ = (α) ⊗ α, so that (1 ⊗ )FD = ( ⊗ 1)FD = (α)S(Xν )αYν βS(Zν ) = (α). It then becomes clear, since (α)(β) = 1, that strictly speaking (β)FD qualifies as a twist. This corresponds to a non-zero scalar multiple of FD which is not important below. Now let γ¯ ∈ H ⊗ H be an even element. Apply (1 ⊗ γ¯ )( ⊗ ) to Lemma 1, (4.3), to give l.h.s. = = r.h.s. = =
(a)(X¯ ν ) ⊗ γ¯ (S(Y¯ν )α Z¯ ν ) ¯
L
L
L L L (X¯ ν a(1) ) ⊗ γ¯ (S(a(2) )S(Y¯ν )α Z¯ ν ) (a(3) )(−1)[Xν ]([a(1) ]+[a(2) ]) L (X¯ ν )(a(1) )
¯ L L ⊗ γ¯ (S ⊗ S)T (a(2) ) (S(Y¯ν )α Z¯ ν ) (a(3) )(−1)[Xν ]([a(1) ]+[a(2) ]) . L
L
On applying (m ⊗ m)(1 ⊗ T ⊗ 1), we obtain
(a)(X¯ ν ) · γ¯ · (S(Y¯ν )α Z¯ ν ) L L ¯ L L L = (X¯ ν )(a(1) )γ¯ (S ⊗ S)T (a(2) ) (S(Y¯ν )α Z¯ ν ) (a(3) )(−1)[Xν ]([a(1) ]+[a(2) ]) .
If γ¯ satisfies
(a(1) ) · γ¯ · (S ⊗ S)T (a(2) ) = (a)γ¯ , ∀a ∈ H,
(4.14)
then FD−1 (a) = (a)FD−1 , ∀a ∈ H,
(4.15)
where FD−1 =
(X¯ ν ) · γ¯ · (S(Y¯ν )α Z¯ ν ).
(4.16)
To explicitly construct γ¯ ∈ H ⊗ H satisfying (4.14), we note ( ⊗ )(a) · ( ⊗ 1 ⊗ 1) −1 · ( ⊗ 1) = ( ⊗ 1 ⊗ 1) −1 · ( ⊗ 1) · (1 ⊗ ⊗ 1)( ⊗ 1)(a).
(4.17)
356
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
Lemma 3. γ¯ = (m ⊗ m) · (1 ⊗ βS ⊗ 1 ⊗ βS) · (1 ⊗ T ⊗ 1)(1 ⊗ 1 ⊗ T )( ⊗ 1 ⊗ 1) −1 · ( ⊗ 1) satisfies (4.14). Moreover, γ¯ = (m ⊗ m) · (1 ⊗ βS ⊗ 1 ⊗ βS) · (1 ⊗ T ⊗ 1)(1 ⊗ 1 ⊗ T )(1 ⊗ 1 ⊗ ) · (1 ⊗ −1 ). Proof. The proof is very similar to that of Lemma 2. We obtain the first part by applying (m ⊗ m)(1 ⊗ βS ⊗ 1 ⊗ βS)(1 ⊗ T ⊗ 1)(1 ⊗ 1 ⊗ T ) to (4.17). The second part is obtained by noting that γ¯ can be written as ¯ ¯ ¯ γ¯ = (X¯ ν ) · (Xµ βS(Z¯ ν ) ⊗ Yµ βS(Y¯ν Zµ ))(−1)[Zν ]([Yµ ]+[Yν ])+[Xν ][Zµ ] , then applying (m ⊗ m)(1 ⊗ βS ⊗ 1 ⊗ βS)(1 ⊗ T ⊗ 1)(1 ⊗ 1 ⊗ T ) to (1 ⊗ 1 ⊗ ) · (1 ⊗ −1 ) = ( ⊗ 1 ⊗ 1) −1 · ( ⊗ 1) · (1 ⊗ ⊗ 1) ,
which is a restatement of (2.2). This proves the second part.
It remains to show that FD−1 is indeed the inverse of FD . To this end, the following result is useful. Lemma 4. FD (α) = γ ,
(β)FD−1 = γ¯ . Proof. Note that
FD ⊗ 1 = (m(1 ⊗ m) ⊗ 1) · ((S ⊗ S)T ⊗ γ ⊗ ⊗ 1) · (1 ⊗ 1 ⊗ βS ⊗ 1) · ( ⊗ 1) (4.5) = (S ⊗ S){T (Xµ )T (X¯ ρ )}(Y¯ρ ) · γ · (Yµ X¯ σ βS(Zµ(1) Y¯σ )) · (S(Yν ))(Xν ) (2)
¯ ¯ ¯ ⊗ Zν Zµ(2) Z¯ σ Z¯ ρ (−1)[Xρ ][Xµ ]+[Xσ ][Zµ ]+[Yσ ][Zµ
(1) ]+[Xν ]([Z¯ σ ]+[X¯ ρ ]+[Zµ ])
Now applying 1 ⊗ 1 ⊗ S to both sides, this reduces to FD ⊗ 1 = (S ⊗ S)T (Xµ ) · γ · (Yµ X¯ σ βS(Zµ(1) Y¯σ )) ¯
(2)
¯
⊗ S(Zµ(2) Z¯ σ )(−1)[Xσ ][Zµ ]+[Yσ ][Zµ ] . Further, applying (1 ⊗ 1 ⊗ )(1 ⊗ 1 ⊗ S −1 ) to both sides gives FD ⊗ 1 ⊗ 1 = (S ⊗ S)T (Xµ ) · γ · (Yµ X¯ σ βS(Zµ(1) Y¯σ )) ¯
¯
(2)
⊗ (Zµ(2) Z¯ σ )(−1)[Xσ ][Zµ ]+[Yσ ][Zµ ] .
.
On Quasi-Hopf Superalgebras
357
Now multiply by (α) ⊗ 1 ⊗ 1 from the right and apply (m ⊗ m)(1 ⊗ T ⊗ 1) so that (S ⊗ S)T (Xµ ) · γ FD (α) = (1)
(2)
¯ ¯ · (Yµ X¯ σ βS(Y¯σ )S(Zµ(1) )αZµ(2) Z¯ σ )(−1)[Yσ ]([Zµ ]+[Zµ ])+[Xσ ][Zµ ] = (S ⊗ S)T (Xµ ) · γ · (Yµ X¯ σ βS(Y¯σ )(Zµ )α Z¯ σ ) = (S ⊗ S)T (Xµ ) · γ · (Yµ (Zµ ))(X¯ σ βS(Y¯σ )α Z¯ σ ) = (S ⊗ S)T (Xµ ) · γ · (Yµ (Zµ )) = γ.
The second part (β)FD−1 = γ¯ is proved similarly with the help of (4.7) and (4.15). Now set ¯ ¯ (2) A¯ i ⊗ B¯ i ⊗ C¯ i ⊗ D¯ i ≡ X¯ ν(1) Xµ ⊗ X¯ ν(2) Yµ ⊗ Y¯ν Zµ (−1)[Zµ ][Yν ]+[Xν ][Xµ ] = ( ⊗ 1 ⊗ 1) −1 · ( ⊗ 1). We compute FD−1 · FD : FD−1 · FD =
(4.15)
(4.13)
(4.15)
= = =
(X¯ σ βS(Y¯σ )α Z¯ σ )FD−1 · FD (X¯ σ βS(Y¯σ ))FD−1 (α Z¯ σ )FD (X¯ σ )(β)(S(Y¯σ ))FD−1 · FD (α)(Z¯ σ ) (X¯ σ )(β)FD−1 (S(Y¯σ )) · FD (α)(Z¯ σ ).
Using Lemma 4 this reduces to FD−1 · FD = (X¯ σ(1) A¯ i ⊗ X¯ σ(2) B¯ i )(β ⊗ β)(S ⊗ S) · T (Aj Y¯σ(1) C¯ i ⊗ Bj Y¯σ(2) D¯ i ) · (α ⊗ α)(Cj Z¯ σ(1) ⊗ Dj Z¯ σ(2) ) · (−1)ξ , where ξ = [Bj ]([D¯ i ] + [Y¯σ ]) + [Y¯σ ]([Aj ] + [C¯ i ] + [D¯ i ]) + [Aj ]([C¯ i ] + [D¯ i ]) + [A¯ i ][X¯ σ(2) ] + [C¯ i ][Y¯σ(2) ] + [Dj ][Z¯ σ(1) ] + [Bj ][Y¯σ(1) ]. On the other hand, setting r≡ (1⊗2 ⊗ Aj ⊗ Bj ⊗ Cj ⊗ Dj ) · ( ⊗ ⊗ ) −1 · (A¯ i ⊗ B¯ i ⊗ C¯ i ⊗ D¯ i ⊗ 1⊗2 ) = X¯ σ(1) A¯ i ⊗ X¯ σ(2) B¯ i ⊗ Aj Y¯σ(1) C¯ i ⊗ Bj Y¯σ(2) D¯ i ⊗ Cj Z¯ σ(1) ⊗ Dj Z¯ σ(2) (−1)ξ , implies
FD−1 · FD = ϕ(r)
358
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
with ϕ : H ⊗6 → H ⊗2 defined by ϕ(a1 ⊗ a2 ⊗ a3 ⊗ a4 ⊗ a5 ⊗ a6 ) = (a1 ⊗ a2 )(β ⊗ β)(S ⊗ S) · T (a3 ⊗ a4 ) · (α ⊗ α)(a5 ⊗ a6 ).
Remark. The two equivalent expressions of γ¯ (γ ) implies that we can choose either
A¯ i ⊗ B¯ i ⊗ C¯ i ⊗ D¯ i =
Aj ⊗ Bj ⊗ Cj ⊗ Dj =
( ⊗ 1 ⊗ 1) −1 · ( ⊗ 1) or (1 ⊗ 1 ⊗ ) · (1 ⊗ −1 ), (1 ⊗ ) · (1 ⊗ 1 ⊗ ) −1 or ( −1 ⊗ 1) · (1 ⊗ 1 ⊗ ) .
Similarly, we can show FD−1 · FD = ϕ(¯ ¯ r ), where r¯ =
(Aj ⊗ Bj ⊗ Cj ⊗ Dj ⊗ 1⊗2 ) · ( ⊗ ⊗ ) · (1⊗2 ⊗ A¯ i ⊗ B¯ i ⊗ C¯ i ⊗ D¯ i )
with ϕ¯ : H ⊗6 → H ⊗2 defined by ϕ(a ¯ 1 ⊗ a2 ⊗ a3 ⊗ a4 ⊗ a5 ⊗ a6 ) = (S ⊗ S) · T (a1 ⊗ a2 ) · (α ⊗ α)(a3 ⊗ a4 ) · (β ⊗ β)(S ⊗ S) · T (a5 ⊗ a6 ). Before proceeding, it is worth noting the following properties of ϕ and ϕ¯ which follow immediately from their definition: ϕ(h23 (a)) = (a)ϕ(h) = ϕ(45 (a)h), ϕ(h14 (a)) = (a)ϕ(h) = ϕ(36 (a)h), ϕ( ¯ 23 (a)h) = (a)ϕ(h) ¯ = ϕ(h ¯ 45 (a)), ϕ( ¯ 14 (a)h) = (a)ϕ(h) ¯ = ϕ(h ¯ 36 (a)),
(4.18) (4.19) (4.20) (4.21)
a(1) ⊗ 1 ⊗ 1 ⊗ ∀a ∈ H, h ∈ H ⊗6 and where we have used the notation 14 (a) = a(2) ⊗ 1 ⊗ 1 (i.e. (a) acting in the first and fourth components of the tensor product), etc. Now we choose the following expressions for r and r¯ : r = (1⊗2 ⊗ (1 ⊗ )(1 ⊗ 1 ⊗ ) −1 ) · ( ⊗ ⊗ ) −1 · (( ⊗ 1 ⊗ 1) −1 · ( ⊗ 1) ⊗ 1⊗2 ), r¯ = (( −1 ⊗ 1)(1 ⊗ 1 ⊗ ) ⊗ 1⊗2 ) · ( ⊗ ⊗ ) · (1⊗2 ⊗ (1 ⊗ 1 ⊗ ) · (1 ⊗ −1 )),
On Quasi-Hopf Superalgebras
359
which implies r = (1⊗3 ⊗ ) · ( ⊗ 1⊗2 ⊗ ) × {(1 ⊗ −1 ) · (1 ⊗ ⊗ 1) −1 · ( −1 ⊗ 1)} · ( ⊗ 1⊗3 ) (2.2)
= (1⊗3 ⊗ )( ⊗ 1 ⊗ (1 ⊗ )) −1 · (( ⊗ 1) ⊗ 1 ⊗ ) −1 · ( ⊗ 1⊗3 )
(2.1)
= ( ⊗ 1 ⊗ ( ⊗ 1)) −1 · (1⊗3 ⊗ ) · ( ⊗ 1⊗3 )((1 ⊗ ) ⊗ 1 ⊗ ) −1 = 45 (Z¯ ν(1) )((X¯ ν ) ⊗ Y¯ν ⊗ 1⊗2 ⊗ Z¯ ν(2) )( ⊗ 1⊗3 ) (2) ¯ ¯ (1) ][Z¯ ν ]+[X¯ µ ][Xµ ]
· (1⊗3 ⊗ )(X¯ µ(1) ⊗ 1⊗2 ⊗ Y¯µ ⊗ (Z¯ µ ))23 (X¯ µ(2) )(−1)[Zν
.
Equation (4.18) implies ϕ(r) = ϕ(s), where s= ((X¯ ν ) ⊗ Y¯ν ⊗ 1⊗2 ⊗ Z¯ ν )( ⊗ 1⊗3 )(1⊗3 ⊗ )(X¯ µ ⊗ 1⊗2 ⊗ Y¯µ ⊗ (Z¯ µ )). Using (2.2), and noting that ⊗3 ⊗ (1 ⊗ T )(T ⊗ 1))(1 ⊗ −1 ⊗ 1⊗2 ), −1 236 = (1 ⊗3 ⊗2 −1 ⊗ −1 ⊗ 1), 145 = ((T ⊗ 1)(1 ⊗ T ) ⊗ 1 )(1
the expression for s reduces to s= 36 (Zµ )45 (Y¯σ ) · (Xµ ⊗ Yµ ⊗ 1⊗4 ) ⊗4 ¯ · −1 ⊗ Z¯ ν )(X¯ σ ⊗ 1⊗4 ⊗ Z¯ σ ) 236 · (Xν ⊗ 1 · −1 · (1⊗4 ⊗ Yρ ⊗ Zρ )23 (Y¯ν ) 145
¯
¯
¯
¯
· 14 (Xρ )(−1)[Yσ ]([Zµ ]+[Yν ]+[Xσ ])+[Yν ]([Xρ ]+[Zν ])+[Zµ ]+[Xρ ] . Equations (4.18) and (4.19) then imply ϕ(s) = ϕ(t), where −1 t = −1 236 · 145 ¯ ¯ = X¯ µ ⊗ X¯ ν ⊗ Y¯ν ⊗ Y¯µ ⊗ Z¯ µ ⊗ Z¯ ν (−1)[Zν ][Zµ ] ,
which then implies ϕ(r) = ϕ(t) = (X¯ µ ⊗ X¯ ν )(β ⊗ β)(S(Y¯µ ) ¯ ¯ ¯ ¯ ⊗ S(Y¯ν ))(α ⊗ α)(Z¯ µ ⊗ Z¯ ν )(−1)[Zν ][Zµ ]+[Yν ][Yµ ] = X¯ µ βS(Y¯µ )α Z¯ µ ⊗ X¯ ν βS(Y¯ν )α Z¯ ν = 1 ⊗ 1.
360
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
Similarly, with the following choice of r¯ , r¯ = ( −1 ⊗ 1⊗3 ) · ( ⊗ 1⊗2 ⊗ )(( ⊗ 1) · (1 ⊗ ⊗ 1) · (1 ⊗ )) · (1⊗3 ⊗ −1 ), and using (2.2) and (2.1), we obtain ϕ(¯ ¯ r ) = ϕ(¯ ¯ s ), with s¯ defined by s¯ =
(Xµ ⊗ 1⊗2 ⊗ Yµ ⊗ (Zµ ))(1⊗3 ⊗ −1 )
· ( −1 ⊗ 1⊗3 )((Xν ) ⊗ Yρ ⊗ 1⊗2 ⊗ Zν ) which reduces to s¯ =
14 (X¯ ν )23 (Yρ )(1⊗4 ⊗ Y¯ν ⊗ Z¯ ν )
· 145 · (Xµ ⊗ 1⊗4 ⊗ Zµ )(Xρ ⊗ 1⊗4 ⊗ Zρ ) · 236 · (X¯ σ ⊗ Y¯σ ⊗ 1⊗4 )45 (Yµ ) ¯ ¯ · 36 (Z¯ σ )(−1)[Xµ ][Zµ ]+[Yρ ]([Xρ ]+[Xν ]+[Yµ ])+[Yµ ][Zσ ] .
This implies that
ϕ(¯ ¯ r ) = ϕ(¯ ¯ s ) = ϕ( ¯ t¯),
where
t¯ = 145 · 236 = t −1 ,
from which it follows that
ϕ(¯ ¯ r ) = ϕ( ¯ t¯) = 1 ⊗ 1,
FD−1
so that is indeed the inverse of FD . Summarising the above results, we have proved Theorem 2. is obtained from by twisting with FD . That is, (a) = FD (a)FD−1 , ∀a ∈ H with FD as in (4.10) and γ as in Lemma 2. Moreover FD−1 is given explicitly by (4.16) with γ¯ as in Lemma 3. Remark. It is actually F¯D = (β)FD which qualifies as a twist. Thus we have (a) = F¯D (a)F¯D−1 , ∀a ∈ H with F¯D−1 = (α)FD−1 . Thus H is a QHSA with coproduct under the twisted structure induced by F¯D . The following gives alternative expressions for FD and FD−1 (the proof is straightforward). Lemma 5. FD = FD−1 =
(X¯ ν βS(Y¯ν )) · γ · (Z¯ ν ), (S(Xν )αYν ) · γ¯ · (S ⊗ S)T (Zν ).
On Quasi-Hopf Superalgebras
361
5. QHSA Structure Induced by In this section we give the full QHSA induced by . Proposition 4. H is a QHSA with coproduct, coassociator and canonical elements given respectively by , ≡ (S ⊗ S ⊗ S) 321 , α = S(β), β = S(α). T Proof. First we note that = (S ⊗ S ⊗ S)( T )−1 , T = −1 321 . is the coassociator associated with the opposite QHSA structure, and obeys
(1 ⊗ T )T (a)( T )−1 = ( T )−1 (T ⊗ 1)T (a). Applying S ⊗ S ⊗ S to both sides of this expression yields S(a(2) ) ⊗ (S ⊗ S)T (a(1) )(−1)[a(1) ][a(2) ] = ( (S ⊗ S)T (a(2) ) ⊗ S(a(1) )(−1)[a(1) ][a(2) ] ) · , which reduces to · (1 ⊗ )(S ⊗ S)T (a) = ( ⊗ 1)(S ⊗ S)T (a) · or
(1 ⊗ ) (a) = ( )−1 ( ⊗ 1) (a) , ∀a ∈ H. Next, from (T ⊗ 1 ⊗ 1) T · (1 ⊗ 1 ⊗ T ) T = ( T ⊗ 1) · (1 ⊗ T ⊗ 1) T · (1 ⊗ T )
we take the inverse (1 ⊗ 1 ⊗ T )( T )−1 · (T ⊗ 1 ⊗ 1)( T )−1 = (1 ⊗ ( T )−1 ) · (1 ⊗ T ⊗ 1)( T )−1 · (( T )−1 ⊗ 1) and then apply S ⊗ S ⊗ S ⊗ S to both sides: l.h.s. = ((S ⊗ S)T · S −1 ⊗ 1 ⊗ 1)(S ⊗ S ⊗ S)( T )−1 ·(1 ⊗ 1 ⊗ (S ⊗ S)T · S −1 )(S ⊗ S ⊗ S)( T )−1 = ( ⊗ 1 ⊗ 1) · (1 ⊗ 1 ⊗ ) = r.h.s. = ( ⊗ 1)(1 ⊗ (S ⊗ S)T · S −1 ⊗ 1)(S ⊗ S ⊗ S)( T )−1 · (1 ⊗ ) = ( ⊗ 1) · (1 ⊗ ⊗ 1) · (1 ⊗ ). Thirdly, from (1 ⊗ ⊗ 1) T = 1, and applying S ⊗ S ⊗ S to both sides gives (1 ⊗ ⊗ 1) = 1.
362
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
As to the canonical elements α and β , m · (1 ⊗ α )(S ⊗ 1) (a) = m · (1 ⊗ S(β))(S ⊗ 1)(S ⊗ S)T (S −1 (a)) = m · (1 ⊗ S(β))(S ⊗ 1)(S ⊗ S) a¯ (2) ⊗ a¯ (1) (−1)[a¯ (2) ][a¯ (1) ] = S 2 (a¯ (2) )S(β)S(a¯ (1) )(−1)[a¯ (2) ][a¯ (1) ] = S( a¯ (1) βS(a¯ (2) )) = (a)S(β) ¯ = (S −1 (a))S(β) = (a)α and similarly m · (1 ⊗ β )(1 ⊗ S) (a) = (a)β . Finally, m(m ⊗ 1) · (1 ⊗ β ⊗ α )(1 ⊗ S ⊗ 1)( )−1 = m(m ⊗ 1) · (1 ⊗ S(α) ⊗ S(β))(1 ⊗ S ⊗ 1)(S ⊗ S ⊗ S) −1 321 ¯ ¯ ¯ = S(Z¯ ν )S(α)S 2 (Y¯ν )S(β)S(X¯ ν )(−1)[Zν ]+[Xν ][Yν ] = S( X¯ ν βS(Y¯ν )α Z¯ ν ) = S(1) = 1, and similarly m(m ⊗ 1) · (S ⊗ 1 ⊗ 1)(1 ⊗ α ⊗ β )(1 ⊗ 1 ⊗ S) = 1. This proves that H is a QHSA with the structure given.
5.1. Connection with the Drinfeld twist. Our aim is to show that the twisted structure induced by FD coincides precisely with the QHSA structure of Proposition 4. We have already shown in Theorem 2 that = FD , so it remains to show that = FD , while α and β are equivalent to αFD and βFD respectively. For the coassociator, it remains to prove = (S ⊗ S ⊗ S) 321 = FD
= (FD ⊗ 1)( ⊗ 1)FD · · (1 ⊗ )FD−1 · (1 ⊗ FD−1 ),
or · (1 ⊗ FD )(1 ⊗ )FD = (FD ⊗ 1)( ⊗ 1)FD · .
(5.1)
On Quasi-Hopf Superalgebras
363
To this end, (1 ⊗ FD )(1 ⊗ )FD (4.13)
= (1 ⊗ )FD · (1 ⊗ FD ) (4.10) = (1 ⊗ ) (S(Xν )) · (1 ⊗ FD )(1 ⊗ FD−1 ) · (1 ⊗ )γ · (1 ⊗ FD )(1 ⊗ FD−1 ) · (1 ⊗ )(Yν βS(Zν ))(1 ⊗ FD ) (2.1) = (1 ⊗ ) (S(Xν )) · (1 ⊗ FD ) · (1 ⊗ )γ · −1 ( ⊗ 1)(Yν βS(Zν )) · . Now multiplying both sides by on the left gives · (1 ⊗ FD )(1 ⊗ )FD = ( ⊗ 1) (S(Xν )) · · (1 ⊗ FD ) · (1 ⊗ )γ · −1 ( ⊗ 1)(Yν βS(Zν )) · , while we can likewise show (FD ⊗ 1)( ⊗ 1)FD · = ( ⊗ 1)FD · (FD ⊗ 1) · = ( ⊗ 1) (S(Xν )) · ( )−1 · (FD ⊗ 1)( ⊗ 1)γ · · ( )−1 · ( ⊗ 1)(Yν βS(Zν )) · . So to prove (5.1), it suffices to prove (1 ⊗ FD )(1 ⊗ )γ = ( )−1 · (FD ⊗ 1)( ⊗ 1)γ · , or Lemma 6. ( )−1 · (FD ⊗ 1)( ⊗ 1)γ = (1 ⊗ FD )(1 ⊗ )γ · −1 . Proof. Since γ =
S(Bi )αCi ⊗ S(Ai )αDi (−1)[Ai ]([Bi ]+[Ci ]) ,
we have (FD ⊗ 1)( ⊗ 1)γ = FD (S(Bi ))(α)(Ci ) ⊗ S(Ai )αDi (−1)[Ai ]([Bi ]+[Ci ]) (4.13) = (S ⊗ S)T (Bi )FD (α)(Ci ) ⊗ S(Ai )αDi (−1)[Ai ]([Bi ]+[Ci ]) = (S ⊗ S)T (Bi ) · γ · (Ci ) ⊗ S(Ai )αDi (−1)[Ai ]([Bi ]+[Ci ]) = (S ⊗ S)T (Bi ) · (S ⊗ S)T (Aj ⊗ Bj ) · (α ⊗ α) · (Cj ⊗ Dj ) · (Ci ) ⊗ S(Ai )αDi (−1)[Ai ]([Bi ]+[Ci ]) ,
(5.2)
364
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
where in the penultimate equation we have used Theorem 2. Set ¯ ¯ ¯ ( )−1 = (S ⊗ S ⊗ S)(Z¯ ν ⊗ Y¯ν ⊗ X¯ ν )(−1)[Zν ]+[Xν ][Yν ] which implies ( )−1 (FD ⊗ 1)( ⊗ 1)γ = (S ⊗ S)T (Y¯ν ⊗ Z¯ ν ) · (S ⊗ S)T · (Bi ) · (S ⊗ S)T (Aj ⊗ Bj ) · (α ⊗ α) · (Cj ⊗ Dj ) · (Ci ) ⊗ S(X¯ ν )S(Ai )αDi ¯
× (−1)[Ai ]([Bi ]+[Ci ])+[Xν ](1+[Bi ]+[Ci ]) = (S ⊗ S)T {(Aj ⊗ Bj )(Bi )(Y¯ν ⊗ Z¯ ν )} · (α ⊗ α) · (Cj ⊗ Dj ) · (Ci ) ⊗ S(Ai X¯ ν )αDi
¯
¯
¯
¯
× (−1)([Aj ]+[Bj ])([Bi ]+[Xν ])+[Bi ][Xν ]+([Ai ]+[Xν ])([Bi ]+[Ci ]+[Xν ]) = ζ (p), where p=
Ai X¯ ν ⊗ (Aj ⊗ Bj ) · (Bi ) · (Y¯ν ⊗ Z¯ ν ) ⊗ (Cj ⊗ Dj ) · (Ci ) ⊗ Di ¯
¯
×(−1)([Aj ]+[Bj ])([Bi ]+[Xν ])+[Bi ][Xν ] and with ζ : H ⊗6 → H ⊗3 defined by ζ (a1 ⊗ a2 ⊗ a3 ⊗ a4 ⊗ a5 ⊗ a6 ) = S(a3 )αa4 ⊗ S(a2 )αa5 ⊗ S(a1 )αa6 × (−1)[a1 ]([a2 ]+[a3 ]+[a4 ]+[a5 ])+[a2 ]([a3 ]+[a4 ]) . Also, p can be reduced to p= (1 ⊗ Aj ⊗ Bj ⊗ Cj ⊗ Dj ⊗ 1) · (1 ⊗ ⊗ ⊗ 1)(Ai ⊗ Bi ⊗ Ci ⊗ Di ) · ( −1 ⊗ 1⊗3 ) = {1 ⊗ (1 ⊗ )(1 ⊗ 1 ⊗ ) −1 ⊗ 1} · (1 ⊗ ⊗ ⊗ 1) · {(1 ⊗ )(1 ⊗ 1 ⊗ ) −1 } · ( −1 ⊗ 1⊗3 ). Now we compute the right-hand side of (5.2): (1 ⊗ FD )(1 ⊗ F )γ · −1 = S(Bi )αCi ⊗ FD (S(Ai )αDi ) · −1 (−1)[Ai ]([Bi ]+[Ci ]) = S(Bi )αCi ⊗ (S ⊗ S)T (Ai )FD (α)(Di ) · −1 (−1)[Ai ]([Bi ]+[Ci ]) = S(Bi )αCi ⊗ (S ⊗ S)T (Ai ) · γ · (Di ) · −1 (−1)[Ai ]([Bi ]+[Ci ]) = S(Bi )αCi X¯ ν ⊗ (S ⊗ S)T {(Aj ⊗ Bj )(Ai )} · (α ⊗ α) · (Cj ⊗ Dj ) ¯ · (Di ) · (Y¯ν ⊗ Z¯ ν )(−1)[Xν ]([Ai ]+[Di ])+[Ai ]([Aj ]+[Bj ]+[Bi ]+[Ci ]) = ζ (p), ˜
On Quasi-Hopf Superalgebras
365
where, in the third equality we have used Theorem 2. Here p˜ = (Aj ⊗ Bj )(Ai ) ⊗ Bi ⊗ Ci X¯ ν ⊗ (Cj ⊗ Dj ) · (Di ) · (Y¯ν ⊗ Z¯ ν ) ¯
× (−1)[Xν ]([Di ]+[Cj ]+[Dj ])+[Di ]([Cj ]+[Dj ]) = (Aj ⊗ Bj ⊗ 1⊗2 ⊗ Cj ⊗ Dj ) · ( ⊗ 1⊗2 ⊗ )(Ai ⊗ Bi ⊗ Ci ⊗ Di ) · (1⊗3 ⊗ −1 ). Therefore, to prove (5.2), it suffices to show that ζ (p) = ζ (p). ˜
(5.3)
We first note that ∀h ∈ H ⊗6 and ∀a ∈ H (notation as in Eqs. (4.18)–(4.21)) ζ (34 (a)h) = (a)ζ (h) = ζ (25 (a)h) = ζ (16 (a)h). We can also write where
(5.4)
¯ p˜ = {(1 ⊗ )(1 ⊗ 1 ⊗ ) −1 }1256 · p,
p¯ = ( ⊗ 1⊗2 ⊗ ){(1 ⊗ )(1 ⊗ 1 ⊗ ) −1 } · (1⊗3 ⊗ −1 ).
In the following we use ∼ to denote equivalence under the map ζ : p˜
(5.4),(2.4)
∼
=
(1⊗2 ⊗ (Xν ) ⊗ Yν ⊗ Zν ) · p¯ {(1 ⊗ )(1 ⊗ 1 ⊗ ) −1 }1256 (1 ⊗ Xµ ⊗ 1⊗2 ⊗ Yµ ⊗ Zµ )(X¯ ν ⊗ Y¯ν ⊗ 1⊗2 ⊗ (Z¯ ν )
∼
· {1⊗2 ⊗ ( ⊗ 1 ⊗ 1) } · p¯ L L L (1 ⊗ )1256 · (X¯ ν ⊗ Y¯ν ⊗ (Z¯ ν(1) ) ⊗ Z¯ ν(2) ⊗ Z¯ ν(3) )
=
· {1⊗2 ⊗ ( ⊗ 1 ⊗ 1) } · p¯ (1 ⊗ Xµ ⊗ 1⊗2 ⊗ Yµ ⊗ Zµ ){1⊗2 ⊗ ( ⊗ 1 ⊗ 1)( ⊗ 1)} −1
(5.4),(2.4)
(5.4),(2.4)
∼
=
· {1⊗2 ⊗ ( ⊗ 1 ⊗ 1) } · p¯ (1 ⊗ Xµ ⊗ (Yµ(1) ) ⊗ Yµ(2) ⊗ Zµ ){1⊗2 ⊗ ( ⊗ 1 ⊗ 1)( ⊗ 1)} −1 · {1⊗2 ⊗ ( ⊗ 1 ⊗ 1) } · p¯ {1 ⊗ (1 ⊗ ( ⊗ 1) ⊗ 1) } · {1⊗2 ⊗ (( ⊗ 1) ⊗ 1)} −1 · {1⊗2 ⊗ ( ⊗ 1 ⊗ 1) } · p. ¯
That is, ζ (p) ˜ = ζ (u), where u = {1 ⊗ (1 ⊗ ( ⊗ 1) ⊗ 1) } ¯ · {1⊗2 ⊗ (( ⊗ 1) ⊗ 1)} −1 · {1⊗2 ⊗ ( ⊗ 1 ⊗ 1) } · p.
366
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
We now compute p. Using Eq. (2.2) we obtain p = (1⊗2 ⊗ ⊗ 1){1 ⊗ (1 ⊗ 1 ⊗ ) −1 ⊗ 1} · {1 ⊗ ( ⊗ ⊗ 1) } · (1 ⊗ ⊗ ⊗ 1)(1 ⊗ 1 ⊗ ) −1 · ( −1 ⊗ 1⊗3 ) = (1⊗2 ⊗ ⊗ 1){1 ⊗ (1 ⊗ 1 ⊗ ) −1 ⊗ 1} · {1 ⊗ ( ⊗ ⊗ 1) } · {1 ⊗ (1⊗2 ⊗ ( ⊗ 1)) } · {1⊗2 ⊗ (1 ⊗ ( ⊗ 1))} −1 · { ⊗ 1 ⊗ ( ⊗ 1)} −1 = (1⊗2 ⊗ ⊗ 1){1 ⊗ (1 ⊗ (1 ⊗ ) ⊗ 1) } · {1⊗2 ⊗ (1 ⊗ ⊗ 1) } · {1⊗2 ⊗ (1 ⊗ ( ⊗ 1))} −1 · { ⊗ 1 ⊗ ( ⊗ 1)} −1 = (1⊗2 ⊗ ⊗ 1){1 ⊗ (1 ⊗ (1 ⊗ ) ⊗ 1) } · {1⊗2 ⊗ (1 ⊗ ⊗ 1)( ⊗ 1)} −1 · {1⊗2 ⊗ (1 ⊗ ⊗ 1) } · { ⊗ 1 ⊗ ( ⊗ 1)} −1 = {1 ⊗ (1 ⊗ ( ⊗ 1) ⊗ 1) } · {1⊗2 ⊗ (( ⊗ 1) ⊗ 1)} −1 · (1⊗2 ⊗ ⊗ 1){1⊗2 ⊗ (1 ⊗ ⊗ 1) } · { ⊗ 1 ⊗ ( ⊗ 1)} −1 = {1 ⊗ (1 ⊗ ( ⊗ 1) ⊗ 1) } · {1⊗2 ⊗ (( ⊗ 1) ⊗ 1)} −1 · {1⊗2 ⊗ ( ⊗ 1 ⊗ 1) } · {1⊗2 ⊗ (1 ⊗ 1 ⊗ ) } · { ⊗ 1 ⊗ (1 ⊗ )} −1 · (1⊗3 ⊗ −1 ) = {1 ⊗ (1 ⊗ ( ⊗ 1) ⊗ 1) } · {1⊗2 ⊗ (( ⊗ 1) ⊗ 1)} −1 · {1⊗2 ⊗ ( ⊗ 1 ⊗ 1) } · p¯ = u. Thus we have proved (5.3), i.e. ζ (p) = ζ (u) = ζ (p). ˜ This proves Lemma 6, so that
= FD ,
as required. For the canonical elements, we begin with the following useful result: Lemma 7. For any η ∈ H ⊗ H , m · (1 ⊗ α)(S ⊗ 1){(a)η} = (a)m · (1 ⊗ α)(S ⊗ 1)η, m · (1 ⊗ β)(1 ⊗ S){η(a)} = (a)m · (1 ⊗ β)(1 ⊗ S)η. Proof. For (5.5), l.h.s. = m · (1 ⊗ α)(S ⊗ 1){ (a(1) ⊗ a(2) )(ηi ⊗ ηi )} = S(ηi )S(a(1) )αa(2) ηi (−1)[ηi ]([a(1) ]+[a(2) ]) = (a)S(ηi )αηi = (a)m · (1 ⊗ α)(S ⊗ 1)η = r.h.s. The proof of (5.6) is similar.
(5.5) (5.6)
On Quasi-Hopf Superalgebras
367
For αFD , we have αFD = m · (1 ⊗ α)(S ⊗ 1)FD−1 = m · (1 ⊗ α)(S ⊗ 1) (X¯ ν ) · γ¯ · (S(Y¯ν )α Z¯ ν ) (5.5) = m · (1 ⊗ α)(S ⊗ 1) (X¯ ν ) · γ¯ · (S(Y¯ν )α Z¯ ν ) = m · (1 ⊗ α)(S ⊗ 1){γ¯ · (α)} = m · (1 ⊗ α)(S ⊗ 1) (X¯ ν )(Xµ βS(Z¯ ν ) ⊗ Yµ βS(Y¯ν Zµ )) · (α) ¯
¯
¯
¯
¯
¯
× (−1)[Zν ]([Yµ ]+[Yν ])+[Xν ][Zµ ] (5.5) = m · (1 ⊗ α)(S ⊗ 1) (X¯ ν )(Xµ βS(Z¯ ν ) ⊗ Yµ βS(Zµ )S(Y¯ν )) · (α) × (−1)[Zν ]([Yµ ]+[Yν ])+[Yν ][Zµ ] = S(β α˜ (1) )S(Xµ )αYµ βS(Zµ )α˜ (2) = S(β α˜ (1) )α˜ (2) = S( S −1 (α˜ (2) )β α˜ (1) (−1)[α˜ (1) ][α˜ (2) ] ), where we have used the notation (α) =
α˜ (1) ⊗ α˜ (2) .
Now observe S −1 (α˜ (2) )β α˜ (1) (−1)[α˜ (1) ][α˜ (2) ] = m · (1 ⊗ β)(S −1 ⊗ 1)T (α) = m · (1 ⊗ β)(S −1 ⊗ 1)(S ⊗ S)(S −1 (α)) = m · (1 ⊗ β)(1 ⊗ S)(S −1 (α)) = (S −1 (α))β = (α)β, which implies αFD = = = =
m · (1 ⊗ α)(S ⊗ 1)FD−1 S((α)β) (α)S(β) (α)α .
The result for βFD , namely βFD = m · (1 ⊗ β)(1 ⊗ S)FD = (β)β is proved similarly. We have therefore proved the following: Theorem 3. The QHSA structure defined on H by Proposition 4 is precisely equivalent to that induced by the Drinfeld twist FD .
368
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
5.2. Drinfeld twisting on quasi-triangular QHSAs. Our aim here is to extend Theorem 3 to the important case of quasi-triangular QHSAs. We begin with Proposition 5. With the full QHSA structure of Proposition 4, H is quasi-triangular with R-matrix R = (S ⊗ S)R. Proof. Applying S ⊗ S to (2.9) gives, ∀a ∈ H R (S ⊗ S)T (a) = (S ⊗ S)(a)R , so that
R (a) = ( )T R .
Applying T ⊗ 1 to (2.10) gives −1 (T ⊗ 1)R = −1 321 R23 312 R13 213 .
Then applying S ⊗ S ⊗ S we obtain l.h.s. = ((S ⊗ S)T · S −1 ⊗ 1)(S ⊗ S)R = ( ⊗ 1)R = r.h.s. = (S ⊗ S ⊗ S) −1 213 · (S ⊗ S ⊗ S)R13 · (S ⊗ S ⊗ S) 312 ·(S ⊗ S ⊗ S)R23 · (S ⊗ S ⊗ S) −1 321 . Since 123 = (S ⊗ S ⊗ S) 321 , −1 ( )−1 123 = (S ⊗ S ⊗ S) 321 , −1 ( )−1 231 = (S ⊗ S ⊗ S) 213 ,
132 = (S ⊗ S ⊗ S) 312 ,
we have
−1 ( ⊗ 1)R = ( )−1 231 (R )13 132 (R )23 ( )123 .
Similarly, applying (S ⊗ S ⊗ S)(1 ⊗ T ) to (2.11) we arrive at (1 ⊗ )R = 312 (R )13 ( )−1 213 (R )12 123 .
This completes the proof. We now show that the R-matrix R coincides with the R-matrix RFD induced from R by the Drinfeld twist FD . Our main result is Theorem 4. The quasi-triangular QHSA structure on H , defined by Propositions 4, 5 is precisely equivalent to the quasi-triangular QHSA structure induced on H by the Drinfeld twist FD . Namely, R = FDT RFD−1 = RFD .
On Quasi-Hopf Superalgebras
369
Proof. To prove this, it suffices to show R FD = FDT R, where FDT =
(S ⊗ S)(Xν ) · γ T · T (Yν βS(Zν ))
= T · FD , and γ T = T · γ . To this end,
R FD = R (S ⊗ S)T (Xν ) · γ · (Yν βS(Zν )) = ( )T (S(Xν ))R · γ · (Yν βS(Zν )) = (S ⊗ S)(Xν )R · γ · (Yν βS(Zν )),
and similarly FDT R =
(S ⊗ S)(Xν ) · γ T · R(Yν βS(Zν )).
It therefore suffices to show Lemma 8.
R γ = γ T R.
Proof. Write R = at ⊗ a t and note that R is even. We then have for the left hand side Rγ = (S(at ) ⊗ S(a t ))(S(Bi )αCi ⊗ S(Ai )αDi )(−1)[Ai ]([Bi ]+[Ci ]) = (S ⊗ S)T {(Ai ⊗ Bi )(a t ⊗ at )} · (α ⊗ α) · (Ci ⊗ Di ) t
t
t
× (−1)[Bi ][a ]+([Ai ]+[a ])([Bi ]+[at ])+[Ai ]([Bi ]+[a ]) = (S ⊗ S)T {(Ai ⊗ Bi )R T } · (α ⊗ α) · (Ci ⊗ Di ) = ψ(v), where
v=
(Ai ⊗ Bi ⊗ Ci ⊗ Di )(R T ⊗ 1⊗2 )
and ψ : H ⊗4 → H ⊗2 is defined by ψ(a1 ⊗ a2 ⊗ a3 ⊗ a4 ) = (S ⊗ S)T (a1 ⊗ a2 ) · (α ⊗ α) · (a3 ⊗ a4 ). For the right hand side (using obvious notation), we have γTR = T( S(Bi )αCi ⊗ S(Ai )αDi ) · (et ⊗ et )(−1)[Ai ]([Bi ]+[Ci ]) = (S ⊗ S)(Ai ⊗ Bi ) · (α ⊗ α)(Di ⊗ Ci )(et ⊗ et )(−1)[Di ][Ci ] = (S ⊗ S)T {T (Ai ⊗ Bi )} · (α ⊗ α) · T {(Ci ⊗ Di )R T } = ψ(v), ˜
370
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
where v˜ = (T ⊗ T )
(Ai ⊗ Bi ⊗ Ci ⊗ Di )(1⊗2 ⊗ R T ),
so it suffices to show ψ(v) = ψ(v). ˜ Above we have used Lemma 2, so that Ai ⊗ Bi ⊗ Ci ⊗ Di = ( −1 ⊗ 1)( ⊗ 1 ⊗ 1) , Ai ⊗ Bi ⊗ Ci ⊗ Di = (1 ⊗ )(1 ⊗ 1 ⊗ ) −1 . In view of Eq. (2.9), v immediately reduces to T T v = ( −1 123 (R )12 ⊗ 1)( ⊗ 1 ⊗ 1) .
With the help of the equation (1 ⊗ )R T = (T ⊗ 1)(1 ⊗ T )( ⊗ 1)R −1 T T = −1 123 (R )12 213 (R )13 312 , v can be written −1 T v = {(1 ⊗ )R T · 312 (R T )−1 13 213 ⊗ 1}( ⊗ 1 ⊗ 1) −1 T = 23 (at )(a t ⊗ 1⊗3 ){ 312 (R T )−1 13 213 ⊗ 1}( ⊗ 1 ⊗ 1) .
Now observe ψ(23 (a)h) = (a)ψ(h) = ψ(14 (a)h),
(5.7)
which holds ∀a ∈ H , h ∈ H ⊗4 . In what follows, we use ∼ to denote equivalence under ψ. We then have (5.7)
v ∼
−1 T (at )(a t ⊗ 1⊗3 ){ 312 (R T )−1 13 213 ⊗ 1} · ( ⊗ 1 ⊗ 1)
−1 = (T ⊗ 1 ⊗ 1){( 132 ⊗ 1)((R T )−1 ⊗ 1)( ⊗ 1 ⊗ 1) } 23 ⊗ 1)( (2.2)
= (T ⊗ 1 ⊗ 1){( 132 ⊗ 1)(1 ⊗ (R T )−1 ⊗ 1)(1 ⊗ ⊗ 1) · (1 ⊗ )(1 ⊗ 1 ⊗ ) −1 }
(2.9)
= (T ⊗ 1 ⊗ 1){( 132 ⊗ 1)(1 ⊗ T ⊗ 1) · (1 ⊗ (R T )−1 ⊗ 1)(1 ⊗ )(1 ⊗ 1 ⊗ ) −1 }
(2.2)
= (T ⊗ 1 ⊗ 1){(1 ⊗ T ⊗ 1)(( ⊗ 1 ⊗ 1) · (1 ⊗ 1 ⊗ ) · (1 ⊗ −1 )) · (1 ⊗ (R T )−1 ⊗ 1) · (1 ⊗ )(1 ⊗ 1 ⊗ ) −1 }.
By straightforward application of Eq. (5.7) we obtain v∼ (Xν )(Zµ )(Yν ⊗ 1⊗2 ⊗ Zν )(1 ⊗ Xµ ⊗ Yµ ⊗ 1) · (T ⊗ 1 ⊗ 1){(1 ⊗ T ⊗ 1)(1 ⊗ −1 ) · (1 ⊗ (R T )−1 ⊗ 1) · (1 ⊗ )(1 ⊗ 1 ⊗ ) −1 } T −1 −1 = (T ⊗ 1 ⊗ 1){(1 ⊗ −1 213 )((R )23 ⊗ 1)(1 ⊗ )(1 ⊗ 1 ⊗ ) }.
On Quasi-Hopf Superalgebras
371
As to v˜ we note that ( ⊗ 1)R T = (1 ⊗ T )(T ⊗ 1)(1 ⊗ )R T = 123 (R T )23 −1 132 (R )13 231 . Paying particular attention to Eqs. (2.9) and (5.7), we have v˜ = (T ⊗ T ) · {(1 ⊗ )(1 ⊗ 1 ⊗ ) −1 · (1⊗2 ⊗ R T )} (2.9)
= (T ⊗ T ) · {(1 ⊗ )(1⊗2 ⊗ R T )(1 ⊗ 1 ⊗ T ) −1 } T = (T ⊗ T ){(1 ⊗ 123 R23 )(1 ⊗ 1 ⊗ T ) −1 } T −1 = 14 (a t )(1⊗2 ⊗ at ⊗ 1)(T ⊗ T ){1 ⊗ −1 231 (R )13 132 } t
· (T ⊗ T )(1⊗2 ⊗ T )(1 ⊗ 1 ⊗ ) −1 (−1)[at ][a ] T −1 ∼ (a t )(1⊗2 ⊗ at ⊗ 1)(T ⊗ T ){1 ⊗ ( −1 231 (R )13 132 )} · (T ⊗ 1⊗2 )(1 ⊗ 1 ⊗ ) −1 T −1 = (T ⊗ 1⊗2 ){(1⊗2 ⊗ T ){(1 ⊗ −1 231 )(1 ⊗ (R )13 )(1 ⊗ 132 )} · (1 ⊗ 1 ⊗ ) −1 }. We therefore have T −1 v˜ ∼ (T ⊗ 1 ⊗ 1){(1 ⊗ −1 ⊗ 1)(1 ⊗ )(1 ⊗ 1 ⊗ ) −1 }. 213 )(1 ⊗ (R )
Thus ψ(v) = ψ(v) ˜ from which the lemma follows.
This is sufficient to prove Theorem 4.
6. Concluding Remarks As noted in the introduction, the potential for applications of QHSAs is enormous, particularly in knot theory and supersymmetric integrable models, and these applications will be investigated elsewhere. In applications such as these, it is important to have a well developed and accessible structure theory, which has been the main focus of this paper. It is worth noting, even in the non-graded case, that the structure induced by the Drinfeld twist (4.10) has only been investigated for quasi-bialgebras [3]. Thus our results on the complete (graded) quasi-Hopf algebra structure, and in particular the purely algebraic and universal proof of Theorem 4, are new even in the non-graded case. Note. After this paper was posted to the math.QA bulletin board, we were informed by F. Hausser of their paper [19], in which the result of Theorem 4 was proved (in the non-graded case only) using graphical techniques on the category of finite dimensional modules of H . However, as we have mentioned above, our proof is purely algebraic and universal. Acknowledgements. P.S.I is supported by a JSPS postdoctoral fellowship.
372
M. D. Gould, Y.-Z. Zhang, P. S. Isaac
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.
Gould, M.D., Zhang, Y.-Z., Isaac, P.S.: J. Math. Phys. 41, no. 1, 547 (2000) Zhang, Y.-Z., Gould, M.D.: J. Math. Phys. 40, no. 10, 5264(1999) Drinfeld, V.G.: Quasi-Hopf algebras. Leningrad Math. J. 1, 1419 (1990) Babelon, O., Bernard, D., Billey, E.: Phys. Lett. B375, 89 (1996) Fronsdal, C.: Lett. Math. Phys. 40, 134 (1997) Jimbo, M., Konno, H., Odake, S., Shiraishi J.: Transform. Groups 4, no. 4, 303 (1999) Arnaudon, D., Buffenoir, E., Ragoucy, E., Roche, Ph.: Lett. Math. Phys. 44, no. 3, 201 (1998) Enriquez, B., Felder, G.: Commun. Math. Phys. 195, no. 3, 651 (1998) Foda, O., Iohara, K., Jimbo, M., Kedem, R., Miwa, T., Yan, H.: Lett. Math. Phys. 32, 259 (1994) Felder, G.: Elliptic quantum groups. Proc. ICMP Paris 1994. Cambridge, MA: International Press, 1995, pp. 211 Baxter, R.J.: Ann. Phys. 70, 193 (1972) Andrews, G.E., Baxter, R.J., Forrester, P.J.: J. Stat. Phys. 35, 193 (1984) Belavin, A.: Nucl. Phys. B180, 189 (1981) Jimbo, M., Miwa, T., Odake, M.: Commun. Math. Phys. 116, 507 (1988) Bazhanov, V.V., Stroganov, Yu.G.: Theor. Math. Phys. 62, 253 (1985) Deguchi, T., Fujii, A.: Mod. Phys. Lett. A6, 3413 (1991) Altsculer, D., Coste, A.: Commun. Math. Phys. 150, 83 (1992) Sweedler, M.E.: Hopf Algebras. New York: Benjamin, 1969 Hausser, F., Nill, F.: Commun. Math. Phys. 199, no. 3, 547 (1999)
Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 224, 373 – 397 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations Wei H. Ruan Department of Mathematics, Computer Science and Statistics, Purdue University Calumet, Hammond, IN 46323, USA. E-mail:
[email protected] Received: 23 October 2000 / Accepted: 30 January 2001
Abstract: We give a rigorous proof of existence of infinitely many black hole solutions to the Einstein–Yang–Mills equations with gauge group SU (3). In the case that the radius of event horizon is not too small, we show that there is a black hole solution for any possible numbers of zeros of the two field variables. 1. Introduction The coupling of Einstein’s general relativity with Yang–Mills’ field theories has been receiving active study for over a decade, ever since the discovery by Bartnik and McKinnon of numerical solutions of hairy black holes when the gauge group is SU (2). The Einstein–Yang–Mills equations with the gauge group SU (N ) have the form ∂P = 0, ∂ωi
rµ + 2Gµ = , r S 2 = G S r
r 2 µωi + ωi +
i = 1, . . . , N − 1,
r > rˆ ,
(1.1)
where rˆ > 0 is the radius of event horizon, 2 1 2 2 − N − 1 + 2i , ωi − ωi−1 8 N
P =−
(1.2)
i=1
in which ω0 = ωN = 0, and G=
N−1 i=1
2 ωi ,
4
= r (1 − µ) + P . r
(1.3)
374
W. H. Ruan
This system is derived using the ansatz ds 2 = µ−1 dr 2 + r 2 dθ 2 + sin2 θdφ 2 − S 2 µ dt 2 for the metric and Aj dx j =
1 i C − C H dθ − C + C H sin θ + D cos θ dφ 2 2
for the field potential, where
0 ω1 0 . . , .. .. ω C= N−1 0 0
N −1
D=
0 N −3 .. .
−N + 1
0
(For a derivation of the equations, see [4, 6].) A regular black hole solution is the one that satisfies the condition µ rˆ = 0,
µ (r) > 0 for r > rˆ , and lim µ (r) = 1. r→∞
For such a system, the so-called No Hair Conjecture has been the general belief for a long time. The conjecture states that a stationary black hole is uniquely determined by mass, angular momentum, and Yang–Mills charge at infinity. This was disproved by Bartnik and McKinnon [1] in 1988. They found in the SU (2) case numerical solutions corresponds to nonsingular and nonabelian black holes. (See also [2, 5, 11] and a recent review by Volkov and Galt’sov [12].) A rigorous and thorough mathematical analysis in this case is given by Smoller, Wasserman, and Yau [9]. (See also [8, 10].) It is shown that for every value of radius of the event horizon, and every nonnegative integer n, there are two black hole solutions such that the field function ω has exactly n zeros. A natural question is hence whether this result can be extended to a more general case where the gauge group is SU (N ). Since in this case there are N − 1 field functions ω1 , . . . , ωN−1 , the conjecture is that for every radius of the event horizon and every N − 1-tuple (n1 , . . . , nN−1 ) of nonnegative integers, there are k black hole solutions such that ωi has exactly ni zeros for i = 1, . . . , N − 1, where k = 23(N−1)/2 if N is odd and k = 23(N−2)/2+1 if N is even. (The multiplicity is due to the symmetry of the system under the changes ωi → −ωi , and ωi → ωN−i for any fixed i.) Proof of this conjecture has been attempted but not yet achieved. The SU (N ) case appears to be difficult because the N − 1 field equations are strongly coupled (i.e., coupled through derivatives). It is to be noted that in a recent paper [6], Mavromatos and Winstanley give an argument for a weaker version of the conjecture: given N − 1 nonnegative integers n1 , . . . , nN−1 , there exist solutions such that each ωi possessing at least (rather than exactly) ni zeros. Their argument, however, is heuristic. There are gaps in the proofs at the fundamental level. In this paper, we give a rigorous proof of the original version of the conjecture in a special case where the gauge group is SU (3) and the radius rˆ of the event horizon
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations
375
exceeds 2. In this particular case, the equations for the metric variable µ and the field functions ωi have the form
1 r 2 µω1 + ω1 + ω1 1 − ω12 + ω22 = 0, 2
1 (1.4) r 2 µω2 + ω2 + ω2 1 − ω22 + ω12 = 0, r > rˆ , 2
rµ + 2Gµ = , r together with the initial condition µ rˆ = 0,
ωi rˆ = ωˆ i ,
i = 1, 2,
(1.5)
where ωˆ i are constants. We set aside the equation for S since S is not involved in the above equations, and hence can be solved separately once µ, ω1 and ω2 are found. In view of the symmetry of the system under the substitutions ωi → −ωi for i = 1, 2, and ω1 ↔ ω2 , we may restrict ourselves to consider only the solutions that satisfy the initial condition 0 ≤ ωˆ 1 ≤ ωˆ 2 . Our main result is the following Theorem 1.1. Suppose rˆ > 2. Then for any integers n1 ≥ n2 ≥ 0, there is a regular black hole solution (µ, ω1 , ω2 ) of the system (1.4)–(1.5) such that √ 0 ≤ ωˆ 1 ≤ ωˆ 2 ≤ 2, and ωi has exactly ni zeros over the interval rˆ , ∞ . Furthermore, each black hole solution, with ωˆ 1 = 0 and ωˆ 2 = 0, has the constant limit √ lim µ = 1, lim ωi = ± 2, i = 1, 2, r→∞
r→∞
and (ω1 , ω2 ) reaches the limit from within the square √ √ √ √ D = − 2 ≤ ω1 ≤ 2, − 2 ≤ ω2 ≤ 2 . It is implied by the theorem that at the infinity, the field is constant and the spacetime is that of the flat one. We will also show that each of these solutions has a finite ADM mass m ≡ r (1 − µ) /2, which is derived from the solution, not an arbitrary constant. The proof of Theorem 1.1 will be complete at the end of the paper, as a result of analyzing several aspects of the system. Let us define a few terms before describing the structure of this paper. It will be seen that the square D defined in Theorem 1.1 plays an important role. Throughout this paper, by trajectory we mean the curve in the (ω1 , ω2 )-plane generated by a solution (µ, ω1 , ω2 ). We characterize trajectories into three types. A crashing trajectory is generated by a solution of which µ becomes zero before the point ω ≡ (ω1 , ω2 ) leaves D (if ever). A connecting trajectory is the one that does not crash, and stays in D for all r > rˆ . And an exiting is the one that leaves D before it crashes (if ever). A starting trajectory point ωˆ ≡ ωˆ 1 , ωˆ 2 of the trajectory is called a crashing, connecting, or exiting point if the corresponding trajectory is of the respective type. By zero-numbers of a trajectory
376
W. H. Ruan
or an initial point we mean the numbers n1 and n2 of zeros of ω1 and ω 2 , respectively, before the trajectory crashes or exits D (if ever). We often write ni ωˆ to indicate the dependence of ni on the starting point ωˆ ∈ D. This paper is organized as follows. In Sect. 2, we give preliminary properties about solutions. In particular, we show that any trajectory that ever reaches the boundary of D in finite r must exit D immediately. We also show that the condition rˆ > 2 eliminates the existence of crashing trajectories. Properties of µ, and m are also given. In Sect. 3, we study the connecting trajectories, and show that these solutions converge to equilibria. This calling it “connecting”. In Sect. 4,give justifies properties of the zero-numbers ni ωˆ . We also show that each zero number ni ωˆ , as an integer-valued function in D, is upper semicontinuous at exiting points and lower semicontinuous at connecting points. In the final section, we examine the distribution of the connecting and exiting points in the square D, and prove the existence of connecting points with all possible values of zero-numbers. Thus we complete the proof of Theorem 1.1. 2. Properties of Solutions In this section, we give a preliminary study about the spacetime variable µ, the field variables (ω1 , ω1 ), the ADM mass m, and the zero-numbers ni ωˆ . We first consider µ. It is clear that the system (1.4) is singular whenever µ = 0. In particular, it is singular at r = rˆ . Because of the singularity at r = rˆ , the existence of a local solution is a nontrivial problem. The result on the existence of a local solution for the general SU (N ) case (1.1) is proved in Künzle [4], which also shows that the solutions depend on the initial values ωˆ = ωˆ 1 , . . . , ωˆ N−1 analytically. Furthermore, from Eq. (1.1), we see that
rˆ 1 4 µ rˆ = = P ω ˆ . 1 + rˆ 2 rˆ r2 Since we are only interested in solutions of which µ (r) > 0 for r > rˆ , we assume throughout this paper that rˆ > 0, or equivalently rˆ 2 > −4P ωˆ . (2.1) In the SU (3) case, we will see that this condition holds if rˆ > 2. Furthermore, we show that µ (r) > 0 as long as ω ∈ D. This would eliminate existence of other singularity while the trajectory stays in D. Theorem 2.1. rˆ > 2. Let (µ, ω1 , ω2 ) be a solution of the initial problem (1.4) Suppose (1.5), with ωˆ 1 , ωˆ 2 ∈ D. Then µ (r) > 0 before the trajectory ω = (ω1 , ω2 ) exits D (if ever). Furthermore, µ (r) < 1 holds for all r at which the solution is defined. Proof. First observe that in the SU (3) case,
2 2 2 1 2 . P ω =− ω1 − 2 + ω12 − ω22 + ω22 − 2 8 A simple analysis shows that −1 ≤ P (ω) ≤ 0
if ω ∈ D.
(2.2)
Hence, the condition rˆ > 2 implies (2.1), which in turn implies µ > 0 for r > rˆ and near rˆ .
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations
377
We next show that µ (r) > 0 if ω ∈ D in rˆ , r . Suppose this is not true. Let r0 be the first of such r at which µ = 0. Then µ (r0 ) ≤ 0. However, from Eq. (1.4) and the relation (2.2),
4 4 1 1 µ (r0 ) = r0 + P (ω (r0 )) ≥ r0 − . r0 r0 r0 r0 Since r0 > rˆ > 2, it follows that µ (r0 ) > 0. This is impossible. The assertion is thus proved. Next, we show that µ (r) < 1 for all r > rˆ . Suppose this is not true. Let r1 be the first number at which µ (r1 ) = 1. Then µ (r1 ) ≥ 0. On the other hand, from Eq. (1.1), 4 P r1 2 2 2 1 2 = −2 ω12 + ω22 − ω1 − 2 + ω12 − ω22 + ω22 − 2 2r1 ≤ 0.
r1 µ (r1 ) = −2µ (r1 ) G +
Hence µ (r1 ) = 0. This implies that ωi (r1 ) = 0,
ωi2 (r1 ) = 2,
i = 1, 2.
However, by the uniqueness of solutions, these conditions imply that each ωi is a constant equilibrium. Therefore, G = 0,
P =0
for all r, and the equation of µ is reduced to rµ + µ = 1. It follows that rˆ < 1, r1 contradicting the assumption. Hence no such r1 exists. µ (r1 ) = 1 −
The above theorem shows that if rˆ > 2, then all trajectories starting in D do not crash in D. We assume rˆ > 2 throughout this paper without further notice. We next consider the field variables ω ≡ (ω1 , ω2 ). Our next result shows that if a trajectory ever reaches the boundary ∂D from inside of D, then the trajectory exits D. Theorem 2.2. Let (µ, ω1 , ω2 ) be a solution of the initial value problem (1.4)-(1.5) where ω (r) ≡ (ω1 (r) , ω2 (r)) is not constant. Suppose r¯ > rˆ is such that ω (r) ∈ D for rˆ ≤ r ≤ r¯ and ω (¯r ) ∈ ∂D. Then there is an ε > 0 such that ω (r) ∈ / D for r¯ < r < r¯ +ε. Proof. Suppose this is not true. Then either ω12 or ω22 has a local maximum 2 at r¯ . Without √ loss of generality, we may assume that ω1 (¯r ) = 2. Hence, ω1 (¯r ) = 0 and ω1 (¯r ) ≤ 0. In view of Theorem 2.1, µ (¯r ) > 0. Thus, from the equation for ω1 in (1.4 ), 1 −1 + ω22 (¯r ) ≥ 0, 2 that is, ω22 (¯r ) ≥ 2. Since on ∂D, ωi2 (¯r ) ≤ 2, it follows that ω22 (¯r ) = 2, which is also a local maximum. Hence ω2 (¯r ) = 0. However, this implies that ω (¯r ) is at an equilibrium with zero derivatives. It follows from the uniqueness of solution that ω (r) is constant for all r > rˆ . This contradicts the assumption of the theorem.
378
W. H. Ruan
Throughout this paper, for any exiting trajectory ω ∈ D starting at a point ω, ˆ we use ω¯ ≡ (ω¯ 1 , ω¯ 2 ) to denote the first point of the trajectory on ∂D. With slight abuse of language, we call ω¯ the end point of the trajectory. We also use µ¯ and r¯ to denote the corresponding “end” values of the solution. As a consequence of the previous theorem, we show that the end values depend on ωˆ continuously. Theorem 2.3. Suppose ωˆ ∗ ∈ D is an exiting point. Then in a neighborhood of ωˆ ∗ the end values r¯ , µ, ¯ ω¯ depend on the initial point ωˆ continuously. Proof. Let (µ∗ , ω∗ ) be the solution whose trajectory starts at ωˆ ∗ , and let r¯ ∗ be the end value of r for this solution. By Theorem 2.1, µ∗ (¯r ∗ ) > 0. Hence, by the continuity of solutions is a δ > 0 such with respect to initial values, for small ε, there that if ωˆ ∈ Nδ ωˆ ∗ then µ¯ > 0 in (¯r ∗ , r¯ ∗ + ε/2] andω ∈ / D in r¯ ∗ + ε/4, r¯ ∗ + ε/2 . We first show that r¯ depends on ωˆ continuously in Nδ ωˆ ∗ . Let ωˆ k be a sequence in D such that ωˆ k → ωˆ ∗ as k → ∞, and let r¯ k , ω¯ k and µ¯ k be the corresponding end values of the solutions. Let εn → 0+ as n → ∞. For n, repeat the argument of the previous paragraph, we can show that ωk ∈ / D for in each r¯ ∗ + εn /4, r¯ ∗ + εn /2 , if k is large enough. Hence lim sup r¯ k ≤ r¯ ∗ + εn /4 k→∞
for each n. This shows that lim supk→∞ r¯ k ≤ r¯ ∗ . On the other hand, by Theorem 2.2, for each εn , the distance between ω∗ (r) and ∂D for r ∈ rˆ , r¯ ∗ − εn has a positive lower bound, say, dn . Hence, by the continuity of solution with respect to initial values, if k is large enough, the distance between ωk (r) and ∂D for r ∈ rˆ , r¯ ∗ − εn is at least dn /2. This implies that lim inf r¯ k ≥ r¯ ∗ − εn k→∞
for each n. Since n is arbitrary, it follows that lim inf k→∞ r¯ k ≥ r¯ ∗ . This proves the assertion. Now, the convergence of ω¯ k and µ¯ k follows directly from r¯ k → r¯ ∗ and the continuous dependence of solutions on initial values. It follows from the above theorem that the set of exiting points is relatively open in D. Hence the set of connecting points is closed. Next, we present some properties of the ADM mass m = 2r (1 − µ). We first show that m is always increasing. Theorem 2.4. Suppose rˆ is a constant. Then m µ > 0 in an interval rˆ , r˜ , where r˜ > is nondecreasing in rˆ , r˜ . Furthermore, m > 0 for all r ∈ rˆ , r˜ unless both ω1 and ω2 are constant. Proof. By computation, 1 m (r) = µG − 2 P 2r 2 2 2 1 2 2 = µ ω1 + ω22 + 2 ω1 − 2 + ω12 − ω22 + ω22 − 2 4r ≥ 0.
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations
379
Hence m is always nondecreasing. If there is an r0 ∈ rˆ , r at which m = 0, then ωi (r0 ) = 0,
ωi2 (r0 ) = 2
for i = 1, 2. Hence each ωi is constant by the uniqueness of solution.
We next show that m has an upper bound which depends only on the number of sign changes of the field functions ω1 and ω2 . We first prove the following lemmas. Lemma 2.1. Suppose µ > 0 and ω ∈ D in the interval rˆ , r˜ . Then there is a constant M > 0, independent of the initial value ω, ˆ such that µG ≤ M. Proof. Let ξ ω; ˆ r = µG. Then by computation,
∂P ∂P r 2 ξ + (2rξ + ) G + 2 ω1 = 0. (2.3) + ω2 ∂ω1 ∂ω2 Since D is bounded, there exists a constant M0 > 1 such that ∂P ∂ω ≤ M0 , i = 1, 2 i whenever ω ∈ D. Hence, by Theorem 2.5 and the boundedness |P | ≤ 1 in D, 4 4 4 4
= r (1 − µ) + P = 2m + P ≥ 2m rˆ − = 2ˆr − > 0, r r rˆ rˆ
(2.4)
and ω1
√ ∂P ∂P + ω2 ≥ −M0 ω1 + ω2 ≥ −M0 2G. ∂ω1 ∂ω2
Suppose by contradiction that there is no upper bound for ˆ Then ξ for all r and ω. for each n > 0, there is a ωˆ n and an rn > rˆ such that ξ ω ˆ ; r ≥ n. Notice that n n ξ ωˆ n , rˆ = 0, rn can be chosen such that ξ ωˆ n , rn ≥ 0. Hence, for large n √ 2rn ξ ωˆ n ; rn + ≥ 2ˆr n ≥ 2M0 and
ξ ωˆ n ; rn G (rn ) = ≥ n ≥ 1. µ (rn )
√ G (rn ) and at r = rn ,
√ 2 ∂P ∂P rn ξ + (2rn ξ + ) G + 2 ω1 ≥ 2M0 G − M0 G > 0. + ω2 ∂ω1 ∂ω2
Hence G (rn ) >
This contradicts (2.3). Lemma 2.2. Letr˜ > rˆ . Then, there is a constant B > 0, depending on r˜ nonincreasingly, such that ωi (˜r ) ≤ B, i = 1, 2 for any trajectory ω that stays in D for r ∈ rˆ , r˜ ,
380
W. H. Ruan
Proof. We first observe that there is a constant σ > 0 such that if ω (r) ∈ D then
(r) ≥ σ . This can be seen from (2.4) and the assumption rˆ > 2. Write the equation for ωi in the form
1 ∂P ωi = − 2
ωi + . (2.5) r µ ∂ωi Since ω ∈ D in rˆ , r˜ , there is a constant, say M1 such that ∂P ∂ω ≤ M1 . i Let r ∈ rˆ , r˜ . Assume first that ωi does not change sign in rˆ , r . Without loss of generality, we may further assume that ωi ≥ 0 in this interval. If there is an r¯ such that rˆ < r¯ < r and ωi (¯r ) < 2M1 /σ , then it is necessary that ωi (r) ≤ 2M1 /σ . Because otherwise, there would be an r ∗ ∈ (¯r , r) such that 2M1 ωi r ∗ = , σ However, by (2.5) 1 ωi r ∗ ≤ − ∗2 r µ
and
ωi r ∗ ≥ 0.
2 (r ∗ ) M1 − M1 σ
≤−
M1 < 0, r ∗2 µ
which is impossible. Hence ωi (r) ≤ 2M1 /σ . If no such r¯ exists, then ωi ≥ 2M1 /σ in rˆ , r . Hence, by (2.5)
ωi (s) ≤ −
1 M1 (2M1 − M1 ) = − 2 s2µ s µ
for any s ∈ rˆ , r . Hence, by Lemma 2.1, −ωi 1 M1 M1 M1 M2 = , ≥ 2 ≥ 2 2 ≥ ωi s µG s2 ωi s 2 µ ωi where M2 is an upper bound of µG guaranteed by Lemma 2.1. Integrating from rˆ to r˜ with respect to s, we have
r − rˆ 1 1 1 1 − ≥ M1 M2 = + M 1 M2 . ωi (r) rˆ r rˆ r ωi rˆ Hence, ωi (r) ≤
rˆ r . M1 M2 (r−ˆr )
Finally, if ωi changes the sign in (r0 , r), then there is
an r¯ in this interval such that ωi (¯r ) = 0 < 2M1 /σ . Thus, by the above argument, ωi (r) ≤ 2M1 /σ . Therefore, in any case, we can choose 2M1 rˆ r˜ . B = max , σ M1 M2 r˜ − rˆ We now prove that m is bounded above by a constant that depends only on the number of times the field functions change signs.
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations
381
Theorem 2.5. For each nonnegative integer n, there is a constant Mn > 0 such that if both ω1 and ω2 of a solution (µ, ω1 , ω2 ) do not change signs for more than n times while the trajectory ω is in D, then m ≤ Mn as long as ω ∈ D. Proof. Since µ > 0 in D, for r ≤ rˆ + 1, m (r) =
r rˆ + 1 . (1 − µ) ≤ 2 2
For r ≥ rˆ + 1, we integrate m = µG − with respect to r to obtain
1 P, 2r 2
r 1 ds ω12 + ω22 ds + 2 rˆ +1 rˆ +1 2s r 1 rˆ + 1 + ω12 + ω22 ds. + ≤ 2 2 rˆ + 1 rˆ +1 ωi changes its sign at s1 , . . . , sk ∈ rˆ + 1, r , (k ≤ n). Let B be the bound of Suppose ω (r) for r > rˆ + 1 guaranteed by Lemma 2.2. Then i r k ωi (sj +1 ) k √ 2 ωi sj +1 − ωi sj ≤ 2 2nB. ≤ B ωi ds ≤ dω ω (s) i i ωi (sj ) rˆ +1 m (r) ≤ m rˆ + 1 +
r
j =1
j =1
Hence m (r) ≤
√ rˆ + 1 1 + 4 2nB. + 2 2 rˆ + 1
Theorems 2.4, 2.5 show that for any solution if its field functions change signs only finitely many times, the ADM mass must be finite. Recall that the purpose of this paper is to show that the system has all kinds of black hole solutions, each has field functions changing signs finitely many times. Hence, each has a finite ADM mass. Using Lemma 2.2, we obtain for future use a positive lower bound of µ (r) which is independent of ω. Proposition 2.1. Let r˜ > rˆ . Then there is a function δ (˜r ) > 0, depending on r˜ increasingly, such that µ (˜r ) ≥ δ (˜r ) for any trajectory that stays in D for r ∈ rˆ , r˜ . Proof. By Lemma 2.2, there is a constant M such that
2 2 rˆ + r˜ G = ω1 + ω2 ≤ M in , r˜ . 2 Let a = rˆ + r˜ /2 and let y be the solution of the initial value problem σ , r y (a) = 0,
ry + 2My =
for r > a,
382
W. H. Ruan
where σ > 0 is a positive lower bound of while ω ∈ D. Then, by the comparison principle,
rˆ + r˜ 2M σ 1− > 0. µ (˜r ) ≥ y (˜r ) = 2M 2˜r 3. The Convergence of Connecting Trajectories In this section, we consider solutions whose trajectory ω (r) stays in D for all r > rˆ . It is easy to see that the system (1.4) has nine equilibria for (ω1 , ω2 ): √ √ (0, 0) , (±1, 0) , (0, ±1) , ± 2, ± 2 . The purpose of this section is to show that unless ω (r) is itself one of these equilibria, any solution starting at a connecting point tends to a constant limit lim (µ (r) , ω1 (r) , ω2 (r)) = (1, ω¯ 1 , ω¯ 2 ) ,
r→∞
√ √ where ω¯ i is either 2 or − 2. The following theorem is actually more general. It only assumes that the trajectory is uniformly bounded, regardless whether it is in D. Theorem 3.1. Suppose (µ (r) , ω1 (r) , ω (r)) is a solution of problem (1.4)–(1.5) such that µ (r) > 0 and ω (r) ≡ (ω1 (r) , ω2 (r)) is uniformly bounded for all r > rˆ , then limr→∞ µ (r) = 1 and the limit of ω (r) exists and is an equilibrium. Furthermore, for any i = 1, 2, if ωˆ i ≡ ωi rˆ = 0 then limr→∞ ωi (r) = 0. The proof is long. We divide it into several steps. We first show that µ → 1 as r → ∞. Lemma 3.1. Suppose the condition of Theorem 3.1 holds. Then limr→∞ µ (r) = 1. Proof. Assume the opposite. Then the mass m, being nondecreasing and unbounded, necessarily tends to ∞. Hence
4 lim (r) = lim 2m (r) + P (ω) = ∞ r→∞ r→∞ r because P (ω) is bounded. We show that for each j = 1, 2, limr→∞ ωj = 0. First observe that since ω is bounded, lim inf r→∞ ωj = 0. This is obvious because ω is bounded. Also since ω is bounded, there is a upper bounded M such that ∂P /∂ωj ≤ M for all r and j . Let ε > 0 be fixed. Since → ∞, we can choose r0 > 0 so large that ε (r0 ) > 2M. Increasing r0 if necessary, we may assume also that ωj (r0 ) < ε. We show that ω (r) < ε for r > r0 . Suppose this is not true. Assume first that ω (r0 ) ≥ 0 j
j
and there is a r1 > r0 such that ωj (r1 ) = ε, and 0 ≤ ωj (r) < ε for r < r1 . Then
−1 ∂P −1 ≤ 2 (2M − M) < 0. ωj (r1 ) = 2
ωj + ∂ωj r=r1 r1 µ r1 µ
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations
383
This is impossible. Hence no such r1 exists. If on the other hand, ωj (r0 ) ≤ 0 and there is a r1 > r0 such that ωj (r1 ) = −ε, and −ε < ωj (r) ≤ 0 for r < r1 . Then
∂P −1 −1 ωj (r1 ) = 2 ≥ 2 (−2M + M) > 0.
ωj + ∂ωj r=r1 r1 µ r1 µ This is again impossible. This proves that ωj (r) < ε for all r > r0 . To show that limr→∞ µ = 1, we use the variable τ = ln r and write the equation for µ in (1.4) in the form 2 2 4 µ˙ + 1 + 2 ω1 + 2 ω2 µ = 1 + 2 P (ω) , r where upper the dot “ · ” represents d/dτ . Let δ > 0. Since 0 < µ < 1, P is bounded, and ωj → 0, there is a T such that µ˙ + µ > 1 − δ/2
for τ > T .
Compare µ with the solution of the equation ξ˙ + ξ = 1 − δ/2,
ξ (T ) = µ(T ).
The comparison principle implies that µ > ξ for all τ > T . It is clear that ξ → 1 − δ/2. Hence, µ > 1 − δ for large τ . Since µ < 1 for all τ , it follows that lim µ = lim µ = 1.
r→∞
τ →∞
Throughout the remainder of this paper, we use f˙ to denote r df dr for any differentiable function f . This is equivalent to introducing a new variable τ = ln r. It is often more convenient not to explicitly change variable from r to τ . In terms of this operator, the field equation in (1.1) can be written as
4 ∂P µω¨ i + 1 − 2µ + 2 P (ω) ω˙ i + = 0, i = 1, 2. (3.1) r ∂ωi We next show that ω˙ i → 0 as r → ∞ for each ωi of a uniformly bounded trajectory. Lemma 3.2. Suppose the condition of Theorem 3.1 holds. Then limr→∞ ω˙ i = 0 for each i = 1, 2. Proof. Define the energy function H =
1 2 µω˙ + P (ω) , 2
where ω˙ 2 ≡ ω˙ 12 + ω˙ 22 . By computation
1 2 2 2 3 2 ˙ H = ω˙ µ − − 2 P (ω) − 2 µω˙ . 2 2 r r
(3.2)
Since P (ω) is bounded and by Lemma 3.1, µ → 1, it is clear that if ω˙ is also bounded, then H˙ ≈ ω˙ 2 for large r. We prove the boundedness of ω˙ as follows. Let M be an upper bound of |P (ω)| and |∇P (ω)|, and let r0 > rˆ be so large that M/r02 < 1/24. Since
384
W. H. Ruan
µ → 1, we may choose r0 larger if necessary so that µ (r) ≥ 5/6 for all r ≥ r0 . We show that ω˙ 2 ≤ 8M 2 for all r > r0 . Suppose this is not true. Without loss of generality, we may assume that there is r1 > r0 such that ω˙ 1 (r1 ) > 2M. By Eq. (3.1), at any point r > r0 at which ω˙ 1 > 2M, we have
4 ∂P 1 5 −1− M − M = 0. (3.3) >2 µω¨ 1 (r) = 2µ − 1 + 2 P ω˙ 1 (r) − r ∂ω1 3 6 Furthermore, ω¨ 1 cannot become zero again because if it did, say first at an r2 after r1 , then ω˙ 1 (r2 ) ≥ ω˙ 1 (r1 ) > 2M. However, ω¨ 1 (r2 ) = 0 by (3.3). This contradiction shows that ω˙ 1 ≤ 2M for all r > r0 . This proves the boundedness of ω˙ 1 . As a consequence of the boundedness of P and ω, ˙ it follows from (3.2) that there is a r ∗ > rˆ such that 2 H˙ ≥ ω˙ 2 3
for r > r ∗ .
(3.4)
We show that lim inf r→∞ ω˙ 2 = 0. If not, by (3.4), H → ∞ as r → ∞. From the definition of H , since P (ω) is bounded, it follows that ω˙ 2 → ∞. This contradicts the boundedness of ω. ˙ We now show that limr→∞ ω˙ 2 = 0. Suppose this is not true. Then there exists δ > 0 and sequences {sn }, {tn } such that r0 < . . . < sn < tn < sn+1 < . . . , sn → ∞, tn → ∞, 1 2 µω˙ (sn ) ≥ δ, 2
and
1 2 µω˙ (tn ) ↓ 0 2
as n → ∞. From the field equations and the boundedness of ω˙ and ∇P , ω¨ is also bounded. Hence there is an ε > 0 such that ω˙ 2 (r) ≥ δ/2 in (sn − ε, sn + ε). It is clear that (tn , tn+1 ) ⊃ (sn+1 − ε, sn+1 + ε) . Now, since H˙ ≥ 23 ω˙ 2 for large r, it follows that for large n,
1 2 1 2 2 τn+1 2 ω˙ dτ µω˙ (tn+1 ) + P (ω (tn+1 )) ≥ µω˙ (tn ) + P (ω (tn )) + 2 2 3 τn 1 2 ≥ µω˙ 2 (tn+1 ) + P (ω (tn )) + δε, 2 3
where τi = ln ti for i = 1, 2, . . . . This implies that 2 P (ω (tn+1 )) ≥ P (ω (tn )) + δε 3 for large n. This contradicts the boundedness of P . The lemma is thus proved.
In the next step, we show that ω has a limit as r → ∞, and the limit is an equilibrium. Lemma 3.3. Suppose the condition of Theorem 3.1 holds. Then the field functions ω = (ω1 , ω2 ) has a limit as r → ∞, and the limit is one of the equilibria of the system.
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations
385
Proof. We first show that limτ →∞ ∇P = 0. Suppose this is not true. Then there is a constant ε0 > 0, a component ωj and a sequence {rn } such that rn ↑ ∞ and ∂P (r ) n ≥ ε0 for all n. ∂ω j ε0 Since ω˙ 2 → 0, there is r˜ > 0 such that ω˙ j < δ ≡ 2(M+3) for r ≥ r˜ , where M is an 1 2 upper bound for µ 2µ − 1 − 4P /r for r > r˜ . However, from the field equation (3.1) we see that ln rn +1 ω˙ j (ern ) ≥ −ω˙ j (rn ) + ω¨ j dτ ≥ −ω˙ j (rn ) +
ln rn ln rn +1
ln rn
∂P dτ − ∂ω j
ln rn +1
ln rn
1 4 2µ−1− 2 P ω˙ j dτ µ r
≥ −3δ + ε0 −δM ε0 > δ. = 2 This is a contradiction. This proves the assertion. Finally, let - denote the “ω -limit set” defined by - = w = (w1 , w2 ) : there is a sequence rn → ∞ such that lim ω (rn ) = w . n→∞
Since ω stays in a bounded set, - is clearly nonempty. Also limr→∞ ω exists if and only if - is a singleton. Suppose the limit does not exist. We show that - has infinitely many points. Let w1 and w 2 be two different points of -. Then, there are two sequences {sn } and {tn }, such that limn→∞ ω (sn ) = w 1 , limn→∞ ω (tn ) = w 2 . We may choose the sequences such that sn < tn < sn+1 < tn+1 < . . . for all n. Choose ε > 0 such that the ε-neighborhoods N w1 , ε and N w2 , ε of w1 and w2 do not intersect. Then for each is a rn between sn and tn such that ω (rn ) is outside both neighborhoods n, there N w1 , ε and N w2 , ε . Since {ω (rn )} is bounded, it has a limit point w 0 . Clearly w0 ∈ - and is outside of N w 1 , ε and N w2 , ε . This shows that for any pair of distinct points of -, there is another point. Hence - has infinitely many points. To see that this is impossible, we notice that each point of - is a zero of ∇P . Indeed, from the preceding paragraph, if limn→∞ ω (rn ) = w, then ∇P (w) = lim ∇P (ω (rn )) = lim ∇P (ω (r)) = 0. n→∞
r→∞
Thus, since ∇P has only finitely many zeros, - is necessarily a singleton. This proves that the limit limτ →∞ ω exists and is a zero of ∇P . It is clear from the equations that any zero of ∇P is an equilibrium. Our final step is to show that the limit of any field function ωi is nonzero if its initial value ωˆ i is nonzero. This would complete the proof of Theorem 3.1. Lemma 3.4. Let the condition of Theorem 3.1 hold. Then for each ωi with the initial value ωi rˆ ≡ ωˆ i = 0, limr→∞ ωi (r) = 0.
386
W. H. Ruan
Proof. Suppose the opposite holds. Without loss of generality, we may assume that there is a solution (µ, ω1 , ω2 ) such that ωˆ 1 = 0, but limr→∞ ω1 = 0. Let ω¯ ≡ (0, ω¯ 2 ) = lim ω (r) . r→∞
Since it is an equilibrium, ω¯ 2 is either 0, or ±1. Let ξj = ω˙ j for j = 1, 2. The equations for ωj can be written as a system of first order equations ω˙ 1 = ξ1 , ω˙ 2 = ξ2 , 1 ˙ξ1 = − 1 − 2µ + µ 1 ξ˙2 = − 1 − 2µ + µ
4 P ξ1 − r2
4 P ξ2 − r2
ω1 1 − ω12 + µ ω2 1 − ω22 + µ
1 2 ω , 2 2
1 2 ω . 2 1
In a neighborhood of the equilibrium ω, ¯ the system can be written as a perturbed system η˙ j = ξj , ξ˙j = ξj −
ηk
k=1,2
∂ 2P ¯ + εj (r, η, ξ ) , j = 1, 2, (ω) ∂ωj ∂ωk
(3.5)
where ηj = ωj − ω¯ j , and εj (τ, η, ξ ) is a smooth function in η ≡ (η1 , η2 ) and ξ ≡ (ξ1 , ξ2 ) such that εj (τ, 0, 0) = 0. Notice that since ω¯ 1 = 0, the linear part of the equations for (η1 , ξ1 ) is independent of (η2 , ξ2 ) and vice versa. Specifically, the linearized equations for (η1 , ξ1 ) are η˙ 1 = ξ1,
1 ξ˙1 = ξ1 − η1 1 + ω¯ 22 , 2
whose eigenvalues are complex with positive real parts, and the equations for (η2 , ξ2 ) are η˙ 2 = ξ2,
ξ˙2 = ξ2 − η2 1 − 3ω¯ 22 .
(3.6)
This implies that the stable manifold for the linearized system is a subset of the subspace S2 = {η1 = 0, ξ1 = 0}. Let k be the dimension of this stable manifold. We show that the stable manifold of the unperturbed system (3.5) also lies in S2 . Since εi (τ, 0, 0) = 0, it is well-known that the stable manifold of the perturbed system is a homeomorphism of the stable manifold of the unperturbed system (see e.g. Chapter 4, Theorem 3.1 of [3]). Hence the stable manifold of the unperturbed system again has the dimension k. Observe next that the unperturbed system (3.5) is invariant in S2 . To see this, consider the system (1.4), which is equivalent to (3.5). If ω1 rˆ = 0, then
1 ω1 rˆ = −ω1 rˆ 1 − ω12 rˆ + ω22 rˆ / rˆ = 0. 2
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations
387
Hence by the uniqueness of (local) solutions, ω1 (r) = 0 for all r > rˆ . This corresponds to η1 = ξ1 = 0. Hence S2 is invariant. Restricting system (3.5) in S2 , we find that its linearized equations about (0, ω¯ 2 ) is again given by (3.6). Hence the stable manifold when restricted in S2 is also k. The same dimensionality implies that the stable manifold of (3.5) is contained entirely in S2 . Hence any trajectory approaches ω¯ as r → ∞ must be in S2 for all r > rˆ . This contradicts the assumption ωˆ 1 = 0. This completes the proof of Theorem 3.1 about the convergence of connecting trajectories. In the next section, we examine the number of zeros of ωi of connecting and exiting trajectories. 4. Zero-Numbers of Trajectories In this section, we discuss the properties of the zero-numbers ni ωˆ at exiting and connecting points. These numbers can be viewed as integer-valued functions in D. It will be shown that the variation of ni ωˆ at any point (i.e., the difference between the maximum and minimum in neighborhoods as the neighborhoods reduce to the point) is at most one. Also, each ni is upper semicontinuous at any exiting point, and lower semicontinuous at any connecting point. The following lemma is needed. Lemma 4.1. Suppose ω∗ is a trajectory that stays in D for r ∈ rˆ ,r˜ , where r˜ > rˆ is a constant. Suppose also that the component ωi∗ has at ∗n zeros in rˆ , r˜ , and is∗ nonzero ∗ such r˜ . Then there is an r > r˜ and a neighborhood N ω ˆ of ω of the initial point ω ˆ that any trajectory starting in N ωˆ ∗ has exactly n zeros in rˆ , r . Proof. Since by Theorem 2.1, solutions do not crash while the trajectory stays in D, the differential equations are not singular for r ∈ (ˆr , r¯ ]. Hence the continuous dependence of solutions on initial values holds is an ∗there ∗ while solutions stay in D. This implies that r > r˜ and a neighborhood such that any trajectory ω starting in N ω ˆ does not N ω ˆ crash for r ∈ rˆ , r and ωi r = 0. We may also assume that ωi∗ has no additional zero in r˜ , r . Suppose by contradiction that there is a sequence of initial points ωˆ k → ωˆ ∗ as k → ∞, such that the i th component ωik , has the number of zeros nk = n. We show that this is impossible. We first show that for large k, nk ≥ n. Let r1 < r2 < . . . < rn be the zeros of ωi∗ in rˆ , r . Note that since ωi∗ is not constantly zero, it followsthat r1 > rˆ , and ωi∗ = 0 at any of these points. Hence, there is an ε > 0 such that ωi∗ rj − ε and ωi rj + ε have the opposite signs for j =1, . . . ,n. By the continuity of dependence on initial values, for each j and large k, ωik rj − ε and ωik rj + ε also have the opposite signs. Hence there is a zero of ωik in rj − ε, rj + ε . The assertion follows if we choose ε smaller than each of rj +1 − rj . We next show that for large k, nk ≯ n. Suppose ωik has more than n + 1 zeros for k infinitely many k. Let r1k < r2k < . . . < rn+1 be n + 1 of them in rˆ , r¯ . Since each ∞ is bounded, we may choose a subsequence so that it has limit. Let the sequence rjk k=1 limit be denoted by rj . Clearly, rj is a zero of ωi . Since ωi has only n zeros in rˆ , r , there are two sequences rjk and rjk+1 that converge to a same limit. Let us call this
388
W. H. Ruan
limit s. Hence ωi∗ (s) = 0. However, by the mean value theorem, there is s k ∈ rjk , rjk+1 such that ωik s k = 0. Taking k → ∞, we see that ωi∗ (s) = 0. By the uniqueness of solutions, ωi∗ ≡ 0. This contradicts the assumption that ωi∗ has only finitely many zeros. We first consider ni ωˆ at an exiting point. Theorem 4.1. Let ω∗ be an exiting trajectory starting point ωˆ ∗ and the end ∗ with the ∗ ∗ point ω¯ . Then, there is a neighborhood N ωˆ of ωˆ such that (1) the zero-number ni is constant in N ωˆ ∗ if ω¯ i∗ = 0, and (2) ni ωˆ ∗ − 1 ≤ ni ωˆ ≤ ni ωˆ ∗
for any ωˆ ∈ N ωˆ ∗
if ω¯ i∗ = 0 and ωi∗ is not a constant. Proof. Suppose ω¯ i∗ = 0. Then there is an ε > 0 such that ωi∗ = 0 in (¯r ∗ − ε, r¯ ∗ + ε), where r¯ ∗ is the end value of r for ω∗ . By Theorem 2.3 and continuous dependence on initial there is a δ > 0 such that any trajectory ω starting in the δ-neighborhood values, Nδ ωˆ ∗ has its end value r¯ ∈ (¯r ∗ − ε, r¯ ∗ + ε) and ωi = 0 in (¯r ∗ − ε, r¯ ∗ + ε). In view ∗ have same number of zeros in rˆ , r¯ ∗ . of Lemma 4.1, if δ is sufficiently small, ω and ω i i ∗ Hence ni ωˆ = ni ωˆ . This proves part (1) of the lemma. Suppose ω¯ i∗ = 0. By the uniqueness of the solution, since ωi∗ is not constantly zero, r¯ ∗ ∗ ∗ is an isolated zero of ωi∗ . Hence there is an ε > 0 such r¯ ∗ ). Choose that ωi = 0 in (¯r − ε, ∗ ∗ ∗ δ > 0 so that any trajectory ω starting in Nδ ωˆ has the end value r¯ ∈ (¯r − ε, r¯ + ε) ∗ ∗ and ωi has the same number of zeros as ωi in rˆ , r¯ − ε . We show that for small ε and the corresponding δ, ωi can have no more than one zero in (¯r ∗ − ε, r¯ ). Suppose the opposite is true. Then there are sequences εk → 0 and ωˆ k → ωˆ ∗ as k → ∞, such that ∗ k each ωi has at least two zeros in r¯ − εk , r¯ k . Let them be denoted as s k < t k . Then by the mean value theorem, there is a r k such that s k < r k < t k and ωik r k = 0. Since s k → r¯ ∗ , t k → r¯ ∗ and ωik → ωi∗ as k → ∞, it follows that ωi∗ (¯r ∗ ) = ωi∗ (¯r ∗ ) = 0. Hence by the uniqueness of solutions, ωi∗ ≡ 0, contradicting the assumption of the ∗ theorem. Hence ni ωˆ ≤ ni ωˆ and the difference of them is at most one. This proves part (2). We now consider ni ωˆ at a connecting point. Theorem 4.2. Let ωˆ ∗ ∈ D be a connecting point such that both ωˆ 1∗ and ωˆ 2∗ are nonzero. ∗ Then there is a neighborhood N ωˆ in which ni ωˆ ∗ ≤ ni ωˆ ≤ ni ωˆ ∗ + 1,
i = 1, 2
(4.1)
for any ωˆ = ωˆ ∗ in N ωˆ ∗ . Furthermore, if ni ωˆ > ni ωˆ ∗ for some ωˆ ∈ N ωˆ ∗ then ωˆ is exiting. ∗ ∗ Proof. ∗ ∗For∗ simplicity in notation, we denote the zero-number n∗i ωˆ by ni . Let µ , ω1 , ω2 be the solution of problem (1.4), ( 1.5) that starts at ωˆ . In view of Theorem 3.1 , limr→∞ µ∗ = 1 and there is an equilibrium p ∈ D such that limr→∞ ω∗ = p. Furthermore, by Theorem 2.2, if ωˆ ∗ = p, then for each r > rˆ , ω∗ (r) is in the interior
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations
389
√ √ of D. Without loss of generality, we assume that p = − 2, − 2 . Then, there is an r˜ > rˆ such that ω∗ is in the square √ C = − 2 ≤ ωi ≤ −1, i = 1, 2 for all r > r˜ . By the continuous dependence of solutions on initial values, there is an ε-neighborhood Nε ωˆ ∗ and an r0 > r˜ such that any trajectory ω starting in Nε ωˆ ∗ must stay in D for r ∈ rˆ ,r0 and ω (r0 ) ∈ C. In view of Lemma 4.1, for small ε, ωi has n∗i zeros in the interval rˆ , r0 . Hence, ni ωˆ ≥ n∗i . It remains to show that ε can be chosen so small such that if the trajectory ω leaves C without leaving D, each ωi can have at most one more zero before ω exits D. In the following, let r¯ be the end value of r for the trajectory (i.e., ω (¯r ) ∈ ∂D), let r2 ≤ r¯ be the maximum of r such that ωi has no more than n∗i + 2 zeros before exiting D, and let δ > 0 be so small that δ < 1/3 and √ √ 4 2 1 2 + (2 − 5δ) > √ . (4.2) 2 3 3 3 (1 − 3δ) By choosing ε small enough, we may assume that r0 is so large such that √ 4 4 2 1 1+ ≤ δ, ≤δ r2 r2 3
(4.3)
for r > r0 . Furthermore, by Theorem 2.5, m has an upper bound Mn for r ∈ rˆ , r2 which is independent of r and ω. ˆ Hence, choosing ε smaller if necessary, we assume µ (r) = 1 −
2m 2Mn ≥1− ≥1−δ r r
(4.4)
for r ∈ (r0 , r2 ). We first show that if at any r ∗ ∈ (r0 , r2 ) √ 4 2
ω˙ i > √ , 3 3 (1 − 3δ)
(4.5)
then ω˙ i will be increasing in (r ∗ , r2 ). To see this, we use conditions (4.3) and (4.4) in Eq. (3.1) to obtain
1 µω¨ i r ∗ ≥ (1 − 3δ) ω˙ i r ∗ − ωi 1 − ωi2 + ωj2 2 √
4 2 1 > √ − ωi 1 − ωi2 + ωj2 , (j = i) . 2 3 3 By simple calculation, it can be shown that sup ωi
ω∈D
1 − ωi2
1 + ωj2 2
√ 4 2 = √ . 3 3
390
W. H. Ruan
Hence ω¨ i (r ∗ ) > 0. If ω¨ i r = 0 for some r ∈ (r ∗ , r2 ), then since at this point (4.1) also holds, the same argument would lead to ω¨ i r > 0. This is impossible. Hence the assertion follows. To show that ωi cannot have more than two zeros before ω exits D, it suffices to show that by the time ωi reaches its first zero after leaving C, ω˙ i is higher than the threshold value that is on the right side of (4.5). The next lemma provides such a lower bound. Lemma 4.2. Suppose ε is chosen such that conditions (4.3) and (4.4) hold. Then each component ωi of a trajectory starting in Nε ωˆ c can have at most one zero in (r0 , r¯ ). At such a zero, if it exists, √ 1 2 + (2 − 5δ) . ω˙ i ≥ 2 3 Proof. Let r1 denote the first zero of ωi in (r0 , r¯ ). Since ωi (r0 ) ≤ −1 and ωi (r1 ) = 0, there is an r1 ∈ (r0 , r1 ) such that ωi r1 = −1 and ω˙ i r1 ≥ 0. Furthermore, from Eq. (3.1), we see that
1 µω¨ i = −ωi 1 − ωi2 + ωj2 > 0, (j = i) 2 whenever ω˙ i = 0 and −1 ≤ ωi ≤ 0. Hence ω˙ i ≥ 0 in r1 , r1 . Define an “individual energy function” Hi by Hi = µω˙ i2 + ωi2 −
ωi4 . 2
By computation,
4P 2µω˙ 2 H˙ i = −1 + 3µ − 2 − ω˙ i2 − ωi ω˙ i ωj2 , (j = i) . r r2 Since P ≤ 0, ωi ≤ 0 and ω˙ i ≥ 0 in r1 , r1 , and by (4.3)-(4.4) √ 2µω˙ 2 4 2 2 µ ≥ 1 − δ, ≤ 2δ, ≤ 2 1+ r2 r 3 it follows that H˙ i ≥ (2 − 5δ) ω˙ i2 > 0
(4.6) in r1 , r1 . This implies that Hi is increasing in this interval. Hence, for any r ∈ r1 , r1 µω˙ i2 + ωi2 − which leads to
ω˙ i >
1 ωi4 = Hi > Hi r1 ≥ 2 2
1 − ω2 ω4 1 − ωi2 + i = √ i . 2 2 2
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations
391
Also, since ωi (r1 ) = 0, it follows from (4.6) that µω˙ i2 (r1 ) = Hi (r1 ) = Hi r1 +
ln r1 ln r1
H˙ i dτ
ln r1 1 ≥ µω˙ i r1 + + (2 − 5δ) ω˙ i2 dτ 2 ln r1 0 1 − ωi2 1 ≥ + (2 − 5δ) √ dωi 2 2 −1 √ 2 1 = + (2 − 5δ) , 2 3 where τ = ln r. This proves the inequality in the lemma. To see that there can be no other zero for ωi , we observe that by (4.2), ω˙ i satisfies (4.5) at r = r1 . It follows that ω˙ i is increasing in (r1 , r¯ ), and hence must be positive. Therefore, it is impossible for ωi to have another zero in (r1 , r¯ ). This completes the proof of Theorem 4.2.
5. Distribution of Connecting Points in D In this final section, we complete the proof of Theorem 1.1, which asserts that for each pair of nonnegativenumbers n1 and n2 , there is a connecting trajectory ω ∈ D such that ωi has ni zeros in rˆ , ∞ for i = 1, 2. Because of the symmetry of the equations, we only consider connecting points in the subset D = {(ω1 , ω2 ) ∈ D : ω2 ≥ ω1 ≥ 0} . In view of Theorem 4.1, the zero-numbers ni are upper semi-continuous at each exiting point. In addition, by Theorem 4.2, if we define Ni = ni at each exiting point and Ni = ni + 1 at each connecting point, then Ni is upper semi-continuous over the entire D . It is clear that the variation of Ni at any point is again at most 1. Let l1 = (u, v) ∈ D : u = 0 and l2 = (u, v) ∈ D : u = v be the left and right side boundary of D . We first construct a sequence of dividing curves 5k that separate regions where N2 < k and N2 ≥ k. Lemma 5.1. For each k = 0, 1, . . . , there is a continuous curve 5k that joins a point ak ∈ l1 and a point bk ∈ l2 , such that N2 = k + 1 on 5k and in any neighborhood of any point on 5k there are points with N2 < k. Furthermore, any two different curves do not intersect, and 5k+1 is below 5k for each k. Proof. Let Dk = (u, v) ∈ D : N2 (u, v) < k + 1 for k = 0, 1, . . . . Then since N2 is upper semicontinuous, Dk is open for each k. We define 5k by induction as follows. First observe that there is a component of D0 that contains the line segments s1 = v = √ √ √ 2, 0 ≤ u ≤ 2 and s2 = u = 0, 1 ≤ v ≤ 2 in D . In fact, if (ω1 , ω2 ) is a trajectory that starts in s1 , then by computation √ 2 ω ˆ 2 ω2 rˆ = − 2 −1 + 1 > 0. 2 rˆ µ rˆ
392
W. H. Ruan
Hence the trajectory exits D immediately. Therefore N2 = n2 = 0. Similarly, if (ω1 , ω2 ) starts in s2 , then by the uniqueness of solutions, ω2 ≡ 0. Hence ωˆ 2 ω2 rˆ = − 2 1 − ωˆ 22 > 0. rˆ µ rˆ Furthermore, if ω2 (r) = 0 at any r > rˆ , then by (1.4), r 2 µω2 = −ω2 1 − ω22 > 0 which is impossible. Hence ω exits D along the line {ω1 = 0} with N2 = n2 = 0. This proves the assertion. Let D0∗ denote the component of D0 that contains s1 ∪ s2 and let 50 = ∂D1∗ \ (s1 ∪ s2 ). Hence 50 lies in the interior of D except for a point on l2 . It is clear that 50 joins a point a0 ∈ l1 and a point on b0 ∈ l2 . Clearly, at any point p ∈ 50 , N2 (p) ≤ 1 by upper semi-continuity. Furthermore, if N2 (p) = 0 for some p ∈ 50 , then p is exiting (because n2 cannot be negative), and the end value ω¯ 2 = 0. However, by Theorem 4.1, N2 = n2 = 0 is constant in a neighborhood of p. Hence p ∈ / ∂D0 . This shows that N2 = 1 on 50 . Suppose Dn∗ and 5n have been defined such that Dn∗ is the component of Dn that contains 5n−1 , 5n = ∂Dn∗ \ (s1 ∪ s2 ), and N2 = n + 1 on 5n . Suppose also that 5n joins ∗ a point an ∈ l1 and a point bn ∈ l2 . Hence there is a component of Dn+1 of Dn+1 that ∗ contains 5n . Define 5n+1 = ∂Dn+1 \ (s1 ∪ s2 ). It is clear that 5n+1 is below 5n and it again joins a point an+1 on l1 to a point bn+1 on l2 . Suppose p ∈ 5n+1 at which N2 (p) < n + 2. We first show that p cannot be connecting. If it is, then n2 (p) = N2 (p) − 1 ≤ n. Hence by Theorem 4.2, in a neighborhood of p, n2 ≤ n at every connecting point and n2 ≤ n + 1 at every exiting point. This leads to N2 ≤ n + 1 in this neighborhood, ∗ . On the other hand, p cannot be exiting. Because otherwise, contradicting p ∈ ∂Dn+1 by Theorem 4.1, there is a neighborhood that only contains exiting points such that ∗ . Hence N (p) = n + 2. The N2 = n2 ≤ n + 1. Again it contradicts p ∈ ∂Dn+1 2 construction by induction is complete. It is clear from the construction that in any neighborhood of any point of 5k there are points with N2 < k and N2 ≥ k. This implies that 5k ∩ 5m = ∅ if m > k. Because if there is an intersection p, then in any small neighborhood of p, there are points with N2 < k and also points with N2 ≥ m ≥ k + 1 ≥ N2 + 2. This contradicts that the variation of N2 is at most one. The proof is complete. It follows from this lemma that at any connecting point on 5k , n2 = N2 − 1 = k. The proof of Theorem 1.1 would be complete if we show that on each dividing curve 5k there is a connecting point pm ≡ (um , vm ) such that n1 (pm ) = m for each m = k, k + 1, . . . . (See Fig. 5.1 below.) For this purpose, weconsider how ni ωˆ changes as the point ωˆ moves through the null-exiting sets Li ≡ ωˆ ∈ D : ω¯ i = 0 . Clearly, any exiting point of 5k lies in L2 since n2 = N2 is not constant in a neighborhood. The following lemma implies that any point of intersection 5k ∩ L¯ 1 is connecting. Lemma 5.2. The intersection L¯ 1 ∩ L¯ 2 contains only connecting points. Proof. Let ωˆ ∗ ∈ L¯ 1 ∩ L¯ 2 and let ω¯ ∗ be the end point of the corresponding trajectory ω∗ . 2 2 If it is not connecting, then it is exiting. Hence, either ω¯ 1∗ = 2 or ω¯ 2∗ = 2. Without loss of generality, we assume the former. Then, by Theorem 2.3, for any ωˆ near ωˆ ∗ , the end value ω¯ 1 of the trajectory is nonzero. Hence ωˆ ∈ / L1 . This means that ωˆ ∗ ∈ / L¯ 1 , contradicting to the assumption.
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations
393
ω2 2
Γ0
n1 = 2
n1 = 1
. .. n1 = k
Γk
n1 = k + 1 n1 = k + 2
ω1 Fig. 5.1. Connecting points on 5k
The next lemma describes the change of the zero-number ni ωˆ as ωˆ moves along a curve that passes through Li . Lemma 5.3. L et t ∈ [0, 1] → p (t) ∈ D be a continuous curve containing only exiting points. Suppose ni (p (0)) = ni (p (1)). Then the curve passes through Li . If in addition, the curve passes through Li only once at a point not in the set {ωi = 0}, then n1 (p (0)) differs from n2 (p (0)) by one. Proof. Let t¯ = inf {t ∈ (0, 1) : ni (p (t)) = ni (p(0))}. Then t¯ = 0. Suppose the end value ω¯ i,p(t¯) corresponding to the starting point p t¯ is nonzero. Then by Theorem 4.1, ni is constant in a neighborhood of p t¯ . Hence, there is a t1 < 1 such that ni (p (t1 )) = ni (p (1)) = ni (p (0)). This contradicts the definition of t¯. Hence ω¯ i,p(t¯) = 0, that is, p t¯ ∈ Li . Suppose ni (p (1)) = ni (p (0)) and the curve passes through Li only at t ∈ [0, 1]. Hence ni (p (t)) changes value only at t = t . Furthermore, Theorem 4.1 ensures that the difference of values of ni can be at most one in a neighborhood of t . This implies that ni (p (0)) and ni (p (1)) can differ at most by one. In view of the above lemma, if one shows that points on 5k near ak ∈ l1 can have arbitrary large n1 values, then the existence of connecting points pm ∈ 5k for m ≥ k would follow. To see this, let cn ∈ 5k be near ak such that n1 (cn ) ≥ n > k. Note that bk is connecting and n1 (bk ) = k. Suppose there is no connecting point on 5k between bk and cn . Then by Lemma 5.3 there is a point of L1 in this segment. And by Lemma 5.2 the intersection point is connecting. Since n1 can only change one at a time, and since n can be arbitrarily large, the existence of all types of connecting points pm follows. Hence, the proof of Theorem 1.1 will be complete if we prove the following lemma. Lemma 5.4. Let ωˆ ∗ = (0, v ∗ ) ∈ D , where v ∗ > 0, be a connecting point. Then, for any n > 0, there is an ε > 0 such that n1 ≥ n for any trajectory starting in the neighborhood Nε ωˆ ∗ .
394
W. H. Ruan
∗ ωˆ in which Proof. Suppose the opposite holds. Then there is a neighborhood N ε ∗ n1 < n. Assume that ε is so small that Nε ωˆ ⊂ Dk for some k. Then by Theorem ¯ which is independent of ωˆ ∈ Nε ωˆ ∗ . Since 2.5, = 2m + 4r P has an upper bound
rˆ > 2 and P ≥ −1 in D, by the monotonicity of m given by Theorem 2.4,
= 2m +
4 4P 4 ≥ 2m rˆ − = rˆ − > 0. r rˆ rˆ
Hence also has a positive lower bound ≡ rˆ − 4/ˆr . In view of Proposition 2.1, µ has a positive lower bound µ for r > rˆ + 1. For any trajectory ω = (ω1 , ω2 ), define the “polar coordinates” ρ (r) and θ (r) in the (ω1 , ω˙ 1 )-space by ω˙ 1 ρ = ω12 + ω˙ 12 , θ = tan−1 , ω1 where the angle is defined so that −π/2 < θ rˆ < π/2. We show that for r and r˜ such that π ¯ r > max rˆ + 1, /µ , r˜ ≥ r + (n + 1) (5.1) 4 ˙ ≤ −1/4 in r , r˜ . there is an ε sufficiently small, such that the end value r ¯ > r ˜ and θ Once this is it is clear that θ (˜r ) − θ r ≥ (n + 1) π and hence ωi has at least proved, n zeros in r , r˜ . This is a contradiction. The conclusion of the lemma thus follows. We prove θ˙ ≤ −1/4 below. Since ωˆ ∗ is a connecting point, itis clear that ε can be chosen so small that r¯ > r˜ for any trajectory starting in Nε ωˆ ∗ . We first show that there is a constant M ≥ 1, depending on r˜ but independent of the initial point ω, ˆ such that |ω1 (r)| ≤ M ωˆ 1 in rˆ , r˜ . (5.2) Observe that by Eq. (1.4),
1 2 ωˆ 1 2 1 − ωˆ 1 + ωˆ 2 < 0. (5.3) 2 rˆ 2 µ rˆ Hence, either ω1 (r) > 0 for all r ∈ rˆ , r˜ or there is a zero of ω1 in this interval. In the first case, 0 ≤ ω1 ≤ ωˆ 1 whichclearly implies (5.2). In the second case, let r0 be the first zero of ω in r ˆ , r ˜ . Then on r ˆ , r , we again have 0 ≤ ω1 ≤ ωˆ 1 . It remains to show 1 0 (5.2) in r0 , r˜ . We show that M can be chosen such that in this interval ω1
rˆ = −
ρ (r) ≤ M ωˆ 1 .
(5.4)
This will imply (5.3). We first compute ρ ρ˙ = ω1 ω˙ 1 −
1 4P 1 1 ω1 ω˙ 1 1 − ω12 + ω22 − ω˙ 12 1 − 2µ + 2 µ 2 µ r
in rˆ , r˜ .
Using the Schwarz inequality and the boundedness of ωi and P , we can find a constant c1 > 0 such that c1 ρ ρ˙ ≤ ρ 2 . (5.5) µ
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations
395
We next estimate µ in the interval (r0 , r˜ ). Observe that by the mean value theorem, −ωˆ 1 = ω1 (r0 ) − ω1 rˆ = ω1 (r1 ) r0 − rˆ for some r1 ∈ rˆ , r0 . If we can find a constant c2 > 0, independent of ω, such that ω ≤ c2 ωˆ 1 1
in rˆ , r0
(5.6)
then the above relation leads to r0 − rˆ > c2 . Hence, by Proposition 2.1, µ (r) ≥ δ rˆ + c2 ≡ c3
in r0 , r˜ .
(5.7)
To prove (5.6), we observe that since at r = r0 , r02 µω1 = − ω1 ≥ 0, the minimum of ω1 in rˆ , r0 occurs either at rˆ or at a point r2 ∈ (ˆr , r0 ] at which ω1 = 0. In the former case, (5.3) implies that ω (r) ≤ ω rˆ ≤ c2 ωˆ 1 in rˆ , r0 , 1 1 where c2 ≥
2 . rˆ 2 µ rˆ
In the latter case, (1.4) implies that
ω (r) ≤ −ω (r2 ) = ω1 1 − ω2 + 1 ω2 ≤ 2 ωˆ 1 in rˆ , r0 . 1 1 1 2
2
Recall that ≥ > 0 for all r ∈ rˆ , r˜ . Hence we again have (5.6) with c2 ≥ 2/ . This proves (5.6) and (5.7). Let us return to the discussion about ρ. By (5.5) and (5.7), rρ = ρ˙ ≤ c3 ρ
in (r0 , r˜ ) ,
where c3 = c1 /c2 . Furthermore, by (5.6), ρ (r0 ) = |ω˙ 1 (r0 )| ≤ c2 r˜ ωˆ 1 . Hence, by the comparison principle, ρ (r) ≤ ρ (r0 ) r c3 ≤ c2 r˜ c3 +1 ωˆ 1 . This proves (5.4) and also (5.2).
396
W. H. Ruan
˙ By computation, and using the definition of ρ and θ , We now consider θ. ω1 ω¨ 1 − ω˙ 12 ρ2
4P 1 ω1 1 −ω1 1 − ω12 + ω22 − 1 − 2µ + 2 ω˙ 1 − ω˙ 12 = 2 ρ µ 2 r
1 1 1 4P =− 1 − ω12 + ω22 cos2 θ − 1 − 2µ + 2 sin 2θ − sin2 θ µ 2 2µ r
1 1 1 4P = −1 − 1 − µ − ω12 + ω22 cos2 θ − 1 − 2µ + 2 sin 2θ. µ 2 2µ r
θ˙ =
Since by Theorem 2.1, µ < 1, it follows that
ω2 1 1 2 2 − 1 − µ − ω 1 + ω2 ≤ 1 . µ 2 µ ¯ ≥ , it follows that Also, by (5.1), rµ ≥ r µ >
1 1 4P 1 − 1 − 2µ + 2 sin 2θ ≤ (rµ − ) ≤ . 2µ r 2rµ 2 Hence θ˙ ≤ −1 +
ω12 1 + . µ 2
Hence, by (5.2), we can choose ωˆ 1 sufficiently small such that ω12 /µ ≤ 1/4. This ensures that θ˙ ≤ −1/4 in r , r˜ . The assertion is proven. The proof of Theorem 1.1 is complete.
References 1. Bartnik, R. and Mckinnon, J.: Particlelike solutions of the Einstein–Yang–Mills equations. Phys. Rev. Lett. 61, 141–144 (1988) 2. Bizon, P.: Colored black holes. Phys. Rev. Lett. 64, 2844–2847 (1990) 3. Hale, J.K.: Ordinary Differential Equations. New York: John Wiley & Sons, Inc., 1969 4. Künzle, H.: Analysis of the static spherically symmetric SU (n)-Einstein–Yang–Mills equations. Commun. Math. Phys. 162, 371–397 (1994) 5. Künzle, H. and Masood-ul-Alm, A.: Spherically symmetric static SU (2) Einstein–Yang–Mills fields. J. Math. Phys. 31, 928–935 (1990) 6. Mavromatos, N.E. and Winstanley, E.: Existence theorems for hairy black holes in su(N ) Einstein– Yang–Mills theories. J. Math. Phys. 39, 4849–4873 (1998) 7. Ruan, W.H.: Existence of infinitely many black holes in su(3) Einstein–Yang–Mills theory. Nonlinear Analysis 47, 6109–6119 (2001) 8. Smoller, J. and Wasserman, A.: Existence of infinitely-many smooth, static, global solutions of the Einstein–Yang–Mills equations. Commun. Math. Phys. 151, 303–325 (1993) 9. Smoller, J., Wasserman, A. andYau, S.-T.: Existence of black hole solutions for the Einstein–Yang/Mills equations. Commun. Math. Phys. 154, 377–401 (1993)
Hairy Black Hole Solutions to SU (3) Einstein–Yang–Mills Equations
397
10. Smoller, J., Wasserman, A., Yau, S.-T. and McLeod, J.: Smooth static solutions of the Einstein/Yang– Mills equations. Commun. Math. Phys. 143, 115–147 (1991) 11. Volkov, M. and Galt’sov, D.: Black-holes in Einstein–Yang–Mills theory. Sov. J. Nucl. Phys. 51, 747–753 (1990) 12. Volkov, M. and Galt’sov, D.: Gravitating non-Abelian solitons and black holes with Yang–Mills fileds. Phys. Rep. 319, 2–83 (1999) Communicated by H. Nicolai
Commun. Math. Phys. 224, 399 – 426 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Quantum Invariant Measures Nicolai Reshetikhin1, , Milen Yakimov1,2, 1 Department of Mathematics, University of California at Berkeley, Berkeley, CA 94720, USA.
E-mail:
[email protected];
[email protected] 2 Department of Mathematics, Cornell University, Ithaca, NY 14853, USA
Received: 26 January 2001 / Accepted: 31 May 2001
Abstract: We derive an explicit expression for the Haar integral on the quantized algebra of regular functions Cq [K] on the compact real form K of an arbitrary simply connected complex simple algebraic group G. This is done in terms of the irreducible ∗-representations of the Hopf ∗-algebra Cq [K]. Quantum analogs of the measures on the symplectic leaves of the standard Poisson structure on K which are (almost) invariant under the dressing action of the dual Poisson algebraic group K ∗ are also obtained. They are related to the notion of quantum traces for representations of Hopf algebras. As an application we define and compute explicitly quantum analogs of Harish-Chandra c-functions associated to the elements of the Weyl group of G. 1. Introduction Let G be a simply connected complex simple algebraic group. The cocommutative Hopf algebra C[G] of regular functions on G has a standard quantization, denoted by Cq [G] and called quantized algebra of regular functions on G. It is a Hopf subalgebra of the dual Hopf algebra of the standard quantized universal enveloping algebra Uq g. Let K denote a compact real form of G. The complex conjugation in the algebra C[K](= C[G]) can be deformed to a conjugate linear antiisomorphism ∗ of Cq [G]. This gives rise to a Hopf ∗-algebra (Cq [G], ∗) called the quantized algebra of regular functions on K which will be denoted by Cq [K]. The Hopf algebra Cq [K] is known [1] to have a unique Haar functional H : Cq [K] → C normalized by H (1) = 1. It is known by a quantum analog of the Schur orthogonality relations. At the same time an analog of the classical expression for the bi-invariant functional on C[K] as an integral over K with respect to the Haar measure was found only Partially supported by NSF grant DMS96-03239
Partially conducted for the Clay Mathematics Institute and also supported by NSF grants DMS94-00097
and DMS96-03239
400
N. Reshetikhin, M. Yakimov
in the case of SU2 , [16]. The first result which we obtain in this paper is a representation for the Haar integral on Cq [K] of this type in the general case. Let us first note that the quantum analog of the set of points on K is the set of irreducible ∗-representations of the Hopf ∗-algebra Cq [K]. Its representations were classified by Soibelman [14] and can be nicely described by a version of the Kirillov– Kostant orbit method. Fix a maximal torus T of K. Let G = KAN be the related Iwasawa decomposition of G. The group K has a standard Poisson structure making it a real Poisson algebraic group which is the semiclassical structure of the deformation of C[K] to Cq [K]. The double and dual Poisson algebraic groups of K are isomorphic to G and AN as real algebraic groups, respectively. The dressing action of AN on K is global and is explicitly given by the rule [9, 14] δan (k) for a ∈ A, n ∈ N, k ∈ K
is such that
ank = (δan (k)) a1 n1
(1.1)
for some a1 ∈ A, n1 ∈ N (see [13, 9] for general facts about the dressing action). Let us choose for each element w of the Weyl group W of G a representative w˙ in the normalizer of A in K. The orbits of the dressing action of AN on K (symplectic leaves of K) are Sw t, where w ∈ W , t ∈ T and Sw denotes the orbit of w. ˙ The disjoint union t∈T Sw .t is the Bruhat cell K ∩ B wB, ˙ where B is the Borel subgroup B = T AN of G. Soibelman proved that the leaves Sw .t are deformed to a set πw,t of (unequivalent) irreducible ∗-representations of the Hopf ∗-algebra Cq [K]. Up to an equivalence they exhaust all such representations of Cq [K]. Our result on the Haar integral on Cq [K] expresses it as an integral over the maximal torus T of K of the traces of the representations πw◦ ,t for the maximal length element w◦ of W . In other words these are the irreducible ∗-representations of Cq [K] corresponding to the symplectic leaves in the maximal Bruhat cell of K. This result is derived in Sect. 5. It is particularly suited for obtaining integral expressions for quantum spherical functions. This will be discussed in a future publication. For each w ∈ W denote Nw = N ∩ wN− w −1 and Nw+ = N ∩ wN w −1 , where N− is the opposite to the unipotent subgroup N of G. Our next result is a quantum analog of the Haar measures on the unipotent groups Nw . The symplectic leaf Sw .t, considered as an AN -homogeneous space via the dressing action, is isomorphic to Sw .t = AN/ANw+ .
(1.2)
The quotient AN/ANw+ does not have a left invariant measure because the ratio of the corresponding modular functions is not equal to 1, see [3]. Using the factorization AN = Nw ANw+ , we can identify AN/ANw+ ∼ = Nw which induces a measure on the symplectic leaf (1.2) from the Haar measure on Nw . The resulting measure transforms under the action of AN by the following multiplicative character of AN : χ (an) = a 2(ρ−wρ) ,
a ∈ A, n ∈ N.
(1.3)
The dressing action of AN = K ∗ on the symplectic leaf Sw .t of K induces an action of K ∗ on the space of functions on Sw .t. The latter transforms in the quantum situation to an action of Cq [G] on the space of linear operators in the Hilbert space completion V w,t of the representation space of πw,t . It coincides with the standard adjoint action c.L =
πw,t (c(1) )Lπw,t (S(c(2) )).
(1.4)
Quantum Invariant Measures
401
(Here and later we use the standard notation for the comultiplication in a Hopf algebra (c) = c(1) ⊗ c(2) .) Let us also note that Cq [K] acts by bounded operators in all of its ∗-representations and thus in particular in V w,t . The standard trace in V w,t is not a homomorphism from the space of trace class operators in V w,t with the adjoint Cq [G]-action (1.4) to the 1-dimensional representation of Cq [G] determined by its counit. After Reshetikhin and Turaev such a homomorphism, from possibly a “deformation” of the space of trace class operators, is called a quantum q trace for the Hopf algebra module under consideration. We define a space B1 (V w,t ) of “quantum” trace class operators in V w,t , stable under the adjoint Cq [G]-action (1.4), and construct a homomorphism from it to the 1-dimensional representation of Cq [G] determined by a multiplicative character of it which is a deformation of the character (1.3). Such homomorphisms, to be called quantum quasi-traces, are treated in Sect. 6 where we also study some of their properties. They are quantum analogs of the invariant measures on the unipotent groups Nw and the almost AN -invariant measures on the symplectic leaves Sw .t. Section 7 contains an application to quantum analogs of Harish-Chandra c-functions related to the elements of the Weyl group of G. They are constructed by the help of the quantum quasi-traces from Sect. 6 and are explicitly computed by a q-analog of the original Harish-Chandra formula. In the quantum situation the role of the factorization formulas for the groups Nw as products of 1-dimensional unipotent subgroups is played by tensor product formulas for the representations πw,t [14, 7]. In a forthcoming publication we will discuss the relation between the quantum c-functions and the asymptotics of quantum spherical functions at infinity, which is similar to the one in the classical case. Sections 2 and 3 review some standard facts about quantized universal enveloping algebras, quantized function algebras, and their representations. Section 4 deals with a family of elements of Cq [K] which enter in all formulas for quantum invariant functionals derived in this paper.
2. Preliminaries on Quantized Enveloping Algebras 2.1. Root data. Let g be a complex simple Lie algebra of rank l with Cartan matrix (aij ). Denote by (. , .) the invariant inner product on g for which the square length of a minimal root equals 2 in the resulting identification h∗ ∼ = h for a Cartan subalgebra h of g. The sets of simple roots, simple coroots, and fundamental weights of g will be denoted by {αi }li=1 , {αi ∨ }li=1 , and {ωi }li=1 , respectively. Let P , Q, and Q∨ , denote the weight, root, and coroot lattices of g. Denote by , + , − , and P + the sets of roots, positive/negative roots, and dominant weights of g. Set Q = { mi αi } and + Q+ ∨ = { mi αi ∨ }, mi ∈ N. Recall that there exists a unique set of relatively prime positive integers {di }li=1 for which the matrix (di aij ) is symmetric and for it (αi , αj ) = di aij . The Weyl group of g will be denoted by W . The simple reflections in W will be denoted by si and the maximal length element in W by w◦ .
402
N. Reshetikhin, M. Yakimov
2.2. Definition of Uq g. Throughout this paper we will assume that q is a real number different from ±1 and 0. The adjoint rational form of the quantized universal enveloping algebra Uq g of g is generated by Ki±1 , and Xi± , i = 1, . . . , l, subject to the relations Ki−1 Ki = Ki Ki−1 = 1, Ki Xj± Ki−1
=
Ki Kj aij ± qi Xj ,
Xi+ Xj− − Xj− Xi+ = δi,j
= Kj Ki ,
Ki − Ki−1 qi − qi−1
,
1 − a ij (Xi± )r Xj± (Xi± )1−aij −r = 0, i = j. r q
1−aij r=0
i
It is a Hopf algebra with comultiplication given by (Ki ) = Ki ⊗ Ki , (Xi+ ) = Xi+ ⊗ Ki + 1 ⊗ Xi+ ,
(Xi− ) = Xi− ⊗ 1 + Ki−1 ⊗ Xi− ,
antipode and counit given by S(Ki ) = Ki−1 ,
S(Xi+ ) = −Xi+ Ki−1 ,
S(Xi− ) = −Ki Xi− ,
+(Ki ) = 1, +(Xi± ) = 0,
where qi = q di . As usual q-integers, q-factorials, and q-binomial coefficients are denoted by [n]q q n − q −n n , [n] ! = [1] . . . [n] , = [n]q = q q q −1 m q −q [m]q [n − m]q q for n, m ∈ N and m ≤ n. The conjugate linear antiisomorphism ∗ of Uq g defined on its generators by Ki∗ = Ki ,
(Xi+ )∗ = Xi− Ki ,
(Xi− )∗ = Ki−1 Xi−
(2.1)
equips Uq g with a structure of a Hopf ∗-algebra. In the limit q → 1 the involution ∗ recovers the Cartan (anti)involution (conjugate linear antiisomorphism of order 2) of g associated to its compact real form k. For the definition and properties of Hopf ∗-algebras we refer to [7, pp. 95–97] and [1, pp. 117–118]. For i = 1, . . . , l the Hopf subalgebra of Uq g generated by Ki and Xi± will be denoted by Uqi gi . It is naturally isomorphic to Uqi sl2 . The canonical embedding Uq sl2 ∼ = Uqi gi ,→ Uq g will be denoted by ϕi . Recall that a Uq g-module is called integrable if the subalgebras Uqi gi act locally finitely. The subalgebras of Uq g generated by {Ki }li=1 , {Xi+ }li=1 , and {Xi− }li=1 will be denoted by U0 , U + , and U − , respectively. Clearly U0 is a commutative Hopf subalgebra of Uq g isomorphic to the group algebra of the lattice Q equipped with the standard structure of a cocommutative Hopf algebra.
Quantum Invariant Measures
403
2.3. Quantum Weyl group. Let Bg denote the (generalized) braid group associated to the Coxeter group W with generators Ti corresponding to the simple reflections si ∈ W . For any integrable Uq g-module V one can define an action of Bg on V . It is given by [10] Ti = (−1)b qiac−b (Xi+ )(a) (Xi− )(b) (Xi+ )(c) , a,b,c∈N
where (Xi± )(n) =
Xi± · [n]qi
In the case of the adjoint representation of Uq g this gives an action of the braid group Bg on Uq g. The explicit action of Ti on the generators Kj , Xj± of Uq g is Ti (Xi+ ) = −Xi− Ki , Ti (Xj+ ) = Ti (Xj− ) =
−aij
r=0 −aij
r=0
Ti (Xi− ) = −Ki−1 Xi+ ,
−aij
Ti (Kj ) = Kj Ki
(−1)r qi−r (Xi+ )(−aij −r) Xj+ (Xi+ )(r) (−1)r qir (Xi− )(r) Xj− (Xi− )(−aij −r)
i = j,
if
if
,
i = j.
The defined actions of Bg are compatible in the sense that for any integrable Uq gmodule V , Ti .xv = (Ti x).Ti v,
∀ x ∈ Uq g,
v ∈ V.
Recall that there exists a canonical section T : W → Bg of the natural projection Bg → W (where Ti → si ). If w = si1 . . . sin is a reduced decomposition of w ∈ W then the image Tw of w in Bg is defined by Tw = Ti1 . . . Tin . It does not depend on the choice of a reduced decomposition. The weight subspaces of a U0 -module (in particular of a Uq g-module) V are defined by Vλ = {v ∈ V | Ki .v = q (λ,αi ) v},
λ ∈ P.
The elements of Bg preserve the weight space decomposition of an integrable Uq gmodule, in particular Tw Vλ = Vwλ .
404
N. Reshetikhin, M. Yakimov
2.4. R-matrix. Put Uk± =
λ∈±Q+ , |(λ,ρ ∨ )|≥k
Uλ± ,
k ∈ N,
(2.2)
U− the completion where ρ ∨ is the half-sum of positive coroots of g. Denote by U+ ⊗ U− according to the descending sequence of vector spaces of the vector space U+ ⊗ + Uk ⊗ U − ⊕ U + ⊗ Uk− . U− acts in the tensor product of two finite dimenAny element of the completion U+ ⊗ sional Uq g-modules. Recall that a representation V of Uq g is called a type 1 representation if it is a direct sum of its weight subspaces. For a pair (V1 , V2 ) of type 1 Uq g-modules define the linear operator 3V1 ,V2 : V1 ⊗ V2 → V1 ⊗ V2 by 3V1 ,V2 (v1 ⊗ v2 ) = q (λ,µ) v1 ⊗ v2
if
v1 ∈ (V1 )λ , v2 ∈ (V2 )µ .
Denote also by σ : V1 ⊗ V2 → V2 ⊗ V1 the flip operator σ (v1 ⊗ v2 ) = v2 ⊗ v1 . U − , called a quasi R-matrix for Uq g, There exists [10, 7] a unique element R ∈ U + ⊗ normalized by U1− R − 1 ∈ U1+ ⊗ such that for any pair (V1 , V2 ) of finite dimensional Uq g-modules of type 1 the composition σ ◦ 3V1 ,V2 ◦ R : V1 ⊗ V2 → V2 ⊗ V1
(2.3)
defines an isomorphism of Uq g-modules. For any pair (V1 , V2 ) of finite dimensional Uq g-modules and an element w ∈ W the actions of Tw ∈ Bg on V1 , V2 , and V1 ⊗V2 , to be denoted by Tw,V1 , Tw,V2 , and Tw,V1 ⊗V2 , U − which does not are related as follows. There exists a unique element R w ∈ U + ⊗ depend on V1 and V2 such that Tw,V1 ⊗V2 = R w Tw,V1 ⊗ Tw,V2 .
(2.4)
As the quasi R-matrix R, R w satisfies U1− . R w − 1 ∈ U1+ ⊗
(2.5)
The element R w◦ associated to the maximal element w◦ of W is equal to the quasi R-matrix R.
Quantum Invariant Measures
405
3. Quantized Algebras of Functions 3.1. Quantized algebras of regular functions. Let G be a connected, simply connected, complex simple algebraic group and g = LieG. The finite dimensional, Uq g-modules of type 1 form a quasitensor category. Hence their matrix coefficients form a Hopf subalgebra of the Hopf dual (Uq g)∗ of Uq g. It is called the quantized algebra of regular functions on G and is denoted by Cq [G]. Every finite dimensional type 1 Uq g-module is a direct sum of irreducible type 1 Uq g-modules. The latter are highest weight modules with highest weights 6 ∈ P+ (the corresponding module will be denoted by L(6)). The matrix coefficient of L(6) 6 : associated to v ∈ L(6) and l ∈ L(6)∗ will be denoted by cl,v 6 6 ∈ Cq [G], cl,v (x) = l, x.v. cl,v
The above implies 6 Cq [G] = span{cl,v | 6 ∈ P+ , v ∈ L(6), l ∈ L(6)∗ }.
The ∗-involution in Uq g induces a structure of Hopf ∗-algebra on Cq [G] by ξ ∗ , x = ξ, S(x)∗ ,
ξ ∈ Cq [G], x ∈ Uq g.
(3.1)
The resulting Hopf ∗-algebra (Cq [G], ∗) is called quantized algebra of regular functions on the compact real form K of G and is denoted by Cq [K]. The inclusions ϕi : Uqi gi ,→ Uq g induce surjective homomorphisms ϕi∗ : (Cq [G], ∗) → (Cqi [Gi ], ∗), where Gi is the subgroup of G isomorphic to SL2 with tangent Lie algebra gi generated by the root vectors of ±αi . We finish this subsection with a simple fact on the explicit structure of the Hopf ∗-algebra Cq [K] (see, for instance, [1, Proposition 13.1.3]). Recall that L(6)∗ ∼ = L(−w◦ 6) and if we fix these isomorphisms, we can consider any v ∈ L(6), l ∈ L(6)∗ as elements of L(−w◦ 6)∗ , L(−w◦ 6), respectively. Recall that any module L(6) can be equipped with a unique (up to a constant) inner product which turns it into a (Uq g, ∗) ∗-representation. Lemma 3.1. (i) The comultiplication, the counit, and the antipode of Cq [G] are given by 6 (cl,v )=
j
6 cl,v ⊗ cl6j ,v , j
−w◦ 6 6 6 +(cl,v ) = l, v, S(cl,v ) = cv,l ,
(3.2) (3.3)
where in (3.2) ({vj }, {lj }) is an arbitrary pair of dual bases of L(6) and L(6)∗ . (ii) Fix an orthonormal basis {vi } of L(6) equipped with an invariant inner product as above and a dual basis {lj } of L(6)∗ . The action of the ∗-involution (3.1) on the corresponding elements of Cq [G] is given by ◦6 ). (cl6i ,vj )∗ = (cv−w i ,lj
(3.4)
406
N. Reshetikhin, M. Yakimov
3.2. Quantized algebra of continuous functions of K. Let G be a complex simple algebraic group as in the previous subsection and K be its compact real form. The quantized algebra of continuous functions Cq (K) on K is by definition the C ∗ -completion of the ∗-algebra Cq [K] with respect to the norm f = sup η(f ),
f ∈ Cq [K],
η
(3.5)
where η runs through all ∗-representations of Cq [K]. The fact that for any ∗-representation η of Cq [K] η(f ) is a bounded operator and that the supremum in (3.5) is finite for all f ∈ Cq [G] follows from the following identity in Cq [K]: j
cl6j ,vi (cl6j ,vi )∗ = 1,
where {vi } and {lj } are dual bases of L(6) and L(6)∗ as in part (ii) of Lemma 3.1, see [1, Eq. (13), p. 452]. The C ∗ -algebras Cq (K) posses natural structures of compact matrix quantum groups in the sense of Woronowicz [18], see [1, Sect. 13.3]. 3.3. Cq [SU2 ]. The Uq sl2 -module L(ω1 ) has a basis in which the operators K1 , X1± act by K1 →
q 0 , 0 q −1
X1+ →
01 , 00
X1− →
00 . 10
The corresponding matrix coefficients cij ∈ Cq [SL2 ] i, j = 1, 2 generate Cq [SL2 ]. More precisely: Lemma 3.2. The Hopf algebra Cq [SL2 ] is isomorphic to the algebra generated by cij , i, j = 1, 2, subject to the relations c11 c12 = q −1 c12 c11 , c11 c21 = q −1 c21 c11 , c12 c22 = q −1 c22 c12 , c21 c22 = q −1 c22 c21 , c12 c21 = c21 c12 ,
c11 c22 − c22 c11 = (q −1 − q)c12 c21 , c11 c22 − q −1 c12 c21 = 1.
In these generators the comultiplication, the counit, the antipode, and the ∗-involution of Cq [SU2 ] are given by (cij ) =
cik ⊗ ckj ,
+(cij ) = δij ,
k=1,2
S(c11 ) = c22 ,
S(c22 ) = c11 , S(c12 ) = −qc12 , ∗ ∗ c11 = c22 , c21 = −qc12 .
S(c21 ) = −q −1 c21 ,
Quantum Invariant Measures
407
A proof of Lemma 3.2 can be found, for instance, in [7, Example 2.3.3 and Theorem 3.0.1]. Let q ∈ R, q > 1. The Hopf ∗-algebra Cq [SU2 ] has an infinite dimensional ∗representation π on l 2 (N) given by the following action of its generators cij , i, j = 1, 2 (see [14, 7]): (3.6) π(c12 )ek = q −k−1 ek , π(c11 )ek = 1 − q −2k ek−1 , π(c21 )ek = −q −k ek , π(c22 )ek = 1 − q −2k−2 ek+1 , (3.7) where e−1 := 0. 3.4. Irreducible star representations of Cq [K]. The group of multiplicative characters of the Hopf algebra Cq [G] is isomorphic to the complex torus (C× )l , see [4, Theorem 3.3] and [6, Sect. 10.3.8] in the case when q is an indeterminate. The character corresponding to the l-tuple t = (t1 , . . . , tl ) ∈ (C× )l is given by 6 χt (cl,v )=
l i=1
(λ,αi ∨ )
ti
l, v =
l i=1
(λ,αi ∨ )
ti
6 +(cl,v ), v ∈ L(6)λ .
(3.8)
The unitary ones among these are the ones corresponding to the real torus (S 1 )l = {(t1 , . . . , tl ) ∈ (C× )l | |ti | = 1}. From now on we will assume that q ∈ R, q > 1. Denote by πi the ∗-representation of (Cqi [Gi ], ∗) ∼ = Cqi [SU2 ] given by (3.6)–(3.7). The ∗-representation of Cq [K] ∼ = (Cq [G], ∗) induced from it by the homomorphism ϕi∗ : (Cq [G], ∗) → (Cqi [Gi ], ∗) will be denoted by πsi . (Recall that si denotes the simple reflection in the Weyl group W of g corresponding to the root αi .) The irreducible ∗-representations of the Hopf ∗-algebra Cq [K] were classified by Soibelman [14], see also the book [7] for an exposition. Theorem 3.3. (i) For any reduced decomposition w = si1 . . . sin of an element w of W and any t ∈ (S 1 )l the tensor product πw,t = πsi1 ⊗ . . . ⊗ πsin ⊗ χt
(3.9)
is an irreducible ∗-representation of Cq [K]. (ii) Up to an equivalence the representation πw,t does not depend on the choice of reduced decomposition of w. (iii) Every irreducible ∗-representation of Cq [G] is isomorphic to some πw,t . Denote by Vw,t the representation space of πw,t equipped with the Hermitian inner product from Theorem 3.3. The Hilbert space completion of Vw,t with respect to it will be denoted by V w,t . Then: The representations πw,t naturally induce irreducible representations of the C ∗ algebra Cq (K), πw,t : Cq (K) → B(V w,t ). The latter exhaust all irreducible representations of Cq (K) up to a unitary equivalence. Each module Vw,t has a natural orthonormal basis ek1 ,... ,kn = ek1 ⊗ . . . ⊗ ekn ⊗ 1,
n = l(w), k1 , . . . , kn ∈ N
(3.10)
408
N. Reshetikhin, M. Yakimov
induced from the orthonormal basis {ek } of the Cq [SU2 ]-module V defined by (3.6)– (3.7). Here 1 denotes a (fixed) vector of the 1-dimensional representation of Cq [G] corresponding to χt . For an element w of the Weyl group W denote by Iw the ∗-ideal of Cq [K] generated by 6 cl,v 6
such that
6 ∈ P+ , l, U + Tw .v6 = 0,
(3.11)
where v6 denotes a highest weight vector of L(6). The annihilation ideals of the representations πw,t are contained in Iw [14, 7]: ker πw,t ⊂ Iw .
(3.12)
4. A Family of Elements a,w ∈ Cq [K] 4.1. Definitions. For a dominant integral weight 6 ∈ P+ and a highest weight vector v6 of L(6) denote by l6,w the unique element of L(6)∗−w6 such that l6,w , Tw v6 = 1. (The uniqueness follows from the fact that dim L(6)w6 = 1.) Define a6,w = cl66,w ,v6 .
(4.1)
Note that a6,w does not depend on the choice of highest weight vector v6 of L(6). The ∗-subalgebras of Cq [K] generated by a6,w played an important role in Soibelman’s classification of the irreducible ∗-representations of Cq [K], see Theorem 3.3. Most of the results in this subsection are due to Soibelman [14]. We include their proofs since [14] does not assume the normalization made in the definition of a6,w . U − allow to write l6,w and Properties (2.4) and (2.5) of the elements R w ∈ U + ⊗ thus a6,w slightly more explicitly. Let l6 = l6,1 , i.e. let l6 ∈ L(6)∗−6 be the unique element such that l6 , v6 = 1. Then (2.4), (2.5) imply l6,w = Tw l6 and thus a6,w = cT6w l6 ,v6 .
(4.2)
∗ Proposition 4.1. (i) The elements a6,w , a6,w ∈ Cq [K], 6 ∈ P+ are normal modulo Iw : "
"
"
"
6 6 a6,w cl,v − q (6,λ )−(w6,µ ) cl,v a6,w ∈ Iw ,
(4.3)
∈ Iw ,
(4.4)
∗ 6" cl,v a6,w
" " 6" ∗ − q (6,λ )−(w6,µ ) cl,v a6,w
for v ∈ L(6" )λ" , l ∈ L(6" )∗−µ" .
Quantum Invariant Measures
409
∗ } (ii) The images of {a6,w , a6,w 6∈P+ in Cq [K]/Iw generate a commutative subalgebra. More precisely the following identity holds in Cq [K]:
a61 ,w a62 ,w = a61 +62 ,w ,
∀ 61 , 62 ∈ P+ .
(4.5)
Proofs of Proposition 4.1 can be found in [14, 7]. The property (4.3) follows from the existence of a quasi R-matrix for Uq g, see (2.3). Equation (4.4) follows from (4.3), Lemma 3.1, and the fact that the ideals Iw are stable under the ∗-involution. The first statement in part (ii) is a direct consequence of part (i). The second statement in (ii) U1− with the properties (2.4), follows from the existence of the element R w ∈ U1+ ⊗ (2.5) and the fact that v61 ⊗ v62 ∈ L(61 ) ⊗ L(62 ) generates a submodule isomorphic to L(61 + 62 ). 4.2. The action of a6,w in Vw,t . Lemma 4.2. Let w, w " ∈ W be such that w = si w " and l(w) = l(w " ) + 1 for some simple reflection si ∈ W . Then (a6,w ) − cl66,w ,Tw" v6 ⊗ a6,w" ∈ ker ϕi∗ ⊗ Cq [K] + Cq [K] ⊗ Iw" . Proof. According to (3.2) (a6,w ) is given by cl66,w ,vj ⊗ cl6j ,v6 , (a6,w ) = j
where ({vj }, {lj }) is a pair of dual bases of L(6) and L(6)∗ consisting of weight vectors (vj ∈ L(6)λj , lj ∈ L(6)−λj , λj ∈ P ). The definition (3.11) of Iw" implies cl6j ,v6 ∈ Iw"
if
λj ∈ / w " 6 + Q+ .
(4.6)
The map ϕi∗ : Cq [G] → Cqi [Gi ] acts on the matrix coefficients of a Uq g-module by restricting the module to Uqi gi . Since w = si w " and l(w) = l(w " ) + 1, w −1 αi ∨ ∈ −Q+ ∨ . Since 6 is a dominant weight 6, w −1 αi ∨ ≤ 0
and thus
w6, αi ∨ ≤ 0.
Hence Tw v6 is a lowest weight vector for the Uqi gi -submodule of L(6) generated by Tw v6 . The corresponding Uqi gi -highest weight vector is Tw" v6 and cl66,w ,vj ∈ ker ϕi∗
if
λj ∈ / {w6, w6 + αi , . . . , w" 6}.
The lemma now follows from (4.6) and (4.7).
(4.7)
#
For an element w ∈ W and a reduced decomposition w = si1 . . . sin of it denote wj = sij +1 . . . sin ,
j = 0, . . . , n − 1,
wn = 1.
(4.8)
410
N. Reshetikhin, M. Yakimov
Proposition 4.3. In the notation (4.8) the action of the elements a6,w in the module Vw,t is given by
πw,t (a6,w ) =
n
j =1
πsij (a
(wj 6,αij ∨ )ωij ,sij
).
l i=1
(6,αi ∨ )
ti
.
(4.9)
In the orthonormal basis {ek1 ,... ,kn }∞ kj =0 of Vw,t , see (3.10), the elements a6,w act diagonally by
πw,t (a6,w ).ek1 ,... ,kn =
n
q
−(kj +1)(wj 6,αij )
j =1
l i=1
(6,αi ∨ )
ti
ek1 ,... ,kn .
(4.10)
Formula (4.9) follows by induction from Lemma 4.2 and Definition (3.8) of the multiplicative characters χt of Cq [G]. To prove (4.10) we first compute that in Cq [SL2 ] aω1 ,s1 = −qc21
(4.11)
(cf. Sect. 3.3) and then use (4.5) which implies amω1 ,s1 = (aω1 ,s1 )m . We also use the identity di αi ∨ = (αi , αi )αi ∨ /2 = αi , see Sect. 2.1. 5. The Haar Integral on Cq (K) 5.1. Definition and the Schur orthogonality relations. Recall that a left invariant integral on a Hopf algebra A is a linear functional H on A satisfying (id ⊗ H ) ((a)) = H (a).1,
∀ a ∈ A.
(5.1)
A right invariant integral is analogously defined. In the analytic setting a left Haar integral for a C ∗ -Hopf algebra A is a state H on A satisfying (5.1), see [18]. Proposition 5.1. There exists a unique left invariant integral H on the Hopf algebra Cq [K] normalized by H (1) = 1. It is also right invariant and can be uniquely extended to a bi-invariant Haar integral on Cq (K). It is given by a quantum version of the classical Schur orthogonality relations: δ6,6" l, v " l " , v 6 6" H (cl,v cl " ,v " ) = 2(λ,ρ) λ dim L(6)λ q or equivalently by 6 ) = δ6,0 l, v. H (cl,v
(5.2)
Quantum Invariant Measures
411
5.2. Statement of the main result. Theorem 5.2. The bi-invariant integral H on Cq (K) (q ∈ R, q > 1) is given in terms of the irreducible representations πw,t of Cq (K) by (2ρ,β) ∗ (q − 1) tr V w ,t (πw◦ ,t (aρ,w◦ aρ,w c))dt, (5.3) H (c) = ◦ (S 1 )l
β∈+
◦
where w◦ is the maximal length element of the Weyl group W of g, ρ is the half sum of 1 l all positive roots of g, and dt is the invariant measure on the torus (S ) normalized by (S 1 )l dt = 1. In the special case of K = SU2 Theorem 5.2 was established by Soibelman and Vaksman [16]. A similar formula is also known for quantum spheres [15, 17]. Theorem 5.2 answers Question 3 in [15]. ∗ Note that formula (4.10) implies that πw,t (aρ,w◦ aρ,w ) is a trace class operator in ◦ V w,t . Since πw,t (c) is a bounded operator for c ∈ Cq (K), the product is also a trace class operator in V w,t for all c ∈ Cq (K). From Definition (3.9) of πw,t it is also clear that ∗ c)) tr V w ,t (πw◦ ,t (aρ,w◦ aρ,w ◦ ◦
is a continuous function in t ∈ (S 1 )l for a fixed c ∈ Cq (K) and that the r.h.s. of (5.3) defines a continuous linear functional on Cq (K). By identifying (S 1 )l ∼ = {(t1 , . . . , tl ) ∈ Cl : |ti | = 1}, the normalized invariant measure on the torus (S 1 )l is represented as dt =
dtl 1 dt1 ∧ ... ∧ · l (2π i) t1 tl
on Cq [G] given by the right-hand In Sects. 5.3 and 5.4 we show that the functional H side of (5.3) satisfies 6 (cl,v )=0 H
if 6 = 0.
(5.4)
(1) = 1. Combined In Sect. 5.5 we check that it satisfies the normalization condition H with (5.2) this proves Theorem 5.2.
5.3. Proof of (5.4): Reduction to the rank 1 case. Recall first the following simple characterization of w◦ ∈ W . Lemma 5.3. The maximal length element w◦ ∈ W is the only element w ∈ W that has a representation of the form w = w" si with l(w " ) = l(w) − 1 for an arbitrary simple reflection si . Lemma 5.3 follows from the so called “deletion condition”, see [5], and the property of w◦ that it is the only element w ∈ W such that w−1 (αi ) is a negative root of g for all simple roots αi of g.
412
N. Reshetikhin, M. Yakimov
We show that (5.4) for K = SU2 implies its validity in the general case. Let 6 ∈ P+ , 6 = 0. Equip L(6) with a Hermitian inner product making it a (Uq g, ∗) ∗-representation, recall (2.1). Denote Li = {v ∈ L(6) | Uqi gi .v = 0},
i = 1, . . . , l.
Since L(6) is an irreducible Uq g-module ∩li=1 Li = 0 and ⊥ l ⊥ L⊥ 1 + . . . + L2 = (∩i=1 Li ) = L(6).
Hence to show (5.4) it is sufficient to show that 6 (cl,v H )=0
if
v ∈ L⊥ m
for some
m = 1, . . . , l.
(5.5)
Note that L⊥ m is the span of the nontrivial irreducible Uqm gm -submodules of L(6). Choose a reduced decomposition of w◦ of the form w◦ = si1 . . . sin◦ −1 sm and consider the corresponding model for the representation πwn◦ ,t , πwn◦ ,t ∼ = πsi1 ⊗ . . . ⊗ πin◦ −1 ⊗ πsm ⊗ χt . Taking trace over the component πsm ⊗ χt of πwn◦ ,t and using (3.2) and (4.9) we see that to prove (5.5) it is sufficient to prove that (S 1 )l
∗ 6 tr V s (πsm (aωm ,sm aω∗ m ,sm ϕm (cl " ,v )))dt = 0 m
for all
l " ∈ L(6)∗ .
(5.6)
(Recall that by definition (w◦ )n◦ = 1, see (4.8).) Since v ∈ L⊥ m, ∗ 6 (cl,v ) = ϕm
p
pω
clp ,vmp
with all p > 0. By appropriately breaking the integral (5.6) into a product of a 1dimensional and an (l − 1)-dimensional integral one sees that (5.6) follows from (5.4) for K = SU2 .
Quantum Invariant Measures
413
5.4. Proof of (5.4): The case of Cq [SU2 ]. Our proof in the rank 1 case is similar to the one from [16]. Lemma 3.2 implies that Cq [SU2 ] is spanned by the elements p
p
m r m r c12 c21 and c22 c12 c21 c11
for
m, p, r ∈ N.
The Haar functional H acts on them by [1, Example 13.3.9] p
p
m r m r H (c11 c12 c21 ) = H (c22 c12 c21 ) = δm,0 δp,r
(−q)p (q 2 − 1) · q 2p+2 − 1
has the same property. This implies (5.4) for K = SU2 . We check that the functional H Recall from (4.11) that aω1 ,s1 = −q −1 c21 and thus aω∗ 1 ,s1 = c12 , see Lemma 3.2. Using (3.6)–(3.7) we compute p r )) tr V (π(aω1 ,s1 aω∗ 1 ,s1 ciim c12 c21
= δm,0
∞
−q −1 .q −(k+1)(p+1) .(−q −k )r+1
k=0
= δm,0
(−q)r q p+r+2 − 1
for i = 1, 2. This gives
1 2πi
S1
(−q)r = δm,0 p+r+2 t r−p−1 dt t q −1 (−q)p = δm,0 δp,r 2p+2 q −1
dt p r tr V s ,t (πs1 ,t (aω1 ,s1 aω∗ 1 ,s1 ciim c12 c21 )) 1
= H in the case K = SU2 . (i = 1, 2) which shows that H (1) = 1. Let w◦ = si1 . . . sin be a reduced decom5.5. Checking the normalization H ◦ position of the maximal element of W . Using (4.10) and the notation (4.8) we compute (S 1 )l
∗ tr V w ,t (πw◦ ,t (aρ,w◦ aρ,w ))dt = ◦ ◦
=
n◦ j =1 n◦
∞ kj =0
−2(kj +1)((w◦ )j ρ,αij ∨ ) qij
1
∨ −(2ρ,(w◦ )−1 j αij ) j =1 1 − q ij
(λ,α ∨ )
·
Note that qi i = q (λ,αi ) for all simple roots αi of g. The set of elements (w◦ )−1 j αij ∈ Q, j = 1, . . . , n◦ , coincides with the set of positive roots of g. This together with the via the r.h.s. of (5.3) gives definition of the functional H (1) = 1. H
414
N. Reshetikhin, M. Yakimov
5.6. Semiclassical limit. Here we explain the semiclassical analog of the integral formula from Theorem 5.2. As earlier G denotes a complex simple algebraic group and K denotes a compact real form of G. For each element w of the Weyl group W of K choose a representative w˙ of it in the normalizer of a fixed maximal torus T of K. Using the related Iwasawa decomposition of G, introduce the map aw : N → A
by w˙ −1 nw˙ = k1 aw (n)n1 ,
k1 ∈ K, n1 ∈ N,
(5.7)
see for instance [8]. It can be pushed down to a well defined map from the symplectic leaf Sw to A ˙ := aw (n), n ∈ N. aw (δn w) We refer to the introduction for details on the dressing action of AN on K related to the standard Poisson structure on K. The semiclassical analog of formula (5.3) is the following formula for the normalized Haar integral on K: π aw◦ (k)−2ρ f (k.t)µw◦ dt, f ∈ C(K). (5.8) H (f ) = (ρ, β) Sw◦ ×T β∈+
Here µw◦ denotes the Liouville volume form on the symplectic leaf Sw◦ corresponding to the maximalelement w◦ ∈ W and dt denotes the invariant measure on the torus T normalized by T dt = 1. Recall that Sw◦ × T coincides with the maximal Bruhat cell of K. Formula (5.8) can be easily proved following the idea of Sects. 5.3–5.5 on the basis of the product formulas [14, 7] for the symplectic leaves Sw of K, w ∈ W , Sw = Ssi1 . . . Ssin ,
(5.9)
where si1 . . . sin is a reduced decomposition of w. The integral with respect to the symplectic measure on the leaf Sw .t is (up to a factor) a semiclassical limit of the trace in the module V w,t . −2ρ At the end we explain the connection between the functions aw on the leaves Sw ∗ and the operators πw,t (aρ,w aρ,w ) in V w,t . Let us consider the highest weight module L(6) of g with highest weight 6 and the matrix coefficient a6,w ∈ C[G],
a6,w (g) = lw,6 , gv6 , g ∈ G,
where v6 is a highest weight vector of L(6) and lw,6 ∈ L(6)∗−w6 is normalized by lw,6 , wv ˙ 6 = 1, cf. (4.1). It is easy to show that the restriction of a6,w to the symplectic −6 , leaf Sw coincides with aw −6 . a6,w |Sw = aw
For t ∈ T the functions |aw,ρ (k.t)|2 = |aw,ρ (k)|2 = aw (k)−2ρ , k ∈ Sw ∗ ) in V are semiclassical analogs of the linear operators πw,t (aw,ρ aw,ρ w,t .
Quantum Invariant Measures
415
6. Quantum Quasi-Traces of Vw,t 6.1. Motivation. Let A be a Hopf algebra and A∗ be its dual Hopf algebra. Denote by A◦ the dual Hopf algebra of A equipped with the opposite comultiplication. Recall [2, 1] that the quantum double D(A) of A is isomorphic to A ⊗ A◦ as a coalgebra and the following commutation relation holds in D(A): ξa =
ξ(1) , a(3) ξ(2) a(2) S −1 ξ(3) , a(1) ,
ξ ∈ A∗ , a ∈ A.
(6.1)
Analogously to the classical situation one defines a quantum dressing action δ of A∗ on A. Using the identification D(A) ∼ = A ⊗ A∗ as vector spaces, set δξ a = (id ⊗ +)(ξ a). In view of the commutation relation (6.1) it is explicitly given by δξ a =
ξ(1) , a(3) a(2) S −1 ξ(2) , a(1) .
It is dual to the standard adjoint action of A∗ on itself adξ ξ " =
ξ(1) ξ " S(ξ(2) )
in the sense that adξ ξ " , a = ξ " , δS(ξ ) a.
(6.2)
For any representation π of A∗ in the vector space V the adjoint action of A∗ on itself lifts to an action of A∗ in the space of linear operators on V by adξ L =
π(ξ(1) )Lπ(Sξ(2) ).
(6.3)
Suppose that A∗ is a deformation of the Poisson Hopf algebra C[F ] of regular functions on a Poisson algebraic group F . According to Kirillov–Kostant orbit method philosophy an irreducible A∗ -module V can be viewed as a quantization of a symplectic leaf S in F . The left action of A∗ in the space of linear operators in V is a deformation of the Poisson C[F ]-module of functions on the leaf S. At the same time the dual Poisson algebraic group F ∗ of F acts in the space of functions on S by the dressing action. The quantum analog of this action is the adjoint action (6.3) of A∗ in the space of linear operators in the A∗ -module V . This leads to: The quantum analog of a measure on the symplectic leaf S in the Poisson algebraic group F which is invariant up to a multiplicative character of F ∗ is a homomorphism from a subspace of linear operators in the A∗ -module V , equipped with the A∗ -action (6.3), to a 1-dimensional representation of A∗ . In the next subsection we will develop this idea from a categorical point of view and relate it to the notion of quantum traces for A∗ -modules. In analogy, the defined more general morphisms will be called quantum quasi-traces. Subsections 6.3 and 6.4 construct such morphisms for the irreducible ∗-representations of the quantized algebras of functions (Cq [G], ∗).
416
N. Reshetikhin, M. Yakimov
6.2. Definitions. Let C be a C-linear, rigid, monoidal category with identity object 1. Recall that C is called balanced if for each object V ∈ Ob(C) there exists an isomorphism ∼ =
bV : V → V ∗∗ such that bV1 ⊗ bV2 = bV1 ⊗V2 ,
(6.4)
(bV∗ )−1 ,
(6.5) (6.6)
bV ∗ = b1 = id1 .
Given a Hopf algebra C over the field C let repC denote the category of its finite dimensional modules equipped with the left dual object V ∗ of V ∈ Ob(C) defined by c.ξ, v = ξ, S(a).v,
ξ ∈ V ∗, v ∈ V .
(6.7)
The spaces HomC (V1 , V2 ), V1 , V2 ∈ Ob(C) can be equipped with the canonical C-action c.L = πV1 (c(1) )LπV2 (S(c(2) )), L ∈ HomC (V1 , V2 ). (6.8) Here the Hopf algebra C plays the role of the Hopf algebra A∗ from the motivation in the previous subsection, cf. (6.3) and its derivation from the quantum dressing action. Clearly HomC (V1 , V2 ) ∼ = V2 ⊗ V1∗ as C-modules. In particular, for this action HomC (V , +) is canonically isomorphic to V ∗ , where, by abuse of notation, + denotes the 1-dimensional representation of C defined by its counit. Reshetikhin and Turaev [11] defined the following notion of quantum trace for a finite dimensional C-module V . Definition 6.1. A quantum trace for a finite dimensional C-module V is a homomorphism qtr V : EndC (V ) → + of C-modules for the action of C on EndC (V ) defined in (6.8). The pairing EndC (V ) ∼ = V ⊗V∗ → C is not a homomorphism of C-modules, where C is given the structure of the C-module corresponding to the counit +. At the same time the opposite pairing V∗ ⊗V → C has this property. If repC is balanced each V ∈ Ob(C) has a quantum trace defined by the composition [11] bV ⊗id
EndC (V ) ∼ = V ⊗ V ∗ −→ V ∗∗ ⊗ V ∗ → +
Quantum Invariant Measures
417
or explicitly qtr V (L) = tr V (bV L), L ∈ End(V ). Here bV is considered as a linear endomorphism of V using the canonical identification of V and V ∗∗ as vector spaces. The properties (6.4)–(6.5) of the balancing morphisms bV imply the following properties of the quantum traces qtr V : qtr V1 ⊗V2 (L1 ⊗ L2 ) = qtr V1 (L1 ) qtr V2 (L2 ), qtr V ∗ (L∗ ) = qtr V (L)
(6.9) (6.10)
for all Li ∈ EndC (Vi ). In [11, 12] it was proved that the category of finite dimensional type 1 Uq g-modules is balanced and this was used for constructing invariants of links and 3-dimensional manifolds. We would like to incorporate in Definition 6.1 the possibility for an invariant up to a character “quantum measure”, as explained in the previous section, and the general case of an infinite dimensional C-module V . We will restrict ourselves to representations of C π : C → B(V ) by bounded operators in a Hilbert space V and will call them bounded representations of C. The Hermitian inner product in V is not assumed to possess any invariance properties and the linear operators π(c), c ∈ C, in V are not assumed to be uniformly bounded. The dual V ∗ of such a bounded representation π : C → B(V ) is defined in the Hilbert space V ∗ of bounded functionals on V by formula (6.7). Obviously it is again a bounded representation. Definition 6.2. Two bounded representations of a Hopf algebra C πi : C → B(Vi ) in the Hilbert spaces Vi , i = 1, 2, will be called weakly equivalent if Vi contain dense C-stable subspaces Wi ⊂ Vi which are equivalent as C-modules. The point here is that the equivalence can be given by an unbounded operator ∼ =
b : W1 → W2 which therefore does not extend to the full space V1 . Definition 6.3. A bounded representation π : C → B(V ) of a Hopf algebra C in a Hilbert space V will be called quasi-balanced if there exists a multiplicative character χ of C for which V and χ ⊗ V ∗∗ are weakly equivalent. By abuse of notation we denote by χ the 1-dimensional C-module corresponding to the multiplicative character χ of C. In other words the bounded C-module V is balanced if there exists an invertible linear operator bV in V with dense domain and range such that Dom bV is C-stable and χ (c(1) )π(S 2 (c(2) ))bV , ∀ c ∈ C. (6.11) bV π(c) = (Here we use the canonical identification of V ∗∗ and V as Hilbert spaces.) Remark 6.4. Often V is the Hilbert space completion of a C-module W , equipped with a Hermitian inner product, which is a direct sum of mutually orthogonal finite dimensional submodules Wµ for a Hopf subalgebra B of C W = ⊕µ Wµ .
(6.12)
418
N. Reshetikhin, M. Yakimov
The restricted dual of such a module W with respect to the decomposition (6.12) as a direct sum of finite dimensional subspaces is naturally a C-module of the same type. The double restricted dual W ∗∗ of W is canonically isomorphic to W as a vector space. If W ∼ = χ ⊗ W ∗∗ as C-modules then the modules V and χ ⊗ V ∗∗ are weakly equivalent and V is a quasi-balanced C-module. Let π : C → B(V ) be a quasi-balanced representation as above. We call the subspace of the space of linear operators in V with dense domains q
B1 (V ) := B1 (V )bV−1 a space of quantum trace class operators in the C-module V . Here B1 (V ) stands for the standard trace class in V . It is naturally a C-module by c.L = π(c(1) )Lπ(S(a(c) )) q
because C acts in V by bounded operators. The linear map qtr V : B1 (V ) → C given by qtr V (L) := tr V (LbV ) is a well defined homomorphism of C-modules q
qtr V : B1 (V ) → χ . It will be called a quantum quasi-trace for the module V . Remark 6.5. One can as well use the space q (V ) := b−1 B1 (V ) B 1 V q
instead of B1 (V ). When bV−1 is not defined on the full space V the composition bV−1 L0 , L0 ∈ B1 (V ) need not have a dense domain in V . Because of this, it is convenient to use q (V ) only when b−1 has full domain. In that case the space B q (V ) is also a the space B 1 1 V C-module and the following map q (V ) → χ , qtr V (L) := tr(bV L) qtr V : B 1 is a homomorphism of C-modules. Remark 6.6. It is natural to look for a quasi-balancing map bV ∈ EndC (V ) for a bounded representation π : C → B(V ) of the form bV = π(aV ) for some aV ∈ C. The definition (6.11) implies that such a map π(aV ) provides a quasibalancing endomorphism if π(aV ) is an invertible linear operator in V with a dense range satisfying aV c − χ (c(1) )S 2 (c(2) )aV ∈ Ker π, ∀ c ∈ C (6.13) for some multiplicative character χ of A.
Quantum Invariant Measures
419
Thus quasi-balancing of the modules of a Hopf algebra A is related to the properties of the square of the antipode S of A. This is analogous to the usual case of balancing when χ = + and (6.13) reduces to aV c = S 2 (c)aV ∈ Ker π,
∀ c ∈ C,
see [11]. Similarly bV = πV (aV )−1 is a quasi-balancing map for the C-module V if πV (aV ) is an invertible operator in V with a dense range such that caV − χ (c(1) )aV S 2 (c(2) ) ∈ Ker π, ∀ c ∈ C. (6.14)
6.3. Main construction. In this subsection we construct quasi-balancing morphisms for the Cq [G]-modules Vw,t . As was pointed out in Sect. 3.4 they are bounded Cq [G]modules in the terminology from the previous subsection. Set 2ρ =
α=
l
α∈+
pi αi
i=1
for some positive integers pi and denote q
2ρ
=
l i=1
p
Ki i ∈ Uq g.
(6.15)
Its commutation with the generators Xi± of Uq g is given by q 2ρ Xi± q −2ρ = q ±(2ρ,αi ) Xi± ,
∀ i = 1, . . . , l.
As it is well known the square of the antipode in Uq g is given by the following lemma. Lemma 6.7. For all x ∈ Uq g, S 2 (x) = q 2ρ xq −2ρ . For an arbitrary element ν = i mi αi ∨ of the coroot lattice Q∨ of g we set q ν := (q m1 , . . . , q ml ) ∈ (C× )l and consider the multiplicative character χq ν of Cq [G]. It is explicitly given by 6 χq ν (cl,v ) = q (ν,µ) l, v,
l ∈ L(6)∗−µ ,
recall (3.8). From Lemma 6.7 we deduce the following properties of S 2 in Cq [G].
(6.16)
420
N. Reshetikhin, M. Yakimov
Lemma 6.8. (i) If v ∈ L(6)λ and l ∈ L(6)∗−µ , then 6 6 S 2 (cl,v ) = q 2(ρ,λ−µ) cl,v .
(6.17)
(ii) For all elements w ∈ W ∗ ∗ − χq 2(wρ−ρ) (c(1) )aρ,w aρ,w S 2 (c(2) ) ∈ Iw , caρ,w aρ,w
∀ c ∈ Cq [G],
recall (3.11). Proof. (i) By a straightforward computation, for all x ∈ Uq g, 6 6 6 6 S 2 (cl,v ), x = cl,v , S 2 (x) = cl,v , q 2ρ xq −2ρ = q 2(ρ,λ−µ) cl,v , x.
(ii) Combining part (i) with the identities (4.3) and (4.4) gives 6 ∗ ∗ 6 aρ,w aρ,w − q 2(wρ−ρ,µ) aρ,w aρ,w S 2 (cl,v ) ∈ Iw , cl,v
which implies (6.17) in view of (3.2).
∀ l ∈ L(6)∗−µ , v ∈ L(6),
#
Let us fix an element w ∈ W , a reduced decomposition w = si1 . . . sin of it, and an element t ∈ (S 1 )l . Consider the (Cq [G], ∗)-module Vw,t . We will make use of the notation (4.8) wj = sij +1 . . . sin , j = 0, . . . , n − 1,
wn = 1
and of the basis ek1 ,... ,kn , kj ∈ N of Vw,t from (3.10). Formula (4.10) implies that the space Vw,t decomposes as a sum of weight subspaces with respect to the action of the commutative subalgebra of Cq [G] spanned by a6,w , 6 ∈ P+ (recall part (ii) of Proposition 4.1) as Vw,t =
span{ek1 ,... ,kn |
µ∈Q+
n j =1
(kj + 1)wj−1 αij = µ}.
(6.18)
All weight subspaces of Vw,t are finite dimensional and we can identify the corresponding ∗∗ with V double restricted dual Vw,t w,t as a vector space. Part (ii) of Lemma 6.8 and the fact that the ideal Iw contains the annihilation ideal of ∗ )−1 : V Vw,t , see (3.12), imply that πw,t (aρ,w aρ,w w,t → Vw,t induces an isomorphism of ∗∗ ∗ )−1 the Cq [G]-modules Vw,t and χ ⊗Vw,t . In view of Remark 6.4, bw,t = πw,t (aρ,w aρ,w defines a quasi-balancing map for the Cq [G]-module V w,t . Explicitly in the basis (3.10) ∗ )−1 acts diagonally by of Vw,t , πw,t (aρ,w aρ,w ∗ πw,t (aρ,w aρ,w )−1 .ek1 ,... ,kn
=
n
q
2(kj +1)(wj ρ,αij )
ek1 ,... ,kn ,
(6.19)
j =1
recall (4.10). Define the set of quantum trace class operators in the Cq [G]-module V w,t by q
∗ ). B1 (V w,t ) = B1 (V w,t )πw,t (aρ,w aρ,w
(6.20)
Quantum Invariant Measures
421
∗ ) is a compact operator and thus It is clear from (6.19) that πw,t (aρ,w aρ,w q
B1 (V w,t ) ⊂ B1 (V w,t ). Using Proposition 4.1, observe that ∗ ∗ πw,t (a2ρ,w a2ρ,w ) = πw,t (aρ,w aρ,w )2 .
(6.21) q
Finally define the quantum quasi-trace functional qtr V w,t : B1 (V w,t ) → C by ∗ )−1 ), qtr V w,t (L) = constw tr V w,t (Lπw,t (aρ,w aρ,w
(6.22)
where
constw =
(q (2ρ,β) − 1).
(6.23)
β∈+ ∩w−1 −
Proposition 6.9. The Cq [G]-modules V w,t are quasi-balanced with multiplicative char∗ )−1 . The space acters χ2(wρ−ρ) and quasi-balancing morphisms bw,t = πw,t (aρ,w aρ,w of quantum trace class operators in V w,t and quantum quasi-trace morphisms q
qtr V w,t : B1 (V w,t ) → χ2(wρ−ρ) are given by (6.20) and (6.22). The morphisms qtr V w,t are normalized by ∗ )) = 1. qtr V w,t (πw,t (a2ρ,w a2ρ,w
(6.24)
To check (6.24) it is sufficient to check that ∗ tr V w,t (πw,t (aρ,w aρ,w )) =
(q (2ρ,β) − 1)−1 ,
β∈+ ∩w−1 −
recall (6.21). This easily follows from (6.19) using the standard fact {wj−1 αij }nj=1 = + ∩ w −1 −
(6.25)
in the notation of (4.8), see for instance [5]. Remark 6.10. Consider again the compact group K equipped with the standard Poisson structure, see the introduction and Sect. 5.6. Recall the notation Nw = N ∩ wN− w −1 and Nw+ = N ∩ wN w−1 , w ∈ W , where N− is the unipotent subgroup of G which is dual to N with respect to the fixed complex torus T A of G. The symplectic leaf Sw .t of K, considered as an AN homogeneous space under the dressing action, is isomorphic to AN/ANw+ . We choose as a base point of Sw .t the point w.t. ˙ Denote by µw,t the Liouville volume form on the leaf Sw .t. The diffeomorphisms Sw .t ∼ = AN/ANw+ ∼ = Nw
(6.26)
422
N. Reshetikhin, M. Yakimov
induce a measure dnw on Sw .t from the Haar measure on Nw . The second one comes from the factorization AN = Nw ANw+ . The measure dnw will be normalized by 2 aw,2ρ |Sw .t dnw = 1, cf. Sect. 5.6. The relation between the volume forms µw,t and dnw on Sw .t was found by Lu [8]. It is given by
−2 (ρ, β) (6.27) dnw = aw,ρ |Sw .t µw,t . π −1 β∈+ ∩w
−
It is easy to compute that the measure dnw on Sw .t transforms under the dressing action of AN = K ∗ by δan (dnw ) = a 2(ρ−wρ) dnw . The quantum quasi-trace morphisms q
qtr V w,t : B1 (V w,t ) → χ2(wρ−ρ) are quantum analogs of the measures dnw on Sw .t and thus also of the Haar measures on the unipotent subgroups Nw of G. The traces in the modules Vw,t can be considered as quantizations of the Liouville volume forms µw,t on the leaves Sw .t. The relation (6.22) is a quantum version of Lu’s relation (6.27). 6.4. Tensor product properties of the quasi-balancing morphisms bw,t . When w, w " ∈ W are such that l(ww" ) = l(w) + l(w" ) the tensor product of (Cq [G], ∗)-modules Vw,t ⊗Vw" ,t " is again an irreducible (Cq [G], ∗)-module, see Lemma 6.12 below. Here we discuss the relation between the corresponding quasi-balancing morphisms constructed in the previous subsection. For an element t = (t1 , . . . , tl ) ∈ (C× )l denote its j th component by (t)j := tj . Define an action of the Weyl group W of g on the torus (C× )l by mij tj where w −1 αj ∨ = mij αi ∨ . (w(t))i := j
i
It can be easily identified with the conjugation action of W on a complex torus of G. It is straightforward to check that 6 )= χw(t) (cl,v
l i=1
(λ,w−1 αi ∨ )
ti
l, v,
cf. (3.8). Fix w ∈ W and a reduced decomposition w = si1 . . . sin of it. The representation space Vw,t , recall Theorem 3.3, is canonically identified with the vector space Vw = Vs1 ⊗ . . . ⊗ Vsn
Quantum Invariant Measures
423
for all t ∈ (C× )l . (As earlier we will not show explicitly the dependence on the choice of a reduced decomposition of w.) Under this identification the basis (3.10) of Vw,t corresponds to the basis ek1 ,... ,kn = ek1 ⊗ . . . ⊗ ekn ,
n = l(w), k1 , . . . , kn ∈ N
(6.28)
of Vw . In the notation (4.8) define the linear operator Jw,t in Vw acting diagonally in the above basis of Vw by Jw,t .ek1 ,... ,kn =
n
k +1
(wj −1 (t)wj (t −1 ))ijj
ek1 ,... ,kn .
(6.29)
j =1
Lemma 6.11. For all w ∈ W , and t, t " ∈ (C× )l the operator Jw,t " defines an isomorphism of the Cq [G]-representations χw(t " ) ⊗ πw,t and πw,t ⊗ χt " ∼ = πw,tt " in the natural identification of their representation spaces with Vw . Lemma 6.11 is checked directly in the case of G = SL2 using the defining identities (3.6)–(3.7) for the Cq [SL2 ]-module π , see Sect. 3.3. This implies the lemma when w is a simple reflection and the general case is proved by induction on l(w). Lemma 6.12. Let w, w " ∈ W be such that l(ww" ) = l(w) + l(w " ) and t, t " ∈ (S 1 )l . The linear operator Jw" ,(w" )−1 (t) induces the unitary equivalence of (Cq [G], ∗)-modules Dw,t;w" ,t " : V w,t ⊗ V w" ,t " → V ww" ,(w" )−1 (t)t "
(6.30)
by identifying the spaces Vw,t ⊗ Vw" ,t " ∼ = Vw ⊗ Vw" ∼ = Vww" ∼ = Vww" ,(w" )−1 (t)t " . (The " product of two reduced decompositions of w and w is used as a reduced decomposition of ww" .) In the setting of Lemma 6.12 the Cq [G]-module V ww" ,(w" )−1 (t)t " admits a quasibalancing morphism constructed from the quasi-balancing morphisms bw,t and bw" ,t " for the modules V w,t and V w" ,t " . It is given by the composition Vww" ,(w" )−1 (t)t "
D−1 w,t;w " ,t "
id⊗bw" ,t "
−→ Vw,t ⊗ Vw" ,t " −→
Vw,t ⊗ χq 2(w" ρ−ρ) ⊗ Vw∗∗" ,t "
J −1 2(w" ρ−ρ) ⊗id w,q
−→
∗∗ ⊗ Vw∗∗" ,t " χq 2w(w" ρ−ρ) ⊗ χq 2(wρ−ρ) ⊗ Vw,t
bw,t ⊗id
χq 2w(w" ρ−ρ) ⊗ Vw,t ⊗ Vw∗∗" ,t " −→
D∗∗ w,t;w " ,t "
∗∗ −→ χq 2(ww" ρ−ρ) ⊗ Vww " ,(w " )−1 (t)t " . (6.31)
The restricted duals of the modules Vw,t and Vw" ,t " are taken with respect to the weight space decomposition (6.18) for the commutative subalgebras of Cq [G] spanned by a6,w and a6,w" , 6 ∈ P+ , respectively. Recall also the notation (6.16). Proposition 6.13. If w, w " ∈ W are such that l(ww " ) = l(w) + l(w" ) and t, t " ∈ (S 1 )l then the quasi-balancing map for the Cq [G]-module V ww" ,(w" )−1 (t)t " given by the composition (6.31) coincides with the quasi-balancing map bww" ,(w" )−1 (t)t " .
424
N. Reshetikhin, M. Yakimov
To prove Proposition 6.13 observe that in the natural identification of the representation spaces in (6.31) with Vw ⊗ Vw" the composition is simply bw,t J −1 2(w" ρ−ρ) ⊗ bw" ,t " . w,q
(We use again the product of two reduced decompositions of w and w " as a reduced decomposition of ww" .) Now the proposition easily follow from (6.19) and the following formula for the action of J −1 2(w" ρ−ρ) in the basis (6.28) of Vw,t which is a direct w,q consequence from (6.29), J −1 2(w" ρ−ρ) .ek1 ,... ,kn =
l(w)
w,q
q
2(kj +1)(wj (w" ρ−ρ),αij )
ek1 ,... ,kn .
j =1 q
This computation implies also the following connection between the spaces B1 (V w,t ), q q B1 (V w" ,t " ) and B1 (V ww" ,(w" )−1 (t)t " ) when l(ww " ) = l(w) + l(w " ). q
q
Corollary 6.14. If L ∈ B1 (V w,t ) and L" ∈ B1 (V w" ,t " ) in the setting of Proposition 6.13, then q
LJw,q 2(w" ρ−ρ) ⊗ L" ∈ B1 (V ww" ,(w" )−1 (t)t " ) and qtr V
ww " ,(w" )−1 (t)t "
(LJw,q 2(w" ρ−ρ) ⊗ L" ) =
constww" qtr V w,t (L) qtr V " " (L" ), w ,t const w const w"
where constw is given by (6.23). 7. An Application: Quantum Harish-Chandra c-Functions Denote by 1 the identity element (1, . . . , 1) of the real torus (S 1 )l .According to (4.10) the linear operators πw,1 (aωi ,w ) in V w,1 are compact, selfadjoint with spectrum contained in [0, ∞). For different values of i they mutually commute. Hence for each λ ∈ h we can define the linear operator in V w,1 , dλ,w = where λi = (λ, αi ∨ ), i.e. λ =
l
πw,1 (aωi ,w )λi ,
i=1
λi ωi . It is obvious that
dλ,w = πw,1 (aλ,w )
when
λ ∈ P+ ⊂ h
(7.1)
and dλ1 ,w dλ2 ,w = dλ1 +λ2 ,w ,
∀ λ1 , λ2 ∈ h.
(7.2)
Lemma 7.1. The linear operator diλ+2ρ,w in V w,1 is quantum trace class (belongs to q B1 (V w,1 )) if and only if Im(λ, β) < 0,
∀ β ∈ + ∩ w −1 − .
Quantum Invariant Measures
425 q
Proof. The operator diλ+2ρ,w in V w,1 belongs to B1 (V w,1 ) if and only if diλ,w ∈ B1 (V w,1 ) because of (7.1), (7.2), and the selfadjointness of πw,1 (aρ,w ). The operator diλ,w is diagonal in the orthonormal basis (3.10) of V w,1 and according to (4.10) acts by diλ,w .ek1 ,... ,kn =
n
q
−i(kj +1)(wj λ,αij )
ek1 ,... ,kn ,
(7.3)
j =1
recall the notation (4.8). It is clear that the linear operator diλ,w in V w,1 is trace class if and only if Re(iλ, wj−1 αij ) > 0 for i = 1, . . . , n = l(w) which implies the statement because of (6.25). # Definition 7.2. The function q
cw−1 (λ) = qtr V w,1 (diλ+2ρ,w ) = tr V w,1 (diλ,w )
(7.4)
in the domain {λ ∈ h | Im(λ, β) < 0, ∀ β ∈ + ∩ w −1 − } will be called quantum Harish-Chandra c-function associated to the element w−1 of the Weyl group W of g. q
Proposition 7.3. For all w ∈ W the quantum Harish-Chandra c-function cw (λ) is given by q (λ) = cw
β∈+ ∩w−
q (2ρ,β) − 1 · q (iλ,β) − 1
This proposition follows from (7.3) and (6.25) similarly to the proof of the normalization (6.24). Remark 7.4. Proposition 7.3 is a quantum analog of the Harish-Chandra formula for the c-function in the case of complex simple Lie groups, generalized later by Gindikin and Karpelevich to arbitrary real reductive groups. Recall the setting of Sect. 5.6 and Remark 6.10. Let dnw denote the Haar measure on the unipotent subgroup Nw of G. The classical Harish-Chandra c-function associated to the element w−1 ∈ W is given by the integral formula aw (n)−(iλ+2ρ) dnw , λ ∈ h, Im(λ, β) < 0, ∀ β ∈ + ∩ w −1 − , cw−1 (λ) = Nw
recall Definition (5.7) of the map aw : N → A. We refer to [3] for a detailed treatment of spherical functions and to [8] for an interpretation of the c-function in terms of the Poisson geometry of K, see in particular Example 2.8 in [8]. The linear operators diλ+2ρ in the modules V w,1 can be thought of as quantizations of the pushforwards of the functions aw (n)−(iλ+2ρ) on Nw to the symplectic leaves Sw by the dressing action, using the base points w˙ ∈ Sw (i.e. using the diffeomorphisms (6.26)). As was explained in Remark 6.10 the quantum quasi-traces qtr V w,1 in the Cq [G]modules V w,1 are quantizations of the pushforwards of the Haar measures on Nw to the symplectic leaves Sw .
426
N. Reshetikhin, M. Yakimov
The classical Harish-Chandra formula (2ρ, β) , λ ∈ h, Im(λ, β) < 0, ∀ β ∈ + ∩ w −1 − cw (λ) = (iλ, β) β∈+ ∩w−
is proved by induction on the length of w, see [3, Chapter IV, §6]. Lu [8] found that this argument is essentially based on the product formula (5.9) for the leaves Sw . Our computation relies on its quantum counterpart – the tensor product formula (3.9) for the representations πw,t , cf. also Sect. 6.4. References 1. Chari, V. and Pressley, A.: A guide to quantum groups. Cambridge: Cambridge Univ. Press, 1994 2. Drinfeld, V. G.: Quantum groups. Proc. ICM, Berkeley, 1986, Providence, RI: AMS, 1987, pp. 798–820 3. Helgason, S.: Groups and geometric analysis. Pure Appl. Math. 113, London–New York: Acad. Press, 1984 4. Hodges, T. J. and Levasseur, T.: Primitive ideals of Cq [G]. Preprint 1992 5. Humphreys, J. E.: Reflection groups and Coxeter groups. Cambridge Stud. Adv. Math. 29, Cambridge: Cambridge Univ. Press, 1990 6. Joseph, A.: Quantum groups and their primitive ideals. Ergebnisse der Mathematik und ihrer Grenzgebiete (3), Berlin–Heidelberg–New York: Springer–Verlag, 1995 7. Korogodski, L.I. and Soibelman, Ya.S.: Algebras of functions on quantum groups: Part I. AMS Math. Surveys and Monographs 56, Providence, RI: AMS, 1998 8. Lu, J.-H.: Coordinates on Schubert cells, Kostant’s harmonic forms, and the Bruhat Poisson structure on G/B. Transform. Groups 4, no. 4, 355–374 (1999) 9. Lu, J.-H. and Weinstein, A.: Poisson Lie groups, dressing transformations, and Bruhat decompositions. J. Diff. Geom. 31, no. 2, 501–526 (1990) 10. Lusztig, G.: Introduction to quantum groups. Progr. Math. 110, Basel–Boston: Birkhäuser, 1993 11. Reshetikhin, N.Yu. and Turaev, V.G.: Ribbon graphs and their invariants derived from quantum groups. Commun. Math. Phys. 127, no. 1, 1–26 (1990) 12. Reshetikhin, N.Yu. and Turaev, V.G.: Invariants of 3-manifolds via link polynomials and quantum groups. Invent. Math. 103, no. 3, 547–597 (1991) 13. Semenov-Tian-Shansky, M.A.: Dressing transformations and Poisson group actions. Publ. Res. Inst. Math. Sci. 21, no. 6, 1237–1260 (1991) 14. Soibelman, Ya.S.: The algebra of functions on a compact quantum group and its representations. St. Petersburg Math. J. 2, 193–225 (1991) 15. Soibelman, Ya.S.: Selected topics in quantum groups. In: Infinite analysis, Part B (Kyoto, 1991), Adv. Ser. Math. Phys. 16, Singapore: World Sci. Publ., 1992, pp. 859–887 16. Soibelman, Ya.S. and Vaksman, L.L.: An algebra of functions on the quantum group SU (2). Funct. Anal. Appl. 22, no. 3, 170–181 (1988) 17. Soibelman, Ya.S. and Vaksman, L.L.: On some problems in the theory of quantum groups. In: Representation theory and dynamical systems Adv. Soviet Math. 9, Providence, RI: AMS, 1992, pp. 3–55 18. Woronowicz, S.L.: Compact matrix pseudogroups. Commun. Math. Phys. 111, no. 4, 613–665 (1987) Communicated by H. Araki
Commun. Math. Phys. 224, 427 – 442 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Quantum Morphing and the Jones Polynomial Oliver T. Dasbach, Thang D. Le, Xiao-Song Lin Department of Mathematics, University of California, Riverside, CA 92521, USA E-mail:
[email protected];
[email protected];
[email protected] Received: 15 February 2001 / Accepted: 8 June 2001
Abstract: We will explore the experimental observation that on the set of knots with bounded crossing number, algebraically independent Vassiliev invariants become correlated, as noticed first by S. Willerton. We will see this through the value distribution of the Jones polynomial at roots of unit. As the degree of the roots of unit is getting larger, the higher order fluctuation is diminishing and a more organized shape will emerge from a rather random value distribution of the Jones polynomial. We call such a phenomenon “quantum morphing”. Evaluations of the Jones polynomial at roots of unity play a crucial role, for example in the volume conjecture. When I questioned your pupil, under a pine-tree, “My teacher”, he answered, “went for herbs, But toward which corner of the mountain, How can I tell, through all these clouds?” Jia Dao (777–841), Chinese Poet of Tang Dynasty
1. Introduction While the Alexander polynomial of a knot is considered as being well-understood, the Jones polynomial remains mysterious. The Alexander polynomial has a solid interpretation in terms of classical topology; such an interpretation of the Jones polynomial is not known. The Alexander polynomial is computable in polynomial time in the number of crossings of a knot while evaluations of the Jones polynomial at all but eight points are known to be #P -hard [JVW90]. Partially supported by the Overseas Youth Cooperation Research Fund of NSFC
428
O. T. Dasbach, T. D. Le, X.-S. Lin
The theory of Vassiliev knot invariants gave a common framework for the Alexander polynomial and the Jones polynomial and its generalizations, known as quantum polynomials. After suitable renormalizations, the coefficients of the polynomials are Vassiliev invariants. Since each of these invariants are computable in polynomial time, this gives a way of approximating the Jones polynomial in polynomial time. Although the space of Vassiliev knot invariants is quite large [Das00], there is a lot of interest in understanding the simplest Vassiliev invariants. (See e.g. the recent, 28 page preprint [PV99] devoted to the Vassiliev invariant of order 2. For the use of this Vassiliev invariant for proofs on the Property P conjecture see [MZ00].) The space of Vassiliev invariants of order three is two-dimensional and has a natural (both algebraically and linearly independent) basis v2 and v3 with integer values. The image of the pair (v2 , v3 ) under all knots is the integer lattice. However, this point of view might be misleading. As observed by Willerton [Wil01], if one restricts the crossing number of the knots the image of (v2 , v3 ), plotted into the plane, has – at least for crossing number less than 16 – a distinct shape, which he called “fish”. Therefore, there might be a possible correlation among the crossing number and the first two Vassiliev invariants. We will put this observation in a more general setting. The first twoVassiliev invariants are the quadratic and cubic terms in the power series expansion of the Jones polynomial evaluated at ex . Note that its linear term vanishes and that the constant term is always 1. Thus, if we evaluate the Jones polynomial at the primitive nth root of unity then, as n grows, this complex number is more and more determined by v2 and v3 . In this way we see v2 and v3 as a limit of the Jones polynomial evaluated at primitive roots of unity. So, Willerton’s plotting, revealing the correlation of v2 , v3 and the crossing number, can be approached by renormalizing the value plotting of the Jones polynomial at a primitive root of unity of high order. Hence, within the set of knots with fixed crossing number, by varying the order of the primitive root of unity, we observe that the value distribution of the Jones polynomial reduces its randomness and becomes stabilized at a more organized shape. We call this phenomenon “quantum morphing”. A somewhat astonishing observation, seen in our plottings, is that the same kind of correlations seem to hold if we confine ourselves to alternating knots or, equally well, to non-alternating knots. The Jones polynomial for alternating knots is a specialization of the Tutte polynomial of the corresponding checkerboard graph of an alternating diagram of the knot. Thus, all observations hold for this specialization of the Tutte polynomial for planar graphs on a given number of edges as well. To explain some of the phenomena seen in the pictures, we rewrite the formulas for the second and third Vassiliev invariants given by Polyak and Viro (also known to Lannes and Fiedler). This will provide further hints to the correlation among the crossing number and these knot invariants. It is interesting to consider the special case of knots of low braid index. For knots of braid index 3, we will provide an explanation of the expected correlation. Finally, we indicate that this method could be explored further to reveal a possible correlation of the similar kind among the crossing number and the “Vassiliev coefficients” of the Jones and the Alexander polynomials. We would like to thank Jim Hoste and Morwen Thistlethwaite for their program knotscape. It provides a wonderful tool for the study of knots. Furthermore, the first author would like to thank Joan Birman for her encouragement.
Quantum Morphing and the Jones Polynomial
429
2. Further Motivations and Discussion 2.1. Complexity theory. It is intriguing to think of the Jones polynomial and the Alexander polynomial from the point of view of computational complexity theory. The Alexander polynomial has its root in classical topology. As most classical topological invariants, the Alexander polynomial is computable in polynomial time. To give a common framework with the Jones polynomial and other quantum polynomials it is convenient to see this fact in terms of representations of the braid group. Starting with a diagram of a knot with c crossings, Vogel’s algorithm (see [Vog90] and compare with [Yam87]) transforms the knot into a closed braid on s strands, where s is the number of Seifert circles in the diagram and thus bounded by c + 1. The word length of the resulting (non-unique) braid is bounded by a polynomial in c. Now the Alexander polynomial can be computed as a determinant from an s-dimensional representation of the braid group Bs . Combining these steps, we see that the computation of the Alexander polynomial for a knot of crossing number c is possible in polynomial time in c. The Jones polynomial, on the other hand, is defined in this setting as a weighted trace of a 2s -dimensional representation of the braid group Bs . Since s depends on c, we could only get an algorithm of exponential complexity for the computation of the Jones polynomial. Note, however, that here a subtlety is of some importance. If we confine our consideration only to knots given as diagrams with a bounded number of Seifert circles, then the computation of the Jones polynomial is polynomial in the crossing number. In particular, the computation of the Jones polynomial of closed n-braids is possible in polynomial time in the word length of n-braids. Without this restriction the computation of the Jones polynomial is harder than the computation of the Alexander polynomial (assuming N P = P ). This was shown by Jaeger, Vertigan and Welsh [JVW90]. They proved that for any primitive root of unity e2πi/n , n > 4 and n = 6, the evaluation of the Jones polynomial at this value is #P -hard. For a definition of #P see for example [GJ79]. This result makes it interesting to look at polynomial-time approximations of the Jones polynomial. Here, the theory of Vassiliev knot invariants comes into the play. As shown in [BL93] the coefficient of x k in the power series expansion of VK (ex ) is a Vassiliev invariant of order k. Since in general Vassiliev invariants of order k are computable in O(ck ) time [BN95], it particularly holds for these coefficients as well. Truncations of the power series expansion now give a polynomial time approximation of the Jones polynomial. It is unknown whether one could get some error a priori estimate in terms of the crossing number. The possible correlation among the finite type coefficients of the Jones polynomial we observed here is certainly encouraging for the search of such an a priori estimate. For related discussion, see [Fre98].
2.2. Quantum computing and value distribution of the Jones polynomial. The braid group is intimately related with the physics of anyons or quantum Hall effects. Such a relationship is in the heart of the introduction in [FLW00] and [FKW00] of a universal computation model equivalent to quantum computation. Roughly speaking, this universal computation model uses the Jones representation at the fifth root of unit as basic logic gates. Density results in [FLW00] led to the following theorem in [FLW01] about the statistical value distribution of the Jones polynomial. To state the theorem, we fix an r th primitive root of unit, r ≥ 3, r = 3, 4, 6 and let V : Bn → C be given by evaluating the Jones polynomial of the closure of a braid
430
O. T. Dasbach, T. D. Le, X.-S. Lin
σ ∈ Bn at the r th root of unity. For a braid σ ∈ Bn , its word length will be calculated in terms of the n − 1 standard generators of Bn . A density measure µn on C can be defined as follows. For a subset S ⊂ C, µn (S) = lim
l→∞
#{σ ∈ Bn ; length(σ ) = l, V (σ ) ∈ S} . (2(n − 1))l
Theorem 2.1 ([FLW01]). When n → ∞, µn approaches a Gaussian distribution on C whose deviation depends on r. If we understand this theorem as describing the statistical value distribution of the Jones polynomial on the set of isotopy classes of links, caution must be used. First of all, the braid index is used here to filtrate the set of isotopy classes. Moreover, since different braids may represent the same link, the limiting Gaussian distribution in the theorem above is for a “weighted” value distribution of the Jones polynomial on links. Factoring through braids leads to this theorem, which is the first of its kind, about the statistical value distribution of the Jones polynomial. It can be thought of as an indication of the randomness of the values of the Jones polynomial. Nevertheless, our plotting shows some more delicate features of the actual value distribution of the Jones polynomial. This seems to be the case in particular regarding the phenomena of “quantum morphing”, that the value distribution of the Jones polynomial exhibits some kind of regularity when r is getting larger. Such a tendency becomes precise in the case of B3 (see Prop. 6.1). We wonder whether some theorems about the phenomena of “quantum morphing” could be established for each braid group Bn . 2.3. The volume conjecture of Kashaev and Murakami–Murakami. The distinctive shape of the value distribution of the Jones polynomial at higher order roots of unit might be thought of as an evidence in support of the volume conjecture of Kashaev and Murakami– Murakami. Recall that the N -dimensional irreducible representation of sl2 gives rise to the colored Jones polynomial. The exponential growth rate of the norm of the colored Jones polynomial at e2πi/N is conjectured in [Kas97, MM99] to be equal to the simplicial volume of the knot complement. It is known that for a knot K, the colored Jones polynomial at the N -dimensional irreducible representation of sl2 is determined by the usual Jones polynomial (N = 2) at the connected r-fold cabling of K, r < N. Notice further that for a fixed knot, the crossing number of its connected N -fold cabling grows like a quadratic function of N . If the value distribution of the Jones polynomial at e2πi/N for knots with crossing number, say N , is random, when N is very large, we certainly would have less chance to get a meaningful exponential growth rate for its norm. Fortunately, what we see from our plottings is pointing toward an opposite direction. It is interesting to try to think about the volume conjecture along this line more quantitatively. 3. The Pictures With the standard notation as in [Jon87], let K (t) be the Alexander polynomial and VK (t) be the Jones polynomial of a knot K. We expand the Jones polynomial into a power series by a change of variables t = ex : VK (ex ) = 1 +
∞ n=2
Vn (K)x n .
(1)
Quantum Morphing and the Jones Polynomial
431
The Alexander polynomial K (t), on the other hand, is a polynomial in (t 1/2 − t −1/2 )2 : K (t) = 1 +
N
c2n (K)(t 1/2 − t −1/2 )2n .
(2)
n=1
The coefficients Vn (K) and cn (K) are Vassiliev invariant of order n. We will call them Vassiliev coefficients of the Jones polynomial and the Alexander polynomial. From the general theory of Vassiliev invariants, we know that V2 , V3 , . . . , Vn , . . . are algebraically independent knot invariants. In other words, there is no non-trivial polynomial P = P (x1 , x2 , . . . , xk ) such that P (Vn1 (K), Vn2 (K), . . . , Vnk (K)) = 0 for all knots K. Nevertheless, actual evaluation of the Jones polynomial reveals that on the set of knots with bounded crossing number, these knot invariants V2 , V3 , . . . , Vn , . . . become correlated in a certain sense. Such a correlation between V2 and V3 was first observed by S. Willerton in his plotting of the “fish”. To agree with the standard notation in the literature, we give the following definition. Definition 3.1.
1 1 v2 (K) := − VK (1) = K (1), 6 2 1 v3 (K) := − (VK (1) + 3V (1)). 36
Note that with this definition VK (ex ) = 1 − 3v2 x 2 − 6v3 x 3 + O(x4 ). So V2 = c2 = −3v2 and V3 = −6v3 . The invariants v2 of order 2 and v3 of order 3 span the whole space of Vassiliev ¯ = invariants of order less than or equal to 3. For K¯ the mirror image of K we have v2 (K) ¯ = −v3 (K). In particular, if K is amphicheiral then v3 (K) = 0. v2 (K) and v3 (K) As a Vassiliev invariant of order 2, v2 is uniquely determined by v2 (unknot) = 0 and v2 (trefoil) = 1. Similarly, the Vassiliev invariant v3 of order 3 is uniquely determined by v3 (unknot) = 0, v3 (right-trefoil) = 1, and v3 (figure-eight) = 0. Let qn be the nth root of unity qn := e2πi/n . Since VK (1) = 1 for a knot K, we have the classical limit lim VK (qn ) = 1. n→∞
For our purposes other limits are more useful: Proposition 3.2. We renormalize the real and imaginary part of the evaluation of the Jones polynomial in the following way: VK (qn ) − 1 ˜ V2,n := re , (2π i/n)2 VK (qn ) V˜3,n := re . (2π i/n)3 Here, re(z) denotes the real part of a complex number z.
432
O. T. Dasbach, T. D. Le, X.-S. Lin
We have
lim V˜2,n = V2 ,
n→∞
lim V˜3,n = V3 .
n→∞
The proof is immediately from the expansion in Eq. (1). Other coefficients of the Jones polynomial can be obtained similarly by considering the limit of some functions of derivatives of VK (qn ) when n approaches infinity. The plottings of the values of the (renormalized) Jones polynomial at various roots of unity will show decreasing randomness and the emergence of more organized shapes. More specifically, we plot the following data: 1. The (renormalized) evaluation at various roots of unity of the Jones polynomials of all (a) alternating prime knots (Fig. 4) (b) non-alternating prime knots (Fig. 5) (c) prime knots (Fig. 6) with crossing number 13; 2. The (renormalized) evaluation at various roots of unit of the Jones polynomials of all alternating prime knots with crossing number 14 (Fig. 7). Finally, the plottings for other pairs of Vassiliev coefficients of the Jones polynomial, namely (V2 , V4 ), (V4 , V3 ), and (V4 , V5 ), show similar phenomena as in the case of (V2 , V3 ) (Fig. 8). 4. Some Explanations We could only offer some very partial explanations to the observed correlations among Vassiliev coefficients of the Jones polynomial on the set of knots with bounded crossing number. 4.1. Upper bounds for v2 and v3 . By results in [BN95] every Vassiliev invariant of order k can be computed in O(ck ) time and its value is in O(ck ), where c is the crossing number of a knot. An explicit quadratic upper bound for the invariant v2 of order 2 in terms of c is given in [PV99] as: |v2 | ≤ c2 /8. Similarly, for v3 one can get the cubic bound [Wil01] |v3 | ≤
1 c(c − 1)(c − 2). 4
4.2. Formulas for v2 and v3 . Combinatorial formulas for v2 and v3 were given by several authors. We will use the approach of Polyak and Viro [PV94], where we fix an oriented knot diagram and extract from it a “signed arrow diagram”. Briefly, the knot diagram gives us a generic immersion of S 1 in the plane, which in turn determines a chord diagram as before. Then an arrow and a sign ±1 is added to each chord to encode the information we get from the fact that this chord diagram comes from an oriented knot
Quantum Morphing and the Jones Polynomial
433
X
H
Y
Fig. 1. Some (based) arrow diagrams X, H and Y
diagram. This is the signed arrow diagram of the knot diagram. Ignoring the signs from a signed arrow diagram, we get an arrow diagram. A (signed) arrow diagram can also be based, which means that we fix a base point on the circle away from the end points of arrows. Finally, a sub-diagram is obtained by deleting several arrows from a (based, signed) arrow diagram. Let G be a signed based arrow diagram coming from a knot projection of a knot K. For a given based arrow diagram D, an imbedding φ : D → G identifies D with a sub-diagram of G. Define sign(φ) to be the product of all signs of the arrows in φ(D). Let X˙ be the based arrow diagram in Fig. 1. Ignoring the base point of X˙ we get the arrow diagram X. Two other arrow diagrams H and Y are also given in Fig. 1. Proposition 4.1 (Polyak–Viro). We have v2 (K) =
sign(φ)
˙ φ:X→G
and v3 (K) =
1 sign(φ) + sign(φ). 2 φ:H →G
φ:Y →G
We now reformulate the Polyak–Viro formulas so that the summation is taken over the same set of elements in both cases of v2 and v3 . This set consists of all imbeddings φ : X → G. Fix such an imbedding φ, we define three weights as follows: 1. w1 (φ) equals to 1 plus the number of endpoints of arrows in G which lie between the arrow-heads of φ(X). 2. w2 (φ) equals to the sum of signs of arrows in G such that together with arrows in φ(X), they form the arrow diagram H . 3. w3 (φ) equals to the sum of signs of arrows in G such that together with arrows in φ(X), they form the arrow diagram Y . Finally, let c be the number of arrows in G. Proposition 4.2. We have v2 (K) =
sign(φ)
φ:X→G
and v3 (K) =
φ:X→G
w1 (φ) 2c
1 1 sign(φ) w2 (φ) + w3 (φ) . 4 3
434
O. T. Dasbach, T. D. Le, X.-S. Lin
4.3. Positive knots. Let K be a knot having a knot diagram of all c crossings positive, we have the following theorem. Proposition 4.3. We have 1 10c − 5 v2 (K) ≤ v3 (K) ≤ v2 (K). 2 6 Proof. The lower bound was already noticed by Willerton [Wil01]. In order to prove the promised upper bound, let us try to bound the weights w2 and w3 by w1 from above. The first upper bound is easy to obtain: w3 (φ) ≤ w1 (φ) − 1 for every imbedding φ : X → G. The comparison of w2 and w1 is slightly more complicated. We will consider two imbeddings φ, φ : X → G as two vertices in a graph G. These two vertices are connected by an edge iff φ(X) and φ (X) together have three different arrows and these three arrows form the arrow diagram H . Then w2 (φ) is equal to the valence of φ in the graph G. They are two types of neighboring vertices for a fixed imbedding φ as shown in Fig. 2.
type 1
type 2
Fig. 2. Two types of relative positions of φ (solid arrows) and φ
Let w2+ (φ) be the number of neighboring vertices of φ of the first type and w2− (φ) be the number of neighboring vertices of the second type. Of course, w2 (φ) = w2+ (φ) + w2− (φ). First it is obvious that w2+ (φ) ≤ w1 (φ) − 1. We observe next that suppose φ and φ are connected by an edge in G, if φ is considered as of type 2 for φ, then φ is considered as of type 1 for φ . Therefore, φ:X→G
w2 (φ) = 2
φ:X→G
w2+ (φ).
Quantum Morphing and the Jones Polynomial
435
Thus, we have v3 (K) =
1 + 1 w2 (φ) + w3 (φ) 2 3 φ
φ
1 1 1 1 w1 (φ) − 1+ w1 (φ) − 1 ≤ 2 2 3 3 φ
φ
φ
φ
5 5 2c v2 (K) − v2 (K) 6 6 10c − 5 = v2 (K) 6 =
as desired. We learned that a slightly better upper bound is given in [Sto98]. 4.4. Knots given as closed braids. Of special interest is the class of knots given as closed 3-braids. Using representation theory of Hecke algebras, Jones gave the following relation between VK (t) and K (t) for K = α, ˆ α ∈ B3 (see [Jon87]). Proposition 4.4 (Jones). If α ∈ B3 is such that the closure αˆ is a knot and the exponent sum of α is e then Vαˆ (t) = t e/2 (1 + t e + t + 1/t − t e/2−1 (1 + t + t 2 )αˆ (t)). As a corollary, we have the following relationship among v2 , v3 and e for knots K = α, ˆ α ∈ B3 . Corollary 4.5. Let K be a knot given as a closed 3-braid K = αˆ and e be the exponent sum of α ∈ B3 . Then e e 3 . V3 (K) = eV2 (K) − + 2 2 For B4 , again using representation theory of Hecke algebras, Jones established the following formula relating the symmetrized Jones polynomial with the Alexander polynomial (see [Jon87]). Proposition 4.6 (Jones). If α ∈ B4 is such that αˆ is a knot and the exponent sum of α is e then t −e Vαˆ (t) + t e Vαˆ (1/t) = (t −3/2 + t −1/2 + t 1/2 + t 3/2 )(t e/2 + t −e/2 ) − (t −2 + t −1 + 2 + t + t 2 )αˆ (t).
(3)
Comparing the Vassiliev coefficients of both side of Eq. (3), we can get an algebraic relation among e, V2,3,4 , and c4 as in Corollary 4.5. We will offer an elementary proof of Proposition 4.6 below. The proof also generalizes Eq. (3) to every α ∈ B4 , not just the braids which close to a knot. Furthermore, our method can be generalized to get a similar equation for every braid group Bn using the HOMFLY polynomial. But since we have no conclusive results directly relevant to the topics of this paper from such a generalization, it will not be presented here. Suffice to say that such equations tell us that the Vassiliev coefficients of the Jones and Alexander polynomial
436
O. T. Dasbach, T. D. Le, X.-S. Lin
for closed braids, together with the exponent sum e, are algebraically related with each other. See also [DL00]. For α ∈ B4 , let e be the exponent sum and k be the number of components of α. ˆ Consider the following three invariants of conjugacy classes of B4 : V = t −e Vαˆ (t) + (−1)k−1 t e Vαˆ (1/t), Q = t e/2 + (−1)k−1 t −e/2 , = αˆ (t). They all satisfy the following skein relation: [α+ ] − [α− ] = (t 1/2 − t −1/2 ) [α0 ], where α+ = α0 σi and α− = α0 σi−1 , with σi the standard braid generator. Thus, these invariants of conjugacy classes of B4 are determined by their initial values on σ1 σ2 σ3 , σ1 σ3 , σ1 , and 1 in B4 . (The set of braids {σ1 σ2 σ3 , σ1 σ2 , σ1 σ3 , σ1 , 1} identifies with the set of conjugacy classes of the symmetric group S4 . Their closures are all the unknots. So the values of V¯ , Q, on these braids depend only on e and k. Thus σ1 σ2 is dropped from our list of braids determining the initial values of V¯ , Q, , since it has the same e and k as the braid σ1 σ3 .) It is easy to check that the following matrix, whose rows are values of V , Q, and on σ1 σ2 σ3 , σ1 σ3 , σ1 , and 1, respectively, is of rank 2: t −3 + t 3 (t −2 − t 2 )(−t 1/2 − t −1/2 ) (t −1 + t)(−t 1/2 − t −1/2 )2 0 3/2 t − t −1 t 1/2 + t −1/2 0 . t + t −3/2 1
0
0
0
Therefore, V , Q, and are linearly related and one can work out Eq. (3) directly from this conclusion. 4.5. The evaluation of the Jones polynomial at e2πi/10 . The evaluation of the Jones polynomial at the tenth root of unity q10 = e2πi/10 is somewhat special. We will give at least an heuristic reason for the difference in the “density” of the values of the Jones polynomial at the tenth-root of unity compared with the other roots of unity.
L
α
Fig. 3. The link Lα
Quantum Morphing and the Jones Polynomial
400
441
-60
-40
200 -60
-40
-20
20
-500 -1000
-20
-1500
20 -200
-2000
-400
-2500
10000
400 200
5000
-2500 -2000 -1500 -1000 -500 -200
-2500 -2000 -1500 -1000 -500 -5000
-400
-10000
Fig. 8. Plots of (V2 , V3 ), (V2 , V4 ), (V4 , V3 ), (V4 , V5 ) for knots on 13 crossings
Proposition 4.7. For a 3-braid α ∈ B3 and fixed L let Lα be the link as in Fig. 3. Here the strands of α are supposed to be oriented in the same direction. Then the Jones polynomial evaluated at the tenth root of unity q10 takes only finitely many values on the set {Lα |α ∈ B3 }. Proof. By a result of Przytycki [Prz88] and a generalization of Stoimenow [Sto99] the value of the Jones polynomial at q4k+2 does not change its norm if we insert or delete σi2k+1 into the braid α. Here, the σi denote the standard generators of the braid groups. More specifically, V (Lα ; q10 ) only changes by a multiplication by −i if we delete σi5 in a braid. By a result of Coxeter [Cox59] (compare with [Che01]) the braid group Bn modulo p p its normal subgroup generated by σ1 , . . . , σn−1 is finite if and only if (n−2)(p−2) < 4. In particular, the group B3 modulo the normal subgroup generated by the fifths powers of the standard generators is a finite group. Thus, V (Lα ; q10 ) can only take finitely many values. As a corollary we get the following result of Jones [Jon87]: Corollary 4.8. The Jones polynomial evaluated at q10 takes only finitely many values on the set of closed 3-braids. References [BL93] [BN95] [Che01] [Cox59]
Birman, J.S. and Lin, X.-S.: Knot polynomials and Vassiliev’s invariants. Invent. Math. 111, no. 2, 225–270 (1993) Bar-Natan, D.: Polynomial invariants are polynomial. Math. Research Letters 2, 239–246 (1995) Chen, Q.: The 3-move conjecture for 5-braids. In: Proceedings of the International Conference on Knot Theory and its Ramifications, Singapore: World Scientific, 2001, pp. 36–47 Coxeter, H.S.M.: Factor groups of the braid group In: Proc. fourth Canad. Math. Congress Banff 1957, 1959, pp. 95–122
442
[Das00] [DL00] [FKW00] [FLW00] [FLW01] [Fre98] [GJ79] [Jon87] [JVW90] [Kas97] [MM99] [MZ00] [Prz88] [PV94] [PV99] [Sto98] [Sto99] [Vog90] [Wil01] [Yam87]
O. T. Dasbach, T. D. Le, X.-S. Lin
Dasbach, O.T.: On the combinatorial structure of primitive Vassiliev invariants III – A lower bound. Comm. Contempor. Math. 2, no. 4, 579–590 (2000) Dasbach, O.T. and Lin, X.-S.: The Bennequin number of closed n-trivial n-braids is negative. To appear in: Math. Res. Let. 8, No. 5–6 (2001) Freedman, M.H., Kitaev, A. and Wang, Z.: Simulation of topological field theories by quantum computers. Preprint (Microsoft), available as: quant-ph/0001071, 2000 Freedman, M.H., Larsen, M.J. and Wang, Z.: A modular functor which is universal for quantum computation. Preprint (Microsoft), available as: qant-ph/0001108, 2000 Freedman, M.H., Larsen, M.J. and Wang, Z.: The two-eigenvalue problem and density of Jones representation of braid groups. Preprint (Microsoft), available as math.GT/0103200, 2001 Freedman, M.H.: Topological views on computational complexity. In: Proceedings of the International Congress of Mathematicians, Vol. II (Berlin, 1998), Doc. Math. 1998, Extra Vol. II, pp. 453–464 (electronic) Garey, M.R. and Johnson, D.S.: Computers and intractability. In: A guide to the theory of NPcompleteness A Series of Books in the Mathematical Sciences, San Francisco, CA: W. H. Freeman and Co., 1979, Jones, V.F.R.: Hecke algebra representations of braid groups and link polynomials. Ann. Math. 126, 335–388 (1987) Jaeger, F., Vertigan, D.L. and Welsh, D.J.A.: On the computational complexity of the Jones and Tutte polynomials. Math. Proc. Cambridge Philos. Soc. 108, no. 1, 35–53 (1990) Kashaev, R.M.: The hyperbolic volume of knots from the quantum dilogarithm. Lett. Math. Phys. 39, no. 3, 269–275 (1997) Murakami, H. and Murakami, J.: The colored Jones polynomials and the simplicial volume of a knot. Acta Math. 186, No. 1, 85–104 (2001) Menasco, W.W. and Zhang, X.: Positive knots and knots with braid index three have Property P . Available as: math.GT/0010154, 2000 Przytycki, J.H.: tk moves on links. Braids (Santa Cruz, CA, 1986), Providence, RI: Amer. Math. Soc., 1988, pp. 615–656 Polyak, M. and Viro, O.: Gauss diagram formulas for Vassiliev invariants. Internat. Math. Res. Notices no. 11, 445ff. (1994), approx. 8 pp. (electronic) Polyak, M. and Viro, O.: On the Casson knot invariant. Knots in Hellas ’98, Vol. 3 (Delphi). J. Knot Theory Ramifications 10, no. 5, 711–738 (2001) Stoimenow, A.: Positive knots, closed braids and the Jones polynomial. Preprint, available as: math.GT/9805078, 1998 Stoimenow, A.: The granny and the square tangle and the unknotting number. Preprint, MPI Bonn, October 1999 Vogel, P.: Representation of links by braids: A new algorithm. Commun. Math. Helv. 65, 104–113 (1990) Willerton, S.: On the first two Vassiliev invariants. Preprint, available as:math.GT/0104061, 2001 Yamada, S.: The Minimum Number of Seifert Circles Equals the Braid Index of a Link. Invent. Math. 89, 346–356 (1987)
Communicated by P. Sarnak
Commun. Math. Phys. 224, 443 – 544 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Global Regularity of Wave Maps II. Small Energy in Two Dimensions Terence Tao Department of Mathematics, UCLA, Los Angeles, CA 90095-1555, USA. E-mail:
[email protected] Received: 14 December 2000 / Accepted: 18 June 2001
Abstract: We show that wave maps from Minkowski space R1+n to a sphere S m−1 are globally smooth if the initial data is smooth and has small norm in the critical Sobolev space H˙ n/2 , in all dimensions n ≥ 2. This generalizes the results in the prequel [40] of this paper, which addressed the high-dimensional case n ≥ 5. In particular, in two dimensions we have global regularity whenever the energy is small, and global regularity for large data is thus reduced to demonstrating non-concentration of energy.
1. Introduction Throughout this paper d ≥ 2, n ≥ 1 will be fixed integers, and all constants may depend on d and n. Let R1+n be n + 1 dimensional Minkowski space with flat metric η := diag(−1, 1, . . . , 1), and let S d−1 ⊂ Rd denote the unit sphere in the Euclidean space Rd . Elements φ of Rd will be viewed as column vectors, while their adjoints φ † are row vectors. We let ∂α and ∂ α for α = 0, . . . , n be the usual derivatives with respect to the Minkowski metric η, subject to the usual summation conventions. We let ✷ := ∂α ∂ α = − ∂t2 denote the D’Lambertian. We shall write φ,α and φ ,α for ∂α φ and ∂ α φ respectively. Define a classical wave map to be any function φ defined on an open set in R1+n taking values on the sphere S d−1 which is smooth, equal to a constant outside of a finite union of light cones, and obeys the equation † ,α φ . ✷φ = −φφ,α
(1)
For any time t, we use φ[t] := (φ(t), ∂t φ(t)) to denote the position and velocity of φ at time t. We refer to φ[0] as the initial data of φ. Note that in order for φ[0] to be the initial data for a classical wave map, φ[0] must be smooth, equal to a constant outside
444
T. Tao
of a compact set, and satisfy the consistency conditions φ † (0)φ(0) = 1;
φ † (0)∂t φ(0) = 0.
(2)
We shall refer to data φ[0] which satisfy these properties as classical initial data. The purpose of this √ paper is to prove the following regularity result for classical wave maps. Let H˙ s := ( −)−s L2 (Rn ) denote the usual homogeneous Sobolev spaces. Theorem 1. Let n ≥ 2, and suppose that φ[0] is classical initial data which has a sufficiently small H˙ n/2 × H˙ n/2−1 norm. Then φ can be extended to a classical wave map globally in time. Furthermore, if s is sufficiently close to n/2, we have the global bounds
φ[t] L∞ (H˙ s ×H˙ xs−1 ) φ[0] H˙ s ×H˙ xs−1 . t
x
x
(3)
In particular, in the energy-critical two-dimensional case one has global regularity for wave maps with small energy into a sphere. From this and standard arguments based on finite speed of propagation (see e.g. [4]), we see that the problem of global regularity for general smooth data is thus reduced to demonstrating the non-concentration of energy. This non-concentration is known if one assumes some symmetry on the data and some curvature assumptions on the target manifold ([4, 30, 34, 35]), but is not known in general. For further discussion on these problems see, e.g. [15, 21, 29, 33]. A similar result, but n/2,1 with the Sobolev norm H˙ n/2 replaced by a slightly smaller Besov counterpart B˙ 2 , was obtained in [42]. Indeed, our paper shall largely be a (self-contained) combination of [42] and the prequel [40] to this paper, although there are some additional technical issues arising here which do not occur in the two papers just mentioned. Theorem 1 was proven in [40] in the high-dimensional case n ≥ 5. In that paper the main techniques were Littlewood–Paley decomposition, Strichartz estimates1 and an adapted co-ordinate frame constructed by parallel transport. To cover the low dimensional cases n = 2, 3, 4 we shall keep the Littlewood–Paley decomposition and adapted co-ordinate frame construction (with only minimal changes from [40]), but we shall abandon the use of Strichartz estimates as the range of these estimates becomes far too restrictive to be of much use, especially when n = 2. Instead, we shall adapt the more intricate spaces (including X˙ s,b spaces) and estimates developed in [42], as substitutes for the Strichartz estimates. This will make the argument much lengthier and involved, although the overall strategy is little changed2 from that in [40]. One major new difficulty ∞ ˙ s,b spaces, and so is that multiplication by L∞ t Lx functions is not well-behaved on X ∞ ∞ we will need to replace Lt Lx with a more complicated Banach algebra. In [40] the non-linearity was placed (after localizing in frequency and switching to n/2−1 the adapted co-ordinate frame) in the familiar space L1t H˙ x . When n ≥ 5 this was 1 Readers familiar with the literature may be surprised that Strichartz estimates are able to handle the critical problem for wave maps. The reason is that the renormalization almost reduces the strength of the non-linearity to the level of a pure power. A more precise statement is that the renormalization ensures that in the event of high-low frequency interactions, at least one of the two derivatives in the cubic non-linearity will land on the lowest frequency term. To compare this with the pure power problem, observe that if we could somehow ensure that both derivatives in the non-linearity landed on the two lowest frequency terms, then we could † ,α φ , which is a cubic semi-linear differentiate the equation to obtain something like ✷∇x,t φ = −∇x,t φφ,α equation in ∇x,t φ with an additional null structure. 2 Indeed, the basic renormalization argument only covers about a third of the paper, from Sect. 2 to Sect. 5. The bulk of the paper is concerned instead with constructing rather complicated function spaces as substitutes for the Strichartz spaces, and proving the relevant estimates for those spaces.
Global Regularity of Wave Maps II. Small Energy in Two Dimensions
445
relatively easy to achieve, since one had access to L2t L4x and L2t L∞ x Strichartz estimates. For n = 4 one loses the L2t L4x estimate, but one could probably use X˙ s,b spaces and null † φ ,α in spaces such as L2 L2 ) as a substitute, so that form estimates (which would place φ,α t x n/2−1 . When one could continue to place the non-linearity in such good spaces as L1t H˙ x n = 3 one also (barely) loses the L2t L∞ estimate, although in principle this could be x compensated for by the Lp null form estimates in [44, 39] for certain p < 2. However in the energy-critical case n = 2, the best Strichartz estimate available is only L4t L∞ x , and it appears that even the best possible Lp null form estimates3 are not strong enough to place the non-linearity in a space such as L1t L2x , even after using the adapted co-ordinate frame and introducing X˙ s,b type spaces. Because of this, we can only place a small portion of the non-linearity in L1t L2x . Following [42], we shall place the other portions of the non-linearity either in an X˙ s,b type space, or in L1t L2x spaces corresponding to null frames. To obtain these types of control on the non-linearity, we shall use null-form estimates, as well as the decomposition, introduced in [42], of free solutions as a superposition of travelling waves, each of which is in L2t L∞ x with respect to a certain null frame. This decomposition, combined with the L2t L2x control coming from X˙ s,b estimates, is crucial to recover the L1t L2x type control of the non-linearity which we need to close the argument. The high-dimensional argument in [40] did not need to exploit the null structure in (1). However, one does not have this luxury in the low-dimensional cases, and we shall need in particular to rely on the identity 2φ,α ψ ,α = ✷(φψ) − φ✷ψ − ✷φψ
(4)
heavily (cf. [41, 42], and elsewhere). This identity is useful when φ, ψ, φψ are relatively close to the light cone in frequency space, although when one is far away from the light cone this identity can become counter-productive. It is quite possible that Theorem 1 can be extended to other manifolds than the sphere4 , and to scattering and well-posedness results. We refer the reader to [40] as we have nothing of interest to add to that discussion here (other than a large increase in complexity). Indeed, we would strongly recommend to anyone interested in these problems for small data that they first study the high-dimensional case before attempting the low-dimensional one. (For large data one of course has blowup in dimensions greater than two due to the supercritical nature of the energy conservation law; see [31].) 2. Notation and Preliminary Reduction We shall restrict our attention to the low dimensional case n = 2, 3, 4 since the high dimensional case was already treated in [40]. We shall need some small exponents 0 < δ0 δ1 δ2 δ3 δ4 1. 3 An examination of the known counter-examples suggests that it may just be possible to place φ ψ ,α ,α 4/3 in Lt L2x , which in principle is just barely enough to obtain L1t L2x control on the non-linearity thanks to the L4t L∞ x Strichartz estimate. However this would require (among other things) a reworking of the endpoint
argument of [39] and would therefore not be a simplification to this paper. 4 In dimensions n ≥ 5 this has recently been achieved [19] in the case when the target manifold is boundedly parallelizable.
446
T. Tao
The exact choice of these exponents is not important, but for concreteness we shall choose them as follows. We first choose 0 < δ4 1 to be a small absolute constant 10 for i = 3, 2, 1, 0. We depending on n (δ4 = 1/100n shall do), and then set δi := δi+1 shall implicitly be inserting the disclaimer “assuming δ4 is sufficiently small depending on n” in all the arguments which follow. Thus any exponential term involving δ4 shall dominate a corresponding term involving δ3 , and so forth down to δ0 , which is dominated by everything. The exponents δi are only of technical importance and the reader should not take them too seriously. Broadly speaking, we shall use the smallest constant δ0 to control the flexibility of frequency envelopes, the next smallest constant δ1 to measure the exponential gains in our final iteration spaces Sk , Nk , and the largest constant δ4 to measure rather large exponential gains coming from the basic linear and bilinear estimates. The intermediate exponents δ2 , δ3 are only used for the delicate trilinear estimate (31), and arise because the proof of (31) is essentially an interpolation between several different types of arguments. (i) (i) Let j, k be integers and i = 0, 1, 2, 3, 4. We use χj ≤k or χk≥j to denote a quantity of the form min(1, 2−δ(j −k) ), where δ > C −1 δi2 for some absolute constant C > 0 (i) depending only on n. We also use χj =k to denote a quantity of the form 2−δ|j −k| with (i)
(i)
the same assumptions on δ. Thus χj ≤k is small unless j ≤ k + O(1), and χj =k is small unless j = k + O(1). We suggest the reader ignore the i index and think of the χ (i) as (i) characteristic functions, e.g. χj ≤k is morally a cutoff to the region j ≤ k. −(i)
−(i)
Similarly, we use χj ≤k = χk≥j to denote a quantity of the form max(1, 2δ(j −k) ), 1/2
where δ < Cδi for some absolute constant C > 0 depending only on n, and also −(i) use χj =k to denote quantity of the form 2δ|j −k| with the same assumptions on δ. The χ (i) thus represent various exponential gains in our estimates, while χ −(i) represent various exponential losses. Note that a χ (i) gain will dominate a corresponding χ −(j ) loss whenever i > j . As usual, we use A B or A = O(B) to denote the estimate A ≤ CB, where C is some quantity depending only on n, d, and the δi . All sums will be over the integers Z unless otherwise specified. We fix 0 < ε 1 to be a small constant depending only on n, d, and the δi (ε := δ0nd will suffice5 ). We shall implicitly insert the disclaimer “assuming ε is sufficiently small depending on n, d and the δi ” in all the arguments which follow. Eventually we shall assume that the initial data has a H˙ n/2 × H˙ n/2−1 norm of ε 2 . We shall parameterize spacetime R1+n in the standard Euclidean frame {(t, x) : t ∈ R, x ∈ Rn } with the Euclidean inner product (t, x) · (t , x ) := tt + x · x ; we will not use the Minkowski metric η much (except in Case 4(e) of Sect. 18). In the proof of our estimates in the second half of the paper we shall also introduce null frames for R1+n , but we shall not need them for quite a while. We fix T > 0 to be a given time. It will be important that our implicit constants do not depend on T . For the first half of this paper, which is concerned with the iteration scheme and the renormalization, our functions shall only be supported on the slab [−T , T ]×Rn , but in the second half, which is concerned with the function spaces and the estimates, 5 Of course, by our above construction this value of ε is absurdly small, as we have wildly exaggerated the separation of scales in exponents that we shall actually need. One can improve the value of ε substantially, but we shall not attempt to do so here.
Global Regularity of Wave Maps II. Small Energy in Two Dimensions
447
we shall mainly work on all of R1+n (as one then gains access to the spacetime Fourier transform) and then apply standard restriction arguments to return to [−T , T ] × Rn . q We define the Lebesgue spaces Lt Lrx by the norm
φ Lqt Lr := x
|φ(t, x)|r dx
q/r
1/q dt
with the usual modifications when r = ∞ or q = ∞. If φ(t, x) is a function on [−T , T ] × Rn or R1+n , we define the spatial Fourier ˆ ξ ) by transform φ(t, ˆ ξ ) := φ(t, e−2πix·ξ φ(t, x) dx Rn with the inverse transform given by ˇ F (t, x) := e2πix·ξ F (t, ξ ) dξ. Rn The spatial Fourier transform is distinct from the spacetime Fourier transform Fφ(τ, ξ ), which we shall need in the second half of the paper. We define the spatial Fourier support ˆ ξ ) = 0 for some t}. or ξ -Fourier support of φ to be the set {ξ : φ(t, We shall write D0 for |ξ |, so that D0 measures the strength of the operator ∇x . Thus, for instance, the set {D0 ∼ 2k } denotes the frequency region {ξ : |ξ | ∼ 2k }. We now set up some Littlewood–Paley operators, which shall play a central role in our arguments. Fix m0 (ξ ) to be a non-negative radial bump function supported on D0 ≤ 2 which equals 1 on the ball D0 ≤ 1. For each integer k, define the operators P≤k = P 0 we have
φ S(Cc) ∼ φ S(c)
(23)
with the implicit constants depending at most polynomially on C. – (S(c) is built up from Sk ) Let φ be a smooth function on [−T , T ] × Rn which israpidly decreasing in the spatial variable. Suppose we have a decomposition φ = k φ (k) , where each φ (k) is in Sk . Then we have −1 (k) ∞ + sup c
φ S(c) φ L∞ k φ Sk . t Lx k
(24)
11 In other words, the spaces S(c), S are dimensionless, while N has the units length−2 . k k 12 We shall consistently use φ, ψ, to denote generic functions in the S family of spaces, and F to denote
generic functions in the N family of spaces.
Global Regularity of Wave Maps II. Small Energy in Two Dimensions
455
n/2−1 – (Nk contains L1t H˙ x ) Let F be an L1t L2x function on [−T , T ] × Rn which has Fourier support on the region D0 ∼ 2k for some integer k. Then F is in Nk and n
F Nk F L1 H˙ n/2−1 ∼ 2( 2 −1)k F L1 L2 . t
t
x
(25)
x
– (Adjacent Nk are equivalent) We have the compatibility property
F Nk1 ∼ F Nk2
(26)
whenever F ∈ Nk2 and k1 = k2 + O(1). – (Energy estimate) For any Schwartz function φ on [−T , T ] × Rn with Fourier support in D0 ∼ 2k , we have
φ Sk ✷φ Nk + φ[0] H˙ n/2 ×H˙ n/2−1 n
n
∼ ✷φ Nk + 2 2 k φ(0) L2 + 2( 2 −1)k ∂t φ(0) L2 .
(27)
– (Product estimates) We have (1)
Pk L(φ, F ) Nk χk≥k2 φ S(c) F Nk2
(28)
for all φ ∈ S(c) and F ∈ Nk2 . We also have the variant (1)
Pk L(φ, F ) Nk χk≥k2 φ Sk1 F Nk2
(29)
whenever φ ∈ Sk1 and F ∈ Nk2 . – (Null form estimates) We have (1)
Pk L(φ,α , ψ ,α ) Nk χk=max(k1 ,k2 ) φ Sk1 ψ Sk2
(30)
for all φ ∈ Sk1 , ψ ∈ Sk2 . – (Trilinear estimate) We have
Pk L(φ
(1)
(2) , φ,α , φ (3),α ) Nk
(1) (1) χk=max(k1 ,k2 ,k3 ) χk1 ≤min(k2 ,k3 )
3 i=1
φ (i) Ski
(31)
whenever φ (i) ∈ Ski for i = 1, 2, 3. – (Epilogue) For any φ ∈ Sk with Fourier support in D0 2k we have sup φ[t] H˙ n/2 ×H˙ n/2−1 2nk/2 sup φ[t] L2x ×L2x φ Sk . t
x
x
t
(32)
We now discuss each of the above properties in turn. – The estimate (17) is a technical fact needed in order to make the continuity argument work, and is proven in Sect. 12, mainly using (27) and (25). Since we are assuming φ to be smooth and constant outside of a compact set, one would certainly expect the function a to actually be continuous rather than just quasi-continuous, but we do not know how to prove this and in any event it is not needed for our argument13 . In the high dimensional case this estimate was trivial as the spaces Sk were just Lebesgue spaces, but more care is required here because Sk will be defined by restriction from R1+n and have a spacetime Fourier component in their norms. We remark that the quantity a(T ) is necessarily finite for classical wave maps φ, thanks to (27) and (13). 13 Note added in proof: Daniel Tataru has observed that continuity can be obtained by using the fact that the scaling map λ → φ(λt, λx) is continuous in the Sk topology for sufficiently nice φ.
456
T. Tao
– The invariance properties are unsurprising given the translation and scaling symmetries of the equation, and will be automatic from our construction of S(c), Sk , Nk in Sect. 10. As a corollary of translation invariance we observe that the Littlewood–Paley operators Pk , P≤k , etc. are bounded on the spaces S(c), Sk , Nk . – The algebra property (18) is essential for us to invert the gauge transformation, and will be proven in Sect. 16. The spaces described in [42] obey this algebra property if c ∈ l 1 , but when c ∈ l 2 there is a logarithmic divergence in the estimates. Fortunately, ∞ this divergence can be rectified (with some non-trivial effort) by adding L∞ t Lx conn/2 trol to S(c). This is analogous to the well-known fact that H˙ x is not closed under n/2 multiplication, but H˙ x ∩ L∞ x is. The estimate (19) thus will be an automatic consequence of our construction of S(c) in Sect. 16. We shall be able to obtain (18) to some extent from (28) via a convenient duality argument. – The estimates (21), (20) are minor variants of (18); indeed, all three estimates shall be treated in a unified manner in Sect. 16. The L∞ control (22) is unsurprising given (20), and shall be easy to prove. – The insensitivity property (23) will be immediate from the construction of S(c). This property is required because it will turn out for induction purposes that it is more convenient to initially measure U in S(Cc) instead of S(c). – The estimate (24) shall turn out to be automatic, because we shall essentially use (24) to define the space S(c) in Sect. 10. – The estimate (25) plays only a minor role in the main argument, ensuring that Nk does n/2−1 indeed contain test functions and certain error terms. Note that the space L1t H˙ x is the classical space which one would use to hold the non-linearity, if one attempted to apply the energy method (although this method of course fails at the critical regularity). This estimate shall be an automatic consequence of our construction of Nk in Sect. 10. – The compatibility property (26) allows us to ignore certain technical “frequency leakage” problems arising from the fact that φψ does not quite have the same frequency as φ, even when φ has much higher frequency than ψ. It will be an automatic consequence of the construction of Nk in Sect. 10. A similar property for Sk holds but will not be needed in our argument. From (26) and Littlewood–Paley decomposition we observe the estimate
Pk F Nk (33)
F Nk k =k+O(1)
whenever F is supported on the region D0 ∼ 2k . – The energy estimate (27) is a bit lengthy, and is proven in Sect. 11. One could try to make (27) the definition of Sk , as is done in some other papers, but this makes the product estimate (28) difficult to prove. (1) – The estimates (28), (29) shall be proven in Sect. 15. The factor χk≥k2 is an indication that the high-high interactions in this problem are quite weak. (A similar gain is implicit in [42]). – We shall prove (30) in Sect. 17. The proof basically uses the estimates (29), (18) described above, combined with the identity (4). In practice we shall only apply (30) in the high-high interaction case (since we then obtain an exponential gain from (1) χk=max(k2 ,k3 ) ), or if a derivative has been transferred from the high-frequency term to the low-frequency term14 . From (33) we observe that the Pk projection in the above lemma can be removed if the expression inside the Pk already has frequency ∼ 2k .
Global Regularity of Wave Maps II. Small Energy in Two Dimensions
457
– The trilinear estimate (31) is the most difficult estimate in this theorem to prove, and (1) is handled in Sect. 18. The factor χk=max(k1 ,k2 ,k3 ) again reflects the fact that high-high interactions are weak. The difficulty lies primarily in obtaining the small but crucial (1) factor15 of χk1 ≤min(k2 ,k3 ) . Without this factor, (31) essentially follows from (29) and (30). The presence of this factor allows us to handle any non-linearity of cubic or higher degree in which at least one derivative falls on a low frequency term. In order to obtain this key exponential gain we have to go beyond the arguments in [42] and apply some other tools, notably some multiplier calculus to shift null forms from one function to another, and the use of Bernstein’s inequality when the null forms are too degenerate for the multiplier calculus to be effective. As with (30), we remark that the Pk can be removed if the expression inside the Pk already has frequency ∼ 2k . – The estimate (32) is basically a dual to (25), and shall be automatic from our construction of Sk in Sect. 16. When n = 2 it might be possible to use energy conservation to circumvent the need for this estimate, however this does not seem to achieve any substantial simplification in this paper. To close this section we informally discuss how the bilinear and trilinear estimates (28), (30), (31) are to be used. They cannot quite treat the original non-linearity L(φ, φ,α , φ ,α ) in (1), especially when the derivatives ∂α , ∂ α fall on high-frequency components of φ. However these estimates can treat these types of expressions when the derivatives are in more favorable locations. Examples of such “good” non-linearities include – (Derivative falls on a low frequency) An expression L(φk1 , φk2 ,α , φk,α3 ) with k1 ≥ min(k2 , k3 ) + O(1). For these expressions we use (31). – (High-high interactions) An expression L(φk1 , φk2 ,α , φk,α3 ) with k2 = k3 + O(1). For these expressions we use (30) and (28). – (Derivative shifted from high-frequency to low, Type I) An expression of the form L(∇x φk1 , φk2 ,α , ∂ α ∇x−1 φk3 ) with k1 ≤ k3 + O(1). This generalizes the high-high interaction non-linearity, and arises from commutator expressions via Lemma 2. This non-linearity is treated by (31). – (Derivative shifted from high-frequency to low, Type II) An expression of the form L(φk1 , ∇x φk2 ,α , ∇x−1 φk,α3 ) with k2 ≤ k3 + O(1). This type of non-linearity also arises from commutator expressions via Lemma 2, and is estimated by (30) and (28). 14 Because of this, it is possible to lose a factor of up to (but not including) 2|k1 −k2 | in (30) without affecting the argument. This is for instance the case in the n ≥ 5 theory in [40], where the high frequency term is 2(n−1)/(n−3) estimated using the endpoint Strichartz space L2t Lx and the low frequency term in the companion n−1 2 |k −k |(n+1)/2(n−1) 1 2 . space Lt Lx , thus losing a factor of 2 15 For n ≥ 4 this estimate can be obtained by estimating the two low frequencies in L2 L∞ and the high t x frequency in L1t L2x , and by moving these exponents around by an epsilon one can also cover the n = 3 case by Strichartz estimates. However in the n = 2 case the Strichartz estimates are far too weak to prove this estimate, and we shall need to work much harder.
458
T. Tao
– (Repeated derivatives avoiding the highest frequency) An expression of the form L(✷φk1 , φk2 , . . . , φks ) with k1 ≤ max(k2 , . . . , ks ) + O(1). These types of expressions will arise once we apply the gauge transformation U . In principle, one can use Eq. (1) to break this expression up into combinations of the previous types of good non-linearity, although the computations become somewhat tedious in practice. Note that in all cases we have retained the null structure of the non-linearity. In the low dimensional cases n = 2, 3, 4 this is vital to the above non-linearities being good. In all of the above cases we obtain various exponential gains which will allow us to sum in the ki indices. As a first approximation, one should treat these good non-linearities as being negligible errors. The objective is then to gauge transform (Littlewood–Paley localized versions of) (1), exploiting such geometric identities as φ † φ,α = 0 as well as Lemma 2, until all the non-linearities are negligible. In this strategy the Littlewood–Paley decomposition seems to play an indispensable role, as this decomposition allows us to easily separate the core component of the non-linearity (which for wave maps is a connection term where the connection Aα;≤k has small curvature) from the remaining error terms which are good non-linearities and therefore negligible.
4. The Main Proposition Let S(c), Sk , Nk be as in Theorem 3. We now adapt the argument from [40]. In the next section we shall prove the following “bootstrap” property of the Sk norms: Proposition 1 (Main Proposition). Let c be a frequency envelope, 0 < T < ∞, and let φ be a classical wave map on [−T , T ] × Rn , extended to R1+n by the free wave equation, such that φ[0] lies underneath εc, and that
φk Sk ck
(34)
φk Sk ≤ ck
(35)
for all k. Then we have
for all k. We now give the continuity argument which deduces Theorem 2 from this proposition. Let T0 , c, φ be as in Theorem 2, and let a(T ) be the quantity in (16). From (27) and the hypothesis that φ lies underneath εc we see that a(0) = 1. From Proposition 1 we see that if 0 < T ≤ T0 obeys a(T ) 1, then we can automatically bootstrap this bound (using the monotonicity of a) to a(T ) = 1. From this and (17) we see that the set {T ∈ [0, T0 ] : a(T ) = 1} is both open and closed in [0, T ]. Since this set contains the origin, we thus have a(T0 ) = 1. From this and (32) we thus see that φ[t] lies underneath Cc for all 0 ≤ t ≤ T0 , as desired. It only remains to prove Proposition 1.
Global Regularity of Wave Maps II. Small Energy in Two Dimensions
459
5. Renormalized Iteration: The Proof of Proposition 1 We shall divide this proof into several steps16 . Step 0. Scaling. Fix c, T , φ, and suppose that the hypotheses of Proposition 1 hold. In this section all our functions and equations shall be on the slab [−T , T ] × Rn . ∞ Since φ is on the sphere, it is bounded in L∞ t Lx . From this and (24) we have the bound
φ S(c) 1.
(36)
Of course, the same bound then holds for all Littlewood–Paley projections of φ, such as φk , P≤k φ, P≥k φ, etc. We need to show (35). By scale-invariance (scaling T , c, and φ appropriately) it suffices to show that
φ0 S0 ≤ c0 .
(37)
By applying P0 to (1) we obtain the evolution equation for φ0 : † ,α φ ). ✷φ0 = −P0 (φφ,α
(38)
Step 1. Linearize the φ0 evolution. Definition 2. For each integer k, define the connection A≤k;α by the formula17 † † A≤k;α := A C. Consider the contribution of Q>j −C ψ. By Plancherel we may discard P0 Qj and estimate the previous by ∞ Q>j −C ψ 2 2 (1 + 2j ) φ L∞ L L t Lx t
x
which is acceptable by (19), (83). The contribution of Q≤j −C φ and Q≤j −C ψ vanish, so we only need consider the contribution of Q>j −C φ and Q≤j −C ψ. By Plancherel we may discard P0 Qj and estimate this contribution by (1 + 2j ) Q>j −C φ L2 L2 Q≤j −C ψ L∞ L∞ . t
x
t
x
But this is acceptable by (83), (82), (88) and dyadic decomposition. Thus we may assume that j < O(1). From Lemma 13, (88) and dyadic decomposition we see that P0 Qj (φ≥j −C ψ) 2 2 2−j/2 φ 1+n ψ S[k2 ] . Lt Lx S(1)(R ) Thus we need only show that P0 Qj (φ<j −C ψ) 2 2 2−j/2 φ 1+n ψ S[k2 ] . L L S(1)(R ) t
x
518
T. Tao
Consider first the contribution of Q≥j −C/2 φ<j −C . By (84) the L2t L∞ x norm of this is −j/2 ∞ 2 O(2
φ S(1)(R1+n ) , while the Lt L norm of ψ is O( ψ S[k2 ] ) by (81). The claim follows by discarding P0 Qj and using Hölder. It remains to control Q<j −C/2 φ<j −C . For this contribution we may replace ψ by Qj −10−2l−C . By Lemma 7 we can bound this contribution by + j/2 P0 Q+ (φ>−2l−C ψ) n/2,1/2,1 2 Q (φ ψ) P 0 >−2l−C ˙ 0 and x, x ∈ R. The related function ˜ k (x −x ) := sgn(x −x ) i eik|x−x | G 2k
(2.2)
allows us to express the resolvent for the δ -perturbation of H0 centered at the point y and having the “strength” β, denoted by β,y , in the form [AGHH, Sect. I.4]: (β,y − k 2 )−1 (x, x ) = Gk (x −x ) −
2βk 2 ˜ ˜ k (x −y). Gk (x −y)G 2 − iβk
(2.3)
Recall that β,y acts as H0 away from y and its domain consists of those f ∈ W 2,2 (R \ {y}) which satisfy the boundary conditions ψ (y+) = ψ (y−) =: ψ (y) ,
ψ(y+) − ψ(y−) = βψ (y).
(2.4)
Our first aim is to approximate the resolvent (2.3) of β,y by a family of those corresponding to the triple δ-perturbation of H0 with the couplings Aa = {αj }j =−1,0,1 = {2β −1 −a −1 , βa −2 , 2β −1 −a −1 } localized at Ya = {yj }j =−1,0,1 = {y − a, y, y + a} for a ≥ 0 letting a → 0. Denote this perturbed operator by − Aa ,Ya . Then by [AGHH, Sect. II.2] the corresponding resolvent has the kernel (− Aa ,Ya − k 2 )−1 (x, x ) = Gk (x −x ) −
[a (k)]−1 jj Gk (x −yj ) Gk (x −yj ),
j,j =−1,0,1
(2.5)
596
P. Exner, H. Neidhardt, V. A. Zagrebnov
where [a (k)]jj := αj−1 δjj + Gk (yj −yj )
jj
and j, j = −1, 0, 1. In particular, for
a purely imaginary k = iκ, κ > 0, we get 2 1 + u w w 1 w 1 + v w , a (iκ) = 2κ w2 w 1+u
(2.6)
where u := 2βκa/(2a −β),
v := 2κa 2 /β,
w := e−κa .
(2.7)
Let us look at how the spectrum of the operators {− Aa ,Ya }a≥0 behaves as a → 0 for a fixed β. Since the perturbation in (2.5) is a rank three operator, σess (H0 ) = σac (H0 ) = [0, ∞) is not affected by the perturbation and the point spectrum consists of at most three negative eigenvalues, with the multiplicity taken into account [We, Sect. 8.3]. Here we have: Proposition 2.1. For small enough spacing a the operator − Aa ,Ya has at most one eigenvalue. This happens if and only if β < 0, and in that case inf σ (− Aa ,Ya ) = −
4 + O(a) . β2
(2.8)
Proof. Since the negative part of σ (− Aa ,Ya ) is the point spectrum determined by zeros of det a (iκ) by [AGHH, Sect. II.2] we arrive at the equation (1+u−w2 ) (1+u)(1+v) − w2 (1−v) = 0, (2.9) or e−2κa = 1 + and e
−2κa
2βκa = 1+ 2a −β
2βκa 2a −β
1 + 2κa 2 β −1 . 1 − 2κa 2 β −1
(2.10)
(2.11)
Expanding the left- and right-hand sides of the last two equations around a = 0, one finds that only (2.10) has a solution for a sufficiently small a > 0 and that it equals κ(a) = −
2 + O(a). β
Since k = iκ corresponds to an isolated eigenvalue if and only if m k > 0, the assertion follows readily. Proposition 2.1 also shows that if κ > −2/β, β = 0, is fixed, then there is a0 (κ) > 0 such that − Aa ,Ya + κ 2 > 0 and the resolvent (− Aa ,Ya + κ 2 )−1 exists for all a ∈ (0, a0 (κ)). We further note that the operator − Aa ,Ya admits a definition in the sense of quadratic forms. Denoting this quadratic form by QAa ,Ya [·, ·] one has β QAa ,Ya [u, v] = (u , v ) + 2 u(y)v(y) a 2 1
+ − u(y + a)v(y + a) + u(y − a)v(y − a) (2.12) β a
Potential Approximations to δ
597
for u, v ∈ dom(QAa ,Ya ) = W 1,2 (R). When equipped with the scalar product (u, v)QAa ,Ya := − Aa ,Ya + κ 2 u, − Aa ,Ya + κ 2 v ,
(2.13)
where κ > −2/β and a ∈ (0, a0 (κ)), the domain dom(QAa ,Ya ) becomes a Hilbert space. It is important to note that the norm · QAa ,Ya arising from this scalar product is equivalent to the norm of the Hilbert space W 1,2 (R). Proposition 2.1 shows that up to an O(a) error the spectral properties of − Aa ,Ya coincide with those of β,y . Next we compare the corresponding resolvents. Theorem 2.2. Let κ = −2/β and β = 0 be fixed. Then the relation
lim
a→0+
− Aa ,Ya + κ 2
−1
−1 (x, x ) = β,y + κ 2 (x, x )
(2.14)
holds for any x, x ∈ R. Consequently, − Aa ,Ya → β,y as a → 0+ in the normresolvent sense. Proof. By virtue of (2.3), to check (2.14) it is sufficient to compute the pointwise limit of the second term at the right-hand side of (2.5). Using the notations introduced in the preceding proof, we obtain an explicit expression for the inverse matrix in (2.5): [a (iκ)]−1 =
2κ (2.15) (w 2 −1−u)[(1+u)(1+v) − w2 (1−v)] 2 −w(w2 −1−u) w2 v w −(1+u)(1+v) × −w(w2 −1−u) (w2 +1+u)(w2 −1−u) −w(w2 −1−u) . w2 v −w(w 2 −1−u) w2 −(1+u)(1+v)
Without loss of generality we may assume y = 0. Suppose, for instance, that x, x > a, then the resolvent difference kernel is obtained by sandwiching the above matrix between the vectors G(x), G(x ), where w Giκ (x + a) 1 −κx 1 , G(x) := Giκ (x) = (2.16) e 2κ G (x − a) w −1 iκ
which yields the expression j,j =−1,0,+1
[a (iκ)]−1 jj G(x − yj )G(x − yj ) =
1 −κx −κx N e e 4κ 2 D
(2.17)
with D=
(w2 −1−u)[(1+u)(1+v) − w2 (1−v)] 2κ
(2.18)
and N = (w2 + w −2 )[w 2 − (1+u)(1+v)] + 2w2 v + (w 2 −1−u)(u−1−w2 ) .
(2.19)
598
P. Exner, H. Neidhardt, V. A. Zagrebnov
It is straightforward, if tedious, to compute the Taylor expansions of the denominator and numerator: we get
D = −2κ 2 a 4 κ +2β −1 + O(a 5 ) ,
(2.20)
while in the other expression all the terms cancel up to the third order giving N = 4κ 4 a 4 + O(a 5 ) .
(2.21)
The sought kernel is thus j,j =−1,0,1
[a (k)]−1 jj Gk (x −yj )Gk (x −yj ) = −
β e−κx e−κx (1+O(a)) 2(2+βκ) (2.22)
as expected. In the same way one can treat the other situations with x, x belonging to (−∞, a), (−a, 0), (0, a), and (a, ∞). In the coefficient this corresponds to different combinations of (w, 1, w−1 ) and (w −1 , 1, w) in (2.16). Due to the symmetry of [a (iκ)]−1 , however, there are just two different expressions, the other one having the numerator replaced by N = (w4 + 1)v + 2[w 2 − (1+u)(1+v)] + (w2 −1−u)(u−1−w2 )
(2.23)
leading to N = −4κ 4 a 4 + O(a 5 )
(2.24)
and the correct kernel again; recall the sign factor in (2.2). This yields the relation (2.14). For a fixed κ > 0 we see from the relation (2.16) that its left-hand side can be majorized by a function from L2 (R2 ) which is independent of a. The same is, of course, true for the last term in (2.3). Then by (2.3), (2.5), (2.17), and dominated convergence the resolvent converges in the Hilbert-Schmidt norm, lim (− Aa ,Ya + κ 2 )−1 − (β,y + κ 2 )−1 2 = 0,
a→0
(2.25)
and thus, a fortiori, {− Aa ,Ya }a≥0 approximates β,y in the norm-resolvent topology. Remark 2.3. The result remains valid if the coupling constants Aa are replaced by α±1 (a) =
2 1 − + ϕ1 (a), β a
α0 (a) =
β (1+ϕ0 (a)) , a2
where ϕj are smooth functions behaving as O(a) for a → 0+ .
(2.26)
Potential Approximations to δ
599
3. Approximation of δ by Regular Potentials It is easy to use the above result to prove the existence of an approximation of δ by local potentials. After a suitable translation we can put y = 0 and we seek in the form
x β a W,0 (x) = V 0 a()2 x + a() x − a() 2 1 1 1 + − V−1 + V1 ; (3.1) β a() a (x) is obtained by replacing x by x −y at the the general potential approximation W,y right-hand side. In this expression β ∈ R \ {0}, and the involved potentials are supposed to satisfy Vj ∈ L1 (R) and Vj (x) dx = 1 (3.2) R
for j = −1, 0, 1. The function a : R+ → R+ , to be specified later, is supposed to be continuous at = 0 with a(0) = 0. The family of one-dimensional Schrödinger operator used to approximate β,y will be of the form a a H,y := − + W,y .
(3.3)
If Vj ∈ L1 (R) the r.h.s. is defined in the sense of the corresponding quadratic forms. If we a (x) is an infinitely small perturbation of the add the requirement Vj ∈ L2 (R), then W,y a ) = dom(− ) = Laplacian and (3.3) as a self-adjoint operator is defined on dom(H,0 W 2,2 (R) as an operator sum. We will make this assumption everywhere in the following, except for Theorem 3.1 where we refer directly to a result in [AGHH]. To compare the resolvents, we choose k = iκ which belongs to the resolvent sets of a and the operator 2 both H,y β,y introduced above; this can be achieved if k is nonreal or with κ > 0 large enough. Then we may employ the elementary estimate a (H +κ 2 )−1 − (β,y +κ 2 )−1 ,y a ≤ (H,y +κ 2 )−1 − (− Aa() ,Ya() +κ 2 )−1 (3.4) 2 −1 2 −1 +(− A ,Y +κ ) − (β,y +κ ) a()
a()
to prove the following claim: Theorem 3.1. Let Vj ∈ L1 (R), j = −1, 0, 1. For any sequence {an } ⊂ (0, ∞) with an → 0 there is a sequence {n } of positive numbers with n → 0 such that lim (Hann,y + κ 2 )−1 − (β,y + κ 2 )−1 = 0 (3.5) n→∞
holds for any κ > 2|β|−1 . Proof. Without loss of generality we may put y = 0. In view of Theorem 2.2 it is sufficient to deal with the first term at the right-hand side of (3.4). By [AGHH, Thm. II.2.2.2] for each an > 0, n = 1, 2, . . . , there exists a sequence of {nm }∞ m=1 with limm→∞ nm = 0 such that 2 −1 n lim (Hanm (3.6) − (− Aan ,Yan + κ 2 )−1 = 0 , ,0 + κ ) m→0
600
P. Exner, H. Neidhardt, V. A. Zagrebnov (n)
where Yan = {yj }j =−1,0,1 = {−an , 0, an }, Aan = {αj }j =−1,0,1 = (2β −1 −an−1 , βan−2 , n 2β −1 − an−1 ) and {Hanm ,0 }n≥1 are defined by local potentials x β an Wnm ,0 (x) = (3.7) V0 nm an2 nm 2 1 1 x + an 1 x − an + − + . V−1 V1 β an nm nm nm nm Indeed, in view of (3.2), Theorem II.2.2.2 of [AGHH] applies if we choose the real analytic function λj (·), which enters into Theorem II.2.2.2, of the form λj (nm ) := (n) nm αj . If m k 2 = 0, the norms at the right-hand side of (3.6) are uniformly bounded and the claim is valid for the diagonal sequence, n := nn – cf. [RS, Sect. I.3]. By the first resolvent identity its validity extends to any point outside the spectrum of β,0 . The diagonal trick used in the above proof introduces a relation between the parameters a and . Since to a given a we choose small enough to meet the requirements, the procedure works if a() tends to zero sufficiently slowly as → 0+. Put like that the claim is, of course, very vague. Even without computing the resolvents, e.g., we can conjecture that the family (3.3) will not yield the sought approximation if a() ∼ ν with ν > 1 since then the three potentials will overlap substantially for small values of and eventually the (divergent) overall mean value will prevail. The question about a rate between a and which is sufficient to yield a convergent approximation is subtle, and the rest of the section is devoted to it. As above we put y = 0 in the following argument restoring a general y only in the final result. First we (0) introduce the sesquilinear forms ta, [·, ·], β 1 +∞ (0) dx V0 (x/)u(x)v(x) , ta, [u, v] := 2 u(0)v(0) − a −∞ (j )
and ta, [·, ·], (j ) ta, [u, v]
:=
2 1 − β a
1 u(j a)v(j a) −
+∞
−∞
dx Vj (x − j a/)u(x)v(x) ,
(j )
(0)
where j = ±1 and dom(ta, ) = dom(ta, ) = W 1,2 (R). We set (0) (−1) (+1) ta, [·, ·] := ta, [·, ·] + ta, [·, ·] + ta, [·, ·]
with dom(ta, ) = W 1,2 (R). To proceed further we need stronger hypotheses about the potentials, namely the conditions (3.8) and (3.11) below. It can be shown that in combination with Vj ∈ L2 (R) they imply Vj ∈ L1 (R). Lemma 3.2. Let V0 ∈ L2 (R). If the conditions (3.2) and +∞ dx |x|1/2 |V0 (x)| < +∞ , −∞
(0)
are valid, then |ta, [u, v]| ≤ holds for u, v ∈ W 1,2 (R).
(3.8)
√ √ +∞ 2 |β|a −2 −∞ dx |x|1/2 |V0 (x)| uW 1,2 vW 1,2
Potential Approximations to δ
601 (0)
Proof. Changing the integration variable x → x in the definition of ta, [u, v] we get +∞
β (0) ta, [u, v] = 2 dx V0 (x) u(0)v(0) − u(x)v(x) , a −∞ which yields (0) [u, v] = − ta,
β a2
+∞ −∞
dx V0 (x) (u(0) − u(x))v(0) + u(x)(v(0) − v(x)) .
Since 1 |f (x)| ≤ √ f W 1,2 , 2
f ∈ W 1,2 (R),
(3.9)
and
|x − y| f W 1,2 , f ∈ W 1,2 (R), (3.10) y as it follows from f (x) − f (y) = − x f (t) dt and the Hölder inequality, we find |β| +∞ (0) |ta, [u, v]| ≤ 2 dx |x| |V0 (x)| uW 1,2 vW 1,2 2 a 2 −∞ |f (x) − f (y)| ≤
for u, v ∈ W 1,2 (R) which proves the lemma.
Lemma 3.3. Let Vj ∈ L2 (R), j = ±1, and β = 0. If the conditions (3.2) and +∞ dx |x|1/2 |Vj (x)| < +∞ , j = ±1 , −∞
(3.11)
are satisfied, then √ √ (j ) ta, [u, v] ≤ 2
+∞ 2 − 1 dx |x|1/2 |Vj (x)| uW 1,2 vW 1,2 β a −∞
(3.12)
holds for any u, v ∈ W 1,2 (R) and j = ±1. Proof. Let j = −1. Changing the integration variable to x − a in the definition of (−1) ta, [u, v] we get +∞
2 1 (−1) ta, [u, v] = dx V−1 (x) u(−a)v(−a) − u(x − a)v(x − a) . − β a −∞ From here we infer 2 1 (−1) ta, [u, v] = − × (3.13) β a +∞
dx V−1 (x) (u(−a) − u(x − a))v(−a) + u(x − a)(v(−a) − v(x − a)) . −∞
Using again (3.9) and (3.10) we complete the proof.
602
P. Exner, H. Neidhardt, V. A. Zagrebnov
Corollary 3.4. Let Vj ∈ L2 (R), j = −1, 0, +1, and β = 0. If the potentials Vj satisfy the conditions (3.2), (3.8), (3.11), then the estimate √ ta, [u, v] ≤ C(a) uW 1,2 vW 1,2 is valid for u, v ∈ dom(ta, ) = W 1,2 (R), where the constant C(a) is given by √ |β| +∞ dx |x|1/2 |V0 (x)| C(a) := 2 2 a −∞ 2 1 +∞ 1/2 dx |x| {|V−1 (x)| + |V+1 (x)|} . + − β a −∞
(3.14)
Let us next introduce the operator G(a) : L2 (R) → C3 , +∞ −∞ dx Giκ (x + a)f (x) +∞ G(a)f := −∞ dx Giκ (x)f (x) +∞ −∞ dx Giκ (x − a)f (x) for f ∈ dom(G(a)) = L2 (R). Obviously, the action of the adjoint operator G(a)∗ : C3 → L2 (R) is given by G(a)∗ ξ (x) = Giκ (x + a)ξ−1 + Giκ (x)ξ0 + Giκ (x − a)ξ+1 , ξ−1 ξ := ξ0 ∈ C3 . ξ+1
where
With these definitions the r.h.s. of (2.5) can be rewritten as (− Aa ,Ya + κ 2 )−1 f = (H0 + κ 2 )−1 f + G(a)∗ a (iκ)−1 G(a)f ,
(3.15)
where Ya = {yj }j =−1,0,+1 with yj = j a and the matrix a (iκ) is given by (2.6). ˆ Furthermore, we introduce the operator G(a): ˆ G(a)f :=
(H0 + κ 2 )−1/2 f G(a)f
L2 (R) : L (R) −→ ⊕ C3 2
(3.16)
and the operator ˆ a (iκ): ˆ a (iκ) :=
I 0 0 a (iκ)
L2 (R) L2 (R) : ⊕ −→ ⊕ . C3 C3
Using the definitions (3.16) and (3.17) we can rewrite (3.15) as ˆ ∗ ˆ a (iκ)−1 G(a)f ˆ . (− Aa ,Ya + κ 2 )−1 f = G(a)
(3.17)
Potential Approximations to δ
603
Since Giκ (x −j a) ∈ W 1,2 (R) for j = −1, 0, +1, one gets that ran(G(a)∗ ) ⊆ W 1,2 (R), ˆ ∗ ) ⊆ W 1,2 (R). Thus it makes sense to define the following and consequently, ran(G(a) sesquilinear form ˆ ˆ da, [ξˆ , η] ˆ := ta, [G(a) η], ˆ ξ , G(a) ∗ˆ
where ξˆ :=
∗
f ξ
L2 (R) ξˆ , ηˆ ∈ dom(da, ) = Hˆ := ⊕ , C3
and
g yˆ := η
with f, g ∈ L2 (R) and ξ, η ∈ C3 . By construction, the form da, [· , ·] defines a bounded ˆ operator Da, : Hˆ → H. Lemma 3.5. Let Vj ∈ L2 (R), j = −1, 0, +1, and β = 0. If the potentials Vj satisfy the conditions (3.2), (3.8) and (3.11) and κ ≥ 1, then one has √ Da, B(Hˆ ,Hˆ ) ≤ 4 C(a) (3.18) for a > 0. Proof. Using Corollary 3.4 we find ˆ ∗ ξˆ , G(a) ˆ ∗ η]| |da, [ξˆ , η]| ˆ = |ta, [G(a) ˆ ≤
√ ˆ ∗ ξˆ W 1,2 G(a) ˆ ∗ η] C(a)G(a) ˆ W 1,2 .
Since ˆ ∗ ξˆ )(x) = (H0 + κ 2 )−1/2 f + Giκ (x + a)ξ−1 + Giκ (x)ξ0 + Giκ (x − a)ξ+1 , (G(a) we have
ˆ ∗ ξˆ 2 1,2 ≤ 4 (H0 + κ 2 )−1/2 f 2 1,2 + Giκ (· + a)ξ−1 2 1,2 G(a) W W W 2 2 + Giκ (·)ξ0 W 1,2 + Giκ (· − a)ξ+1 W 1,2 .
The assumption κ ≥ 1 yields (H0 + κ 2 )−1/2 f 2W 1,2 ≤ f 2 ,
f ∈ L2 (R),
and Giκ (· − j a)ξj 2W 1,2 =
1 −1 κ + κ −3 |ξj |2 ≤ |ξj |2 , 4
j = −1, 0, +1,
for a ≥ 0. In this way we get the estimate
ˆ ∗ ξˆ 2 1,2 ≤ 4 f 2 + ξ 2 3 ≤ 4ξˆ 2 , G(a) ˆ W C H
for a ≥ 0. This leads to the estimate
√ ˆ ∗ ξˆ , G(a) ˆ ∗ η]| |da, [ξˆ , η]| ˆ = |ta, [G(a) ˆ ≤ 4 C(a)ξˆ Hˆ η ˆ Hˆ ,
from which (3.18) follows readily.
604
P. Exner, H. Neidhardt, V. A. Zagrebnov (n)
Let us further introduce the Neumann iterations Ra, (iκ) defined by
n (n) ˆ ∗ (a)ˆ a (iκ)−1 Da, ˆ a (iκ)−1 G(a), ˆ (iκ) := G n = 0, 1, 2, . . . Ra, for k > max(−2/β, 1) and a ∈ (0, a0 (κ)). The meaning of these expressions will become clear below; we note that (0) Ra, (iκ) = (− Aa ,Ya + κ 2 )−1 .
(3.19)
We also need to know how the norm of a (iκ)−1 behaves as a → 0. The Taylor expansion for all the expressions contained in (2.15) yields [a (iκ)]−1 =
2βa −2 2 + βκ
2κβ −1 2κ(κ + β −1 ) −2κ(κ + 2β −1 ) × −2κ(κ + 2β −1 ) 4κ(κ + 2β −1 ) −2κ(κ + 2β −1 ) (1 + O(a)) . 2κβ −1 −2κ(κ + 2β −1 ) 2κ(κ + β −1 )
Consequently, for κ > max(−2/β, 1) there is a constant C (κ) > 0 such that a (iκ)−1 3 3 ≤ C (κ) a −2 B(C ,C )
(3.20)
holds for any a ∈ (0, a0 (κ)). Lemma 3.6. Let Vj ∈ L2 (R), j = −1, 0, +1, and κ > max(−2/β, 1), β = 0. If the potentials Vj satisfy the conditions (3.2), (3.8) and (3.11), then the Neumann iterations obey the estimate 2C (κ) 4√ C (κ)C(a) n (n) (3.21) Ra, (iκ) ≤ 2 a a2 for a ∈ (0, a0 (κ)) and n = 1, 2, . . . , where C(a) is given by (3.14). Proof. Since κ > 1, we have ˆ ˆ∗ G(a) B(H,Hˆ ) = G (a)B(Hˆ ,H) ≤
√ 2.
n+1 (n) An elementary estimate, Ra, (iκ) ≤ 2 a (iκ)−1 B(C3 ,C3 ) Da, n
B(Hˆ ,Hˆ )
, gives
(n) Ra, (iκ) ≤ 2 · 4n n/2 C (κ)n+1 C(a)n a −(2n+2) so (3.21) follows readily. If κ > max{−2/β, 1} and the condition √ 4 C (κ)C(a) τ (, a, κ) := max(−2/β, 1), β = 0. If the potentials Vj satisfy the conditions (3.2), (3.8), (3.11), and τ (, a, κ) < 1 is valid for a given some a ∈ (0, a0 (κ)), then −κ 2 belongs to the resolvent set of the operator H,0 by (3.3), and moreover, one has a + κ 2 )−1 = Ra, (iκ). (H,0
(3.23)
Proof. Combining the above definitions of the quadratic forms, we get − Aa ,Ya + κ 2 u, − Aa ,Ya + κ 2 v = ha,0 [u, v] + κ 2 (u, v) + ta, [u, v] (3.24) −1 for u, v ∈ W 1,2 (R). We use this relation for u = Ra, (iκ)f and v = − Aa ,Ya + κ 2 g with f, g ∈ L2 (R). Since ˆ ∗ ˆ a (iκ)−1 Ra, (iκ) = G(a)
∞
Da, ˆ a (iκ)−1
n
ˆ G(a)
(3.25)
n=0
ˆ ∗ ) ⊆ W 1,2 (R) we get u ∈ W 1,2 (R). Since v = − A ,Y + κ 2 −1 g ∈ and ran(G(a) a a W 1,2 (R), we can insert u and v into (3.24). This yields (Ra, (iκ)f, g) = ha,0 [Ra, (iκ)f, (− Aa ,Ya + κ 2 )−1 g] + κ 2 (Ra, (iκ)f, (− Aa ,Ya + κ 2 )−1 g)
+ ta, [Ra, (iκ)f, (− Aa ,Ya + κ 2 )−1 g].
Using (3.19) and (3.25) we find ta, [Ra, (iκ)f, (− Aa ,Ya + κ 2 )−1 g] ∞ n −1 ˆ ˆ ˆ ∗ (iκ) ˆ ∗ ˆ a (iκ)−1 ˆ = ta, [G(a) Da, ˆ a (iκ)−1 G(a)f, G(a) G(a)g] n=0
= Da, ˆ a (iκ)−1 =
∞ n=1
∞
Da, ˆ a (iκ)−1
n=0
(n) Ra, (iκ)f, g
.
n
−1 ˆ ˆ ˆ G(a)f, (iκ) G(a)g
606
P. Exner, H. Neidhardt, V. A. Zagrebnov
Furthermore, from (3.19) we infer that
(− Aa ,Ya + κ 2 )−1 f, g = ha,0 [Ra, (iκ)f, (− Aa ,Ya + κ 2 )−1 g] + κ 2 (Ra, (iκ)f, (− Aa ,Ya + κ 2 )−1 g). Setting now h := (− Aa ,Ya + κ 2 )−1 g we find (f, h) = ha,0 [Ra, (iκ)f, h] + κ 2 (Ra, (iκ)f, h)
(3.26)
for h ∈ dom(− Aa ,Ya ). Since dom(− Aa ,Ya ) is a core for the quadratic form ha,0 [· , ·] one concludes that the equality (3.26) extends to each h ∈ dom(ha,0 ). In particular, if a ) we have h ∈ dom(H,0 a + κ 2 )h). (f, h) = (Ra, (iκ)f, (H,0 a )) and In this way we find Ra, (iκ)f ∈ dom(H,0 a + κ 2 )Ra, (iκ)f = f, (H,0
and
a Ra, (iκ)(H,0 + κ 2 )h = h,
f ∈ H,
a h ∈ dom(H,0 ).
a + κ 2 ) = {0} and ran(H a + κ 2 ) = H, so the operator H a + κ 2 is Hence ker(H,0 ,0 ,0 a + κ 2 )−1 = R (iκ). boundedly invertible and (H,0 a,
With the help of Lemma 3.7 one can prove the following estimate. Lemma 3.8. Let Vj ∈ L2 (R), j = −1, 0, +1, and κ > max(−2/β, 1), β = 0. If the potentials Vj satisfy the conditions (3.2), (3.8) and (3.11) and τ (, a, κ) < 1 is valid for some a ∈ (0, a0 (κ)), then a (H + κ 2 )−1 − (− A ,Y + κ 2 )−1 ≤ 2C (κ) τ (, a, κ) (1 − τ (, a, κ))−1 . ,0 a a a2 Proof. Taking into account (3.23) and (3.19) we find a + κ 2 )−1 − (− Aa ,Ya + κ 2 )−1 = (H,0
∞ n=1
(n) Ra, (iκ) .
Using the notation (3.22) and taking into account the estimate (3.21) one gets ∞ a (H + κ 2 )−1 − (− A ,Y + κ 2 )−1 ≤ 2C (κ) τ (, a, κ)n . ,0 a a a2 n=1
If τ (, a, κ) < 1 is satisfied, we obtain (3.8) easily.
Now we are ready to say something about the rate of the potential approximation in terms of the relation between a and . Consider a function a : (0, ∞) → (0, ∞).
Potential Approximations to δ
607
Theorem 3.9. Let Vj ∈ L2 (R), j = −1, 0 + 1, and κ > max(−2/β, 1), β = 0. Moreover, suppose that a() → 0 as → 0+. If the potentials Vj satisfy the conditions (3.2), (3.8) and (3.11) for j = −1, 0, +1, and = 0, →0 a()12
(3.27)
a lim (H,y + κ 2 )−1 − (− Aa() ,Ya() + κ 2 )−1 = 0
(3.28)
a lim (H,y + κ 2 )−1 − (β,y + κ 2 )−1 = 0.
(3.29)
lim
then →0
and →0
a is unitarily equivalent to H a by translation and the same is true for Proof. Since H,y ,0 the other involved operators, we can again put y = 0 without loss of generality. By assumption, a() ∈ (0, a0 (κ)) for sufficiently small. Further, we note that there is a constant C = C(Vj , β) such that C(a) ≤ Ca −2 for a > 0. Using that we can estimate
τ (, a(), κ) ≤
√ √ 4 C (κ)C 2 = 4C (κ)Ca() , a()4 a()6
so lim→0+ τ (, a(), κ) = 0 by (3.27) and lim→0 a() = 0. Hence, τ (, a(), κ) < 1 holds for sufficiently small. Applying Lemma 3.8 we get √ a (1 − τ (, a(), κ))−1 . (H,0 + κ 2 )−1 − (− Aa() ,Ya() + κ 2 )−1 ≤ 8C (κ)2 C a()6 Taking into account once again the assumption (3.27) we prove (3.28). Moreover, using Theorem 2.2 together with the estimate (3.4) we arrive at (3.29). 4. Exceptional Character of the CS Approximation In conclusion we want to show that it is sufficient to disbalance the limiting procedure slightly, say by changing the normalization (3.2), and the result will be completely different than that in Theorem 3.9. For simplicity we will consider the case y = 0 only. Denote by − D,0 the Laplace operator with Dirichlet boundary conditions at the origin, i.e
dom(− D,0 ) = f ∈ W 2,2 (R− ) ⊕ W 2,2 (R+ ) : f (0−) = f (0+) = 0 and (− D,0 f )(x) = −
d2 f (x), dx 2
f ∈ dom(− D,0 ).
With respect to L2 (R) = L2 (R− ) ⊕ L2 (R+ ) the operator − D,0 decomposes into + − D,0 = − − D,0 ⊕ − D,0
608
P. Exner, H. Neidhardt, V. A. Zagrebnov
with dom(− ± f ∈ W 2,2 (R± ) : f (0±) = 0 . We note that σ (− ± D,0 ) = D,0 ) ± 2 −1 = [0, +∞). The resolvents (− D,0 + κ ) are integral operators with the kernels 1 ∓κx sinh(κx ) . . . ±x ∈ [0, ±x) ±κ e ± Diκ (x, x ) := ± κ1 sinh(κx) e∓κx . . . ±x ∈ [±x, +∞). Then a straightforward computation shows that the free resolvent (2.1) gets the form : − + (x, x ) ⊕ Diκ (x, x ) + Giκ (x − x ) = Diκ
1 −κ|x| −κ|x | e e . 2κ
(4.1)
The indicated modification corresponds to the changed − Aa ,Ya with Aa replaced by αAa , αAa := α(2β −1 − a −1 ), αβa −2 , α(2β −1 − a −1 ) , where α, β ∈ R \ {0}. The form Qα Aa ,Ya [·, ·] associated with the operator − α Aa ,Ya is given by β 2 1 − u(+a)v(+a)+u(−a)v(−a) , Qα Aa ,Ya [u, v] = (u , v )+α 2 u(0)v(0)+α a β a where u, v ∈ dom(Qα Aa ,Ya,α ) = W 1,2 (R), which means that α = 1 amounts to a simultaneous change of all the δ coupling parameters. The resolvent (− α Aa ,Ya +κ 2 )−1 is again given by Krein’s formula (− α Aa ,Ya + κ 2 )−1 (x, x ) = Giκ (x − x ) − j,j =−1,0,+1
[a,α (iκ)]−1 jj Giκ (x − yj )Giκ (x − yj ) ,
(4.2)
2 1 + αu w w 1 w 1 + αv w , a,α (iκ) = 2κ w2 w 1 + αu
where
i.e., in comparison with (2.6) we have u → αu, v → αv, while w is preserved. Lemma 4.1. Let κ > 0. The resolvent (− α Aa ,Ya + κ 2 )−1 exists for sufficiently small a > 0 if α = 1. Proof. It is sufficient that −κ 2 is not an eigenvalue. As in Proposition 2.1 this would be true for − α Aa ,Ya if κ satisfies one of the equations analogous to (2.10) and (2.11), with κ replaced by ακ at the r.h.s. The Taylor expansion around a = 0 shows that this cannot happen unless α = 1. In the following we fix κ > 0, α ∈ {0, 1}, and β = 0. Then there is a0 (κ) > 0 such that for all a ∈ (0, a0 (κ)) the resolvent (− α Aa ,Ya + κ 2 )−1 exists. Theorem 4.2. Let κ > 0, α = 0, 1, and β = 0 be fixed. Then the relation
−1 − + lim − α Aa ,Ya + κ 2 (x, x ) = Diκ (x, x ) ⊕ Diκ (x, x ) a→0+
(4.3)
holds for any x, x ∈ R. Consequently, − α Aa ,Ya → − D,0 as a → 0+ in the normresolvent sense.
Potential Approximations to δ
609
Proof. Considering the case x, x ≥ a and following the line of reasoning from (2.15) to (2.20) we obtain 1 −κx −κx Nα [a,α (iκ)]−1 e (4.4) jj G(x −yj )G(x −yj ) = 4κ 2 e Dα jj =−1,0+1
with Dα :=
(w 2 − 1 − αu)[(1+αu)(1+αv) − w 2 (1−αv)] 2κ
and Nα := (w 2 + w −2 )[w 2 − (1+αu)(1+αv)] + 2αw 2 v + (w 2 − 1− αu)(αu−1− w 2 ) . If α = 1, one gets
Dα = −2κa 2 (1 − α) + O(a 3 )
and Nα = −4κ 2 a 2 (1 − α) + O(a 3 ) , so the r.h.s. of (4.4) equals 2κe−κx e (4.1), we find
−κx
(4.5)
(1 + O(a)). Inserting (4.5) into (4.2) and using
+ lim (− α Aa ,Ya + κ 2 )−1 (x, x ) = Diκ (x, x )
a→+0
(4.6)
for x, x ∈ [a, +∞). In the same way one can treat the other combinations with x, x belonging to (−∞, a], (−a, 0), (0, a) and [a, +∞); doing so we check (4.3) for x, x ∈ R. Taking into account (4.1) and (4.2) one easily verifies that −1
− + (x, x ) − Diκ (x, x ) ⊕ Diκ (x, x ) − α Aa ,Ya + κ 2 can be majorized by a function from L2 (R2 ) which is independent of a. By (4.3) and the −1 −1 − − D,0 + κ 2 Lebesgue convergence theorem the difference − α Aa ,Ya + κ 2 converges to zero in the Hilbert-Schmidt norm, so − α Aa ,Ya → − D,0 as a → 0+ in the norm-resolvent sense. a Let us introduce the Schrödinger operator H,0,α defined by a a H,0,α := − + αW,0
for α ∈ R \ {0} as in the previous section. It corresponds to rescaling of the original a a if α = 1. The Neumann iterations are approximation potential: we have H,0,α = H,0 now defined by
n (n) ˆ ∗ (a)ˆ a,α (iκ)−1 Da, ˆ a,α (iκ)−1 G(a), ˆ Ra,,α (iκ) := G n = 0, 1, 2, . . . for k > 1 and a ∈ (0, a0 (κ)), where the definition of ˆ a,α (iκ) is obvious, cf. (3.17). We note that for κ > 1 and α = 1 there is a constant Cα (κ) > 0 such that instead of (3.20) one has the estimate a,α (iκ)−1 3 3 ≤ C (κ)a −1 α B(C ,C ) for a ∈ (0, a0 (κ)). Lemma 3.6 reads now as follows.
610
P. Exner, H. Neidhardt, V. A. Zagrebnov
Lemma 4.3. Let Vj ∈ L2 (R), j = −1, 0, +1, and κ > 1, β = 0. If the potentials Vj satisfy the conditions (3.2), (3.8) and (3.11), then the Neumann iterations obey the estimate 2C (κ) 4√ C (κ)C(a) n α α (n) Ra,,α (iκ) ≤ a a for a ∈ (0, a0 (κ)) and n = 1, 2, . . . , where C(a) is given by (3.14). The proof is similar to that of Lemma 3.6. In view of Lemma 4.3 one has to modify the parameter τ (, a, κ) to √ 4 Cα (κ)C(a) . τα (, a, κ) := a (n)
n If ατα (, a, κ) < 1 is satisfied, then the operator Ra,,α (iκ) := ∞ n=0 α Ra,,α (iκ) is well defined. With obvious modifications Lemma 3.7 takes the following form.
Lemma 4.4. Let Vj ∈ L2 (R), j = −1, 0, +1, and let κ > 1, α ∈ {0, 1}, and β = 0. If the potentials Vj satisfy the conditions (3.2), (3.8) and (3.11) and ατα (, a, κ) < 1 is valid for some a ∈ (0, a0 (κ)), then −κ 2 belongs to the resolvent set of the operator a H,0,α , and, moreover, one has a + κ 2 )−1 = Ra,,α (iκ). (H,0,α
Lemma 3.8 modifies similarly but we get a slightly stronger result because the matrix a,α (iκ)−1 is now less singular for any κ > 0 as a → 0. Lemma 4.5. Under the assumptions of the preceding lemma, a (H,0,α + κ 2 )−1 − (− α Aa ,Ya + κ 2 )−1 ≤ 2αCα (κ)
τα (, a, κ) (1 − ατα (, a, κ))−1 . a
Taking into account Theorem 4.2 and Lemmata 4.4, 4.5 we thus prove the following theorem. Theorem 4.6. Let Vj ∈ L2 (R), j = −1, 0, 1, and let κ > 1, α ∈ {0, 1}, and β = 0. Furthermore, let lim→0 a() = 0. If the potentials Vj satisfy the conditions (3.2), (3.8) and (3.11) and lim = 0, →0 a()8 then
−1 −1 a 2 2 = 0. lim H,0,α + κ − − α Aa() ,Ya() + κ →0
and
−1 −1 a 2 2 = 0. lim H + κ − − + κ D,0 ,0,α →0
a Using a translation, the analogous conclusion can be made for the family {H,y,α } with the potential center shifted to a point y, which naturally converges for α ∈ {0, 1} to the Laplacian with the Dirichlet decoupling at y.
Potential Approximations to δ
611
Acknowledgements. The authors are grateful for the hospitality in the institutes where parts of this work were done: P.E. and H.N. in Centre de Physique Théorique, CNRS, Marseille-Luminy, and H.N. and V.Z. in Nuclear ˇ near Prague. We also thank the referee for pointing out an error in the first version Physics Institute, AS, Rež of the manuscript. The research was partially supported by the GAAS Grant A1048101 and the Exchange Agreement No. 7919 between CNRS and the Czech Academy of Sciences.
Note added in proof. We thank J. Brasche who attracts out attention to the article “Singular Schrödinger Operators as Limits of Point Interaction Hamiltonian”, Potential Analysis 8, 163–178 (1998), by J. Brasche, R. Figari and A. Teta, related to the topic of the present paper. References [AGHH] Albeverio,S., Gesztesy, F., Høegh-Krohn, R., Holden, H.: Solvable Models in Quantum Mechanics. Heidelberg: Springer, 1988 [ADE] Asch, J., Duclos, P., Exner, P.: Stability of driven systems with growing gaps. Quantum rings and Wannier ladders. J. Stat. Phys. 92, 1053–1069 (1998) [AEL] Avron, J.E., Exner, P., Last, Y.: Periodic Schrödinger operators with large gaps and Wannier–Stark ladders. Phys. Rev. Lett. 72, 896–899 (1994) [BF] Berezin, F.A., Faddeev, L.D.: A remark on Schrödinger equation with a singular potential. Sov. Acad. Sci. Doklady 137, 1011–1014 (1961) (in Russian) [Ca] Carreau, M.: Four–parameter point–interactions in 1D quantum systems. J. Phys. A26, 427–432 (1993) [CS] Cheon, T., Shigehara, T.: Realizing discontinuous wave functions with renormalized short-range potentials. Phys. Lett. A243, 111–116 (1998) [CH] Chernoff, P.R., Hughes, R.: A new class of point interactions in one dimension. J. Funct. Anal. 111, 92–117 (1993) [Ex] Exner, P.: The absence of the absolutely continuous spectrum for δ Wannier–Stark ladders. J. Math. Phys. 36, 4561–4570 (1995) [Fr] Friedman, C.N.: Perturbations of the Schrödinger equation by potentials with small support. J. Funct. Anal. 10, 346–360 (1972) [GH] Gesztesy, F., Holden, H.: A new class of solvable models in quantum mechanics describing point interactions on the line. J. Phys. A20, 5157–5177 (1987) [GK] Gesztesy, F., Kirsch, W.: One–dimensional Schrödinger operators with interactions singular on a discrete set. J. Reine Angew. Math. 362, 28–50 (1985) [GHM] Grossmann, A., Høegh-Krohn, R., Mebkhout, M.: A class of explicitly soluble, local, many-center Hamiltonians for one-particle quantum mechanics in two and three dimensions. J. Math. Phys. 21, 2376–2385 (1980) [Ki] Kiselev, A.: Some examples in one–dimensional “geometric” scattering on manifolds. J. Math. Anal. Appl. 212, 263–280 (1997) [Kl] Klauder, J.: Field structure through model studies: Aspects of nonrenormalizable field theory. Acta Phys. Austriaca Suppl. 11, 341–387 (1973) [KP] Kronig, R. de L., Penney, W.G.: Quantum mechanics of electrons in crystal lattices. Proc. Roy. Soc. (London) 130A, 499–513 (1931) [MS] Maioli, M., Sacchetti, A.: Absence of absolutely continuous spectrum for Stark-Bloch operators with strongly singular periodic potentials. J. Phys. A28, 1101–1106 (1995); erratum A31, 1115– 1119 (1998) [NZ1] Neidhardt, H., Zagrebnov, V.A.: Towards the right Hamiltonian for singular perturbations via regularization and extension theory. Rev. Math. Phys. 8, 715–740 (1996) [NZ2] Neidhardt, H., Zagrebnov, V.A.: On the right Hamiltonian for singular perturbations: General theory. Rev. Math. Phys. 9, 609–633 (1997) [RS] Reed, M., Simon, B.: Methods of Modern Mathematical Physics, I. Functional Analysis. NewYork: Academic Press, 1972 [Še1] Šeba, P.: The generalized point interaction in one dimension. Czech. J. Phys. B36, 667–673 (1986) [Še2] Šeba, P.: Some remarks on the δ –interaction in one dimension. Rep. Math. Phys. 24, 111–120 (1986)
612
P. Exner, H. Neidhardt, V. A. Zagrebnov
[SMMC] Shigehara, T., Mizoguchi, H., Mishima, T., Cheon, T.: Realization of a four parameter family of generalized one-dimensional contact interactions by three nearby delta potentials with renormalized strengths. IEICE Trans. Fund. Elec. Comm. Comp. Sci. E82-A, 1708–1713 (1999) [Si] Simon, B.: Quadratic forms and Klauder’s phenomenon: a remark on very singular perturbations. J. Funct. Anal. 14, 295–298 (1973) [We] Weidman, J.: Linear Operators in Hilbert Spaces. New York: Springer, 1980 Communicated by H. Araki
Commun. Math. Phys. 224, 613 – 655 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Clebsch–Gordan and Racah–Wigner Coefficients for a Continuous Series of Representations of Uq (sl(2, R)) B. Ponsot1 , J. Teschner2 1 Laboratoire de Physique Mathématique, Université Montpellier II, Pl. E. Bataillon, 34095 Montpellier,
France. E-mail:
[email protected] 2 Institut für Theoretische Physik, Freie Universität Berlin, Arnimallee 14, 14195 Berlin, Germany.
E-mail:
[email protected] Received: 9 August 2000 / Accepted: 2 July 2001
Abstract: The decomposition of tensor products of representations into irreducibles is studied for a continuous family of integrable operator representations of Uq (sl(2, R). It is described by an explicit integral transformation involving a distributional kernel that can be seen as an analogue of the Clebsch–Gordan coefficients. Moreover, we also study the relation between two canonical decompositions of triple tensor products into irreducibles. It can be represented by an integral transformation with a kernel that generalizes the Racah–Wigner coefficients. This kernel is explicitly calculated. 1. Introduction Noncompact quantum groups can be expected to lead to very interesting generalizations of the rich and beautiful subject of harmonic analyis on noncompact groups. Important progress has recently been made concerning an abstract (C ∗ -algebraic) theory of noncompact quantum groups, see [1] for a nice overview and further references. However, an important problem is still the rather limited supply of interesting examples. Results on the harmonic analysis are so far only known for the quantum deformation of the group of motions on the euclidean plane[2, 3], the quantum Lorentz group [5, 6] and SUq (1, 1) [7, 8]. Moreover, there sometimes exist subtle analytical obstacles to construct quantum deformations of classical groups such as SU (1, 1) on the C ∗ -algebraic level, cf. [4]. Recently some evidence was presented in [9] that a certain noncompact quantum 2 group with deformation parameter q = eπib should describe a crucial internal structure of Liouville theory, a two-dimensional conformal field theory (CFT) that can be seen to be as much a prototype for a CFT with continuous spectrum of Virasoro representations as the harmonic analysis on SL(2, C) is a protoype for noncompact groups. The relation between Liouville theory and that quantum group which was proposed in [9] generalizes the known equivalences between fusion categories of chiral algebras in conformal field theories and braided tensor categories of quantum group representations, cf. e.g. [12, 13]. These equivalences concern the isomorphisms that represent the operation of commuting
614
B. Ponsot, J. Teschner
tensor factors as well as the associativity of tensor products, and can be boiled down to the comparison of certain numerical data, the most non-trivial being some generalization of the Racah–Wigner coefficients (or fusion coefficients in CFT terminology). The quantum group in question is Uq (sl(2, R)). A class of “well-behaved” representation of Uq (sl(2, R)) on Hilbert-spaces was defined and classified in [10]. We will study a certain subclass of the representations listed there. Some of the representations found in [10] reproduce known representions of principal or discrete series of sl(2, R) in the classical limit b → 0, others do not have a classical limit at all. The representations we will consider are of the latter type. Let us remark that representations that are essentially equivalent to the class of representations discussed in our paper were recently also discussed in [14]. The main result of the latter paper is a very interesting proposal for a braiding operation on such representations. In our present paper we will present explicit descriptions for the decomposition of tensor products of these representations into irreducibles, as well as the isomorphism relating two canonical bases for triple tensor products. What appears to be remarkable is the fact that the subseries we have picked out is actually closed under forming tensor products, which one would generally not expect if there exist other unitary representations. The maps describing the decomposition of tensor products lead to the definition and explicit calculation of the generalization of the Racah–Wigner coefficients which represent the central ingredient for the approach of [9] from the mathematics of quantum groups. From the mathematical point of view one may view our results as providing a technical basis for further studies of a C ∗ algebraic quantum group that may be generated1 from Uq (sl(2, R)) and its dual object, which is expected to be a C ∗ algebraic quantum group generated from SLq (2, R). In [9] we presented the definition of SL+ q (2, R) as a quantum space, a C ∗ algebra A+ that is generated from SLq (2, R) and is acted on by analogues of left and right regular representation of Uq (sl(2, R)). An L2 -space was introduced there, and the result describing its decomposition into irreducible representations of Uq (sl(2, R)) (Plancherel decomposition) was announced. Two aspects of these constructions were unusual: A+ was introduced such that the elements a, b, c, d generating SLq (2, R) have positive spectrum and the L2 -space was introduced by a measure that has no classical q → 1 limit. It turns out that it is precisely the subset of unitary Uq (sl(2, R)) representations studied in the present paper which appears in the Plancherel decomposition of that L2 -space. We view these results as hints towards existence of a rather interesting C ∗ -algebraic quantum group related to SLq (2, R) that has no classical counterpart, but other beautiful properties such as a self-duality under b → b−1 which are crucial for the application to Liouville theory [9]. A first hint towards this self-duality can be found in the observation made in [9, 14] (see also [15] for closely related earlier observations) that the representations that we 2 consider may alternatively be seen as representations of Uq˜ (sl(2, R)), where q˜ = eπi/b . This led L. Faddeev to the proposal [14] to unify Uq (sl(2, R)) and Uq˜ (sl(2, R)) into an object called “modular double”, which exhibits the self-duality under b → b−1 in a manifest way. And indeed, it is found in the present paper that the Clebsch–Gordan intertwining maps, as well as the Racah–Wigner coefficients can be constructed in terms of a remarkable special function Sb (x). This special function is closely related to the Barnes Double Gamma function [28], and was more recently independently introduced 1 In a similar sense as the bounded operators on L2 (R) are generated by the unbounded operators p and q that satisfy [p, q] = −i, cf. [11] for more details.
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
615
under the names of “Quantum Dilogarithm” in [16], and as “Quantum Exponential function” in [17]. The function Sb (x) has the property to be self-dual in the sense that it satisfies Sb (x) = S1/b (x). It follows from this self-duality of the function Sb that the Clebsch–Gordan maps constructed in the present paper can be seen as intertwining maps for the “modular double” of L. Faddeev. We would finally like to point out that our techniques for dealing with finite difference operators that involve shifts by imaginary amounts, in particular the method for determining the spectrum of such an operator, seem to be new and should have generalizations to a variety of other problems where such operators appear. Moreover, the investigation of the class of special functions that we use is fairly recent, so we will need to deduce several previously unknown properties. The paper is organized as follows: In the following section we will introduce some technical preliminaries. Since we have to deal with finite difference operators that shift the arguments of functions by imaginary amounts, a lot of what follows will be based on the theory of functions analytic in certain strips around the real axis, and the description of their Fourier-transforms via results of Paley–Wiener type. The third section introduces the class of representations that will be studied in the present paper and discusses some of their properties. This is followed by a section describing the decomposition of tensor products of representations into irreducibles. We then define and calculate b-Racah Wigner coefficients as the kernel that appears in the integral transformation that establishes the isomorphism between two canonical decompositions of triple tensor products. Appendix A is in some sense the technical heart of the paper: It contains the spectral analysis of a finite difference operator of second order that is related to the Casimir on tensor products of two representations. Appendices B and C contain some information on the special functions that are used in the body of the paper.
2. Preliminaries We collect some basic conventions, definitions and standard results that will be used throughout the paper.
2.1. Finite difference operators. The quantum group will be realized in terms of finite difference operators that shift the arguments by an imaginary amount. On functions f (x), x ∈ R that have an analytic continuation to a strip containing {x ∈ C; Im(x) ∈ [ − a− , a+ ]}, a± ≥ 0 one may define the finite difference operators Txia , a ∈ [ − a− , a+ ] by Txia f (x) = f (x + ia).
(1)
As convenient notation we will use sin(π bx) , [x]b ≡ sin(π b2 )
ib
1 dx ≡ ∂x , 2π
[dx + a]b ≡
− ib 2
eπiba Tx 2 − e−πiba Tx eπib − e−πib 2
2
. (2)
616
B. Ponsot, J. Teschner
2.2. Fourier-transformation. Our notation and conventions concerning the Fourier-transformations are as follows: Let S(R) denote the usual Schwartz-space of functions on the real line. The Fourier-transformation of a function f ∈ S(R) will be defined as ∞
f˜(ω) =
dx e−2πiωx f (x).
(3)
−∞
The corresponding inversion formula is then ∞ f (x) =
dω e2πiωx f˜(ω).
(4)
−∞
The Fourier-transformation maps the finite difference operator Txia to the operator of multiplication with e−2πaω . It will therefore be a useful tool for dealing with these operators. Of fundamental importance will be the connection between analyticity of functions in a strip to exponential decay properties of its Fourier-transform and vice versa that is expressed by the classical Paley–Wiener theorem: Theorem 1 (Paley–Wiener). Let f be in L2 (R). Then (e2πxa+ + e−2πxa− )f ∈ L2 (R), a± > 0 if and only if f˜ has an analytic continuation to the strip {ω ∈ C; Im(ω) ∈ (−a− , a+ )} such that for any ω2 ∈ (−a− , a+ ), f˜(. + iω2 ) ∈ L2 (R) and ∞ sup
ω2 ≤b
dω1 |f˜(ω1 + iω2 )|2 < ∞ for any b ∈ (−a− , a+ ).
(5)
−∞
Proof. Cf. e.g. [19].
The following simple variant of this result will often be useful: Lemma 1. For f ∈ S(R), the following two conditions are equivalent: 1. f is the restriction to R of a function F that is meromorphic in the strip {z ∈ C; Im(z) ∈ (−a− , a+ )}, a+ , a− > 0 with finitely many poles in the upper (lower) half plane at P± ≡ {zj ; j ∈ I± }, |Im(zj )| > 0, and all functions Fy (x) ≡ F (x + iy), y ∈ (−a− , a+ ) are of rapid decrease, and 2. one has the following asymptotic behavior of the Fourier-transform f˜(ω) for ω → ±∞: f˜(ω) = − 2πi e−2πizj ω Res F (z) + f˜a+ (ω), j ∈I−
f˜(ω) = + 2πi
j ∈I+
z=zj
e−2πizj ω Res F (z) + f˜a− (ω), z=zj
where f˜a± (ω) decay as x → ±∞ faster than e−2πa|ω| for any a ∈ (−a− , a+ ).
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
617
2.3. Distributions. Let S (R) be the space of tempered distributions on S(R). The dual pairing between a distribution ∈ S (R) and a function f ∈ S(R) will be denoted by , f . The Fourier transformation on S (R) is defined by ˜ , f˜ ≡ , f for any f ∈ S(R). It should be noted that if a distribution ∈ S (R) actually happens to be represented by a function (x) via ∞ ,f =
dx
(x)f (x),
−∞
then our definition of the Fourier-transform of following inversion formula for (x): ∞ (x) =
implies that instead of (4) one has the
dω e−2πiωx ˜ (ω).
(6)
−∞
The distributions that appear below will all be defined in terms of meromorphic functions by means of the so-called i!-prescription: Assume given a family of functions ! , ! > 0 that are meromorphic in some strip containing R, rapidly decreasing at infinity and have finitely many poles with !-independent residues at a distance ! from the real axis. The limit ≡ lim!→0 ! then defines a distribution ∈ S (R). We will often use the symbolic notation (x) for the resulting distribution, keeping in mind that (x) will not be defined for all x ∈ R. There is a simple generalization of Lemma 1 to such distributions in S (R): Poles on the real axis correspond to asymptotic behavior of the form e2πiωx of the Fouriertransform: Lemma 2. For
∈ S (R), the following two conditions are equivalent:
= lim!→0 ! , where ! is for ! > 0 represented as the restriction to R of a function that is meromorphic in the strip {z ∈ C; Im(z) ∈ (−a− , a+ )}, a+ , a− > 0 ! ≡ {z ± i!; j ∈ I }, with finitely many poles in the upper (lower) half plane at P± j ± ±Im(zj ) ≥ 0, and all functions !,y (x) ≡ ! (x + iy), x, y ∈ R, y ∈ (−a+ , a− ) are of rapid decrease, and 2. ˜ is represented by a function ˜ (ω) ∈ C ∞ (R) that has the following asymptotic behavior:
1.
! (x)
˜ (ω) = + 2π i
j ∈I+
˜ (ω) = − 2π i
j ∈I−
e2πizj ω Res
(z) + ˜ a+ (ω)
e2πizj ω Res
(z) + ˜ a− (ω),
z=zj
z=zj
where ˜ a± (ω) decay faster than than e−2πa|ω| for any a ∈ (−a− , a+ ). Remark 1. The sign flips between Lemmas 1 and 2 are due to the different inversion formulae for functions and distributions.
618
B. Ponsot, J. Teschner
2.4. A useful lemma from complex analysis. The following lemma is useful for determining the analytic properties of convolutions of meromorphic functions: Lemma 3. Let f (z0 ; z1 , z2 ) be meromorphic in its variables in some open strip S around the real axis, with singular behavior near z0 = z1 = z2 of the form R12 (z1 )(z0 − z1 )−1 (z0 − z2 )−1 . The function I (z1 , z2 ), defined by the integral ∞ I (z1 , z2 ) ≡
dz0 f (z0 ; z1 , z2 ),
(7)
−∞
will then be a function that has a meromorphic continuation w.r.t. zi , i = 1, 2 to the whole strip S. If z1 and z2 were initially separated by the real axis one will find a pole with residue R12 (z1 ) at z1 = z2 . If not, I (z1 , z2 ) will be nonsingular at z1 = z2 as well. Proof. To define the meromorphic continuation of I (z1 , z2 ) in cases where the poles zi , i = 1, 2 cross the contour of integration of the integral (7) one just needs to deform the contour accordingly. This will obviously always be possible as long as zi , i = 1, 2 were initially not separated by the real axis. We will therefore turn to the case that they were initially separated, and consider w.l.o.g. the case that z1 was initially in the upper, z2 in the lower half plane. In this case one may deform the contour into a contour that passes above z1 plus a small circle around z1 . The residue contribution from the integral over that small circle is 2πi
R12 (z1 ) + (contributions regular as z1 − z2 → 0). z1 − z 2
(8)
The Lemma is proven. 3. A Class of Representations of Uq (sl(2, R)) Definition. Uq (sl(2, R) is a Hopf-algebra with generators: E,
F,
K,
K −1 ; KF = q −1 F K,
relations: KE = qEK, star-structure: K ∗ = K,
E ∗ = E,
co-product: &(K) = K ⊗ K,
[E, F ] = −
F∗ = F;
K 2 − K −2 ; q − q −1
(9)
&(E) =E ⊗ K + K −1 ⊗ E, &(F ) =F ⊗ K + K −1 ⊗ F.
The center of Uq (sl(2, R) is generated by the q-Casimir C = FE −
qK 2 + q −1 K −2 − 2 . (q − q −1 )2
(10)
We will consider the case that q = eπib , b ∈ (0, 1) ∩ (R \ Q). Unitary representations of Uq (sl(2, R)) by operators on a Hilbert-space have been studied in [10]. Since there are no unitary representations in terms of bounded operators some care is needed in order to single out an interesting class of “well-behaved” 2
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
619
representations. A natural notion of “well-behaved” was introduced in [10], where the corresponding unitary representations of Uq (sl(2, R)) were classified. In the present paper we will study a one-parameter subclass Pα , α ∈ Q/2 + iR, Q = b + b−1 of the representations listed in [10] which are constructed as follows: The representation will be realized on the space Pα of entire analytic functions f (x) that have a Fourier-transform f (ω) which is meromorphic in C with possible poles at ω = i(α − Q − nb − mb−1 )
n, m ∈ Z≥0 .
ω = i(Q − α + nb + mb−1 )
(11)
Remark 2. It can be shown that Pα is a Frechet-space. One may then introduce the following finite difference operators: πα (E) ≡ e+2πbx [dx + Q − α]b
ib
πα (K) ≡ Tx 2 .
πα (F ) ≡ e−2πbx [dx + α − Q]b
(12)
As shorthand notation we will also use uα ≡ πα (u). Lemma 4. (i) The operators πα (u), u = E, F, K map Pα into itself. (ii) πα (u), u = E, F, K generate a representation of Uq (sl(2, R)) on Pα . Proof. To verify (i), note that Fourier-transformation maps Eα , Fα , Kα into the following operators: E˜ α = [−iω + α]b Tωib F˜α = [−iω − α]b Tω−ib
Kα = e−πbω .
(13)
The claim follows from the fact that [x]b = 0 for x = nb−1 , n ∈ Z. (ii) is checked by straightforward calculation. Proposition 1. The operators (12) generate an integrable operator representation of Uq (sl(2, R)) in the sense of [10], i.e. 1. Eα , Fα , Kα have self-adjoint extensions in L2 (R), 2. the corresponding unitary operators Eαit , Fαit , Kαit satisfy Kαis Eαit = q −ts Eαit Kαis ,
Kαis Fαit = q ts Fαit Kαis ,
and
3. the q-Casimir strongly commutes with Eα , Fα and Kα . Proof. It suffices to show that the representation Pα is unitarily equivalent to one of the representations listed in [10]. Consider the operator Jα defined as (Jα f˜)(ω) = Sb (α − iω)f˜(ω) in terms of the special function Sb (x) (cf. Appendix B). Jα is unitary since |Sb−1 (α − iω)|2 = 1 which follows from Eq. (134) in Appendix B. Moreover, it follows from the analytic and asymptotic properties of Sb (x) given in the Appendix that Jα maps Pα to the space Rα of entire analytic functions which have a Fourier-transform that is meromorphic in C with possible poles at ω = i(α − Q − nb − mb−1 ) ω = i(−α − nb − mb
−1
)
n, m ∈ Z≥0 .
(14)
620
B. Ponsot, J. Teschner
One finally finds from the functional relations of the Sb -functions, Eq. (133) that Jα−1 E˜ α Jα = Tωib Jα−1 F˜α Jα = [α + iω]b Tω−ib [α − iω]b
Jα−1 Kα Jα = e−πbω .
(15)
Our representation is thereby easily recognized as the representation denoted by (I )1,−1,c 2 −1 −2 in Corollary 5 of [10], where c = [α − Q 2 ]b + 2(q − q ) . Note that our notation Q −1 −2 is different from that in [10] and c ≤ 2(q − q ) . Remark 3. The representations considered here form a subset of the representations of Uq (sl(2, R)) that appear in the classification of [10]. This subset has the following ˜ F˜ , K˜ by replacing b → b−1 in remarkable property: If one introduces generators E, the expressions for E, F , K given above, one obtains a representation of Uq˜ (sl(2, R)) ˜ F˜ , K˜ 2 commute with E, q˜ = exp(π ib−2 ) on the same space Pα . The generators E, 2 F , K on the space Pα . This does not mean, however, that these operators commute as self-adjoint operators on L2 (R). This self-duality property of our representations Pα is related to the fact that the representations (Pα , πα ) do not have a classical (b → 0) limit. Intertwining operators. The representations with labels α and Q−α are equivalent. The unitary operator establishing this equivalence can be most easily found by considering the Fourier-transform of the representation (12), as already done in the proof of Proposition 1, Eqs. (13): Define the operator I˜ α : L2 (R) → L2 (R) as (I˜ α f˜)(ω) = B˜ α (ω)f (ω),
B˜ α (ω) ≡
Sb (α − iω) . Sb (Q − α − iω)
(16)
The operator I˜ α is unitary since |B˜ α (ω)| = 1. It maps Pα to PQ−α as follows from the analytic and asymptotic properties of the Sb -function summarized in Appendix B. The fact that πQ−α (u)I˜ α = I˜ α πα (u),
u ∈ Uq (sl(2, R))
(17)
is a simple consequence of the functional relations (133),Appendix B of the Sb -functions. By inverse Fourier-transformation one finds the representation of the intertwining operator on functions f (x). It takes the form (Iα f )(x) =
dx Bα (x − x )f (x),
(18)
R
where the inverse Fourier-transform defining the kernel Bα (x − x ) may be found by means of Eq. (136), Appendix B to be given by
Bα (x − x ) =
Sb
Q
2 Sb (2α) Q Sb 2
+ i(x − x ) − α + i(x − x ) + α
.
(19)
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
621
4. The Clebsch–Gordan Decomposition of Tensor Products The co-product allows us to define the tensor product of representations: For any u ∈ Uq (sl(2, R)) let π21 (u) ≡ (πα2 ⊗ πα1 )&(u). The operators π21 (u) generate a representation of Uq (sl(2, R)) on Pα2 ⊗ Pα1 . Our aim is to determine the decomposition of this representation into irreducible representations of Uq (sl(2, R)). Lemma 5. Pα2 ⊗ Pα1 is dense in L2 (R) ⊗ L2 (R). Proof. Any two-variable Hermite-function is contained in Pα2 ⊗ Pα1 . α3 α2 α1 (the “Clebsch–Gordan coeffiDefinition 1. Define a distributional kernel x3 x 2 x 1 cients”) by an expression of the form α3 α2 α1 α3 α2 α1 ≡ lim , (20) x3 x2 x1 x3 x2 x1 ! !↓0 where the meromorphic function
Q − α3 α2 α1 x3 x2 x1
α3 α2 α1 x3 x2 x1
!
is defined as
πi
!
= e− 2 (&α3 −&α2 −&α1 ) × Db (β32 ; y32 + !)Db (β31 ; y31 + !)Db (β21 ; y21 + !), (21)
&α = α(Q−α), the distribution Db (α; y) is defined in terms of the Double Sine function Sb (y) (cf. Appendix) as Db (α; y) =
Sb (y) , Sb (y + α)
(22)
and the coefficients yj i , βj i , j > i ∈ {1, 2, 3} are given by y32 = i(x3 − x2 ) − 21 (α3 + α2 − Q), y31 = i(x1 − x3 ) − y21 = i(x1 − x2 ) −
1 2 (α3 1 2 (α2
+ α1 − Q), + α1 − 2α3 ),
β32 = α2 + α3 − α1 , β31 = α3 + α1 − α2 , β21 = α2 + α1 − α3 .
(23)
The aim of this section will be to prove Theorem 2. The Uq (sl(2, R))-representation π21 defined on πα2 ⊗ πα1 decomposes as follows into irreducible representations Pα : πα2 ⊗ πα1
⊕ dαπα, S
S≡
Q + iR+ . 2
(24)
The isomorphism can be described explicitly in terms of a unitary map C21 of the form C21 :
dµ(α) ≡ |Sb (2α)|2 L2 (R × R) → L2 (S × R, dµ(α3 )dx3 ), α3 α2 α1 f (x2 , x1 ) → Ff (α3 , x3 ) ≡ dx2 dx1 f (x2 , x1 ) x3 x2 x1 R
(25)
622
B. Ponsot, J. Teschner
such that the corresponding projections 221 (α3 ), 221 (α3 )f (x3 ) = Ff (α3 , x3 ), map Pα2 ⊗ Pα1 into Pα3 and intertwine the respective Uq (sl(2, R)) actions according to 221 (α3 )π21 (u) = πα3 (u)221 (α3 ),
u ∈ Uq (sl(2, R)).
(26)
Remark 4. It follows from Theorem 2 that the representation π21 is in fact integrable, which was not clear apriori. Remark 5. It is remarkable and nontrivial that the subset of “self-dual” integrable representations of Uq (sl(2, R)) is actually closed under tensor products. Remark 6. The appearance of the measure dµ(α) is natural since dµ(α) is the Plancherel measure for the dual space of functions L2 (SL+ q (2, R)), cf. [18]. Corollary 1. The Clebsch–Gordan coefficients
α3 α2 α1 x3 x2 x1
satisfy the following or-
thogonality and completeness relations: α3 α2 α1 ∗ β3 α2 α1 lim dx1 dx2 = |Sb (2α3 )|−2 δ(α3 − β3 )δ(x3 − y3 ), x3 x2 x1 ! y3 x2 x1 ! !↓0 R
dα3 |Sb (2α3 )|
lim !↓0
S
dx3
2 R
α3 α2 α1 x3 x2 x1
∗ !
α3 α2 α1 x3 y2 y1
!
= δ(x2 − y2 )δ(x1 − y1 ). (27)
The main step in the proof of Theorem 2 will be the construction of a common spectral decomposition for the operators Q21 ≡ (πα2 ⊗ πα1 )&(Q) and K21 . The decomposition of L2 (R × R) into eigenspaces of K21 is simply obtained by Fourier-transformation: F:
L2 (R × R) → L2 (R × R) f (x2 , x1 ) → F (κ3 , x− ) ≡
R
dx+ e−πiκ3 x+ f
x+ +x− 2
− , x+ −x . 2
(28)
The q-Casimir Q21 is mapped under this Fourier-transformation F into a second order finite difference operator C21 (κ3 ) that contains shifts w.r.t. the variable x− only and therefore leaves the eigenspaces of K21 invariant: 2 C21 (κ3 ) − α3 − Q 2 b Q 1 = − ix − 21 (α1 + α2 − Q) + (α3 − Q 2 ) b [ − ix − 2 (α1 + α2 − Q) − (α3 − 2 )]b 1 − − ix + 21 (α1 + α2 ) − Q b eiπb(−ix− 2 (α1 +α2 )) {α1 − α2 + iκ3 }b
1 − e−iπb(−ix− 2 (α1 +α2 )) {α1 − α2 − iκ3 }b Tx−ib − −2ib 1 1 + − ix + 2 (α1 + α2 ) − Q b − ix + 2 (α1 + α2 ) − 2Q b Tx− , (29) where the following notation has been used: [x]b ≡
sin(π bx) , sin(π b2 )
{x}b ≡
cos(π bx) . i sin(π b2 )
(30)
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
623
The spectral analysis of the operator C21 is performed in Appendix A. The result may be summarized as follows: Eigenfunctions α3 (α2 , α1 |κ3 |x) of C21 are given by an expression of the form Q−α3 (α2 , α1 |κ3 |x)
;κ3 πx(2α3 −2α2 +iκ3 ) = Mαα23,α e 6b (T , y− ) 7b (U, V , W ; y+ ). (31) 1
The special functions 6b (T ; y) and 7b (U, V , W ; y) are defined in Appendix B, y± are introduced as y± = −ix − 21 (α2 + α1 − Q) ∓ (α3 − Q 2 ) and the coefficients T , U , V , W are given as T = α2 + α1 − α3 U = α3 + α1 − α2
V = − iκ3 + α3 W = − iκ3 + α1 − α2 + Q.
(32)
Theorem 3. A complete set of generalized eigenfunctions for the operator C21 (κ3 ) is given by {( α3 )∗ ; α3 ∈ S}. By combining Theorem 3 with the usual Plancherel formula for the Fourier-transformation F one concludes that each function f (x2 , x1 ) ∈ L2 (R × R) can be decomposed as (x± ≡ x2 ± x1 ) ∗ f (x2 , x1 ) = dκ3 eπiκ3 x+ dµ(α3 ) (33) α3 (α2 , α1 |κ3 |x− ) Ff (α3 , κ3 ), R
S
where the generalized Fourier-transformation Ff of f is defined as Ff (α3 , κ3 ) = dx2 dx1 e−πiκ3 x+ α3 (α2 , α1 |κ3 |x− )f (x2 , x1 ).
(34)
R
The measure dµ(α3 ) will be determined later. One may next observe that Lemma 6. One has α3 α2 α1 α3 α2 α1 dx3 e2πiκ3 x3 ≡ = e−πiκ3 x+ κ3 x2 x1 x3 x2 x1
α3 (α2 , α1 |κ3 |x− ),
(35)
R
if the normalization factor M in (31) is chosen as ;κ3 ≡ eπiα2 (α2 −α3 ) e−πi(α3 −iκ3 )(α3 +α2 −Q) . (36) Mαα23,α 1 Q − α3 α2 α1 Proof. The kernel may be rewritten in terms of the function 6b (β; y) x3 x2 x1 as follows: Q − α3 α2 α1 = eπiα1 α2 e2π(x3 (α2 −α1 )+α1 x1 −α2 x2 ) x3 x2 x1 (37) × 6b (β32 ; y32 )6b (β31 ; y31 )6b (β21 ; y21 ).
The substitution s = −i(x3 − x2 ) + 21 (α3 + α2 − Q) then leads to the Euler-type integral (146) for the b-hypergeometric function. The rest is straightforward. If follows that the generalized Fourier-transformation defined in Theorem 3 represents a decomposition into eigenspaces of the q-Casimir Q21 . Two things remain to be done in order to finish the proof of Theorem 2: On the one hand it remains to calculate the spectral measure dµ(α3 ), and on the other hand one needs to verify the intertwining property (26).
624
B. Ponsot, J. Teschner
4.1. Spectral measure. We will show in this subsection that dµ(α3 ) = |Sb (2α3 )|2 . This follows from the combination of the following two results. We first of all determine the asymptotics of the distributional Fourier-transform of α3 : Lemma 7. The function ˜ α3 (ω) (defined as in (6)) decays exponentially for ω → ∞ and has the following asymptotic behavior for ω → −∞: ˜ α3 (ω) = N+ (α3 )e2πiωx+ + N− (α3 )e2πiωx− + R− (ω),
(38)
where R− (ω) decays exponentially for ω → −∞, x+ and x− are defined by x± ≡ + 2i α1 + α2 − Q ± i α3 −
Q 2
and |N± (α3 )|2 = |Sb (2α3 )|−2 . Proof. According to Lemma 2 one just needs to calculate the residues of α3 for the poles at x = x± . We will only need the absolute values of these quantities. The pole at x = x− comes from the Gb /Gb factor in the expression for . To calculate its residue one needs the following special value of the 7-function: 7b (U, V ; W ; W − U − V ) =
Gb (V )Gb (W − U − V ) , Gb (W − U )
(39)
which follows easily from the fact that the representation (146) simplifies to the b-beta 2 integral (136) for x = W − U − W . We furthermore note that |Gb ( Q 2 + ix)| = 1 from the reflection property of Sb (x) stated in Appendix B. It thereby follows that |N− (α3 )|2 = |Mαα23α;κ13 Gb (Q − 2α3 )|2 .
(40)
One has |Mαα23α;κ13 |2 = eπiQ(Q−2α3 ) , and |Gb (Q − 2α3 )|2 = e−πiQ(Q−2α3 ) |Sb (2α3 )|−2 from the connection between Sb and Gb , as well as the reflection property of Sb (see Appendix B). Therefore |N− (α3 )|2 = |Sb (2α3 )|−2 . The pole at x = x+ corresponds to the pole at y = 0 of 7b (U, V ; W ; y). One may determine the singular term for y → 0 by applying Lemma 3 to the Euler integral representation (146) for the function 7b : 2πe−2πiyβ
1 Gb (γ − β) Gb (−y + γ − β) = + (contributions regular as y → 0). Gb (α)Gb (−y + Q) y Gb (α) (41)
The rest of the calculation proceeds as in the case of N− (α3 ) and yields |N+ (α3 )|2 = |Sb (2α3 )|−2 . Proposition 2. Assume that the generalized eigenfunctions ˜ α3 decay exponentially for ω → ∞ and have asymptotic behavior of the form (38) with |N+ (α3 )|2 = |N− (α3 )|2 for ω → −∞. In that case one may define the “inner product” ( α3 , α3 ) as a bidistribution which is explicitly given by (
α3 ,
α3 )
= |N+ (α3 )|2 δ(α3 − α3 ).
(42)
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
625
Proof. Consider (C21 (κ3 ) = lim
α3 ,
W →∞
W
s=±−W
α3 ) − (
α3 , C21 (κ3 )
α3 )
∗ ∗ dω δ˜s (ω) ˜ α3 (ω + sib) ˜ α3 (ω)− ˜ α3 (ω) δ˜s (ω) ˜ α3 (ω + sib) , (43)
where the Fourier-transform of the explicit expression (105) for C21 (κ3 ) has been used. The contour of integration for the second term in (43) can be deformed into R − isb plus contours from −W to −W − isb and W − isb to W . The integral over R − isb cancels the first term on the right-hand side of (43). Only the contour from −W to −W −isb will give nonvanishing contributions in the limit W → ∞ due to the exponential decay of ˜ α3 (ω) for ω → ∞. In the remaining term one gets in the limit W → ∞ contributions only from the leading terms in the asymptotics of ˜ α3 (ω) for ω → −∞ as quoted in Lemma 38. Taking into account that δ˜s (ω) =
1 esπib(Q−α1 −α2 ) + O(e2πbω ) (q − q −1 )2 Q 2
for ω → −∞, it follows that (α3 = (C21 (κ3 )
α3 ,
α3 ) − (
+ ip3 , α3 =
α3 , C21 (κ3 )
1 = lim (q − q −1 )2 W →∞ s=±
α3 )
!1 ,!2 =±
Q 2
(44)
+ ip3 )
∗ N!1 (α3 ) N!2 (α3 ) 2πiW (!1 p3 −!2 p ) 3 · e (45) 2π i(!1 p3 − !2 p3 ) · e2πs!2 bp3 1 − e2πsb(!1 p3 −!2 p3 ) .
The expression on the right-hand side of (45) vanishes by the Riemann–Lebesgue Lemma for p3 " = p3 as well as !1 " = !2 . The remainder is found to be (C21 (κ3 )
α3 ,
α3 ) − (
α3 , C21 (κ3 )
α3 )
e2πiW (p3 −p3 ) − e−2πiW (p3 −p3 ) . = [ip3 ]2b − [ip3 ]2b |N+ (α3 )|2 lim W →∞ 2π i(p3 − p3 )
(46)
It follows that
(
α3 ,
α3
e2πiW (p3 −p3 ) − e−2πiW (p3 −p3 ) W →∞ 2π i(p3 − p3 )
) = |N+ (α3 )|2 lim = |N+ (α3 )|
2
(47)
δ(α3 − α3 )
by the corresponding well-known property of the kernel sin(Rx)/x, cf. e.g. [21, Chapter IX, Exercise 14].
626
B. Ponsot, J. Teschner
4.2. Intertwining property. Proposition 3. The projections 221 (α3 ), α3 ∈ S map Pα2 ⊗ Pα1 into Pα3 and satisfy the intertwining property (26). Proof. Ff (α3 , x3 ) will be entire analytic w.r.t. x3 by straightforward application of Lemma 3, using that f is entire analytic in x2 , x1 and the analytic properties of the Clebsch–Gordan coefficients summarized in Lemma 19, Appendix C. One similarly finds by using Lemma 20, Appendix C that the Fourier-transform Ff (α3 , κ3 ) will be meromorphic in κ3 with poles at κ = ±(Q − α + nb + mb−1 ), n, m ∈ Z≥0 for any f ∈ Pα2 ⊗ Pα1 . This establishes the first claim in Proposition 3. Note that the analytic continuation of the integral (25) that defines Ff (α3 , x3 ) can be represented by integrating over a deformed contour C (2) ⊂ C2 . For later use we will present suitable contours for the cases of analytic continuation to {x3 ∈ C; Im(x3 ) ∈ [0, b2 ]} and {x3 ∈ C; Im(x3 ) ∈ [− b2 , 0]} respectively: In the first case one may integrate x1 over the real axis and instead of integrating over x2 one may integrate x32 ≡ −iy32 , cf. (23), over a contour consisting of the union of the half axes (−∞, −δ] and [δ, +∞), b > δ > b/2 with a half-circle in the upper half plane around x32 = 0 of radius δ. In the second case one may integrate x2 over R, and x31 ≡ −iy31 over the contour C1 consisting of the union of the half axes (−∞, −δ] and [δ, +∞) with a half-circle of radius δ in the lower half plane around x31 = 0. Now consider the right-hand side of (26). The expressions for π21 (u), u = E, F, K contain the shift operators + ib
+ ib
Tx1 2 Tx2 2 ,
− ib
− ib 2
Tx1 2 Tx2
and
− ib
+ ib
Tx1 2 Tx2 2 .
(48)
± ib
The shift operator Txi 2 is “partially integrated” by (i) shifting the contour of integration over xi to the axis R ∓ ib 2 , where one will pick up a residue contribution from the pole of the Clebsch–Gordan coefficients that lies between these two contours, and (ii) introducing the new variables of integration xi ≡ xi ± ib 2 . In this way one rewrites the expression for C21 π21 (u)f in the form
α α α t dx2 dx1 π21 (u) 3 2 1 f (x2 , x1 ), (49) x3 x2 x1 C1
C2
t denotes the transpose of π , and the contours C , i = 1, 2 are just the where the π21 21 i contours introduced above to represent the analytic continuation w.r.t. x3 . It is important to notice that due to the fact that only the shift operators (48) appear in the expressions for π21 (u), u = E, F, K one does not need to introduce further deformations of the contours in order to treat the poles from the factor in the Clebsch–Gordan coefficients that depends on x2 − x1 only. It is verified by a straightforward calculation using (133) that the Clebsch–Gordan coefficients satisfy the finite difference equations
t (u) π21
α3 α 2 α 1 α α α = πα3 (u) 3 2 1 , x3 x2 x1 x3 x2 x1
u = E, F, K.
(50)
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
627
Inserting these relations into (49) yields an expression that is easily identified as πα3 (u)C21 f . 5. Racah–Wigner Coefficients for Uq (sl(2, R)) 5.1. Canonical decompositions for triple tensor products. Triple tensor products Pα3 ⊗ Pα2 ⊗ Pα1 carry a representation π321 of Uq (sl(2, R)) given by π321 ≡ (πα3 ⊗ πα2 ⊗ πα1 ) ◦ &(3) ,
(51)
&(3) ≡ (& ⊗ id) ◦ & ≡ (id ⊗ &) ◦ &.
The decomposition of this representation into irreducibles can be constructed by iterating Clebsch–Gordan maps: There are two canonical ways to do so, which will be referred to as “s-channel” and “t-channel” respectively. The first of these corresponds to first decomposing the factor Pα2 ⊗ Pα1 into a direct sum of irreducible representations Pαs then performing the Clebsch–Gordan decomposition of Pα3 ⊗ Pαs . This extends to a unitary map C3(21) :
L2 (R × R × R) → L2 (S2 × R, dµ(α4 )dµ(αs )dx4 ) . f (x3 , x2 , x1 ) → Ffs (α4 , αs , x4 ),
(52)
The generalized Fourier-transform Ffs of f is defined as α α α Ffs (α4 , αs ; x4 ) ≡ lim lim dx3 dxs 4 3 s x4 x3 xs !2 !2 ↓0 !1 ↓0 R2
×
dx2 dx1
R2
αs α2 α1 xs x2 x1
(53) !1
f (x3 , x2 , x1 ),
which in the notation x ≡ (x3 , x2 , x1 ), dx ≡ dx3 dx2 dx1 can be rewritten as α α Ffs (α4 , αs ; x4 ) ≡ lim dx sαs 3 2 (x4 ; x) f (x), α 4 α1 ! !↓0 where
s αs
α3 α 2 α4 α 1
!
R3
(x4 ; x) =
R
dxs
α4 α3 αs x4 x3 xs
!
αs α2 α1 x s x 2 x1
(54)
!
α4 , αs ∈ S, x4 ∈ R.
α 3 α2 (x ; x) are collected inAppendix C. α4 α1 ! 4 The generalized Fourier-transformation C3(21) is such that the two-parameter family of projections 2s (α4 , αs ) : Pα3 ⊗ Pα2 ⊗ Pα1 → Pα4 (R) defined by f → Ffs (α4 , αs ; .) intertwine the representation π321 with the irreducible representation πα4 . It therefore realizes the following isomorphism of Uq (sl(2, R)) representations:
Some useful properties of the functions
s αs
⊕ Pα3 ⊗ Pα2 ⊗ Pα1
dµ(α4 )Pα4 ⊗ Sµ , S
(55)
628
B. Ponsot, J. Teschner
where the multiplicity space Sµ L2 (S, dµ) is considered to be equipped with the trivial action of Uq (sl(2, R)). A second canonical decomposition of Pα3 ⊗ Pα2 ⊗ Pα1 is obtained by first decomposing the factor Pα3 ⊗ Pα2 into a direct sum of irreducible representations Pαt and then performing the Clebsch–Gordan decomposition of Pαt ⊗ Pα1 . One obtains a map C(32)1 :
L2 (R × R × R) → L2 (S2 × R, dµ(α4 )dµ(αt )dx4 )
(56)
f (x3 , x2 , x1 ) → Fft (α4 , αt , x4 ),
where Fft is defined by a generalized Fourier-transform of the same form as (53) but with s21 replaced by α 3 α2 α4 αt α1 αt α3 α2 t (x4 ; x) = dxt . α4 , αt ∈ S, x4 ∈ R. αt α α x 4 x t x1 ! x t x3 x2 ! 4 1 ! R (57) As in the case of the s-channel, one has a corresponding two-parameter family of projections 2s (α4 , αs ) : Pα3 ⊗ Pα2 ⊗ Pα1 → Pα4 that intertwine the representation π321 with the irreducible representation πα4 . Remark 7. The unitarity of the maps C3(21) and C(32)1 ensures existence of self-adjoint extensions for the operators π3(21) (u), π(32)1 (u), u = E, F, K, Q: Simply take the −1 −1 or C(32)1 . image of the self-adjoint extensions on L2 (S2 × R) under C3(21) However, it is not a priori clear that such self-adjoint extensions are unique. In particular, it could be that the self-adjoint extensions that are defined in terms of the maps C3(21) and C(32)1 are inequivalent. This disturbing possibility will be excluded shortly. 5.2. Relation between C3(21) and C(32)1 . It will be convenient to also consider the = 3 α2 (k ; x), = = s, t that are defined as Fourier-transforms αs α 4 α α 1 !
4
= αs
α 3 α2 α 4 α1
!
(k4 ; x) =
R
dx4 e2πik4 x4
= αs
α3 α 2 α4 α1
!
(x4 ; x).
(58)
Unitarity of the maps C3(21) and C(32)1 allows us to relate the transforms Ffs and Fft by a transformation of the form α α k (59) Ffs (α4 , αs , k4 ) = dα4 dαt dk4 K 4 s 4 Fft (α4 , αt , k4 ). α 4 αt k 4 S2
R
The distribution K appearing in (59) can be represented as α α k K 4 s 4 α 4 αt k 4 ∞ ρ
∗ α3 α 2 t = lim lim dx2 dx3 dx1 (k ; x) αt α α 4 ρ→∞ !↓0 4 1 ! −∞
−ρ
s αs
α 3 α2 α4 α1
!
(k4 ; x). (60)
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
629
We will first prove Proposition 4. The distribution K is of the form K
α α α 4 αs k4 = δ(α4 − α4 )δ(k4 − k4 ) K 4 s . α 4 αt k 4 k4 α t
(61)
Proof. This will be a consequence of the following result: K satisfies α4 −
2 Q 2 b
− α4 −
2
Q 2 b
K
(k4 − k4 ) K
α4 αs k4 α4 αt k4 α4 αs k4 α4 αt k4
= 0, (62) = 0.
To see that (62) implies the claim, consider the simplified case of a distribution T ∈ S (R) that satisfies Tf = 0, where f is a function that vanishes only at x0 and such that f g ∈ S(R) if g ∈ S(R). This distribution has support only at x0 . By Theorem V.11 n of [20] one has T = N n=0 an (x0 )∂x δ(x − x0 ). It is then easy to see that Tf = 0 implies an = 0 for n " = 0. The generalization to the case at hand is clear. = 3 α2 To verify (62) one may note that the functions αt α α4 α1 ! (k4 ; x), = = s, t satisfy eigenvalue equations for the operators Q321 ≡ π321 (Q) and K321 ≡ π321 (K) up to an error of order O(!). It follows that
α α k K 4 s 4 α 4 αt k 4 ρ
∗ α 3 α2 t = lim lim dx2 dx3 dx1 (k x) Q321 αt α α 4 !1 ,!2 ↓0 ρ→∞ 4 1 !1
α4 −
Q 2 2 b
− α4 −
R
Q 2 2 b
−ρ
− Q321
t αt
α 3 α2 α4 α1
∗
!1
(k4 ; x)
s αs
s αs
α 3 α2 α4 α1
α 3 α2 α 4 α1
!2
(k4 ; x)
!2
(k4 ; x) . (63)
The right-hand side of (63) will vanish if Q321 can be “partially integrated”. To show that this is the case, one needs some information on the form that Q321 takes when acting on functions f (x). By straightforward evaluation of its definition one obtains an expression in terms of shift operators T1is1 b T2is2 b T3is3 b ,
where Ti = Txi , si ∈ {+, −}, i = 1, 2, 3.
It is convenient to introduce an alternative set of shift operators T+3 = T1 T2 T3 ,
2 T21 = T2 T1−1
2 T32 = T3 T2−1 .
The crucial point now is that the expression for Q321 when rewritten in terms of T+ , T21 , T32 takes the following form Q321 =
3
3 3
n+ =−3 n21 =0 n32 =0
in b
2
ibn21
Pn+ n21 n32 (x) T+ + T213
2
ibn32
T323
,
(64)
630
B. Ponsot, J. Teschner
so it contains shifts of x21 , x32 , x31 by positive imaginary amounts up to 2ib only. Furthermore note that in (63) one may replace T+ by e−2πik4 . The analytic properties of the integrand in (63) as following from Lemma 22 in Appendix C now allow to partially integrate Q321 by appropriate shifts of the contours of integration over x3 , x2 , x1 (cf. proof of Proposition 3). The verification of the second equation in (62) is similar. Remark 8. This result implies that the self-adjoint extensions of π321 (u), u = K, Q that are defined by the maps C3(21) and C(32)1 indeed coincide. A similar argument as in the proof of the previous proposition will also cover the two other cases u = E, F . 5.3. Calculation of the Racah–Wigner coefficients I. It will be useful to also introduce α4 αs x4 X α 4 αt x 4 ∞ (65)
∗ α 3 α2 α 3 α2 t s = lim dx3 dx2 dx1 (x ; x) (x ; x). 4 αt α α αs α α 4 !→0+ 4 1 ! 4 1 ! −∞
Proposition 4 has an obvious counterpart for X : Proposition 5. The distribution X is of the form
α4 αs x4 α1 α2 αs X = δ(α4 − α4 )δ(x4 − x4 ) . α 4 αt x 4 α3 α4 αt b
(66)
Proof. Introduce K!,ρ
∞ ρ α4 αs k4 dx2 dx3 dx1 = α 4 αt k 4 −∞
−ρ
t αt
α3 α 2 α4 α1
!
∗
(k4 ; x)
s αs
α 3 α2 α 4 α1
!
(k4 ; x). (67)
The coefficient of δ(k4 − k4 ) in the expression for K coincides with the sum of the coefficients with which e−2πi(k4 −k4 )x1 and e−2πi(k4 −k4 )x3 appear in the asymptotic expansion of the integrand in (67), cf. Lemma 22. Lemma 2 identifies the origin of these terms in the asymptotic expansion of = , = = s, t, with the poles in the dependence of = [. . . ] (x ; x), = = s, t on their variable x . It follows that the coefficient of δ(k − k ) ! 4 4 4 4 in the expression for K is independent of k4 . The result now follows from standard properties of the Fourier transformation. Proposition 6. We have
Sb (α2 + αs − α1 )Sb (αt + α1 − α4 ) α1 α2 αs =N α3 α 4 α t b Sb (α2 + αt − α3 )Sb (αs + α3 − α4 ) i∞ Sb (U1 + s)Sb (U2 + s)Sb (U3 + s)Sb (U4 + s) 2 · |Sb (2αt )| ds , Sb (V1 + s)Sb (V2 + s)Sb (V3 + s)Sb (V4 + s) −i∞
(68)
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
631
where the coefficients Ui and Vi , i = 1, . . . , 4 are given by U1 =αs + α1 − α2 , U2 = Q + αs − α2 − α1 , U3 = αs + α3 − α4 , U4 = Q + αs − α3 − α4 ,
V1 V2 V3 V4
= 2Q + αs − αt − α2 − α4 , = Q + αs + αt − α4 − α2 , = 2αs , = Q,
(69)
and N is a constant. Proof. Let K!
∞ α4 αs x4 = dx3 dx2 dx1 α 4 αt x 4 −∞
t αt
α 3 α2 α4 α1
∗
!
(x4 ; x)
s αs
α 3 α2 α4 α 1
!
(x4 ; x). (70)
The analytic and asymptotic properties of the integrand follow from Lemma 21 in Appendix C. Let us observe that for ! > 0 one is dealing with absolutely convergent integrals, the integrand being meromorphic both w.r.t. the integration variables and the parameters. The integral (70) therefore does not depend on the order in which the integrations are performed, so we will assume that it is first integrated over x2 . Singular behavior will emerge in the limit ! → 0. We will call a pole relevant if it has distance of O(!) from the real axis, irrelevant otherwise2 . It then easily follows from Lemma 3 that the integration over x2 does not introduce any new relevant poles since all the relevant poles in the x2 dependence that have distance of O(!) are lying on the same side of the contour. Next one may integrate over x1 . We find from Lemma 21 in Appendix C that Rs14 Rs13 α 3 α2 + + (Regs ), (x4 , x) = α4 α1 ! x1 − x3 + α13 − 2i! x1 − x4 + α14 − 2i!
∗ Rt13 Rt14 α 3 α2 t (x , x) = + αt α α 4 + 2i! + i! + (Regt ), x1 − x3 + α13 x1 − x4 + α14 4 1 ! (71) s αs
where (Reg= ), = = s, t are terms that do not lead to relevant poles in the variable x1 after having integrated over x2 . The following abbreviations have been used: α13 = 2i (α1 + α3 − 2(Q − α4 )),
α13 = 2i (α1 + α3 − 2(Q − α4 )),
α14 = 2i (α1 − α4 ),
= 2i (α1 − α4 ). α14
(72)
It is then easily found by using Lemma 3 that the result of the integration over x1 will have poles at the following locations: i(α4 − α4 ) − 4i! = 0, x4 − x4 + 2i (α4 − α4 ) − 3i! = 0,
x3 − x4 − 2i (α3 + α4 − 2(Q − α4 )) − 4i! = 0, x4 − x3 + 2i (α3 + α4 − 2(Q − α4 )) − 3i! = 0. (73)
The relevant residues can easily be assembled from the expressions given in Appendix C. Moreover, it is straightforward to work out their poles. By again using Lemma 3 one 2 We of course assume that ! has been chosen to be much smaller than b.
632
B. Ponsot, J. Teschner
then finds that all four poles listed in (73) will, after doing the x3 integration, produce terms that are singular for x4 = x4 , α4 = α4 and ! → 0. The terms that lead to δ(x4 − x4 )δ(α4 − α4 ) are easily identified by means of lim
!→0+
1
1 − = 2π iδ(x). x − i! x + i!
(74)
All these terms have as residue an expression proportional to Res Res
y31 =0 y21 =0
α4 α3 αs ∗ ∗ ∗
R
Res Res
y31 =0 y21 =0
dx2 Res
y31 =0
αs α2 α1 ∗ x 2 x1
α4 αt α1 ∗ ∗ ∗
Res
x1 =x3 −α13 y32 =0
αt α3 α2 xt ∗ x 2
xt =x3 − 2i (α3 −αt )
(75) .
One just needs to assemble the ingredients to check that the expression (75) coincides with what one finds on the right-hand side of (68). Remark 9. With more patience, one could of course also fix the constant N by the method used in the previous proof. We refrain from doing so since we will present a less tedious and more illuminating way of calculating it in the next subsection. What will be needed there, however, is the information on analyticity of the coefficients {. . . } w.r.t. αt that follows from Proposition 6.
5.4. Relation between the distributions Proposition 7.
s
and
s
and
t.
t
are related by a linear transformation of the form
α3 α 2 α α αs α3 α 2 s t (x4 ; x). dαt 1 2 α αs α α (x4 ; x) = α α 3 α 4 t b t α4 α 1 4 1
(76)
S
The relation (76) can be read either as (i) relation between functions analytic in A(4) ≡ {x = (x4 , x3 , x2 , x1 ) ∈ C4 ; Im(x1 ) < Im(x2 ) < Im(x3 ), Im(x1 ) < Im(x4 ) < Im(x3 ), Im(x3 − x1 ) < Q}, or (ii) as relation between functions meromorphic w.r.t. x ∈ C4 , or (iii) as relation between distributions defined as boundary values of = , = = s, t for (x4 , x) ∈ R4 . Proof. We will start from Eq. (59). By using Fourier-transformation w.r.t. the variable k4 and Eq. (66) one may rewrite (59) as follows:
α 1 α2 α s Ffs (α4 , αs , x4 ) = dαt F t (α , α , x ). (77) α3 α4 αt b f 4 t 4 S
Let us introduce sequences of test-functions that tend towards delta-distributions: tn (y; x) =
n 3 n 2 − ||x−y||2 e 2 , 2π
y = (y3 , y2 , y1 ).
(78)
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
633
Lemma 8. Let y ≡ (x4 , y) ∈ A(4) with Im(y1 ) < 0. In this case one has =
lim Ftn (y;.) (α4 , α= , x4 ) =
n→∞
= α=
α3 α2 (x ; y). 4 α 4 α1
(79)
=
Proof. By writing out the definition of Ftn and shifting the contours of integration over xi to R + iIm(yi ), i = 1, 2, 3, one reduces the claim to the standard result that lim tn (y; x) = δ 3 (x − y)
n→∞
for Im(yi ) = 0, i = 1, 2, 3. (Note that follows from Lemma 21, Appendix C.)
=
is regular for these values of its arguments as
We will now consider the sequence with elements
α1 α 2 α s dαt Ft (α , α , x ). α3 α4 αt b tn (y,.) 4 t 4
(80)
S
It converges for n → ∞ due to Lemma 8 and Eq. (77). We would like to show that one may exchange the limit n → ∞ with the integration over αt so that the limit of (80) is given by the integral
α 1 α2 α s α 3 α2 t dαt (81) (x4 ; y).
α3 α4 αt b αt α4 α1 S
To this aim it is useful to note that Lemma 9. Under the conditions on the variable y introduced in Lemma 8 one finds that the integrand in (81) decays exponentially for pt ≡ −i(αt − Q 2 ) → ±∞. The integrand in (80) decays at least as fast as the integrand in (81). Proof. By a straightforward calculation using the method in the proof of Lemma 17, Appendix B and Eq. (135) one finds that α 3 α2 (x4 ; y) decays stronger than e∓πQpt and α4 α1
α1 α2 αs grows as e±πQpt α3 α4 α t b t αt
(82)
for pt → ∞. The first statement in Lemma 9 follows. The second statement follows from the first by shifting the contour of integration over x1 in the definition of Fttn (y,.) to R + iIm(y1 ). The integrals (80), (81) can therefore be transformed into integrals over a compact set, e.g. the interval [0, 1]. In order to justify the exchange of limit and integration it therefore suffices to prove the following Lemma 10. The convergence of Fttn (y,.) (α4 , αt , x4 ) is uniform in αt .
634
B. Ponsot, J. Teschner
Proof. To shorten the exposition, let us consider a slightly simplified situation. Assume that fp (x) is analytic w.r.t. both p and x in open strips that contain the real axis and n −nx /2 e decays exponentially for either |p| or |x| going to infinity. Let tn (x) = 2π and study the convergence of fp,n ≡ R dxfp (x)tn (x) for n → ∞. Upon writing fp (x) = fp (0) + xgp (x), the task reduces to the study of 2
dx gp (x) xtn (x) = √ R
1 2π n
n 2
dx e− 2 x ∂x gp (x).
(83)
R
Convergence for n → ∞ will be uniform in p provided that ∂x gp (x) is bounded as functions of both p and x. But this is a consequence of our assumptions: The exponential decay allows us to transform fp (x) (resp. ∂x gp (x)) to a function that is analytic on a compact rectangle in C2 , and therefore bounded. The regularity properties of t necessary to extend the argument to the present situation follow from Lemma 21, Appendix C. We have proved (76) provided (x4 , x) satisfies the same conditions as (x4 , y) in Lemma 8. Proposition 7 follows by analytic continuation.
5.5. Calculation of Racah–Wigner coefficients II. We have shown that the meromorphic functions s and t are related by an integral transformation of the form (76). If one fixes the values of three of the four variables x4 , . . . , x1 in (76) one obtains an integral transformation for a function of a single variable. In fact, the analytic properties of sαs and tαt even allow one to choose complex values. It will be convenient to consider 7αs s
3 α 3 α2 2πα4 x4 (x) = lim e lim e−2παj xj α¯ 4 α1 x4 →∞ x2 →−∞ j =1
s αs
x1 =x α3 α 2
(x) i , α¯ 4 α1 x3 = 2 (Q+α2 −α4 ) (84)
where α¯ = Q − α, and the same for 7αt t . The integral that defines sαs and tαt , (54)(57) can be done explicitly in this limit by using (146). One finds expressions of the form α 3 α2 α α α α (x) = Nαs s 3 2 6sαs 3 2 (x), α¯ 4 α1 α¯ 4 α1 α¯ 4 α1 α α 6sαs 3 2 (x) = e+2πx(αs −α2 −α1 ) Fb (αs + α1 − α2 , αs + α3 − α4 ; 2αs ; −ix), α¯ 4 α1 α α α α α α 7αt t 3 2 (x) = Nαt t 3 2 6tαt 3 2 (x), α¯ 4 α1 α¯ 4 α1 α¯ 4 α1 α α 6tαt 3 2 (x) = e−2πx(αt +α1 −α4 ) Fb (αt + α3 − α2 , αt + α1 − α4 ; 2αt ; +ix), α¯ 4 α1 (85)
7αs s
where Fb is the b-hypergeometric function defined in the Appendix, and Nαs s , Nαt t are certain normalization factors.
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
635
The linear transformation following from (76) can now be calculated as follows: One observes that 7αs s (resp. 7αt t ) are eigenfunctions of the finite difference operators Qs and Qt defined respectively by 2 Qs = dx + α1 + α2 − Q − e+2πbx dx + α1 + α2 + α3 − α4 dx + 2α1 , 2 (86) 2 −2πbx − e + α + α − α − α d d . Qt = dx + α1 − α4 + Q x 1 2 3 4 x 2 It can be shown that Theorem 4. The operators Qs and Qt have unique self-adjoint extensions in L2 (R, dxe2πQx ). Bases of L2 (R, dxe2πQx ) in the sense of generalized eigenfunctions are given by the sets of functions {6sαs ; αs ∈ S} and {6tαt ; αt ∈ S}, where the normalization is given by ∗ α α α α = dx e2πQx 6α 3 2 (x) 6=α= 3 2 (x) = δ(α= − α= ), = = s, t. (87) ¯ 4 α1 α¯ 4 α1 = α R
The proof is omitted as it is very similar to the proof of Theorem 3. It follows that the Racah–Wigner coefficients can be evaluated in terms of the overlap between these two bases: α α Nαs s 3 2
∗ α¯ 4 α1 α1 α2 αs α 3 α2 α 3 α2 2πQx t s = dx e 6 6 (x) (x).
αt α αs α α3 α¯ 4 αt b ¯ 4 α1 ¯ 4 α1 α α Nαt t 3 2 R α¯ 4 α1 (88) The integral can be done by using the representation (143) for the b-hypergeometric function. The result is just Eq. (68) with N = 1. 5.6. Properties the Racah–Wigner coefficients. First of all let us note that orthogonality and completeness of the bases { sαs ; αs ∈ S} and { tαt ; αt ∈ S} imply the following orthogonality relations for the b-Racah–Wigner symbols
∗ α1 α 2 α s α1 α2 α s dαs |Sb (2αs )|2 = |Sb (2αt )|2 δ(αt − βt ). (89)
α α α α α βt t S
3
4
b
3
4
b
This may be verified e.g. by rewriting
α 3 α2 α 3 α2 t t (x ; .), (x ; .) 4 αt α α αt α α1 ! 4 4 1 ! 4
(90)
= |Sb (2αt )|−2 δ(αt − αt )δ(α4 − α4 )δ(x4 − x4 ) with the help of the inversion formula to (76)
Sb (2αs ) 2 α1 α2
α ∗ α 3 α2 t s
(x ; x) = dα 4 s αs α α α3 α4 αt b 4 1 Sb (2αt ) S
s αt
α3 α 2 (x4 ; x), α4 α 1 (91)
636
B. Ponsot, J. Teschner
and finally using (90) with subscripts t replaced by s. Second, by considering quadruple products of representations one finds the so-called pentagon equation in the usual way:
β 1 α3 β 2 α 1 α2 β 1 α 1 δ 1 β2 α 2 α3 δ1 α 1 α 2 β1 dδ1 = .
α3 β 2 δ 1 b α4 α 4 γ 2 b α 4 γ 2 γ 1 b α4 α5 γ1 b γ1 α5 γ2 b S
(92)
5.7. From intertwiners to coinvariants. Let us consider coinvariants on tensor products of representations. These will be maps B : Pαn ⊗ . . . ⊗ Pα1 . → C that satisfy the coinvariance property B ◦ (παn ⊗ . . . ⊗ πα1 )&(n) (u) = 0,
u ∈ Uq (sl(2, R)),
(93)
where &(n) is defined recursively by &(n) = (id ⊗ &)(&(n−1) ) = (& ⊗ id)(&(n−1) ), &(2) ≡ &. The basic case to consider is n = 2. Let Bα : PQ−α ⊗ Pα → C be defined by −i Q 2
Bα (f ⊗ g) ≡ f , T g ,
T ≡ Tx
.
(94)
Proposition 8. Bα satisfies the coinvariance property (93). Proof. Let us note that Txiα f , g = f , Tx−iα g
(95)
if f ∈ PQ−α and g ∈ Pα . A straightforward calculation then shows that πQ−α (u)f , g = f , πα (u)g ,
u ∈ Uq (sl(2, R)).
(96)
It is useful to also note the commutation relations T Eα = e−iπbQ Eα T ,
T Fα = e+iπbQ Fα T ,
T Kα = Kα T .
(97)
We may then calculate in the case u = E Bα ((πQ−α ⊗ πα ) ◦ &(E))f ⊗ g = EQ−α f , T Kα g + KQ−α f , T Eα g = EQ−α f , Kα T g + e−iπbQ KQ−α f , Eα T g = f , Eα Kα T g − q = 0.
−1
(98)
T f , Kα Eα T g
The calculation for the case u = F is identical and the case u = K is trivial.
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
637
A coinvariant Bα : Pα ⊗ Pα is then obtained by combining Bα with the intertwining operator Iα : Bα ≡ Bα ◦ (Iα ⊗ id).
(99)
In order to construct coinvariants B (n) for n > 2 one may use intertwining maps C ∈ HomUq (sl(2,R)) (Pαn−1 ⊗ . . . ⊗ Pα1 , Pαn ). Such maps can be constructed by iterating Clebsch–Gordan maps, as has been discussed explicitly in the case n = 4 at the beginning of the present section. One may associate a coinvariant BC to any C ∈ HomUq (sl(2,R)) (Pαn−1 ⊗ . . . ⊗ Pα1 , Pαn ) via BC ≡ B ◦ (id ⊗ C).
(100)
The maps C can be represented explicitly with the help of meromorphic integral = kernels C (xn ; x), x ≡ (xn−1 , . . . , x1 ) that generalize α= and the Clebsch–Gordan coefficients. It follows that the corresponding coinvariant BC can be represented as iQ 2 BC (fn ⊗ . . . ⊗ f1 ) = dxn Txn fn (xn ) dx C (xn ; x)fn−1 (xn−1 ) . . . f1 (x1 ). R
Rn−1
(101) It is possible to rewrite (101) as a convolution of fn (xn ) . . . f1 (x1 ) against a kernel 7C (x), x ≡ (xn , . . . , x1 ): To this aim it is necessary to “partially” integrate the finite difference operator in (101) to let it act on C . One should note that the analytic continuation of the integral over x to complex values of xn may in general be represented by integrating the variable x over deformed contours, cf. e.g. the proof of Proposition 3. One arrives at a representation of the form (102) BC (fn ⊗ . . . ⊗ f1 ) = dxn . . . dx1 7C (xn , . . . , x1 )fn (xn ) . . . f1 (x1 ), Cn
where −i Q 2
7C (xn , . . . , x1 ) = Txn
C (xn ; xn−1 , . . .
, x1 ).
(103)
Remark 10. The kernels that represent the coinvariants are in some respects analogous to functional realizations of the conformal blocks in conformal field theory. We strongly suspect that we are touching upon the tip of an iceberg at this point: Quantization of Teichmüller space, as developed in [22, 23] conjecturally leads to a construction of spaces of conformal blocks in Liouville theory. One may expect this to be equivalent to a quantization of certain moduli spaces of flat SL(2, R) connections on Riemann surfaces with marked points. In analogy to results of [24] one would expect spaces of conformal blocks in the case of the punctured Riemann sphere to be represented by spaces of coinvariants in tensor products of Uq (sl(2, R)) representations. A class of these has been constructed in the present subsection. It would certainly be rather interesting
638
B. Ponsot, J. Teschner
and far-reaching if one could establish a direct relation between these spaces and the Hilbert spaces constructed via quantization of Teichmüller space. In this regard we find the following observation quite intriguing: Consider the case of n = 4. There is a canonical way to define a Hilbert space H(0,4) of coinvariants by = taking the sets { α ; α ∈ S} for either = = s or = = t as basis in the sense of generalized functions with the normalization given by (
= α,
= α )
= |Sb (2α)|−2 δ(α − α ).
(104)
The observation made in Subsect. 5.6. now implies that H(0,4) is in a canonical way 2 Q 2 isomorphic to L2 (R) such that multiplication with αs − Q 2 b (resp. αs − 2 b ) gets mapped into the self-adjoint finite difference operator Qs (resp. Qt ). Maybe there is a rather direct connection of these operators to the geodesic length operators appearing in the quantization of Teichmüller space. This would establish a direct relation between the latter and our quantum group results. 6. Appendix A: Spectral Analysis of C21 (κ3 ) This appendix is devoted to the proof of Theorem 3. 6.1. Preliminaries. The difference operator to be considered is of the form 2 πibQ 2πbx C21 (κ3 ) − α3 − Q e − δ0 + δ− e−πibQ e−2πbx , 2 b = δ+ e
(105)
where δs , s = −, 0, + are x-independent finite difference operators given by δ+ = Tx−ib [dx − α2 − ik3 ]b [dx − α1 + ik3 ]b , 2δ0 = {0}b {Q}b Tx−2ib − e−2πbk3 {2α2 − Q}b + e2πbk3 {2α1 − Q}b Tx−ib
+ {2α3 − Q}b ,
(106)
δ− = Tx−ib [dx + α2 − ik3 ]b [dx + α1 + ik3 ]b , and κ3 = −2k3 . It will initially be defined on the domain D ⊂ L2 (R) consisting of functions with the following property: There exists a function F (z) that is 1. holomorphic in the strip {z ∈ C|Im(z) ∈ [−2b, 0]} and 2. the functions Fy (x) ≡ F (x +iy) are in L2 (R, dx cosh(2π bx)) for any y ∈ [−2b, 0]. Proposition 9. The operator (C21 (k3 ), D) is a symmetric, densely defined operator in L2 (R). The domain D† of its adjoint is dense as well. Proof. First of all note that one has (f, Tx−ib g) = (Tx−ib f, g)
(107)
for any f, g ∈ D. This follows by shifting the contour of the integration that represents (f, T− g) to the line R + ib. The fact that C21 (κ3 ) is symmetric is then seen by a simple calculation remembering that αi∗ = Q − αi , i = 1, 2. The fact that D and D† are dense in L2 (R) is easily seen by noting that any Hermitefunction is contained in these sets.
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
639
˜ The Paley–Wiener theorem provides a characterization of the Fourier-transform D of the domain D of C21 (κ3 ). The action of C21 (κ3 ) on functions in D then corresponds ˜ with the following operator: to acting on D 2 2πbω C21 (κ3 ) − α3 − Q &1 + e4πbω &2 , 2 b ≡ &0 − e &0 = [dω + α3 − Q − 21 (α1 + α2 )]b [dω − α3 − 21 (α1 + α2 )]b , 1 &1 = [dω + 21 (α1 + α2 )]b eiπb(dω − 2 (α1 +α2 )+Q) {α1 − α2 − 2ik}b
1 − e−iπb(dω − 2 (α1 +α2 )+Q) {α1 − α2 + 2ik}b ,
&2 = [dω + 21 (α1 + α2 )]b [dω + 21 (α1 + α2 ) + Q]b . (108)
6.2. Strategy. The key to the proof of Theorem 3 is the following result characterizing regularity and asymptotic properties of distributional solutions to the eigenvalue equation of the operator C21 (κ3 ): Theorem 5. Let
2 t ∈ S (R) be a distributional solution of (C21 (κ3 )−[α3 − Q 2] )
= 0.
1. ˜ is represented by a function ˜ (ω) that can be continued to a meromorphic function on C, with simple poles within SQ/2 only at ω = − k3 + i(α1 + nb + mb−1 ), ω = − k3 − i(α1 + nb + mb−1 ), n, m ∈ Z≥0 . ω = + k3 + i(α2 + nb + mb−1 ), ω = + k3 − i(α2 + nb + mb−1 ), 2.
can be represented as = lim!→0 ! where ! is for ! > 0 represented as the restriction to R of a function ! (x) that is meromorphic on C with poles only at −1 x = + 2i α1 + α2 − Q ± i α3 − Q 2 − i(! + nb + mb ), n, m ∈ Z≥0 . Q −1 i x = − 2 α1 + α2 − Q + i 2 + nb + mb ,
In fact, given these properties it is not very difficult to show that for any given eigen2 value [α3 − Q 2 ] there is at most one tempered distributional solution to the eigenvalue equation (Proposition 13). Moreover, no such solution exists for Re(2α3 − Q) " = 0. It follows [25] that the deficiency indices vanish and C21 (κ3 ) has a unique self-adjoint extension. The spectral decomposition can be written as an expansion into generalized eigenfunctions [26]. It can be shown on rather general grounds that only tempered distributions can appear in the spectral decomposition, as is nicely discussed in [27]. The combination of Theorem 5 and Proposition 13 therefore also yields a characterization of the support of the Plancherel measure. These remarks reduce the proof of Theorem 3 to that of Theorem 5 and Proposition 13. 6.3. Preparations. In view of the explicit expressions for C21 (κ3 ) (cf. (105)) resp. its Fourier-transform (108) one may anticipate that the analysis of the asymptotic behavior of and ˜ will require some information about properties of the operators δ+ , δ− resp. &0 , &2 . The information that will be needed is contained in the following lemmas:
640
B. Ponsot, J. Teschner
Lemma 11. δ± is invertible on Cc∞ (R). The image f (x) of a function g ∈ Cc∞ (R) under −1 has the following properties: δ± 1. f (x) is analytic in the strip {x ∈ C; Im(x) ∈ (−2b, 0)} and f (x) ∈ C ∞ (R), f (x − 2ib) ∈ C ∞ (R). 2. f˜(ω) is meromorphic in C with simple poles at ω = −k3 + i(∓α1 + nb−1 )
ω = +k3 + i(∓α2 + nb−1 )
n ∈ Z.
−1 Proof. The action of δ± is represented on the Fourier transform f˜ as multiplication with −1 (δ˜± )−1 (ω) ≡ e−2πbω [iω ∓ α2 − ik3 ]−1 b [iω ∓ α1 + ik3 ]b .
The statement on the analyticity properties of f˜ is then clear after recalling that the function g(ω) ˜ is entire analytic and of rapid decay being the Fourier transform of a Cc∞ function [21, Theorem IX.11]. −1 The statement that (δ+ g)(x) is analytic in the strip {x ∈ C; Im(x) ∈ (−2b, 0)} −1 )(ω) by means of the Paley–Wiener follows from the asymptotic decay properties of (δ˜± Theorem. In fact, the rapid decay of g(ω) ˜ ensures convergence of the inverse Fourier −1 transformation for any x-derivative of (δ+ g)(x) even in the extremal cases Im(x) = 0 and Im(x) = −2b. We will furthermore need similar statements about the inverses of &0 and &2 . Lemma 12. &2 is invertible on Cc∞ (R). The image f (ω) of a function g ∈ Cc∞ (R) under &−1 2 has the following properties: 1. f˜(x) is meromorphic in C with simple poles at x = − 2i (α1 + α2 ) − i(Q + nb−1 )
x = − 2i (α1 + α2 ) + inb−1
n ∈ Z.
2. f (ω) is analytic in the strip {ω ∈ C; Im(x) ∈ (−b, b)} and f (ω ± ib) ∈ C ∞ (R). Lemma 13. &0 is invertible on the space of functions D(&0 ) ≡ dω + α3 − Q − 21 (α1 + α2 ) dω − α3 − 21 (α1 + α2 h,
h ∈ Cc∞ (R).
The image f (ω) of a function g ∈ D(&0 ) under &−1 0 has the following properties: 1. f˜(x) is meromorphic in C with simple poles at x = + 2i (α1 + α2 − Q) ± i(α3 −
Q −1 2 ) − inb
n ∈ Z \ {0}.
2. f (ω) is analytic in the strip {ω ∈ C; Im(x) ∈ (−b, b)} and f (ω ± ib) ∈ C ∞ (R).
Clebsch–Gordan and Racah–Wigner Coefficients for Representations of Uq (sl(2, R))
641
6.4. Asymptotic estimates. We now want to show that the Fourier-transform ˜ of may actually be represented by integration against a function ˜ (ω). For technical reasons it will be necessary to start by considering the distribution R ∈ S (R) defined by ˜ R ≡ δ˜tr,R (ω) ˜ ≡ (ω − ω ) ˜ , ω ∈I+ ∪I− |Im(ω )| 0 such that cosh(2π bn) Proof. We will rewrite large n. One may write
R , τn
R , τn
R , τn
< N for all n ∈ Z.
(109)
in a form that allows us to estimate its asymptotics for
= , δtr,R τn , = , δ+ e2πbx σn,R , =
σn,R ≡ e−2πbx (δ+ )−1 δtr,R τn ;
where
c σn,R , , δ+
where
c δ+
≡ (δ0 − δ− e
−2πbx
(110)
).
In the last step we have used that weakly solves the eigenvalue equation, for which one needs to check that σn,R ∈ D: One point of having introduced δtr,R is that it improves the asymptotic behavior of (δ+ )−1 δtr,R τn for x → −∞ by cancelling the poles of its Fourier transform in {ω ∈ C; Im(ω) < R}. The regularity theorem for tempered distributions [20, Theorem V.10] allows us to furthermore write ∞
R , τn
=
dx 6(x) ρn,R (x)
where
c −2πbx ρn,R ≡ ∂xk δ+ e (δ+ )−1 δtr,R τn
−∞
(111) for some positive integer k and a polynomially bounded continuous function 6(x). The functions ρn,R (x) may be represented by expressions of the form ρn,R (x) =
k=1,2
Ck e
−2πbx
∞ dω e2πiωx −∞
Pk,R (ω)τ˜n (ω) , (1 − e2πb(ω−k+iα1 ) )(1 − e2πb(ω+iα2 ) ) (112)
where Pk,R (ω) k = 1, 2 are some polynomials in ω. The functions ρn,R (x) have main support around x = n, and by choosing R large enough one can achieve decay stronger than e−2πλ|x−n| for any λ > 0. It is then convenient to split the integral in (111) into an c integral Jn obtained by integrating over [ n2 , 3n 2 ] and the remainder Jn . c In order to estimate Jn one may use the polynomial boundedness of 6(x) to estimate its absolute value by some constant times cosh(!x), where ! can be as small as one likes. The absolute value of ρn,R (x) can in R \ [ n2 , 3n 2 ] be estimated by some inverse power of
642
B. Ponsot, J. Teschner
cosh(x), which is bounded by the chosen value of R. It follows that there exist D1 , N1 such that |Jnc | ≤ D1 e−2πµn
for any n > N1 ,
(113)
where µ can be made arbitrarily large by choosing R large enough. In the case of Jn one may estimate |ρn,R (x)| by some constant times e−2πbn e−2πb|x−n| and 6(x) simply by a constant, which easily gives existence of D2 , N2 such that |Jn | ≤ D2 e−2πbn
for any n > N1 .
(114)
This proves the claim about the asymptotics for n → ∞. In the case of n → −∞ one uses the operator δ− in a completely analogous fashion. 6.5. Representation of ˜ . Assume that the set {τn ; n ∈ Z} represents a Cc∞ (R)-partition of unity. It will be convenient to choose the τn as translates of τ0 : τn (x) = τ0 (x − n). This can always be achieved: Let
τ0 (x) =
χ (x) = N −1
x
dt exp
− 41
|x| >
3 4
0
if
1
if |x|
K24 and l = A,
√ P (| Int(0 )| ≥ A) ≥ exp − w1 A − K25 l 1/3 (log l)2/3 . The size of the error term K25 l 1/3 (log l)2/3 in Theorem 4.1 is important because it determines what “bad” behaviors can be ruled out as unlikely – in particular, those which have probability at most exp(−w1 l − cl 1/3 (log l)2/3 )) for some c > K25 . Though our error term is likely not optimal – according to [16] the optimal error term may be of order log l – it is enough of an improvement over the corresponding results in [12] and [17] to enable us to establish an apparently near-optimal bound on the local roughness. The proof of our Theorem 4.1 relies on the following result, the halfspace version of (2.10). Theorem 4.2 ([4]). Let P be a percolation model on B(Z2 ) satisfying (2.7), positivity of τ and the ratio weak mixing property. There exist 412 , K26 such that for all x = y ∈ R2 and all dual sites u, v ∈ Hxy , P (u ↔ v via an open dual path in Hxy ) ≥
412 e−τ (v−u) . |v − u|K26
Proof of Theorem 4.1. Let s = l 2/3 (log l)1/3 and δ = K27 s 2 / l, with K27 to be specified. Let (y0 , . . . , yn , y0 ) be the s-hull skeleton of ∂(l + δ)K1 . For each i let yi be a dual site √ with yi ∈ Hyi−1 yi ∩ Hyi yi+1 and |yi − yi | ≤ 2 2. By (3.5), provided K27 is large enough we have Co({y0 , . . . , yn }) ⊃ lK1
and hence
| Co({y0 , . . . , yn })| ≥ A.
Further, n j =0
τ (yj +1 − yj ) ≤ W(∂(l + δ)K1 ) + 4κτ n ≤ w1 l + K28 l 1/3 (log l)2/3
(4.1)
Droplets in Random Cluster Models
749
(with yn+1 = y0 .) Therefore using the FKG property, Theorem 4.2, (4.1) and (3.2), P (| Int(0 )| ≥ A) ≥ P (0 encloses Co({y0 , . . . , yn })) ≥ P (yj ↔ yj +1 via a path in Hyj yj +1 for all j ≤ n) ≥
n j =0
P (yj ↔ yj +1 via a path in Hyj yj +1 )
n 4 (n+1) 12 ≥ K exp − τ (yj +1 − yj ) l 26 j =0
1/3 ≥ exp − w1 l − K29 l (log l)2/3 . # "
(4.2)
5. Upper Bounds for Open Dual Circuit Probabilities We need to develop a method of cutting a dual circuit across a bottleneck, modifying the bond configuration to create two dual circuits. The cutting procedure is simplified if the bottleneck is clean, in the following sense. The canonical path from dual site u = (x1 , y1 ) to dual site v = (x2 , y2 ) is the path, denoted ζuv , which goes horizontally from (x1 , y1 ) to (x2 , y1 ), then vertically to (x2 , y2 ). We call a bottleneck (u, v) clean if ζuv ⊂ Int(γ ) (except for the endpoints u, v.) The next lemma will enable us to restrict our cutting to clean bottlenecks. Lemma 5.1. If a dual circuit γ contains a (q, r)-bottleneck for some r > 3q > 0, then γ contains a clean (q, r/3)-bottleneck. Proof. Suppose γ contains a (q, r)-bottleneck (u, v). We have two disjoint paths from u to v: γ [u,v] and γ [v,u] (traversed backwards.) Each of these may intersect ζuv a number of times. Accordingly, ζuv contains a finite sequence of sites u = x0 , x1 , . . . , xm = v such that the segment ζi of ζuv between xi−1 and xi satisfies ζi ⊂ Int(γ ) for all i ∈ I and ζi ⊂ Ext(γ ) ∪ γ for all i ∈ / I , where I consists either of all odd i or of all even i. For i ∈ I , we call the segment of ζuv with endpoints xi−1 and xi an interior gap. Let ψ be a dual path from u to v in Int(γ ) with |ψ| ≤ q. We can extend ψ to a doubly infinite path ψ + by adding on (possibly non-lattice) paths ψ1 from v to ∞ and ψ2 from ∞ to u, both in Ext(γ ). The path ψ + divides the plane into two regions, AL ⊃ γ [v,u] to the left of ψ + and AR ⊃ γ [u,v] to the right. Replacing ψ with ζuv in the definition of ψ + , we obtain another doubly infinite path ζ + . The path ζ + is not necessarily self-avoiding, but R2 \ζ + has exactly two unbounded components BL and BR , to the left and right of ζ + , respectively. Since diam(ζuv ) ≤ q, there exist sites z1 ∈ γ [u,v] and z2 ∈ γ [v,u] for which d(zj , ζuv ) ≥ (r − q)/2 > q (j = 1, 2) and hence z1 ∈ BR , z2 ∈ BL . Let θ be a (possibly non-lattice) path from z1 to z2 in Int(γ ). Then θ must intersect ζ + , and hence must intersect ζuv , necessarily in some interior gap. Thus every θ from z1 to z2 in Int(γ ) must cross at least one interior gap, so there exists an interior gap ζi which separates z1 and z2 , that is, exactly one of z1 , z2 is in γ [xi−1 ,xi ] . It follows that (xi−1 , xi ) is a clean (q, (r − q)/2)-bottleneck. Since (r − q)/2 > r/3, the proof is complete. " #
750
K. S. Alexander
Define Rx = x + [−1/2, 1/2]2 and Rx+ = x + [−1, 1]2 . Let Q1 (u, v) = ∪x∈ζuv Rx and Q2 (u, v) = ∪x∈ζuv Rx+ . Note that |∂Q2 (u, v)| ≤ 4q + 8. Let J1 (u, v), J2 (u, v), . . . . be an enumeration of the subsets of ∂Q2 (u, v). We say a clean (q, r)-bottleneck (u, v) in a dual circuit γ is of type η if the set of bonds in ∂Q2 (u, v) which are contained (except possibly for endpoints) in Int(γ ) is precisely Jη (u, v). Following [17] we call a dual circuit γ r-large if diamτ (γ ) > r and r-small if diamτ (γ ) ≤ r. We assume we have a fixed but arbitrary algorithm for choosing a particular (q, r)bottleneck, which we then call primary, from any circuit containing one or more (q, r)bottlenecks. When a configuration ω includes an exterior dual circuit γ for which (u, v) is a primary (q, r)-bottleneck of type η, we can apply a procedure, which we term bottleneck surgery (on γ , at (u, v)) to create a new configuration denoted Yuvη (ω). Bottleneck surgery consists of replacing the configuration ω with the configuration given by if e ∈ ∂Q1 (u, v), 1, Yuvη (ω)e = 0, (5.1) if e∗ ∈ Jη (u, v), ω , otherwise, e for each bond e. The configuration Yuvη (ω) then contains two or more disjoint open dual circuits αi , each consisting of some dual bonds of γ and some dual bonds of Jη (u, v), with no open dual path connecting αi to αj for i = j , and with ∪i Int(αi ) = Int(γ )\Q2 (u, v). Further, |Q2 (u, v)| +
| Int(αi )| ≤ K30 r 2 ,
(5.2)
(5.3)
i:αi (κτ r/3)-small
and, since γ is exterior, there is no open dual path connecting αi to αj for i = j . We call each αi a (q, r)-offspring or a (q, r)-descendant of γ . A (q, r)-offspring of a (q, r)descendant is also a (q, r)-descendant, iteratively. We may perform bottleneck surgery on each (q, r)-offspring of γ which contains a clean (q, r)-bottleneck, and iterate this process until no descendant of γ contains such a clean (q, r)-bottleneck (necessarily after a finite number of surgeries.) The bottleneck-free (q, r)-descendants are called final (q, r)-descendants. Among final (q, r)-descendants, the one enclosing maximal area is called the maximal (q, r)-descendant of γ and denoted αmax,γ . The set of all (κτ r/3)-large final (q, r)-descendants of γ is denoted F(q,r) (γ ); the non-maximal among these form the set F(q,r) (γ ) = F(q,r) (γ )\{αmax,γ }. Note that since γ is exterior, so is each offspring of γ . It is useful to note the following general fact about norms on R2 , which can be verified by a simple geometric argument. Let C be a convex set; then W(∂C) ≤ 6 diamτ (C). Define u(c, A) = max(w1 A1/2 − cA1/6 , 0)
(5.4)
Droplets in Random Cluster Models
and D(q,r) (γ ) =
751
diamτ (α),
α∈F(q,r) (γ )
D(q,r) (γ ) =
diamτ (α).
α∈F(q,r) (γ )
The following lemma is related to (5.4). Lemma 5.2. Let γ be a circuit, let A = | Int(γ )|, and let q ≥ 1, r ≥ 15q. Then D(q,r) (γ ) ≥
1 √ w1 A. 6
(5.5)
Proof. We may assume γ contains a clean (q, r)-bottleneck (u, v), for otherwise (5.5) is immediate from (5.4). We have √ 1 q + 2 2 ≤ r. 4 Let S denote the union of Q2 (u, v) and all (κτ r/3)-small offspring of γ , and let R = {z ∈ R2 : dτ (z, Q2 (u, v)) ≤ κτ r/3}. Then S ⊂ R,
√ √ 2 2r diam(R) ≤ q + 2 2 + 3
and √ √ q +2 2+ 2r 2 |R| ≤ z ∈ R : d(z, Q2 (u, v)) ≤ ≤π 3 2
√ 2 2r 3
2 ≤ r 2 . (5.6)
Note that the set {α1 , α2 , . . . .} of (κτ r/3)-large offspring of γ can be divided into two disjoint classes: right offspring, which intersect γ [u,v] , and left offspring, which intersect γ [v,u] . Also, every point of γ is either√ in a left offspring, in a right offspring, or in S. The diameter of Q2 (u, v) is at most q + 2 2 ≤ r/6, while the diameters of γ [u,v] √ and γ [v,u] are at least r, so the right and left classes each include at least one (5κτ r/6 2)-large offspring,. Further, if D(q,r) (γ ) ≥ diamτ (γ ),
(5.7)
then (5.5) follows from (5.4). Let w and x be sites of γ with dτ (w, x) = diamτ (γ ). At least one of these points is not in Q2 (u, v), so we may assume w is in some αi . There are now four possibilities. First, √ if also x ∈ αi , then (5.7) holds. Second, if instead x ∈ S, then there exists a (5κτ r/6 2)-large offspring αj = αi , and we have D(q,r) (γ ) ≥ diamτ (αi ) + diamτ (αj ) √ √ κτ r 5κτ r + √ ≥ diamτ (γ ) − 2κτ (q + 2 2) − 3 6 2 ≥ diamτ (γ ),
752
K. S. Alexander
and again (5.7) holds. Third, suppose x ∈ αk for some k = i and there exists a third (κτ r/3)-large offspring αl with l = i, k. Let dm = diamτ (αm ). Then √ √ di + dk ≥ diamτ (γ ) − 2κτ (q + 2 2) ≥ diamτ (γ ) − dl , so once more (5.7) holds. The fourth possibility is that x ∈ αk for some k =√i and αi , αk are the only (κτ r/3)large offspring; each is necessarily actually (5κτ r/6 2)-large. From (5.4) we have A ≤ |R| + | Int(αi )| + | Int(αk )| ≤ r 2 +
36 2 (di + dk2 ). w12
Using this and the fact that w1 ≤ 4κτ (since the unit square encloses unit area) we obtain w12 4 A ≤ κτ2 r 2 + di2 + dk2 ≤ 2di dk + di2 + dk2 . 36 9 Taking square roots yields (5.5). " # For k ≥ 0 define the events My (k, q, r, A, A , d , t) = |F(q,r) (0 )| = k ∩ [| Int(0 )| = A] ∩ | Int(αmax,0 )| = A ∩ D(q,r) (0 ) ∈ [d , d + 1)] ∩ [W(∂ Co(αmax,0 )) ≥ t] . We first consider k = 0, which means αmax,γ = γ and D(q,r) (γ ) = 0; larger values will be handled later by induction. Proposition 5.3. Assume (2.7) and either (i) the ratio weak mixing property or (ii) both the weak mixing property and the near-Markov property for open√circuits. Then there exist constants 4i , Ki as follows. Let A ≥ 1, t+ ≥ 0, t = w1 A + t+ ≥ 2, and 413 t > r > 15q > K31 log t. Then P (M0 (0, q, r, A, A, 0, t)) ≤ e−u(K32 r
2/3 ,A)− 1 t 2 +
.
(5.8)
Proof. From the definition of w1 we may assume t+ ≥ 0. It follows easily from (2.9) that for some K33 , K34 , P (diamτ (0 ) ≥ t) ≤ P (m ≤ diamτ (0 ) < m + 1) m>t−1
≤
e−τ (y−x)
m>t−1 x ∗ ,y ∗ ∈Bτ (0,m+1)∩(Z2 )∗ :τ (y−x)≥m
≤ K33 t 4 e−t ≤ e−u(K34 r
2/3 ,A)−t +
,
so by (5.4) it suffices to consider M0 (q, r, A, t) = M0 (0, q, r, A, A, 0, t) ∩ [t/6 ≤ diamτ (0 ) ≤ t].
(5.9)
Droplets in Random Cluster Models
753
Suppose ω ∈ M0 (q, r, A, t). Fix α > 0 to be specified, let s = αt 2/3 r 1/3 and suppose HSkels (0 ) = (y0 , . . . , ym+1 ). By (3.1), m ≤ K35 α −1 t 1/3 r −1/3 .
(5.10)
Let Bi = B(yi , 4r) ∩ (Z2 )∗ . Let I = {i ≤ m : |yi+1 − yi | > 8r}. For each i ∈ I [y ,y ] there is a segment 0[wi ,xi ] ⊂ 0 i i+1 entirely outside Bi ∪ Bi+1 , with wi ∈ ∂Bi and xi ∈ ∂Bi+1 . We next show that [wj ,xj ]
d(0[wi ,xi ] , 0
) > q/2
for all i = j in I.
(5.11)
[w ,x ]
If not, there exist u ∈ 0[wi ,xi ] , v ∈ 0 j j and a dual path ψ from u to v in Co(0 ) [y ,y ] with |ψ| ≤ q. Let a be the last site of ψ in 0 i i+1 , and b the first site of ψ after a [y ,y ] which is in some segment 0 k k+1 with k = i. Since all sites yl are extreme points, we must have ψ (a,b) ⊂ Int(0 ). We claim that (a, b) is a (q, 3r)-bottleneck. By Lemma 5.1 this is a contradiction, so our claim will establish (5.11). Suppose i < k; the proof if i > k is similar. We have ψ ⊂ B(u, q) and u ∈ / Bi+1 so ψ ∩ B(yi+1 , 3r + 1) = φ. Therefore 0[a,b] contains a segment in B(Bi+1 ) which includes yi+1 and has diameter at least 3r. Similarly since v ∈ / Bi , 0[b,a] contains a segment in B(Bi ) which includes yi and has diameter at least 3r. This proves the claim and thus (5.11). From (3.6) and (5.10) we have τ (xi − wi ) ≥ W(HPaths (0 )) − K36 mr i∈I
≥ W(∂ Co(0 )) − K37 α 2 t 1/3 r 2/3 − K36 mr
(5.12)
≥ t − K38 (α 2 + α −1 )t 1/3 r 2/3 . Equation (5.12) shows that it is optimal to take α of order 1 in our choice of s, so we now set α = 1. For wi ∈ ∂(Bi ∩Z2 ) and xi ∈ ∂(Bi+1 ∩Z2 ) for each i ≤ m, let A(w0 , x0 , . . . , wm , xm ) be the event that for each i ∈ I there is an open dual path αi from wi to xi in Bτ (wi , t), with d(αi , αj ) > q/2 for all i = j . Then we have shown P M0 (q, r, A, t)) ∩ [HSkels (0 ) = (y0 , . . . , ym+1 )] ≤ ··· P (A(w0 , x0 , . . . , wm , xm )) (5.13) w x w x 0
0
2 m+1
≤ (K39 r )
m
m
max
w0 ,x0 ,...,wm ,xm
P (A(w0 , x0 , . . . , wm , xm )).
Provided K31 is sufficiently large, Lemma 3.2, (5.12) and induction give P (A(w0 , x0 , . . . , wm , xm )) ≤ 2m P (wi ↔ xi ) i∈I
(5.14)
m −t+2K38 t 1/3 r 2/3
≤2 e
which with (5.10) and (5.13) yields 1/3 2/3 P M0 (q, r, A, t) ∩ [HSkels (0 ) = (y0 , . . . , ym+1 )] ≤ e−t+K40 t r .
(5.15)
754
K. S. Alexander
The number of possible (y0 , . . . , ym+1 ) in (5.15) is at most (K41 t 2 )m+1 , which with (5.15) yields P (M0 (q, r, A, t)) ≤ e−t+K42 t
1/3 r 2/3
,
(5.16)
provided K31 , and hence r, is large enough. It is easily verified that, provided 413 is small enough, 1 K42 (t+ + w1 A1/2 )1/3 r 2/3 ≤ 2K42 (w1 A1/2 )1/3 r 2/3 + t+ , 2 by considering two cases according to which of t+ and w1 A1/2 is larger. This and (5.16) establish (5.8) for M0 ; as we have noted, this and (5.9) establish (5.8) as given. " # Remark 5.4. Let I be any increasing event. Since the event on the left side of (5.14) is decreasing, its probability is not increased by conditioning on I . It follows easily that Proposition 5.3 remains true if the probability on the left side of (5.8) is conditioned on I , even though M0 (0, q, r, A, A, 0, t) is not itself a decreasing event. Under (2.7), open dual bonds do not percolate, so for every bounded set A there is a.s. an innermost open circuit surrounding A; we denote this circuit by O(A). An enclosure event is an event of form ∩i≤n (Open(αi ) ∩ [αi ↔ ∞]) , where α1 , . . . , αn are circuits (of regular bonds.) This includes the degenerate case of 2 the full space {0, 1}B(Z ) . Clearly any such event is increasing. Proposition 5.5. Assume (2.7), the weak mixing property and the near-Markov property for open circuits. There exist constants K√i , 4i as follows. Let A ≥ A ≥ 3, k ≥ 0, t+ ≥ √ 0, t = w1 A + t+ , d ≥ 0, and 414 (w1 A + d + t+ ) ≥ r ≥ 15q ≥ K43 log A. Then 1 1 2/3 P (M0 (k, q, r, A, A , d , t)) ≤ exp −u(K44 r , A) − t+ − d (5.17) 60 10 and
1 P (M0 (k, q, r, A, A , d , t)) ≤ exp − d P (M0 (0, q, r, A , A , 0, t)). 2
(5.18)
Remark 5.6. The proof of Proposition 5.5 actually shows slightly more than (5.18). For U ⊂ R2 let My,U (k, q, r, A, A , d , t) = My (k, q, r, A, A , d , t) ∩ [Int(y ) ⊂ U ]. Under the assumptions of the proposition, for every enclosure event E and every U ⊂ R2 , we have 1 1 P (M0,U (k, q, r, A, A , d , t) | E) ≤ exp −u(K44 r 2/3 , A) − t+ − d (5.19) 60 10 and
P M0,U (k, q, r, A, A , d , t) | E 1 ≤ exp − d max P My,U (0, q, r, A , A , 0, t) | E . 2 y∈U ∩Z2
(5.20)
We do not need this stronger result here, but it will be useful in a forthcoming paper.
Droplets in Random Cluster Models
755
√ Proof of Proposition 5.5. We will refer to the requirement 414 (w1 A + d + t+ ) ≥ r as the size condition, and to all other assumptions of the proposition collectively as the basic assumptions. We first prove (5.17). We proceed by induction on k, using Proposition 5.3 for k = 0. Fix q, r and define for U ⊂ R2 , Ly,U (k, A, A , d, d , t) = My,U (k, q, r, A, A , d , t) ∩ [d ≤ D(q,r) (y ) < d + 1], (5.21) where 1 √ w1 A − 1 ≤ d ≤ K45 A, 6
d≥
1 t+ − 1. 6
(5.22)
We omit the U in the notation when U = R2 . If K45 is large enough then, from Lemma 5.2 and the lattice nature of y , Ly,U (k, A, A , d, d , t) is empty if any of the inequalities in (5.22) fails. Note that for some K46 , My (k, q, r, A, A , d , t) ⊂ [diam(y ) ≤ K46 A].
(5.23)
Our induction hypothesis is that for some constants 4i , Ki for all j < k, all A, A , t, d satisfying the basic assumptions, all U ⊂ R2 and all d satisfying (5.22), for every enclosure event E, we have κτ 9 P (L0,U (j, A, A , d, d , t) | E) ≤ exp − d + j r ; (5.24) 10 40 if the size condition is also satisfied, then in addition P (L0,U (j, A, A , d, d , t) | E)
≤ exp −u(K32 r
2/3
1 1 , A) − t+ − d , 60 10
(5.25)
with K32 from (5.8). We wish to verify these hypotheses for j = k. For j = 0 it suffices to consider d = 0 and (5.25) is Proposition 5.3 (together with Remark 5.4), while (5.24) follows easily from the first inequality in (5.9), if K43 is large. Hence we may assume k ≥ 1 and fix A, A , d, d . Let Q(u, v, η) denote the event that L0 (k, A, A , d, d , t) occurs with (u, v) a primary (q, r)-bottleneck in 0 of type η, and let R(u, v, η) = {Yuvη (ω) : ω ∈ Q(u, v, η)}. Let E be an enclosure event; it is easy to see that bottleneck surgery cannot destroy E, that is, Yu,v,η (Q(u, v, η) ∩ E) ⊂ R(u, v, η) ∩ E. (This is the reason for considering only enclosure events, not general increasing events.) Since |∂Q1 (u, v)|+|Jη (u, v)| ≤ K47 q, we then have from the bounded energy property: P (Q(u, v, η) | E) ≤ eK48 q P (R(u, v, η) | E).
(5.26)
Fix u, v, η and for y1 , . . . , yl ∈ Z2 and l, Ai , ki , di , di ≥ 0 in Z, let Z = Z(l, y1 , . . . , yl , A1 , . . . , Al , d1 , . . . , dl , d1 , . . . , dl , k1 , . . . , kl ) denote the event that there exist disjoint exterior open dual circuits α1 , . . . , αl such that:
756
K. S. Alexander
αi surrounds yi , | Int(αi )| = Ai , diamτ (αi ) ≥ κτ r/3, |F(q,r) (αi )| = ki , di ≤ D(q,r) (αi ) < di + 1 and di ≤ D(q,r) (αi ) < di + 1 for all i ≤ l; (ii) letting αmain denote the open dual circuit enclosing maximal area among all descendants of all αi , we have√αmain a descendant of α1 satisfying | Int(αmain )| = A and W(∂ Co(αmain )) ≥ w1 A + t+ ; (iii) there is no open dual path connecting αi to αn for any two indices i = n.
(i)
We suppress the parameters in Z when confusion is unlikely. Then considering only (κτ r/3)-large offspring we see that R(u, v, η) ⊂ ∪ Z(l, y1 , . . . , yl , A1 , . . . , Al , d1 , . . . , dl , d1 , . . . , dl , k1 , . . . , kl ), (5.27) where the union is over all parameters satisfying 2 ≤ l ≤ min(4q, k + 1), A1 ≥ A ,
yi ∈ (Z2 )∗ with d(yi , ζuv ) ≤ 2,
A − K30 r 2 ≤
Ai ≤ A,
i≤l
Ai ≥
κτ r , 6
ki = k + 1 − l,
1 w1 Ai − 1 ≤ di ≤ K45 Ai , di ≤ di , 6 d − l ≤ d1 + di ≤ d , d − l ≤ di ≤ d, 2≤i≤l
d1 − d1 + 1 ≥
√ 1 (w1 A + t+ ). 6
(5.28)
i≤l
(5.29) (5.30)
i≤l
(5.31)
Here (5.31) and the first inequality in (5.29) follow from (ii) above and (5.4), and K30 is from (5.3). Temporarily fix such a set of parameters and let ν1 , . . . , νl be circuits with diamτ (νi ) ≥
κτ r 3
for all i, and Int(νi ) ∩ Int(νj ) = φ for i = j.
Define events
√ L˜ i = ∪3≤B≤Ai Lyi (ki , Ai , B, di , di , w1 B), i = 1, L˜ 1 = Ly1 (k1 , Ai , A , d1 , d1 , t), Ti = [O(yi ) = νi ], Yi = L˜ i ∩ Ti for i ≤ m, T = ∩i≤l Ti .
Then Z ∩ T ⊂ ∩i≤l Yi .
(5.32)
Note that as in (3.14), Yi = Open(νi ) ∩ [νi ↔ ∞] ∩ Gi for some Gi ∈ GB(Int(νi )) ,
for every i ≤ l. (5.33)
Define regions and events R = ∪i≤l−1 Int(νi ), F = ∩i≤l−1 Ext(νi ), ˜ = ∩i≤l−1 Gi , H = ∩i≤l−1 Open(νi ), G
N = ∩i≤l−1 [νi ↔ ∞],
Droplets in Random Cluster Models
757
and let Ll denote the event that L˜ l occurs on B(F ). Then N ∩ E ∩ H = E R ∩ EF ∩ H ˜ ∩i≤l−1 Yi = H ∩ N ∩ G,
for some ER ∈ GB(R) , EF ∈ GB(F ) ,
(5.34) (5.35)
and ∩i≤l Yi ⊂ Ll .
(5.36)
The relation between area Ai and diameter di tells us roughly whether the circuit αi (or its collection of descendants) is regular or irregular; we thereby subdivide the circuits into “large regular”, “small regular” and “irregular” categories as follows: I1 = {i ≤ l : di < 4w1 Ai , Ai ≥ c1 r 2 }, I2 = {i ≤ l : di < 4w1 Ai , Ai < c1 r 2 }, I3 = {i ≤ l : di ≥ 4w1 Ai }, 2 , (3K /w )3 ) is chosen so that where c1 = max(1/414 32 1
u(K32 r 2/3 , Ai ) ≥
2 w1 A i , 3
i ∈ I1 .
(5.37)
Let 1 9 κτ µi = max u(K32 r 2/3 , Ai ) + di , di − ki r , i ∈ I1 \{1}, 10 10 40 9 κτ µi = di − ki r, i ∈ I2 ∪ I3 , 10 40 1 1 9 κτ µ1 = max u(K32 r 2/3 , A1 ) + t+ + d1 , d1 − k1 r if 1 ∈ I1 60 10 10 40 (cf. (5.24)). Now H ∩ EF is an enclosure event so by the induction hypotheses (5.24) and (5.25), summing over B ≤ Al gives P (Ll | H ∩ EF ) ≤ Al e−µl .
(5.38)
(Note that the size condition can be used here for i ∈ I1 .) Since |νi | ≥ 415 r for all i, from (2.5) and (2.6), provided K43 is sufficiently large we get ˜ ∩ ER ) ≤ (1 + e−415 r/2 )P (Ll ∩ EF | H ) P (Ll ∩ EF | H ∩ G
(5.39)
˜ ∩ ER ). P (EF | H ) ≤ (1 + e−415 r/2 )P (EF | H ∩ G
(5.40)
and
758
K. S. Alexander
Combining (5.33)–(5.40) we obtain P (∩i≤l Yi ) ∩ E ≤ P (Ll ∩ (∩i≤l−1 Yi ) ∩ E)P (Tl | Ll ∩ (∩i≤l−1 Yi ) ∩ E) ˜ ∩ ER ∩ EF )P (Tl | Ll ∩ (∩i≤l−1 Yi ) ∩ E) = P (Ll ∩ H ∩ G ˜ ∩ ER ) ≤ (1 + e−415 r/2 )P (Ll ∩ EF | H )P (H ∩ G · P (Tl | Ll ∩ (∩i≤l−1 Yi ) ∩ E) ˜ ∩ ER ) = (1 + e−415 r/2 )P (Ll | EF ∩ H )P (EF | H )P (H ∩ G · P (Tl | Ll ∩ (∩i≤l−1 Yi ) ∩ E) ˜ ∩ ER )P (H ∩ G ˜ ∩ ER ) ≤ (1 + e−415 r/2 )2 Al e−µl P (EF | H ∩ G
(5.41)
· P (Tl | Ll ∩ (∩i≤l−1 Yi ) ∩ E) = (1 + e−415 r/2 )2 Al e−µl P ((∩i≤l−1 Yi ) ∩ E)P (Tl | Ll ∩ (∩i≤l−1 Yi ) ∩ E). Summing over νl (which appears via Tl ), dividing by P (E) and iterating this (taking H and N to be the full space and R = φ, F = R2 , at the last iteration step) we obtain using (5.32) −415 r/4 2l −µi l (5.42) ) Ai e ≤ 2A exp − µi . P (Z | E) ≤ (1 + e i≤l
i≤l
We now want to sum (5.42) over all parameters of Z allowed in (5.27). We first view l as fixed and allow the other parameters to vary. Note that the number of parameter choices is at most (K49 A)5l+1 , and the number of possible (u, v, η) is at most (K49 A)3 , for some K49 . Suppose we can show that, under the basic assumptions,
µi ≥
i≤l
9 κτ κτ d − kr + (l − 1)r 10 40 100
(5.43)
and, if the size condition is also satisfied,
µi ≥ u(K32 r 2/3 , A) +
i≤l
1 1 κτ d + t+ + (l − 1)r. 10 60 100
(5.44)
Then from (5.27) and (5.42), 9 κτ κτ P (R(u, v, η) | E) ≤ 2(K49 A)5l+2 exp − d + kr − (l − 1)r , 10 40 40 and if the size condition is satisfied, 1 1 P (R(u, v, η) | I ) ≤ (K49 A)5l+2 exp − max u(K32 r 2/3 , A) + d + t+ , 10 60
κτ 9 κτ (l − 1)r . d − kr − 40 100 10
Thus, summing over u, v, η, then over l, and using r > K43 log A and (5.26), we obtain (5.24) and (5.25) for j = k.
Droplets in Random Cluster Models
759
Now (5.43) is a direct consequence of (5.28) and (5.30), so we turn to (5.44) and assume that the size condition holds. Let 1 di , i≥2 10 δi = 1 d + 1 t+ , i = 1 10 1 60 and set β1 = β3 = 1/2, β2 = 1/10. We claim that 1 (5.45) µi ≥ βj w1 Ai + δi + di − 1 for every i ∈ Ij , j = 1, 2, 3. 10 Observe that κτ r κτ r κτ r κτ r − 1 ≥ (ki + 1) , di ≥ ki − 1 ≥ ki (5.46) di ≥ (ki + 1) 3 4 3 4 and hence 4 (5.47) µi ≥ di , i ≤ l. 5 This yields 4 11 (5.48) µi ≥ di ≥ w1 Ai + di , i ∈ I3 , 5 20 which proves (5.45) for i ∈ I3 \{1}. For i ∈ I1 we can use (5.37), (5.46) and a convex combination of the lower bounds (5.47) and u(K32 r 2/3 , Ai ) for µi to obtain 3 1 1 1 (5.49) µi ≥ u(K32 r 2/3 , Ai ) + di ≥ w1 Ai + di , i ∈ I1 , 4 5 2 5 which proves (5.45) for i ∈ I1 \{1}. From (5.29), 1 1 3 di ≥ w1 Ai + di − , i ≤ l, 8 4 4 and hence by (5.47), 4 1 1 3 w1 Ai + di − , i ≤ l, (5.50) µ i ≥ di ≥ 5 10 5 5 which proves (5.45) for i ∈ I2 \{1}. We need slightly different estimates for i = 1. If 1 ∈ I3 then using (5.48) and (5.31) we obtain 1 µ1 ≥ w1 A1 + d1 2 (5.51) 1 1 1 ≥ w1 A1 + d1 + (d1 − 1) + t+ , 4 4 24 which proves (5.45) for i = 1. If 1 ∈ I1 then by (5.49) and (5.31), 1 1 µ1 ≥ w1 A1 + d1 2 5 (5.52) 1 1 1 1 ≥ w1 A1 + d1 + (d1 − 1) + t+ , 2 10 10 60
760
K. S. Alexander
which again proves (5.45) for i = 1. Finally if 1 ∈ I2 , then by (5.50), (5.52) remains valid with 1/10 in place of 1/2, and 3/5 subtracted from the right side, once again proving (5.45) for i = 1. Thus (5.45) holds in all cases. The next step is to sum (5.45) over i. There are 2 cases. √ Case 1. d ≥ 20w1 A. Then using (5.45), (5.30) and (5.46), 1 µi ≥ δi + di + 1 10 i≤l
i≤l
≥
1 1 1 1 (d − l) + t+ + (d − l) + di + l 10 60 20 20
√ 1 1 κτ ≥ d + t+ + w1 A, + lr, 10 60 80
(5.53)
i≤l
which proves (5.44). √ Case 2. d < 20w1 A. By (5.30), (5.31) and (5.46),
√ 6d + t+ ≤ 6(d + l + 1) ≤ 7d ≤ 140w1 A.
This and the size condition imply r < 141414 w1
√
√ κτ A A≤ 80K30 w1
if 414 is small enough, with K30 as in (5.3), and then √ √ K30 r 2 κτ r A − K30 r 2 ≥ A 1 − . ≥ A− A 80w1 Let Sj =
Ai ,
j = 1, 2, 3,
and
(5.54)
S = S1 + S2 + S3 .
i∈Ij
Provided 414 is small enough, we have by (5.3) and (5.54), √ √ κτ r w1 S ≥ w1 A − . 80 It is easily checked that, for θ ≤ 1, √ √ √ a + b ≤ a + θ b for 0 ≤ b ≤ 4θ 2 a.
(5.55)
(5.56)
Choose i13 , i2 satisfying Ai13 = max Ai , i∈I1 ∪I3
Ai2 = max Ai . i∈I2
(If I1 ∪ I3 or I2 is empty then the corresponding i13 or i2 is undefined.) We now consider two subcases. 1 Case 2a. I2 = φ or Ai2 ≤ 25 (S1 + S3 ). From the definition of µi if i13 ∈ I1 , and from (5.45) if i13 ∈ I3 , we have µi13 ≥ w1 Ai13 − λ + δi13 , (5.57)
Droplets in Random Cluster Models
761
1/6 where λ = min(K32 r 2/3 Ai13 , w1 Ai13 ). Therefore by (5.56) with θ = 1/2,
µi ≥ w1 S1 + S3 − λ + δi +
i∈I1 ∪I3
i∈I1 ∪I3
i∈I1 ∪I3 ,i=i13
1 di − l. 10
(5.58)
Hence by (5.56) with θ = 1/25, and (5.55), (5.46), (5.58) and (5.45), √ κτ µi ≥ w1 S − λ + δi + (l − 1)r − l 40 i≤l
i≤l
√ κτ r 1 κτ 1 ≥ w1 A − − λ + (d − l) + t+ + (l − 1)r − l 80 10 60 40 1 1 κτ ≥ u(K32 r 2/3 , A) + d + t+ + (l − 1)r, 10 60 100
(5.59)
which gives (5.44). 1 Case 2b. Ai2 > 25 (S1 + S3 ). Let us relabel (Ai , i ∈ J2 ) as B1 ≥ · · · ≥ Bn . We have S1 + S3 ≤ 25c1 r 2 , while from (5.3), provided 414 is small enough, S ≥ A − K30 r 2 ≥ 32, 525c1 r 2 , so S2 ≥ 32, 500c1 r 2 ≥ 325
100
Bi ,
m=1
which implies 1 20
n
m=101
1/2 Bm
9 ≥ 10
100
1/2 Bm
.
m=1
Using this, and using (5.56) twice (with θ = 1 and with θ = 1/20), we get n 1 Bi 10 m=1 1/2 100 n 1 1 ≥ Bm + Bi 10 10 m=1 m=101 1/2 1/2 1/2 100 n 100 n 1 1 9 ≥ Bm + Bi + Bi − Bm 20 20 10 m=1 m=101 m=101 m=1 ≥ S2 . (5.60)
If I1 ∪ I3 = φ then (5.58) remains valid. This, with (5.45), (5.46) and (5.60), shows that, whether I1 ∪ I3 = φ or not, (5.59) (with λ = 0 if I1 ∪ I3 = φ) still holds.
762
K. S. Alexander
The proof of (5.44), and thus of (5.25), is now complete. Taking y = 0 and E the full configuration space in (5.25), and summing over d satisfying (5.22) shows that P (M0 (k, q, r, A, A , d , t))
1 1 ≤ K45 A exp −u(K32 r 2/3 , A) − t+ − d . 60 10
(5.61)
This proves (5.17), with K44 = 2K32 . It remains to prove (5.18). This is similar to the proof of (5.25), so we will only describe the changes. Again fix q, r. We make the same induction hypothesis, except that (5.25) is replaced by P (L0,U (j, A, A , d, d , t) | E) 4 sup P (My,U (0, q, r, A , A , 0, t) | E), ≤ exp − d 5 y∈U ∩Z2
(5.62)
and the requirement that the size condition be satisfied is removed. This hypothesis is true for j = 0, where only d = 0 is relevant; hence we fix k ≥ 1 and A, A , d, d . Let f (A , t) = − log sup P (My,U (0, q, r, A , A , 0, t) | E). y∈U ∩Z2
In place of µi we use 9 κτ di − ki r, i ≥ 2, 10 40 4 9 κτ µˆ 1 = max d1 + f (A , t), d1 − k1 r . 5 10 40 µˆ i =
In place of (5.43), (5.44) and their multi-case proofs, we have simply, using the first half of (5.46), 1 κτ r κτ κτ r di − k i r ≥ (l − 1) ≥ l + (l − 1), 10 40 40 50 2≤i≤l
and hence using (5.30), i≤l
1 4 κτ di − k i r µˆ i ≥ f (A , t) + (d − l) + 5 10 40 i=m
4 κτ r (l − 1). ≥ f (A , t) + d + 5 50 This leads directly to (5.62) (with L0,U , M0,U in place of L0 , M0 ), as in the proof of (5.25). In place of (5.61) we have P (M0,U (k, q, r, A, A , d , t)) 4 sup P (My,U (0, q, r, A , A , 0, t)). ≤ K45 A exp − d 5 y∈U ∩Z2
(5.63)
Droplets in Random Cluster Models
763
By (5.23), taking U = B(0, K46 A) gives M0,U (k, q, r, A, A , d , t)) = M0 (k, q, r, A, A , d , t)) so (5.63) becomes
4 P (M0 (k, q, r, A, A , d , t)) ≤ K45 A exp − d P (M0 (0, q, r, A , A , 0, t)). 5
For k ≥ 1 we need only consider d ≥ κτ r/3, so provided K43 is large enough, this proves (5.18). " # Lemma 5.7. Let q ≥ 1, r ≥ 15q, let γ be a dual circuit and let αmax,γ be its maximal (q, r)-descendant. Then W(∂ Co(γ )) ≤ W(∂ Co(αmax,γ )) + 19D(q,r) (γ ).
Proof. We may assume γ has at least one bottleneck. If (u, v) is a primary bottleneck in γ , and the (κτ r/3)-large offspring of γ are α1 , . . . , αk , then κ r
τ Int(γ ) ⊂ Bτ u, + 2κτ q ∪ ∪i≤k Int(αi ), 3 and therefore from (5.4), γ can be surrounded by a (non-lattice) loop of τ -length at most κ r
τ 12 W(∂ Co(αi )). + 2κτ q + 3 i≤k
Since ∂ Co(γ ) minimizes the τ -length over all such loops, it follows that W(∂ Co(γ )) ≤ 6κτ r + W(∂ Co(αi )). i≤k
Iterating this, and using (5.4) and
(γ ) D(q,r)
≥
κτ r 3 |F(q,r) (γ )|,
(γ )| + W(∂ Co(γ )) ≤ 6κτ r|F(q,r)
we obtain
W(∂ Co(α))
α∈F(q,r) (γ )
≤ W(∂ Co(αmax,γ )) + 19D(q,r) (γ ).
# "
The next theorem, together with Theorem 4.1, shows roughly that for a droplet of size√A, there is a cost for the convex hull boundary τ -length exceeding the minimum w1 A by an amount s+ , this cost being exponential in s+ , and there is an exponential cost for positive D(q,r) (0 ). Theorem 5.8. Assume (2.7) and either (i) the ratio weak mixing property or (ii) both the weak mixing property and the near-Markov property √ for open circuits. There exist constants Ki as follows. Let A > K50 , s+ ≥ 0, s = w1 A + s+ and d ≥ 0. Then (0 ) ≥ d ) P (| Int(0 )| ≥ A, W(∂ Co(0 )) ≥ s, D(q,r) 1 1 ≤ exp −u(K51 (log A)2/3 , A) − s+ − d . 1520 20
(5.64)
764
K. S. Alexander
Proof. Let K52 ≥ K43 (of Proposition 5.5) and 1 K52 log B, rB = K52 log B, 15 √ t+ (n, A ) = max(s − 19n − w1 A , 0), s+ }, I1 (n) = {A ∈ Z+ : t+ (n, A ) ≥ 2 ! " √ √ s+ I2 (n) = A ∈ Z+ : t+ (n, A ) < , w1 A ≤ w1 A + 19n , 2 qB =
and
! " √ √ s+ I3 (n) = A ∈ Z+ : t+ (n, A ) < , w1 A > w1 A + 19n . 2
Then using Lemma 5.7, (0 ) ≥ d ) P (| Int(0 )| ≥ A, W(∂ Co(0 )) ≥ s, D(q,r) ≤ P (| Int(0 )| = B, D(q,r) (0 ) ∈ [n, n + 1), B≥A n≥d
W ∂ Co(αmax,0 ) ≥ s − 19n) √ ≤ P M0 (k, qB , rB , B, A , n, w1 A + t+ (n, A )) B≥A n≥d
A ≤B,A ∈I1 (n) k≥0
+
√ P (M0 (k, qB , rB , B, A , n, w1 A ))
(5.65)
A ≤B,A ∈I2 (n) k≥0
+
√ P (M0 (k, qB , rB , B, A , n, w1 A )) .
A ≤B,A ∈I3 (n) k≥0
The events M0 (k, qB , rB , B, A , n, ·) are empty unless n+1 > (k +1)κτ r/4 (cf. (5.46)); if K52 , and hence r, is large enough, this implies k ≤ n, so we may restrict the sums in (5.65) to such k. Presuming A is large enough, u(K44 (log B)2/3 , B) is strictly positive for all B ≥ A, for K44 of Proposition 5.5. For A ∈ I1 (n) we apply Proposition 5.5 to get
√ P M0 (k, qB , rB , B, A , n, w1 A + t+ (n, A )
A ≤B,A ∈I1 (n) k≤n
√
≤ (n + 1)B exp −w1 B + K44 B
1/6
(log B)
2/3
1 s+ − n . − 120 10
(5.66)
√ Note that if I2 (n) or I3 (n) is nonempty we must have s+ = s −w1 A > 0. If A ∈ I2 (n) we have √ √ √ 1 (s − w1 A) > s − 19n − w1 A ≥ s − w1 A − 38n, 2
Droplets in Random Cluster Models
765
and hence n ≥ s+ /64. Therefore √ P (M0 (k, qB , rB , B, A , n, w1 A )) A ≤B,A ∈I2 (n) k≤n
√ 1 (5.67) ≤ (n + 1)B exp −w1 B + K44 B 1/6 (log B)2/3 − n 10 √ 1 1 ≤ (n + 1)B exp −w1 B + K44 B 1/6 (log B)2/3 − n − s+ . 20 1520
If A ∈ I3 (n) we have √ √ s+ s+ + w1 A − w1 A − 19n ≤ t+ (n, A ) ≤ 2 so that √ √ √ √ s+ 2(w1 A − w1 A) > w1 A − w1 A + 19n ≥ 2 which implies √ √ √ s+ w1 B ≥ w1 A > w1 A + . 4 Therefore
√ P (M0 (k, qB , rB , B, A , n, w1 A ))
A ≤B,A ∈I3 (n) k≤n
√ 1 (5.68) ≤ (n + 1)B exp −w1 B + K44 B 1/6 (log B)2/3 − n 10 1 √ 1 √ 1 1 ≤ (n + 1)B exp − w1 B − w1 A − s+ + K44 B 1/6 (log B)2/3 − n . 2 2 8 10
We can now use (5.66), (5.67) and (5.68) to sum over n and B in (5.65), obtaining (5.64). # " Part of our main result is an easy consequence of Theorem 5.8. Proof of Theorem 2.1, (2.13) and (2.14). From the definition of w1 and Theorem 5.8, for any c, if A is sufficiently large, P (| Int(0 )| ≥ A, ALR(0 ) > cl 1/3 (log l)2/3 ) (5.69) ≤ P (| Int(0 )| ≥ A, | Co(0 )| ≥ A + cw1 l 4/3 (log l)2/3 ) 2 1/3 2/3 √ cw1 l (log l) ≤ P | Int(0 )| ≥ A, W(∂ Co(0 )) ≥ w1 A + 3
≤ exp − u(K51 (log A)2/3 , A) − 416 cl 1/3 (log l)2/3 . If we take c sufficiently large, this and Theorem 4.1 prove that (2.13) holds with conditional probability approaching 1 as A → ∞.
766
K. S. Alexander
Next, from the quadratic nature of the Wulff variational minimum (see [1, 12]), for any a, b, if A is sufficiently large, P (A ≤ | Int(0 )| ≤ A + aw1 l 4/3 (log l)2/3 , ;A (∂ Co(0 )) > bl 2/3 (log l)1/3 ) b 2/3 1/3 P | Int(0 )| = B, ;B (∂ Co(0 )) > l (log l) ≤ 2 B
√ P | Int(0 )| = B, W(∂ Co(0 )) ≥ w1 A + 417 bl 1/3 (log l)2/3 ≤ B
≤ exp − u(K51 (log A)2/3 , A) − 418 bl 1/3 (log l)2/3 ,
(5.70)
where the sums are over A ≤ B ≤ A + aw1 l 4/3 (log l)2/3 . Now P (| Int(0 )| > A + aw1 l 4/3 (log l)2/3 ) can be bounded as in (5.69), so if we take a, b sufficiently large, (5.70) and Theorem 4.1 prove that (2.14) holds with conditional probability approaching 1 as A → ∞. " #
6. Proof of (2.15) and (2.16) For x, y ∈ (Z2 )∗ , r > 0 and G ⊂ R2 , we say there is an r-near dual connection from x to y in G if for some u, v ∈ (Z2 )∗ with d(u, v) ≤ r, there are open dual paths from x to u and from y to v in G. Let N (x, y, r, G) denote the event that such a connection exists. The following result is from [4]. Lemma 6.1. Let P be a percolation model on B(Z2 ) satisfying (2.7) and the ratio weak mixing property. There exist Ki such that if |x| > 1 and r ≥ K53 log |x| then P (N(0, x, r, R2 )) ≤ e−τ (x)+K54 r . To prove (2.15) we also need the following. Lemma 6.2. Suppose τ is positive. There exists 419 such that if q ≥ 1, r ≥ 15q, A > A > 0 and γ is a dual circuit with | Int(γ )| = A, | Int(αmax,γ )| = A , then √ (γ ) ≥ 419 A − A . D(q,r) Proof. Let α1 , . . . , αk be the non-maximal final (q, r)-descendants of γ , and Ai = | Int(αi )|. Then # κτ r 1 1 D(q,r) (γ ) ≥ ≥ w1 max w1 Ai , Ai + kκτ r 3 2 6 i≤k
i≤k
while (cf. (5.6)) A − A ≤ kr 2 +
i≤k
The lemma follows easily. " #
Ai .
Droplets in Random Cluster Models
767
The proof of (2.15) is relatively straightforward, compared to (2.13) and (2.16), so in the interest of space we give only a sketch of the proof. Proof sketch for Theorem 2.1, (2.15). The basic idea is that a large inward deviation of [y ,y ] 0 k k+1 from ∂ Co(0 ) for some k reduces the factor P (wk ↔ xk ) in (5.14). Let Vk denote the line through yk and yk+1 . Suppose HSkel(0 ) = {y0 , . . . , ym+1 }, z is a site in 0 with d(z, ∂ Co({y0 , . . . , ym+1 })) > K55 l 2/3 (log l)1/3 , with K55 large, and 0 satisfies (2.13) and (2.14). Let J (yk , yk+1 ) denote the event that there is an open dual path from yk to yk+1 containing some such site z. Let Rk denote the region bounded [y ,y ] by 0 k k+1 and by the segment of ∂ Co(0 ) from yk to yk+1 . There exists a site z ∈ Vk with the following property: a line tangent to ∂Bτ (yk , τ (z − yk )) at z passes through z. This z is a projection of z onto Vk and satisfies τ (yk − z) ≥ τ (yk − z ),
τ (yk+1 − z) ≥ τ (yk+1 − z ). D
E
F
(6.1) z ,
⊂ ⊂ be balls centered at all with Let D ⊂ E be balls centered at z and let radii of order l 2/3 (log l)1/3 . (The statements to follow are valid for appropriate choices [y ,y ] of these radii.) Equation (2.13) limits the area of Rk and thereby requires that 0 k k+1 intersects D . Thus we have dual connections yk ↔ z and z ↔ yk+1 , at least one of which intersects D . We then want to use Lemma 3.2 and then the triangle inequality to say something like P (J (yk , yk+1 )) ≤ exp −τ (yk+1 − z) − τ (z − yk ) (6.2) ≤ exp −τ (yk+1 − yk ) − K56 l 2/3 (log l)1/3 , but we must deal with the fact that the two dual connections may go close to each other so that Lemma 3.2 does not apply. There are various cases to consider depending on the geometry of the two dual connections yk ↔ z and z ↔ yk+1 relative to the balls and relative to each other. Consider for example the possibility that yk ↔ z and z ↔ yk+1 with an r-near connection from yk to yk+1 outside E. In this case this r-near connection and a path from z to the boundary of D occur at a large separation, so Lemma 3.2 can be applied to these two events, and the path from z to the boundary of D provides the extra cost K56 l 2/3 (log l)1/3 on the right side of (6.2). If there is no r-near connection outside E, we must consider cases depending on the location of z relative to yk and yk+1 and on whether the path intersecting D includes an r-near connection outside F . In general for each case we add costs (such as τ (yk+1 − z) and τ (z − yk ) in (6.2)), and then use the triangle inequality to get a bound like the right side of (6.2), if paths outside some ball are separated enough that costs can be added. When paths are not sufficiently separated, there is an r-near connection outside one of the larger balls E or F , and we add the costs of the r-near connection and another connection inside a smaller ball D or E , as exemplified above, to get a bound. Combining all cases bounds the left side of (6.2) by the right side of (6.2), while analogously to (5.14), P (wi ↔ xi ). P (A(w0 , x0 , . . . , wm , xm ) ∩ J (yk , yk+1 )) ≤ 2m P (J (yk , yk+1 )) i∈I \{k}
Analogously to (5.14)–(5.16), for sufficiently large K57 this leads to P (| Int(0 )| ≥ A, MLR(0 ) ≥ K57 l 2/3 (log l)1/3 , 0 is (q, r) − bottleneck-free) √ 1 (6.3) κτ K57 l 2/3 (log l)1/3 ≤ exp −w1 A − 200
768
K. S. Alexander
with q, r of order log A. Essentially in the manner of (6.7) – (6.10) below, we can reduce to the case of bottleneck-free 0 and conclude that if K58 is sufficiently large, P (| Int(0 )| ≥ A, MLR(0 ) ≥ K58 l 2/3 (log l)1/3 ) √ 1 κτ K58 l 2/3 (log l)1/3 . ≤ exp −w1 A − 4000
(6.4)
With Theorem 4.1 this completes the proof. " # The proof of (2.16) is based on the following idea. As we will see, by summing (5.18) as in the proof of Theorem 5.8 it is easy to obtain roughly 1 P (| Int(0 )| ≥ A, D(q,r) (0 ) ≥ d ) ≤ exp − d P (| Int(0 )| ≥ A − v) (6.5) 2 with v A. (Statement (6.5) is for motivation only – the actual statement we prove is contained in (6.19).) Note that for q, r as in Theorem 2.1, if 0 is not (q, r)-bottleneck free then D(q,r) (0 ) ≥ 21 K6 log A; hence to prove (2.16) we would like a result somewhat like (6.5) but with A on the right side in place of A − v. To replace A − v with A we need to know something of how the probability on the right side of (6.5) behaves as a function of v, which is obtainable from our next result. Let N = [−N, N ]2 . Proposition 6.3. Let P be a percolation model on B(Z2 ) satisfying (2.7), the nearMarkov property for open circuits, and the ratio weak mixing property. There exist Ki such that for A ≥ K59 and δ ≥ K60 log A we have √ (6.6) P (| Int(0 )| ≥ A + δ A) ≥ e−K61 δ P (| Int(0 )| ≥ A). √ Proof. Let l = A and r = 15q = K43 log A, where K43 is from Proposition 5.5. Let Midt (0 ) denote the Wulff shape of area t| Int(0 )| centered at the center of mass of Int(0 ). From translation invariance, Theorem 4.1, (2.17), (5.70) and (6.4) we have for sufficiently large K62 , 1 K62 κτ l 2/3 (log l)1/3 ) 3 + P (| Int(0 )| ≥ A, ;A (0 ) > K62 l 2/3 (log l)1/3 )
(0 ) ≥ P (| Int(0 )| ≥ A, D(q,r)
+ P (| Int(0 )| ≥ A, ;A (0 ) ≤ K62 l 2/3 (log l)1/3 , 0 ∈ / Mid3/4 (0 )) 1 ≤ P (| Int(0 )| ≥ A). 2 It is easy to see (cf. the proof of Lemma 5.7) that ;A (αmax,0 ) ≤ ;A (0 ) + 3κτ−1 D(q,r) (0 ). 2 κ 2 l 4/3 (log l)2/3 , we get Using this and Lemma 6.2, and letting g(A) = (3419 )−2 K62 τ
P (| Int(0 )| ≥ A) ≤
1 P (| Int(0 )| ≥ A) 2 + P | Int(0 )| ≥ A, | Int(0 )| − | Int(αmax,0 )| ≤ g(A), (6.7) ;A (αmax,0 ) ≤ 2K62 l 2/3 (log l)1/3 , 0 ∈ Mid3/4 (0 ) .
Droplets in Random Cluster Models
769
We need the following straightforward extension of (5.18), under the conditions of Proposition 5.5: P M0 (k, q, r, A, A , d , t) ∩ [;A (αmax,0 )
≤ 2K62 l 2/3 (log l)1/3 , 0 ∈ Mid7/8 (αmax,0 )] (6.8) 1 ≤ exp − d P M0 (0, q, r, A , A , 0, t) 2
∩ [;A (0 ) ≤ 2K62 l 2/3 (log l)1/3 , 0 ∈ Mid7/8 (0 )] . This yields that P | Int(0 )| ≥ A, | Int(0 )| − | Int(αmax,0 )| ≤ g(A),
;A (αmax,0 ) ≤ 2K62 l 2/3 (log l)1/3 , 0 ∈ Mid3/4 (0 ) √ ≤ P M0 (k, q, r, B, A , d , w1 A ) B≥A d ≥0 B−g(A) r,
the absense of bottlenecks implies
d 0[X2 ,U1 ] , 0[X1 ,U2 ] > q
and
d 0[V2 ,X1 ] , 0[V1 ,X2 ] > q.
It follows that Wi = Ui , Zi ∈ 0[Xi ,Vi ] (i = 1, 2) and
q d 0[X2 ,W1 ] , 0[X1 ,W2 ] > q − 1 > . 2 When ρ and σ are paths such that the endpoint of ρ is the initial point of σ , we let (ρ, σ ) denote the path obtained by concatenating σ and ρ. Then
[X ,X ] [X ,W ] µL 0 2 1 − µL 0 2 1 , ζW1 X1 ≤ |B1 | = 16π r 2 , since the paths differ only inside B1 . Again we may interchange 1, 2 and L, R. Using Lemma 3.2, P | Int(0 )| ≥ A − g(A), 0 is (q, r) − bottleneck-free, ;A (0 ) ≤ 2K62 l 2/3 (log l)1/3 , 0 ∈ Mid7/8 (0 ) ≤ A−u≤AL +AR ≤2A
√ x1 ,x2 ∈B(0, A) xi =(ai ,b√ i) b1 −b2 ≥ 23 A
w1 ,z1 ∈B(x1 ,4r)
w2 ,z2 ∈B(x2 ,4r)
P X1 = x1 , X2 = x2 , W1 = w1 , W2 = w2 , Z1 = z1 , Z2 = z2
Droplets in Random Cluster Models
771
and there exist paths ρR from x2 to w1 and ρL from w2 to x1 satisfying µL ((ρR , ζw1 x1 )) ≥ AR − 16π r 2 ,
q µR ((ζx2 w2 , ρL )) ≥ AL − 16π r 2 , d(ρL , ρR ) ≥ , z1 ∈ ρL , z2 ∈ ρR , 2 0 ∈ JR ((ζx2 w2 , ρL )) ∩ JL ((ρR , ζw1 x1 )) ≤ (6.12) A−u≤AL +AR ≤2A
√ x1 ,x2 ∈B(0, A) xi =(ai ,b√ i) b1 −b2 ≥ 23 A
w1 ,z1 ∈B(x1 ,4r)
w2 ,z2 ∈B(x2 ,4r)
2P x2 ↔ w1 via an open dual path ρR in Sx1 x2 with µL ((ρR , ζw1 x1 )) ≥ AR − 16π r 2 , z2 ∈ ρR , 0 ∈ JL ((ρR , ζw1 x1 )) · P w2 ↔ x1 via an open dual path ρL in Sx1 x2 with µR ((ζx2 w2 , ρL )) ≥ AL − 16π r 2 , z1 ∈ ρL , 0 ∈ JR ((ζx2 w2 , ρL )) . Let us assume for convenience that δ is an integer (if not, the necessary modifications are simple), and let x1 , w1 , z2 and x2 be the lattice sites which are 2δ units to the right of x1 , w1 , z2 and x2 , respectively. We now “pull apart” the two halves of 0 by replacing each of these four sites by its right-shifted counterpart in the first probability on the right side of (6.12). Specifically, by the FKG property, for each summand we have P x2 ↔ w1 via an open dual path ρR in Sx1 x2 with µL ((ρR , ζw1 x1 )) ≥ AR − 16π r 2 , z2 ∈ ρR , 0 ∈ JL ((ρR , ζw1 x1 )) · P w2 ↔ x1 via an open dual path ρL in Sx1 x2 with µR ((ζx2 w2 , ρL )) ≥ AL − 16π r 2 , z1 ∈ ρL , 0 ∈ JR ((ζx2 w2 , ρL ))
(6.13)
· P (Open(ζz1 w1 ))P (Open(ζw2 z2 )) 3 √ 2 ≤ P |0 | ≥ AL + AR + δ A − 32π r 2 √ ≤ P (|0 | ≥ A + δ A). Here the first inequality uses the fact that, even though the paths ρR , ρL , ζz1 w1 , ζw2 z2 are not necessarily disjoint, the 4 events on the left side of (6.13) imply the event on the right side of the first inequaltiy in (6.13), under our nonstandard definition of circuit. From (6.10), (6.12), (6.13) and the bounded energy property we obtain √ P (| Int(0 )| ≥ A) ≤ K63 A4 r 4 ueK64 δ P (| Int(0 )| ≥ A + δ A) (6.14) √ ≤ eK65 δ P (| Int(0 )| ≥ A + δ A), completing the proof. " #
772
K. S. Alexander
Proof Let 420 , 421 > 0 to be specified, let n0 = min{n : 2n κτ r/3 > √ of Theorem 2.1, (2.16). −2 2n 421 A}, and let bn = 419 2 (κτ r/3)2 , where 419 is from Lemma 6.2. Then provided 421 is small enough (depending on 420 ), we have bn < 420 2n−1
κτ r √ A 3
for all n ≤ n0 .
(6.15)
(0 ): We have the following decomposition according to the size of D(q,r)
P (| Int(0 )| ≥ A, 0 is not (q, r) − bottleneck-free) κτ r ) (0 ) ≥ ≤ P (| Int(0 )| ≥ A, D(q,r) 3 κτ r ≤ P (| Int(0 )| ≥ 2A, D(q,r) (0 ) ≥ ) 3 √ (0 ) ≥ 421 A) + P (A ≤ | Int(0 )| < 2A, D(q,r) n0 κτ r κτ r
+ ≤ D(q,r) . P A ≤ | Int(0 )| < 2A, 2n−1 (0 ) < 2n 3 3
(6.16)
n=1
By Theorems 5.8 and 4.1, the first probability on the right side of (6.16) satisfies κτ r P (| Int(0 )| ≥ 2A, D(q,r) (0 ) ≥ ) 3 1 κτ r 2/3 ≤ exp − − u(K51 (log 2A) , 2A) 20 3 1 κτ r ≤ exp − P (| Int(0 )| ≥ A), 20 3
(6.17)
and, for large A, the second satisfies √ P (A ≤ | Int(0 )| < 2A, D(q,r) (0 ) ≥ 421 A) √ 1 2/3 ≤ exp − 421 A − u(K51 (log A) , A) 20 √ 1 ≤ exp − 421 A P (| Int(0 )| ≥ A). 40
(6.18)
Thus we must consider the terms of the sum on the right side of (6.16). By Lemma 6.2, D(q,r) (0 ) < 2n
κτ r 3
implies
| Int(0 )| − | Int(αmax,0 )| < bn .
Droplets in Random Cluster Models
773
Hence similarly to (6.9) we have κτ r κτ r
P A ≤ | Int(0 )| < 2A, 2n−1 ≤ D(q,r) (0 ) < 2n 3 3 n−1 κτ r ≤ P A ≤ | Int(0 )| < 2A, D(q,r) (0 ) > 2 and 3 | Int(y )| − | Int(αmax,y )| < bn √ ≤ P (M0 (k, q, r, B, A , d , w1 A )) A≤B K79 and l = A, √ P (| Int(0 )| = A) ≥ exp − w1 A − K80 l 1/3 (log l)2/3 .
776
K. S. Alexander
√ Proof. Fix A large and let l = A. Let a1 denote the vertical coordinate of the point where ∂K1 meets the positive vertical axis. Let s = l 2/3 (log l)1/3 and δ = K27 s 2 / l, with K27 as in the proof of Theorem 4.1. Let α = ∂(l + δ)K1 and let (z0 , . . . , zn , z0 ) be the s-hull skeleton of α. It is an easy exercise in geometry to see that the natural ), i = 0, . . . , n, are disjoint. (Our labeling as usual is cyclical: half-slabs UR (zi , zi+1 ) from the skeleton zn+1 = z0 .) For some K81 to be specified, let us call a pair (zi , zi+1 √ √ very short if |zi+1 − zi | ≤ 2 2, short if 2 2 < |zi+1 − zi | ≤ 2K81 log l and long if |zi+1 − zi | > 2K81 log l. In what follows, very short pairs can be handled quite trivially but tediously, so for convenience we will assume there are no very short pairs. For long pairs we define xi and yi+1 to be the points on the line segment zi zi+1 at distance K81 log l ) within from zi and from zi+1 , respectively, and let xi , yi+1 be dual sites in UR (zi , zi+1 √ distance 2 of xi and yi+1 , respectively. For short pairs we let xi = yi+1 be a dual site √ ) within distance 2 of the midpoint of z z . With minor modification in UR (zi , zi+1 i i+1 of the definition of the s-hull skeleton, we may assume the set {z0 , . . . , zn } has lattice symmetry, that is, for each zi the reflection of zi across the horizontal or vertical axis is another zj , and analogously for the sites xi and yi . For each i we let φi denote a dual lattice path of minimal length from yi to xi outside Co({z , . . . , zn })∪ U˜ z z (xi−1 , yi )∪ 0
i−1 i
U˜ zi zi+1 (xi , yi+1 ). We call such a φi a short link. Let Ci denote the bond boundary of φi in (Co({z , . . . , zn }) ∪ U˜ z z (xi−1 , yi ) ∪ U˜ z z (xi , yi+1 ))c .
0
i−1 i
i i+1
Let λa be the vertical line through (a, 0). Let HL (x) and HR (x) denote the open half planes to the left and right, respectively, of the vertical line through x. Let HU (x) and HB (x) denote the open half planes above and below the horizontal line through x, respectively. (In general we use the convention that subscripts L, R, U, B refer to left, right, upper and lower halfspaces, respectively, with combinations, such as LU , referring to quadrants.) Let S(x, y) denote the open slab between the vertical lines through x and y. Let N be the integer part of a1 l/2, M the integer part of l 2/3 (log l)1/3 and D the integer part of l 1/3 (log l)2/3 . Let uRU = HR (0) ∩ HU (0) ∩ α ∩ λM+ 1 , vRU = HR (0) ∩ HU (0) ∩ α ∩ λM+D+ 1 , 2
= HR (0) ∩ HU (0) ∩ α ∩ λN+ 1 , wRU 2
2
xRU = HR (0) ∩ HU (0) ∩ α ∩ λN+D+ 1 . 2
(Note each of these intersections is a single point.) We call these 4 points determining points. Lattice symmetry yields corresponding determining points with appropriate subscripts in the other three quadrants. We may assume that uRU is one of the sites zi of the s-hull skeleton of α (if not, we add uRU to the skeleton), and analogously for the other sites just defined. Let uRU be the second closest dual site above uRU in λM+ 1 , and analogously for vRU , wRU , xRU . If uRU = zi , for some i, we de2 fine zi to be uRU , and again analogously for the other determining points. Loosely, the idea is to remove from 0 its intersection with each of the width-D vertical slabs S(xLU , wLU ), S(vLU , uLU ), S(uRU , vRU ), S(wRU , xRU ), then raise or lower the segments of 0 between these slabs to adjust the area as desired, then reconnect these segments to make a new circuit enclosing area A. To do this we must first ensure that 0 intersects each vertical line bounding any of these four slabs only twice. We refer to the 4 width-D vertical slabs above as removal slabs. We call the 5 regions HL (xLU ), S(wLU , vLU ), S(uLU , uRU ), S(vRU , wRU ), HR (xRU ) (whose closures together form the complement of the 4 removal slabs) retention regions. By a retained
Droplets in Random Cluster Models
777
segment we mean a connected component of the intersection of α with a retention re gion. Each retained segment has the form α (zj ,zk ) for some j, k; we call zj an initial determining point and zk a final determining point, and call (j, k) a retention pair. We let J ret denote the set of all 8 retention pairs. For each initial determining point zj , in the boundary of some retained region F , we let ψj be a dual lattice path from zj to xj in F \U˜ zj zj +1 (xj , yj +1 ), of minimal length, and let Dj be the bond boundary of ψj in F \U˜ z z (xj , yj +1 )}. For each final determining point z we let ψk be a dual lattice k
j j +1
path from yk to zk in F \U˜ zk−1 zk (xk−1 , yk ), of minimal length, and define Dk analogously to Dj . We refer to ψj and ψk as the endpaths of the retention pair (j, k).
For each retention pair (j, k) let Ij k = {i : zi , zi+1 ∈ α (zj ,zk ) } and let Qj k denote )-cylindrically the event that (i) for each i ∈ Ij k ∪ {j }, we have xi ↔ yi+1 (zi , zi+1 in U˜ zR z (xi , yi+1 ), (ii) for each i ∈ Ij k we have φi open and all bonds in Ci closed, i i+1
and (iii) both endpaths of (j, k) are open and all bonds in Dj ∪ Dk are closed. These 3 component events are denoted Q(i) (j, k), Q(ii) (j, k) and Q(iii) (j, k). For a configuration in Qj k , the paths xi ↔ yi+1 together with the short links φi and the two endpaths form an open dual path from zj to zk outside Co({z0 , . . . , zn }), and there is no open dual connection from this path to any point of the retention region boundary except zj and zk . By Lemma 3.1, (7.6) and Theorem 4.2, provided K81 is large we have P (Q(i) (j, k)) |I | ≥ 21 j k
i∈Ij k ∪{j }
≥
4 |Ij k |+1 27
l
≥ exp −
P (xi ↔ yi+1 (zi , zi+1 ) − cylindrically in U˜ zR z (xi , yi+1 )) i i+1
exp −
τ (yi+1 − xi )
i∈Ij k ∪{j }
(7.7)
τ (zi+1 − zi ) − K82 |Ij k | log l .
i∈Ij k ∪{j }
From the bounded energy property, P Q(ii) (j, k) ∩ Q(iii) (j, k) | Q(i) (j, k) ≥ exp(−K83 |Ij k | log l) which with (7.7) yields P (Qj k ) ≥ exp −
τ (zi+1 − zi ) − K84 |Ij k | log l .
(7.8)
i∈Ij k ∪{j }
For a configuration ω ∈ Qj k , and for F the retention region with zj , zk ∈ ∂F , we can associate an area Rj k (ω) as follows. There is a unique outermost open dual path Yj k (ω) from zj to zk in F . If F is a halfspace (HL (xLU ) or HR (xRU )), then Rj k (ω) is the area of the region between Yj k (ω) and zj zk . If F is a slab and zj , zk ∈ HU (0), then Rj k (ω) is the area of the region in F ∩ HU (0) below Yj k (ω). If F is a slab and zj , zk ∈ HB (0), then Rj k (ω) is the area of the region in F ∩ HB (0) above Yj k (ω).
778
K. S. Alexander
We define corresponding nonrandom areas Rj0k similarly but using α (zj zk ) in place of Yj k (ω). Then Rj k (ω) ≥ Rj0k . It is not hard to see that for some K85 we have P Rj k ≥ Rj0k + K85 l 4/3 (log l)2/3 for some
1 retention pair (j, k) | ∩(j,k)∈J ret Qj k ≤ . 2
(7.9)
In fact, if this were false we could obtain extra area K85 l 4/3 (log l)2/3 almost “for free” in Theorem 4.1; more precisely, one could replace A with A + K85 l 4/3 (log l)2/3 on the left side in the conclusion of that theorem. But, assuming K85 is large, this would contradict Theorem 5.8. It follows from (7.9) that for each retention pair (zj , zk ) there exists aj k ∈ [Rj0k , Rj0k + K85 l 4/3 (log l)2/3 ) such that 1 P ∩(j,k)∈J ret [Rj k = aj k ] | ∩(j,k)∈J ret Qj k ≥ (K85 l 4/3 (log l)2/3 )−8 2
(7.10)
Let R1 =
aj k ,
(j,k)∈J ret
where the sum is over the 8 retention pairs. We call (k, j ) a removal pair if α (zk ,zj ) is a connected component of the intersection of α with some removal slab. Let J rem denote the set of all 8 removal pairs. For each removal pair (k, j ) and corresponding removal slab F , let χkj be a dual path from zk to zj in F \ Co({z0 , . . . , zn }), and let Gkj denote the event that all bonds in χkj are open, while all bonds in the bond boundary of χkj in (ψj ∪ ψk )c are closed. We call χkj a long link. There are 2 long links in each removal slab, one each in the upper and lower half planes. Let R2 be the total area in the 4 removal slabs, between the upper and lower long link in each slab. Assuming the long links are chosen to have length of order D (say, at most 4D), we have from the bounded energy property that P ∩(k,j )∈J rem Gzk ,zj | (∩(j,k)∈J ret Qj k ) ∩ (∩(j,k)∈J ret [Rj k = aj k ]) ≥ e−K86 D ≥ e−K87
l 1/3 (log l)2/3
(7.11) .
Define the event E = ∩(j,k)∈J ret Qj k ∩ ∩(j,k)∈J ret [Rj k = aj k ] ∩ ∩(k,j )∈J rem Gkj . For a configuration ω ∈ E, there is an open dual circuit surrounding Co({z0 , . . . , zn }) satisfying the constraint that it include all of the short links φi , long links χkj and endpaths ψj . There is a unique outermost such circuit subject to this constraint, obtained by taking )-cylindrical connection from x to y the outermost (zi , zi+1 i i+1 for each i; we denote this circuit 1 (ω). Because of the cylindrical nature of these connections and the closed state of the bond boundaries of the short and long links, we have 1 (ω) = 0 (ω), unless
Droplets in Random Cluster Models
779
0 (ω) and 1 (ω) are disjoint with 0 (ω) surrounding 1 (ω) and no open dual path connecting 0 (ω) to 1 (ω). It therefore follows from the near-Markov property that P (0 = 1 | E) ≤ e−428 l ≤
1 . 2
By (3.2),
|Ij k | ≤ K88
(j,k)∈J ret
l ≤ K89 l 1/3 (log l)−1/3 . s
Using these facts with (7.10), (7.11), Lemma 3.1, (7.8) and (4.1) (which is still valid here), we obtain P (|0 | = R1 + R2 ) ≥ P (E ∩ [0 = 1 ]) ≥
1 P (E) 2
1 1/3 2/3 (K85 l 4/3 (log l)2/3 )−8 e−K87 l (log l) P (∩(j,k)∈J ret Qj k ) 2 1/3 2/3 P (Qj k ) ≥ e−K90 l (log l) ≥
≥ exp −
(j,k)∈J ret n
(7.12)
τ (zi+1 − zi ) − K91 l 1/3 (log l)2/3
i=0
√ ≥ exp −w1 A − K92 l 1/3 (log l)2/3 . Let θ ω be the upward shift of a configuration ω by 1 unit, and for an event G let θ m G = {ω : θ −m ω ∈ G}. Let JBret be the set consisting of the 3 retention pairs corresponding to segments of α in the lower half plane. Given a constant K93 , for m ≤ K93 l 1/3 (log l)2/3 we can replace Qj k with θ m Qj k for all (j, k) ∈ JBret throughout the argument leading to (7.12) at the expense of only a possible increase in K91 , provided we alter the 4 long links χkj in the outer 2 removal slabs to connect to the appropriate shifted sites zi + (0, m) instead of to zi . (The possible increase in K91 reflects a possible reduction in the probabilities of the events Gkj , resulting in an increase of K87 in (7.11).) We can readily keep the area R2 fixed when we so alter the long links. We thereby obtain √ P (|0 | = R1 + R2 − (2N + 1)m) ≥ exp(−w1 A − K94 l 1/3 (log l)2/3 ). Provided K93 is large, we can choose m ∈ Z so that |R1 + R2 − (2N + 1)m − A| ≤ N. We can then repeat this entire argument, but shift upward (by some amount q ≤ K93 l 1/3 (log l)2/3 ) only the event Qj k for the central of the 3 retention pairs in JBret . This gives P (|0 | = R1 + R2 − (2N + 1)m − (2M + 1)q)
√ ≥ exp(−w1 A − K95 l 1/3 (log l)2/3 ).
We can choose k so that |R1 + R2 − (2N + 1)m − (2M + 1)q − A| ≤ M ≤ l 2/3 (log l)1/3 .
(7.13)
780
K. S. Alexander
But it is easy to see that one can alter the long links to change R2 by any amount up to l 2/3 (log l)1/3 , so that R1 + R2 − (2N + 1)m − (2M + 1)q = A, at the expense of only a possible increase to K95 in (7.13). With (7.13) this completes the proof. " # Now that we can use Theorem 7.3 in place of Theorem 4.1, all proofs leading to (2.13) – (2.15) of Theorems 2.1 and 2.2 remain valid under conditioning on | Int(0 )| = A in place of | Int(0 )| ≥ A. This establishes the following result. Theorem 7.4. Under the assumptions in Theorem 2.1 or Theorem 2.2, under the measure P (· | | Int(0 )| = A) the conclusions (2.13)–(2.15) hold with probability approaching 1 as A → ∞. We do not include√ (2.16) in Theorem 7.4 because of √technical difficulties in replacing P (| Int(0 | ≥ A + δ A) with P (| Int(0 | = A + δ A) in Proposition 6.3. However, we have no reason to expect (2.16) should not be valid here as well. We conclude with the following. Proof of Theorem 2.3. The O(·) is a combined upper and lower estimate; the lower estimate follows from Theorem 7.3 and the upper estimate from Theorem 5.8. " #
References 1. Alexander, K.S.: Stability of the Wulff minimum and fluctuations in shape for large finite clusters in two-dimensional percolation. Probab. Theory Rel. Fields 91, 507–532 (1992) 2. Alexander, K.S.: Approximation of subadditive functions and rates of convergence in limiting shape results. Ann. Probab. 25, 30–55 (1997) 3. Alexander, K.S.: On weak mixing in lattice models. Probab. Theory Rel. Fields 110, 441–471 (1998) 4. Alexander, K.S.. Power-law corrections to exponential decay of connectivities and correlations in lattice models. Ann. Probab. 29, 92–122 (2001) 5. Alexander, K.S.: The asymmetric random cluster model and comparison of Ising and Potts models. Probab. Theory Rel. Fields 120, 395–444 (2001) 6. Alexander, K.S., Chayes, J.T. and Chayes, L.: The Wulff construction and asymptotics of the finite cluster distribution for two dimensional Bernoulli percolation. Commun. Math. Phys. 131, 1–50 (1990) 7. Avron, J.E., van Beijeren, H., Schulman, L.S. and Zia, R.K.P.: Roughening transition, surface tension and equilibrium droplet shapes in a two-dimensional Ising system. J. Phys. A 15, L81–L86 (1982) 8. Baik, J., Deift, P. and Johansson, K.: On the distribution of the length of the longest increasing subsequence of random permutations. J. Am. Math. Soc. 12, 1119–1178 (1999) 9. Burton, R. and Keane, M.: Density and uniqueness in percolation. Commun. Math. Phys. 121, 501–505 (1989) 10. Deuschel, J.-D. and Zeitouni, O.: On increasing subsequences of I.I.D. samples. Combin. Probab. Comput. 8, 247–263 (1999) 11. Dobrushin, R.L. and Hryniv, O.: Fluctuations of the phase boundary in the 2D Ising ferromagnet. Commun. Math. Phys. 189, 395–445 (1997) 12. Dobrushin, R.L., Kotecký, R. and Shlosman, S.: Wulff construction. A global shape from local interaction. Translations of Mathematical Monographs 104, Providence, RI: American Mathematical Society, 1992 13. Edwards, R.G. and Sokal,A.D.: Generalization of the Fortuin–Kasteleyn–Swendsen–Wang representation and Monte Carlo algorithm. Phys. Rev. D 38, 2009–2012 (1988) 14. Fortuin, C.M. and Kasteleyn, P.W.: On the random cluster model. I. Introduction and relation to other models. Physica 57, 536–564 (1972) 15. Grimmett, G.R.: The stochastic random-cluster process and uniqueness of random-cluster measures. Ann. Probab. 23, 1461–1510 (1995) 16. Hryniv, O.: On local behaviour of the phase separation line in the 2D Ising model. Probab. Theory Rel. Fields 110, 91–107 (1998)
Droplets in Random Cluster Models
781
17. Ioffe, D. and Schonmann, R.H.: Dobrushin–Kotecký–Shlosman theorem up to the critical temperature. Commun. Math. Phys. 199, 117–167 (1998) 18. Krug, J. and Spohn, H.: Kinetic roughening of growing interfaces. In: Solids Far from Equilibrium: Growth, Morphology and Defects (C. Godrèche, ed.) Cambridge: Cambridge University Press, 1991, pp. 479–582 19. Laanait, L., Messager, A. and Ruiz, J.: Phase coexistence and surface tensions for the Potts model. Commun. Math. Phys. 105, 527–545 (1986) 20. Licea, C., Newman, C.M. and Piza, M.S.T.: Superdiffusivity in first-passage percolation. Probab. Theory Rel. Fields 106, 559–591 (1996) 21. McCoy, B.M. and Wu, T.T.: The Two-Dimensional Ising Model. Cambridge, MA: Harvard University Press, 1973 22. Newman, C.M. and Piza, M.S.T.; Divergence of shape fluctuations in two dimensions. Ann. Probab. 23, 977–1005 (1995) 23. Piza, M.S.T.: Directed polymers in a random environment: Some results on fluctuations. J. Statist. Phys. 89, 581–603 (1997) 24. Taylor, J.E.: Existence and structure of solutions to a class of nonelliptic variational problems. Symp. Math. 14, 499–508 (1974) 25. Taylor, J.E.: Unique structure of solutions to a class of nonelliptic variational problems. Proc. Symp. Pure Math. 27, 419–427 (1975) 26. Uzun, H.B.: On maximum local roughness of random droplets in two dimensions. Ph.D. dissertation, University of Southern California, 2001 27. van den Berg, J. and Kesten, H.: Inequalities with applications to percolation and reliability. J. Appl. Probab. 22, 556–569 (1985) 28. Vershik, A.M. and Kerov, C.V.: Asymptotics of the Plancherel measure of the symmetric group and the limiting form of Young tables. Dokl. Acad. Nauk. 233, 1024–1028 (1977) Communicated by A. Kupiainen