Communications in Mathematical Physics - Volume 194

Commun. Math. Phys. 194, 1 – 45 (1997) Communications in Mathematical Physics © Springer-Verlag 1998 ¨ Modified Prufe...

Author: A. Jaffe (Chief Editor)

48 downloads 721 Views 8MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 194, 1 – 45 (1997)

Communications in

Mathematical Physics © Springer-Verlag 1998

¨ Modified Prufer and EFGP Transforms and the Spectral Analysis of One-Dimensional Schrödinger Operators Alexander Kiselev1,? , Yoram Last2 , Barry Simon2,?? 1

Department of Mathematics, University of Chicago, Chicago, IL 60637, USA Division of Physics, Mathematics, and Astronomy, California Institute of Technology 253-37, Pasadena, CA 91125, USA

2

Received: 8 April 1997 / Accepted: 19 June 1997

Abstract: Using control of the growth of the transfer matrices, we discuss the spectral analysis of continuum and discrete half-line Schrödinger P∞ operators with slowly decaying potentials. Among our results we show if V (x) = n=1 an W (x − xn ), where W has 0, then H has purely a.c. (resp. purely s.c.) spectrum compact support xn /xn+1 →P P and a2n = ∞). For λn−1/2 an potentials, where an are on (0, ∞) if a2n < ∞ (resp. independent, identically distributed random variables with E(an ) = 0, E(a2n ) = 1, and λ < 2, we find singular continuous spectrum with explicitly computable fractional Hausdorff dimension.

1. Introduction In this paper, we will study continuum and discrete Schrödinger operators on the halfline (while we don’t always make them explicit, given theory in [10, 26, 32], many of our results extend to suitable whole-line problems). Explicitly, we are interested in the spectral analysis of operators H on L2 (0, ∞; dx) and on `2 ([1, ∞)) given by (Hu)(x) = −

d2 + V (x) dx2

(1.1)

in the continuum case and (Hu)(n) = u(n + 1) + u(n − 1) + V (n)u(n)

(1.2)

in the discrete case. ?

Research supported in part by NSF Grant No. DMS-9022140. This material is based upon work supported by the National Science Foundation under Grant No. DMS9401491. The Government has certain rights in this material. ??

2

A. Kiselev, Y. Last, B. Simon

Suitable boundary conditions are set at x (or n) = 0, so that H is self-adjoint (since in all our examples V is limit point at infinity, a boundary condition is not needed there). We are interested in the spectral properties of such operators in situations where |V (n)| → 0 as n → ∞, but so slowly that the usual scattering theory will not apply. Our main theme in this paper is that there are perturbation techniques of remarkable power for such operators based on two ideas. The first is that one can use the transfer matrix to analyze spectral properties. The transfer or fundamental matrix is a 2 × 2 unimodular matrix defined in the continuum case for any E by 0 a u (x) , (1.3) = TE (x, 0) u(x) b where −u00 + V u = Eu, u0 (0) = a, u(0) = b. In the discrete case a u(n + 1) = , TE (n, 0) b u(n)

(1.4)

where u(1) = a, u(0) = b, and u(n + 1) + u(n − 1) + V (n)u(n) = Eu(n). The second idea is that one can control the transfer matrix by controlling the norms of two solutions of −u00 + V u = Eu and that the critical equations are ones that involve those norms. Two of us [22] have recently found several new criteria for singular or absolutely continuous spectra in terms of transfer matrices, and these criteria will make some of our results here possible. The perturbation equations we will use have not been systematically used in this context except in the paper of Kotani-Ushiroya [21] whose techniques have some overlap with this paper. But they didn’t control the discrete case and their method is so entwined with certain Martingale inequalities that it is unclear how to use them in other contexts. While we were writing up the work for this paper, we received a preprint from Remling [29] that uses this two-pronged approach and has considerable overlap with our Sects. 5 and 6. We will discuss the connection shortly. Here are some of the theorems that we will use that relate spectral properties to behavior of T (n). The first is from [22]: Theorem 1.1. Suppose that there is a fixed sequence ni → ∞ and S is a subset of R so that for a.e. E ∈ S, limi→∞ kTE (ni , 0)k = ∞. Then µac (S) = 0, where µac is the absolutely continuous part of the spectral measure for H. Remarks. 1. The interesting aspect of this theorem is that ni is arbitrary. The result actually allows a more general sequence kTE (ni , mi )k. 2. In typical applications, S is an interval in the essential spectrum. In the other direction, one has the following pair of results: Theorem 1.2. Suppose S is a set so that for a.e. E ∈ S, limx→∞ kTE (x, 0)k < ∞. Then µac (Q) > 0 for any Q ⊂ S with |Q| > 0 where | · | = Lebesgue measure. Rb Theorem 1.3. Suppose there is a sequence ni → ∞ so that a kTE (ni , 0)k4 dE < ∞. Then (a, b) ⊂ spec(H) and the spectral measure is purely absolutely continuous on (a, b) and µac (Q) > 0 for any Q with |Q ∩ (a, b)| > 0.

Modified Prüfer and EFGP Transforms

3

Remarks. 1. That Theorem 1.2 is implied by the Gilbert-Pearson [11] theory was noted by Stolz [36]. A simple proof can be found in [33]. Last-Simon [22] prove a stronger R1 variant in which kTE (x, 0)k is replaced by −1 kTE (x + y, 0)k dy and lim by lim. (The discrete analog holds with lim and without any local integration.) 2. Theorem 1.3 is from Last-Simon [22] although the method used there is not much more than what is in Carmona [1]. As to distinguishing dense pure point from singular continuous spectrum, in one direction we have the following elementary result from Simon-Stolz [35]. R∞ P Theorem 1.4. If n kTE (n, 0)k−2 = ∞ in the discrete case or 0 kTE (x, 0)k−2 dx = ∞ in the continuum case, then Hu = Eu has no solution which is L2 at infinity. The paradigm of results that guarantees a solution L2 at ∞ is Ruelle’s proof [30] of Osceledec’s theorem. His argument is abstracted in [22]. We will need the following in Sect. 8: Theorem 1.5. If limn→∞ [log kTE (n, 0)k/nα ] exists and lies in (0, ∞) for some α > 0, then Hu = Eu has an L2 solution. [22] also has a general abstract result on power decay which, to get an L2 solution, requires 3 log kTE (n, 0)k > . lim n→∞ log n 2 [22] also has an example where the limit is 23 and there is no `2 solution. But there are stronger results that hold a.e. in certain probabilistic situations, so we won’t discuss the power decay result here. In Sect. 8, we will discuss the probabilistic result. As for the technique to control the growth of solutions, in the continuum case we will use modified Prüfer variables defined for E > 0 by √ (1.5a) u0 (x) = E R(x) cos(θ(x)), u(x) = R(x) sin(θ(x)). (1.5b) √ One finds these obey the differential equations (with k = E) V (x) dθ =k− sin2 (θ(x)), dx k

(1.6)

1 d log R(x) = V (x) sin(2θ(x)). dx 2k

(1.7)

Two features of these equations are immediately noteworthy: (a) They separate in the sense that (1.6) does not involve R and after solving it, one obtains R by integration. That R drops out of (1.6) and the right side of (1.7) is an expression of the linearity of the initial equations. (b) If V = 0 in some region (a, b), then in that region R is √ constant and θ(x) = θ(a) + E in (1.5a). The addition k(x − a). It is this fact that leads one to take the factor √ of this E is what distinguishes this from ordinary Prüfer transformations.

4


There is a third significant feature which we will turn to momentarily. Given how common these continuum equations are, we would have expected their discrete analogs would have been rediscovered and used many times, but even after some efforts at tracking various literature, we’ve found them in a single chain of four papers (and we are a fifth in this chain, since we learned of them from the fourth paper!). The original discoverer of the correct equation was Thomas Eggarter [9] in 1971. He was not looking Pnat an explicit difference equation but rather a continuum equation with V (x) = V0 i=1 δ(x − xi ). By integrating modified Prüfer variables across the δ-functions, he was led to the transforms (E = 2 cos(k)), u(n) − cos(k)u(n − 1) = R(n) cos(θ(n)), sin(k)u(n − 1) = R(n) sin(θ(n)),

(1.8b) (1.8b)

in which case we have, after some calculation (see Sect. 2), cot(θ(n + 1)) = cot(k + θ(n)) − (sin(k))−1 V (n), V (n)2 V (n) R(n + 1)2 sin(2θ(n) + 2k) + =1− sin2 (θ(n) + k). 2 R(n) sin(k) sin2 (k)

(1.9) (1.10)

Actually, he had only an equation of the form (1.9). The definition of θ(n) and precise (1.9) is in a 1975 paper of Gredeskul-Pastur [13] who followed up on Eggarter’s work. [9, 13] focus on (1.9) because they use the transform to study the integrated density of states. Pastur-Figotin [26] defined R and exploited (1.10) to study the Lyapunov exponent. In recognition of these seminal works, we call (1.8) the EFGP transform. Their approach was further exploited in Chulaevsky-Spencer [2]. It will often be useful to use an equivalent form of (1.9) that appears as (2.14). Notice that (1.9), (1.10) have the two critical properties (a), (b) mentioned for (1.6), (1.7) in the continuum case. In particular, if V (n) = 0 for n in some interval [n0 , n1 ], then in that region R(n) is constant and θ(n) = θ(n0 ) + k(n − n0 ). While the EFGP transform was obtained by integrating a continuum δ-function model, it could also be found by looking for a transform with property (b). We will explain this in Sect. 2. [9, 13, 26, 2] all consider V ’s with no decay as n → ∞ but with a small coupling so that any calculations are only asymptotic in coupling constant. It turns out that the methods are especially well suited when V (n) → 0 at infinity and one obtains results that are exact for a fixed V . For example, in Sect. 8, we will find exact formulas for the local Hausdorff dimensions of certain singular continuous spectral measures. The third critical factor of the modified Prüfer and EFGP transforms is a major theme of this paper, namely, that first-order terms in V are oscillatory while the second-order term has a strong tendency to be strictly positive. This idea is already seen in [26, 2], where γ(E) is O(g 2 ) with g a coupling constant because the first-order terms drop out. Let us be explicit about this idea. In (1.6), one might think the positivity comes via the square in sin2 (θ(x)) but that is wrong! Indeed, in writing sin2 (θ) = 21 − 21 cos(2θ), it is the cos(2θ) that is critical! Formally, (1.6) says θ(x) = kx + θ0 −

V (x) sin2 (kx + θ0 ) + O(V 2 ) ≡ kx + θ0 + δθ + O(V 2 ), k


5

and then using sin(2θ) = sin(2kx + 2θ0 ) + 2 cos(2kx + 2θ0 )δθ + O(V 2 ), we get d log R = t1 + t2 + O(V 3 ), dx where t1 =

1 1 V (x) sin(2(kx + θ0 )) − 2 2k 2k

Z

x

V (x)

V (y) dy cos(2(kx + θ0 ))

x0

is the oscillatory term that is often unimportant, while 1 d t2 = 2 4k dx

Z

2

x

V (y) cos(2ky + 2θ0 ) dy x0

has a positive integral, second order in V . In explicit cases, it is more subtle to prove the second order is strictly positive and, indeed, for examples like V (x) = x−α , α < 21 , where the spectrum is absolutely continuous (by Weidmann [37]), the second-order terms do not cause divergences. This means that results that depend on a finite second-order term should hold more generally than those that depend on an infinite second-order term. Indeed, we Conjecture. If V is bounded and in L2 (R, dx) (or `2 (Z+ )), then the essential support of the a.c. part of the spectrum is all of (0, ∞) (or (−2, 2) in the discrete case). Our idea is that for almost all (but not all; see, e.g., [24, 25, 34]) k, the oscillations should kill the first-order term, and so the L2 condition should suffice to give a bounded transfer matrix for a.e. k and so the stated conclusion about the a.c. spectrum by Theorem 1.2. After discussing the modified Prüfer and EFGP transforms and their relation to the growth of the transfer matrix in Sect. 2, we turn to two warm-up problems in Sects. 3 and 4. In Sect. 3, we show these transforms can replace the Harris-Lutz [15] method in many cases where that method is applicable. In Sect. 4, we look at potentials V with lim x|V (x)| finite and show that for such potentials their positive eigenvalues can only coalesce at E = 0. Since examples are known with countable many eigenvalues embedded in (0, ∞), this result is interesting. In Sects. 5–7, we study sparse potentials. Definition. A Pearson potential is one of the form V (x) =

∞ X

an W (x − xn ),

(1.11)

n=1

where W is a bounded, non-negative function of compact support, an → 0, and 1 ≤ x1 < x2 < x3 < · · ·, xn → 0. (1.12) xn+1

6


TheP name is in honor of David Pearson who considered potentials of the form (1.11) ∞ where n=1 a2n = ∞ and xn went to infinity sufficiently fast. To make things precise, think of the example xn = n!. Our major goal in Sects. 5–6 is to prove the following: Theorem 1.6. Let V be a Pearson potential. Then P∞ d2 (1) If n=1 a2n < ∞, the spectrum of − dx 2 + V (x) is purely absolutely continuous on (0, ∞) for any boundary condition at 0. P∞ d2 (2) If n=1 a2n = ∞, the spectrum of − dx 2 + V (x) is purely singular continuous on (0, ∞) for any boundary condition at 0. In Sect. 5, we will actually prove a stronger version of (1): Theorem 1.60 . Let V have the form (1.11) where lim

xn < 1. xn+1

(1.13)

Then (1) holds. P∞ Pearson [27, 28] proved a weak version of (2) in that if n=1 a2n = ∞, there exists some set of xn ’s so that the spectrum is purely singular continuous. In [27], there are hints that a result of type (1) (again with xn sufficiently large) should hold, but nothing explicit. As noted at the end of Sect. 5, for (1) the bumps W (x − xn ) can be n-dependent. At the end of Sect. 6, for [a, b] ≡ S ⊂ (0, ∞), we construct Pearson-like potentials (bumps whose width grows with n) so that there is purely a.c. spectrum on S and purely s.c. spectrum on (0, ∞)\S. In a recent paper, coincident with our work, Remling [29] obtained results related to Theorem 1.6(1) using similar methods. He only obtains the existence of absolutely continuous spectrum (his results are consistent with simultaneous singular continuous spectrum while we prove there is none), and he needs at least exp( 23 n log n) growth on the xn (whereas, if f (n) is a monotone function with f (m) → ∞ no matter how slowly, then xn = exp(nf (n)) obeys (1.12) and xn = exp(an) obeys (1.13)). After this manuscript was completed, we obtained a preliminary version of a preprint of Molchanov [23] with considerable overlap with our results in Sects. 5 and 6. In Sect. 7, we will prove Theorem 1.7. Let xn ∈ Z obey xn /xn+1 → 0. Let V be the potential with V (xn ) = an , V (x) = 0

x 6= xn for any n.

Then, P 2 (1) If an < ∞, the discrete Schrödinger operator with potential V has purely a.c. spectrum for (−2, 2). P (2) If a2n = ∞, the operator has purely singular continuous spectrum on (−2, 2).


7

In Sects. 8 and 9, we discuss models with randomness and decay, first studied by Simon [31] and then by Delyon, et al. [7], Delyon [6], and Kotani-Ushiroya [21]. Typical of the models discussed in these sections is (g is positive constant) V (n) = gn−α an , where the an are independent, identically distributed random variables, uniformly distributed in [−1, 1]. We prove (i)

If α > 21 , the spectrum is almost surely purely absolutely continuous in (−2, 2).

(ii) If 0 < α < 21 , the spectrum is almost surely dense pure point in (−2, 2). (iii) If α = 21 , the spectrum is almost surely purely singular continuous in the region |E| < (4 − 13 g 2 )1/2 and dense pure point in the region (4 − 13 g 2 )1/2 ≤ |E| < 2 (if g 2 > 12, interpret (4 − 13 g 2 )1/2 as 0). (iv) In case α =

1 2

and g 2 < 12, in the region |E| < (4 − 13 g 2 )1/2 , the spectrum has

fractional Hausdorff dimension with local dimension (4 − E 2 −

g2 3 )/(4

− E 2 ).

Section 8 handles the discrete case, and Sect. 9 the continuum case. For sparse potentials, we give the details in the continuum case and sketch the discrete case; while for random decaying potentials, we give details in the discrete case and sketch the continuum case. A.K. would like to thank the hospitality of I.H.E.S., and B.S. the hospitality of Hebrew University where some of this work was done.

¨ 2. Modified Prufer and EFGP Transforms We will be interested in solutions of −u00 (x) + V (x)u(x) = k 2 u(x).

(2.1)

u0 (x) = kR(x) cos(θ(x)), u(x) = R(x) sin(θ(x)).

(2.2a) (2.2b)

Change variables to

These are called modified Prüfer variables. The 2π ambiguity in θ is fixed by choosing θ(0) ∈ [0, 2π) and demanding θ(x) be continuous in x. Then a straightforward calculation shows (2.1) is equivalent to the pair of equations V (x) dθ =k− sin2 (θ(x)) dx k 1 d(log R)(x) = V (x) sin(2θ(x)). dx 2k

(2.3) (2.4)

This change of variables is so very useful because if V = 0, then θ(x) = θ0 + kx, R(x) = R0 . We will be able to study V as a perturbation about this solution. As explained in the introduction, one needs to study the asymptotic behavior of the norm of the transfer matrix T (x, 0). For any θ0 in [0, π), let θ(x, θ0 ) solve (2.3) with initial condition θ(0) = θ0 . Then let R(x, θ0 ) solve (2.4) with R(0, θ0 ) = 1. Then

8


Theorem 2.1. For any α, β ∈ (0, ∞) and θ1 6= θ2 , there exists non-zero, finite constants C1 and C2 (independent of x and V ) so that C1 max(R(x, θ1 ), R(x, θ2 )) ≤ kT (x, 0)k ≤ C2 max(R(x, θ1 ), R(x, θ2 ))

(2.5)

for all k ∈ (α, β). Proof. Define k(a, b)k2k = (ka)2 + b2 . Then min(1, k)k(a, b)k ≤ k(a, b)kk ≤ max(1, k) k(a, b)k. So defining operator norms in terms of k · kk , we see min(k, k −1 )kT (x, 0)kk ≤ kT (x, 0)k ≤ max(k, k −1 )kT (x, 0)kk , so it suffices to prove (2.5) with k · kk rather than k · k. But kT (x, 0)kk ≥ max(R(x, θ1 ), R(x, θ2 )) is trivial and kT (x, 0)kk ≤ {min[sin( 21 |θ1 − θ2 |), cos( 21 |θ1 − θ2 |)]}−1 max(R(x, θ1 ), R(x, θ2 )) by the lemma below.

If |θ1 − θ2 | ≤ π2 (which can be done by replacing θ1 by π + θ, if need be), then this proof shows we can take C1 = min(α, β −1 ), C2 = max(β, α−1 )[sin( 21 |θ1 − θ2 |)]−1 . Lemma 2.2. Let A be a unimodular matrix. Let uθ = (cos(θ), sin(θ)). Then if |θ1 −θ2 | ≤ π 2, kAk ≤ sin( 21 |θ1 − θ2 |)−1 max(kAuθ1 k, kAuθ2 k). Proof. There exists θ0 so that kAuθ k2 ≥ kAk2 sin2 (θ − θ0 ). If |θ1 − θ2 | < π2 , for any θ0 at least one of | sin(θ0 − θi )| is larger than or equal to | sin( 21 (θ1 − θ2 )|. Remark. One might worry that the lemma involves k · k and not k · kk but kAkk =

k 0 k 0 −1

and this product is also unimodular.

01 A 01 For the discrete case, we are interested in solutions of (0 ≤ k ≤ π) u(n + 1) + u(n − 1) + V (n)u(n) = 2 cos(k)u(n).

(2.6)

EFGP variables R(n), θ(n) are defined by R(n) cos(θ(n)) = u(n) − cos(k)u(n − 1), R(n) sin(θ(n)) = sin(k)u(n − 1).

(2.7a) (2.7b)

A priori θ(n) is only determined mod (2π). We will fix this ambiguity later. Noticing that R(n) sin(k + θ(n)) = sin(k)u(n), (2.8) sin(k + θ(n)) u(n) = . (2.9) u(n − 1) sin(θ(n))


Similarly,

9

R(n) cos(k + θ(n)) = cos(k)u(n) − u(n − 1).

Thus, cot(k + θ(n)) =

(2.10)

cos(k)u(n) − u(n − 1) , sin(k)u(n)

where by definition, − cot(θ(n + 1)) =

cos(k)u(n) − u(n + 1) . sin(k)u(n)

Thus, (2.6) is equivalent to cot(θ(n + 1)) = cot(k + θ(n)) −

V (n) . sin(k)

(2.11)

¯ Writing θ(n) ≡ θ(n) + k, we see, using first (2.7) and then (2.8)/(2.9): R(n + 1)2 = sin2 (k)u(n)2 + (u(n + 1) − cos(k)u(n))2 = sin2 (k)u(n)2 + (u(n − 1) − cos(k)u(n) + V (n)u(n))2 2 V (n) 2 ¯ 2 2 ¯ ¯ sin(θ(n)) = R(n) sin (θ(n)) + R(n) cos(θ(n)) − sin(k) V (n)2 V (n) 2 ¯ 2 ¯ sin(2θ(n)) + = R(n) 1 − sin (θ(n)) . sin(k) sin2 (k) We can summarize with the EFGP equations: V (n) ¯ ; θ(n) = θ(n) + k, sin(k) ¯ cot(θ(n + 1)) = cot(θ(n)) + νk (n),

νk (n) ≡ −

(2.12a) (2.12b)

2

R(n + 1) ¯ ¯ = 1 + νk (n) sin(2θ(n)) + νk (n)2 sin2 (θ). R(n)2

(2.12c)

¯ We will fix the ambiguity in θ by demanding θ(n + 1) − θ(n) ∈ [−π, π). Equation (2.12) can be regarded as analogs of modified Prüfer equations in that if V = 0, R(n) = constant, and θ(n) = θ(0) + kn. As noted in the introduction, Eggarter arrived at the first version of the EFGP transform by looking at continuum models with δ-function potential ((2.12b) is especially transparent in this mode). But one arrived at it by noting that when V (n) ≡ 0, could have 2 cos(k) −1 . This matrix has eigenvalues e±ik and so it the transfer matrix is powers of 0 1 cos(k) sin(k) must be similar to − sin(k) cos(k) . That similarity transformation will make the powers simple. Indeed, 0 sin(k) 2 cos(k) −1 cos(k) sin(k) 0 sin(k) = 1 − cos(k) 1 0 − sin(k) cos(k) 1 − cos(k) so the transform (2.7) precisely realizes the similarity. There is an analog of Theorem 2.1. Define R(n, θ) by requiring R(1) = 1, θ(1) = θ in [0, π). Then

10


Theorem 2.3. For any α ∈ (0, π2 ) and θ1 6= θ2 , there exists non-zero, finite constants C1 and C2 (independent of x and V ) so that for all k ∈ (α, π − α), C1 max(R(n, θ1 ), R(n, θ2 )) ≤ kT (n − 1, 0)k ≤ C2 max(R(n, θ1 ), R(n, θ2 )).

(2.13)

Because of the arccot, (2.12b) is somewhat awkward to deal with. Pastur-Figotin [26] have noted an equivalent form of (2.12b) which is straightforward from e2iϕ = 1 + viz., e

2iθ(n+1)

=e

¯ 2iθ(n)

1 1 2 1 + i cot(ϕ)

¯ iνk (n) (e2iθ(n) − 1)2 + . ¯ 2 1 − iνk2(n) (e2iθ(n) − 1)

(2.14)

As an application of (2.14) we have Proposition 2.4. If |νk (n)| < 21 , then ¯ |θ(n + 1) − θ(n)| ≤ π|νk (n)|. Proof. If |νk (n)|
0, we have ρac θ (T ) > 0. (ii) For any boundary condition θ, ρsing θ (S) = 0. Thus, bounded transfer matrices have important spectral consequences. By Theorems 2.1 and 2.3, if we can show R( · , θ) remains bounded for two initial θ’s, we have boundedness R of T . From this and (2.4), P (2.12c), one easily obtains the well-known result that if |V (x)| dx < ∞, (resp. |V (n)| < ∞), then the spectrum is purely a.c. in (0, ∞) (resp. (−2, 2)). Here is a result allowing more general decay, first in the continuum case.


11

Rβ Theorem 3.2. Fix k 6= 0. Suppose that limβ→∞ x V (y)e2iky dy exists and that Z ∞ V (y)e2iky dy (3.1) Wk (x) = x

obeys

Z

Then

|V (x)Wk (x)| dx < ∞.

(3.2)

lim kT (x, 0)k < ∞.

(3.3)

x→∞

Remarks. 1. This result is not new; it is essentially due to Harris-Lutz [15]. This is a new proof. PN 2. This result implies that if V (x) = m=1 am sin(km x)/xβ , β > 21 , and k 6= ± 21 km for any m, then (3.3) holds, and so by Proposition 3.1, the spectrum is purely a.c. except 2 }. for possible positive eigenvalues of { 41 km 3. In [19], Kiselev proved that if V (x) = O(x− 4 − ), then (3.2) holds off a set of Lebesgue measure zero. 3

Proof. We will show for any θ0 , R(x, θ0 ) is bounded, and then one can appeal to Theorem 2.1 to complete the proof of (3.3). Write θ(x) = kx + ϕ(x), so by (2.3), ϕ obeys V (x) dϕ =− sin2 (kx + ϕ). dx k By (2.4) (and R(0) = 1), Z x 1 dWk 2iϕ Im e dx log R(x) = dx 0 2k Z 2i x dϕ 2iϕ 1 = Im [Wk (x)e2iϕ(x) − Wk (0)e2iθ0 ] − e Wk dx 2k 2k 0 dx if we integrate by parts. By hypothesis, Wk (x) is bdd so using (3.4), Z 1 x | log R(x)| ≤ bdd + |Wk (y)V (y)| dy k 0 is bounded by (3.2).

Remark. A similar argument proves that lim θ − kx −

x→∞

1 2k

Z

x

V (y) dy 0

exists. This in turn lets one prove there are complex solutions η± (k, x) with Z x 1 η± (k, x) exp (∓i kx − V (y) dy → 1, 2k 0 Z x 1 0 η± (k, x) exp ∓i kx − V (y) dy → ik. 2k 0

(3.4)

12


Notice that if V ∈ L2 , kx −

Z

1 2k

Z

x 0

x

p

k 2 − V (y) dy + Q(x),

V (y) dy = 0

where limx→∞ Q(x) exists. So if V ∈ L2 , this says that WKB-type solutions exist. This is also what the Harris-Lutz method gives [19]. We are heading toward a proof of Theorem 3.3. Fix k 6= 0, π. Suppose V (n) is a discrete potential with lim

B X

B→∞

exists and that

∞ X

V (m)e2ikm = Wk (n)

m=n

|V (n)Wk (n)| + |V (n)Wk (n + 1)| < ∞.

(3.5)

n=1

Then lim kT (n, 0)k < ∞.

n→∞

Given a function f on {1, 2, . . .}, define (δf )(n) = f (n + 1) − f (n) and note that summation by parts takes the form b X

b X

g(m)(δf )(m) = −

m=a

f (m + 1)(δg)(m) + (f g)(b + 1) − (f g)(a).

m=a

Lemma 3.4. If (3.5) holds for some k, then

P∞ n=1

|V (n)|2 < ∞.

Proof. Since W exists, V → 0 at ∞ and so V is bounded. Thus, writing V (n) = −e−2ikn (δWk )(n), and summing by parts, B X

V (n)2 = bdd +

n=1

B+1 X

V (n)Wk (n)e−2ikn −

n=2

is bounded by (3.5).

B X

V (n)Wk (n + 1)e−2ikn

n=1

Lemma 3.5. Suppose that {an }∞ n=1 is a real sequence so that an → 0 and

Then

N X

QN

n=1 (1

an

as n → ∞

is bounded.

(3.6)

(3.7)

n=1

+ an ) is bounded.

Proof. By (3.6), |an | → 0, so without loss we can suppose that |an | < 1. Then |1+an | ≤ 1 + an ≤ ean and (3.7) implies the result.


13

Proof of Theorem 3.3. By (2.12c), Lemma 3.4, and Lemma 3.5, it suffices to prove that N X

¯

νk (n)e2iθ(n) ≡ G(N )

(3.8)

n=1

is bounded. Define ¯ − kn. ϕ(n) = θ(n) − k(n − 1) = θ(n) Proposition 2.4 and Lemma 3.4 imply that for n large |(δϕ)(n)| ≤ π|νk (n)|.

(3.9)

By the definition (3.8),

G(N ) = −

N X

δWk (n)(sin k)−1 e2iϕ(n)

n=1

= bdd + (sin k)−1

N X

Wk (n + 1)δ(e2iϕ )(n).

n=1

But |δ(e2iϕ )| ≤ 2|δϕ|, so by (3.9) |G(N ) − bdd| ≤ C1

N X

|Wk (n + 1)νk (n)|

n=1

≤ C1

X N

|Wk (n)νk (n)| + |νk (n)|2 < ∞.

n=1

Sometimes it is better to use slightly different Prüfer variables. For example, if R, θ are defined by p u0 (x) = E − V (x) R(x) cos(θ(x)), u(x) = R(x) sin(θ(x)), then d log(R) 1 ∂V = cos2 (θ(x)), dx 2 ∂x 1 from which we see if V (x) → 0 at infinity and ∂V ∂x ∈ L , then solutions are bounded. (This is essentially the proof of Weidmann’s theorem [37] in [33].) If one tries out an 1 2 integration by parts argument, one needs both ∂V ∂x ∈ L and V ∈ L .

14


4. Bound States for O(x−1 ) Potentials 2

d If |V (x)| = o(x−1 ), Eastham-Kalf [8] show that − dx 2 +V (x) has no positive eigenvalues; more generally, if lim x|V (x)| = C < ∞, they show any eigenvalue λ must obey λ ≤ C 2 . On the other hand, Naboko [24] and Simon [34] have constructed V (x) decaying arbitrarily slower than x−1 with eigenvalues dense in [0, ∞). In fact, Simon [34] constructed V (x) withP V√ (x) = O(x−1 ) so that there are infinitely many eigenvalues with λi < ∞. In this section, we will handle the borderline case and λi → 0 as long as improve Eastham-Kalf [8] by showing:

Theorem 4.1. Let V (x) obey C = limx→∞ x|V (x)| < ∞. Then there are at most countably many positive eigenvalues λn for which there are solutions un of −u00n + V (x)un = λn un and un ∈ L2 . Moreover,

X

λn ≤

n

C2 . 2

(4.1)

Remarks. 1. We do not specify boundary conditions on V , that is, (4.1) is a bound on all possible boundary conditions at once. P√ 2 λn = ∞ (e.g., λn = π3C 2. There are λn so that (4.1) holds, but 2 n2 ) so there is a gap between Simon’s examples P √and what our bounds allow. We believe the optimal result would be to prove that n λn ≤ C. Without loss of generality by slightly increasing C and looking at [x, ∞), we can suppose that (4.2) |V (x)| ≤ C(1 + |x|)−1 which we henceforth do. The following is standard (see, e.g., Eastham-Kalf [8]): Lemma 4.2. If V is bounded and u solves −u00 + V u = λu and u ∈ L2 , then u0 ∈ L2 . In particular, R(x, θ0 ) ∈ L2 for that θ0 with (u(0), u0 (0)) = (R0 sin(θ0 ), kR0 cos(θ0 )). Proof.

Z

N 0

N Z |u0 |2 dx = u0 u − 0

N Z 0 = u u + 0

N

u00 u dx

0 N

(λ − V )u2 dx, 0

RN so if limN →∞ 0 |u0 |2 dx = ∞, then limN →∞ u0 u = ∞, but that implies u2 (N ) = RN u(0)2 + 2 0 u0 u dx → ∞, contradicting the fact that u ∈ L2 . Lemma 4.3. Let f and g be C 1 functions on [1, ∞) so that |g 0 f | + |f 0 | ∈ L1 . Then

RN 0

f (x)ei(kx+g(x)) dx is bounded as N → ∞ for any k 6= 0.


Proof. Write eikx = Z

1 d ikx ik dx e

N

f (x)e

i(kx+g(x))

1

15

and integrate by parts to see that

Z N |f (N )| |f (1)| 1 + + dx ≤ (|f 0 | + |f g 0 |) dx. |k| |k| |k| 1

Noting that |f (N )| = |f (1)| +

RN 1

|f 0 (y)| dy, we see that the integral is bounded.

Remark. If f (x) → 0 at infinity, this argument shows that limN →∞ exists.

RN f (x)ei(kx+g(x)) dx 1

Lemma 4.4. Let {ei }N i=1 be a set of unit vectors in a Hilbert space H so that α ≡ N suphei , ej i < 1.

(4.3)

i6=k

Then

N X

|hg, ei i|2 ≤ (1 + α)kgk2

(4.4)

i=1

for any g ∈ H. Proof. Let A be the n × n matrix with aij = hei , ej i. Note that the Hilbert-Schmidt P norm of A − 1 is bounded by ( i6=j hei , ej i2 )1/2 ≤ α so (4.3) says that A is invertible. If B is its inverse, then X Bij ej (4.5) fi = obeys hfi , ej i = δij , and thus X hg, ei ifj ≡ Proj of g to the span of the e’s, and so

X

2

hg, e if kgk2 ≥ i i .

By (4.5), hfi , fj i = Bij and since hh, A−1 hiCn ≥ kAk−1 hh, hiCn , we see that n X

|hg, ei i|2 ≤ kAk

X

hg, ei i hfi , fj ihg, ej i

i,j

i=1

≤ kAk kgk2 , which is (4.4).

Proof of Theorem 4.1. It obviously suffices to show for each fixed N < ∞ that N X n=1

λn ≤

C2 . 2

Define Rn (x) to be the R corresponding to the L2 solution u(x, λn ). Normalize u so Rn (0) = 1. By Lemma 4.2,

16

A. Kiselev, Y. Last, B. Simon N X

|Rn (x)|2 ∈ L1

n=1

so lim x

N X

|Rn (x)|2 = 0

n=1

PN

(for if not, eventually n=1 |Rn (x)| ≥ Cx−1 is not L1 ). Thus, we can find Bj → ∞ so that for n = 1, . . . , N , −1/2 Rn (Bj ) ≤ Bj or

Z

2

Bj 0

so by (2.4),

Z

Bj

1 d (log Rn (y)) dy ≤ − ln Bj , dx 2 p

V (x) sin(2θn (y)) dy ≤ −

λn log Bj .

(4.6)

0

Now consider the Hilbert spaces Hj = L2 ((0, Bj ), (1 + x) dx). In Hj , we have Z kV

k2Hj

Bj

≤

C 2 (1 + |x|)−2 (1 + x) dx = C 2 log(Bj ) + O(1).

(4.7)

0

Let e(j) n (y) =

sin(2θn (y)) 1 p χ[0,Bj ] (y), (1 + |y|) Nn(j)

where

Z

Bj

sin2 (2θn (y)) dy. (1 + |y|) 0 √ √ √ Notice that 4θn (y) − 4 λn and 2(θn ± θm ) − 2( λn ± λm ) have derivatives that are O(x−1 ) by (2.3). Thus by Lemma 4.3, Z Bj sin(2θn (y)) sin(2θm (y)) − 21 δnm dy (1 + |y|) 0 Nn(j)

=

are bounded. We conclude that Ni(j) = (j) he(j) i , ek i

1 2

log Bj + O(1),

= O((log Bj )

−1

)

i 6= k.

(4.8) (4.9)

Equations (4.6) and (4.8) imply that p

hV, e(j) n i Hj ≤ −

2λn (log Bj )1/2 + O(1).

(4.10)

Since the number N of eigenfunctions is fixed, but Bj → ∞ for j large, Lemma 4.4 applies and

Modified Prüfer and EFGP Transforms N X

17

2 −1 2 |hV, e(j) n iHj | ≤ (1 + O((log Bj ) )kV kHj .

(4.11)

n=1

But (4.10) and (4.7) then say that 2

X N

λn log(Bj ) ≤ C 2 log(Bj ) + O(1),

n=1

so

N X

λn ≤

n=1

C2 . 2

5. Sparse Potentials: The Continuum, Absolutely Continuous Case Our goal in this section is to prove assertion (1) in Theorem 1.6 and Theorem 1.60 . The idea will be to control kT (x)k4 and then use Theorem 1.3. As explained in Sect. 1, the key is oscillations in sin(2θ(x)) for θ(x) ∼ kxn+1 for x near xn+1 . We will realize this (xn )k . using an integration by parts so we need a priori control on objects like dkTdk Fix a Pearson potential; an is assumed to obey an → 0 and xn+1 > xn + 21. Fix θ0 and solve the modified Prüfer equations for each k ∈ (0, ∞) to get functions θ(x, k) and R(x, k) (with initial conditions θ(x = 0, k) = θ0 , R(x = 0, k) = 1). Fix 1 so supp(W ) ⊂ [−1, 1]. We need two propositions to prepare for bounds in an integration by parts: Proposition 5.1. Suppose that lim xn+1 /xn > 1. For each a, b > 0, there exists a constant C so that for each k ∈ (a, b), ∂θ (5.1) ∂k (xn + 1) ≤ Cxn 2 ∂ θ ≤ Cx2n . (x + 1) n ∂k 2

and

(5.2)

Moreover, uniformly for k ∈ (a, b), 1 ∂θ (x) = 1, x→∞ x ∂k 1 ∂2θ lim 2 (x) = 0. x→∞ x ∂k 2 lim

Proof. Let β = inf n

xn+1 >1 xn

(5.3) (5.4)

(5.5)

by hypothesis. As a preliminary, note that if h, g, f are functions on [a, b], h is C 1 and h0 (x) = f (x) + g(x)h(x).

(5.6)

18


Then

|h(b)| ≤ (|h(a)| + (b − a)kf k∞ )e(b−a)kgk∞

as follows from the exact solution of (5.6): Z Rx g(y) dy + h(x) = h(x)e a

Rx

x

f (y)e

y

g(z) dz

(5.7)

dy.

a

Now let h(x) =

∂θ ∂k (x).

From (2.3),

V (x) V (x) ∂h = 1 + 2 sin2 (θ(x)) − sin(2θ(x))h. ∂x k k

(5.8)

This means for x ∈ (xn−1 + 1, xn − 1), we have that ∂h = 1. ∂x

(5.9)

By (5.7) and (5.8), |h(xn + 1)| ≤ e2C|an |1 [|h(xn − 1)| + 21 + 2C|an |1] ≤e

2C|an |1

(5.10)

[|h(xn−1 + 1)| + (xn − xn−1 ) + 2C|an |1],

(5.11)

where we used (5.9) to go from (5.10) to (5.11). In these equations, C is a constant only depending on (a, b). Throughout this proof, C is such a constant whose value can vary from one equation to the next. Let β > 1 be given by (5.5). Pick n0 so large that for n ≥ n0 : β −1 e2|an |C1 ≤ and

1 2

(1 + β −1 )

2C|an |1 2an C1 1 − β −1 . 1+ ≤1+ e xn 2

(5.12)

(5.13)

Since β > 1 and an → 0, such an n0 exists. Next, pick D ≥ 2 so |h(xn0 −1 + 1)| ≤ Dxn0 −1 .

(5.14)

We claim inductively that for n ≥ n0 − 1, we have that |h(xn + 1)| ≤ Dxn

(5.15)

for by (5.14), this holds for n = n0 − 1, and if it holds for n − 1, then by (5.11) and xn−1 ≤ β −1 xn , |h(xn + 1)| ≤ [Dxn−1 + xn − xn−1 + 2C|an |1]e2C|an |1 2C|an |1 2C|an |1 e ≤ xn (D − 1)β −1 + 1 + xn 1 1 − β −1 (1 + β −1 ) + 1 + ≤ xn (D − 1) 2 2 −1 1−β = xn D − (D − 2) ≤ Dxn 2

(by (5.12)/(5.13))


19

since D ≥ 2. Thus, we’ve proven (5.15). Next, let H(x) = h(x) − x, so (5.8) implies that ∂H ∂x ≤ C|an |(1 + |H|)

(5.16)

on (xn − 1, xn + 1). Using (5.7) and (5.15), we conclude (recall the constant C changes from one equation to the next!) |H(xn + 1) − H(xn − 1)| ≤ C|an |xn . Since H(xn−1 + 1) = H(xn − 1), we have that for n ≥ n0 , n X H(xn + 1) xm ≤ C + am xn + 1 xn + 1 (x n + 1) m=n 0

n X C + am β −(n−m) → 0 ≤ xn + 1 m=n 0

as n → ∞ since β > 1 and am → 0. From this and (5.16), we see that | H(x) x | → 0 as x → ∞, which proves (5.1). ∂2θ To prove (5.2), let g = ∂h ∂k = ∂k2 . Then differentiating (5.8) with respect to k, we see that ∂g = 0 on (xn−1 + 1, xn − 1) ∂x ∂g = A(x) + B(x)h(x) + D(x)g(x) + E(x)h2 (x) ∂x

(5.17a) on (xn − 1, xn + 1) (5.17b)

where A, B, D, E are uniformly bounded by Can on this interval with C uniformly bounded as k runs through (a, b). Now use (5.7) and (5.1) to see that |g(xn + 1)| ≤ e2Can 1 [g(xn−1 + 1) + Can x2n 1]. As above, if n is so large that β −2 e2Can 1 ≤

1 2

(1 + β −1 )

and

(Can 1)e2Can 1 ≤

1 2

(1 − β −1 )

then inductively, g(xn + 1) ≤ Cx2n for n large. This is (5.2). Plugging this into (5.17b), we see that g(xn + 1) ≤ C 1 +

n X

am x2m

,

(5.17c)

m=1

which yields limn→∞ g(xn + 1)/x2n = 0 from which (5.4) is immediate.

20


Proposition 5.2. For any a, b > 0, there is a C so that for all k ∈ (a, b), log R(xn + 1) ≤ C

n X

|am |,

(5.18)

X ∂ log R (xn + 1) ≤ C |am xm |. ∂k

(5.19)

m=1 n

m=1

Proof. By (2.4), log R(x) is constant for x ∈ (xn−1 + 1, xn − 1) and Z | log R(xn + 1) − log R(xn − 1)| ≤ 2k −1 |an | W (y) dy, so (5.18) holds with C = 2 min(k)−1 From (2.4), we have

R

W (y) dy.

∂θ ∂ ∂ (k log R) = V (x) cos(2θ(x)) , ∂x ∂k ∂k so that the bound (5.1) implies (5.9).

As a final preliminary, we note that Lemma 5.3. Suppose that lim xn+1 /xn > 1. Then for a constant C, ∞ X X

∞

|an am |

n=1 m≤n

X xm ≤C a2n . xn n=1

Proof. Let β = lim xn+1 /xn . Pick 1 < γ < β. Then for m ≤ n, xm /xn ≤ Cγ −|m−n| . Thus, the lemma follows from Young’s inequality that X T (a)n ≡ γ −|m−n| am m

is bounded from `2 to `2 for any γ > 1.

0

Proof of Theorem 1.6 . Let g be a non-negative C ∞ -function compactly supported on (0, ∞). We will prove that Z (5.20) sup g(k)R(k, xn + 1)4 dk < ∞. n

Proving this for two values of θ0 and appealing to Theorem 2.1 gets a uniform bound R on g(k)kT (0, xn + 1)k4 dk. Theorem 1.3 then proves pure absolute continuity of the spectrum on (0, R ∞). Let Bn = g(k)R(xn + 1)4 dk. Notice that by (2.4), R(xn−1 + 1) = R(xn − 1) and (5.21) R(xn + 1)4 = R(xn − 1)4 exp(Qn ), where 2 Qn = k

Z

1 −1

an W (y) sin(2θ(xn + y)) dy.

Since k −1 and an are bounded, Qn is uniformly bounded in n, and so


21

exp(Qn ) ≤ 1 + Qn + CQ2n ≤ 1 + Qn + Ca2n

(5.22)

(where again C is a constant that varies from formula to formula). For y ∈ (−1, 1), we have by (2.3) |θ(xn + y) − θñ (y)| ≤ Can , where θñ (y) = θ(xn−1 + 1) + k(xn + y − xn−1 − 1), so

Z 1 Qn − 2 ˜ an W (y) sin(2θn (y)) dy ≤ Ca2n . k −1

(5.23)

By (5.21)–(5.23), Bn ≤ Bn−1 (1 + Ca2n ) + En , where

Z E n = an

Z

1

dy −1

(5.24)

2g(k) R(xn−1 + 1, k)4 W (y) sin(θñ (y)) dk. k

Notice that we’re implementing our basicQ strategy: We separate out the second-order ∞ terms (which will present no problem since n=1 (1 + Ca2n ) < ∞) and need to control the first-order terms where we have an explicit highly oscillatory factor since θn ∼ kxn . Now 1 ∂θ(xn−1 + 1) ∂ θñ (y) = xn + y − xn−1 − 1 + > xn (5.25) ∂k ∂k 2 for n large by the bound (5.3). Thus, we can write sin(θñ (y)) =

1 ∂ θñ ∂k

∂ (− cos(θñ (y))) ∂k

and integrate by parts. After integration by parts, we have three terms En(1) En(2) En(3)

∂[k −1 g(k)] , ∂k ∂R4 , coming from ∂k ∂ 1 . coming from ∂ θ˜ ∂k coming from

∂k

For the En(1) term, we can bound R4 as follows using (5.18) and xn ≥ Cβ n . By (5.10), for n large,

(5.26)

22


X n R4 ≤ C exp C am m=1

n ln(β) , ≤ C exp 2 since an → 0. Thus, by (5.19) and (5.26),

En(1) ≤ Cβ n/2 β −n = Cβ −n/2 . For the En(2) term, we use

∂R4 ∂k

(5.27)

R = R4 ∂ log ∂k , (5.19), and (5.25) to see that

En(2) ≤ CBn−1 bn , where bn = Note now that by

P

n−1 X

an am

m=1

a2n

xm . xn

< ∞ and Lemma 5.3, we have ∞ X

bn < ∞.

(5.28)

n=1

For the En(3) term, we use (5.25) and (5.17c) to see that En(3) ≤ Bn−1 cn , where cn = Can As in the proof of Lemma 5.3,

(1 +

X

Pn−1 m=1 x2n

am x2m )

.

cn < ∞.

(5.29)

By (5.24) and the above estimates on En(i) , max(Bn , 1) ≤ (1 + Ca2n + Cbn + Ccn + Cβ −n/2 ) max(Bn−1 , 1). P P By hypothesis, a2n < ∞, and by (5.28–5.29), bn + cn < ∞. Thus N Y

(1 + Ca2n + Cbn + Ccn + Cβ −n/2 )

n=1

is bounded and consequently, so is Bn .

It is easy to see that the methods of this section extend to prove: P Theorem 5.4. Suppose V (x) = Wn (x − xn ), where (i) lim xn /xn+1 < 1, (ii) supp Wn ⊂ [−1, 1] for some fixed 1, P R 2 (iii) n |Wn (y)| dy < ∞. 2

d Then − dx 2 + V (x) has purely a.c. spectrum on (0, ∞).

(5.30)


23

6. Sparse Potentials: The Continuum, Singular Continuous Case In this section, we will prove assertion (2) in Theorem 1.6. The idea will be to force kT (k 2 , xn )k to infinity for almost all k and suitable xn . To do this, we will need to isolate a strictly positive second-order term and show that these second-order terms then dominate the first-order terms because of oscillations. Here is a warm-up problem to show this cancellation mechanism. Let Xn be independent, identically distributed random variables taking the values ±1 with probability 1 2 . Let > 0 and let an be a sequence going to zero as n → ∞. Finally, let Yn =

n X

(a2m + am Xm ).

m=1

Suppose that probability 1,

P

a2n = ∞. We claim there exists a subsequence n(i) → ∞, so with lim Yn(i) = ∞.

i→∞

(6.1) Pn

The reason (6.1) holds is that by the central limit theorem m=1 an Xn is typically pP a2n and, because of the square root, this is smaller not more negative than O − Pn than m=1 a2n . Pn To make a proof, notice that since m=1 a2m → ∞, we can choose n(i) so that Pn(i) 2 2 m=1 am ≥ i . By a Tschbechev inequality, Prob

X n(i) 1

P

1 i2

X 2 am 2 n(i)

am Xm ≥

≤

k

1

Pn(i)

am Xm k2 4 4 1 Pn(i) 2 2 = 2 Pn(i) 2 ≤ 2 2 . i a ) a m m 1 1

1

( 2

< ∞, so by the Borel-Cantelli lemma, with probability 1, eventually n(i) X

a m Xm ≤

1

n(i) X 2 an , 2 1

and thus eventually, Yn(i) ≥

n(i) X 2 am 2 1

diverges. The usual Kolmogorov stopping argument that lets one prove things without subsequences isn’t obviously applicable here in a situation where we assume no regularity on the am ’s (see Sect. 8 for the case am = m−α ). Since a subsequence suffices for our application, we have not tried to push the argument through to get lim Yn = ∞, even in the toy problem. Notice that independence of the Xn ’s was not needed; rather, it suffices to have enough control of E(Xn Xm ) to show that the first-order term is small compared to the second-order term. In the case at hand, we will use integration by parts in k as we did in the last section to get this control. We summarize the key to the above argument with Lemma 6.1. Let Pn , Qn be random variables so that

24


(i) Pn (x) ≥ αn > 0 for a.e. x and positive reals αn , P −1 (ii) αn Exp(|Qn |) < ∞, (iii) limn→∞ αn = ∞. Then Pn (x) + Qn (x) → ∞ for a.e. x. If (ii) is replaced with (ii0 ) limn→∞ αn−1 Exp(|Qn |) = 0, then there exists a subsequence n(i) so that Pn(i) (x) + Qn(i) (x) → ∞ for a.e. x. Proof. If (ii0 ) holds, we can find a subsequence so that (ii) holds. Thus, it suffices to prove the result P assuming (ii). By (ii), αn−1 |Qn (x)| < ∞ for a.e. x. In particular, αn−1 Qn (x) → 0 so Pn + Qn ≥ αn [1 − αn−1 |Qn (x)|] → ∞. We will also need the following lemma: Lemma 6.2. Suppose that Bn , αn , βn ≥ 0 are real numbers and that p Bn ≤ Bn−1 + 2αn Bn−1 + βn (n ≥ 1). Then, p

(6.2)

v u n n X p uX Bn ≤ B0 + αk + t βk . k=1

(6.3)

k=1

Proof.PWe give a proof by induction. Equation (6.2) holds for n = 0. Let an = n bn = k=1 βk . By the induction hypothesis, p p p Bn−1 ≤ B0 + an−1 + bn−1 .

Pn k=1

αk ,

(6.4)

Equation (6.2) implies that Bn ≤ So by (6.4),

p 2 Bn−1 + αn + βn .

2 p p B0 + an + bn−1 + βn 2 p p p ≤ B0 + an + bn + 2 bn−1 B0 + a n p p 2 B 0 + a n + bn , ≤

Bn ≤

proving (6.3) inductively.

P So fix a Pearson potential with a2n = ∞. Fix θ0 and let R(x, k) be the solution of (2.3/2.4). Let Yn (k) = log R(xn + 1, k) and δYn (k) = Yn (k) − Yn−1 (k). By (2.4),


δYn (k) =

25

an 2k

Z

1 −1

W (y) sin 2θ(xn + y) dy.

(6.5)

As in Sect. 5, we write θñ (y) = θ(xn−1 + 1) + k(xn + y − xn−1 − 1). But we expand θ to the next order by letting Z an y θn(1) (y) = − W (y) sin2 (θñ (y)) dy. k −1 Then by (2.3),

(6.6)

θ(xn + y) = θñ (y) + θn(1) (y) + O(a2n ),

so by (6.5), δYn (k) = an Xn(1) + a2n Sn + O(a3n ), Z 1 1 W (y) sin(2θñ (y)) dy, Xn(1) = 2k −1 (1) Z 1 1 θ (y) W (y) cos(2θñ (y)) n . Sn = k −1 an

(6.7a) (6.7b) (6.7c)

In the formula for θn(1) , use sin2 (θñ (y)) = 21 (1 − cos(2θñ (y))). The cos term from this formula when plugged into (6.7c) gives Z y Z 1 2 1 ˜ ˜ k W (y) cos(2θn (y)) W (s) cos(2θn (s)) dy 2 −1 −1 Z 1 2 1 ˜ = 2 W (y) cos(2θn (y)) dy . 4k −1 lump the contribution of the RWe y W (s) ds, we find −1

1 2

(6.8)

term with the first-order term. Defining X(y) =

δYn (k) = [a2n Zn (k) + an Xn (k)] + O(a3n ),

(6.9)

where Z 1 2 1 ñ (y)) dy , W (y) cos(2 θ 4k 2 −1 Z 1 1 an W (y)X(y) ˜ ˜ cos(2θn (y)) dy. W (y) sin(2θn (y)) − Xn (k) = 2k −1 2k Zn (k) =

In (6.9), the O(a3n ) means an error bounded by Ca3n , where C is a finite constant for k ∈ [a, b] any compact subinterval of (0, ∞). Define Z 1 f (k) = W (y)e2iky dy. W −1

26


Then, Zn (k) =

1 f |W (k)|2 + X˜ n (k), 8k 2

(6.10)

where

1 f X˜ n (k) = 2 |W (k)|2 cos(4(θñ (0, k) + ϕ(k)), 8k f (k)). where ϕ(k) = 21 Arg(W f (k) = |W f (k)|e2iϕ(k) , then For let θñ (y) = θñ (0) + ky. If W

(6.11)

2 Z 1 1 2i(θñ (0)+ky) W (y)e dy Zn (k) = 2 Re 4k −1 1 f = 2 |W (k)|2 cos2 (2(θñ (0, k) + ϕ(k))). 4k Proof of Theorem 1.6, Part (2). Let X 1 f 2 | W (k)| a2m , 8k 2 n

Pn (k) =

m=1

Qn (k) = Yn (k) − Pn (k); so

δQn (k) = Qn (k) − Qn−1 (k),

δQn (k) = a2n X˜ n (k) + an Xn (k) + O(a3n ). f (k) 6= 0}. Let Let g be a C ∞ -function compactly supported in {k ∈ (0, ∞) | W n 2 Z X am Xm (k) dk, Bn = g(k) Z B˜ n =

m=1

n 2 X 2 g(k) am X˜ m (k) dk. m=1

We will prove that

p

Bn

X n

a2m → 0

(6.12)

m=1

en . Since Pn a3 / Pn a2 → 0 (on account of as n → ∞, P and similarly for B m=1 m m=1 m n an → 0 and m=1 a2m → ∞), (6.12) and the Schwartz inequality imply that X Z n a2m → 0, g(k)|Qn (k)| dk m=1

e > 0 implies that there is a subsequence n(i) so by Lemma 6.1 and inf k∈supp g |W8k(k)| 2 so that Yn(i) (k) → ∞ for a.e. k in supp g. By doing this for two values of θ0 and using Theorem 2.1 and Theorem 1.1, we conclude there is no a.c. spectrum on supp g. f is an entire function, it has isolated zeros and thus, this argument shows σac Since W is empty. By Theorem 1.4, σpp ∩ (0, ∞) is empty, and an elementary argument proves that σ(H) ⊃ [0, ∞). So the spectrum on (0, ∞) is purely singular continuous. It thus suffices to prove (6.12) (the proof for B˜ n is essentially identical).


Let Mn−1 (k) =

27

Pn−1

am Xm (k). Then Z Bn ≤ Bn−1 + g(k)Mn−1 (k)an Xn (k) + Ca2n m=1

for a suitable constant C. Now Xn has cos(2θñ (y)) and sin(2θñ (y)) terms. As in the last d [. . .] and integrate by parts and section, we write those as a suitable [dθñ (y, k)/dk]−1 dk get three terms: −1 One coming from ∂[k ∂kg(k)] ∂k. Noting that |Mn (k)| ≤ Cn, we have that this is bounded by Cn xn . n (k) . Using (5.1), this term is bounded by One coming from ∂M∂k C

n−1 X

an am

m=1

xm . xn

2

n 2 One coming from Ln = [ ∂∂kθ2n ]/[ ∂θ ∂k ] . As in the last section, thisRLn is bounded by Pn−1 C( m=1 am x2m )/x2n . We can use the Schwartz inequality to control g(k)|Mn (k)| dk, p Pn−1 and so bound this term by C Bn−1 an m=1 am x2m /x2n . The net result is the bound p Bn ≤ Bn−1 + 2αn Bn−1 + βn , (6.13)

where αn = C

n−1 X

x2m x2n

|an am |

m=1

and

n−1 n X xm βn = C a2n + + an am . xn xn m=1

By the argument in Lemma 5.3 with xn−1 /xn → 0 and that

n X

αm

m=1

and that

n X

X n

a2m → 0,

m=1

a2n → ∞, we see (6.14)

m=1

n X βm ≤ C 1 + a2m ,

m=1

so

P∞

m=1

v u n X n uX t βm a2m → 0. m=1

(6.15)

m=1

Lemma 6.2 and (6.13–6.15) imply (6.12).

One can modify this construction to make examples of decaying potentials for which the associated Schrödinger operator has regions of a.c. spectrum and regions of s.c. specf (k) vanishes in a whole interval so that even though trum. The idea is to arrange that W

28


f (k) cannot vanish if W has coman ∈ / `2 , we have a.c. spectrum for those k. Of course, W pact support, so we will take the bump functions of increasing support converging toward a function whose Fourier transform vanishes in an interval. So, let S = [a, b] ⊂ (0, ∞). Let f be an even Schwartz class function that vanishes if k 2 ∈ S and is strictly positive on [0, ∞)\S. P Let an = n−1/2 , xn = (n!)2 , 1n = n−1/12 . Notice that a2n = ∞. Define Z 1 f˜(x) = exp(−2ikx)f (k) dk, 4π Wn (x) = f˜(x)χ(−1n ,1n ) (x) and V (x) =

X

an Wn (x − xn ).

n

We are heading toward: 2

d Theorem 6.3. The half-axis Schrödinger operator − dx 2 + V (x) has purely singular spectrum on (0, ∞)\S and purely a.c. spectrum on S.

Lemma 6.4. For any m > 0, there exists a constant Cm with Z f (k) − e2ikx Wn (x) dx ≤ Cm n−m . Proof. Let fn (k) =

R

(6.16)

e−2ikx Wn (x) dx. Then Z 1 sin(1n (k − k 0 )) f (k 0 ) dk 0 , fn (k) = 2π (k − k 0 )

so the left side of (6.16) is Z 1 sin 1n (k − k 0 ) 0 0 [f (k) − f (k )] dk , 0 2π k−k Z g(y, k) sin 1n y dy ,

which has the form

where g(y, k) is Schwartz space in y with bounds (including bounds on derivatives) uniform in k. If we integrate by parts 12m times, we will get (6.16). Proposition 5.1 extends with no change. In the region where f (k) 6= 0, the analysis earlier in this section shows that log R(xn(i) + 1n(i) ) → ∞ for a.e. k and a suitable subsequence xn(i) , so we know the spectrum in (0, ∞)\S is purely singular continuous. On the other hand, if g is C ∞ supported in S, we claim that Z (6.17) sup g(k)R(k, xn + 1n )4 dk < ∞. n

The proof is similar to that in the last section. In place of (5.22), we need to use exp(Qn ) ≤ 1 + Qn + 21 Q2n + O(a3n ).


29

fn (k)|2 /8k 2 and oscillatory terms that we can As in this section, Q2n has a term a2n |W integrate by parts. Noting that n−1 X m=1

X xm an am ≤ n−2 n−1/2 m−1/2 ≤ Cn−2 xn

is still summable and that

n−1

m=1

P

fn (k)|2 is summable by Lemma 6.4, we obtain (6.17). a2n |W

7. Sparse Potentials: The Discrete Case In this section, we will sketch the proof of Theorem 1.7. The proof follows closely that in the last two sections with (2.12) replacing (2.3/2.4). We will make use of (2.14), the Pastur-Figotin form of (2.12b). Fix α > 0 and pick k ∈ (α, π−α) and then N so large that for all such k, |νk (n)| < 21 for n ≥ N0 . Equation (2.14) can then be effectively used to prove the analogs of (5.1/5.2), that is, 2 ∂ θ ∂θ 2 (x ) ≤ Cx (x ) (7.1) n ∂k 2 n ≤ Cxn . ∂k n Equation (2.12c) can be rewritten log R(n + 1) − log R(n) =

1 2

¯ + νk (n)2 sin2 (θ)). ¯ log(1 + νk (n) sin(2θ)

This implies the bound log R(xn ) ≤ C

n X

|am |.

(7.2)

(7.3)

m=1

Next notice that 1 + α sin(2θ) + α2 sin2 (θ) = (1 + 21 α sin(2θ))2 + α2 sin4 (θ). This provides a uniform bound on the argument of the log(·) in (7.2), and so allows one to prove n X ∂ log R(x ) ≤ C a n xm . (7.4) n ∂k m=1

With these tools, the proof of assertion (1) of Theorem 1.6 is similar to that in Sect. 5, only a little simpler since (2.12c) implies ¯ + Cn a2n ). R(n + 1)4 ≤ R(n)4 (1 + νk (n) sin(2θ(n)) The same integration by parts used in Sects. 5 and 6 shows that Z Z g(k)R(n, k)4 νk (n) sin(2θ(n)) dk = C(bn + cn + B −n/2 ) 1 + g(k)R(n, k)4 dk Pn−1 with bn = m−1 an am xm /xn and cn is like bn with x2m /x2n replacing xm /xn . As in Sect. 5, this proves assertion (1) in Theorem 1.7. To prove assertion (2), we must identify a strictly positive second-order term. We write

30


log(1 + α sin(2θ) + α2 sin2 (θ)) = α sin(2θ) + α2 (sin2 (θ) −

1 2

sin2 (2θ)) + O(α3 )

(7.5)

= α sin(2θ) + α cos(4θ) − α cos(2θ) + α + O(α ). 1 4

This lets us write

2

1 2

2

log R(n + 1) − log R(n) =

1 4

1 2

2

3

(7.6)

a2n + an Xn ,

and, as in Sect. 6, use the integration by parts machine to prove Z X N

2 an Xn

g(k) dk

1/2 X

a2n → 0

n=1

and complete the proof as there. f (k) since the analog of W here In this case, we don’t need to worry about zeros of W f is δn0 and so W (k) = 1. 8. Random Decaying Potentials: The Discrete Case In this section, we consider discrete situations where the V (n) are independent random variables of zero mean and decaying variance. The results that imply a.c. spectrum require no regularity in E(V (n)2 ), while those for singular spectrum require some kind of regular decay, as we will explain. The results for a.c. spectrum are so general yet so simple to prove that they are a paradigm of the usefulness of the EFGP transform. Theorem 8.1. Suppose Vω (n) are independent random variables with E(Vω (n)) = 0 and X E(Vω (n)2 ) + E(Vω (n)4 ) < ∞. (8.1) n

Then for a.e. ω, hω has purely a.c. spectrum on (−2, 2). Remarks. 1. For E(Vω2 )1/2 ≤ Cn−α with V bounded and α > 21 , we get a.c. spectrum recovering results of Delyon, et al. [7]. bounded, then E(Vω (n)4 ) ≤ CE(Vω (n)2 ) and so (8.1) 2. If the PVω (n) are uniformly 2 becomes n E(Vω (n) ) < ∞; we state the general bound because unbounded V ’s are so easy to accommodate. 3. The case E(Vω (n)2 )1/2 = n−1/2 log(n)−1 is of some interest. This sequence is 2 ` so if V is bounded, the theorem proves a.c. spectrum. Kotani-Ushiroya [21] cannot handle such borderline cases. Proof. Fix θ0 . Then Rω (n) and θω (n) become random variables which are measurable functions of {Vω (j)}j≤n−1 and so independent of {Vω (j)}j≥n . By (2.12c), Vω (n) sin(2θ¯ω (n)) + O(Vω2 + Vω4 ) . R(n + 1)4 = R(n)4 1 + sin k ¯ and R(n), we have Since Vω (n) is independent of θ(n) E(Rω (n)4 Vω (n) sin(2θω (n) )) = E(Vω (n))E(Rω4 (n) sin(2θ¯ω (n))) = 0.


31

Using independence to bound E(R(n)4 Vωj ) by E(R(n)4 )E(Vωj ), we see that E(Rω (n + 1)4 ) ≤ [1 + CE(Vω2 (n) + Vω4 (n))]E(Rω4 (n)), where C is uniformly bounded for k in any (α, π − α) with α > 0. It follows that Z π−α 4 E Rω (n, k) dk < ∞. α

By Fatou’s lemma, for a.e. ω, Z

π−α

Rω (n, k)4 dk < ∞,

lim α

and by Theorem 1.3, the spectrum is purely a.c. on (−2 cos(α), 2 cos(α)). P∞ For the case where n=1 E(V (n)2 ) = ∞, we need some regularity of the fall-off. Rather than try to find complicated general conditions, we consider the case where E(V (n)2 ) ∼ n−2α with α ≤ 21 . The same method can handle a case like E(V (n)2 ) = [n log(n+1)]−1 (which always has singular continuous spectrum of Hausdorff dimension 1) by the kind of arguments we will discuss in the case α = 21 ; in this case for typical energies kT (0, n)k grows like log(n). Explicitly, we suppose 0 < α ≤ 21 ; λ > 0, (i) E(Vω (n)2 )1/2 = λn−α (ii) E(Vω (n)) = 0, (iii) For some > 0, supω |Vω (n)| ≤ Cn−(2α/3)− , (iv) Vω (n) is independent of {Vω (j)}n−1 j=1 . Remarks. 1. Think of the case discussed in [26, 7], where Vω (n) = n−α Xn (ω) with Xn identically distributed bounded, independent random variables. If E(X) = 0 and X is bounded, then (i)–(iv) hold. 2. With some extra effort, we could allow unbounded distributions, and only require that limn→∞ n+α E(Vω (n)2 )1/2 exists and be non-zero. Theorem 8.2. Suppose (i)–(iv) hold. Fix k in (0, π) with k 6=

π 2π 3π 4, 4 4 .

Then for a.e. ω,

λ2 log kT2 cos(k) (n, 0)k Pn −2α = . n→∞ ( j=1 j ) 8 sin2 (k) lim

Remark. In case α < 21 , this says kT (n, 0)k ∼ exp(Cn1−α ) with C = α=

1 2,

this says kT k ∼ n with C = C

λ2 . 8(1−2α) sin2 (k)

If

λ2 . 8 sin2 (k)

Proof. By Theorem 2.3, we need only prove this result with R(n) replacing T for each θ0 . So fix k and θ0 , and let θω (n), Rω (n) solve (2.12). By (2.12c), log R(n + 1) − log R(n) =

1 2

¯ ¯ log(1 + νk (n) sin(2θ(n)) + νk (n)2 sin2 (θ(n))).

(8.2)

Since supω νk (n) → 0 as n → ∞, we can use log(1 + x) = x −

x2 + O(x3 ). 2

(8.3)

32


We also use sin2 θ − 21 sin2 (2θ) =

1 4

− 21 cos(2θ) + 41 cos(4θ).

The net result is 1 X E(Vω (n)2 ) + C1 + C2 + C3 + C4 , 8 sin2 (k) n

log R(n) =

j=1

where the corrections have the form X 1 Vω (j) sin(2θ¯ω (j)), 2 sin(k) n

C1 = −

j=1

1 2 ¯ ¯ [Vω (j) − E(Vω (j) )] sin (θω (j)) − sin (2θω (j)) , 2 j=1 n X 1 1 2 1 ¯ ¯ C3 = cos(2 θ cos(4 θ E(V (j) ) (j)) − (j)) , ω ω ω 2 4 2 sin2 (k)

1 C2 = 2 sin2 (k)

n X

2

2

2

j=1

C4 =

n X

O(Vω (j)3 + Vω (j)4 ).

j=1

The theorem follows if we prove that for each q = 1, 2, 3, 4 and a.e. ω, |Cq (ω)| lim Pn −2α = 0. n→∞ j=1 j

(8.4)

Equation (8.4) for q = 4 is an immediate consequence of hypothesis (iii). C1 , C2 clearly have zero expectation values and variances that decay properly for us to hope (8.4) holds; the key to the proof will be a Martingale inequality. C3 will depend on the fact that cos(θ) has zero average and the slow variation of E(Vω (n)2 ). We break the proof to present some needed lemmas. For the first two of these lemmas, let X0 , X1 , . . . , XN be independent random variables, where X0 can be vector valued. Suppose that for j = 1, . . . , N , Zj = Xj fj (X1 , . . . , Xj−1 ; X0 )

(8.5)

with fj a measurable function, and that E(Xj ) = 0.

(8.6)

The following is a variant of a standard Martingale inequality; we provide a proof for the reader’s convenience: Lemma 8.3.

E

sup n=1,2,...,N

|Z1 + · · · + Zn | ≥ r

≤

X N 1 2 E Z j . r2 j=1

(8.7)


33

Proof. Define Yn =

n X

Zj ,

Qn =

j=1

N X

Zj

j=n+1

and let Aj = {ω | |Y1 | ≤ r, |Y2 | ≤ r, . . . , |Yj | > r}. Then χn , the characteristic function of An , is a function only of X0 , X1 , . . . , Xn and thus, if k > n, E(Zk Yn χAn ) = E(Xk )E(fk (X1 , . . . , Xk−1 , X0 )Yn χAn ) = 0. Thus, E(χn Yn2 ) ≤ E(χn (Yn + Qn )2 ), since the cross term has zero expectation when we expand the square. Thus, r2

n X

E(χj ) ≤

j=1

which is (8.7).

N X

E(χj Yj2 ) ≤

j=1

N X

E(χj YN2 ) ≤ E(YN2 ),

j=1

Lemma 8.4. Suppose E(Zn2 ) ≤ Cn−2α . Then for a.e. ω: (1) If α
21 (1 − 2α), then n X −β Zj n = 0. lim n→∞ j=1

(2) If α =

1 2

and β > 21 , then X n Zj (log n)−β = 0. lim n→∞ j=1

(3) If α > 21 , lim

n→∞

n X

Z j = Y∞

j=1

exists, and for any β < α − 21 , X ∞ Zj = 0. lim n+β

n→∞

j=n

Pn Remark. Naively, fluctuations should behave as ( j=1 j −2α )1/2 . This lemma shows they Pn are not much worse. Since we only need that they are small compared to j=1 j −2α , the lemma suffices.

34


Proof. (1) Pick β1 so β > β1 ≥ 21 (1 − 2α). By Lemma 8.3, E

sup n−1

j=1,...,2

nβ1 Zk ≥ 2 ≤ C 2−2nβ1 2(n−1) 2−2(n−1)α

2n−1 X+j k=2n−1 +1

(8.8)

≤ C 2−(n−1)(2α+2β1 −1) is summable in n by the choice of β1 . Therefore, by the Borel-Cantelli lemma, for a.e. ω, there is an n0 (ω0 ) so that the sup inside (8.8) is less than 2nβ1 if n ≥ n0 . Let j be larger than 2n0 −1 and pick n so that 2n−1 + 1 ≤ j ≤ 2n . Then |Z1 + · · · + Zj | ≤ |Z1 + · · · + Z2n0 | +

n X

2kβ1

k=1 nβ1

2 2 β1 − 1 2 β1 ≤ |Z1 + · · · + Z2n0 | + β1 j β1 . 2 −1 ≤ |Z1 + · · · + Z2n0 | +

Thus, lim j −β1 |Z1 + · · · + Zj | < ∞. Since β > β1 , the limit for β is 0. (2) Pick β1 with β > β1 > 21 and define Kn =

ω

sup j=1,...,2

X j ≥ nβ1 . Z j n m=1

Then by Lemma 8.3, n

−2β1

E(Kn ) ≤ n

2 X 1 1

j

≤ n−2β1 (1 + n log 2) ≤ Cn1−2β1 ,

Pk

≤ 1 + log k. since Pick an integer m so m(2β1 − 1) > 1. Then 1 1 j

∞ X

E(Knm ) < ∞.

n=1

/ K nm . So by themBorel-Cantelli lemma, for a.e. ω, there is n0 (ω), so if n ≥ n0 , then ω ∈ If j > 2n0 , pick n so that m m 2(n−1) < j ≤ 2n . Then |Z1 + · · · + Zj | ≤ (nm )β1 ≤ 2mβ1 (n − 1)mβ1 ≤ 2mβ1 (log 2)−β1 (log j)β1 . (3) Pick β1 so β < β1 < α − 21 . Then E

2n−1 X+j sup Zk ≥ 2−nβ1 ≤ C 2−2nβ1 2n−1 2−2(n−1)α n−1

j=1,...,2

k=2n−1 +1

≤ C 22β1 2−2(n−1)[α−1/2−β1 ]


35

is summable. Thus, for a.e. ω, there is an n0 (ω) so that for n ≥ n0 (ω), the sup is bounded by 2−nβ1 . Thus, if j1 ≥ j2 ≥ 2n2 −1 ≥ 2n0 −1 , j2 ∞ X X ≤ Z 2−nβ1 → 0 k n=n2

k=j1

as n2 → ∞. So the sum is convergent (i.e., the partial sums are Cauchy). Moreover, if j ≥ 2n0 −1 and n is picked so 2n−1 ≤ j ≤ 2n , then ∞ ∞ X X 2−nβ1 j −β1 Zk ≤ 2−mβ1 = ≤ , 1 − 2−β1 1 − 2−β1 m=n k=j

and thus, if we multiply by j β , the limit is 0.

Lemma 8.5. Suppose that k ∈ R is not in Zπ. Then there exist integers q` → ∞ so that for any θ0 , . . . , θq` , X q` X q` ≤1+ cos(θ ) |θj − θ0 − kj|. j j=1

j=1

Pq

Remark. In essence, we show | j=1 cos(θ0 + kj)| ≤ 1 a stronger result than the ergodic Pq theory result that | q1 j=1 cos(θ0 + kj)| → 0. The weaker ergodic theory result suffices for our application, but the proof of this lemma is easy so we give it. Proof. By general number theory considerations [14], we can find p` , q` so that k − πp` ≤ 1 (8.8) q` q`2 / Z if k ∈ / Zπ. For any p/q ∈ / Z and any θ0 , and p` /q` ∈ q X j=1

Thus

jpπ = 0. cos θ0 + q

q` q` X X jp` π cos(θj ) = cos(θj ) − cos θ0 + q` j=1 j=1 q` X θj − θ0 − jp` π ≤ q` j=1 q` q` X X πp` ≤ |θj − θ0 − kj| + j k − q` j=1

≤

q` (q` + 1) + 2q`2

j=1

q` X j=1

|θj − θ0 − kj|.

(8.9)

36


Conclusion of the Proof of Theorem 8.2. We need to verify (8.4) for q = 1, 2, 3. Vω (n) sin(2θ¯ω (n)) ≡ Zn has the form (8.5) and E(Zn2 ) ≤ Cn−2α , so by Lemma 8.4, for a.e. ω, X n −2α j . |C1 (ω)| = o j=1

¯ − 1 sin (2θ)] also has the form of (8.4) since E(Vω2 − [Vω (n)2 − E(Vω (n)2 )][sin (θ) 2 2 E(Vω )) = 0. Since V is bounded, 2

2

E((V 2 − E(V 2 ))2 ) ≤ CE(V 2 ). Thus, for a.e. ω, |C2 (ω)| = o

X n

j −2α

j=1

also. Finally, we will show n X

j −2α cos(4θ¯ω (j)) = o

j=1

X n

j −2α ,

j=1

which proves (8.4) for q = 3. By hypothesis on k, 4k ∈ / Zπ so Lemma 8.5 applies. Let q` be as in that lemma. Note next that by hypothesis (iii) and Proposition 2.4 for j large, |θω (j + 1) − θω (j) − k| ≤ C0 j −2α/3 .

(8.10)

n0 ≥ q`2

(8.11)

Pick n0 so and

−2α/3

4C0 n0

≤ q`−2 .

(8.12)

Suppose N = n0 + Kq` . Then N X q` X −2α K X −2α j cos(4θω (j)) = (n0 + mq` + j) cos(4θω (mq` + j)) j=n0 +1

m=0 j=1

= A 1 + A2 , where A1 is what we get by replacing (n0 + kq` + j)−2α by (n0 + kq` )−2α and A2 is the difference. By Lemma 8.5, (8.10), and (8.12), A1 ≤

K X

(n0 + kq` )−2α [1 + 1],

k=0

while using |(n0 + kq` + j)−2α − (n0 + kq` )−2α | ≤ (n0 + kq` )−2α jn−1 0 and (8.11),


A2 ≤

K X

(n0 +

37

kq` )−2α q`2 n−1 0

≤

k=0

K X

(n0 + kq` )−2α .

k=0

Thus for any N , N N X X −2α −α j cos(4θ (j)) ≤ C + 3q j −2α , ω ` ` j=1

1

and so lim

X N

N →∞

j

−2α

j=1

−1 X N −2α j cos(4θω (j)) ≤ 3q`−α 1

uniformly in ω. Since we can take q` → ∞ by Lemma 8.5, the lim is 0. Theorem 8.6. Suppose that (i)–(iv) hold with α < uous parameter. Then for a.e. ω:

1 2

but we consider V (1) as a contin-

(1) For a dense Gδ of values of V (1), Hω has purely singular continuous spectrum in (−2, 2). (2) For Lebesgue a.e. value of V (1), Hω has dense pure point spectrum in (−2, 2) and the eigenfunctions obey Hω u = 2 cos(km )u with (1 − 2α)λ2 log(|u(n)2 + u(n + 1)2 |1/2 ) = − . n→∞ |n|1−2α 8 sin2 (km ) lim

(8.13)

If we consider a whole-line problem with independent Vω (n), where both {Vω (n)}∞ n=1 and V˜ω (n) ≡ Vω (−n), n = 1, 2, . . . obey hypotheses (i)–(iv) and Vω (0) has a purely a.c. density, then for a.e. ω, Hω has dense pure point spectrum in (−2, 2) and (8.13) holds as |n| → ∞. Remark. This strengthens the result originally proven in [31] and improved in [7] in two ways. First, we get the explicit constant in (8.13). Second, we only require one Vω ( · ) to have an a.e. distribution. Proof. By Theorem 8.2 and Fubini’s theorem for a.e. ω, we have for a.e. k ∈ (0, π), log kT (n)k (1 − 2α)λ2 = . n→∞ n1−2α 8 sin2 (k) lim

Thus by Theorem 8.3 of [22], there is an L2 -solution obeying (8.13). The theorem follows from general principles on rank one perturbations [12, 4, 5, 28]. The case α = 21 has an extra subtlety we will need to deal with, using an argument modeled on Kotani-Ushiroya [21]. The following replaces an explicit but complex formula they use for the projection onto a decaying solution (and fills in a gap in their argument): Lemma 8.7. Let uθ = (cos θ, sin θ) in R2 . For any unimodular matrix A with kAk > 1, let θ(A) be the unique θ ∈ (− π2 , π2 ] with kAuθ k = kAk−1 . Define ρ(A) = kAu0 k/kAuπ/2 k. Let An be a sequence of unimodular matrices with kAn k → ∞ and kAn+1 A−1 n k/kAn kkAn+1 k → 0 as n → ∞. Let ρn = ρ(An ), θn = θ(An ). Then:

38


(i) θn has a limit θ∞ if and only if limn→∞ ρn ≡ ρ∞ exists (ρ∞ = ∞ is allowed, but then we only have |θn | → π2 ). (ii) Suppose θn has a limit θ∞ 6= 0, π2 (equivalently, ρ∞ 6= 0, ∞). Then lim

log kAn u∞ k = −1 log kAn k

(8.14)

log |ρn − ρ∞ | ≤ −2. log kAn k

(8.15)

n→∞

if and only if lim

n→∞

Remark. Consider

An =

cosh(n) (−1)n sinh(n)

(−1)n sinh(n) . cosh(n)

Then ρ(An ) ≡ 1 and kAn k → ∞ but θn = (−1)n+1 ( π4 ) does not have a limit. This shows that the condition kAn+1 A−1 n k/kAn kkAn+1 k → 0 is required. Indeed, in this case that limit is 1. Kotani-Ushiroya miss this issue. Proof. (i) Note first that kAn uθ k2 = kAn k2 sin2 (θ − θn ) + kAn k−2 cos2 (θ − θn ). Thus, ρn =

tan2 (θn ) + kAn k−4 . 1 + kAn k−4 tan2 (θn )

(8.16)

(8.17)

It follows that ρn has a finite limit ρ∞ if tan2 (θn ) has a finite limit. By writing ρ−1 n =

cot 2 (θn ) + kAn k−4 , 1 + kAn k−4 cot2 (θn )

this is true also for ρn → ∞ and tan2 (θn ) → ∞. Pick η ∈ [0, π2 ] so tan2 (θn ) → tan2 (η). If η = 0, then θn → 0, and if η → π2 , then |θn | → π2 because of the continuity of tan(θ) on [− π2 , π2 ]. If 0 < η < π2 , we only have |θn | → η and have to worry about the sign (see the remark above). In (8.16), take θ = θn+1 and see that 2 2 sin2 (θn+1 − θn ) ≤ kAn k−2 kAn A−1 n+1 k kAn+1 uθn+1 k 2 = kAn k−2 kAn+1 k−2 kAn+1 A−1 n k −1 −1 since An A−1 n+1 is unimodular, and thus kAn An+1 k = kAn+1 An k. Thus by hypothesis,

sin2 (θn+1 − θn ) → 0. This, together with |θn | → η ∈ (0, π2 ), implies that θn has a limit. (ii) By (8.16), we have that (8.14) holds if and only if lim

n→∞

log |θn − θ∞ | ≤ −2. log kAn k

Since θ∞ 6= 0, π, this is true if and only if

(8.18)


lim

n→∞

39

log | tan2 (θn ) − tan2 (θ∞ )| ≤ −2. log kAn k

By (8.17) and θ∞ 6= π, || tan2 (θn ) − tan2 (θ∞ )| − |ρn − ρ∞ || ≤ CkAn k−4 . Thus, (8.18) holds if and only if (8.15) holds.

Lemma 8.8. Suppose the hypotheses of Theorem 8.2 hold with α = 21 and k 6= is fixed. Then for a.e. ω, there exists an initial condition uθ(ω) so that

π 2π 3π 4, 4 , 4

λ2 log kT2 cos(k) (n, 0)uθ(ω) k =− . n→∞ log(n) 8 sin2 (k) lim

Remark. As noted in [22] (and gotten incorrectly in [21]), Ruelle’s deterministic argument doesn’t ever suffice in this kT k ∼ nγ case. If An is a sequence of unimodular matrices with limn→∞ log kAn k/ log(n) = γ, then [22] has explicit examples (even coming from deterministic Schrödinger operators) for each γ > 21 where the decaying solution only obeys limn→∞ log kAn u∞ k/ log(n) = −γ + 1. It also appears one needs γ > 23 to be sure of the existence of decaying solutions. But following [21], the probabilistic argument here can replace Ruelle’s argument. 2

Proof. Let β = 8 sinλ2 (k) . Let R1 (n) and R2 (n) be the R’s associated to θ = 0 and θ = 21 . By the proof of Theorem 8.2 for a.e. ω, lim

n→∞

log kRi (n)k = β. log(n)

(8.19)

Let θi (n) be the corresponding EFGP angles. By (2.7), R1 (n)R2 (n) sin(θ1 (n) − θ2 (n)) = sin(k)[u1 (n)u2 (n − 1) − u1 (n − 1)u2 (n)] = −1 (by the initial conditions R1 (1) = R2 (1) = 0, θ1 (1) = 0, θ2 (1) = Wronskian. Thus by (8.19) for a.e. ω, lim

n→∞

Let ρn =

R1 (n) R2 (n) .

π 2)

and constancy of the

log |θ1 (n) − θ2 (n)| = −2β. log(n)

Then by (2.12c),

Lω (n) ≡ [log ρ(n + 1) − log ρ(n)] = log(1 + A1 (n)) − log(1 + A2 (n)), where Ai (n) = − Define

Vω (n) Vω (n)2 sin(2θi,ω (n)) + sin2 (θi,ω (n)). sin(k) sin2 (k)

F (a, θ) = log(1 − a sin(2θ) + a2 sin2 (θ)).

By a finite Taylor expansion, F (a, θ) =

J−1 X j=1

aj Pj (θ) + O(aJ )

(8.20)

40


with P1 (θ) = sin(2θ) and the P ’s, C ∞ in θ. Fix > 0 so for n large, use (8.20) to see that |θ1 − θ2 | = o(n−2β+ ). Choosing J so n−J/3 = o(n−2β−1 ), we see that Lω (n) = −

Vω (n) [sin(2θ1 (n)) − sin(2θ2 (n))] + O(n−2β−1+ ). sin(k)

Since θj (n) depend only on {Vω (k)}k≤n−1 , we can apply part (3) of Lemma 8.4 (with 2α = 1 + 2β − ) to see that for a.e. ω, lim

N X

N →∞

exists and

Lω (n)

(8.21)

1

X ∞ ≤ Cω N −2β+ . L (n) ω

(8.22)

N

By (8.21), lim

R1 (n) R2 (n)

≡ ρ∞ exists and is different from 0 and ∞. Moreover, by (8.22), lim

log |ρ(n) − ρ(∞)| ≤ −2β. log(n)

Lemma 8.7 completes the proof.

Theorem 8.9. Suppose (i)–(iv) hold with α = 21 . Then, (1) For a.e. ω, the essential spectrum of Hω is [−2, 2] and the absolutely continuous spectrum of Hω is empty. (2) If |λ| ≥ 2 and Vω (1) has an absolutely continuous distribution, then for a.e. ω, Hω has dense point spectrum and only dense point spectrum in (−2, 2). (3) If |λ| < 2 and Vω (1) has an absolutely continuous distribution, then for a.e. ω, Hω has purely singular continuous spectrum in {E | |E| < (4 − λ2 )1/2 } and only dense pure point spectrum in {E | (4 − λ2 )1/2 < |E| < 2}. In either case (2) or (3), in the region of point spectrum, there are almost surely eigenvectors of power decay n−β with β=

λ2 . 8 − 2E 2

(8.23)

Remark. This theorem extends results of Delyon, et al. [7], Delyon [6], and KotaniUshiroya [21]. In particular, [7] conjectured that there is a region of point spectrum near E = ±2 no matter how small λ is. Proof. By Theorem 8.2, limn→∞ kTω (0, n)k = ∞ for a.e. E for a.e ω, so by Theorem 1.1, we conclude (3). By Lemma 8.8, for a.e. pairs (ω, E), there is a unique λ2 1 2 decaying solution with rate of decay n−β with β = 8−2E 2 . If β > 2 , this is ` and we 1 have potential point spectrum. If β < 2 , there is no `2 solution. The general theory of rank one perturbations ([32, 5]) then yields (2) and (3). We can compute the precise Hausdorff dimension of the singular continuous spectral measures in this case:


41

Theorem 8.10. Fix λ < 2 and a model obeying (i)–(iv) with α = 21 . In the region |E| ≤ (4 − λ2 )1/2 , define 4 − E 2 − λ2 d(E, λ) = . 4 − E2 Suppose Vω (1) has an absolutely continuous density. Then for a.e. ω, the spectral measure, µ, has dimension d(E, λ) at E in the sense that for any , there is a δ so that µ(A) = 0 if A is a subset of (E − δ, E + δ) of Hausdorff dimenion less than (d − ), and there is a subset B of Hausdorff dimension less than (d + ), so µ((E − δ, E + δ)\B) = 0. PL Proof. Let kukL = ( j=1 u(j)2 )1/2 . By the general theory of rank one perturbations, Theorem 8.2, Lemma 8.8, and the assumption of Vω (1) for a.e. ω, µ is supported on the set of energies where most solutions grow as nβ and one decays as n−β , where β(E, λ) is given by (8.23). The hypothesis for singular spectrum is precisely β < 21 . Since β < 21 , ku1 kL ∼ L−β L1/2 while ku2 kL ∼ Lβ L1/2 , where a ∼ b is shorthand for lim log(a) log(b) = 1. The Jitomirskaya-Last version [16, 17] of the Gilbert-Pearson [11] theory says that the Borel transform of the spectral measure is supported on the set of E’s, where ku2 kL , (8.26) |m(E + i)| ∼ ku1 kL and E is given by ku1 kL ku2 kL =

1 2

(the ∼ in (8.26) holds in the strong sense that the ratio lies in the interval (5 − √ 24 )). Thus, ∼ L−1 and (8.26) says that

(8.27) √

24 , 5 +

|m(E + i)| ∼ −2β . Since β is continuous, the theory in [3] then says that the local dimension is given by 1 − 2β as claimed. 9. Random Decaying Potentials: The Continuum Case Having done the discrete random case, we will only sketch the continuum case. We will specialize to a situation where {V (x)}n≤x λk + λs−k ≥ b. We can now apply Lemma 1.3 and we obtain Vs−k ⊂ V∠ k . Since the dimensions of these . subspaces are equal we must have Vs−k = V∠ k 2. Conformally Symplectic Manifolds and Conformal Hamiltonian Flows Let M be a smooth manifold of even dimension. A conformally symplectic structure on M is a differentiable 2-form 2 which is non-degenerate and has the following basic property: d2 = γ ∧ 2, (2.1) for some closed 1-form γ. A manifold with such a form 2 is called conformally symplectic. The origin of this name becomes clear when one observes that locally γ = dU for some smooth function U and d(e−U 2) = 0, i.e. e−U 2 defines a bona fide symplectic structure. For a given function H : M → R, called a Hamiltonian, let us consider a vector field ∇2 H defined by the usual relation 2(·, ∇2 H) = dH.

(2.2)

We will call it the conformally Hamiltonian vector field, or conformally symplectic, or simply a Hamiltonian vector field when the conformally symplectic structure is clearly

52

M. P. Wojtkowski, C. Liverani

chosen. Note that our definition does not coincide with the definition of a Hamiltonian vector field from [V]. Let 8t denote the flow defined by the vector field F = ∇2 H. The Hamiltonian function H is a first integral of the system. Indeed we have d H = dH(∇2 H) = 2(∇2 H, ∇2 H) = 0. dt Let us consider the Lie derivative of the form 2 in the direction of vector field F , i.e., (LF 2) (ξ, η) :=

d 2(D8u ξ, D8u η)|u=0 . du

Theorem 2.1. For a Hamiltonian vector field F = ∇2 H we have LF 2 = γ(F )2 + γ ∧ dH.

(2.3)

Proof. We will use the Cartan formula ([A-M-R]) LF = iF d + diF , where iF is the interior and d the exterior derivative. (For a differential m-form ζ the interior derivative iF ζ is the differential (m − 1)-form obtained by substituting F as the first vector argument of ζ.) We have iF 2 = −dH and we get immediately LF 2 = iF d2 − d2 H = iF (γ ∧ 2) = γ(F )2 + γ ∧ dH. Let us restrict the flow 8t to one smooth level set of the Hamiltonian, M c = {z ∈ M |H(z) = c}. In particular we assume that on M c the differential dH and the vector field F do not vanish. For two vectors ξ, η from the tangent space Tz M c , we introduce w(t) = 2(Dz 8t ξ, Dz 8t η). By (2.3) we get d w(t) = γ(F (8t z))w(t), dt since dH vanishes on the tangent space Tz M c . We conclude that 2(Dz 8t ξ, Dz 8t η) = β(t)2(ξ, η),

(2.4)

for every ξ, η, from Tz M c and Z

t

β(t) = exp

u

γ(F (8 z))du . 0

(2.5)

Conformally Symplectic Dynamics

53

Remark 2.1. Let us note that under a non-degenerate time change a conformally Hamiltonian vector field is still conformally Hamiltonian with the same Hamiltonian function but with respect to a modified conformally symplectic form. More precisely if F = ∇2 H is a Hamiltonian vector field, then if the new time τ is related to the original time t by dτ = f, dt for some function f of the phase point, we get that the vector field f1 F is conformally e = f 2. Indeed symplectic with respect to the form 2 1 e = (d ln f + γ) ∧ 2 e and 2(·, e F ) = dH. d2 f Alternatively we can keep the same conformally symplectic form and modify the Hamiltonian separately on each level set. Indeed we have 1 1 1 (H − c) , 2(·, F ) = dH = d f f f where the last equality is valid only on the level set {H = c}. Finally, let us consider the symplectic form e−U 2. On the level set {H = c} we have d(e−U (H − c)) = e−U dH. It follows that on this level set e−U 2(·, F ) = d e−U (H − c) , and, as a result, the vector field F coincides locally with the Hamiltonian vector field given by the Hamiltonian e−U (H − c) (with respect to the symplectic form e−U 2). This observation provides an alternate way to derive (2.4) by using the preservation of the symplectic form e−U 2 by any Hamiltonian flow (with respect to this symplectic form). For a fixed level set M c we introduce the quotient of the tangent bundle T M c of M by the vector field F = ∇2 H, i.e., by the one dimensional subspace spanned by F . Let us denote the quotient bundle by TbM c . The form 2 factors naturally from T M c to TbM c , in view of (2.2). The factor form defines in each of the quotient tangent spaces Tbz M c , z ∈ M c , a linear symplectic form. The derivative of the flow preserves the vector field F , i.e., c

Dz 8t (F (z)) = F (8t z). As a result the derivative can be also factored on the quotient bundle and we call it the transversal derivative cocyle and denote it by At (z) : Tbz M c → Tb8t z M c . It follows immediately from (2.4) that the transversal derivative cocycle is conformally symplectic with respect to 2 (or more precisely the linear symplectic form it defines in the quotient tangent spaces). We fix an invariant probability measure µc on M c and assume that

54


Z Mc

kDz 8t kdµc (z) < +∞.

Under this assumption the derivative cocycle has well defined Lyapunov exponents, cf. [O, R]. Then the transversal derivative cocycle has also well defined Lyapunov exponents which coincide with the former except that one zero Lyapunov exponent is skipped. We can immediately apply Theorem 1.4 to the transversal derivative cocycle and we get the following. Theorem 2.2. For a Hamiltonian flow 8t , defined by the vector field F = ∇2 H, restricted to one level set M c we have the following symmetry of the Lyapunov spectrum of the transversal derivative cocyle with respect to an invariant ergodic probability measure µ. Let {0} ⊂ V0 (z) ⊂ V1 (z) ⊂ . . . ⊂ Vs−1 (z) ⊂ Vs = Tbz M c be the flag of subspaces at z associated with the Lyapunov spectrum λ1 < λ2 < . . . < λs−1 < λs of the transversal derivative cocycle At (z), z ∈ M c . Then the multiplicities of λk and λs−k+1 are equal and R

λk + λs−k+1 = a, for k = 1, 2, . . . , s,

where a = M c γ(F (z))dµc (z). Moreover the subspace Vs−k is the skew-orthogonal complement of the subspace Vk . Note that the invariant measure µc can be supported on a single periodic orbit, so that Theorem 2.2 applies as well to the real parts of the Floquet exponents. To apply Theorem 1.4 it is enough to have linear symplectic forms in each of the quotient tangent spaces (to the level set), not necessarily coming from a conformally symplectic structure on the phase space. But then one needs to check directly how the transversal derivative cocycle acts on these forms, because we do not have the advantage of Theorem 2.1. This is essentially the line of argument in [D-M 1] and [D-M 3]. 3. Conformally Symplectic Flows with Collisions Let M be a smooth manifold with piecewise smooth boundary ∂M . We assume that the manifold M is equipped with a conformally symplectic structure 2, as defined in Sect. 2. Given a smooth function H on M with non vanishing differential we obtain the non vanishing conformally Hamiltonian vector field F = ∇2 H on M . The vector field F is tangent to the level sets of the Hamiltonian M c = {z ∈ M |H(z) = c}. We distinguish in the boundary ∂M the regular part, ∂Mr , consisting of the points which do not belong to more than one smooth piece of the boundary and where the vector field F is transversal to the boundary. The regular part of the boundary is further split into “outgoing” part, ∂M− , where the vector field F points outside the manifold M and the “incoming” part, ∂M+ , where the vector field is directed inside the manifold. Suppose that additionally we have a piecewise smooth mapping 0 : ∂M− → ∂M+ , called the collision map. We assume that the mapping 0 preserves the Hamiltonian, H ◦ 0 = H, and so it can be restricted to each level set of the Hamiltonian. We assume that all the integral curves of the vector field F that end (or begin) in the singular part of the boundary lie in a codimension 1 submanifold of M . We can now define a flow 9t : M → M , called a flow with collisions, which is a concatenation of the continuous time dynamics 8t given by the vector field F , and the


55

collision map 0. More precisely a trajectory of the flow with collisions, 9t (x), x ∈ M , coincides with the trajectory of the flow 8t until it gets to the boundary of M at time tc (x), the collision time. If the point on the boundary lies in the singular part then the flow is not defined for times t > tc (x) (the trajectory “dies” there). Otherwise tc the trajectory is continued at the point 0(9 x) until the next collision time, i.e., for 0 ≤ t ≤ tc 0(9tc (x) x) , 9tc +t x = 8t 09tc x. We define a flow with collisions to be conformally symplectic, if for the collision map 0 restricted to any level set M c of the Hamiltonian we have 0∗ 2 = β2,

(3.1)

for some non vanishing function β defined on the boundary. More explicitly we assume that for every vector ξ and η from the tangent space Tz ∂M c to the boundary of the level set M c we have 2(Dz 0ξ, Dz 0η) = β2(ξ, η). We restrict the flow with collisions to one level set M c of the Hamiltonian and we denote the resulting flow by 9tc . This flow is very likely to be badly discontinuous but we can expect that for a fixed time t the mapping 9tc is piecewise smooth, so that the derivative D9tc is well defined except for a finite union of codimension one submanifolds of M c . We will consider only such cases. We choose an invariant measure in our system which satisfies the condition that all the trajectories that begin (or end) in the singular part of the boundary have measure zero. Usually there are many natural invariant measures satisfying this property. For instance we get one by taking a Lebesgue measure ν in RT M c and averaging it over increasing time intervals ( T1 0 9tc∗ νdt as T → +∞). Let us denote the chosen invariant measure by µc . This measure µc defines the measure µcb on the boundary ∂M c , which is an invariant measure for the section of the flow (Poincaré map of the flow). With respect to the measure µc the flow 9tc is a measurable flow in the sense of the Ergodic Theory and we obtain a measurable derivative cocycle D9tc : Tx M c → T9tc x M c . We can define Lyapunov exponents of the flow 9tc with respect to the measure µc , if we assume that Z Z t log+ ||Dx 9c ||dµc (x) < +∞ and log+ ||Dy 0||dµcb (y) < +∞ Mc

c ∂M−

(cf. [O, R]). The derivative of the flow with collisions can be also naturally factored onto the quotient of the tangent bundle T M c of M c by the vector field F , which we denote by TbM c . Note that for a point z ∈ ∂M c the tangent to the boundary at z can be naturally identified with the quotient space. We will again denote the factor of the derivative cocycle by At (x) : Tbx M c → Tb9tc x M c . We will call it the transversal derivative cocycle. If the derivative cocycle has well defined Lyapunov exponents then the transversal derivative cocycle has also well defined Lyapunov exponents which coincide with the former ones except that one zero Lyapunov exponent is skipped. For a conformally symplectic flow with collisions the factor At (x) of the derivative cocycle on one level set changes the form 2 by a scalar, (2.3) and (3.1), so that we can immediately apply Theorem 1.4 and we get

56


Theorem 3.1. For a conformally symplectic flow with collisions 9tc we have the following symmetry of the Lyapunov exponents for a given ergodic invariant probability measure µc . Let {0} ⊂ V0 (x) ⊂ V1 (x) ⊂ . . . Vs−1 (x) ⊂ Vs = Tbx M c be the flag of subspaces at x associated with the Lyapunov spectrum λ1 < λ2 < . . . < λs−1 < λs of the transversal derivative cocycle At (x), x ∈ M c . Then the multiplicities of λk and λs−k+1 are equal and λk + λs−k+1 = a + b, for k = 1, 2, . . . , s, R R where a = M c γ(F )dµc and b = τ1 ∂M c log |β(y)|dµcb (y). τ = ∂M c tc (y)dµcb (y) is − − the average collision time on the section of the flow. Moreover the subspace Vs−k is the skew-orthogonal complement to Vk . R

4. Applications A. Gaussian isokinetic dynamics. The equations of the system are (cf. [D-M 1]) q˙ = p, p˙ = E − αp, where α =

hE, pi . hp, pi

(4.1)

In these equations q describes a point in the multidimensional configuration space RN , p is the momentum (velocity) also in RN and h·, ·i is the arithmetic scalar product in RN . The field of force E = E(q) is assumed to be irrotational, i.e., it has locally a potential . function U = U (q), E = − ∂U ∂q P Let us denote by pdq the 1-form which defines the standard symplectic Pκ = structure ω = dκ = dp ∧ dq. We introduce the following 2-form 2=ω+

hE, dqi ∧ κ. hp, pi

We choose the Hamiltonian to be H = 21 hp, pi and we denote by F the vector field defined by (4.1). We have 2(·, F ) = dH, (4.2) but the form 2 does not give us a conformally symplectic structure because the relation (2.1) fails. To correct this setback we fix one level set of the Hamiltonian M c = {H = 1 2 hp, pi = c} and define another 2-form 2c = ω +

hE, dqi ∧ κ. 2c

Now we get a conformally symplectic structure. Indeed hE, dqi hE, dqi d2c = − ∧ 2c , and locally =d 2c 2c

−U 2c

.

Moreover on M c we still have 2c (·, F ) = dH so that the restriction of (4.1) to M c coincides with a conformally Hamiltonian system with respect to the 2-form 2c and with the Hamiltonian H = 21 hp, pi. We can immediately apply Theorem 2.1 and we obtain


57

that for any invariant ergodic probability measure µc on M c the Lyapunov exponents λ1 < . . . < λs satisfy Z Z 1 λk + λs−k+1 = − hE, pidµc = αdµc . 2c M c Mc Note that if the vector field of force has a global potential, E = − ∂U ∂q , then by the R R 1 1 Birkhoff Ergodic Theorem the integral − 2c M c hE, pidµc = 2c M c dU (F )dµc is equal −U to the time average of dU 2c dt and so it must vanish. Another way to see it is that e defines a global symplectic structure and on Mc our flow is Hamiltonian with respect to this symplectic structure and a modified Hamiltonian e = e−U ( 1 hp, pi − c). H 2 Indeed as discussed in Remark 2.1 on M c we have e e−U 2c (·, F ) = dH. For a Hamiltonian flow the symmetric Lyapunov exponents must add up to zero. B. Gaussian isokinetic dynamics on a Riemannian manifold. For a given Riemannian P gij dqi dqj we canP naturally generalize the manifold N with the metric tensor ds2 = pdq is independent of form 2 to the cotangent bundle T ∗ N . Indeed the 1-form κ = the coordinate system, cf. [A], and for a given closed 1-form γ we put 2c = dκ −

1 γ ∧ κ. 2c

1 We get d2c = − 2c γ ∧ 2c . Taking γ = dU for some potential function (single or P ij multi-valued) and the Hamiltonian H = 21 g pi pj we obtain the Gaussian isokinetic dynamics, [Ch], on the level set H = c by the relation (4.2). We can repeat the discussion in part A and we conclude again that the Lyapunov exponents must be symmetric and they add up to zero, if the potential U is single-valued.

C. The Gaussian isokinetic dynamics with collisions. Let us consider n spherical particles in a finite box B contained in Rd or the torus Td . We assume that the particles interact with each other by the potential V (q1 , q2 , . . . , qn ) (qk ∈ B, k = 1, . . . , n denote the positions of the particles) and that they are subjected to the external fields given by the potentials Vk (qk ), k = 1, . . . , n. Further we assume that the particles have the radii r1 , . . . , rn , the masses m1 , . . . , mn , and that they collide elastically with each other and the sides of the box, which can be flat or curved. The last element in the description of the system is the Gaussian isokinetic thermostat. As described in Part A and B the Gaussian isokinetic thermostat gives rise to a conformally Hamiltonian flow with the Pn p2 Hamiltonian H = k=1 2mkk and an appropriate conformally symplectic structure. We will check below that the collisions in this system preserve the form 2c giving rise to a conformally symplectic flow with collisions. Theorem 3.1 can be thus applied to our system giving us the symmetry of the Lyapunov spectrum. We introduce the canonical change of variables which brings the kinetic energy into the standard form,

58


xk =

√

mk qk , pk vk = √ . mk

The advantage of these coordinates is that although the collision manifolds in the configuration space become less natural, the collisions between particles (and the walls of the box) are given by the billiard rule in the configuration space. The equations of motions in the (x, v) coordinates are x˙ = v, (4.3) ∂U − αv, v˙ = − ∂x Pn (v) where U = U (x) = V + k=1 Vk is the total potential of the system and α = − dU hv,vi . We introduce the differential 2-form X 1 dv ∧ dx − dU ∧ hv, dxi. 2c = 2c As in part A we conclude that the form satisfies (2.1) and the system (4.3) restricted to M c coincides with the conformally Hamiltonian system defined by this form and the Hamiltonian H = 21 hv, vi. Proposition 4.1. The collision maps preserve the form 2c . Proof. A collision manifold is locally given by an equation of the form g(x) = 0, where g is some differentiable function on Rnd . Note that the general form of the collision map is the same for collisions of particles and the collisions with the sides of the box. Let n(x), for x ∈ {x ∈ Rnd |g(x) = 0}, denote the unit normal vector to the collision manifold in the configuration space. The collision map is defined as x + = x− , v + = v − − 2hv − , n(x− )in(x− ),

(4.4)

where the index + corresponds to the values of x and v after the collision and the index − to the values before the collision. As a result of these formulas we get immediately that (4.5) δx+ = δx− . It is well known, [W1, W2], that in an elastic collision the symplectic form ω is preserved. It remains to show the preservation of the second term in 2c . It follows immediately from (4.4) and (4.5), because hv + , δx+ i = hv − , δx− i − 2hv − , n(x− )ihn(x− ), δx− i, and the last term is zero since we only take the variations (δx− , δv − ) tangent to the collision manifold, i.e., δx− is orthogonal to n(x). The proposition is proven. It follows from Proposition 4.1 that also the form e−U 2c is preserved under collisions. Hence, as remarked in Parts A and B, if the potential U is singlevalued then the system restricted to one energy level coincides with a globally Hamiltonian system (with collisions) with respect to the symplectic form e−U 2c with the Hamiltonian function e = e−U ( 1 hp, pi − c). We conclude that the occurrence of dissipation in such equal to H 2


59

systems is related to the topology of the configuration space (the multivaluedness of the potential U ). D. Nosé-Hoover dynamics. The Nosé Hamiltonian is, cf. [D-M 3], H(q, s; π, ps ) =

N X i=1

p2 πi2 + ϕ(q) + s + C ln s, 2 2mi s 2

withP a non-physical time denoted by λ and some constant C. The symplectic form is ω = dπ∧dq+dps ∧ds. Changing the variables as π = sp and σ = ln s the Hamiltonian becomes N X p2 p2i (4.6) + ϕ(q) + s + Cσ, H(q, σ; p, ps ) = 2mi 2 i=1 P P and the symplectic form is ω = eσ i dpi ∧ dqi + dps ∧ dσ + dσ ∧ ( i pi dqi ) . Note that now in the Hamiltonian the thermostat (σ, ps ) is decoupled from the system but the σ coupling is shifted to the symplectic form. We make finally the time change dλ dt = e . We choose not to change the Hamiltonian but rather to modify the 2-form, e−σ ω(·, eσ ∇ω H) = dH. We end up with the Hamiltonian (4.6) and the conformally symplectic structure X X dpi ∧ dqi + dps ∧ dσ + dσ ∧ ( pi dqi + ps dσ). 2 = e−σ ω = i

We have d2 = dσ ∧ 2. Note the similarity of 2 with the form used in the discussion of the isokinetic dynamics above. This form and the Hamiltonian give us the Hoover equations pi , q˙i = mi ∂ϕ p˙i = − − ps pi , ∂qi σ˙ = ps , X p2 i p˙s = − C. m i i On any level set we can drop the equation for σ since σ can be trivially obtained from other variables using the constancy of the Hamiltonian. By Theorem 2.1 we have the symmetry of the Lyapunov spectrum for this system reduced to one level of the Hamiltonian. Moreover the Lyapunov exponents add up to the time average of σ. ˙ This average must be zero, unless σ grows linearly, which is unlikely. Note that the Nosé–Hoover system is open in the sense that arbitrarily large fluctuations of σ cannot be ruled out. Acknowledgement. We thank Dmitri Alexeevski, Federico Bonetto and Philippe Choquard for many enlightening discussions during our stay at the ESI in December of 1996. In addition, we thank David Ruelle for valuable comments. We are also grateful for the opportunities provided by the hospitality of the Erwin Schrödinger Institute in Vienna where this work was done. M.P.Wojtkowski has been partially supported by NSF Grant DMS-9404420.

60


References [A]

Arnold, V.I.: Mathematical Methods in Classical Mechanics. Berlin–Heidelberg–New York: Springer Verlag, 1978 [A-M-R] Abraham, R., Marsden, J.E., Ratiu, T.: Manifolds, tensor analysis and applications. Berlin– Heidelberg–New York: Springer Verlag, 1988 [B-G-G-S] Benettin, G., Galgani, I., Giorgilli, A., Strelcyn, J.-M.: Lyapunov characteristic exponents for smooth dynamical systems and for Hamiltonian systems; a method for computing all of them. Meccanica 15, 9–20 (1980) [B-G-G] Bonetto, F., Gallavotti, G., Garrido, P.L.: Chaotic principle: An experimental test. Preprint (1996) [Ch] Choquard, Ph.: Lagrangian formulation of Nosé–Hoover and of isokinetic dynamics. ESI report (1996) [Ch-E-L-S] Chernov, N.I., Eyink, G.L., Lebowitz, J.L., Sinai, Ya.G.: Steady-state electric conduction in the periodic Lorentz gas. Commun. Math. Phys. 154, 569–601 (1993) [D-P-H] Dellago, C.P., Posch, H.A., Hoover, W.G.: Lyapunov instability in a system of hard disks in equilibrium and nonequilibrium steady states. Phys. Rev. E 53, n.2, 1485–1501 (1996) [D-M 1] Dettmann, C.P., Morriss, G.P.: Proof of Lyapunov exponent pairing for systems at constant kinetic energy. Phys. Rev. E 53, 5541 (1996) [D-M 2] Dettmann, C.P., Morriss, G.P.: Hamiltonian formulation of the Gaussian isokinetic thermostat. Phys. Rev. E 54, 2495 (1996) [D-M 3] Dettmann, C.P., Morriss, G.P.: Hamiltonian reformulation and pairing of Lyapunov exponents for Nosé–Hoover dynamics. Phys. Rev. E 55, 3693 (1997) [E-C-M] Evans, D.J., Cohen, E.G.D., Morriss, G.P.: Viscosity of a simple fluid from its maximal Lyapunov exponents. Physical Review 42A, 5990–5997 (1990) [E-M] Evans, D.J., Morriss, G.P.: Statistical Mechanics of nonequilibrium liquids. New York: Academic Press, 1990 [G-G] Garrido, P.L., Gallavotti, G.: Billiards correlation functions. J. Stat. Phys. 76, 549–586 (1994) [L-B-D] Latz, A., van Beijeren, H., Dorfman, J.R.: Lyapunov spectrum and the conjugate pairing rule for a thermostatted random Lorentz gas: Kinetic theory. Preprint (1996) [O] Oseledets, V.I.: A multiplicative ergodic theorem: Characteristic Lyapunov exponents of dynamical systems. Trans. Moscow Math. Soc. 19, 197–231 (1968) [R] Ruelle, D.: Ergodic theory of differentiable dynamical systems. Publ. Math. IHES 50, 27–58 (1979) [V] Vaisman, I.: Locally conformal symplectic manifolds. Int. J. Math.–Math. Sci. 8, 521–536 (1985) [W1] Wojtkowski, M.P.: Measure Theoretic Entropy of the system of hard spheres. Ergodic Theory and Dynamical Systems. 8, 133–153 (1988) [W2] Wojtkowski, M.P.: Systems of classical interacting particles with nonvanishing Lyapunov exponents. In: Lyapunov Exponents, Proceedings, Oberwolfach 1990, L. Arnold, H. Crauel, J.-P. Eckmann (Eds.), Lecture Notes in Math. Vol. 1486, 1991, pp. 243–262 Communicated by J. L. Lebowitz

Commun. Math. Phys. 194, 61 – 70 (1998)

Communications in


The Pair Correlation Function of Fractional Parts of Polynomials? Zeév Rudnick1 , Peter Sarnak2 1 Raymond and Beverly Sackler School of Mathematical Sciences, Tel Aviv University, Tel Aviv 69978, Israel 2 Department of Mathematics, Princeton University, Princeton NJ 08544, USA

Received: 22 July 1997 / Accepted: 24 September 1997

Abstract: We investigate the pair correlation function of the sequence of fractional parts of αnd , n = 1, 2, . . . , N , where d ≥ 2 is an integer and α an irrational. We conjecture that for badly approximable α, the normalized spacings between elements of this sequence have Poisson statistics as N → ∞. We show that for almost all α (in the sense of measure theory), the pair correlation of this sequence is Poissonian. In the quadratic case d = 2, this implies a similar result for the energy levels of the “boxed oscillator” in the high-energy limit. This is a simple integrable system in 2 degrees of freedom studied by Berry and Tabor as an example for their conjecture that the energy levels of generic completely integrable systems have Poisson spacing statistics.

1. Introduction Hermann Weyl [11] proved that for an integer d ≥ 1 and an irrational α, the sequence of fractional parts αnd mod 1, n = 1, 2, . . . is equidistributed in the unit interval. A different aspect of the random behavior of the sequence has attracted attention recently: Are the spacings between members of the sequence distributed like those between members of a sequence of random numbers in the unit interval (or as some would say, do they have a “Poissonian” distribution)? This issue came up in the context of the distribution of spacings of the energy levels of integrable systems [1]. For the case d = 1 the spacings between the fractional parts of αn are essentially those of the energy levels of a two-dimensional harmonic oscillator [4, 2, 3]. For d = 2 the spacings are related to the spacings between the energy levels of the “boxed oscillator” [1], a particle in a ? Supported in part by grants from the U.S.-Israel Binational Science Foundation, the Israel Science Foundation, and the NSF.

62

Z. Rudnick, P. Sarnak

2-dimensional potential well with hard walls in one direction and harmonic binding in the other. The spacings of αn2 mod 1 were also investigated numerically in [5]. If d = 1 it is elementary that the consecutive spacings have at most 3 values [9, 10]. Hence the sequence is not random in this case. For d ≥ 2 the picture is very different. To explain it, we recall a basic classification of real numbers with regards to their Diophantine approximation properties: We say α is of type κ if there is c = c(α) > 0 so that |α − p/q| > c/q κ for all integers p, q. For rational α, κ = 1 and α is irrational if and only if κ ≥ 2. It is well known that almost all α (in the sense of measure theory) are of type κ = 2 + for all > 0. We will call such α “Diophantine”. For instance, algebraic irrationals are of this type (Roth’s theorem). In [7] we establish some results towards the conjecture that αnd mod 1 is Poissonian for any α of Diophantine type. In this note we examine the behavior for almost all α, which according to the above should be Poissonian. The statistic we examine is the pair correlation: The pair correlation density for a sequence of N numbers θ1 , . . . , θN ∈ [0, 1] which are equidistributed as N → ∞, measures the distribution of spacings between the θj at distances of order of the mean spacing 1/N . Precisely, if kxk = distance(x, Z) then for any interval [−s, s] set 1 n so . (1.1) R2 ([−s, s], {θn }, N ) = # 1 ≤ j 6= k ≤ N : kθj − θk k ≤ N N For random numbers θj chosen uniformly and independently, R2 ([−s, s], {θn }, N ) → 2s with probability tending to 1 as N → ∞. Our main result is that this holds for the sequence of fractional parts {αnd mod 1} for almost every α: Denoting by R2 ([−s, s], α, N ) the pair correlation sum (1.1) for this sequence, we show Theorem 1. For d ≥ 2, there is a set P ⊂ R of full Lebesgue measure such that for any α ∈ P , and any s ≥ 0, R2 ([−s, s], α, N ) → 2s,

N → ∞.

Remark 1.1. The proof given below does not provide (and we do not know of) any specific α which is provably in P . Remark 1.2. Already with the pair correlation we see the necessity of a condition on the type of α. For if there are arbitrarily large integers p, q so that 1 p , |α − | ≤ q 10q d+1 then R2 ([−s, s], α, N ) 6→ 2s. Indeed if we choose N = q, then for m 6= n ≤ N ,

d d d

d

d

n α − md α = (n − m )p + t(n − m )

q 10q d+1

d with Hence either n α − md α ≤ 1/10q = 1/10N if q divides nd − md , or

d |t| ≤ 1.

n α − md α ≥ 9/10q = 9/10N otherwise. Thus there are no normalized differences

N nd α − md α in the interval (1/10, 9/10) for this sequence of N = q.

Pair Correlation of Fractional Parts

63

The proof of the theorem follows the steps in [8] (where a similar assertion is proven for the values of binary quadratic forms). We first establish that as a function of α ∈ [0, 1], R2 ([−s, s], α, N )P→ 2s in L2 (0, 1). This together with standard bounds on the Weyl sums S(n, N ) = x≤N e(nαxd ) allows us to pass to almost everywhere convergence. In Sect. 4 we briefly discuss higher correlations and show that they do not converge in L2 to the expected value. Thus our approach does not lend itself directly to establishing almost everywhere convergence of the higher correlations. 2. Bounding the Variance Let f ∈ Cc∞ (R) be a test function and set R2 (f, {θn }, N ) := where FN (y) =

X

1 N

FN (θj − θk ),

(2.1)

1≤j6=k≤N

X

f (N (y + m)).

(2.2)

m∈Z

The function FN (y) is periodic and has a Fourier expansion 1 X b n e(ny). f FN (y) = N N

(2.3)

n∈Z

Hence R2 (f, {θn }, N ) =

1 X b n f N2 N n∈Z

X

e(n(θj − θk )).

(2.4)

1≤j6=k≤N

In particular, if θn = αnd mod 1, then the pair correlation function is given by 1 X b n sof f (n, N ), f R2 (f, α, N ) = 2 N N

(2.5)

n∈Z

where sof f (n, N ) :=

X

e nα(xd − y d ) .

(2.6)

1≤x6=y≤N

As a function of α, R2 (f, α, N ) is periodic and from (2.5) its Fourier expansion is X R2 (f, α, N ) = bl (N )e(lα), (2.7) l∈Z

where for l 6= 0, bl (N ) =

1 X N2 n6=0

X 1≤x6=y≤N n(xd −y d )=l

The mean of R2 (f, α, N ) over α ∈ [0, 1] is

n . fb N

(2.8)

64


1 hR2 i = b0 (N ) = 2 N

X

fb(0) =

1≤x6=y≤N

1 1− N

fb(0),

(2.9)

so that Z hR2 i =

∞

−∞

f (x)dx + O

1 N

,

(2.10)

which is the expected value for a random sequence. We next estimate the variance of R2 (f, α, N ) as a function of α: Proposition 2. As a function of α ∈ [0, 1],

R2 (f, α, N ) − fb(0) N −1/2+

(2.11)

2

for any > 0, the implied constants depending on and f . Proof. It is easy to see from (2.8) that since f ∈ Cc∞ (R), the Fourier coefficients bl (N ) are negligable for l ≥ N d+1+δ for any fixed δ > 0. Also from (2.8) we have for l 6= 0, τ (|l|)2 , N2

bl (N )

(2.12)

where τ (|l|) is the numbers of divisors of |l|. This is because the factors of l determine n, x, y. We will use the well-known estimate τ (m) m ,

for any > 0.

Thus by Parseval

2

R2 (f, α, N ) − fb(0) =

fb(0) N

2

!2 +

X N l6=0

N2

|bl (N )|

06=|l|≤N d+1+δ X

N N2 N N2

|bl (N )|2

l6=0

X

=

X

N |bl (N )| + smaller order term N2

|bl (N )|

l6=0

X

1≤x6=y≤N n∈Z

1 b n | N −1+ . |f N2 N

(2.13)


65

3. Almost-Everywhere Convergence 3.1. Overview of the argument for Theorem 1. In order to prove Theorem 1 from the decay of the variance of the pair correlation, we first show that for each f ∈ Cc∞ (R), there is a set of full measure P (f ) ⊂ R so that for all α ∈ P (f ), R2 (f, α, Nm ) → fb(0)

(3.1)

for a subsequence Nm which grows faster than m. Indeed, fix δ > 0, and let {Nm } be a sequence of integers with Nm ∼ m1+δ . Set XN (α) = R2 (f, α, N ) − fb(0). By Proposition 2, kXN k2 N −1+ for all > 0 and so 2

∞ Z X m=1

1

|XNm (α)2 |dα < ∞. 0

Therefore (since |XNm |2 ≥ 0) Z 0

and so

P m

1

X

|XNm (α)| dα = 2

m

XZ m

1

|XNm (α)|2 dα < ∞, 0

|XNm |2 ∈ L1 (0, 1). Thus the sum is finite almost everywhere: X

|XNm (α)|2 < ∞,

for almost all α.

m

Therefore, XNm (α) → 0 as m → ∞ for almost all α, that is we have (3.1) on a set P (f ) of α’s which we may assume consists only of Diophantine numbers. To go from almost everywhere convergence along a subsequence to almost everywhere convergence, we will show that as a function of N , R2 (f, N, α) does not oscillate much for Diophantine α. More precisely, there is some ν > 0 so that if Nm ≤ n < Nm+1 then for Diophantine α, there is c(f, α) > 0 so that −ν |Xn (α) − XNm (α)| c(f, α)Nm . δ , this estimate in turn will follow from: Because 0 ≤ n − Nm ≤ Nm+1 − Nm Nm

Proposition 3. Let 0 < δ < 1/2d−1 . Then for all f ∈ Cc (R) and all α of Diophantine type, there is some c(f, α) > 0 so that for all 0 ≤ k ≤ N δ , |XN +k (α) − XN (α)| ≤ c(f, α)N −ν , where ν < 1/2d−1 − δ.

66


Since XNm (α) → 0 for all α ∈ P (f ), which by throwing out a measure-zero subset we assumed consisted only of Diophantine α’s, Proposition 3 implies Xn (α) → 0 for all α ∈ P (f ). We will prove this proposition after finishing the proof of Theorem 1. What remains to do is to find one subset P ⊂ R of full measure for which R2 (f, α, N ) → R∞ f (x)dx for all α ∈ P and all f which are characteristic function of intervals [−s, s] −∞ (or in Cc∞ (R)). To do this, pick a (countable) sequence of positive fi ∈ Cc∞ (R) so that for eachRf ≥ 0 as above, there are subsequences {fi± } ⊂ {fi } which satisfy fi− ≤ f ≤ fi+ ∞ and −∞ (fi+ − fi− )(x)dx → 0. Take P := ∩i P (fi ) which is still of full measure. For every α we have R2 (fi− , α, N ) ≤ R2 (f, α, N ) ≤ R2 (fi+ , α, N ), R∞ R∞ and in addition for α ∈ P , we have R2 (fi± , α, N ) → −∞ fi± . Since −∞ fi± → R∞ R∞ f , this shows that R2 (f, α, N ) → −∞ f for α ∈ P and gives Theorem 1. −∞ The proof of Proposition 3 will occupy the rest of this section. 3.2. Estimates for Weyl sums. We Pbegin with some consequences of Weyl’s estimates for the “Weyl sums” S(n, N ) = x≤N e(nαxd ) which we will need. Throughout the remainder of this section, we set D = 2d−1 . Lemma 4. For α Diophantine, and M ≥ 1, we have X |S(n, N )|D M 1+ N D−1+ 1≤n≤M

for all > 0 (D = 2d−1 ). Proof. This follows from proof of Weyl’s inequality (see [6], Lemma 3). We will outline the steps. By repeated squaring, one finds that |S(n, N )|

D

N

D−1

+N

D−d

N X y1 ,...,yd−1 =1

1 min N, kd!nαy1 . . . yd−1 k

,

where k·k denotes the distance to the nearest integer. Now sum over n ≤ M , collecting together terms with the product d!ny1 . . . yd−1 having a given value m. The number of such terms is at most the divisor function τ (m) m . Since the maximal value of m is d!M N d−1 , we find X X 1 . |S(n, N )|D M N D−1 + M N D−d+ min N, kmαk (3.2) 1≤n≤M m≤d!M N d−1 Proceeding as in [6], we replace α by a rational approximation a/q with |α−a/q| ≤ 1/q 2 , and divide the range of summation into consecutive blocks of length q. This will give X 1 M N d−1 + 1 · (N + q log q). min N, kmαk q d−1 m≤d!M N

Inserting into (3.2) we get


X

67

|S(n, N )|

D

MN

D−1

+M N

1≤n≤M

D−d+

M N d−1 + 1 · (N + q log q). q (3.3)

Now choose q ≤ M N d−1 with |α − a/q| ≤ 1/qM N d−1 (so certainly |α − a/q| ≤ 1/q 2 so (3.3) holds). Since α is Diophantine, |α − a/q| 1/q 2+ which gives q (M N d−1 )1− . Therefore M N d−1 + 1 · (N + q log q) (M N d−1 )1+ , q and consequently

X

|S(n, N )|D M 1+ N D−1+

1≤n≤M

as required.

As an immediate consequence of this lemma, we get on repeatedly using the CauchySchwarz inequality that Corollary 5. For α Diophantine, and M ≥ 1, X |S(n, N )|2 M 1+ N 2−2/D+

(3.4)

1≤n≤M

and

X

|S(n, N )| M 1+ N 1−1/D+ .

(3.5)

1≤n≤M

3.3. Proof of Proposition 3. We first show XN +k (α) − XN (α) =

1 N2

X 0 0, M = N 1+b , n X 1 sof f (n, N ) + rapidly decaying term. XN (α) = 2 fb N N 06=|n|≤M

Next we use |sof f (n, N )| ≤ N + |S(n, N )|2 and Corollary 5 to deduce that X X |sof f (n, N + k)| ≤ M N + |S(n, N + k)|2 M 1+ N 2−2/D . (3.7) 06=|n|≤M 06=|n|≤M

68


Next we claim that n X 1 1 sof f (n, N + k) = 2 fb 2 (N + k) N +k N 06=|n|≤M

n sof f (n, N + k) fb N

X 06=|n|≤M

+ O(M 2+ N −2+δ−2/D ). Indeed, write 1 1 = 2 +O 2 (N + k) N

k N3

1 + O(N −3+δ ) N2

=

and nk n n n = + O( 2 ) = +O N +k N N N

M N 2−δ

,

so that for |n| ≤ M , k < N δ , fb

n b n =f +O N +k N

M N 2−δ

.

Therefore n sof f (n, N + k) N +k 06=|n|≤M n X 1 sof f (n, N + k) fb − 2 N N 06=|n|≤M X 1 M 1 b n +O = f + O( ) sof f (n, N + k) N2 N 3−δ N N 2−δ 06=|n|≤M n X 1 sof f (n, N + k) fb − 2 N N 06=|n|≤M X 1 M + |sof f (n, N + k)| N 4−δ N 3−δ X

1 (N + k)2

fb

06=|n|≤M

M M 1+ N 2−2/D = M 2+ N −2+δ−2/D N 4−δ

by (3.7)

as required. This proves (3.6). Next we express the difference sof f (n, N + k) − sof f (n, N ) as X X e(−nαy d ) e nαxd sof f (n, N + k) − sof f (n, N ) = 2 Re N +1≤y≤N +k

+

X

1≤x≤N

e nα(xd − y d ) .

N +1≤x6=y≤N +k

We estimate the second term trivially by k 2 : |sof f (n, N + k) − sof f (n, N )| ≤ k|S(n, N + k)| + k 2 .

(3.8)


69

Then inserting this into (3.6) we get XN +k (α) − XN (α)

1 N2 k N2

X

k|S(n, N + k)| + k 2 + M 2+ N −2+δ−2/D

0 . 3 In the four dimensional case one has similar equations, with the indices i, j, k running from 1 to 3. Then the coefficients cijk are the structure constants for quaternions. The holomorphic H case that we will shortly analyze is thus a theory with a complexified quaternionic structure. 4 It is also known that a solution exists in seven dimensions if one replaces Spin(7) by G (see [15]). 2 5 An interesting problem is to find conditions on a curved compact Joyce manifold M so that such 8 instantons exist.

156

L. Baulieu, H. Kanno, I. M. Singer

Z Z 1 1 1 8 √ S1 = ∧ Tr (F ∧ F ) + s d x g Tr (χi 8i + χi Hi ) 2 M8 2 M8 2 Z 1 ∧ Tr (F ∧ F ) = 2 M8 Z 1 1 1 8 √ d x g Tr Hi 8i + Hi Hi − χi (Dψ)i + φ[χi , χi ] , (2.18) + 2 M8 2 2 where (Dψ)i is the FP ghost independent part of s8i . Eliminating the auxiliary fields Hi by Eq. (2.15), one recovers the standard Yang–Mills kinetic term Z S1 =

1 µν 1 √ d x g Tr − F Fµν − χi (Dψ)i + φ[χi , χi ] . 4 2 M8 8

(2.19)

Notice that the fermion terms break the SO(8) global invariance down to G2 , for which the octonion structure coefficient in Eq. (2.12) is an invariant tensor. The gauge fixing and Faddeev–Popov ghost dependence have not been considered yet: the first stage action has still a gauge symmetry in the ordinary sense. To fix it completely we take two more conditions D · ψ = 0,

∂ · A = 0.

(2.20)

(The meaning of the scalar product is the usual one, e.g. D · 9 = Dµ 9µ .) Introducing ¯ η) and (¯c, B) with the BRST transformation law, additional fields (φ, ¯ sφ¯ = η − [c, φ], s¯c = B − [c, c¯],

¯ − [c, η], sη = [φ, φ] sB = [φ, c¯] − [c, B],

(2.21)

we write the complete action as 1 √ ¯ d x g Tr (φD · ψ + c¯∂ · A + c¯B) S2 = S1 + s 2 M8 Z 1 1 √ d8 x g Tr − F µν Fµν − χi (Dψ)i + φ[χi , χi ] = 4 2 M8 ¯ · Dφ − ψ · [φ, ¯ ψ] + B∂ · A + 1 B 2 + c¯∂ · Dc +ηD · ψ + φD 2 1 −¯c∂ · ψ + ∂ · A[c, c¯] − φ[¯c, c¯] . 2 Z

8

(2.22)

A natural set of topological observables is derived from the topological invariants 1 2

Z

Z ∧ Tr (F ∧ F ), M8

Tr (F ∧ F ∧ F ∧ F ).

(2.23)

M8

The method of the descent equation implies a ladder of topological invariants and, for example, gives the following descendants:

Special Quantum Field Theories in Eight and Other Dimensions

O(0) =

1 2 Z

Z ∧ Tr (F ∧ F ), M8

∧ Tr (ψ ∧ F ),

O(1) = Z

γ7

1 ∧ Tr ( ψ ∧ ψ − φ ∧ F ), 2 γ6 Z =− ∧ Tr (ψ ∧ φ), γ5 Z 1 = ∧ Tr (φ ∧ φ). 2 γ4

O(2) = O(3) O(4)

157

(2.24)

The descendant O(k) with ghost number k is an integral over an (8 − k) cycle γ(8−k) . 2.1.3. Geometric interpretation. The virtual dimension of the moduli space MJ of solutions to P+ FA = 0 is −index 6 ∂ ⊗ IG , i.e., the index of 6 ∂ ⊗ IG : S − ⊗ G → S + ⊗ G. Its value is Z ˆ 8 ) ch(G), A(M (2.25) − M8

computable in terms of the relevant characteristic classes. We will discuss the vanishing theorem needed to make the virtual dimension equal to the actual dimension elsewhere. We can interpret Sect. 2.1.1 geometrically analogous to Sect. 5 in [5]. The BRST equations in this section are the analogues of (7) in [5], and are the structure equations for the universal connection on A/G ×M8 with structure group G. The curvature 2-form i is an i-form in the A/G F for this universal connection equals F20 +F11 +F02 , where F2−i direction (ghost number) and a (2 − i)-form in the M8 direction. Note that F20 at (A, x) is FA (x) and F11 assigns to τ ∈ T (A/G, A) and v ∈ T (M8 , v) the value τ (v) ∈ G, since τ is ∗ a 1-form on M8 . Further, F02 on τ1 , τ2 ∈ T (MJ , A) is G(b∗τ1 (τ2 )) where G = (DA DA )−1 0 0 ∗ on 3 ⊗ G and bτ1 (f ) = [τ1 , f ] for f ∈ 3 ⊗ G; bτ1 is the adjoint of bτ1 . We restrict F to MJ ×M8 and consider c2 = 8π1 2 Tr (F ∧F) a 4-form on MJ ×M8 . Its expansion contains 1 1 1 8π 2 Tr (F1 ∧ F1 ), which has ghost number 2. This 4-form assigns to τ1 , τ2 ∈ T (MJ , A) ˜ 2 and v1 , v2 ∈ T (M8 , x) the value 8π1 2 (Tr (τ1 (v1 )τ2 (v2 )) − Tr (τ1 (v2 )τ2 (v1 )). Let τ1 ∧τ denote this 2-form on M8 . Let ck4−k be the component of c2 which is of degree k in the MJ direction and of R degree 4 − k in the M8 direction. Then γk ∧ ck4−k gives a k-form on MJ , when γk is a (8 − k)-cycle on M8 , k = 0, 1, 2, 3 or 4. These are the observables O(k) in Eq. (2.24). Taking products of the forms O and integrating them over MJ gives the expectation values of the products of observables. We are not addressing the central problem of integrating a form over the non compact space MJ . We can specialize to 6-cycles, or equivalently to 2-forms to get a closer analogy to Donaldson invariants: if σ ∈ H 2 (M8 ), R ˜ 2 ∧ σ. We get an rlet 6σ be the 2-form on MJ given by 6σ (τ1 , τ2 ) = M8 ∧ τ1 ∧τ R 2 symmetric multi-linear function on H (M8 ) given by (σ1 , . . . , σr ) → MJ 6σ1 ∧ . . . ∧ 6σr , if dim MJ = 2r. Of course the issue here is to make these invariants well-defined and to see how they depend on the space of Joyce manifolds modulo diffeomorphisms for a fixed M8 .

158


2.2. Type H: Calabi–Yau Complex 4-manifold. 2.2.1. Geometrical setup. Suppose now that the holonomy group for (M8 , g) with metric g is SU (4). So M8 is a complex manifold and we can assume that g is a Calabi–Yau metric with a Kähler 2-form ω. We choose a holomorphic covariantly constant (4,0)form which trivializes the canonical bundle K. We √ normalize so that ∧ is the volume element of M8 . We also choose the trivial K for the spin structure on M8 . even ± We know that complex spinors can be identified with forms: SM ⊗ C ' 30, odd and the Dirac operator with ∂¯ + ∂¯ ∗ . Real Majorana spinors SM ⊂ SM ⊗ C are the fixed points of a conjugation b on SM ⊗ C. We can identify b with a conjugate linear ∗ operator as follows. For any Calabi–Yau M2n , define ∗ : 30,p → 30,n−p by hα, βi = R ∧ α ∧ ∗β, where now ∈ 3n,0 . (If one denotes by ∗1 the usual map on M2n complex manifolds: 3p,q → 3n−q,n−p , then ∗1 − = ∧ ∗ on 30,q .) When n = 4, one can show that conjugation b equals (−1)q ∗ on 30,q . Consequently, the operator 0,2 − + ∂¯ ∗ + P+ ∂¯ : 30,1 → 30,0 + 30,2 + is the Dirac operator from SM → SM . Here 3± is the 0,2 0,2 ± eigenspace of ∗, P± is the projection of 3 on 3± ; we have identified 30,1 with ∂¯

P ∂¯

+ 0,0 0,2 + 30,3 ) and 30,0 with 1+∗ + 30,4 ). The sequence 30 −→30,1 −→3 + is 2 (3 elliptic and is the linearization of the equation P+ FA = 0, modulo gauge transformations. Suppose now (E, ρ) is a complex Hermitian vector bundle over M8 with metric ρ of dimC = N . If A is a connection for E, we have its covariant differential DA : C ∞ (E) → C ∞ (E ⊗31 ) so that DA = ∂A + ∂¯A with ∂¯A : C ∞ (E) → C ∞ (E ⊗30,1 ). By introducing local complex coordinates z µ , ∂¯A (ßI ) = (∂µ¯ + (AIJ )µ¯ ßJ )dz¯ µ , I, J = 1, . . . , N . So (AIJ )µ¯ dz¯ µ is a (0,1)-form on M8 with N × N matrix coefficients. The 1-form connection A with values in GL(N, C) does not split naturally into 30,1 + 31,0 unless E is holomorphic. A splitting can be obtained by a choice of almost complex structure on the principal bundle. See Bartolomeis and Tian [24]. In any case, 2 . the curvature FA can be decomposed as FA = FA2,0 + FA1,1 + FA0,2 with FA0,2 = ∂¯A 0,1 ∞ ∞ For each ∂¯ operator: C (E) → C (E ⊗ 3 ), there exists a unique connection A ¯ Hence, the such that (i) A preserves the hermitian metric ρ of E and (ii) (DA )0,1 = ∂. ¯ space AP of ∂ operators can be identified with the connections of the principal bundle P associated with E, which preserve the Hermitian metric. The group of complex gauge ¯ is also a ∂¯ transformations H acts on the space AP , because if h ∈ H, then h−1 ∂h operator.

0,1 1−∗ 2 (3

∂¯

P ∂¯

+ A A 0,1 ⊗ G −→ 30,2 Let G be gl(N, C). Then the sequence 30 ⊗ G −→3 + ⊗ G is still elliptic on the symbol level. We say ∂¯A is holomorphic anti-self-dual if P+ FA0,2 = 0, in − + ⊗G → SM ⊗G. which case the sequence is elliptic. Its index is the index of 6 ∂ ⊗IG : SM R 0,2 Again, the BRSTQFT will be obtained by gauge fixing S0 = M8 ∧Tr (FA ∧FA0,2 ). S0 is independent of A, because S0 = 8π 2 ∪ p1 (E), since ∈ 34,0 . When S0 6= 0, we can normalize further by eiθ , so that S0 is real and positive. To verify Eq. (2.4) in the H case, we reduce G to u(N ), using the metric ρ. If ω ∈ 30,2 2 2 2 has components ω± in 30,2 ± , then k ω k =k ω+ k + k ω− k . And Z 0,2 0,2 0,2 0,2 0 ≤ S0 = Tr ∧ (FA+ + FA− ) ∧ (FA+ + FA− )

=− =−

M8 0,2 2 k k FA+ 0,2 2 k FA+ k

0,2 2 0,2 0,2 + k FA− k +iIm hFA+ , FA− i 0,2 2 + k FA− k .

(2.26)


159

Hence 0,2 2 k FA0,2 k2 = 2 k FA+ k +S0 .

(2.27)

So the holomorphic anti-self-dual gauge condition minimizes the action k FA0,2 k2 in the topological sector with S0 fixed. The (4, 0) form can be simply expressed in local coordinates as = dz 1 ∧ dz 2 ∧ dz 3 ∧ dz 4 .

(2.28)

F (0,2) = dz¯ µ¯ dz¯ ν¯ Fµ¯ ν¯ ,

(2.29)

Fµ¯ ν¯ = ∂µ¯ Aν¯ − ∂ν¯ Aµ¯ + [Aµ¯ , Aν¯ ].

(2.30)

Dµ¯ = ∂µ¯ + [Aµ¯ , ].

(2.31)

and where also One has the part of the Bianchi identity D[µ¯ Fν¯ ρ]¯ = 0.

(2.32)

The 3 complex gauge covariant gauge conditions, which count for 6 real conditions on the 8 independent real components contained in Aµ¯ are c

Fµ¯ 1 µ¯ 2 + µ¯ 1 µ¯ 2 µ¯ 3 µ¯ 4 Fµ¯ 3 µ¯ 4 = 0.

(2.33)

The two other gauge conditions are given by the following complex equation 1 ∂µ¯ c Aµ¯ + [Aµ¯ , c Aν¯ ] = 0. 2

(2.34)

If we compute the real and imaginary parts of this condition, they give respectively the Landau gauge condition adn the first of the seven conditions in (2.11). A similar decomposition of (2.33) gives the six other equations in (2.11). We have now the topological ghost 9µ¯ with 4 independent complex components, and we have the ghost gauge condition Dµc¯ 9µ¯ = 0.

(2.35)

(Here and below, we use the left upper symbol c for complex conjugation.) A consequence of the use of complex gauge transformations is that a complex Faddeev–Popov ghost c must be introduced, with complex ghost of ghost φ. Up to the complexification of all fields, we have thus exactly the same field content as the original 4 dimensional Yang–Mills TQFT. This leads us to the BRST algebra that we will shortly display. 2.2.2. Action and observables. From the previous arguments, we must write the BRST algebra in a notation where all fields are complex fields and replace the formula of the J case by sAµ¯ = ψµ¯ + Dµ¯ c, sψµ¯ = −Dµ¯ φ − [c, ψµ¯ ], 1 sc = φ − [c, c], sφ = −[c, φ]. 2

(2.36)

160


In the antighost sector, we have a complex self dual two-form with 3 independant complex components χµ¯ ν¯ and sχµ¯ ν¯ = Hµ¯ ν¯ − [c, χµ¯ ν¯ ],

sHµ¯ ν¯ = [φ, χµ¯ ν¯ ] − [c, Hµ¯ ν¯ ].

(2.37)

We have also the complexified analogues of the antighosts of the four dimensional Yang–Mills TQFT, with the same transformation laws as in (2.21). Because of complexification, there are in the H case more ghosts than as in the J case. Thus, part of the gauge fixing will consists in setting equal to zero the imaginary parts of the scalar ghosts ¯ η. c, φ, φ, To impose these conditions, and the 3+1 complex gauge conditions Eqs. (2.33) and (2.34), we define Z Z= [DAµ¯ ][Dc Aµ¯ ][D9µ¯ ][Dc 9µ¯ ][Dκµ¯ ν¯ ][Dc κµ¯ ν¯ ][DHµ¯ ν¯ ][Dc Hµ¯ ν¯ ] c ¯ ¯ c φ][Dc][D c][Dc¯][Dc c¯][DB][Dc B] [Dη][Dc η][Dφ][Dc φ][Dφ][D

exp SH , (2.38)

Z SH = [ ∧ Tr F (0,2) ∧ F (0,2) ] Z 1c 1 4 4 d zd z¯ s Tr κµ¯ ν¯ (c Fµ¯ ν¯ + µ¯ ν¯ ρ¯ σ¯ Fρ¯ σ¯ + Hµ¯ ν¯ ) + c κµ¯ ν¯ (Fµ¯ ν¯ + µ¯ ν¯ ρ¯ σ¯ c Fρ¯ σ¯ + Hµ¯ ν¯ ) 2 2 ¯ µc¯ 9µ¯ +c φ¯ c Dµ¯ 9µ¯ + Im φ¯ Im c + φD 1 1c 1 c 1 c c c c + c¯(∂µ¯ Aµ¯ + [Aµ¯ , Aµ¯ ] + B) + c¯( ∂ µ¯ Aµ¯ + [ Aµ¯ , Aµ¯ ] + B) .(2.39) 2 2 2 2 If we develop the s-exact terms and eliminate the auxiliary fields H and B we get a supersymmetric action starting with Tr (F ∧ ∗F ), because 41 k FA k2 = k FA0,2 k2 + 1 2 2 4 k hF, ωi k + topological terms, and a Feynman–Landau gauge fixing term |∂ ·A| . The action of the H case is similar to that of the J case after eliminationof the imaginary parts ¯ η by mean of the equations of motion coming from s(Im φ¯ Im c). Moreover, of c, φ, φ, if one separate fields in their real and imaginary parts, one finds a mapping between the ghosts of the H and J case (for instance the six antighosts contained in the complex self dual two-form κµ¯ ν¯ and the imaginary part of the antighosts c¯ of the H cases can be identified as the seven ghosts κi of the J case). Actually, up to this mapping, the actions of the H and J cases are almost identical. The definition of observables follows from the cocycles obtained by the descent equations, as sketched in the previous section. Their meaning is now discussed. f denote [A ∈ AP ] with F+0,2 = 0. It is invariant 2.2.3. Geometric interpretation. Let M 2 f The 3 under H (which acts on G in 3+ ⊗ G, but not on 32+ .) Let MH = M/H. complex covariant gauge conditions, 30,2 = 0, probe the moduli space M H . We + ∂¯

P ∂¯

+ A A 0,1 remarked earlier that 0 → 30 ⊗ G −→3 ⊗ G −→ 30,2 + ⊗ G is an elliptic complex with 0,1 0 0,2 ∗ ¯ ¯ ∂A +P+ ∂A : 3 ⊗G → 3 ⊗G +3+ ⊗G; the elliptic operator 6 ∂A : S − ⊗G → S + ⊗G. The complex gauge condition is ∂¯ ∗ τ = 0 for τ ∈ 30,1 ⊗ G. e over MH × M8 with connection. As before, we get a hermitian vector bundle E e One can compute c2 of E in terms of its curvature F H . One has the map T of H 0,∗ (M8 ) T R into forms on MH by µ−→ M8 ∧ Tr (F H ∧ F H ) ∧ µ.


161

Formally this gives a multilinear map of H 0,∗ (M8 ) → C by µ1 , . . . , µr → T (µ1 ) ∧ . . . ∧ T (µr ). These would be the expectation values of the observables of MH the BRSTQFT. H 1 As in Sect. 2.1.3. part of c2 is 8π1 2 Tr ((F H )11 ∧ (F H )11 ) with R (F )1 (τ, v) = τ (v) ∈ 0,2 u(N ). If σ ∈ H (M8 ) let Tσ be the 2-form on MH given by M8 ∧ Tr (τ1 ∧ τ2 ) ∧ σ, where τi , i = 1, 2, are (0,1) forms on M8 with values in u(N ). The formal holomorphic Donaldson polynomial is the symmetric r-multilinear function on H 0,2 (M8 ) given by R σ1 , . . . , σr → MH Tσ1 ∧ . . . ∧ Tσr , when dim MH = 2r. (Note that if H 0,2 (M8 ) 6= 0, then M8 is hyperKähler because elements of H 0,∗ are covariantly constant.) It will be very interesting to see when formal integration over MH is justified, and when these invariants depend only on the complex structure of M8 , not on the Calabi– Yau metric g, nor the hermitian metric ρ. C. Lewis [12] is investigating the conditions under which MH is the set of stable holomorphic vector bundles. Since the elliptic operator here is 6 ∂ again, the virtual dimension of MH is Z ˆ 8 ) ch (G). A(M (2.40) − R

M8

2.3. Comparison of H and J cases. Under suitable conditions ((E, ρ) a stable6 vector bundle, for example), one expects that the orbit space of AP under the group of complex gauge transformations, will be the same as the sympletic quotient, AP k GU , where GU are the gauge transformations on P reduced to the compact group U (N ). Since [A; hFA1,1 , ωim = 0, m ∈ M8 ] is the zeros of the moment map, AP /GU is the orbit space of this set under GU . We replace the condition P+ (FA0,2 ) = 0 with FA0,2 ∈ 30,2 + ⊗gl(N, C) by the conditions 0,2 1,1 0,2 0,2 P+ (FA ) = 0 and hFA , ωi = 0, where now FA ∈ 3 ⊗ u(N ) and hFA1,1 , ωi ∈ u(N ). One should get the same moduli space of solutions. ∂¯

P ∂¯

+ A A 0,1 ⊗gl(N, C) −→ 30,2 In the linearization, the sequence gl(N, C)−→3 + ⊗gl(N, C) →

P ∂¯ ⊕i ∂

ω A 0 is replaced by u(N ) → 30,1 ⊗ u(N ) + −→ 30,2 + ⊗ u(N ) ⊕ u(N ) → 0, where 0,1 iω ∂ : τ ∈ 3 ⊗ u(N ) → h∂τ, ωim ∈ u(N ). The operator iω ∂ is the linearization of the 0-momentum condition hFA1,1 , ωim = 0; it is also the imaginary part of ∗ ∂¯A : 30,1 ⊗ gl(N, C) → gl(N, C). Thus, with the reduction of Spin(7) holonomy to Spin(6) = SU (4) holonomy, the 7 dimensional 32+ in the J-case decomposes into the 6 dimensional 32+ of the H-case plus R.

2.4. Link to twisted supersymmetry. We note that the field content of our Yang–Mills BRSTQFT action in 8 dimensions is similar to that of four dimensional topological Yang–Mills theory. Since four dimensional topological Yang–Mills theory is a twisted version of D = 4, N = 2 super Yang–Mills theory and is related by dimensional reduction to the minimal six dimensional supersymmetric Yang–Mills theory, it is natural to expect a similar connection in eight dimensions. This is indeed so; we explain the type J case, although the fields (c, c¯, B) which were employed to impose the Lorentz condition ∂ µ Aµ = 0, are neglected. The gauge supermultiplet in eight dimensions consists of one gauge field in 8v (the vector representation), one chiral spinor in 8s , one anti-chiral spinor in 8c and two scalars [25]. The reduction of the holonomy group to Spin(7) 6 For physicists, one might define (E, ρ) to be stable if it is holomorphic, Einstein-Hermitian, i.e., F · ω ρ is a constant multiple of the identity, where Fρ is the curvature of (E, ρ) relative to its unique ρ-connection.

162


defines decomposition of the chiral spinor; 8s = 1 ⊕ 7. Now it is natural to identify Aµ and ψµ in our topological theory as 8v and 8c , respectively. Furthermore χi and η just correspond to the chiral spinor 8s according to the above decomposition. Finally φ and φ¯ give the remaining two scalars. This exhausts all the dynamical fields in our action of eight dimensional topological Yang–Mills theory. Though we do not work out the transformation law explicitly, we believe this is a sufficiently convincing argument for the fact that the J case is the D = 8 SSYM dimensionally reduced from D = 10, N = 1 SSYM. The connection between a general supersymmetry transformation and topological BRST transformations is the following: when M8 is flat, the reduction from D = 10 is N = 2 real supersymmetry or N = 1 complex supersymmetry. For curved manifolds, the only surviving supersymmetries are those depending on covariant constant spinors. In the J case the nilpotent topological BRST symmetry generator is a combination of the real and imaginary parts of the one surviving complex generator of supersymmetry. As said just above, this supersymmetric Yang–Mills theory in eight dimensions is obtained by dimensional reduction from the D = 10, N = 1 super Yang–Mills theory. This suggests a relationship with superstring theory. It has been argued that the effective world volume theory of the D-brane is the dimensional reduction of the ten dimensional super Yang–Mills theory [26]. Thus the BRSTQFT constructed in this section may arise as an effective action of 7-brane theory. In fact Joyce manifolds are discussed in connection with supersymmetric cycles in [27, 28]. Recently in [29], a six dimensional topological field theory of ADHM sigma model is obtained as a world volume theory of D-5 branes. The world volume theory of D-branes could provide a variety of higher dimensional BRSTQFT’s.

3. Coupling of the 8D Theory to a 3-Form For the pure Yang–Mills theory, we have seen that the construction of a BRSTQFT implies a consistent breaking of the SO(D) invariance. This turns out to be quite natural, when closed but not exact forms exist, like the Kähler 2-form on Kähler manifolds or the holomorphic (n, 0)-form on Calabi–Yau manifolds. This idea extends to consider BRSTQFT’s involving sets of possibly interacting p-form gauge fields with (p + 1)-form curvatures Gp+1 = dBp + ..., satisfying relevant Bianchi identities. Our point of view is that one must define a system of equations, eventually interpreted in BRSTQFT as gauge conditions, which does not overconstrain the fields. If tensors T µ1 ,...,µ2p+2 of rank 2p+2; (2p+2 ≤ D) exist which are invariant under maximal subgroups of SO(D), we can consider BRSTQFT based on gauge functions of the following type, where λ is a parameter: T µ1 ,...,µ2p+2 Gµp+2 ,...,µ2p+2 = λGµ1 ,...,µp+1 .

(3.1)

Such equations must be understood in a matricial form, since they generally involve several forms Bp , with different values of p. To ensure that the problem is well defined, a first requirement is that Eq. (3.1) has solutions in Gp+1 for λ different from zero. This algebraic question is in principle straightforward to solve by group theory arguments, although we expect that geometrical arguments should also justify them. Moreover, we must also consider that Gp+1 is the curvature of a p-form gauge field Bp . Thus, other gauge functions must be introduced, to gauge fix the ordinary gauge freedom of Bp which leave invariant its curvature Gp+1 . This gives a second requirement, since from the point of view of the quantization, the total number of gauge conditions, the topological ones


163

and the ordinary ones, must be exactly equal to the number of independent components in the gauge field Bp . To be more precise, the number of ordinary gauge freedom of a p-form gauge field in p−1 : (this amounts to the fact that Bp is truly defined up to a (p − 1)D dimensions is CD−1 form, which is itself defined up to a (p − 2)-form, and so on.) We should therefore only retain invariant tensors T such that the number of components of Bp equates the rank of the system of linear equations in G presented in Eq. (3.1) plus the number of ordinary gauge freedom in Bp . Obviously, when there are several fields in Eq. (3.1), the counting of independent conditions can become quite subtle, since one must generally combine several equations like Eq. (3.1). For instance, we will display in the next section BRSTQFT theories in dimensions D < 8. Their derivation will appear as rather simple, because they all descend by dimensional reduction from the pure Yang–Mills BRSTQFT based on the set of 6 or 7 independent self-duality gauge covariant equations in eight dimensions found in Sect. 2. Without this insight, their derivation would be less obvious. We now turn to the introduction of a 3-form gauge field in 8 dimensions. In even D = 2k dimensions, Eq. (3.1) has a generic solution for an uncharged (k−1)-form gauge field Bk−1 : assuming the existence of a curvature Gk for Bk−1 , we can consider the obvious generalization of self-duality equations, Gk = ∗Gk . The number of these conditions k . On the other hand, the number of ordinary gauge freedom of a (k − 1)is CD−1 k−2 k−2 k−3 k−4 0 = CD − CD + CD − . . . ± CD . Thus imposing the form gauge field is CD−1 ordinary gauge fixing conditions for the (k−1)-form gauge field plus the gauge covariant k−1 k−2 k ones, Gk = ∗Gk , gives a number of CD = CD−1 + CD−1 equations, which is equal k−1 to the number of arbitrary local deformations of the CD independent components of the (k − 1)-form gauge field. We will see that it is possible to generalize the self duality equation satisfied by a (k − 1)-form gauge field. Moreover, the counting remains correct in the case it has a charge. As an example, in the 8-dimensional theory, a 3-form gauge field has 56 components, with 21 ordinary gauge freedom, while the number of self dual equations involving the 4-form curvature of the 3-form is 35, and one has 56=21+35. We thus propose as topological gauge conditions for the coupled system made of the Yang–Mills field A and the 3-form gauge field B3 the following coupled equations: λFµν = Tµνρσ Fρσ , dB3 + ∗(dB3 ) + αTr (F ∧ F )+ = 0.

(3.2)

α is a real number, possibly quantized, and Tr (F ∧ F )+ denotes the self dual part of Tr (F ∧ F ) 7 . Although B3 is real valued, it interacts with the Yang–Mills connection A, when α 6= 0. An octonionic instanton solves the first equation, as shown in [14] and by Eqs. (25), (30), (31) of [15] in the case of M8 = S 7 × R. For this solution, the 4-form Tr (F ∧ F ) is not self dual. Given these facts, we are led to define a BRSTQFT in 8 dimensions based on the gauge conditions (3.2), in which a 3-form gauge field is coupled to a Yang–Mills field. The ghost spectrum for the ordinary gauge invariance of the field B3 generalizes that of the Yang–Mills field, with the following unification between the ghost B21 and the ghosts of ghosts B12 and B03 : b3 = B3 + B 1 + B 2 + B 3 . B 2 1 0

(3.3)

(From now on upper indices mean ghost number and lower indices ordinary form degree.) 7 Equation (3.2) suggests that the 3-form could be involved in an anomaly compensating mechanism. See sec. 3.1 where we show that Eqs. (3.1 ) implies dB3 = 0 if M is compact.

164


The BRST symmetry of the topological Yang–Mills symmetry considered in the previous section satisfies b = A + c, A (3.4) b A] b = F 0 + 91 + φ 2 , b + 1 [A, Fb = (s + d)A 2 1 0 2

(3.5)

with the notation 911 = 9µ dxµ and φ20 = φ. The gauge symmetry of the 3-form B3 involves a 2-form infinitesimal parameter associated to B21 . We can distinguish however different topological sectors for B3 , which cannot be connected only by these infinitesimal gauge transformations. As an example, B3 and B30 can belong to such different sectors, if 2 B30 = B3 + Tr (A ∧ dA + A ∧ A ∧ A). 3

(3.6)

We thus define the curvature of B3 as (A) ∧ F (A) ), G(A) 4 = dB3 + Tr (F

(3.7)

where the index (A) means the dependence upon the Yang– Mills field A. Notice that it is not globally possible to eliminate the A dependence of G(A) 4 by a field redefinition of B3 involving the Chern–Simons 3-form. The topological BRST symmetry of the 3-form gauge field system is defined from b3 + Tr (Fb (A) ∧ Fb (A) ) = G4 + G 1 + G 2 + G 3 + G 4 , b 4 = (s + d)B G 3 2 1 0

(3.8)

that is (s + d) (B3 + B21 + B12 + B03 ) + Tr (F20 + 911 + φ20 ) ∧ (F20 + 911 + φ20 ) = dB3 + Tr (F ∧ F ) + G31 + G22 + G13 + G04 .

(3.9)

g , g = 1, 2, 3, 4 are the topological ghosts of B3 . By expansion in ghost The fields G4−g number, Eqs. (3.5) and (3.9) define a BRST operation s which, eventually, determines the equivariant cohomology of arbitrary deformations of the Yang–Mills field modulo ordinary gauge transformations and of the 3-form gauge field, modulo the infinitesimal gauge transformations, δB3 = d2 , 2 ∼ 2 + d1 , 1 ∼ 1 + d. There is a natural topological invariant candidate for the classical part of a BRSTQFT action, Z (A) (A) G(A) ∧ F (A) ) . (3.10) Itop = 4 ∧ G4 + ∧ Tr (F

Its gauge fixing is a generalization of what we do in the pure Yang–Mills case. The main point is to find the gauge function in the topological sector. The existence of the octonionic instanton, together with an associated moduli space (yet to be explored), indicates that Eq. (3.2) is a good choice 8 . To enforce the gauge function Eq. (3.2), one must introduce a self-dual 4-form antighost κµνρσ , and consider the following BRST exact action: 8 Notice that one could also consider a 7-dimensional theory, which is formally related to the BRSTQFT in 8 dimensions as the 3-dimensional Chern–Simons theory is related to the 4-dimensional Yang–Mills TQFT action.


Z S3 =

165

d8 x s κµνρσ (∂[µ Bνρσ] + µνρσαβγδ ∂[α Bβγδ] + Tr F[µν Fρσ] ) .

(3.11)

The remaining conditions are for the usual gauge invariances of forms, whether they are classical or ghost fields. One can choose the following gauge fixing conditions for g , the longitudinal parts of all ghosts and ghosts of ghosts G4−g 1 = a11νρ , ∂ µ G3µνρ 2 ∂ µ G2µν = b12ν , 3 ∂ µ G1µ = c13 .

(3.12)

One must also conventionally gauge fix the longitudinal components of B3µνρ , of the 1 2 and B1µ , and of the antighosts. The presence in the r.h.s. of Eq. (3.12) of ghosts B2µν g the cocycles 13−g stemming from the ghost decomposition of Tr Fb ∧ Fb = Tr (F + 9 + φ) ∧ (F + 9 + φ) is an interesting possibility. It can lead to mass effects in TQFT, when the ghost of ghost φ takes a given mean value, depending on the choice of the vacuum in the moduli space, which can be adjusted by suitable choices of the parameters a, b, c. All these gauge conditions can be enforced in a BRST invariant way, as explained e.g. in [30]. The final result is an action of the following type (including the pure Yang–Mills part discussed in the previous sections) Z S = (∂µ Bνρσ ∂ µ B νρσ + Tr F µν Fµν + ∂µ Bνρσ Tr F µν F ρσ +supersymmetric terms).

(3.13)

b g occurring in the ghost expansion The observables are defined from all forms O 8−g of the 8-form b4 ∧ G b4 . b8 = G (3.14) O Whether these supersymmetric terms, made of ghost interactions, are linked to Poincaré supersymmetry is an interesting question. 3.1. Mathematical Interpretation. Fix an element of H 4 (M8 , Z) and let h4 denote its harmonic representative. Let ß denote the affine space of all closed 4-forms which represent this cohomology class. Then ß = h4 + d33 ; strictly speaking ß = h4 + d(33 /closed 3-forms) = h4 + dδ34 . In any case a tangent vector to ß can be represented as dB3 with B3 a 3-form. There are other ways of describing ß. An element of ß can be represented as a collection of 3-forms {Bu }, for a collection of coordinate neighborhoods U covering M8 , satisfying Bu − Bv = dwu,v on u ∩ v. Thus {dBu } gives a well-defined closed form on M8 ; to be an element of ß, this 4-form must be cohomologous to h4 . In the earlier part of this section, dB3 means this element of ß when B3 is defined locally as B3,u ; or if B3 is an ordinary three form, dB3 is really h4 + dB3 .9 Next consider the elliptic complex 0 → 30 → 31 → 32 → · · · → 34+ → 0, where 34± are the ±1 eigenspaces of the ordinary ∗ operator on M8 . Remember that in the J-case we also had 0 → 30 → 31 → 32+ → 0 with 32 = 32− ⊕ 32+ of dimensions d

d

21 and 7, respectively. Consider then 0 → 32− −→ 33 −→ 34+ → 0. We leave the 9 The theory of gerbes [31] gives a sheaf theoretic description for exhibiting integral cohomology classes, extending the notion of curvature field as an integral 2-cocycle.

166


reader to check that it is elliptic. (It does not suffice that the dimensions are 21, 56 and DA 35, respectively.) The linearization of the problem below involves 0 → 30 ⊗ G −→ D d d A 32+ ⊗ G → 0 for connections, and 0 → 32− −→ 33 −→ 34+ → 0 for 31 ⊗ G −→ 3-forms. An analogue of the anti-self-dual equations for the pair (A, G) with a connection A and G ∈ ß is (a) (b)

(i.e. P+ FA = 0) FA = ∗ ∧ FA , (1 + ∗)G = −αTr (FA ∧ FA )+ .

(3.15)

This equation is a mathematical interpretation of (3.2). Note that if a solution A, G = h4 + dB3 exists for (3.15), then Tr (FA ∧ FA ) is self-dual and hence harmonic. Hence (1 + ∗)(h4 + dB3 ) is harmonic. Since (1 + ∗)h4 is harmonic, so is (1 + ∗)dB3 . Hence dB3 = 0, and G = h4 . Note also that the sector ß, i.e. the element chosen in H 4 (M8 , R) must have its self-dual part, a multiple of the self-dual element p1 (P ). If we linearize (3.15), we get for τ ∈ T (G) and B3 ∈ T (ß), the equations P+(2) (DA τ ) = 0 and P+(4) dB3 = 0, where P+j is the projection of 3j → 3j+ (j = 2, 4). ∗ We then have a pair of elliptic systems above, with gauge fixing functions DA τ = 0 and d∗ B3 = 0, respectively. The covariant gauge functions areR given by (3.15). The candidate for the topological action S0 (A, G) is M8 G ∧ G + ∧ Tr (F ∧ F ). Since we now have the covariant gauge functions to probe the moduli space of solutions to (3.15) and we have the gauge fixing functions, we can apply the BRST formalism. We first express S0 in terms of the norms. From (2.7), Z ∧ Tr (FA ∧ FA ) + 4 k (FA )+ k2 . (3.16) k FA k2 = M8

R Also with G = G+ + G− , G± ∈ 34± , we have M8 G ∧ G =k G+ k2 − k G− k2 . Thus one obtains k FA k2 + k G k2 = S0 + 4 k (FA )+ k2 +2 k G+ k2 . We know that k FA k2 is minimized when F+ = 0, and that k G k2 is minimized when G = h. So we get a R minimum when (3.15) is satisfied and it equals S0 + 16π 4 α2 M8 p21 = S01 . In the pure YM case, the natural space was AP /G × M8 or its subspace MJ × M8 . b3 = Rather than 3-forms on M8 , we need 3-forms on MJ × M8 which we write as B 0 1 2 3 B3 + B2 + B1 + B0 (Eq.(3.3), above) with the upper index as the degree in the MJ direction (ghost number) and the lower index in the M8 direction. As before s denotes b b b b dMJ so that (s + d)B3 = (dMJ + dM8 )B3 = dMJ (B3 ) + dM8 (B3 ) is a 4-form with terms in the ab directions. 4. BRSTQFT’s for Other Dimensions Than 8 From many points of view the case D = 8 is exceptional. It is of interest, however, to also build BRSTQFT’s in other dimensions, by using the BRST quantization of dclosed Lagrangians with gauge functions as in Eq. (3.1). In this section, we first focus on theories with D < 8, that we directly obtain by various dimensional reductions in flat space of the J and H theories; we then comment on the cases D = 12 and D = 10 . We will not address the question of observables; their determination is clear from the descent equations which can be derived in all possible cases from the knowledge of the BRST symmetry.


167

4.1. Dimensional reduction of the Yang–Mills 8D BRSTQFT. In D = 8, for the J-case, we have seen that there exists a set of seven self-duality equations, on which we have based our BRSTQFT. These equations were complemented with a Landau gauge condition to get a system of 8 independent equations for the 8 components of Aµ . These seven equations can be written as 8i (Fµν (xµ )) = 0,

1 ≤ i ≤ 7,

1 ≤ µ, ν ≤ 8.

[ (2.14)]

(4.1)

Just as one obtains a BRSTQFT action based on Bogomolny equations in 3 dimensions [32], we can define a BRSTQFT in seven dimensions, by standard dimensional reduction on the eighth coordinate; that is, by putting in the above seven equations x8 = 0, ∂8 = 0 and replacing A8 by a scalar field ϕ(xj ) and Fi8 by Di ϕ(xj ). We can then gauge fix the longitudinal part of Ai , with an equation of the following type: ∂i Ai = [v, ϕ],

(4.2)

which allows for the case of a massive gauge field A. (Here and in what follows, the constant v defines a direction in the Lie algebra for the Yang–Mills symmetry.) The gauge fixed action will be Z d7 x |Fij |2 + |Di ϕ|2 + |∂i Ai − [v, ϕ] |2 + supersymmetric terms . (4.3) M7

This process can be iterated. We can go down from dimension 8 to 8 − n, by suppressing the dependence on n of the coordinates xµ . In D < 8 dimensions we will have a gauge field with D = 8−n components and a set of n scalar fields ϕp , p = 1, . . . , n which should be considered as Higgs fields. Obviously, the dimensional reduction applies as well to the various ghosts, and the fields ϕp fall into topological BRST multiplets, which, depending on the case, can possibly be interpreted as twisted Poincaré supermultiplets. Moreover, as we will see when D = n = 4, there is an interesting option to assign the fields ϕp as elements of other representations, e.g. spinorial ones, of SO(D). One can also consider the dimensional reduction in the H-case. One can break the symmetry between the coordinates y, z, t, w and their complex conjugates by replacing some of the fields, e.g. Im Aw¯ , by scalar fields. In all cases, the final theories rely on 8 independent gauge conditions for all fields: 7 for the topological gauge ones plus 1 for the ordinary gauge condition, if one starts from the J case; or 6 for the 3 complex topological gauge conditions plus 2 for the ordinary complex gauge condition, if one starts from the H case. 4.1.1. The case D=6. Since the case D = 6 is of great interest in superstring theory, let us display what we get, starting from the H case, i¯ j¯ k¯ Fj¯ k¯ = Di¯ ϕ,

(4.4)

∂i¯ Ai¯ = [v, ϕ].

(4.5)

This set of gauge functions represents 4 complex equations, for eight degrees of freedom represented by the complex fields Ai¯ and ϕ. If we start from the J case, we have 8i (Fµν (xµ ), Dµ ϕa (xµ )) = 0, possibly complemented by

1 ≤ a ≤ 2,

1 ≤ µ, ν ≤ 6,

(4.6)

168


∂µ Aµ = Ma,b ϕa ϕb + Na,b v a ϕb .

(4.7)

Notice that a 2-form gauge fields, subjected to the topological invariance sB2 = 92 + . . . can be introduced, still in 6 dimensions, with the topological self-dual gauge condition 2 (4.8) dB2 + ∗(dB2 ) + αTr (AdA + AAA)+ = 0. 3 This possibility is similar to the introduction of a 3-form in D = 8. We can directly build a BRSTQFT in 6 dimensions. First we consider a pure Yang– Mills case, taking the topological gauge fixing condition of the type 1 Tµνρσ F ρσ . (4.9) 2 The fourth rank tensor Tµνρσ is assumed to be invariant under some maximal subgroup of SO(6). According to Corrigan et al [6], only SO(4) × SO(2) and U (3) allow such an invariant tensor. The first choice corresponds to the case where the 6D manifold is a direct product of a 4D manifold and 2D Riemann surface; M6 = M4 × 62 . The second subgroup is the holonomy group of 6 dimensional Kähler manifolds. In this case we can write down the invariant tensor as the Hodge dual of a Kähler form ω, λFµν =

Tµνρσ = (∗ ω)µνρσ .

(4.10)

The possible eigenvalues λ of (4.9) with the tensor (4.10) are 1, −1 and −2. The eigenspaces of these eigenvalues give the decomposition of the 15 dimensional rep¯ ⊕ 1.10 Taking resentation of SO(6) under its subgroup SU (3) × U (1); 15 = 8 ⊕ (3 ⊕ 3) λ = 1 defines the 8 dimensional subspace given by the following seven linear conditions on Fµν , where we use complex indices a, b = 1, 2, 3: Fab = Fa¯ b¯ = 0,

(4.11)

ab¯

ω Fab¯ = 0.

(4.12)

(The last Eq. (4.12) is, e.g., F11¯ + F22¯ + F33¯ = 0.) The first condition (4.11) means that the connection is holomorphic. These equations are known as the Donaldson-UhlenbeckYau (DUY) equation for the moduli space of stable holomorphic vector bundles on a Kähler manifold. It also appears in the Calabi–Yau compactification of the heterotic strings. The DUY equation implies the standard second order equation of motion for the Yang–Mills field11 . In fact, this follows from the following identity in the action density level; 1 − Tr F ∧ ∗F + ω ∧ Tr (F ∧ F ) 4 3 ¯ ¯ = Tr − g aa¯ g bb Fab Fa¯ b¯ + (g ab Fab¯ )2 , 2

(4.13)

where we have introduced the metric gab¯ for the Kähler form ω. This identity [24] is crucial in constructing a BRST Yang– Mills theory whose classical action is the topological density ω ∧ Tr (F ∧ F ). From the BRST point of view, one must introduce scalar fields to get a correct balance between the gauge fixing conditions and the field degrees of freedom and to 10 The usual splitting of 3 ⊗ C into 3 ⊕3 ω is the Kähler form. 11 This is a general property of the system (4.9). 2

1,1

2,0

⊕3

0,2

with 3

1,1

decomposed into λω ⊕ ω ⊥ , where


169

recover Eq. (4.4). Given a hermitian connection A for the hermitian vector bundle (E, ρ), Eq. (4.4) says FA0,2 = (∂¯A )∗ ϕ e = ∗−1 e i.e., ∗1 FA0,2 = ∂A (∗1 , ϕ) e ∈ 31,3 ⊗ G. (See 1 ∂A ∗1 ϕ, Sect. 2.2.1 for the definition of the operation ∗1 .) When M is a Calabi–Yau 3 fold, let ϕ = ∗ϕ e ∈ 30 ⊗ G and we get ∗FA0,2 = ∂¯A ϕ. Linearization gives the usual elliptic curl grad operator, the holomorphic of : div 0 0,2 0,1 ∗ 3 ⊗G 3 ⊗G ∂¯A ∂¯A −→ . (4.14) : ∗ ∂¯A 0 30,3 ⊗ G 30,0 ⊗ G Of course what one wants is not (4.4) but FA0,2 = 0, Eq. (4.11), the condition that makes E a holomorphic bundle. However, as a consequence of the Bianchi identity, ∗ ∗ ϕ e = 0, which also implies ∂¯A ϕ e = 0, when ∂¯A FA0,2 = 0 and hence (4.4) implies ∂¯A ∂¯A M is compact without boundary. Thus (4.4) implies (4.11); moreover, when M is a ∗ ϕ e = 0, (equivalent, ∂¯A ϕ = 0) only happens when Calabi–Yau 3-fold and E is stable, ∂¯A ϕ is a constant multiple of I in u(N ). In that sense, the right-hand side of Eq. (4.5) is 0, giving the gauge fixing condition ∂¯ ∗ τ = 0, τ ∈ 30,1 ⊗ G. Equation (4.12) is the equation hF, ωim = 0 (see Sect. 2.3). As stated there, the orbit space under complex gauge transformations should be the same as the symplectic quotient, the orbit space under unitary gauge transformations of the 0-momentum set, i.e., the condition hF, ωim = 0. Equation (4.13) is a special case of Proposition 3.1 in [24], which we have used previously in Sect. 2.2.2. The DUY equation can also be obtained from the 6 dimensional supersymmetric Yang–Mills theory on a Calabi–Yau manifold. The supersymmetry transformation laws of the (N = 1) vector multiplet (AM , 9) in 6 dimensions are δAM = iΞ0M 9 − i90M Ξ, i δ9 = − 6M N ΞF M N , 2

(4.15)

where 0M are the gamma matrices and 6M N = 41 [0M , 0N ] is the spin representation. On the Calabi–Yau manifold the holonomy group is further reduced to SU (3), which gives a covariantly constant (complex) spinor ζ. In fact this is the very reason why the Calabi–Yau manifold is favorable in the compactification of superstrings to 4 dimensions. We will identify the supersymmetry transformation with Ξ = ζ as a topological BRST transformation. With this choice of parameter, SUSY transformations are decomposed according to the representations of SU (3). The decomposition of SO(6) vector is 6 = 3 ⊕ 3¯ and the chiral spinor decomposes as 4 = 3 ⊕ 1. Thus we obtain the following topological BRST transformation law: sAµ = ψµ , sAµ¯ = 0, sχ = g µµ¯ Fµµ¯ , sψµ = 0, sρ = 0. sψ[µ¯ ν] ¯ = Fµ¯ ν¯ ,

(4.16)

We should explain how we have “twisted”spinors into ghosts and anti-ghosts. In terms ¯ µ = 0, we can make the twist as of the covariantly constant spinor ζ which satisfies ζ0 follows; ¯ µ¯ 9, ¯ χ = ζ9, ψ¯ µ¯ = ζ0 ¯ µ¯ 0ν¯ 0σ¯ 9, ¯ ρ¯ = µ¯ ν¯ σ¯ ζ0 ψ[µ¯ ν] ¯ = ζ0µ¯ 0ν¯ 9,

(4.17)

170


where (ψ¯ µ¯ , ρ) ¯ are complex conjugates of (ψµ , ρ). This is an example of the identification of spinors with forms, explained in Sect. 2.1.1. Looking at the BRST transformations of the anti-ghosts, we recover the DUY equations (4.11, 4.12). 4.1.2. Reduction to a 4-D BRSTQFT; Seiberg–Witten equations. We now turn to the reduction to D = 4, which is of special interest, particularly the theory obtained by dimensional reduction of the J theory from D = 8 to D = 4. We will get a BRSTQFT with gauge conditions identical to the non-Abelian Seiberg–Witten equations, which in turn is also related to the N = 4, D = 4 supersymmetric theory. The main observation is that, in the J case the set of seven equations (2.14) can be separated into 3 plus 4 equations. If we group A5 , A6 , A7 , A8 into the 4 component field ϕα , α = 1, 2, 3, 4, the latter can be interpreted in 4 dimensions as a commuting complex Weyl spinor and Aµ = A1 , A2 , A3 , A4 as a 4 dimensional vector. The set of the first 3 equations in Eq. (2.14) can now be interpreted as the condition that the self-dual part in 4-D of the curvature of Aµ is equal to a bilinear in ϕα ; then, the remaining four equations can be written as Dirac type equations. To be more precise, with the relevant definition of the 4 × 4 matrices 0µ and 6µν , the dimensional reduction down to D = 4 of Eq. (2.14) gives Fµν + µνρσ F ρσ +t ϕ6µν ϕ = 0, Dµ(A) 0µ ϕ = 0.

(4.18)

The consistency of the dimensional reduction from Eq. (2.14) to Eq. (4.18), and the correctness of the SO(4) tensorial properties of all fields, are ensured by the existence of relevant elliptic operators in 8 and 4 dimensions. The remarkable feature is that the above equations are the non-abelian version of Seiberg–Witten equations. In other words, we have observed that the spinors and vectors of the non-abelian S–W theory get unified in the Yang–Mills field of the J theory. The generation of a Higgs potential, to break down the symmetry, with a remaining U (1) is in principle possible, by the relevant modifications in the gauge functions, which provide a Higgs potential, function of ϕ. This is however a subtle issue that we will address elsewhere. The form of the action after dimensional reduction is just the sum of the bosonic part of the Seiberg–Witten action, plus ghost terms. Its derivation is standard from the knowledge of the gauge function, as a BRST exact term, which enforces the gauge functions. The link to supersymmetry in 4 dimensions is as follows. The BRSTQFT based on Spin(7) is a twisted version of the D = 8, N = 1 theory where the spinor is a complex field counting for 16 = 8 + 8 independent real components, and one has a complex scalar field in the supersymmetry multiplet. This theory is itself obtained as the dimensional reduction of the D = 10, N = 1 super Yang–Mills theory, where the spinor has 16 independent real components. Thus we predict that the theory we get by dimensional reduction to 4 dimensions of BRSTQFT in 8 dimensions is related to twisted versions of the D = 4, N = 4 super Yang–Mills theory. For instance, there are 6 scalar fields in the bosonic sector of the theory as presented in the work of Vafa and Witten [16], (see their Eq.(2.1)). In our derivation, these 6 scalar fields are combinations of 4 of the components of the 8-D Yang–Mills field and of the commuting ghost and antighost φ and φ¯ of the J theory.


171

There are actually three ways of twisting the N = 4 SSYM in four dimensions, defined by how SO(4) ' SU (2) × SU (2) is embedded in the R symmetry group12 SU (4) [16]. They are (i) (2, 1) ⊕ (1, 2) , (ii) (1, 2) ⊕ (1, 2) and (iii) (1, 2) ⊕ (1, 1) ⊕ (1, 1), where we have indicated how the defining representation of SU (4) decomposes under SU (2) × SU (2). Taking into account the argument in Sect. 6 of [27], we can see that the cases (i) and (iii) arise from the reduction of type H and J cases, respectively. The remaining case (ii), which is the twist employed by Vafa-Witten [16], is obtained from the 7 dimensional Joyce manifold with G2 holonomy. On the other hand, we get the non-abelian Seiberg–Witten theory with an adjoint hypermultiplet in the case (iii), which gives the relationship between N = 4 SSYM and non-abelian Seiberg–Witten equation. We thus conclude that very interesting twists connect the fields of the pure Yang– Mills 8-D BRSTQFT, (obtained by gauge fixing the invariant ∧ Tr (F ∧ F )), the fields which are involved in the four dimensional Seiberg–Witten equations, and the fields of the D = 4, N = 4 super-Yang–Mills theory. We note that if one starts from the H case gauge functions, the result of compactifying down to 4 dimensions is just a complexified version of a two dimensional Yang–Mills TQFT, coupled to two scalar fields; it could also be deduced from the dimensional reduction of the 3-dimensional BRSTQFT based on the Bogomolny equations. 4.2. Dimensions larger than 8. 4.2.1. Discussion of the case D=12. A BRSTQFT in 12 dimensions might be a candidate for F -theory. 11-dimensional supergravity, defined on the boundary of a 12 dimensional manifold, emphasizes the relevance of a 3-form gauge field CR3 , possibly coupled to a non abelian connection one form A. The most important term M11 C3 ∧ dC3 ∧ dC3 of the 11-dimensional supergravity suggests that one should build a TQFT based on the gauge-fixing of the following invariant 13 : Z dC3 ∧dC3 ∧dC3 +dC3 ∧dC3 ∧Pinv 4 (F )+dC3 ∧Pinv 8 (F )+Pinv 12 (F ) , (4.19) M12

where Pinv n (F ) are invariant polynomials of degree n/2 of the curvature of A, i.e, characteristic classes. Special geometries like hyper or quaternionic Kähler manifolds give natural four-forms. They, their duals (which are 8-forms, and are therefore good candidates to define gauge functions for the curvature of a 3-form in 12 dimensions), and their powers might be used as well here. It is natural to try and gauge fix these topological actions to get a BRSTQFT. However, we did not find gauge fixing functions for a single uncharged 3-form gauge field in 12 dimensions. Rather, we did find one for a single charged 3-form, and another one for a theory with two uncharged 3-forms. (See below.) We could introduce a 5-form gauge field, (not relevant for pure 11-dimensional supergravity), and similar to the 8-dimensional case, consider self-duality conditions for the 6-form curvature of C5 , with a gauge condition of the type dC5 + ∗dC5 + Tr (F ∧ F ∧ F )+ = 0.

(4.20)

In the present understanding of superstrings, 5-forms are not so natural; so we will not elaborate further on this case. 12 13

The R symmetry is the automorphism of the extended supersymmetry algebra. Here again dC 3 means h + dC 3 , where h is the harmonic representative of an element in H 4 (M12 ).

172


When M12 is a Calabi–Yau 6-fold, we can do some things in two different theories. In the first theory, we couple a charged 3-form B to the Yang–Mills field. (B is valued in the same Lie Algebra as A.) We again use ∗ : 30,q → 30,6−q , so that 30,3 ⊗ G = 0,3 ¯ 0,2 − ∂¯A ∂¯A ∗ B = 0 implies for compact manifolds that 30,3 + ⊗ G + 3− ⊗ G. Again ∂A F ∂¯A ∗ B = 0. The covariant gauge condition is ∗F 0,2 = ∂¯A B, B ∈ 30,3 + ⊗ G; equivalently, ∗ B. So the covariant gauge conditions become the pair F 0,2 = 0 and ∂¯A B = 0, F 0,2 = ∂¯A ∗ similar to the Calabi–Yau 3-fold case in Sect. 4.1.1. There, F 0,2 = 0 and ∂¯A ϕ e = 0, with 0,3 0,3 ϕ e ∈ 3 ⊗ G. In the present case, B ∈ 3+ ⊗ G. The moduli space is a vector bundle over the set of holomorphic bundles for a fixed C ∞ (E, ρ). Each such holomorphic structure gives a unique A with FA0,2 = 0. The fiber ¯ over A consists of [B ∈ 30,3 + ⊗ G ; ∂A B = 0]. ∂¯

∂¯

∂¯

A A A The sequence 0 → 30,0 ⊗ G −→ 30,1 ⊗ G −→ 30,2 ⊗ G −→ 30,3 + ⊗ G is elliptic at the symbol level; linearization of the covariant gauge condition together with the usual gauge fixing is given by the elliptic operator: 0,2 0,1 ∗ ∂¯A ∂¯A 3 ⊗G 3 ⊗G −→ . (4.21) : ∗ ∂¯A 0 30,3 30,0 ⊗ G + ⊗G R We take as classical “topological” action S0 [A, B] = M12 6 ∧Tr (∂¯A B ∧FA ) where 6 is the (6, 0) covariant constant formRof M12 . Since the covariant gauge function is ∗ ∗ ∗ B and since hF 0,2 , ∂¯A Bi = M12 6 ∧ Tr (F 0,2 ∧ ∂¯A B), we have k F 0,2 − F 0,2 − ∂¯A ∗ 2 0,2 2 ∗ 2 0,2 ¯ ∗ ∗ 0,2 ∗ ¯ ¯ ¯ B k2 =k ∂A B k =k F k + k ∂A B k −hF , ∂A Bi − h∂A B, F i, that is, k F 0,2 − ∂¯A 0,2 2 ∗ 2 c ∗ ¯ ¯ ¯ F k + k ∂A B k −S0 [A, B] − S0 [A, B] . (Remember that ∂A = ∗∂A ∗.) We thus ∗ B k2 . obtain a BRSTQFT whose gauge fixed action will include the term k F 0,2 k2 + k ∂¯A 0,3 Moreover, the condition that B ∈ 3+ ⊗ G can be imposed in a BRST invariant by using the ordinary gauge freedom of B 14 . In the second theory, we introduce two uncharged 2-form gauge fields B2a and two (non abelian) Yang–Mills fields Aa , with a = 1 and 2. We consider the following topological classical action Z ab 6 ∧ dB2a ∧ dB2b . (4.22)

M12

We define the following “holomorphic” gauge conditions, where the complex indices run from 1 to 6, c

2 a a a a a ∂[µ¯ Bνa¯ ρ]¯ + ab µ¯ ν¯ ρ¯ α¯ β¯ γ¯ ∂[α¯ Bβb¯ γ] ¯ + A[µ¯ Aν¯ Aρ] ¯ ). ¯ = Tr (A[µ¯ ∂ν¯ Aρ] 3

(4.23)

The right-hand side of this equation is the Chern–Simons form of rank 3. The similarity to 8 dimensions is striking, up to the replacement of the even Chern class by the odd Chern–Simons class. Equation (4.23) implies ∂ ρ¯ ∂[µ¯ Bνa¯ ρ]¯ = ab µ¯ ν¯ ρ¯ α¯ β¯ γ¯ Tr Fρb¯ α¯ Fβb¯ γ¯ .

(4.24)

Its solution is the stationary point of the following action: 14 The (0,3)-form B is valued in the same Lie algebra as the Yang–Mills field. It is thus non abelian and its quantization involves the field anti-field formalism of Batalin and Vilkoviski. We intend to perform elsewhere this rather technical task, which generalizes that sketched at the end of Sect. 3.0.


173

Z d12 x ab (∂[µ¯ Bνa¯ ρ]¯ c ∂[µ¯ Bνb¯ ρ]¯ + µ¯ ν¯ ρ¯ α¯ β¯ γ¯ c Bνa¯ ρ¯ Tr Fρb¯ α¯ Fβbγ¯ + complex conjugate). M12

(4.25) Gauge fixing the Lagrangian Eq. (4.22) by the gauge condition Eq. (4.24) provides a BRST invariant action. Its ghost independent and gauge independent part is identical to the action Eq. (4.25). 4.2.2. Other possibilities. In 10 dimensions one could build a BRSTQFT based on a four-form gauge field B4 and a pair of two gauge field B2a , a = 1, 2, which naturally fit into the type IIB superstring. All these forms are uncharged, but they can develop non trivial interactions [30]. The curvatures are

with Bianchi identities, dG5 = fields one closed 11-form and two 8-forms

G5 = dB4 + ab B2a Gb3 ,

(4.26)

Ga3 = dB2a ,

(4.27)

ab Ga3 Gb3

and

dGa3

= 0. One can construct from these

111 = ab Ga3 Gb3 G5 ,

(4.28)

1a8 = G5 Ga3 .

(4.29)

The role of the invariant forms is obscure, but their existence could signal generalizations of the Green–Schwarz type anomaly cancellation mechanism. The covariant gauge function is (4.30) dB4 + ∗dB4 + ab B2a dB2b = 0. The mixing of forms of various degrees by the gauge functions generalizes that of the 3-form with the Yang–Mills field in the eight dimensional theory of Sect. 3. 5. Conclusion We have described some new Yang–Mills quantum field theories in dimensions greater than four, using self duality. In eight dimensions we found two BRSTQFT’s depending on holonomy Spin(7) (the J-case) or holonomy SU (4) (the H-case). In the J-case, BRST symmetry is what is left of supersymmetry. The increase in dimension allows us to couple ordinary gauge fields to forms of higher degree. We have given several examples. Dimensional reduction generates new theories. One of them is a BRSTQFT whose gauge conditions are the non-abelian Seiberg–Witten equations. In four dimensions, given the self duality condition, there are other ways of deriving the Lagrangian of Witten’s topological Yang–Mills theory besides Witten’s twist of N = 2 SSYM and besides BRST [1, 2, 33]. These methods should work equally well in deriving our BRSTQFT Lagrangians for the pure Yang–Mills case. Finally, as we have indicated earlier, the geometries of the moduli spaces we have probed have not been worked out. Much remains to be done [13]. However, from the lessons learned in four dimensions, it is tempting to hurdle these obstacles and proceed to the corresponding Seiberg–Witten abelian theory. Preliminary investigations indicate that one can compute the Seiberg–Witten invariants, when M8 is hyperKähler, i.e., when the holonomy group is Sp(2). This case is very similar to the Seiberg–Witten invariants for M4 when it is Kähler [34].

174


Acknowledgement. We thank M. Duff for pointing out to us that the octonionic solution in [14] and [15] does not give a self dual T r(F ∧ F ). H.K. would like to thank T. Eguchi and T. Inami for helpful communications. The work of H.K. is supported in part by the Grant-in-Aid for Scientific Research from the Ministry of Education, Science and Culture, Japan. L. B. would like to thank the Yukawa Institute where part of this work has been done, and E. Corrigan and H. Nicolai for discussions. IMS would like to thank G. Tian for bringing him up to date on complex geometry. He also benefited from discussions with S. Axelrod, S. Donaldson, R. Thomas, and E. Weinstein. His work is supported in part by a DOE Grant No. DE-FG02-88ER25066.

Note added on July 17, 1997. T.A. Ivanova has called our attention to [14], where instanton solutions are found. B.S. Acharya and M. Loughlin have called our attention to their paper [35] where they discuss self duality for Euclidean gravity when d ≤ 8. B.S. Acharya, M. Loughlin and B. Spence also discuss self duality in [36]. In their paper, a note added says that their proof of BRST invariance would “seem to conflict” with our theory not being topological. Indeed the theory is not topological. They made the corrections in a revised version. We expand on our assertion. Assume M is a compact oriented simply connected manifold with Aˆ = 1 and assume M admits a Joyce metric, i.e, a metric with Spin(7) holonomy. The space of Joyce metrics modulo diffeomorphisms isotopic to the identity is of dimension 1 + b4− (M ) (see Theorem D in [20]). It is conceivable that this manifold of Joyce metrics is not connected so that one cannot find a path from one Joyce metric to another with each point of the path a Joyce metric. The BRST argument for invariance requires a path of Joyce metrics, hence shows formally that the correlation functions are constant on components of the space of Joyce metrics. But the argument does not imply constancy of the correlation functions on all Joyce metrics. This is one reason we chose not to label our J-case QFT a topological quantum field theory. On the mathematical side the argument analogous to BRST invariance also works formally because the correlation functions come from the second Chern class (see 2.1.3). As we indicated there, to define the analogue of Donaldson invariants (the correlation function precisely), one needs to integrate over the moduli space MJ of self dual connections. To do so, a compactification of MJ is important (work in progress by D. Joyce and C. Lewis). The H-case (Sect. 2.2.3 in particular) is more complicated. Physicists allow a degeneration of the complex structure to connect one moduli space with another. We do not know how the “holomorphic Donaldson invariants” behave under this degeneration.

References 1. Birmingham, D., Blau, M., Rakowski, M., G. Thompson: Physics Reports 209, 129 (1991) 2. Cordes, S., Moore, G., Ramgoolam, S.: Lectures on 2D Yang–Mills Theory, Equivariant Cohomology and Topological Field Theories. In: Les Houches Session LXII, hep-th/9411210 3. Donaldson, S.K.: Topology. 29, 257 (1990) 4. Witten, E.: Commun. Math. Phys. 117, 353 (1988) 5. Baulieu, L., Singer, I.: Nucl. Phys. B (Proc. Supple.) 5B, 12 (1988) 6. E. Corrigan, Devchand, C., Fairlie, D.B., Nuyts, J.: Nucl. Phys. B214, 452 (1983) 7. Ward, R.S.: Nucl. Phys. B236, 381 (1984) 8. Fairlie, D.B., Nuys, J.: J. Phys.A17, 2867 (1984) 9. Fubini, S., Nicolai, H.: Phys. Lett. 155B, 369 (1985) 10. Salamon, S.: Riemannian Geometry, Holonomy Groups. Pitman Research Notes in Mathematics Series, 1989


175

11. Weinstein, E.: Extension of self-dual Yang–Mills equations across the 8th dimension. PHD dissertation, Harvard University Math. Dept, 1992 12. Joyce, D.D.: Invent. Math. 123, 507 (1996) 13. Donaldson, S.K., Thomas, R.P.: Gauge Theory in Higher Dimensions. Oxford preprint (1996) 14. Ivanova, T.A.: Phys. Lett. B315, 277 (1993); Ivanova, T.A., Popov, A.D.: Lett. Math. Phys. 24, 85 (1992); Theor. Math. Phys. 94, 225 (1993) 15. Günayden, M., Nicolai, H.: Phys. Lett. B351, 169 (1995) 16. Vafa, C., Witten, E.: Nucl. Phys. B431, 3 (1994) 17. Vafa, C.: Nucl. Phys. B469, 403 (1996), hep-th/9602022 18. Papadopoulos, G., Townsend, P.K.: Phys. Lett. B357, 300 (1995), hep-th/9506150 19. Acharya, B.S.: N=1 M-Theory-Heterotic Duality in Three-Dimensions and Joyce Manifolds. hepth/9604133; Dirichlet Joyce Manifolds, Discrete Torsion and Duality. hep-th/9611036 20. Joyce, D.D.: J. Diff. Geom. 43, 291, 329 (1996) 21. Shatashvili, S., Vafa, C.: Superstrings, Manifolds of Exceptional Holonomy. hep-th/9407025 22. Figueroa-O’Farrill, J.M.: A Note on the Extended Superconformal Algebras Associated with Manifolds of Exceptional Holonomy. hep-th/9609113 23. Günayden, M., Gürsey, F.: J. Math. Phys. 14, 1651 (1973) 24. DeBartolomeis, P., Tian, G.: J. Diff. Geom. 43, 231 (1973) 25. Salam, A., Sezgin, E. (Eds), Supergravities in Diverse Dimensions. Amsterdam–Singapore: NorthHolland/World Scientific, 1989 26. Witten, E.: Nucl. Phys. B460, 335 (1996), hep-th/9510135 27. Bershadsky, M., Sadov, V., Vafa, C.: Nucl. Phys. B463, 420 (1996), hep-th/9511222 28. Becker, K., Becker, M., Morrison, D.R., Ooguri, H., Oz, Y., Yin, Z.: Supersymmetric Cycles in Exceptional Holonomy Manifolds and Calabi–Yau 4-folds. hep-th/9608116 29. Furuuchi, K., Kunitomo, H., Nakatsu, T.: Topological Field Theory and Second-Quantized Five-Branes. hep-th/9610016 30. Baulieu, L.: Algebraic quantization of gauge theories. In: Perspectives in fields and particles, eds. Basdevant-Levy, Cargese Lectures 1983 London: Plenum Press, 1985; Baulieu, L.: Nucl. Phys. B478, 431 (1996) 31. Brylinski, J.-L.: Loop Spaces, Characteristic Classes, Geometric Quantization. Berlin: Birkhäuser, 1992 32. Baulieu, L., Grossman, B.: Phys. Lett. 214B, 223 (1988) 33. Atiyah, M., Jeffrey, L.: J. Geom. Phys. 7, 120 (1990) 34. Witten, E.: Math. Res. Letters 1, 764 (1994) 35. Acharya, B.S., O’Loughlin, M.: Phys. Rev. D55, R4521, (1997), hep-th/9612182 36. Acharya, B.S., O’Loughlin, M., Spence, B.: hep-th/9705138 Communicated by S.-T. Yau

Commun. Math. Phys. 194, 177 – 190 (1998)

Communications in


Quenched Sub-Exponential Tail Estimates for One-Dimensional Random Walk in Random Environment Nina Gantert? , Ofer Zeitouni ?? Department of Electrical Engineering, Technion-Israel Institute of Technology, Haifa 32000, Israel Received: 13 December 1996 / Accepted: 3 October 1997

Abstract: Suppose that the integers are assigned i.i.d. random variables {ωx } (taking values in the unit interval), which serve as an environment. This environment defines a random walk {Xn } (called a RWRE) which, when at x, moves one step to the right with probability ωx , and one step to the left with probability 1 − ωx . Solomon (1975) determined the almost-sure asymptotic speed vα (=rate of escape) of a RWRE. Greven and den Hollander (1994) have proved a large deviation principle for Xn /n, conditional upon the environment, with deterministic rate function. For certain environment distributions where the drifts 2ωx − 1 can take both positive and negative values, their rate function vanishes on an interval (0, vα ). We find the rate of decay on this interval and prove it is a stretched exponential of appropriate exponent, that is the absolute value of the log of the probability that the empirical mean Xn /n is smaller than v, v ∈ (0, vα ), behaves roughly like a fractional power of n. The annealed estimates of Dembo, Peres and Zeitouni (1996) play a crucial role in the proof. We also deal with the case of positive and zero drifts, and prove there a quenched decay of the form exp(−cn/(log n)2 ).

1. Introduction In this paper, we continue the study, initiated in [4] and [2], of tail estimates for a nearest-neighbor random walk on Z with site-dependent transition probabilities. Let ω = (ωx )x∈Z be an i.i.d. collection of (0, 1)-valued random variables, with marginal distribution α For every fixed ω, let X = (Xn )n≥0 be the Markov chain on Z starting at X0 = 0 (unless explicitly stated otherwise), and with transition probabilities ? On leave from the Department of Mathematics, TU Berlin. Research supported by the Swiss National Science foundation under grant 8220–046518. ?? Partially supported by a US-Israel BSF grant.

178

N. Gantert, O. Zeitouni

Pω (Xn+1 = y | Xn = x) =

 

ωx if y = x + 1 1 − ωx if y = x − 1 .  0 otherwise

(1)

The symbol Pω denotes the measure on path space given the environment ω, and is referred to as the “quenched" setting. The process (X, ω) is an Rexample of a random walk in random environment (RWRE), and X has the law P = αZ (dω)Pω , referred to as the “annealed" law. When no confusion arises, we use P also to denote the law of (X, ω). We use in various places, when confusion does not occur, P to denote the probability of events constructed from random variables unrelated to the RWRE. For a discussion of the different regimes that the RWRE Xn exhibits, we refer to the introduction in [2]. R Abbreviate ρ = ρ(x, ω) = (1 − ωx )/ωx and hf i = f (ω)αZ (dω) for any function f of the environment. Let ρmax denote the maximum of ρ over the closed support of α, and let ρmin denote the corresponding minimum. We will be interested here in the case hρi < 1 and ρmax ≥ 1, in which case (cf. [7]) the RWRE is transient and, P-a.s., lim n−1 Xn = vα :=

n→∞

1 − hρi . 1 + hρi

(2)

Tail estimates for Xn /n have been derived for the quenched setting in [4]. In particular, it was shown there that, P-a.s, the random variables Xn /n satisfy with respect to Pω a large deviation principle of speed n and explicit, deterministic, rate function I(v), defined as follows (see [4, Theorem 2 and Corollary 1]). Let f (r, ω), r ≥ 0 denote the continued fraction function f (r, ω) =

ρ(0, ω)| ρ(1, ω)| 1| − − , er (1 + ρ(0, ω)) er (1 + ρ(1, ω)) ···

and let λ(r) = exphlog f (r, ω)i . Let r(v) = 0 for v ≤ vα , and for v ∈ (vα , 1], let r(v) be the unique solution of the equation v −1 = −λ0 (r)/λ(r). Then,   −r(v) − v log λ(r(v)) , v ∈ [0, 1] I(−v) + vhlog ρi , v ∈ [−1, 0] I(v) =  ∞, v 6∈ [−1, 1] . Furthermore, I(v) = 0 for v ∈ [0, vα ] and I is strictly positive elsewhere. Our goal in this paper is to study in greater detail the regime v ∈ (0, vα ) under Pω . In the annealed setting, i.e., when one is interested in P(Xn ≤ nv), v ∈ (0, vα ), sub–exponential rates of decay were derived in [2]. We summarize now the main results of [2] relevant to us. Recall (cf. [2]) that when hρi < 1, there exists a unique s > 1 satisfying hρs i = 1. Theorem 1 (see [2]). Let v ∈ (0, vα ). (a) Positive and negative drifts. Suppose that hρi < 1 and ρmax > 1. Then, lim log P(Xn ≤ nv)/ log n = 1 − s .

n→∞

(b) Positive and zero drifts. Suppose that hρi < 1 but ρmax = 1 and α(1/2) > 0. Then, |2/3 and C2 = | π(loghρi) |2/3 , with C1 = 23 | π log α(1/2) 2 8

Random Walk in Random Environment

− C1 (1 −

179

v 1/3 1 ) ≤ lim inf 1/3 log P(Xn ≤ nv) n→∞ n vα 1 v 1/3 ) . ≤ lim sup 1/3 log P(Xn ≤ nv) ≤ −C2 (1 − vα n→∞ n

(3)

Maybe surprisingly, it turns out that the annealed estimates are key to understanding the quenched asymptotics. The next theorems are our main results. They quantify the fact that the annealed probabilities of large deviations are of bigger order than their quenched counterparts, due to the possibility of rare fluctuations in the environment which may slow down the RWRE. Theorem 2 (Positive and negative drifts). Suppose that hρi < 1, ρmax > 1, and let v ∈ (0, vα ). Then, for P-a.a. ω, the following statements hold: 1. For any δ > 0, lim sup n→∞

1 n1−1/s−δ

log Pω (Xn < nv) = −∞.

(4)

log Pω (Xn < nv) = 0.

(5)

2. For any δ > 0, lim inf n→∞

1 n1−1/s+δ

Furthermore, lim sup n→∞

1 log Pω (Xn < nv) = 0. n1−1/s

(6)

One should compare the rate of decay obtained in Theorem 2 with the annealed polynomial rate of decay (see Theorem 1) P(Xn < nv) ' n1−s . As in [2], tail estimates are different when the drift cannot be negative: Theorem 3 (Positive and zero drifts). Suppose that hρi < 1, ρmax = 1, and α({1/2}) > 0. Then, for P-a.a. ω, and for v ∈ (0, vα ), − c1 (1 −

v (log n)2 log Pω (Xn < nv) ) ≤ lim inf n→∞ vα n v 2 (log n)2 ≤ lim sup log Pω (Xn < nv) ≤ −c2 (1 − ) . n v n→∞ α

(7)

Here, c1 = |π log α({1/2})|2 /8 and c2 = |π loghρi|2 /243 . Again, the rate in Theorem 3 should be compared with the annealed rate (cf. Theorem 1) P(Xn < nv) ' exp(−Ci n1/3 ). Remarks. 1. As in [2], we have not covered the case of hρi < 1, ρmax = 1, while α({1/2}) = 0. The tail estimates in the annealed case were conjectured in [2, p. 681] to be of the form exp(−Di nβ ), i = 1, 2, for some β ∈ (1/3, 1) determined by the tails of α(·) near 1/2. The same proof as in Theorem 3 then shows that the upper quenched estimates in Theorem 3 become exp(−dn/(log n)γ ), with γ = 1/β − 1.

180


2. In the setting of Theorem 2, we conjecture that actually lim inf n→∞

1 n1−1/s

log Pω (Xn < nv) = −∞.

In fact, the derivation of the lower bound in (6) hints at such a limit. In the setting of Theorem 3, we conjecture, as in [2], that the lower bound is sharp, that is v (log n)2 log Pω (Xn < nv) = −c1 (1 − ). n→∞ n vα lim

In fact, it was shown recently (see [6]) that the lower bound is sharp in the annealed setting, that is one may replace C2 in the right hand side of (3) by C1 . This however does not suffice for closing the gap in our Theorem 3, see the comment following the proof of the theorem. 3. In the setting of Theorem 2, it is natural to attempt to improve on (4), (5) by allowing for δn →n→∞ 0. Such improvement is possible if in Theorem 1.1 of [2], one refines the convergence, that is one proves bounds of the form lim sup gn ns−1 P(Xn < nv) < ∞ n→∞

for appropriate gn →n→∞ 0 sub–polynomially , which is possible albeit tedious. It seems however impossible by this way to completely close the gap between the upper and lower bounds exhibited in (4) and (5). We conclude this introduction with two technical lemmas, borrowed from [2], whose proof follows readily from the explicit computations for inhomogeneous random walk of [1, pp. 66–71]. Let Xn denote a RWRE and let X¯ n denote a RWRE with ω0 = 1. Let Pk τ¯k = min{n : X¯ n = k}, let Rk = k −1 i=1 log ρ(i), and let L0 = maxn≥0 {−Xn }. Lemma 1 ([2], Lemma 2.1). For all n, k, Pω (τ¯k ≥ n) ≥ (1 − e−(k−1)Rk−1 )n . Lemma 2 ([2], Lemma 2.2). For any k ≥ 1, P(L0 ≥ k) ≤

hρik . 1 − hρi

2. Proofs Proof of Theorem 2. Since the lower bound of Theorem 2 is relatively simple, and the key ideas are already explained in [2], we postpone the discussion of it and begin by providing a sketch of the proof of the upper bound leading to (4), that is, with τn = inf {t : Xt = n} , we will explain why lim

n→∞

1 n1−1/s−δ

log Pω τn > n/v = −∞.

(8)


181

The required upper bound follows readily. We will omit subsequences, etc. in this sketch, and thus the reader interested in a complete proof should take the next few paragraphs with somewhat of a grain of salt. The precise statement of the required estimate is contained in the statement of Proposition 1. Divide the interval [0, nv] into blocks of size roughly k = kn := n1/s+δ . Let Xnx denote the RWRE started at x, and define Tk(i) = inf{t > 0 : Xtik = (i + 1)k} ,

i = 0, ±1 , . . . .

(9)

By slight abuse of notation, we continue to use Pω for the quenched law of the {Xnx }. By using the annealed bounds of [2], see Theorem 1, one knows that P(τk > k/v) ∼ k 1−s . Hence, taking appropriate subsequences, one applies a Borel–Cantelli argument to control the probability, conditioned on the environment, of the time spent in each such block being large, i.e., one exhibits a uniform estimate on Pω (Tk(i) > k/v), cf. Lemma 5. The next step involves a decoupling argument. Let (i)

T k = inf {t > 0 : Xtik = (i + 1)k or Xtik = (i − 1)k}.

(10)

Then, using Lemma 2, and the Borel–Cantelli lemma, one shows that for all relevant (i) blocks, that is i = ±1, ±2, . . . , ±n/k, Pω (T k 6= Tk(i) ) is small enough. Therefore, we (i)

can consider the random variables T k instead of Tk(i) , which have the advantage that their dependence on the environment is well localized. This allows us to obtain (cf. Lemma 7) (i) a uniform bound on the tails of T k , for all relevant i. The final step involves estimating how many of the k-blocks will be traversed from right to left before the RWRE hits the point nv. This is done by constructing a simple (i) random walk (SRW) St whose probability of jump to the left dominates Pω (Tk(i) 6= T k ) for all relevant i. The analysis of this SRW will allow us to claim (cf. Lemma 9) that the number of visits to a k-block after entering its right neighbor is negligible. Thus, the original question on the tail of τn is replaced by a question on the sum of (dominated by (i) i.i.d.) random variables T k , which is resolved by means of the tail estimates obtained in the second step. A slight complication is presented by the need to work with subsequences in order to apply the Borel–Cantelli lemma at various places. Going from subsequences to the original n sequence is achieved by means of monotonicity arguments. Turning now to the complete proof, we first note that it is actually enough to prove a weaker statement. For δ ∈ (0, 1 − 1/s), let Cn = nδ and let nj = [j 2/δ ]. Recall that τn = inf {t : Xt = n} , and let µ := v −1 > vα−1 . The key to the upper bound is the following proposition, whose proof is postponed. Proposition 1. lim

j→∞

C nj 1−1/s

nj

log Pω τnj > nj µ = −∞.

(11)

Assuming the proposition holds true, let us show how to complete the proof of the upper (j + 1)2/δ + 1 −→ 1. Let jn be such that bound (4). Note that, for j large, nj+1 /nj ≤ j 2/δ − 1 j→∞ njn ≤ n < njn+1 . Then, for any n,

182


Pω τn > nµ ≤ Pω τnjn+1 > njn µ = Pω τnjn+1 > njn+1 µ(n) , µnjn . njn+1 µnjn Let N be large such that inf n≥N > µα , and consider only n > N . One njn+1 concludes from Proposition 1 that for all δ > 0, P a.s.,

where µ(n) =

lim sup n→∞

1 1 n1− s +δ

log Pω (τn > nµ) = −∞ .

(12) 0

0

To prove (4), let v < v 0 < vα and define L[nv ] = max{[nv 0 ] − Xk[nv ] ; k ≥ 0}. Then, 0

Pω (Xn < nv) ≤ Pω (τ[nv0 ] > n) + Pω (L[nv ] ≥ [nv 0 ] − nv) .

(13)

By Lemma 2, P(L

[nv 0 ]

0

≥ [nv ] − nv) = E(Pω (L

[nv 0 ]

0

hρi[nv ]−[nv]−1 . ≥ [nv ] − nv)) ≤ 1 − hρi 0

Hence, one may find some ε > 0, θ > 0 such that 0

P(Pω (L[nv ] ≥ [nv 0 ] − nv) ≥ e−εn ) ≤ e−θn . Applying now the Borel–Cantelli lemma, one concludes that P-a.s., lim sup n→∞

0 1 log Pω (L[nv ] ≥ [nv 0 ] − nv) < −ε < 0 . n

(14)

(4) follows from (13), (14) and (12). As mentioned before, the proof of the lower bounds (5) and (6) follows the ideas of [2] (see in particular Remark 4, p. 682). Indeed, it is already explained there why, for any δ > 0, X 1 n < v = 0. lim inf 1−1/s+δ log Pω n→∞ n n In order to see the refined estimate in (6) , we recall the following notations from [2]. Let m+k X x Rk (m) = k1 log ρ(i). Define τkx = inf {t : Xtx = k + x} and τ xk = inf {t : X t = i=m+1 x

k + x}, where X t is the RWRE with ω(x) = 1, initiated at x. It follows from Lemma 1 that n x ≥ n ≥ Pω τ xk+1 ≥ n ≥ 1 − e−kRk (x) . (15) Pω τk+1 For n = 1, 2 , . . . , define Mn (x) =

max

x≤m≤x+n k≤x+n−m

kRk (m).

In particular, it follows from (15) that for any c > 0 and l = [n/c], n x x Pω (τl+1 ≥ n) ≥ Pω (τ¯l+1 ≥ n) ≥ 1 − e−Ml (x) .

(16)

We recall the following exceedence bounds, due to Iglehart. For this version, see [5], Theorem A.


183

Lemma 3. There exist constants K1 , K2 , such that for any z ∈ R, log l exp −K1 exp(−sz) ≤ lim inf P Ml (x) − ≤z l→∞ s log l ≤z ≤ lim sup P Ml (x) − s l→∞ ≤ exp −K2 exp (−sz) . A corollary of Lemma 3 and (16) (taking y = ez ) is the following: Lemma 4. For any y > 0 there exists a cy > 0 such that, for any v 0 < vα , 1−1/s − n 0 1/s x y(v ) ≥ cy lim inf P Pω (τ[nv 0 ] ≥ n) ≥ e

n→∞

and the convergence is uniform in x. Equipped with Lemma 4, we have completed all the preliminaries required for provk ing (6). Indeed, fix y > 0, and let nk = 22 . Note that lim sup n→∞

log Pω (Xn ≤ nv) log Pω (Xnk ≤ nk v) ≥ lim sup 1−1/s k→∞ n1−1/s n ≥ lim sup k→∞

≥ lim sup k→∞

≥ lim sup k→∞

k 0 log Pω (τ[n ≥ nk ) k v] 1−1/s nk n log Pω (τ[nk−1 ≥ nk ) k v]−nk−1 1−1/s nk nk−1 log Pω (τ[nk v0 ] ≥ nk ) n, 1−1/s nk

where v 0 = v − ε for arbitrary ε. By Lemma 4, and the Borel–Cantelli lemma, for any z > 0, 1−1/s n n − kz ≥ n ≥ e Pω τ[nk−1 0 k kv ] infinitely often. The conclusion follows by taking z → ∞. This completes the proof of Theorem 2, except that we still have to show Proposition 1. 1/s

C nj n j for some 1 > ε > 0. For Xnx the RWRE Proof of Proposition 1. Let k = kj = 1−ε started at x, recall that Tk(i) = inf{t > 0 : Xtik = (i + 1)k} ,

i = 0, ±1 , . . . .

law of the {Xnx }. By slight abuse of notation, wencontinue h i to use Pωhforithe quenched o n n Finally, let bn = Cn−δ and Ij = − kjj − 1 , · · · , kjj + 1 . Fix µ0 > µ.

184


Lemma 5. For P – a.e. ω, there exists a J0 (ω) such that for all j > J0 (ω), and all i ∈ Ij , ! Tk(i)j Pω > µ0 ≤ b n j . kj Proof of Lemma 5. By Chebycheff’s bound, T (i) kj

P Pω

kj

>µ

0

!

> bnj

1 Tk j ≤ P > µ0 b nj kj (i)

≤

1 1−s+o(1) k , b nj j

where the last inequality follows from Theorem 1(a), and o(1) j→∞ −→ 0. Hence, T (i) P Pω

kj

kj

>µ

0

! > bnj for some i ∈ Ij

≤3 ≤

hn i j

kj

·

3 nδ(s−o(1)−δ) j

and the conclusion follows from the Borel–Cantelli lemma. Let 0 < θ
0 : Xtik = (i + 1)k or Xtik = (i − 1)k}. Lemma 6. For P – a.e. ω, there is a J1 (ω) s.t. for all j ≥ J1 (ω), (i) Pω T kj 6= Tk(i)j , some i ∈ Ij ≤ dθnj . Proof of Lemma 6. Again, we use the Chebycheff bound: (i) P Pω T kj 6= Tk(i)j , some i ∈ Ij > dθnj 1 3nj (0) P T kj 6= Tk(0) ≤ θ · j d nj k j ≤ ≤

1 3nj hρikj · · dθnj kj 1 − hρi

1 3 log hρi 1− 1 −δ +δ nj s +θ , exp njs (1 − hρi) (1 − ε)

where the second inequality follows from Lemma 2. The conclusion follows from the Borel–Cantelli lemma. We actually need to iterate the estimates of Lemma 5.


185

Lemma 7. For P – a.e. ω, for all j > J0 (ω) , and each i ∈ Ij , and for x ≥ 1,  Pω 



(i)

T kj kj

> µ0 x ≤ (2bnj )[x/2]∨1 .

Proof of Lemma 7. For 1 ≤ x < 4, the claim follows from Lemma 5. Assume thus that x ≥ 4. Then,  Pω 



(i)

T kj kj

> µ x ≤ Pω 0

T (i)

> µ0 (x − 2) ,

kj

kj

ik

(i − 1)kj < X[µ0jkj (x−2)]+1 < (i + 1)kj , min{t : t ≥ [µ0 kj (x − 2)] + 2, Xt

ikj

= (i + 1)kj } ≥ xµ0 kj .

Hence, by the Markov property,

Pω

T (i) kj

kj



> µ0 x ≤ Pω 



(i)

T kj

> µ0 (x − 2)

kj

×

sup (i−1)kj 2µ0 kj j

kj   (i) T kj > µ0 (x − 2) ≤ Pω  kj h i 0 Pω Tk(i)j > µ0 kj + Pω Tk(i−1) > µ k j j   (i) T kj ≤ 2bnj Pω  > µ0 (x − 2) , kj

where the last inequality is a consequence of Lemma 5. The lemma follows by induction. We need one more preliminary computation related to the bounds in Lemma 7. Let {Zk(i)j }, i = 1, 2, . . . denote a sequence of i.i.d. positive random variables, with P

Zk(i)j kj

! µx

Lemma 8. For any λ > 0, and any ε > 0,

[x/2]∨1 = 2bnj ,

x ≥ 1.

186


E exp λ

! Zk(i)j kj

0

≤ eλµ (1+ε) + gj ,

−→ 0. where gj j→∞ Proof of Lemma 8. ! Zk(i)j Z ∞ Zk(i)j log u du = E exp λ P > kj kj λ 0 log u Z ∞ ∨1 0 0 ≤ eλµ (1+ε) + (2bnj ) 2λµ (1 + ε) du = e where gj j→∞ −→ 0.

λµ0 (1+ε)

eλµ0 (1+ε)

+ gj

In order to control the number of repetitions of visits to kj –blocks, we introduce an auxiliary random walk. Let St , t = 0, 1 , . . . , denote a simple random walk with S0 = 0 and P St+1 = St + 1 St = 1 − P St+1 = St − 1 St = 1 − dθn . Set Mnj =

1 1− s1 n . C nj j

Lemma 9. For θ as in Lemma 6, and n large enough, hn i θε j } > Mnj ≤ exp − nj . P inf {t : St = kj 2 Proof of Lemma 9. n h n io S[Mnj ] nj j > Mnj ≤ P < P inf t : St = kj Mn j k j Mn j S[Mnj ] < 1 − ε ≤ 2 e−Mnj hnj (1−ε) , =P Mn j where the last inequality is a consequence of Cramèr’s theorem (cf. [3]), and the fact that dθn < ε. Here, 1−x x hn (1 − x) = (1 − x) log + x log θ . θ 1 − dn dn Using hn (1 − x) ≥ − e2 − x log dθn , we get S[Mnj ] ε +εMnj log dθn j ≤ e− 2 θ n j . < 1 − ε ≤ 2 e2Mnj /e e P Mn j

We are now ready to prove (11). Note that, for all j > J0 (ω), and all i ∈ Ij , we may, (i) due to Lemma 7, construct {Zk(i)j } and {T kj } on the same probability space such that


(i) Pω Zk(i)j ≥ T kj

187

∀ i ∈ Ij = 1. Fix µα < µ0 < µ and ε > 0 small enough. Recalling (i)

that, under Pω , the T kj are independent, we obtain, with {St } defined before Lemma 9, and j large enough,

n

Pω (τnj > nj µ) ≤ P inf t : St =

h n io j

kj

> Mnj + P

nj M X

Zk(i)j > nj µ

i=1

1 X Z kj > µ(1 − ε) M nj kj i=1 ! h Zk(i)j −λµ(1−ε) iMnj −θεnj /2 ≤e ·e + E exp λ (i) kj Mn j 0 ≤ e−θεnj /2 + eλ(µ +2εµ−µ) + gj e−λµ(1−ε) M n j ≤ e−θεnj /2 + e−λεµ , Mn j

≤ e−θεnj /2 + P

where Lemma 9 was used in the second inequality and Lemma 8 in the fourth. Since λ > 0 is arbitrary, (11) follows. Proof of Theorem 3. We begin by giving a quick sketch of the lower bound in (7), based on [2]. By the Erdös-Renyi strong law for the longest run of heads, (or the asymptotics for long rare segments in random walks, see e.g., [3, p. 69]), there is a segment I = (imin , imax ), with imin ≥ n(v−ε), imax < nv and imax −imin = log n/(− log α({1/2}))(1+ o(1)), such that ωi = 1/2 for i ∈ I. Let X˜ n denote the RWRE started at (imin + imax )/2. Let τ = min{t : X˜ t = imin or X˜ t = imax }. Then, τ possesses the same law as the exit time, denoted τ¯ , of the simple symmetric random walk from the interval [−(imax − imin )/2, (imax − imin )/2]. As before, we let τk = min{t : Xt = k}. We have, v − 2ε v 2ε )Pω (τ > n(1 − + )) vα vα vα v − 2ε v 2ε ≥n )P (τ¯ > n(1 − + )) . vα vα vα

Pω (Xn < nv) ≥ Pω (τn(v−ε) ≥ n = Pω (τn(v−ε)

(17)

By Solomon’s law of large numbers, cf. (2), lim Pω (τn(v−ε) ≥ n

n→∞

v − 2ε ) = 1. vα

(18)

By standard eigenvalue estimates for the simple random walk (cf. [8, p. 243]), lim

n→∞

n(1 −

v vα

(log n)2 log P (τ¯ > n) = −π 2 /8 . − v2εα )(log α(1/2))2

(19)

Combining (19), (17), and (18), the lower bound in (7) follows. The proof of the upper bound in (7) follows the proof of part 1 of Theorem 2, except that there is no need for subsequences here. With µ = v −1 > vα−1 = µα and t ∈ (0, 1) , define µ¯ = tµα + (1 − t)µ. Fix 1/2 > ε > 0, δ > 2, bn = n−(δ/2) and

188


(log n)3 (1 + δ)3 , C23 (µ¯ − µα )(1 − ε)3 h i o n h i where C2 was defined in Theorem 1. We define In = − nk − 1 , · · · , nk + 1 , and k = k(n) :=

use Tk(i) as in (9). Then, following the outline of the proof of Lemma 5, exp(−C2 (µ¯ − µα )1/3 k 1/3 (1 − ε)) , bn

¯ > bn ) ≤ P(Pω (Tk(i) > µk)

(20)

where we have used the bound ¯ ≤ exp(−k 1/3 C2 (µ¯ − µα )1/3 ) , P(Tk(i) > µk) which follows from Theorem 1 using the inequalities ¯ ≤ P(X[µk] < k) ≤ P(X[µk] < ([µk] ¯ + 1)/µ) ¯ . P(Tk(i) > µk) ¯ ¯ Thus, by the Borel–Cantelli lemma, for P-a.e. ω, there exists an N0 (ω) such that for all n > N0 (ω), ¯ , some i ∈ In ) ≤ bn . (21) Pω (Tk(i) > µk (i)

Define T k as in (10). Set 0 < γ < (1 + δ)3 | loghρi|/C23 (µ¯ − µα ). With dn = exp(−γ(log n)3 ), the Borel–Cantelli lemma yields, as in the proof of Lemma 6, that for P-a.e. ω, there exists an N1 (ω) such that for n ≥ N1 (ω), (i)

Pω (Tk(i) 6= T k , some i ∈ In ) < dn .

(22)

Using (21), one concludes as in Lemma 7 that for P-a.e. ω, for n > N0 (ω), and each i ∈ In , (i) ¯ ≤ (2bn )[x/2]∨1 . (23) Pω (T k > k µx) Let Zk(i) , i = 1, 2 , . . . denote a sequence of positive, i.i.d random variables with P

Z (i) k

k

< µ¯ = 0 ,

P

Z (i) k

k

> µx ¯ = (2bn )[x/2]∨1 ,

x ≥ 1.

The following lemma takes the place of Lemma 8 in the proof of Theorem 2: ¯ + ε0 ), Lemma 10. For each ε0 > 0, we have, for λn = − log(2bn )/2µ(1 E exp λn Zk(i) /k ≤ eλn µ¯ + gn , where gn n→∞ −→ 0 . Proof of Lemma 10. Exactly as in the course of the proof of Lemma 8, for n large enough, Z ∞ Z (i) log u (i) k > E exp λn Zk /k = du P λn k 0 Z ∞ log u (2bn ) 2λn µ¯ du = eλn µ¯ + gn , ≤ eλn µ¯ + eλn µ¯

where


Z gn =

∞

u

189

¯ (log 2bn )/(2λn µ)

Z

∞

du =

eλn µ¯

eλn µ¯

0

u−(1+ε ) du n→∞ −→ 0.

Let St , t = 0, 1, . . . , denote the simple random walk with S0 = 0 and P (St+1 = St + 1|St ) = 1 − P (St+1 = St − 1|St ) = 1 − dn , and let Mn =

nC23 (µ¯ − µα )(1 − ε)2 . (log n)3 (1 + δ)3

Mimicking the proof in Lemma 9, we obtain that P (inf{t : St = [n/k]} > Mn ) ≤ exp(−nθε) ,

(24)

where θ = γC23 (µ¯ − µα )(1 − ε)2 /(3(1 + δ)3 ). Following the proof of Theorem 2, we have Mn n h n io X > Mn + P Pω [τn > nµ] ≤ P inf t : St = Zk(i) > nµ k i=1

Mn 1 X Zk(i) −nθε ≤e > µ(1 − ε) +P Mn k i=1 Mn ≤ e−nθε + E exp λn Zk(i) /k e−λn µ(1−ε) ¯ ≤ e−nθε + e−λn Mn (µ(1−ε)−µ−ε) ,

where the second inequality is due to (24) and the last due to Lemma 10. Plug in the definition of Mn and λn to get 3 2δ ( µ ¯ − µ )(1 − ε) C µ(1 − ε) − µ ¯ − ε 2 α 2 2 (log n) lim sup log Pω (τn > nµ) ≤ − . 3 µ(1 n 2(1 + δ) ¯ + ε0 ) n→∞ Letting ε and ε0 → 0 and δ → 2, one gets lim sup n→∞

1 µ − µ¯ (log n)2 log Pω (τn > nµ) ≤ −C23 (µ¯ − µα ) n 2 · 33 µ¯ 1 t(1 − t) (µ − µα )2 , (25) = −C23 2 · 33 (1 − t)µ + tµα

where we used the definition of µ¯ in the last equality. Optimizing over t ∈ (0, 1) yields lim sup n→∞

(log n)2 1 1 log Pω (τn > nµ) ≤ −C23 (µ − µα )2 √ . √ n 2 · 33 ( µ + µ α )2

To prove the upper bound in (7), observe that for v < v 0 < vα , by the same argument as in (14),

190

lim sup n→∞


X (log n)2 (log n)2 1 n log Pω < v ≤ lim sup log Pω τ[nv0 ] > [nv 0 ] 0 n n n v n→∞ 0 2 (log[nv ]) 0 1 0 ] > [nv ] = lim sup v 0 log P τ ω [nv [nv 0 ] v0 n→∞ 2 1 1 0 1 1 v − ≤ −C23 2 2 · 33 v0 vα √1 + √1 0 vα v 0 2 v 1 v α ≤ −C23 . 1− √ √ 2 2 · 33 vα 0 v + vα

√ √ Letting v 0 → v, and using vα /( v + vα )2 ≥ 1/4, we get lim sup n→∞

X v 2 (log n)2 1 n log Pω < v ≤ −C23 1 − , n n 8 · 33 vα

completing the proof of the upper bound in (7).

(26)

Remark. Even when one uses the results of [6] and replaces C2 by C1 in the right hand side of (26), the behaviour of the exponent in the upper bound is quadratic in (vα − v), which is far from the linear behaviour exhibited by the exponent of the corresponding lower bound. While the constant in the upper bound can be slightly further improved (e.g., by using subsequences in the proof), it seems that a new approach is needed to completely close the gap. Added in proof A. Pisztora and T. Povel have recently succeeded in closing the gap mentioned above, and established that the lower bound in (7) captures the right asymptotic behaviour. References 1. Chung, K.L.: Markov chains with stationary transition probabilities. Berlin: Springer, 1960 2. Dembo, A., Peres, Y., Zeitouni, O.: Tail estimates for one–dimensional random walk in random environment. Commun. Math. Phys. 181, 667–683 (1996) 3. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Boston: Jones and Bartlett, 1993 4. Greven, A., den Hollander, F.: Large deviations for a random walk in random environment. Ann. Probab. 22, 1381–1428 (1994) 5. Karlin, S., Dembo, A.: Limit distributions of maximal segmental score among Markov dependent partial sums. Adv. in Appl. Prob. 24, 113–140 (1992) 6. Pisztora, A., Povel, T., Zeitouni, O.: Precise large deviations estimates for one-dimensional random walk in random environment. Submitted 7. Solomon, F.: Random walks in random environment. Ann. Probab. 3, 1–31 (1975) 8. Spitzer, F.: Principles of random walk. Berlin: Springer, 1976 Communicated by Ya. G. Sinai

Commun. Math. Phys. 194, 191 – 205 (1998)

Communications in


The Riemann Problem for Pressureless Fluid Dynamics with Distribution Solutions in Colombeau’s Sense Jiaxin Hu The Young Scientist Laboratory of Mathematical Physics, Wuhan Institute of Physics and Mathematics, Academia Sinica, Wuhan 430071, P.O.Box 71010, P.R. China. E-mail: [email protected] Received: 2 May 1997 / Accepted: 6 October 1997

Abstract: The Riemann problem for the equations of pressureless fluid dynamics was considered. Solutions of this problem were constructed by employing the viscosity vanishing approach. For some initial data, solutions showed high singularity around the shock waves. A new mathematical theory of generalized functions initiated by J.F. Colombeau was applied to dealing with the multiplication of singular distributions. As a byproduct, the entropy condition was obtained for singular distribution solutions.

1. Introduction We are concerned with the one-dimensional equations of pressureless fluid dynamics of the form ρt + (ρu)x = 0 (1.1) (ρu)t + (ρu2 )x = 0, (x, t) ∈ R × R+ with initial data (ρ, u)|t=0 = (ρ0 (x), u0 (x)) =

(ρ2 , u2 ), x > 0, . (ρ1 , u1 ), otherwise, ρ1 ≥ 0, ρ2 ≥ 0,

(1.2)

We call (1.1),(1.2) a Riemann problem. System (1.1) is the special form of one-dimensional fluid dynamics equations. We recall that the equations of fluid dynamics in Eulerian coordinates read ( ρt + (ρu)x = 0 (conservation of mass), (1.3) 2 (1.4) (ρu)t + (ρu + p)x = 0 (conservation of momentum), where ρ and u stand for the density and the velocity of fluid, respectively, while p denotes the pressure [1].The density ρ is nonnegative; the regions in the physical space where

192

J. Hu

ρ = 0 are identified with vacuum regions of the flow. As we know, a flow is formed by two kinds of effects; the effect of inertia and the effect of pressure difference. If we neglect the effect of pressure difference in (1.4)(that is, the pressure p is constant), (1.3) and (1.4) are reduced to (1.1). The system (1.1) also describes other important physical phenomena, see [10, 26] and references therein. System (1.1) has duplicate eigenvalues λ = u with corresponding right eigenvectors r = (a, 0)T , a being an arbitrary real number, and so 5λ.r = 0. Thus (1.1) is nonstrictly hyperbolic and linearly degenerate. We recall that in general, classical Riemann solutions lie in L∞ loc (R × R+ ), the space of locally bounded functions (cf. [18]). In the present situation, however, one finds that no classical weak solutions exist for some initial data, and the introduction of linear functionals on C0∞ (R × R+ ) into Riemann solutions is found necessary. In other words, Riemann solutions can be viewed as Schwartz generalized functions. At this time, there arises a concurrent problem of how to define the product of two Schwartz generalized functions. According to L. Schwartz’s theory, it is impossible to define the multiplication of arbitrary Schwartz generalized functions since the space of all Schwartz generalized functions is not an algebraic one [16]. To overcome this difficulty, many people have done interesting works [12, 22, 23]. In particular, Colombeau initiated a new algebraic space G(Rn ) of generalized functions, which is the extention of Schwartz generalized function space and allows us to define the multiplication of arbitrary distributions [2, 3]. This new idea was independently introduced by E.E. Rosinger [15], and developed by Oberguggenberger [13,14], Colombeau, A.Y.Loux [4, 5, 6], Todorov [20] and Egorov [9] et al. In the present paper, we borrow this new theory to cope with the product of distributions appearing in the non-classical Riemann solutions to (1.1), (1.2). The program of this paper is as follows: in Sect. 2, for reader’s convenience, we shall give a glimpse of Colombeau’s theory of generalized functions and then interpret in what sense (1.1),(1.2) hold in the framework of Colombeau’s theory (cf.(2.3),(2.4)). In Sect. 3 we shall use the vanishing viscosity approach, first introduced by Dafermos [7] and Tupciev [21], to show that the viscosity regularized problem ρt + (ρu)x = µtρxx (1.5) (ρu)t + (ρu2 )x = µt(ρu)xx , (x, t) ∈ R × R+ with initial data (1.2) has a smooth similarity solution for ρ1 ρ2 > 0. Equivalently, we consider the boundary value problem ( 00 µρ = −ξρ0 + (ρu)0 , (1.6a) µ(ρu)00 = −ξ(ρu)0 + (ρu2 )0

(1.6b)

with boundary conditions (ρ(−∞), u(−∞)) = (ρ1 , u1 ),

(ρ(∞), u(∞)) = (ρ2 , u2 ),

(1.7)

It is shown that (1.6),(1.7) has a smooth solution(ρµ (ξ), uµ (ξ)) on(−∞, ∞) for every µ > 0. In Sect. 4 we shall pay attention to the non-classical Riemann solutions and prove the solutions (ρµ (ξ), uµ (ξ)) to (1.6),(1.7) obtained in Sect. 3 generate a distribution solution (P, U )to (1.1),(1.2). The weak limit of the sequence {ρµ (ξ) : 0 < µ < 1} is just the macroscopic aspect of P in G(R), and the microscopic profiles of the nonclassical shock waves are analysed. Finally, in the last section we consider the case whenρ1 = 0, ρ2 > 0 or ρ1 > 0, ρ2 = 0. We mention in passing here that E, Rykov and Sinai once considered (1.1) with more

Riemann Problem for Pressureless Fluid Dynamics

193

general initial data (except vacuum data) and obtained the global existence of weak solutions by using generalized variational principles [26]. Also Z. Wang, F. Huang and X. Ding investigated this problem by introducing potential functions and LebesgueStieltjies integrals [24, 25]. All of them avoided the difficulty of the multiplication of distributions. 2. A Glimpse of Colombeau’s Algebra G(Rn ) of Generalized Functions In this section we briefly describe the definition of the new generalized functions introduced by Colombeau ([2–4]). Let be an open set in Rn and we denote by D() the set of all C ∞ functions on n with R q ia nonnegative integer, we set Aq = {ϕ ∈ iD(R i1) such R compact support. For that Rn ϕ(λ)dλ = 1 and Rn λ ϕ(λ)dλ = 0 if 1 ≤ |λ| ≤ q}. As usual λ = λ1 ...λinn and|i| = i1 + ... + in . If 0 < < 1 we set 1 λ . ϕ (λ) = n ϕ We denote by EM [] the set of all mappings R(ϕ, x) : A0 × → R such that (i) for any ϕ the mapRϕ : x → R(ϕ, x) is a C ∞ function of the variable x ∈ . |k| (ii) If D = k1∂ kn is any partial derivation operator and if K is any compact subset ∂x1 ...∂xn

of , then there exists an integer N such that if ϕ ∈ AN there are constants C > 0 and η ∈ (0, 1] such that C sup |DRϕ (x)| ≤ N x∈K if 0 < < η We set N [] to be the set of all mappings R ∈ EM [] such that for all D and K as above there exists an integer N such that if ϕ ∈ Aq , q ≥ N , ∃C > 0 and η ∈ (0, 1] such that sup |DRϕ (x)| ≤ Cq−N x∈K

if 0 < < η. Colombeau defined a generalized function G on to be the equivalence class modulo N [] of a representative(ϕ, x) → R(ϕ, x) of G, i.e., the space G() of generalized functions on is the quotient set EM []/N []. The operations in G() such as differentiation, addition and multiplication are those naturally defined on representatives. D0 () is naturally imbedded as a vector subspace of G(): any distribution T on is considered as the class of the mapping R(ϕ, x) = hT (λ), ϕ(λ − x)i if the function λ → ϕ(λ − x) has its support in . Finally, following Colombeau’s notation, two elements G1 , G2 ∈ G() are said to be associated (notation G1 ≈ G2 ) if there exist some representatives R1 , R2 of G1 , G2 satisfying ∀ψ ∈ D(Rn )∃N such that ∀ϕ ∈ AN (Rn ) Z lim (R1 (ϕ , x) − R2 (ϕ , x))ψ(x)dx = 0. →0 Rn

194

J. Hu

An element G is said to have a distribution T ∈ D0 () as macroscopic aspect, iff G ≈ T , i.e., Z ∀ψ ∈ D(),

R(ϕ , x)ψ(x)dx → hT, ψi

as

→ 0,

where R(ϕ, x) is some representative of G. Now we are in a position to give the definition of distribution solutions to (1.1), (1.2). We first note that (1.1),(1.2) are invariant under the transformation x → αx0 , t → αt0 (α > 0). We should seek self-similar solutions of the form (ρ(x, t), u(x, t)) = (ρ(ξ), u(ξ)), ξ = xt . Thus (1.1) changes into

−ξρ0 (ξ) + (ρ(ξ)u(ξ))0 = 0 −ξ(ρ(ξ)u(ξ))0 + (ρ(ξ)u2 (ξ))0 = 0,

and (1.2) into

(ρ(ξ), u(ξ)) →

(ρ1 , u1 ), (ρ2 , u2 ),

0=

d dξ , ξ

∈ R,

ξ → −∞, ξ → +∞,

(2.1)

(2.2)

Definition 2.1. The generalized functions P, U ∈ G(R) are said to be a distribution solution to (1.1),(1.2) if they satisfy −ξP 0 + PU 0 ≈ 0 (2.3) −ξ(PU )0 + (PU 2 )0 ≈ 0 and the initial condition (ρ(ξ), u(ξ)) →

(ρ1 , u1 ), (ρ2 , u2 ),

ξ → −∞, ξ → +∞,

(2.4)

for some representatives (ρ(ξ), u(ξ)) of P, U . 3. Existence of Smooth Solutions to (1.6),(1.7) In this section we shall show the existence of a smooth solution (ρµ , uµ ) to (1.6),(1.7) on (−∞, ∞) for every fixed µ > 0 with ρµ (ξ) > 0. To do this, we consider the two parameter boundary-value problem  00 0 0   µρ = −ξρ + νm (3.1a) 2 0 m 00 0  (3.1b) , −L < ξ < L  µm = −ξm + ν ρ with

ρ(−L) = ρ∗ + ν(ρ1 − ρ∗ ), ρ(L) = ρ∗ + ν(ρ2 − ρ∗ ), m(−L) = νρ1 u1 = νm1 , m(L) = νρ2 u2 = νm2 ,

(3.2a) (3.2b)

where parameters ν ∈ [0, 1], L ≥ 1 and ρ∗ = min(ρ1 , ρ2 ), ρ1 > 0, ρ2 > 0, m(ξ) = ρ(ξ)u(ξ). The following theorem is a special case of Theorem 2.1 in [17] with p(ρ) = 0 (see pp. 1050–1052).


195

Theorem 3.1. Assume that there are positive constants M and δ depending only on u1 , u2 , ρ1 , ρ2 and µ, but independent of ν and L, such that every solution (ρ(ξ), m(ξ)) of (3.1), (3.2) with ρ(ξ) > 0, corresponding to any 0 ≤ ν ≤ 1, L ≥ 1, satisfies |m(ξ)| + ρ(ξ) ≤ M, sup −L≤ξ≤L (3.3) inf ρ(ξ) ≥ δ. −L≤ξ≤L

Then there exists a solution of (1.6),(1.7), denoted again by (ρ(ξ), m(ξ)), such that ρ(ξ) > 0 for −∞ < ξ < ∞. Our next goal is to derive the apriori estimates (3.3) required to apply Theorem 3.1. We note that if (ρ(ξ), m(ξ)) is a solution of (3.1), (3.2) with ρ > 0, then ρ(ξ) and u(ξ) = m(ξ) ρ(ξ) satisfy the following: µρ00 = −ξρ0 + ν(uρ0 + ρu0 ), ρ0 µu00 = (νu − ξ − 2µ )u0 , −L < ξ < L ρ

(3.4a) (3.4b)

with boundary conditions ρ(−L) = ρ∗ + ν(ρ1 − ρ∗ ), ρ(L) = ρ∗ + ν(ρ2 − ρ∗ ), m(−L) ρ1 u1 m(L) ρ2 u2 u(−L) = =ν ∗ , u(L) = =ν ∗ . ρ(−L) ρ + ν(ρ1 − ρ∗ ) ρ(L) ρ + ν(ρ2 − ρ∗ )

(3.5a) (3.5b)

The following lemma is crucial in establishing (3.3). Lemma 3.2. Let (ρ(ξ), u(ξ)) with ρ(ξ) > 0 be a nonconstant solution of (3.4),(3.5) in (-L, L) for some 0 < ν ≤ 1 and L ≥ 1. Then u(ξ) is always a monotone function in (-L, L) while ρ(ξ) satisfies one of the following: (i) ρ(ξ) is monotone in (-L, L), (ii) ρ(ξ) has only one maximum point in (-L, L) if u(ξ) is strictly decreasing in (-L, L), (iii) ρ(ξ) has only one minimum point in (-L, L) if u(ξ) is strictly increasing in (-L, L). Proof. Let (ρ(ξ), u(ξ)) be a nonconstant solution of (3.4),(3.5) with ρ(ξ) > 0. By (3.4b),(3.5b), we know that ! Z ξ 1 0 A(s)ds , −L < ξ < L, u (ξ) = λ exp −L µ where A(ξ) = νu(ξ) − ξ − 2µρ0 (ξ)/ρ(ξ) and λ is given by Z u(L) − u(−L) = λ

Z

L

τ

exp −L

−L

1 A(s)ds dτ. µ

Therefore, u(ξ) is always monotone in (-L, L). Next we suppose u(ξ) is strictly decreasing in (-L, L) and ρ(ξ) has a critical point σ in (-L,L). Then ρ0 (σ) = 0. By (3.4a), we have µρ00 (σ) = νρ(σ)u0 (σ) < 0 since ρ(σ) > 0 and u0 (σ) < 0. Thus σ is the maximum point of ρ(ξ). Case (ii) is proven. Case (iii) can be treated similarly.

196

J. Hu

From Lemma (3.2), u(ξ) is uniformly bounded in (-L,L) with respect to ν ∈ (0, 1], L ≥ 1 and µ > 0. It remains to estimate ρ(ξ) for Cases (ii), (iii) in Lemma 3.2. Lemma 3.3. For Cases (ii), (iii) in Lemma 3.2, there exists positive constants M and δ, independent of ν ∈ (0, 1] and L ≥ 1, such that δ ≤ ρ(ξ) ≤ M. Proof. Motivated by [8], we first prove Z β ρ(ξ)dξ ≤ (β − α)ρ + N,

(3.6)

α

for every interval (α, β) ⊂ (−L, L), where ρ = max {ρ(−L), ρ(L)} = max{ρ1 , ρ2 } 0≤ν≤1

and N = ρ max |u(−L) − u(L)|. 0≤ν≤1

In fact, we set θ1 = inf{ξ ∈ (α, β) : ρ(ξ) ≥ ρ} if ρ(α) < ρ (if this set is empty, (3.6) is automatically satisfied); on the other hand, we set θ1 = sup{ξ ∈ (−L, β) : ρ(ξ) ≤ ρ} if ρ(α) ≥ ρ. Similarly, we set θ2 = inf{ξ ∈ (β, L] : ρ(ξ) ≤ ρ} if ρ(β) ≥ ρ, while we set θ2 = sup{ξ ∈ (α, β) : ρ(ξ) ≥ ρ} if ρ(β) < ρ. Since ρ(θ1 ) = ρ(θ2 ) = ρ, we have Z β Z θ2 Z θ2 (ρ(ξ) − ρ)dξ ≤ (ρ(ξ) − ρ)dξ = − ξρ0 (ξ)dξ. (3.7) θ1

α

θ1

Noting that ρ0 (θ1 ) ≥ 0 and ρ0 (θ2 ) ≤ 0, integrating (3.4a) over (θ1 , θ2 ), we have Z θ2 0 ≥ µρ0 (θ2 ) − µρ0 (θ1 ) = − ξρ0 (ξ)dξ + νρ(u(θ2 ) − u(θ1 )).

(3.8)

θ1

Therefore, (3.7), (3.8) give Z β (ρ(ξ) − ρ)dξ ≤ νρ(u(θ1 ) − u(θ2 )) ≤ ρ|u(L) − u(−L)| ≤ N α

since u(ξ) is monotone in (-L, L). Thus (3.6) is obtained. Now we apply (3.6) to estimating ρ(ξ) from above for Case (ii) in Lemma 3.2. By (3.6), it follows that ρ∗ ≤ ρ(ξ) ≤ ρ +

N , ξ ∈ [−L, L]/σ. |σ − ξ|

(3.9)

Without loss of generality we assume that ρ(σ) > ρ. We fix ξ0 ∈ [−L, σ) such that ρ(ξ0 ) = ρ. For any ξ ∈ [ξ0 , σ) we let ξ 0 be a point in (σ, L] with the property ρ(ξ 0 ) = ρ(ξ) (such a point exists since ρ(L) ≤ ρ). Integrating (3.4a) over (ξ, ξ 0 ) we obtain 0

0

Z

0

µ(ρ (ξ ) − ρ (ξ)) = −

ξ0

sρ0 (s)ds + νρ(ξ)(u(ξ 0 ) − u(ξ)).

(3.10)

ξ

We note that ρ0 (ξ 0 ) ≤ 0 and − gives

R ξ0 ξ

sρ0 (s)ds =

R ξ0 ξ

(ρ(s) − ρ(ξ))ds ≥ 0. Therefore, (3.10)


197

µρ0 (ξ) ≤ νρ(ξ)(u(ξ) − u(ξ 0 )) ≤ ρ(ξ).

N ρ

which yields ρ(σ) ≤ ρ(α) exp(

N (σ − α)), µρ

ξ0 ≤ α ≤ σ.

(3.11)

N ). On the other hand, If σ − ξ0 ≤ 1, we choose α = ξ0 in (3.11) to get ρ(σ) ≤ ρ exp( µρ if σ − ξ0 > 1 we choose α = σ − 1. From (3.11),(3.9), it follows that N N ≤ (ρ + N ) exp . ρ(σ) ≤ ρ(α − 1) exp µρ µρ

Next we estimate ρ(ξ) from below for Case (iii) in Lemma 3.2. We set w(ξ) = −ρ(ξ). Then (w(ξ), u(ξ)) is the solution of (3.4) with the initial data (3.5b) and w(−L) = −ρ∗ − ν(ρ1 − ρ∗ ),

w(L) = −ρ∗ − ν(ρ2 − ρ∗ ).

(3.5a)0

Similar to (3.9), (3.11), we obtain N , ξ ∈ [−L, L]/τ, |τ − ξ| N w(τ ) ≤ w(α) exp − ∗ (τ − α) , ξ0 ≤ α ≤ τ, µρ w(ξ) ≤ w +

where τ is the minimum point of ρ(ξ) in (-L,L) and ξ0 is a point in (-L,L) with w(ξ0 ) = w, w = max {w(−L), w(L)} = −ρ∗ and N = ρ∗ max |u(−L) − u(L)|. Therefore, 0≤ν≤1

0≤ν≤1

≥ ρ∗ − |τN −ξ| , ξ ∈ [−L, L]/τ, N ρ(τ ) ≥ ρ(α) exp − µρ , ξ0 ≤ α ≤ τ. ∗ (τ − α) ρ(ξ)

(3.12) (3.13)

2 N If τ − ξ0 ≤ 2 ρN∗ , (3.13) gives that ρ(τ ) ≥ ρ∗ exp −2 µρ for α = ξ0 . If τ − ξ0 > 2 ρN∗ , ∗2 2 2N . The proof is let α = τ − 2 ρN∗ and use (3.12),(3.13) to obtain ρ(τ ) ≥ 21 ρ∗ exp − µρ ∗2 complete. Now we have obtained the existence of a solution (ρµ (ξ), uµ (ξ)) of (1.6),(1.7) on (−∞, ∞) for every fixed µ > 0. Moreover, we have Lemma 3.4. For every fixed µ > 0, the same results in Lemma 3.2 are valid for the solution (ρµ (ξ), uµ (ξ)) of (1.6),(1.7). Furthermore, Z

β

ρµ (ξ)dξ ≤ (β − α)ρ + N1 , δ ∗ , ρ ≤ ρµ (ξ) ≤ M + 1, 0 < min 2

(3.14)

α

(3.15)

for every interval (α, β) ⊂ (−∞, ∞), where ρ = max{ρ1 , ρ2 } and N1 = ρ|u1 − u2 |.

198

J. Hu

Proof. The proof is similar to that of Lemma 3.2 and (3.6), and (3.15) can be easily obtained from the proof of Theorem 3.1 in [17] (pp. 1051–1052). We omit the detail. Similar to (3.9), we also have ρ∗ ≤ ρµ (ξ) ≤ ρ +

N , ξ ∈ (−∞, ∞)/σµ , |σµ − ξ|

(3.16)

when ρµ (ξ) has a maximum point σµ on (−∞, ∞) ( by Lemma (3.4) this happens if and only if u1 > u2 ), where N = ρ(u1 − u2 ). 4. Existence of Distribution Solutions to (1.1), (1.2) In this section we shall prove the existence of distribution solutions to (1.1), (1.2) for ρ1 ρ2 > 0. We distinguish two cases (1) u1 ≤ u2 , (2) u1 > u2 . Case (2) is much more complicated and a matter of real interest in this paper. We first consider Case (1). From Lemma 3.4, it is easily seen that {(ρµ (ξ), uµ (ξ)) : 0 < µ < 1} is uniformly bounded in µ and of uniformly bounded variation. By Helly’s theorem {(ρµ (ξ), uµ (ξ)) : 0 < µ < 1} possesses a subsequence which converges a.e. on (−∞, ∞) to some functions ρ(ξ), u(ξ) of bounded variation, and (ρ(ξ), u(ξ)) provides a classical weak solution to (1.1,(1.2) (see Theorem 3.2 in [7] or Theorem 4.1 in [17]). It is a routine matter to verify that the generalized functions P, U in G(R) with ρ(ξ), u(ξ) as macroscopic aspects, respectively, satisfy (2.3), (2.4). Thus, (1.1) with (1.2) admits a distribution solution in Colombeau’s sense for u1 ≤ u2 and ρ1 ρ2 > 0. Now we turn to Case (2). At this time, ρµ (ξ) has a maximum point σµ on (−∞, ∞), and ρµ (ξ) may tend to infinity as µ → 0. We call the condition that u1 > u2 the entropy condition for the singular distribution solution of (1.1), (1.2). Let σµ → σ, |σ| ≤ ∞, as µ → 0 (pass to a further subsequence if necessary). Lemma 4.1. If |σ| = ∞, then {(ρµ (ξ), uµ (ξ)) : 0 < µ < 1} is uniformly bounded in µ. Proof. We first assume that σ = ∞. By (3.16) we get ρ∗ ≤ ρµ (ξ) ≤ ρ + N,

(4.1)

for ξ ∈ (∞, a] and µ small, where a is any fixed real number. We take a = 2 and ξ0 ∈ [1, 2] such that 0 ≤ ρ0µ (ξ0 ) = ρµ (2) − ρµ (1) ≤ ρ + N − ρ∗ by (4.1). Noting u2 ≤ uµ (ξ) ≤ u1 on (−∞, ∞), we take µ to be so small that 1 (ξ0 − uµ (ξ0 ))ρµ (ξ0 ) + µρ0µ (ξ0 ) ≤ 1, σµ − uµ (σµ ) 1 (σµ − ξ0 )ρ + N1 ≤ 2ρ + 1, σµ − uµ (σµ )

(4.2) (4.3)

(this can be done since σµ → ∞ as µ → 0). Integrating (1.6a) over (ξ0 , σµ ), one finds that


199

 ρµ (σµ ) =

1  0  ξ0 − uµ (ξ0 ) ρµ (ξ0 ) + µρµ (ξ0 ) + σµ − uµ (σµ )

Zσµ

  ρµ (ξ)dξ 

ξ0

1 (ξ0 − uµ (ξ0 ))ρµ (ξ0 ) + µρ0µ (ξ0 ) + (σµ − ξ0 )ρ + N1 ≤ σµ − uµ (σµ ) ≤ 2 + 2ρ by virtue of (3.14),(4.2) and (4.3). The case σ = −∞ can be treated similarly.

Now we turn to the case |σ| < ∞. Let ξαµ be the singularity point of (1.6), that is, ξαµ = uµ (ξαµ ) and set ξα = lim uµ (ξαµ ). It is easily seen that u2 ≤ ξα ≤ u1 since µ→0

u2 ≤ uα (ξ) ≤ u1 on (−∞, ∞).

Lemma 4.2. ξα is defined as above. Then {ρµ (ξ) : 0 < µ < 1} is uniformly bounded in µ if σ 6= ξα . Proof. We suppose σ > ξα . Integrating (1.6a) over (σµ , σ + 1), we have Z σ+1 0 ξρ0µ (ξ)dξ + ρµ (σ + 1)uµ (σ + 1) − ρµ (σµ )uµ (σµ ) µρµ (σ + 1) = − σµ

= σµ − uµ (σµ ) ρµ σµ + uµ (σ + 1) − σ − 1 ρµ (σ + 1) Z σ+1 ρµ (ξ)dξ. +

(4.4)

σµ

Notice that ρ0µ (σ + 1) ≤ 0 and σµ − uµ (σµ ) = σµ − ξαµ + (uµ (ξαµ ) − uµ (σµ )) = (σµ − ξαµ )(1 − u0µ (θµ )) ≥ 21 (σ − ξα ) > 0 for µ small since u0µ (ξ) ≤ 0 on (−∞, ∞) and σµ − ξαµ → σ − ξα as µ → 0. Here θµ is between ξαµ and σµ . Therefore, (4.4) gives "

1 ρµ (σµ ) ≤ σµ − uµ (σµ ) ≤

Z

σ + 1 − uµ (σ + 1) ρµ (σ + 1) −

2 (σ + 1 − u2 )(ρ + 2N ), σ − ξα

#

σ+1

ρµ (ξ)dξ σµ

for µ small

by (3.16) and ρµ (ξ) > 0. Thus {ρµ (ξ) : 0 < µ < 1} is uniformly bounded in µ. The case σ < ξα can be treated similarly. Given η > 0, from (3.16) we know ρµ (ξ) ≤ ρ + 2N η , ξ ∈ (−∞, ξα − η) ∪ (ξα + η, ∞) for µ small if σµ → σ = ξα as µ → 0. Combining this and Lemma 4.1, 4.2 we easily get Theorem 4.3. If u1 > u2 , then {ρµ (ξ) : 0 < µ < 1} is uniformly bounded and of uniformly bounded variation over the interval (−∞, ξα − η) ∪ (ξα + η, ∞) for any given η > 0. Applying Helly’s theorem and the diagonal principle, we deduce from Theorem 4.3 that there exist a subsequence of ρµ (ξ)(still denoted by the original one) and some function ρ(ξ) such that ρµ (ξ) → ρ(ξ), ξ ∈ (−∞, ξα ) ∪ (ξα , ∞), µ → 0.

(4.5)

200

J. Hu

We set lim uµ (ξ) = u(ξ), ξ ∈ (−∞, ∞).

(4.6)

µ→0

Theorem 4.4. Suppose u1 > u2 and let (ρ(ξ), u(ξ)) be given by (4.5), (4.6).Then for any η > 0 (ρ1 , u1 ), ξ ≤ ξα − η . (4.7) (ρ(ξ), u(ξ)) = ρ2 , u2 ), ξ ≥ ξα + η Proof. By Theorem 4.3, it is easily seen that there exists a positive M1 (independent of µ) such that η η ∪ ξα + , ∞ , (4.8) 0 < ρ∗ ≤ ρµ (ξ) ≤ M1 , ξ ∈ −∞, ξα − 2 2 i h such where ρ = min(ρ1 , ρ2 ) and µ is small. We set ξ0 to be a point in ξα + η2 , ξα + 3η 4 that 4 3η η uµ (ξα + ) − uµ ξα + u0µ (ξ0 ) = η 4 2 which says that 4 − (u1 − u2 ) ≤ u0µ (ξ0 ) ≤ 0. η

(4.9)

By (1.6a), (1.6b), u0µ (ξ) Observe that

u0µ (ξ0 )ρ2µ (ξ0 ) exp = ρ2µ (ξ)

Z

ξ

ξ0

! uµ (s) − s ds , µ

uµ (s) − s = uµ (s) − uµ (ξαµ ) + (ξαµ − s) 1 = (s − ξαµ )(u0µ (θµ ) − 1) ≤ − η, 4

ξ ≥ ξ0 .

(4.10)

(4.11)

since s − ξαµ → s − ξα ≥ ξ0 − ξα ≥ 21 η as µ → 0, and u0µ (θµ ) ≤ 0. Here θµ is between s and ξαµ . Combining (4.10) with (4.8),(4.9),(4.11), one easily gets that 2 4 1 M1 0 exp − η(ξ − ξ0 ) , ξ ≥ ξ0 , (4.12) |uµ (ξ)| ≤ (u1 − u2 ) η ρ∗ 4µ R∞ Therefore, by (4.12), for any ξo ≥ ξα +η, we deduce from u2 −uµ (ξ) = ξ u0µ (s)ds that lim uµ (ξ) = u2 uniformly. A similar argument leads to lim uµ (ξ) = u1 for ξo ≤ ξα − η. µ→0

µ→0

Now we turn to discuss the case for ρ(ξ). Let ξ1 be a point in [ξα + 21 η, ξα + 43 η] such that 4 3 1 ρµ (ξα + η) − ρµ ξα + η ρ0µ (ξ1 ) = η 4 2 which combines with (4.8) to yield


|ρ0µ (ξ1 )| ≤

201

4 (M1 − ρ∗ ). η

By (1.6a), (4.10), ρ0µ (ξ)

=

ρ0µ (ξ1 ) exp

Z

ξ ξ1

uµ (s) − s ds µ

(4.13)

!

! uµ (s) − s ds ρµ (τ )u0µ (τ )dτ exp µ ξ1 τ ! Z ξ uµ (s) − s 0 = ρµ (ξ1 ) exp ds µ ξ1 !Z Z ξ ξ 1 0 u (s) − s 1 µ + uµ (ξ0 )ρ2µ (ξ0 ) exp ds dτ. µ µ ρ µ (τ ) ξ0 ξ1 1 + µ

Z

ξ

Z

ξ

From (4.13), (4.11), (4.9) and (4.8), it follows from (4.14) that 1 0 |ρµ (ξ)| ≤ const exp − η(ξ − ξ1 ) 4µ 1 1 + (ξ − ξ1 ) exp − η(ξ − ξ0 ) , ξ ≥ ξα + η. µ 4µ

(4.14)

(4.15)

Here the constant is independent of µ. Accordingly, we have from (4.15) that lim ρµ (ξ) = µ→0

ρ2 for ξ ≥ ξα + η. In a similar way, lim ρµ (ξ) = ρ1 for ξ ≤ ξα − η. The proof is complete.

µ→0

Theorem 4.4 means that ρ(ξ), u(ξ) share the same discontinuity point ξ = ξα on (−∞, ∞). However, the Rankine–Hugoniot condition no longer holds for ξ = ξα since ρ1 ρ2 (u1 − u2 ) 6= 0, and thus (ρ(ξ), u(ξ)) is not a classical weak solution to (1.1), (1.2). But we can get the existence of a distribution solution of (1.1), (1.2) in Colombeau’s sense. The following lemma is vital to our analysis. Lemma 4.5. Let u1 > u2 and (ρµ (ξ), uµ (ξ)) be the solution of (1.6), (1.7). Then ρµ (ξ) converges weakly star to the function ρ(ξ) given by (4.7) plus a weighted Dirac function concentrated on ξ = ξα , that is Z ∞ Z ∞ ρµ (ξ)ϕ(ξ)dξ = ρ(ξ)ϕ(ξ)dξ + λhδ(ξ − ξα ), ϕ(ξ)i, (4.16) lim µ→0

for each ϕ(ξ) ∈ function.

−∞

−∞

C0∞ (R),

where λ = ρ1 u1 − ρ2 u2 + (ρ2 − ρ1 )ξα and δ(ξ) is the Dirac

Proof. Inspired by [19] (also see [11]), we let ξ1 < ξα < ξ2 and ϕ(ξ) ∈ C0∞ (ξ1 , ξ2 ) with the property that ϕ(ξ) ≡ ϕ(ξα ) in a small neighbourhood of ξα . By (1.6a), Z ξ2 ρµ (ξ)ϕ00 (ξ)dξ µ ξ1 ξ2

Z =

ξ1

ρµ (ξ)(ξϕ0 (ξ) + ϕ(ξ)) − ρµ (ξ)uµ (ξ)ϕ0 (ξ) dξ,

202

J. Hu

which combines with (3.14) to yield that Z

ξ2

ρµ (ξ)ϕ(ξ)dξ = lim

lim

µ→0

Z

ξ2 µ→0

ξ1

ρµ (ξ)(uµ (ξ) − ξ)ϕ0 (ξ)dξ.

(4.17)

ξ1

For α1 , α2 near ξα , α1 < ξα < α2 , (4.7) gives Z

ξ2

lim

µ→0

ρµ (ξ)(uµ (ξ) − ξ)ϕ0 (ξ)dξ

ξ1

Z

α1

= lim

µ→0

Z

α1

=

ρµ (ξ)(uµ (ξ) − ξ)ϕ0 (ξ)dξ + lim

ξ1 0

ρ1 (u1 − ξ)ϕ (ξ)dξ +

ξ1

Z

Z

µ→0

ξ2

ξ2

ρµ (ξ)(uµ (ξ) − ξ)ϕ0 (ξ)dξ

α2

ρ2 (u2 − ξ)ϕ0 (ξ)dξ

α2

Z

Z

α1

= (ρ1 u1 − ρ2 u2 + ρ2 α2 − ρ1 α1 )ϕ(ξα ) + ρ1

ξ2

ϕ(ξ)dξ + ρ2

→ (ρ1 u1 − ρ2 u2 + (ρ2 − ρ1 )ξα )ϕ(ξα ) + ρ1

ϕ(ξ)dξ

ξ1 Z ξα

α2 Z ξ2

ϕ(ξ)dξ + ρ2 ξ1

ϕ(ξ)dξ ξα

as α1 → ξα −, α2 → ξα +. Therefore , it follows from (4.17) that Z

ξ2

ρµ (ξ)ϕ(ξ)dξ

lim

µ→0

ξ1

Z

= (ρ1 u1 − ρ2 u2 + (ρ2 − ρ1 )ξα )ϕ(ξα ) + ρ1

Z

ξα

ϕ(ξ)dξ + ρ2 ξ1

(4.18)

ξ2

ϕ(ξ)dξ. ξα

By the approximation process, (4.18) holds for every ϕ(ξ) ∈ C0∞ (ξ1 , ξ2 ) . This completes the proof. Now we define ρ(ξ) on (−∞, ∞) as follows: ρ(ξ) = ρ(ξ) + λδ(ξ − ξα ) = ρ1 + (ρ2 − ρ1 )H(ξ − ξα ) + λδ(ξ − ξα ),

(4.19)

where λ = ρ1 u1 − ρ2 u2 + (ρ2 − ρ1 )ξα and H(ξ) is the Heaviside function given by H(ξ) = 1 for ξ > 0 and H(ξ) = 0 for ξ < 0. By (4.7), we write u(ξ) = u1 + (u2 − u1 )H(ξ − ξα ).

(4.20)

Now we have Theorem 4.6. Suppose u1 > u2 and ρ1 ρ2 > 0. Let P and U belong to G(R) with ρ(ξ) and u(ξ) given by (4.19), (4.20) as their macroscopic aspects respectively. Then (P, U√ ) satisfies (2.3), (2.4), i.e., (P, U ) is a distribution solution to (1.1), (1.2). Moreover, √ √ ρ u + ρ u ξα = √1 ρ1 +√ρ2 2 and λ = ρ1 ρ2 (u1 − u2 ). 1

2


203

Proof. Let θi (ξ) ∈ A0 (R), i.e., θi (ξ) ∈ D(R) with

R

θ (ξ)dξ R i

= 1, i = 1, 2. We set

R1 (θ1 , ξ) = (ρ ∗ θ1 )(ξ) = (ρ ∗ θ1 )(ξ) + λhδ(y − ξα ), θ1 (y − ξ)i Z ∞ θ1 (s)ds + λθ1 (ξα − ξ) = ρ1 + (ρ2 − ρ1 ) ξα −ξ

Z

R2 (θ2 , ξ) = (u ∗ θ2 )(ξ) = u1 + (u2 − u1 )

(4.21)

∞ ξα −ξ

θ2 (s)ds.

We want to show that for any ψ(ξ) ∈ C0∞ (R), Z ∞ (4.22) R1 (θ1 , ξ)(ξψ(ξ))0 − R1 (θ1 , ξ)R2 (θ2 , ξ)ψ 0 (ξ) dξ = 0, lim →0 −∞ Z ∞ lim R1 (θ1 , ξ)R2 (θ2 , ξ)(ξψ(ξ))0 − R1 (θ1 , ξ)R22 (θ2 , ξ)ψ 0 (ξ) dξ = 0, (4.23) →0

−∞

where θi (ξ) = 1 θi

ξ

, i = 1.2. Observing that

ρ ∗ θ1 (ξ) → ρ(ξ), R2 (θ2 , ξ) → u(ξ), R22 (θ2 , ξ) → u2 (ξ) we have

Z lim →0 Z =

∞

−∞ ∞ −∞

a.e.on R,

R1 (θ1 , ξ)(ξψ(ξ))0 dξ

ρ(ξ) (ξψ(ξ))0 dξ + λ lim

Z

→0

∞ −∞

1 θ1

ξα − ξ

(ξψ(ξ))0 dξ

(4.24)

= ((ρ1 − ρ2 )ξα + λ) ψ(ξα ) + λξα ψ 0 (ξα ) and

Z lim →0 Z =

∞

−∞ ∞

R1 (θ1 , ξ)R2 (θ2 , ξ)ψ 0 (ξ)dξ

u(ξ)ρ(ξ)ψ 0 (ξ)dξ (4.25) Z ∞ Z ∞ s 1 1 ξα − ξ θ1 ψ 0 (ξ)(u1 + (u2 − u1 ) θ2 ds)dξ + λ lim →0 −∞ ξα −ξ −∞

= (ρ1 u1 − ρ2 u2 )ψ(ξα ) + λ(u1 + (u2 − u1 )A)ψ 0 (ξα ), R R∞ ∞ where A = −∞ θ1 (y) y θ2 (s)ds dy. Therefore, (4.22), (4.24–25) imply that

(λ + (ρ1 − ρ2 )ξα − ρ1 u1 + ρ2 u2 ) ψ(ξα ) + λ (ξα − u1 − (u2 − u1 )A) ψ 0 (ξα ) = 0, which means λ (ξα − u1 − (u2 − u1 )A) = 0, since ψ(ξ) is arbitrary. Similarly, (4.23),(4.21) are reduced to

λ = (ρ2 − ρ1 )ξα + ρ1 u1 − ρ2 u2 ,

(4.26)

204

J. Hu

ξα (ρ1 u1 − ρ2 u2 ) + λ(u1 + (u2 − u1 )A + ρ2 u22 − ρ1 u21 ) = 0, (4.27) λ ξα (u1 + (u2 − u1 )A) − u21 + 2u1 (u2 − u1 )A + (u2 − u1 )2 B = 0, R 2 R∞ ∞ where B = −∞ θ1 (y) y θ2 (s)ds dy. Remembering that u2 ≤ ξα ≤ u1 , we solve (4.26),(4.27) to obtain √

√ ρ1 u1 + ρ2 u2 ξα = √ , √ ρ1 + ρ2 √ ρ2 A= √ √ , ρ1 + ρ2

λ=

√ ρ1 ρ2 (u1 − u2 ),

B = A2 .

(4.28) (4.29)

We remark here that (4.29) always holds for any ρ1 ρ2 > 0 by appropriately choosing θ1 and θ2 . Thus (2.3) holds for ξα , λ, A chosen above. And (2.4) is easily seen. This completes the proof. 5. Existence of Distribution Solutions to (1.1), (1.2) for ρ1 ρ2 = 0 In this section we only consider the case ρ1 = 0, ρ2 > 0 . The case for ρ2 = 0, ρ1 > 0 can be treated similarly. Let (ρη (x, t), uη (x, t)) = (ρη (ξ), uη (ξ)) be the macroscopic aspect of the distrbution solution (P,U) to (1.1), (1.2) with initial data (η, u1 ), x ≤ 0 . (5.1) (ρη (x, 0), uη (x, 0)) = (ρ2 , u2 ), x > 0 We set (ρ(ξ), u(ξ)) = lim (ρη (ξ), uη (ξ)) in D0 (R). Then (ρ(ξ), u(ξ)) is a classical weak η→0

solution to (1.1), (1.2)(of course (P, U ) is also a distribution solution to (1.1), (1.2) in Colombeau’s sense). As a matter of fact, when u1 ≤ u2 , the smooth solution (ρµη (ξ), uµη (ξ)) to (1.6), (5.1) is uniformly bounded with respect to both µ and η and of uniformly bounded variation, and (ρ(ξ), u(ξ)) is a classical weak solution in L∞ (R × R+ ). On the other hand, when u1 > u2 , the singularity part λη δ(ξ − ξαη ) of ρη (ξ) vanishes as η → 0 since its strength √ √ λη = ρ1 ρ2 (u1 − u2 ) = ηρ2 (u1 − u2 ) → 0 as η → 0 (cf. (4.19), (4.28)), and the weak limit (ρ(ξ), u(ξ)) of (ρη (ξ), uη (ξ)) as η → 0 lies in L∞ (R × R+ ). In a word, (1.1), (1.2) has always a classical weak solution when one initial datum is a vacuum state. Acknowledgement. I am deeply grateful to Prof. Xiaqi Ding for his persistent encouragement. I also would like to express my thanks to Prof. M. Oberguggenberger for bringing his works to my attention.

References 1. Courant, R. and Friedrichs, K.O.: Supersonic flow and shock waves. Berlin–Heidelberg–New York: Springer-verlag, 1976 2. Colombeau, J.F.: New generalized functions and multiplication of distributions. Amsterdam: North Holland, 1984 3. Colombeau, J.F.: Multiplication of distributions. Springer Lecture Notes in Mathematics, Vol. 1532, Heidelberg: Springer-Verlag, 1992


205

4. Colombeau, J.F. and Oberguggenberger, M.: On a hyperbolic system with a compatible quadratic term: Generalized solutions, delta waves, and multiplication of distributions. Comm. Part. Diff. Eqs. 14, 905– 938 (1990) 5. Colombeau, J.F. and LeRoux, A.Y.: Multiplication of distributions in elasticity and hydrodynamics. J. Math. Phys. 29, 315–319 (1988) 6. Colombeau, J.F., LeRoux, A.Y., Noussair, A. and Perrot, B.: Microscopic profiles of shock waves and ambiguities in multiplications of distributions. SIAM J. Numer. Anal. 26, 871–883 (1989) 7. Dafermos, C.M.: Solution of the Riemann problem for a class of hyperbolic conservation laws by the viscosity method. Arch. Rational Mech. Anal. 52, 1–9 (1973) 8. Dafermos, C.M. and DiPerna, R.J.: The Riemann problem for certain classes of hyperbolic systems of conservation laws. J. Diff. Eqs. 20, 90–114 (1976) 9. Egorov, Yu.V.: On the theory of generalized functions. Russ. Math. Surveys 45, 3–40 (1990) 10. Greenberg, J.M. and LeRoux, A.Y.: A well-balanced scheme for the numerical processing of source terms in hyperbolic equations. SIAM J. Num. Anal. 33, 1–16 (1996) 11. Hu, J.X.: A limiting viscosity approach to Riemann solutions containing delta-shock waves for nonstrictly hyperbolic conservation laws. Quarterly Appl. Math. 2, 361–373 (1997) 12. Maso, G.D., LeFloch, P. and Murat, F.: Definition and weak stability of nonconservative products. J. Math. Pure Appliquees 6, 483–548 (1995) 13. Oberguggenberger, M.: Multiplications of distributions and applicatons to partial differential equations. Pitman Research Notes Math. Vol. 259, Harlow: Longman, 1992 14. Oberguggenberger, M.: Products of distributions: Nonstandard methods. Zeit. Anal. Anw. 7, 347–365 (1988); Corrections to this article: Zeit. Anal. Anw. 10, 263–264 (1991) 15. Rosinger, E.E.: Nonlinear Partial Differential Equations. An Algebraic view of generalized solutions. Amsterdam: North Holland, 1990 16. Schwartz, L.: Sur l’impossibilite de la multiplication des distributions. C.R. Acad. Sci. Paris 239, 847– 848 (1954) 17. Slemrod, M. and Tzavaras, A.E.: A limiting viscosity approach for the Riemann problem in isentropic gas dynamics. Indiana Univ. Math. J. 38, 1047–1074 (1989) 18. Smoller, J.: Shock Waves and Reaction-Diffusion Equations. Berlin–Heidelberg–New York: SpringerVerlag, 1983 19. Tan, D.C., Zhang, T. and Zheng, Y.X.: Delta-Shock waves as limits of vanishing viscosity for hyperbolic systems of conservation laws. J. Diff. Eqs. 112, 1–32 (1994) 20. Todorov, T.: Colombeau’s generalized functions and nonstandard analysis. In: Generalized Functions, Convergence Structures and Their Applications (B.Stankovic, E. Pap, S. Pilipovic, V.S. Vladimirov, ed), New York: Plenum Press, 1988, pp. 327–339 21. Tupciev, V.A.: On the method of introducing viscosity in the study of problems involving decay of a discontinuity. Dokl. Akad. Nauk. SSR 211, 55–58 (1973) 22. Volpert, A.I.: The space BV and quasilinear equations. Math. USSR Sbornik 73(115), 225–267 (1967) 23. Volpert, A.I. and Hudjaev, S.I.: Analysis in classes of discontinuious functions and equations of mathematical physics. Nijhoff, 1985 24. Wang, Z., Huang, F.M. and Ding, X.Q.: On the Cauchy problem of transportation equations. Acta Math. Appl. Sinica, 2, (1997) 25. Wang, Z. and Ding, X.Q.: Uniqueness of generalized solutions for the Cauchy problem of transportation equations. To appear 26. E, Weinan, Rykov, Yu.G. and Sinai, Ya.G.: Generalized variational principles, global weak solutions and behavior with random initial data for systems of conservation laws arising in adhesion particle dynamics. Commun. Math. Phys. 177, 349–380 (1996) Communicated by Ya. G. Sinai

Commun. Math. Phys. 194, 207 – 230 (1998)

Communications in


Poincaré–Cartan Class and Deformation Quantization of Kähler Manifolds Hideki Omori1 , Yoshiaki Maeda2 , Naoya Miyazaki1 , Akira Yoshioka3 1 Department of Mathematics, Faculty of Science and Technology, Science University of Tokyo, Noda, Chiba, 278, Japan 2 Department of Mathematics, Faculty of Science and Technology, Keio University, Hiyoshi, Yokohama, 223, Japan 3 Department of Mathematics, Faculty of Engineering, Science University of Tokyo, Kagurazaka, Tokyo, 162, Japan

Received: 4 April 1997 / Accepted: 12 October 1997

Abstract: We introduce a complete invariant for Weyl manifolds, called a Poincaré– Cartan class. Applying the constructions of the Weyl manifold to complex manifolds via the Poincaré–Cartan class, we propose the notion of a noncommutative Kähler manifold. For a given Kähler manifold, the necessary and sufficient condition for a Weyl manifold to be a noncommutative Kähler manifold is given. In particular, there exists a noncommutative Kähler manifold for any Kähler manifold. We also construct the noncommutative version of the S 1 -principal bundle over a quantizable Weyl manifold.

Introduction The construction of a deformation quantization of symplectic manifolds has been extensively studied in recent works. The purpose of this paper is to present a cohomological invariant of Weyl manifolds which appeared in the construction of the star products on a symplectic manifold. As introduced by Bayen, Flato, Fronsdal, Lichnerowicz and Sternheimer in [BFL], a deformation quantization, or more precisely a star-product on a symplectic manifold M is an associative product ∗ on C ∞ (M )[[ν]], the space of formal power series in ν with coefficients in C ∞ (M ), such that: (D1) f ∗ g = f g + ν2 {f, g} + · · · , for f, g ∈ C ∞ (M ), where { , } stands for the Poisson bracket on M . (D2) 1 ∗ f = f = f ∗ 1, ν ∈ center. (D3) Complex conjugation f → f¯ is an anti-automorphism of (C ∞ (M )[[ν]], ∗), where ν = −ν. By the localization theorem (cf. [OMY2, O, p.312]), we may always assume that the star-product ∗ has the locality; i.e. supp f ∗ g ⊂ supp f ∩ supp g as C[[ν]]-valued functions.

208

H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka

One construction of star-products for symplectic manifolds was first shown by Vey [V] and Lichnerowicz [L] via a torsion free flat connection. Using different approaches, De Wilde-Lecomte [DL], Fedosov [F] and Omori-Maeda-Yoshioka [OMY1] have proved the existence of a star-product for an arbitrary symplectic manifold. De Wilde-Lecomte worked algebraically via careful cohomological arguments, while Fedosov and Omori-Maeda-Yoshioka used a geometric method on the Weyl bundle (cf. [W]). Fedosov’s crucial idea is to construct a flat connection on the sections of the Weyl bundle. [OMY1] built a noncommutative version of manifolds, called Weyl manifolds from a given symplectic manifold. Thus, it is natural to ask how the constructions by [DL, F and OMY1] relate to each other. Deligne [D] studied the relationship between the construction of the star-product by De Wilde-Lecomte and by Fedosov and showed that these constructions are equivalent to each other. Recently, there has been interesting work on the equivalence of star products by Xu [X] and Bertelson-Cahen-Gutt [BCG]. In this paper, we first remark the equivalence of the star-product constructed in [OMY1]; The Weyl manifold, by definition (cf.Sect. 3.1), is constructed by patching “noncommutative coordinates”, and constructions of the star product are built on that of Weyl manifolds. The quantum version of Darboux’s theorem (cf. [O], p.317) combined with the inverse Moyal product formula (1.2) easily gives that all star-products are obtained as the algebra of Weyl functions on a Weyl manifold (cf. remark after Theorem 3.2). Fedosov’s flat connection is the connection on a Weyl algebra bundle for which all Weyl functions are characterized as parallel sections. We show in this paper that there is a bijective correspondence between the equivalence class of Weyl manifolds and the second cohomology group H 2 (M, ν 2 C[[ν 2 ]]) (Theˇ orem 3.5). The correspondence is indeed given by a characteristic Cech 2-cohomology class (cf. Definition 3.4) called the Poincaré–Cartan class which comes from a patching of “quantized Darboux coordinates” to make a noncommutative manifold. The Poincaré– Cartan class has been proposed previously by Karasev and Maslov in [KM] to be an invariant for their asymptotic quantization theory. It is remarked that its integration on a circle coincides with the original Poincaré–Cartan invariant (cf. [O]). On the other hand, a characteristic class was defined by Nest-Tsygan [NT] in terms of the curvature of the connection for the Weyl bundle, which distinguishes Fedosov star-products up to equivalence (cf. [F]). We conjecture that their characteristic class might coincide with the Poincaré–Cartan class for the Weyl manifold. The main purpose of this paper is to apply the Poincaré–Cartan class to complex manifolds and to propose the notion of a noncommutative Kähler manifold. A Kähler manifold is a special type of symplectic manifolds with the option that their coordinate transformations are not only symplectic but also holomorphic. For a given Kähler manifold M , we give a necessary and sufficient condition for a Weyl manifold over M to be a noncommutative Kähler manifold in terms of its Poincaré–Cartan class (Theorem 4.6.). We also show that there exists a noncommutative Kähler manifold for every Kähler manifold (Theorem 5.2.). The second subject of this paper discussed in Sect. 6 is, as an application of the construction of star products via Weyl manifold, a construction of a quantum S 1 -bundle over a symplectic manifold with the quantization condition. In patching up the (noncommutative) local coordinates to obtain the Weyl manifold, we use a derivation which generates a noncommutative version of the circle action on the S 1 -bundle of a symplectic manifold satisfying the quantization condition. Furthermore, if the base manifold M has a Kähler structure, then the noncommutative version of the associated line bundle has the structure which one may call “holomorphic

Poincaré–Cartan Class and Deformation Quantization of Kähler Manifolds

209

line bundle”. This structure naturally gives the notion of holomorphic sections, and the space of all holomorphic sections is a maximal commutative subalgebra. It should be remarked that the construction of star-products in [OMY1] has an advantage of yielding naturally such constructions. 1. Weyl Functions We first review briefly Weyl functions and Weyl diffeomorphisms on Weyl algebras. Here we start with a Weyl algebra over R. A Weyl algebra W is the algebra generated formally by ν, X1 , · · · , Xn , Y1 , · · · , Yn over R with the fundamental relations [ν, Xi ] = [ν, Yi ] = 0, [Xi , Xj ]= [Yi , Yj ] = 0, [Xi , Yj ] = −νδij . The multiplication of the algebra is denoted by ∗. Then, the Weyl algebra W can be identified with the algebra R[[X, Y, ν]] of formal power series with the following product, called the Moyal product : → ν ←− − (1.1) a ∗ b = a exp{− ∂X ∧˙ ∂Y }b, 2 Pn ←− − → where a∂X ∧˙ ∂Y b = j=1 {∂Xj a·∂Yj b−∂Yj a·∂Xj b}, We put the usual adic-topology on W . The formula (1.1) can be inverted to recapture the commutative product as follows: → ν ←− ∗ − a · b = a exp{ ∂X ∧ ∂Y }b, 2

(1.2)

Pn → ←− ∗ − where a∂X ∧ ∂Y b = j=1 {∂Xj a∗∂Yj b − ∂Yj a∗∂Xj b}, This can be viewed as a method of construction of a commutative product from the ∗-product. This idea appears in Sect. 3 to make a model space of a Weyl manifold, and in Sect. 6 to solve an equation given by using ∗-product. We define an involutive anti-automorphism a → a¯ by setting X i = Xi , Y j = Yj , ν¯ = −ν. Note that there are other systems of elements (X10 , · · · , Xn0 , Y10 , · · · , Yn0 ) of W with the same fundamental relations which topologically generate the same W . We call such X10 , · · · , Xn0 , Y10 , · · · , Yn0 quantum canonical generators (QC-generators). 1.1. Weyl function. Let U be an open set of R2n with linear coordinates (x1 , · · · , xn , y1 , · · · , yn ), and W U the trivial algebra bundle U × W . Let Γ (W U ) be the space of all continuous sections of W U with respect to the compact open topology. Γ (W U ) is an associative algebra over R under the pointwise ∗-product. Define the sections ξi , ηi of W U by ηi (p) = yi (p) + Yi , i = 1, · · · , n. (1.3) ξi (p) = xi (p) + Xi , Then, we have [ξi , ηj ] = −νδij , [ξi , ξj ] = [ηi , ηj ] = 0. For any R[[ν]]-valued C ∞ function f , we define a section f ] (ξ, η), called a Weyl function, by the formula P 1 λ µ λ µ (1.4) f ] (ξ, η)(p) = λ!µ! ∂x ∂y f (p)X · Y . λµ

For f ∈ C ∞ (U )[[ν]] we call f ] the Weyl continuation of f . Obviously ξi = x]i , and ηi = yi] . We define F (W U ) to be the set of all Weyl functions. F(W U ) is a closed subalgebra of Γ (W U ) (cf.[OMY1]).

210


It is easily seen that the ∗-product f ] ∗ g ] is given by the same formula (1.1), i.e. − − → ν ← f ] ∗ g ] (p) = (f exp − { ∂x ∧˙ ∂y }g)] , 2

(cf. [OMY1]).

(1.5)

Moreover, the involutive anti-automorphism a 7→ a¯ extends naturally on Γ (W U ) and F (W U ) = F(W U ), We have also f ] = (f¯)] . 1.2. Integration on W U . For a Weyl function f ] ∈ F(W U ) with f integrable on U , we define the integral of f ] by R ] R f = f dV ∈ R[[ν]], U

U

where dV = dx1 · · · dxn dy1 · · · dyn is the usual volume element on U . Integration by R ← − − → parts shows that if one of f, g has a compact support, then U f { ∂y ∧˙ ∂x }k gdV = 0. Hence, we have R R ] f ∗ g ] = f · gdV. (1.6) U

U

In particular, we have R

f ] ∗ g] =

U

R

g] ∗ f ] ,

U

R U

f] =

R

f ].

(1.7)

U

1.3. The contact Weyl Lie algebra. We define a derivation L0 as follows: L0 ν = 2ν 2 ,

L0 Xi = νXi ,

L0 Yi = νYi .

Together with a formal symbol τ , we define a Lie algebra, called a contact Weyl Lie algebra , g = Rτ ⊕ W with the bracket: [aτ + f, bτ + g] = aL0 g − bL0 f + [f, g].

(1.8)

We easily see that [g, g] ⊂ ν ∗ W . We set also τ¯ = τ to define an involutive antiautomorphism. Definition 1.1. A linear mapping A : g → g is called a ν-isomorphism, if A is a Lie algebra isomorphism satisfying (i) A(ν) = ν, (ii) AW = W and (iii) the restriction A|W is an algebra isomorphism. D : g → g is called a ν-derivation if D is a Lie algebra derivation satisfying (i) D(ν) = 0, (ii) DW ⊂ W , and (iii) the restriction D|W is an algebra derivation. (Cf.[OMY1] Definition 4.2. ) Although L0 and hence τ depends on the choice of QC-generators, it is easy to see that the ν-isomorphism class of g is determined only by W . Lemma 1.2. For every ν-derivation D : g → g, there are f ∈ W , c ∈ R such that D is written in the form D = ad(ν −1 ∗f ) + c ad(log ν). If D(τ ) ∈ ν∗W , then f = ν ∗ g, g ∈ W in the above expression and c is determined uniquely by D. g is determined only up to constant. If D(τ ) ∈ ν 2 ∗W , then D = ad(ν ∗ g), where g is determined uniquely by D.


211

Here, we first remark that ad(ν −1∗f ) and ad(log ν) are defined by only symbolic use of ν −1 ∗f and log ν. Note that the above lemma is proved in [OMY1, Proposition 4.3] in the case of complex coefficients, but the proof works also for the real coefficients. Though the second statement was not given there, it can be seen easily by the proof. Let U be an open subset of R2n with coordinates x1 , · · · , xn , y1 ,· · · , yn , and Γ (gU ) the space of all continuous sections of the trivial bundle gU = U × g over U . We define a section by Pn (1.9) τ˜ (p) = τ − i=1 (yi (p)Xi − xi (p)Yi ). The sections ξi , ηi given by (1.3) are contained in Γ (gU ), and we have [τ˜ , ξi ] = ν ∗ ξi , [τ˜ , ηi ] = ν ∗ ηi , [ξi , ηj ] = −νδij , [τ˜ , ν] = 2ν 2 .

(1.10)

We give several remarks for the complexification. The notion of Weyl algebras and Weyl functions can be easily complexified by considering the tensor product with C. We denote these by W C and F(W U )C . W , F(W U ) are real subalgebras of W C , F(W U )C . The involutive anti-automorphism extends naturally by setting ¯i = −i to these complexified algebras. Here, one should take care that for instance W is not the subspace {a ∈ W C ; a¯ = a}. To avoid the confusion that might occur, we define as follows: A linear mapping 8 : F (W U )C → F(W U )C over R is said to have the hermitian property if 8(f ) = 8(f¯) holds for every f , and 8 is said to have the real-to-real property if 8(F(W U )) ⊂ F (W U ). Notions of ν-isomorphisms and ν-derivations of g extend for the complexification gC = g ⊗ C. Lemma 1.2 and the following remark hold for the complexified case.

2. Patching Diffeomorphisms 2.1. Weyl diffeomorphisms and contact Weyl diffeomorphisms. Let U and V be open subsets of R2n with coordinates x1 , · · · , xn , y1 , · · · , yn . Consider the trivial algebra bundles W U = U × W , and W V = V × W over U and V respectively. For a bundle isomorphism 8 : W U → W V inducing a diffeomorphism ϕ : U → V on base spaces, we define the pullback 8∗ : Γ (W V ) → Γ (W U ) by (8∗ S)(p) = 8−1 S(ϕ(p)). A continuous algebra isomorphism 9 : F (W V ) → F(W U ) such that 9(ν) = ν will be called a pre-Weyl diffeomorphism. The following lemma is shown in [OMY1, Lemma 3.2] : Lemma 2.1. For any pre-Weyl diffeomorphism 9 : F (W V ) → F(W U ), there exists a unique bundle isomorphism 8 such that 9 = 8∗ . In particular, the induced diffeomorphism ϕ : U P → V is a symplectic diffeomorphism with respect to the natural symplectic 2-form = dxi ∧ dyi . A pre-Weyl diffeomorphism 9 : F(W V ) → F (W U ) is called a Weyl diffeomorphism, if 9 has the hermitian property 9(f¯) = 9(f ). By Lemma 2.1 and the same proof of [OMY3] Proposition 2, we see easily that any pre-Weyl diffeomorphism 9 has the volume preserving property: R R 9(f ) = f. (2.1) U

V

212


Remark that the definition of the Weyl diffeomorphism is slightly stronger than that defined in [OMY1, Definition 3.4]. Though (2.1) is requested in the definition of the Weyl diffeomorphism in [OMY3], this holds automatically by the above observation. Note that the notion of ν-derivations in Definition 1.1 extends naturally to Γ (gU ). Remember that a ν-derivation induces, by definition, an algebra derivation on Γ (W U ). Definition 2.2. A ν-derivation 4 : Γ (gU ) → Γ (gU ) is called a contact Weyl vector field if 4(ν) = 0, 4F (W U ) ⊂ F (W U ) and 4(τ˜ ) ∈ F(W U ). 2.2. Contact Weyl diffeomorphisms. We call an isomorphism 8c∗ : Γ (gV ) → Γ (gU ) a pointless contact diffeomorphism if 8c∗ is a Lie algebra isomorphism such that 8c∗ (ν) = ν, 8c∗ (τ˜ ) ∈ τ˜ + F (W U ), 8c∗ F(W V ) = F(W U ), and the restriction 8c∗ |F(W V ) is an algebra isomorphism. 8c∗ is called a contact Weyl diffeomorphism, if the restriction to F(W V ) gives a Weyl diffeomorphism. Proposition 2.3. Suppose U, V are diffeomorphic to the open unit disk D2n of R2n . For every symplectic diffeomorphism ϕ : U → V , there is a Weyl diffeomorphism 8 : W U → W V inducing ϕ between base spaces. Moreover, 8∗ extends to a contact Weyl diffeomorphism 8c∗ : Γ (gV ) → Γ (gU ) such that 8c∗ (f ) = 8c∗ (f¯). Proposition 2.3 is given in [OMY1, Theorems 3.7 and 4.7] in the case of complex coefficients, but this holds also for the real case by the same proof. In the proof of [OMY1, Theorem 3.7], ϕ is requested to be a symplectic diffeomorphism of U onto V . However this condition is easily removed by considering an exhausting family of closed subsets of U and V . The 8 given by Proposition 2.3 is called a lift of ϕ. Note that the lift 8 of ϕ is not unique in general. C Let W C U and F(W U ) be the complexification of W U and F (W U ) respectively. Notions of pre-Weyl diffeomorphisms and Weyl diffeomorphisms extends naturally on these complexified algebras. Let Γ (gU )C be the complexification of Γ (gU ). As in Lemma 1.2, the notion of contact Weyl vector fields and pointless contact diffeomorphisms, etc. extends naturally to Γ (gU )C . By Lemma 1.2 and the remark mentioned in the last paragraph of Sect. 1, we have: Lemma 2.4. For a contact Weyl vector field 4 : Γ (gU )C → Γ (gU )C there exist f ∈ F (W U )C and c0 ∈ C such that 4 = ad(ν −1 ∗f ) + c0 ad(log ν): (1) If 4(τ˜ ) ∈ ν∗F(W U )C , then f ∈ ν∗F(W U )C , c0 ∈ C, and c0 is uniquely determined. ν −1 ∗f is determined only up to constant. (2) If 4(τ˜ ) ∈ ν 2 ∗Γ (gU )C , then c0 = 0, and f can be taken in ν 2 ∗F (W U )C , and such f is unique. (3) If 4 has the real-to-real property; 4Γ (gU ) ⊂ Γ (gU ), then c0 ∈ R and f can be taken in F (W U ). ¯ and 4(τ˜ ) ∈ (4) If 4 has the real-to-real property, the hermitian property; 4(h) = 4(h) 0 2 ν ∗F(W U ), then c = 0 and f can be taken in ν ∗F(W U ), and hence such f is unique. Proof. (1) and (2) are easy to see by Lemma 1.2, and (3) is given by the similar proof. For (4), we see by (1)-(3) that there are g ∈ F (W U ) and c0 ∈ R such that 4 = ad(g) + c0 ad(log ν). By the hermitian P property, we have 4(τ˜ ) = 4(τ˜ ). It follows that [τ˜ , g + g] ¯ = 4c0 ν. Since g + g¯ ∈ k≥0 ν 2k C ∞ (U )] , we have c0 = 0 and g is written


213

P 2k+1 in the form g = ν g2k+1 . It follows f = ν ∗ g ∈ ν 2 ∗ F (W U ). This yields 2 4(τ˜ ) ⊂ ν ∗ F(W U ), and hence f is determined uniquely by 4. Considering the formal expansion in ν k , we see that if a pointless contact diffeomorphism 8c∗ : Γ (gU )C → Γ (gU )C induces the identity on the base space U , then 8c∗ is written in the form 0 Y k ] −1 0 ead(ν hk ) ec ad(ν ) ec ad(log ν) , (2.2) 8c∗ = ∞

Q0 using hk ∈ C ∞ (U )C for every integer k ≥ 0 and c, c0 ∈ C, where the notation ∞ Ik means · · · Ik · · · I2 I1 I0 . c0 is determined uniquely by 8c∗ by virtue of Lemma 2.4, (1). By Lemma 2.4 we easily obtain the following: Corollary 2.5. If a pointless contact diffeomorphism 8c∗ : Γ (gU )C → Γ (gU )C induces the identity on the base space U , then hk , c, c0 in (2.2) satisfies the following: (1) If 8c∗ (τ˜ ) ∈ τ˜ + 2c + ν 2 ∗ F(W U )C , then c0 = 0, h0 = 0, and c, hk (k ≥ 1) are unique. (2) If 8c∗ has the real-to-real property; 8c∗ Γ (gU ) = Γ (gU ), then c, c0 ∈ R and hk ∈ F (W U ). (3) If 8c∗ has the real-to-real property and the hermitian property; 8c∗ (f ) = 8c∗ (f¯), then c0 = 0, h2k = 0 for k ≥ 0, and c and h2k+1 (k ≥ 0) are unique. (4) If 8c∗ induces the identity on F (W U )C , then there are c˜ ∈ C[[ν]], c0 ∈ C such −1 0 that 8c∗ = ec˜ ad(ν )+c ad(log ν) . Furthermore, if 8c∗ has the real-to-real property and the hermitian property, then c0 = 0 and c˜ ∈ R[[ν 2 ]]. Proof. Here we give the proof of (4). Set 8c∗ (τ˜ ) = τ˜ + g, g ∈ F (W U )C . As [τ˜ , ξi ] = νξi , [τ˜ , ηj ] = νηj , and 8c∗ is an isomorphism, we have [g, ξi ] = [g, ηj ] = 0, hence g ∈ C[[ν]]. The second statement of (4) follows easily. The next lemma is given in [OMY1]: Lemma 2.6. For every pre-Weyl diffeomorphism 8∗ : F (W U )C → F (W U )C there is a pointless contact diffeomorphism 8c∗ : Γ (gU )C → Γ (gU )C which extends 8∗ . Proof. By Lemma 2.1, 8∗ induces a symplectic diffeomorphism ϕ on U . By Proposition 2.3, there is a Weyl diffeomorphism 9∗ which is a lift of ϕ. Let 9c∗ be a contact Weyl diffeomorphism which extends 9∗ . Hence, 8∗ 9∗−1 induces the identity on the base space. It follows by Corollary 2.5, (3) that 8∗ = 9∗ ead(h) , h ∈ F (W U )C . Note that the ad(ν −1 ) component is not used, since these act trivially on F(W U )C . Hence we define a pointless contact diffeomorphism 8c∗ by 9c∗ ead(h) . A contact Weyl diffeomorphism 8c∗ has by definition the real-to-real property and it may be assumed by Proposition 2.3 that 8c∗ has the hermitian property. We now remark the following: Lemma 2.7. If a pointless contact diffeomorphism 8c∗ : Γ (gU )C → Γ (gU )C has the real-to-real property and the hermitian property, then 8c∗ (τ˜ ) is written in the form ] + ··· , τ˜ + g0] + ν 2 ∗g2] + · · · + ν 2k ∗g2k

g2k ∈ C ∞ (U ).

214


3. Poincaré–Cartan Classes 3.1. Weyl manifold. Let W M be a locally trivial algebra bundle with the fiber isomorphic to W. Then for an open covering {Vα } of M , there are local trivializations 8α : W Vα → W Uα associated to Vα , where W Vα is the restriction of W M and W Uα is the trivial algebra bundle over Uα (⊂ R2n ). Denote by ϕα : Vα → Uα the induced homeomorphism. Definition 3.1. W M is called a (real) Weyl manifold, if for each Vα , Vβ such that Vα ∩ Vβ 6= ∅, the patching transformation 8αβ = 8β 8−1 α : Uαβ × W → Uβα × W ,

(3.1)

where Uαβ = ϕα (Vα ∩ Vβ ), induces a Weyl diffeomorphism 8∗αβ . Each 8α : W Vα → W Uα is called a local Weyl chart on W M , and W Uα is called the model algebra over Vα . If 8∗αβ are merely pre-Weyl diffeomorphisms, then W M is called a pre-Weyl manifold. By Lemma 2.1, the base manifold M of a pre-Weyl manifold W M has a C ∞ symplectic structure. The following was the main theorem of [OMY1]: Theorem 3.2. On every C ∞ symplectic manifold M , there exists a Weyl manifold W M . In particular, the system of trivial Weyl algebra bundles {W Uα } can be patched together via Weyl diffeomorphisms. The notions of Weyl functions, the involutive anti-automorphism f → f¯, and integration are naturally defined on a Weyl manifold W M . Denote by F(W M ) the algebra of all Weyl functions on W M . Two Weyl manifolds W M , W 0M are said to be isomorphic, if there is an algebra isomorphism 9 : F (W M ) → F (W 0M ) inducing the identity on the base manifold M . Using the fact that F(W M ) is linearly isomorphic to C ∞ (M )[[ν]], we translate the algebra structure of F(W M ) over to C ∞ (M )[[ν]]. In particular, C ∞ (M )[[ν]] is a noncommutative associative algebra which can be viewed as a deformation quantization of (C ∞ (M ), ·). Through this observation, we see also that complex conjugation f → f¯ is an involutive anti-automorphism of (C ∞ (M )[[ν]], ∗). Suppose conversely that we have a deformation quantization (C ∞ (M )[[ν]], ∗) with an involutive anti-automorphism f → f¯ such that f¯ = f for any f ∈ C ∞ (M ) and ν¯ = −ν. Let {Vα } be a locally finite simple open covering of M . Note that by the localization theorem [OMY2], the above ∗-product can be localized on C ∞ (Vα )[[ν]]. Here we need a definition; Definition 3.3. For f ∈ C ∞ (U )[[ν]], the body part b(f ) of f is an R-valued C ∞ function on U such that f − b(f ) ∈ νC ∞ (U )[[ν]]. A system of elements ξ1 , · · · , ξ2n ∈ C ∞ (U )[[ν]] are called topological generators (T-generators), if the body parts b(ξ1 ), · · · , b(ξ2n ) are local coordinates on U . By the same idea of Weyl continuation, every f ∈ C ∞ (U )[[ν]] can be viewed as a “function” of ξ1 , · · · , ξ2n , whenever ξ1 , · · · , ξ2n are T-generators. By the quantum version of Darboux’s theorem [O], there are elements ξ1 , · · · , ξn , η1 , · · · , ηn of C ∞ (U )[[ν]] such that ξ¯i = ξi ,

η¯i = ηi ,

[ξi , ξj ] = [ηi , ηj ] = 0,

[ξi , ηj ] = −νδij ,

(3.2)


215

which are T-generators. We call ξ1 , · · · , ξn , η1 , · · · , ηn quantum canonical generators (QC-generators). As in (1.2), we use the inverse Moyal product formula to make a commutative product ◦. We identify Vα with Uα ⊂ R2n . It is not hard to see that the mapping f → f¯ remains as an involutive automorphism of (C ∞ (Vα )[[ν]], ◦), and f ◦ g is decomposed for some k ≥ 1 into X ν 2l $2l (f, g), $2l (f, g) = $2l (g, f ) ∈ C ∞ (Vα ). f ◦g =f ·g+ l≥k

Since the first component $2k is a Hochschild 2-cocycle, and hence a Hochschild 2coboundary by [OMY2, Theorem 2.2], it is easy to see that (C ∞ (Vα )[[ν]], ◦) is isomorphic to (C ∞ (Vα )[[ν]], ·) with the usual commutative product ·. Hence, there is an open subset Uα of R2n and (C ∞ (Vα )[[ν]], ∗) is isomorphic to F (W Uα ) through an isomorphism 9∗α with the hermitian property. On each Vα ∩ Vβ , the identity mapping of (C ∞ (Vα ∩ Vβ )[[ν]], ∗) onto itself regarded as (C ∞ (Vβ ∩ Vα )[[ν]], ∗), induces a Weyl diffeomorphism 8∗αβ : F (W Uβα ) → F (W Uαβ ). Hence, any deformation quantization (C ∞ (M )[[ν]], ∗) with an involutive anti-automorphism is obtained as an algebra of Weyl functions on a Weyl manifold. 3.2. Poincaré–Cartan classes. For a symplectic manifold M , there are Weyl manifolds over M which are not isomorphic. We give the complete invariant for the isomorphism class of a Weyl manifold as an element of H 2 (M )[[ν 2 ]]. Let {Vα } be a covering of M . For each α let ϕα : Vα → Uα ⊂ R2n be a symplectomorphic coordinate map. Consider the trivial Lie algebra bundle gUα on Uα . Recall that Theorem 3.2 was proved in [OMY1] by constructing a contact Weyl diffeomorphism 8c∗ αβ : Γ (gUβα ) → Γ (gUαβ ) for Vα ∩ Vβ 6= ∅, patching gUα and gUβ together. Let 8∗αβ be the restriction 8c∗ αβ |F(W Uβα ). It is clear that {8∗αβ } gives a pre-Weyl manifold if and only if 8c∗ αβ satisfy 8c∗ αα = 1

c∗ c∗ cαβγ ad(ν and 8c∗ αβ 8βγ 8γα = e

−1

)+c0αβγ ad(log ν)

on every Vα ∩ Vβ ∩ Vγ 6= ∅, where cαβγ ∈ R[[ν]] and c0αβγ ∈ R. The necessity is given −1

0

by Corollary 2.5, (4), and the sufficiency is given since ecαβγ ad(ν )+cαβγ ad(log ν) is the identity on each subalgebra F (W Uαβγ ), where Uαβγ = ϕα (Vα ∩ Vβ ∩ Vγ ). {8∗αβ } gives a Weyl manifold if and only if 8c∗ αβ has the hermitian property furthermore. If this is the case, we see that c0αβγ = 0 and cαβγ ∈ R[[ν 2 ]]. Under these situations the family {F (W Uα )} of algebras is patched together to give an algebra sheaf on M . ˇ 2-cocycles on M (cf.[O, p. 353, It is easily seen that {cαβγ } and {c0αβγ } are Cech OMY1, Lemma 5.6]). In what follows Weyl manifolds are our main concern, but pre-Weyl manifolds are occasionally used for a supplementary role. Definition 3.4. For a family {Γ (gUα )} constructed on a Weyl manifold W M , {cαβγ } is called the Poincaré–Cartan 2-cocycle of {gUα }. If we set 2 (2) cαβγ = c(0) αβγ + ν cαβγ + · · · ,

(3.3)

216


ˇ then {c(0) αβγ } is cohomologous to a Cech cocycle given by the symplectic 2-form on M (cf. [O, p. 357, KM]). We call the cohomology P class of {cαβγ } the Poincaré–Cartan class of {W Uα } and denote it by c(W M ) = k≥0 ν 2k c(2k) (W M ). In [OMY1], we constructed a Weyl manifold on M such that cαβγ = c(0) αβγ ∈ R. The following is a characterization for the equivalence of Weyl manifolds via the Poincaré–Cartan class. This corresponds to the work by [D, NT and BCG] in which they characterized the equivalence of Fedosov star products: Theorem 3.5. The equivalence of Weyl manifolds W M up Pto isomorphism is determined by the Poincaré–Cartan class. Moreover, for every c = k≥0 ν 2k c(2k) ∈ H 2 (M )[[ν 2 ]] such that c(0) is the class of symplectic 2-form, there exists a Weyl manifold W M whose Poincaré–Cartan class c(W M ) is c. Proof. Let {cαβγ }, {c0αβγ } be Poincaré–Cartan cocycles of {gUα } and {g0Uα } respectively. Suppose the Poincaré–Cartan classes coincide. Then, there exists bαβ ∈ R[[ν 2 ]] on every Vα ∩ Vβ 6= ∅ such that bαβ = −bβα and c0αβγ − cαβγ = bαβ + bβγ + bγα . −1

Note that bαβ can be replaced by bαβ +cαβ such that cαβ +cβγ +cγα =0. Since ebαβ ad(ν ) c∗ bαβ ad(ν −1 ) ´ c∗ is an automorphism, we can replace 8c∗ . αβ by 8αβ = 8αβ e −1 bαβ ad(ν ) Since e is the identity on F(W Uαβ ), this replacement does not change the isomorphism class of F (W M ), but it changes the Poincaré–Cartan cocycle from ´ c∗ {cαβγ } to {c0αβγ }. Hence we can assume that we have two families {8c∗ αβ } and {8αβ } of patching transformations such that c∗

c∗

c∗

c∗ c∗ cαβγ ad(ν ´ ´ ´ 8c∗ αβ 8βγ 8γα = 8αβ 8βγ 8γα = e

−1

)

.

´ c∗ −1 induces the identity on the base space, we see by Corollary 2.5, Since 8c∗ αβ (8αβ ) −1 c∗ ad(νhαβ ) ´ c∗ (3) that there is a unique hαβ such that 8 . ec ad(ν ) -terms can be αβ = 8αβ e removed by using the ambiguity of bαβ mentioned above. By a standard argument of ˇ Cech cohomology, we see that ad(νhα ) c∗ −ad(νhβ ) ´ c∗ 8 8αβ e . αβ = e

(See also Lemmas 5.4 through 5.6.) This implies that two families are isomorphic. Conversely suppose there is a Weyl diffeomorphism 9 : W 0M → W M which induces the identity on the base manifold. That is, 9∗ defines an algebra isomorphism of F(W M ) onto F(W 0M ) with the hermitian property such that 9∗ (ν) = ν. The isomorphism 9∗ is equivalently given by a family {9∗α } of isomorphisms: 9∗α : F(W U α ) → F (W 0U α ),

(3.4)

each of which induces the identity map on the base space Uα such that ´ ∗αβ . 9∗α 8∗αβ (9∗β )−1 = 8

(3.5)

c∗ If we extend 9∗α to a contact Weyl diffeomorphism 9c∗ α , then the replacement of 8αβ c∗ c∗ c∗ −1 by 9α 8αβ (9β ) makes no change of Poincaré–Cartan cocycle.


217 −1

c∗

c∗ c∗ −1 bαβ ad(ν ) ´ αβ = 9c∗ By (3.5) and Corollary 2.5,(4), we have 8 . Howα 8αβ (9β ) e ever this type of replacement changes the Poincaré–Cartan cocycle within the same cohomology class.P Suppose c = k≥0 ν 2k c(2k) ∈ H 2 (M )[[ν 2 ]] is given. Then, we start with a Weyl e–Cartan cocycle {c(0) manifold W (0) M with a Poincar´ αβγ }, and changing patching Weyl diffeomorphisms we construct a Weyl manifold with a Poincaré–Cartan class c. Let 8∗αβ : F(W Uβα ) → F (W Uαβ ) be the patching Weyl diffeomorphism of W (0) M (2k) ˇ and let 8c∗ be its extension as a contact Weyl diffeomorphism. Let {c } be a Cech αβ αβγ

cocycle involved in c(2k) . Since the sheaf cohomology of C ∞ functions H 2 (M, E) = {0}, ∞ there is h(2) αβ ∈ C (Uαβ ) on each Uαβ such that (2) (2) ∗ ∗ (2) −c(2) αβγ = hαβ + ϕαβ hβγ + ϕαγ hγα .

(3.6)

] 2 ] ` c∗ For a function rαβ ∈ C ∞ (Uαβ )[[ν 2 ]], we set h˜ αβ = (h(2) αβ ) + ν rαβ . If we use 8αβ = ˜

ad(ν hβα ) 8c∗ as patching diffeomorphisms for every Vα ∩ Vβ 6= ∅, then we see αβ e ∗

∗

∗

˜ βα ) ad(ν8 h ˜ ˜ c∗ c∗ c∗ ad(ν8αβ h αγ γβ ) ead(ν8αα hαγ ) . ` c∗ ` c∗ ` c∗ e 8 αβ 8βγ 8γα = 8αβ 8βγ 8γα e

Here we used the general formula ∗

ad(h) 8c∗ = ead(8αβ h) 8c∗ αβ e αβ

(3.7)

for h ∈ F (W Uαβ ), proved by the uniqueness of solution of ordinary differential equations. By (3.6), we have ∗

∗

˜

∗

˜

˜

ead(ν8αβ hβα ) ead(ν8αγ hγβ ) ead(ν8αα hαγ ) = eν

2 (2) cαβγ adν −1

mod ν 4 .

(3.8)

By working on the term ν , ν , · · · , we can tune up rαβ by the same manner as in [OMY1] so that 4

∗

˜

6

∗

˜

∗

˜

ead(ν8αβ hβα ) ead(ν8αγ hγβ ) ead(ν8αα hαγ ) = eν

2 (2) cαβγ adν −1

.

` c∗ {8 αβ }

` M with the Poincaré–Cartan class It follows that defines a Weyl manifold W c∗ ` c(0) + ν 2 c(2) . Replacing 8c∗ αβ by 8αβ and repeating a similar argument as above, we can 4 replace the condition mod ν in (3.8) by mod ν 6 . Repeating this procedure, we have a Weyl manifold W M such that c(W M ) = c ∈ H 2 (M )[[ν 2 ]]. 4. Noncommutative Kähler manifolds In this section, we introduce a restricted notion of deformation quantization for Kähler manifolds, which we call a noncommutative Kähler manifold. 4.1. Paracoordinates. Let us first review the calculus of complex variables, which differs crucially from the real case. Let U be an open subset of Rm with coordinate functions x1 , · · · , xm , and C ∞ (U )C the space of all C-valued C ∞ functions on U . Consider a set {z1 , · · · , zm } in C ∞ (U )C . Set U˜ = {ψz (p); p ∈ U },

ψz (p) = (z1 (p), · · · , zm (p)).

z1 , · · · , zm are called paracoordinates of U , if the following conditions are satisfied:

218


m (1) U˜ is a real m dimensional C ∞ submanifold of √C such that the complex span of the tangent space Tp U˜ equals Cm , i.e. Tp U˜ + −1Tp U˜ = Cm . (2) ψz : U → U˜ is a diffeomorphism.

[] ψz is called the coordinate map of paracoordinates. ∂x ∂zi The inverse matrix of ( ∂x ) is occasionally denoted by ( ∂zij ), though we do not j ∂x define the derivative ∂zij . Moreover, a C ∞ mapping f from U˜ into C is written in the form f (z1 , · · · , zm ), even though z1 , · · · , zm are not necessarily independent complex variables on U˜ . Since zi are C ∞ functions of x1 , · · · , xm , f (ψz (x)) is a C ∞ function ∂f as follows: of x1 , · · · , xm . We define ∂z i ∂f X ∂xk ∂f = . ∂zi ∂zi ∂xk m

(4.1)

k=1

Note that the right-hand side of (4.1) is computed as elements of C ∞ (U )C or of C ∞ (U˜ )C . Higher order derivatives are defined similarly. At each point p˜ ∈ U˜ , {dz1 , · · · , dzm } forms a real basis in real coefficients of the cotangent space Tp∗˜ U˜ . Let u1 , · · · , um ∈ C ∞ (U )C be other paracoordinates with U˜ 0 = {(u1 (p), · · · , um (p)); p ∈ U }. We set: ∂zi X ∂zi ∂xk = ∂uj ∂xk ∂uj m

(4.2)

k=1

although zi is not a genuine function of u1 , · · · , um . Note that ψu ψz−1 is a C ∞ diffeomorphism of U˜ onto U˜ 0 . The chain rule also holds for these. 4.2. Kähler manifolds. Let M be a smooth symplectic manifold with a symplectic form . For a function f ∈ C ∞ (M )C , we denote by f¯ the complex conjugate of f . The ¯ for any f, g ∈ C ∞ (M )C . Poisson bracket { , } defined by satisfies {f, g} = {f¯, g} Note that a Kähler manifold M is characterized as a real symplectic manifold covered by open subsets {Vα } such that for each Vα there is a homeomorphism ϕα : Vα → Uα ⊂ Cn with the following properties: (1) The coordinate functions z1α , · · · znα on Cn satisfy {ziα , zjα } = {z¯iα , z¯jα } = 0. (2) The matrix ({ziα , z¯jα }) is nondegenerate. (3) On each intersection Vα ∩ Vβ , setting Uαβ = ϕα (Vα ∩ Vβ ), the coordinate transformation ϕαβ = ϕβ ϕ−1 α : Uαβ → Uβα is holomorphic. z1α , · · · , znα are called Kähler coordinates (K-coordinates) on Vα . We can assume that {Vα } is a locally finite, simple open Stein covering: i.e. a locally finite open covering such that Vα1 ∩ · · · ∩ Vαk is a contractible Stein manifold. Let V be one of Vα . As is known in [KN], there exist K-coordinates z1 , · · · , zn on V . We can assume that there is a Kähler potential F on V , i.e. a real valued C ∞ function F (z, z) ¯ such that the symplectic form equals √ X −1 ∂ 2 F k,l = . (4.3) = k,l dzk ∧ dz¯l , 2 ∂zk ∂ z¯l √ P ∂F , we have = dzi ∧ dzi∗ , {zi , zj∗ } = δij and {zi∗ , zj∗ } = 0. Setting zi∗ = 2−1 ∂z i ∗ ∗ z1 , · · · , zn , z1 , · · · , zn are called complex canonical coordinates (CC-coordinates).


219

2

Since ∂z∂k ∂Fz¯l is nondegenerate, the CC-coordinates are paracoordinates of V . Note that the canonical conjugate variables z1∗ , · · · , zn∗ are not uniquely defined, as zi∗ can ∂h for any holomorphic function h. In using CC-coordinates, be replaced by z˜i∗ = zi∗ + ∂z i the Poisson bracket becomes X ∂f ∂g ∂g ∂f − ). (4.4) {f, g} = ( ∂zi ∂zi∗ ∂zi ∂zi∗ We consider the relationship between z1 , · · · , zn , z¯1 , · · · , z¯n

and z1 , · · · , zn , z1∗ , · · · , zn∗

(4.5)

on V . Let ϕz , ϕ0z be the coordinate maps of (z1 (p), · · · , zn (p), z¯1 (p), · · · , z¯n (p)),

(z1 (p), · · · , zn (p), z1∗ (p), · · · , zn∗ (p))

respectively. We set V˜ = {ϕz (p) ∈ C2n ; p ∈ V }, V˜ 0 = {ϕ0z (p) ∈ C2n ; p ∈ V }. ∞ difLet {t1 , · · · t2n } be a real coordinate system of V . Note that ϕ0z ϕ−1 z is a C 0 −1 feomorphism of V˜ onto V˜ 0 . Then, ϕ0z ϕ−1 can be written as ϕ ϕ (z, z) ¯ = (z, z ∗ ). If z z z we consider the inverse mapping of ϕ0z ϕ−1 ¯ can be viewed as a “function” of z, z ∗ , z , z which can be understood as a sort of implicit function theorem. Note that {dzi , dz¯i }, {dzi , dzi∗ } are real bases of the cotangent spaces TV∗˜ and TV∗˜ 0 respectively, and we have that there are relations: √ 2 ∂2F −1 X ∂ F dz¯k + dzk , 1 ≤ i ≤ n. (4.6) dzi∗ = 2 ∂zi ∂ z¯k ∂zi ∂zk 2

Since ∂z∂i ∂Fz¯k is non-singular, the above equality can be inverted to solve dz¯k . P p,q We consider the exterior algebra Λ∗ (V ) = Λ (V ) consisting of elements of the form: X ω= ωi1 ···ip ,j1 ···jq (z, z ∗ )dzi∗1 ∧ · · · ∧ dzi∗p ∧ dzj1 ∧ · · · dzjq . Define the partial exterior derivatives ∂ω, ∂ ∗ ω by: X ∂ω = −{zj∗0 , ωi1 ···ip ,j1 ···jq }dzj0 ∧ dzi∗1 ∧ · · · ∧ dzi∗p ∧ dzj1 ∧ · · · ∧ dzjq , (4.7) X ∂∗ω = {zi0 , ωi1 ···ip ,j1 ···jq }dzi∗0 ∧ dzi∗1 ∧ · · · ∧ dzi∗p ∧ dzj1 ∧ · · · ∧ dzjq . P Thus, ∂ ∗ zi∗ = (ad(zk )zi∗ )dzk∗ = dzi∗ , Hence we may set {zi , } =

∂zi =

∂ , ∂zi∗

P

−(ad(zk∗ )zi )dzk∗ = dzi .

−{zi∗ , } =

∂ . ∂zi

Using the Jacobi identity for the Poisson bracket, we have (∂ ∗ )2 = ∂ 2 = 0,

∂∂ ∗ + ∂ ∗ ∂ = 0.

We set d = ∂ + ∂ ∗ . The following is a slight modification of the Poincaré lemma: Lemma 4.1. Let V be an open contractible subset of M with CC-coordinates (z1 , · · · , zn , z1∗ , · · · , zn∗ ). If dω = 0 for ω ∈ Λp,q (V ), then there exist θ1 ∈ Λp−1,q (V ) and θ2 ∈ Λp,q−1 (V ) such that ω = ∂ ∗ θ1 + ∂θ2 .

220


By Lemma 4.1, we have Lemma 4.2. On a Kähler manifold M , every holomorphic coordinate transformation ϕαβ : Uαβ → Uβα induces a Poisson algebra isomorphism of the form: ϕ∗αβ (ziβ ) = ϕiαβ (z α ),

ϕ∗αβ (zi∗β ) =

X ∂gαβ ((dϕαβ )−1 )ki · (zk∗α + ), ∂zkα

(4.8)

k

where gαβ is a holomorphic function. Proof. Let z1 , · · · , zn and w1 , · · · , wn be K-coordinates on Uαβ and Uβα respectively. Let z1 , · · · , zn , z1∗ , · · · , zn∗ and w1 , · · · , wn , w1∗ , · · · , wn∗ be associated CC-coordinates respectively. Since wi = wi (z); holomorphic function of z1 , · · · , zn , we have also w¯ i = ¯ and z¯i = z¯i (z, z ∗ ), we see that wi = wi (z), wi∗ = wi∗ (z, z ∗ ). w¯ i (z). Since wi∗ = wi∗ (z, z), The Poisson isomorphism which is induced by the coordinate transformation is written as ϕ∗αβ (wi ) = wi (z), ϕ∗αβ (wi∗ ) = wi∗ (z, z ∗ ). We define another Poisson isomorphism ϕ˜ ∗αβ by setting ϕ˜ ∗αβ (wi ) = wi (z),

ϕ˜ ∗αβ (w˜ i∗ ) =

X ∂zk z∗ ∂wi k

using the correspondence similar to transition functions of the cotangent bundle. Since both (w1 , · · · , wn , w1∗ , · · · , wn∗ ) and (w1 , · · · , wn , w˜ 1∗ , · · · , w˜ n∗ ) are CC-coordinates, we have {wi , wj∗ − w˜ j∗ } = 0. It follows that gj = wj∗ − w˜ j∗ is holomorphic. By {wi∗ , wj∗ } = {w˜ i∗ , w˜ j∗ } = 0, we have {w˜ i∗ , gj } − {w˜ j∗ , gi } = 0, which implies P ∂g . d( gi (w)dwi ) = 0. By Lemma 4.1, we have gi = ∂w i Put g = gαβ (z). Since ϕαβ is a holomorphic diffeomorphism, we have wi∗ = w˜ i∗ +

X ∂zk ∂g ∂gαβ = (zk∗α + ). ∂wi ∂wi ∂zkα

k

It is obvious that ϕ∗αα = 1, gαα =const, and on every Vα ∩ Vβ ∩ Vγ 6= ∅, we see that ϕ∗αβ ϕ∗βγ ϕ∗γα = 1,

ϕ∗αβ gβα + ϕ∗αγ gγβ + gαγ = const.

(4.9)

4.3. Noncommutative Kähler manifold. Let M be a Kähler n-manifold. In the following we give a noncommutative version of K-coordinates. Viewing M as a real symplectic manifold, we construct a real Weyl manifold W M as a locally trivial Weyl algebra bundle over M and a noncommutative algebra F (W M ) of the Weyl functions of W M . C We now consider the complexifications W C M and F (W M ) . The complexification C F(W M ) is viewed as a subalgebra of the sections of the complex Weyl algebra bundle WC M. Let U be a contractible open subset of R2n . Definition 4.3. (cf. Definition 3.3) For f ∈ F (W U )C , the body part b(f ) of f is a C ∞ -function on U such that f − b(f )] ∈ νΓ (W U )C . A system of elements ξ1 , · · · , ξ2n ∈ F(W U )C are called topological complex generators (TC-generators), if the body parts b(ξ1 ), · · · , b(ξ2n ) are paracoordinates on U .


221

If ξ1 , · · · , ξ2n are TC-generators, then these elements together with ν generate a dense subalgebra of F(W U )C . On a local coordinate neighborhood U , TC-generators ζ1 , · · · , ζn , ζ¯1 , · · · , ζ¯n ∈ F (W U )C are called quantum Kähler coordinates (QK-coordinates), if [ζi , ζj ] = [ζ¯i , ζ¯j ] = 0, and the body part of the matrix (− ν1 [ζi , ζ¯j ]) is non-degenerate. The following is easy to see: Proposition 4.4. Let U ⊂ Cn be a domain which is a Stein manifold. Suppose ζ1 , · · · , ζn ∈ F(W U )C satisfy [ζi , ζj ] = 0. Then, for any holomorphic function f (t1 , · · · , tn ) on a domain U , f (ζ1 , · · · , ζn ) can be defined by using a polynomial approximation, to be an element of F(W U )C . Definition 4.5. A complexified pre-Weyl manifold W C M is called a noncommutative Kähler manifold, if there is an open covering {Vα } with QK-coordinates z1α , · · · , znα , z¯1α , · · · , z¯nα of each F (W Uα )C , the model algebra over Vα , satisfying the following: On every Uαβ = ϕα (Vα ∩ Vβ ), two systems of the generators are related through a pre-Weyl diffeomorphism 8∗αβ : F(W Uβα )C → F(W Uαβ )C such that there is a holomorphic mapping ϕαβ = (ϕ1αβ , · · · , ϕnαβ ) of Uαβ onto Uβα with 8∗αβ (ziβ ) = ϕiαβ (z α ).

(4.10)

ϕαβ is called a holomorphic coordinate change. By the above definition, it is easily seen that the base manifold M of a noncommutative ahler manifold (cf.Sect. 4.2). Kähler manifold W C M is a K¨ A function of QK-coordinates z α remains a function of QK-coordinates z β after any patching transformation 8∗βα . Hence on the noncommutative Kähler manifold W C M , the notion of quantum holomorphic function is well-defined as a function of z α on each WC Uα . We now consider a Weyl manifold W M over M and its complexification W C M. Let {cαβγ } be the Poincaré–Cartan cocycle of {g(Uα )}. Since constant functions can be viewed as holomorphic functions, there is a natural homomorphism π of H 2 (M ) into H 2 (M, O), the sheaf cohomology group of holomorphic functions. The following is the main theorem of this paper: Theorem 4.6. A Weyl manifold W M constructed on a Kähler manifold M is a noncommutative Kähler manifold, if and only if π(c(2k) (W M )) = 0 for every k ≥ 1. In particular, if H 2 (M, O) = {0}, then any Weyl manifold constructed on M is a noncommutative Kähler manifold.

5. Proof of Theorem 4.6 5.1. Quantum complex coordinates. Let M be a Kähler manifold. According to [OMY1], there exists a Weyl manifold W C M. Theorem 5.1. There is an open covering {Vα } of M such that on every Vα there are QK-coordinates ζ1α , · · · , ζnα , ζ¯1α , · · · , ζ¯nα and the quantum canonical conjugate ζ1∗α , · · · , ζn∗α with [ζiα , ζj∗α ] = −νδij , [ζi∗α , ζj∗α ] = 0.

222


Proof. We first take K-coordinates z1α , · · · , znα and we make CC-coordinates on Uα ; z1α , · · · , znα , z1∗α , · · · , zn∗α . Set ζiα = (ziα )] , ζi∗α = (zi∗α )] . Then, we get [ζiα , ζjα ] = [ζi∗α , ζj∗α ] = 0

(mod ν 2 ),

[ζiα , ζj∗α ] = −νδij

(mod ν 2 ).

(5.1)

Set as follows: [ζiα , ζjα ] = ν 2 aij

(mod ν 3 ),

[ζiα , ζj∗α ] = νδij + ν 2 bij

[ζi∗α , ζj∗α ] = ν 2 cij

(mod ν 3 ),

(mod ν 3 ).

Define a 2-form ω as follows: X ω= (aij dzi∗α ∧ dzj∗α − bij dzi∗α ∧ dzjα + cij dziα ∧ dzjα ).

(5.2)

(5.3)

By the Jacobi identity, we have dω = 0. Thus, by Lemma 4.1, there exists a 1-form X X κj dzjα (5.4) θ= λi dzi∗α + such that dθ = ω. Replacing ζiα , ζi∗α by ζ˜i = ζiα − νλi , ζ˜i∗ = ζi∗α + νκi , we obtain [ζ˜i , ζ˜j ] = [ζ˜i∗ , ζ˜j∗ ] = 0

[ζ˜i , ζ˜j∗ ] = −νδij

( mod ν 3 ),

Repeating this procedure yields Theorem 5.1. ζ1α , · · ·

, ζnα , ζ1∗α , · · ·

, ζn∗α

( mod ν 3 ).

will be called quantum complex canonical generators (QCC-

generators). 5.2. Standard noncommutative Kähler manifold. On a Kähler manifold M , we take a coordinate covering {Vα } and K-coordinates z1α , · · · , znα on Uα , where Uα = ϕα (Vα ). By using the argument in Sect. 4.2 on each Uα , there are CC-coordinates z1α , · · · , znα , z1∗α , · · · , zn∗α . Identifying Vα with Uα , we use the above CC-coordinates on Vα . Let ψzα : Vα → C2n be the coordinate map of these paracoordinates and let V˜α = α {ψz (p); p ∈ Vα }. We define a star-product ∗α on C ∞ (V˜α )C [[ν]] by ν ←− −−→ f ∗α g = f exp{− ∂zα ∧˙ ∂z∗α }g. 2

(5.5)

ˆ C (C ∞ (V˜α )C [[ν]], ∗α ) can be viewed as the algebra of Weyl functions F(W V˜ α ) of the C ˆ ˜ ˜ trivial complex Weyl algebra bundle W V˜ α over Vα . Since Vα is diffeomorphic to Vα ˆ C ˆ C through the coordinate map ψzα , F(W V˜ ) may be written as F (W V ). We now identify α

α

ˆ C C (V˜α )C [[ν]] with F(W Vα ). Let ϕαβ = ϕβ ϕ−1 α be classical holomorphic coordinate transformations. By (4.8), ˆ ∗αβ (ν) = (4.9), we see the following: Under the same notations as in Lemma 4.2, we set 8 ν and X ∂gαβ ˆ ∗αβ (zi∗β ) = ˆ ∗αβ (ziβ ) = ϕiαβ (z α ), 8 ((dϕαβ )−1 )ki · (zk∗α + ). (5.6) 8 ∂zkα ∞

k


223

ˆ ∗αβ extends to a pre-Weyl diffeomorphism of Theorem 5.2. (i) The mapping 8 ˆ C ˆ C (F(W V ), ∗β ) onto (F(W V ), ∗α ) such that βα

αβ

ˆ ∗αα = 1, 8

ˆ ∗αβ 8 ˆ ∗βγ 8 ˆ ∗γα = 1. 8

(5.7)

α α ∗α ˆ C Thus, we obtain a noncommutative Kähler manifold W M in which z1 , · · · , zn , z1 , · · · , ∗α zn are local QCC-generators. ˆ ∗αβ extends to a pointless contact diffeomorphism 8 ˆ c∗ (ii) Moreover, 8 αβ such that (0)

cαβγ ad(ν ˆ c∗ ˆ c∗ ˆ c∗ 8 αβ 8βγ 8γα = e

−1

)

,

c(0) αβγ ∈ C,

(5.8)

and c(0) αβγ defines a cohomology class in the coefficients C of the symplectic 2-form on M. Proof. Omitting subscripts α, β, we denote by zi0 = ϕi (z). By (5.5), we have " # X ∂zk ∂zk ∂zj0 ∂g ∗ 0 = ν · (z + ), z = νδij , k j ∂zi0 ∂zk ∂zi0 ∂zk k " # X ∂zk ∂g X ∂zm ∂g 0 0 ∗ ∗ [zi , zj ] = 0, · (zk + ), · (zm + ) = 0. ∂zi0 ∂zk ∂zj0 ∂zm m

(5.9)

k

Thus, setting zi0∗ =

P ∂zk ∂zi0

· (zk∗ +

∂g ∂zk ),

we see z10 , · · · , zn0 , z10∗ , · · · , zn0∗ are QCC-

generators of C ∞ (Vαβ )C [[ν]]. ˆ ∗αβ extends to a pre-Weyl diffeomorphism. Since ϕαβ is a symplectic We show that 8 diffeomorphism, Proposition 2.3 gives a lift 9∗αβ of ϕαβ . By Theorem 3.2, we may assume that 9∗αβ are patching Weyl diffeomorphisms of a Weyl manifold W M . ˆ ∗αβ on the above QCC-generators. Set We consider (9∗αβ )−1 8 ∗

ˆ αβ (zi0 ) = zi0 + hi , (9∗αβ )−1 8

∗

ˆ αβ (zi0∗ ) = zi0∗ + h∗i . (9∗αβ )−1 8

By (5.9) together with Lemma 4.1, we easily see that there are elements hαβ ∈ ˆ ∗αβ = 9∗αβ ead(hαβ ) . Since 9∗αβ and ead(hαβ ) are pre-Weyl C ∞ (Vαβ )C [[ν]] such that 8 ˆ ∗αβ extends to a pre-Weyl diffeomorphism. Equation (5.7) follows diffeomorphisms, 8 ˆ M. directly from (4.9). Thus, we get a noncommutative Kähler manifold W ˆ c∗ Though we can make, by Lemma 2.6, a pointless contact diffeomorphism 8 αβ which ˆ c∗ ˆ ∗αβ , we construct 8 directly in two ways to obtain (5.8). extends 8 αβ β We define a contact Weyl Lie algebra Γ (gC Uβ ) by joining τ with the relations [τ β , ν] = 2ν 2 , c∗

[τ β , ziβ ] = νziβ ,

[τ β , zi∗β ] = νzi∗β .

(5.10)

∗

ˆ αβ we have only to know the function fαβ given ˆ αβ of 8 To obtain the extension 8 c∗ β α ˆ αβ (τ ) = τ + fαβ . By (5.6), we set zi0 = ϕ∗αβ (z βi ), zi0∗ = ϕ∗αβ (z ∗β by 8 i ) on Uαβ . Then, by (5.9) and (5.10), fαβ must satisfy [τ α + fαβ , zi0 ] = νzi0 ,

[τ α + fαβ , zi0∗ ] = νzi0∗ .

(5.11)

224


Pn Note that [τ α , h] = νE α h, where E α is the Euler operator given by E α = i=1 (ziα · ∗α ∂ziα + zi · ∂zi∗α ). Equation (5.11) can be rewritten by using the usual commutative product, as ∂ ∂ fαβ = −(I − E α )zi0 , fαβ = (I − E α )zi0∗ . (5.12) ∂zi0∗ ∂zi0 Thus, by solving (5.12) via Lemma 4.1, we found fαβ . Since the right-hand side of (5.12) does not involve ν, fαβ does not involve ν. Put ∗ ∗ 1 c(0) αβγ = 2 (fαβ + ϕαβ fβγ + ϕαγ fγα ). c∗

c∗

c∗

(0)

−1

cαβγ ad(ν ) ˆ ˆ ˆ . Then, we have c(0) αβγ ∈ C and 8αβ 8βγ 8γα = e To find the de Rham cohomology class corresponding to c(0) αβγ through the isoˇ morphism between Cech cohomology and de Rham cohomology, we recall another recipe of constructions of fαβ . That is, we find a 1-form θ˜α on every Vα such that ϕ∗αβ θ˜β − θ˜α = 21 dfαβ on Vαβ = Vα ∩ Vβ , because {dθ˜α } defines a global closed 2-form. To find θ˜α , we remark that there is a one parameter family ψt of symplectic diffeomorphisms ψt : Vαβ,0 → Vαβ,t such that Vαβ,0 = Vαβ , ψ0 = 1 and Vαβ,1 = Vαβ , ψ1 = (ϕ1αβ , · · · , ϕ2n αβ ) (cf. [OMY3, Lemma A] and [KNz]). Define an infinitesimal symd ψt )ψt−1 . Since Ht is a Hamiltonian vector field, plectic transformation Ht by Ht = ( dt ∞ there is a C function ht such that yHt = −dht . c∗ d c∗ 1 Recall that a lift 9c∗ t is given by solving the equation dt 9t = 9t ad( ν ht ). If we c∗ α α set 9t (τ ) = τ + ft , then ft must satisfy the differential equation

1 1 d ft = [ht , ft ] + 2ht − [τ α , ht ]. dt ν ν

(5.13)

By the first construction of fαβ , we may set f1 = fαβ mod ν. P Note that setting θ˜α = 21 (ziα dzi∗α − zi∗α dziα ), we have ν1 [τ α , ht ] = E α ht = 2θ˜α yHt . Solving the equation of the ν 0 -component of (5.13), we have Z 1 ∗ fαβ = ψ1 ψt∗−1 (2ht − 2θ˜α yHt )dt. (5.14) 0

Note that = dθ˜α on Vα (cf. 4.2). By Cartan’s formula of Lie derivatives, we see that Z 1 Z 1 dfαβ = 2ψ1∗ ψt∗−1 (dht − d(θ˜α yHt ))dt = −2ψ1∗ ψt∗−1 LHt θ˜α dt. 0

0

Hence we have dfαβ = 2(ϕ∗αβ θ˜β − θ˜α ), by remarking ψ1∗−1 θ˜α = P that dθ˜α = 2 dziα ∧ dzi∗α . The last assertion is proved.

ϕ∗αβ θ˜β . Thus, we see

Probably, the noncommutative Kähler structure obtained by Theorem 5.2 is isomorphic to that given by Karabegov [Ka]. 5.3. Proof of Theorem 4.6. Suppose we have a Weyl manifold W M with the Poincaré– Cartan class {cαβγ }. Let 8∗αβ be Weyl diffeomorphisms giving patching transforma∗ tions, and let 8c∗ αβ be the lifts of 8αβ given in Sect. 3.2. Let ϕαβ : Uαβ → Uβα be the coordinate transformation induced by 8∗αβ . ϕαβ is a symplectomorphism and a holomorphic diffeomorphism at the same time.


225

By the assumption of Theorem 4.6, we have that for every k ≥ 1, {c(2k) αβγ } can be (2k) (2k) (2k) (2k) ∗ ∗ (2k) is a written in the form cαβγ = gαβ + ϕαβ gβγ + ϕαγ gγα , (k ≥ 1), where gαβ holomorphic function on Uαβ = ϕα (Vα ∩ Vβ ). Beside 8c∗ αβ , we define another family of pre-Weyl diffeomorphisms ˘ c∗ ˆ c∗ 8 αβ = 8αβ exp

X k≥1

1 (2k) ad(ν 2k−1 gαβ ) 2k − 1

(5.15)

c∗

(2k) ˘ αβ } satisfies by using gαβ given above. By (3.7), we see that {8

ˆ c∗ ˆ c∗ ˆ c∗ ˘ c∗ ˘ c∗ ˘ c∗ 8 αβ 8βγ 8γα = 8αβ 8βγ 8γα exp

P

2k−1 (2k) 1 (gαβ k 2k−1 ad(ν

∗

∗

(2k) ˆ αβ g (2k) + 8 ˆ αγ gγα +8 )). βγ

Hence by (5.6), (5.8), we have c∗ c∗ c∗ cαβγ ad(ν −1 ) ˘ c∗ ˘ c∗ ˘ c∗ . Lemma 5.3. 8 αβ 8βγ 8γα = 8αβ 8βγ 8γα = e

˘ ˘ c∗ The above lemma also shows that the system {8 αβ } defines a pre-Weyl manifold W M . However, we see also the following: 6 ∅, there exist a unique hαβ ∈ Lemma 5.4. For each α, β such that Vα ∩ Vβ = ad(νhαβ ) cαβ ad(ν −1 ) ˘ c∗ = 8 e . F (W Uαβ )C and a unique cαβ ∈ C such that 8c∗ αβ αβ e Proof. We already know that fαβ does not involve ν. Hence by (5.15) we see that P ] β α 2k ∞ ˘ c∗ 8 αβ (τ ) is written in the form τ + k≥0 ν ∗h2k , h2k ∈ C (Uαβ ). Apply now Lemma ∗ ˆ αβ induce the same ϕαβ on the base spaces, we see 2.7 to 8∗αβ . Since both 8∗αβ and 8 ∗ ˆ αβ )−1 (τ α ) is written uniquely in the form τ α + 2cαβ + by Corollary 2.5, (1) that 8∗αβ (8 2 0 0 ˘ c∗ ad(hαβ ) ecαβ ad(ν −1 ) . ν ∗ hαβ , where hαβ ∈ F (W Uβα )C . Hence, we have 8c∗ αβ = 8αβ e We remark also that cαβ + cβγ + cγα = 0. c∗ ˘ c∗ ˘ c∗ The identities 8c∗ αβ 8βα = 1, 8αβ 8βα = 1 together with (3.7) yield hβα = ∗ ˘ αβ hαβ and cαβ = −cβα . −8 In what follows we use −1 ˘ ∗αβ = 8 ˘ ∗αβ ecαβ ad(ν ) 9 (5.16) ∗

˘ αβ , since this replacement (5.16) does not change the Poincaré–Cartan instead of 8 cocycle by the above remark. The identity in Lemma 5.3 gives the following cocycle property for {hαβ }: ˘∗

˘∗

Lemma 5.5. On Γ (gUαβγ ), we have ead(νhαβ ) ead(ν 9αβ hβγ ) ead(ν 9αγ hγα ) = 1. ˘ M. The next lemma shows that this cocycle is a coboundary, and hence W M ∼ =W Lemma 5.6. For each α, there exists hα ∈ F (W Uα )C such that c∗

ad(νhα ) ˘ 9αβ e−ad(νhβ ) . 8c∗ αβ = e

(5.17)

226


Proof. Let ϕαβ be the induced symplectic diffeomorphism by 8∗αβ . Using Lemma 5.5, we have by identifying hαβ with an ordinary function that hαβ + ϕ∗αβ hβγ + ϕ∗αγ hγα = 0

mod ν.

(5.18)

Taking a partition of unity {φα } subordinate to the covering {Vα }, we set X hα = ϕ∗αγ φγ hαγ ∈ F(W Uα )C .

(5.19)

γ

Using Lemma 5.5 again, we get hαβ = hα − ϕ∗αβ hβ , Setting ad(hα ) ˘ c∗ −ad(hβ ) ´ c∗ 9αβ e , 9 αβ = e 3 ˜ ´ c∗ we see that 8c∗ αβ = 9αβ mod ν . By Corollary 2.5, (1), there exists a unique hαβ such that 2 ´ c∗ ad(ν h˜ αβ ) 8c∗ (5.20) αβ = 9αβ e

without the ec ad(ν

−1

)

-term. Repeating this procedure yields Lemma 5.6.

We now show Theorem 4.6. We first show the necessity; π(c(2k) (W M )) = 0 implies M is a noncommutative Kähler manifold. We may assume by Lemma 5.6 that W M is a ˘ ∗αβ . Since [ν −1 , ziβ ] = 0 pre-Weyl manifold with a system of patching diffeomorphisms 9 (2k) β (z ), ziβ ] = 0, we have by (5.15), (5.16) and Theorem 5.2 that and [gαβ ˆ ∗αβ ziβ = ϕiαβ (z α ). ˘ ∗αβ ziβ = 8 9

(5.21)

This means the patching transformations are holomorphic. Note that [ziα , z¯jα ] = −ν{b(ziα ), b(z¯jα )} mod ν. Hence we see the body part of the matrix (− ν1 [ziα , z¯jα ]) is nondegenerate. Thus, W M is a noncommutative Kähler manifold. To prove the sufficiencyP in Theorem 4.6, let W M be a Weyl manifold with the (2k) . By definition of Weyl manifold W M for a Poincaré–Cartan class c = k≥0 c Kähler manifold M , there are a simple open Stein covering {Vα }, a system of trivial Lie algebra bundle {gUα } and a system of patching transformations {8c∗ αβ }. Suppose that W M is a noncommutative Kähler manifold over M . Then, we may assume that on each F(W Uα )C there are QK-coordinates z1α , · · · , znα , z¯1α , · · · , z¯nα with the pre-Weyl diffeomorphisms 9∗αβ : F(W Uβα )C → F(W Uαβ )C satisfying the property (4.10). Since {8∗αβ } and {9∗αβ } are patching diffeomorphisms of the same Weyl manifold W M , there is a pre-Weyl diffeomorphism 9∗α for each α such that 8∗αβ 9∗β = 9∗α 9∗αβ . Hence W M can be viewed as a pre-Weyl manifold with patching diffeomorphisms 9∗αβ . ∗ Let 9c∗ αβ be a pointless contact diffeomorphism which extends 9αβ . On the other hand, remark that the holomorphic coordinate change ϕαβ in (4.10) can be viewed as the usual holomorphic coordinate transformations on the base manifold M . By Theorem 5.1, there are QCC-generators z1α , · · · , znα , z1∗α , · · · , zn∗α on each ˆ ∗αβ and pointless F (W Uα )C . By Theorem 5.2, we have pre-Weyl diffeomorphisms 8 ˆ∗ ˆ c∗ contact diffeomorphisms 8 αβ which extend 8αβ . c∗ −1 ˆ induces the identity on the base space Uαβ , there is by (2.2) Since 9c∗ αβ (8αβ ) C ˆ ∗αβ ead(hαβ ) . The terms ec ad(ν −1 ) , ec0 ad(log ν) need hαβ ∈ F (W Uαβ ) such that 9∗αβ = 8 not be used because these are identities on F(W Uαβ )C .


227

∗

ˆ αβ (zi ) for every zi . We see that [zi , hαβ ] = 0. It follows that Note that 9∗αβ (zi ) = 8 hαβ does not involve zi∗ variables, that is “holomorphic”. ∗ ˆ c∗ ad(hαβ ) . Then, 9c∗ e– We define 9c∗ αβ by 8αβ e αβ is an extension of 9αβ . The Poincar´ Cartan cocycle of W M is given as ecαβγ ad(ν

−1

)

c∗

c∗

c∗

ˆ∗

ˆ∗

c∗ c∗ ad(8αβ hαβ ) ad(8αγ hβγ ) ad(hγα ) ˆ ˆ ˆ = 9c∗ e e . αβ 9βγ 9γα = 8αβ 8βγ 8γα e ˆ∗

−1

(0)

−1

ˆ∗

By (5.8), we have ecαβγ ad(ν ) = ecαβγ ad(ν ) ead(8αβ hαβ ) ead(8αγ hβγ ) ead(hγα ) . Let hαβ = P P k (k) 2k (2k) k≥0 ν hαβ . Since cαβγ = k≥0 ν cαβγ , we have ˆ ∗αγ hβγ + hγα = 0 ˆ ∗αβ hαβ + 8 8

mod ν.

By identifying h(0) αβ with an ordinary function, we get (0) ∗ (0) 0 = ϕ∗αβ h(0) αβ + ϕαγ hβγ + hγα . (0)

(0) ˆ c∗ −ad(hαβ ) instead of 8 ˆ c∗ ˇ c∗ Consider 8 αβ = 8αβ e αβ in the above arguments. Since hαβ are c∗ ˇ αβ can be used as patching diffeomorphisms to define a noncommutative holomorphic, 8 Kähler manifold. ˇ ∗αβ ead(νhαβ ) and Now by the same reason as above, there are hαβ such that 9∗αβ = 8 hαβ are holomorphic. Hence, we have −1

(0)

−1

ˇ∗

ˇ∗

ecαβγ ad(ν ) = ecαβγ ad(ν ) ead(ν 8αβ hαβ ) ead(ν 8αγ hβγ ) ead(νhγα ) . P (2) (0) (0) ∗ ∗ (0) Setting hαβ = k≥0 ν k h(k) αβ , we have cαβγ = ϕαβ hαβ + ϕαγ hβγ + hγα . This implies (2) that cαβγ is a coboundary in the cochain complex with coefficients O. Repeating this 2 procedure, we see {c(2k) αβγ } = 0 in H (M, O) for k ≥ 1. Thus, we obtain Theorem 4.6. 6. Construction of Noncommutative Contact Algebras In this section we construct a certain algebra, called a noncommutative contact algebra over a quantizable symplectic manifold. We use notations stated in Sect. 3. Let M be a symplectic manifold with the symplectic form . We assume that M is quantizable, i.e., π1 ∈ H 2 (M ; Z). We consider a Weyl manifold W M . On each coordinate Uα , we use τ given by (1.9) and we denote it by τ˜α . In this section we assume that the Poincaré–Cartan class c(W M ) is c(0) (W M ). Since M is quantizable, we can assume that c(0) αβγ is taken as πnαβγ , where nαβγ ∈ Z. α −1 Since [τ˜ , ν ] = −2, we see that c∗ c∗ α α 8c∗ αβ 8βγ 8γα τ˜ = τ˜ + 2πnαβγ . α

α

(6.1)

c∗ c∗ iτ˜ = eiτ˜ , and hence the associative algebras A(Uα ) genThis implies 8c∗ αβ 8βγ 8γα e iτ˜ α C erated by e and F(W Uα ) can be patched together to form an algebra sheaf. We denote this patched algebra by A(M ). Every element of A(Uα ) is written in the form P α fm ∗ eimτ˜ , fm ∈ F (W Uα )C and 8c∗ αβ are the patching transformations.

228

H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka α

Let Am (M ) be the subspace consisting of elements written in the form fα ∗eimτ˜ on each Uα , where fα ∈ F (W Uα ). Am (M ) is characterized as the eigenspace of 1i ad(ν −1 ) with the eigenvalue 2m. Clearly, we have M Am (M ), A(M ) = m∈Z

and A0 (M ) = F(W M ) is a subalgebra of A(M ). Since 8c∗ αβ is a contact Weyl diffeomorphism, we see that there exists fαβ ∈ F (W Uαβ )C on every Vα ∩ Vβ 6= ∅ such that C

β

c∗

iτ˜ ) = ei8αβ (τ˜ 8c∗ αβ (e

β

)

= ei(τ˜

α

+fαβ )

Lemma 6.1. There is Fαβ ∈ F (W Uαβ )C such that ei(τ˜ ular, we have eis(τ˜

α

+tν)

α

t

α

.

(6.2)

+fαβ )

t

α

= Fαβ ∗ eiτ˜ , In particα

= eisτ˜ ∗ (1 + 2isν) 2 = (1 − 2isν)− 2 ∗ eisτ˜ . is(τ˜ α +fαβ )

(6.3)

−isτ˜ α

Proof. Consider ψ(s) = e ∗e . Since [ν, ψ(s)] = 0, ψ(s) does not involve τ˜ α . We have α α α α d ψ(s) = eis(τ˜ +fαβ ) ∗ (ifαβ ) ∗ e−isτ˜ = iψ(s) ∗ eisτ˜ ∗ fαβ ∗ e−isτ˜ . (6.4) ds α

α

α

Put g(s) = eisτ˜ ∗ fαβ ∗ e−isτ˜ . Since g(s) = eisad(τ˜ ) fαβ , we see g(s) ∈ F(W Uα )C . d ψ(s) = ψ(s) ∗ ig(s), where g(s) ∈ F (W Uα )C Thus, we have a differential equation ds is viewed as a known function. Note that F (W Uα )C ∼ = C ∞ (Uα )C [[ν]]. By the Moyal product formula, the above differential equation can be rewritten as a system of differential equations on Uα . It is easy to see that (6.4) has a unique solution in C ∞ (Uα )C [[ν]]. Note that α α tν . eisτ˜ ∗ tν ∗ e−isτ˜ = 1 − 2isν Equation (6.3) is obtained by solving (6.4) inserted in the above quantity. β

α

iτ˜ ) = 8∗αβ (g) ∗ Fαβ ∗ eiτ˜ for any g ∈ F(W Uβ )C . Lemma 6.1 shows that 8c∗ αβ (g ∗ e −1 α α −1 t Since etad(ν ) eτ˜ = eτ˜ +2t , we see that e 2 ad(ν ) gives an S 1 = {eit } action on A(M ). Hence the relation (6.2) together with Lemma 6.1 can be viewed as a transition rule (coordinate transformation) of a “quantum” S 1 -principal bundle and the associated line bundle. Remark that the principal S 1 -bundle PM constructed on M via the quantization condition is a contact manifold with a contact form as a connection form whose curvature form is the symplectic form. We denote by LM the line bundle associated to PM . A(M ) can be viewed as a noncommutative contact algebra (C ∞ (PM )C [[ν]], ∗) defined on PM . We denote by P˜M , L˜ M the quantum principal bundle and its associated line bundle respectively given by the patchwork mentioned above. We suppose that W M is a noncommutative Kähler manifold over a Kähler manifold M . By Theorem 5.1, there exist QCC-generators z1α , · · · , znα , z1∗α , · · · , zn∗α . As we did in (5.10), joining a new element τ˜ α such that

[τ˜ α , ziα ] = νziα ,

[τ˜ α , zi∗α ] = νzi∗α ,

[τ˜ α , ν] = 2ν 2 ,

(6.5)

C we construct a family of contact Lie algebras {gU } . Since the ν-isomorphism class of α α C C gUα depends only on W Uα , we see that the Poincaré–Cartan cocycle of {gC Uα }α gives the class c(0) (W M ) in the coefficient C, which is assumed to be integral.


229

α

Proposition 6.2. For every Vα there is an element Hα ∗eiτ˜ ∈ A(Vα ) such that [zi , Hα ∗ α eiτ˜ ] = 0. Moreover, if f ∈ A(Vα ) satisfies [ziα , f ] = 0 for every i, 1 ≤ i ≤ n, then there α is a holomorphic function h(z) such that f can be written in the form h(z) ∗ Hα ∗ eiτ˜ . Proof. We need only to show the first assertion. The inverse Moyal product formula (1.2) for QCC generators z1α , · · · , znα , z1∗α , · · · , zn∗α gives a commutative product ◦. Using the product ◦, the ∗-product is given by the Moyal P product formula (1.1). Take a function H(t) and put Hα = H( ziα ◦zi∗α ). We consider the system of equaα tions [ziα , H ∗ eiτ˜ ] = 0. By the Moyal product formula for the above QCC generators, this equals α α (6.6) (H 0 ◦ ziα ) ∗ eiτ˜ + H ∗ [ziα , eiτ˜ ] = 0. α

α

Since eiτ˜ ∗ ziα = ziα ∗ ei(τ˜ +ν) , (6.6) is reduced by using Lemma 6.1 to a differential equation ν 1 1 d (1 + √ ) H(t) + (1 − √ )H(t) = 0. (6.7) 2 1 − 2iν dt 1 − 2iν Equation (6.7) can be solved in C ∞ (R)[[ν]] and we have H ∈ C ∞ (Vα )[[ν]].

By Proposition 6.4, we see that for every Vα ∩Vβ 6= ∅ there are holomorphic functions h such that α iτ˜ β ) = hαβ ∗ Hα ∗ eiτ˜ . (6.8) 8c∗ αβ (Hβ ∗ e αβ

line bundle over M and Thus, the quantum line bundle L˜ M of LM is a holomorphic L . A(M ) can be viewed as the algebra of sections of m∈Z L˜ m M Let H(M )[[ν]] be the commutative algebra consisting of all holomorphic sections L of m≥0 L˜ m M . If H(M )[[ν]] 6= {0}, H(M )[[ν]] can be viewed as a representation space of Weyl functions on M . As pointed out in [CGR], the multiplication operator combined with the projection to the space of all holomorphic sections is the essence of the Berezin representation [Be], which coincides with the representation produced by geometric quantization with respect to the Kähler polarization mentioned above. References [BCG] [Be] [BFL] [CGR] [D] [DL] [F] [Ka] [KM] [KN]

Bertelson, M., Cahen, M. and GuttS, S.: Equivalence of star-products. To appear in Commun. Math. Phys. Berezin, F.A.: General concept of quantization. Commun. Math. Phys. 8, 153–174 (1975) Bayen, F., Flato, M., Fronsdal, C., Lichnerowicz, A. and Sternheimer. D.: Deformation theory and quantization I. Ann. of Physics 111, 61–110 (1978) Cahen, M., Gutt S. and Rawnsley, J.: Quantization of Kähler manifolds, II. Trans. Amer. Math. Soc. 337, 73–98 (1993) Deligne, D.: Déformation de l’Algèbre des Fonctions d’une Variété Symplectique: Comparison entre Fedosov et De Wilde. Lecomte, Selecta Math. New Series 1, 667–697 (1995) De Wilde, M. and Lecomte, P.B.: Existence of star-products and of formal deformations of the Poisson Lie algebra of arbitrary symplectic manifolds. Lett. Math. Phys. 7, 487–496 (1983) Fedosov, B.: Deformation quantization and index theory, Mathematical topics. 9, Basel–Boston: Birkhäuser, 1996 Karabegov, A.V.: Deformation quantization with separation of variables on a Kähler manifold. Commun. Math. Phys. 180, 745–755 (1996) Karasev M. and Maslov, V.: Asymptotic and geometric quantization. Russian Math. Surveys 39, no.6, 133–206 (1984) Kobayashi, S. and Nomizu, K.: Foundations of differential geometry II. New York: Wiley, 1969

230

[KNz]


Karasev, M. and Nazaikinskii, V.: On the quantization of rapidly oscillating symbols. Math. USSR Izv. 34, 737–764 (1978) [NT] Nest, R. and Tsygan, B.: Algebraic index theorem for families. Adv. Math. 113, 151–205 (1995) [L] Lichnerowicz, A.: Déformations d’algebrès a` une variété symplectique (les ∗ν -produits). Ann. Inst. Fourier, Grenoble 32, 157–209 (1982) [O] Omori, H.: Infinite dimensional Lie groups. AMS. Translation Monograph 158, Providence, RI: Am. Math. Soc., 1997 [OMY1] Omori, H., Maeda, Y. and Yoshioka, A.: Weyl manifolds and deformation quantization. Adv. Math. 85, 224–255 (1991) [OMY2] Omori, H., Maeda, Y. and Yoshioka, A.: Deformation quantization of Poisson algebras. Contemp. Math. 179, 213–240 (1994) [OMY3] Omori, H., Maeda, Y. and Yoshioka, A.: Existence of a closed star product. Lett. Math. Phys. 26, 285–294 (1993) [OMY4] Omori, H., Maeda, Y. and Yoshioka, A.: A Poincaré–Birkhoff–Witt theorem for infinite dimensional Lie algebras. J. Math. Soc. Japan 46, 25–50 (1994) [OMMY] Omori, H., Maeda, Y., Miyazaki, N. and Yoshioka, A.: Noncommutative 3-sphere: A model of noncommutative contact algebras, To appear in J. Math. Soc. Japan [V] Vey, J.: Déformations du crochet de Poisson d’un variété symplectique. Comment. Math. Helv. 50, 421–454 (1975) [W] Weinstein, A.: Deformation quantization, Séminaire Bourbaki, 46éme annee. Asterisque 227, 789, 389–409 (1995) [X] Xu, P.: Fedosov ∗-products and quantum moment maps. To appear Communicated by H. Araki

Commun. Math. Phys. 194, 231 – 248 (1998)

Communications in


Stochastic Burgers’ Equations and Their Semi-Classical Expansions A. Truman1 , H. Z. Zhao1,2 1 Department of Mathematics, University of Wales Swansea, Singleton Park, Swansea SA2 8PP, UK. E-mail: [email protected] 2 Department of Mathematics, University of California, Irvine, CA 92697, USA. E-mail: [email protected]

Received: 12 May 1997 / Accepted: 23 February 1998

Abstract: In this paper we use the Hopf-Cole logarithmic transformation and the stochastic Hamilton Jacobi theory to study stochastic heat equations and Burgers’ equations. Before the caustics, the stochastic inviscid Burgers’ equation gives the first term of our semi-classical expansion, i.e. the inviscid limit of the viscous stochastic Burgers’ equation. In order to push our results beyond the inviscid limit, we construct solutions for iterated (stochastic) Hamilton Jacobi continuity equations. Then we give the semiclassical asymptotic expansions for stochastic heat equations and Burgers’ equations by using Nelson’s stochastic mechanical processes with drifts given by the solution of the iterated (stochastic) Hamilton Jacobi continuity equations. The explicit formula for the remainder term is given by a path integral.

1. Introduction and Results It is well known that the Hopf-Cole logarithmic transformation gives a powerful technique in the study of Burgers’ equations (Hopf (1950)). Using this transformation, the semi-classical asymptotic analysis of Elworthy and Truman for the heat equation has led us to discover new inviscid (semi-classical) asymptotic expansions for the stochastic Burgers’ equations in Truman and Zhao (1996b). On the other hand, the Hopf-Cole logarithmic transformation has proved to be a powerful technique in the study of heat equations and reaction diffusion equations (Li and Zhao (1996), Zhao (1997)). The idea of the stochastic Hamilton Jacobi theory goes back to the well-known WKB method for the asymptotics of the eigenfunctions of the Schrödinger operator. In the 1960’s, Maslov applied the semi-classical Hamilton Jacobi theory and his canonical operator method to obtain the asymptotic expansions of the solutions to the Schrödinger equation and heat equations (Maslov and Fedoryuk (1981)). Truman (1977) gave a simple proof that quantum mechanics tends to classical mechanics as ~ → 0 and used the Girsanov-Cameron-Martin theorem to obtain analogous results for the diffusion

232

A. Truman, H. Z. Zhao

equation. A more detailed connection between classical mechanics and the diffusion equation was obtained by Elworthy and Truman (1982) using the Brownian bridge process. Recent developments include the applications to nonlinear problems such as approximate travelling wave solutions of the generalized KPP equations in Zhao and Elworthy (1992) and Elworthy, Truman, Zhao and Gaines (1994) and a new discovery of how to formulate and solve a stochastic Hamilton Jacobi equation and continuity equation with random vector and scalar potentials by Truman and Zhao. We have now generalized Hamilton Jacobi theory to include the stochastic heat equations, Burgers’ equations and stochastic Schrödinger equations in Truman and Zhao (1996a-c). These equations arise in e.g. nonlinear filtering problems such as the Zakai equation and the dynamics of an interface subject to a random external force such as the KPZ model and a quantum mechanical particle in a random electromagnetic field and quantum filtering problems. In recent years, stochastic Burgers’ equations have attracted the attention of mathematicians, e.g. Sinai (1991), Bertini, Cancrini and Jona-Lasinio (1994), Bertini and Cancrini (1995), Albeverio, Molchanov and Surgailis (1994), Holden, Lindstrom, Oksendal, Uboe and Zhang (1994,1995), Oksendal (1994), Truman and Zhao (1996b), to name but a few. Using a sample path of a stochastic mechanics, Truman and Zhao have constructed the solution of the stochastic Hamilton Jacobi equation following a classical sample path. For a continuous distribution of trajectories with associated density in configuration space, the leading term is given by the continuity equation. In this paper we generalize the results in Truman and Zhao (1996b) giving exact semiclassical expansions up to arbitrarily high order in powers of the viscosity µ. For this, iterated continuity equations and iterated Hamilton-Jacobi-continuity equations are introduced and their solutions are constructed. It is intrinsic to study the iterated Hamilton-Jacobi-continuity equations which are derived formally in the following way: Consider a stochastic viscous Burgers’ equation dv µ + (∇v µ )v µ dt + (∇c)dt + (∇k)dw(t) =

1 2 µ µ 1v dt. 2

Here d is the differential with respect to time t, ∇ and 1 are the gradients and the Laplacian with respect to space variables x ∈ Rn , w(t) is a one dimensional Brownian motion on a probability space (, F , P). Conditions on c and k and initial conditions are discussed later. Let v µ = ∇S µ with appropriate initial condition for S µ . Then S µ satisfies the following stochastic Hamilton-Jacobi-Bellman equation: 1 1 dS µ + |∇S µ |2 dt + cdt + kdw(t) = µ2 1S µ dt. 2 2 But formally if S µ ∼ ∞ X j=0

∞ P j=0

µ2j Sj , then it turns out that

∞

∞

j=0

j=0

1 X 2j 1 X 2j µ2j dSj + | µ ∇Sj |2 dt + cdt + kdw(t) ∼ µ2 µ 1Sj dt. 2 2

Comparing coefficients of µ2j for j = 0, 1, 2, · · · , it is not difficult to find 1 ∂ Sj + ∂t 2

X i1 ,i2 ≥0,i1 +i2 =j

∇Si1 ∇Si2 =

1 1Sj−1 , 2

Stochastic Burgers’ Equations and Their Semi-Classical Expansions

233

with the convention 21 1S−1 (x, t) = −c(x, t) − k(x, t)w˙ t . We call these equations (stochastic) Hamilton Jacobi continuity equations. These equations are apparently related to the semiclassical limits of the quantum mechanics (Simon (1979) and Truman (1977)), to the Madelung fluid (see e.g. Guerra (1981)), and to the quantum tunneling problem (Jona-Lasinio, Martinelli and Scoppola (1981)). It seems that similar equations are also related to the quantum field theory (Glimm and Jaffe (1981)). More details will be discussed in our later publications. The main result of this paper is to construct the solutions of the iterated HamiltonJacobi-continuity equations above. The solutions are given by the solutions of the iterated continuity equations. These first m solutions give the first m terms in the asymptotics of the solution for the stochastic heat equations and stochastic Burgers’ equations. Furthermore the remainder term is given by the expectation of an exponential involving a Nelson’s stochastic mechanical process (Nelson (1985)). Our results and methods for the remainder term although related to Watling (1992) are different in this important respect as well as being stochastic in that they are applicable to systems with noise. We expect this new approach can be used in the study of the small time asymptotics of a heat kernel on a Riemannian manifold which has been studied by Elworthy (1989). Readers who are interested in deterministic equations can assume k ≡ 0 while reading this paper. Assume that c ∈ C 2 (Rn × R1 ), k ∈ C 2 (Rn × R1 ) and S(−, 0) ∈ C 2 (Rn ). Consider a stochastic classical mechanics ( ˙ s = −∇c(8s , s) ds − ∇k(8s , s) dws d8 (1.1) ˙ 0 (x) = ∇S(x, 0) . 80 (x) = x , 8 Here ws is an one dimensional Brownian motion on probability space (, F , P). For each x, we have a solution 8s (x). Therefore we have a random map 8s : × Rn → Rn for each s. We assume a no caustic condition: there exists a T (ω) > 0 a.s. such that for 0 ≤ t ≤ T (ω), 8t (ω) : Rn → Rn is a diffeomorphism for a.e. ω ∈ . Such T (ω) > 0 exists if we have some control on c, k and S0 . We have proved in Truman and Zhao (1996b) that if ∇c, ∇2 c, ∇k, ∇2 k, ∇S(−, 0) and ∇2 S(−, 0) are all bounded, then there exists a T (ω) > 0 a.s. such that if 0 ≤ s ≤ T , 8s (ω) : Rn → Rn is a diffeomorphism for a.e. ω ∈ . Define S˜0 : [0, +∞) × Rn → R by the following non-anticipating Itô’s stochastic integral Z Z t Z t 2 1 t ˙ c(8s y, s) ds − k(8s y, s) dws . 8s (y) ds + S(y, 0) − S˜ 0 (y, t) = 2 0 0 0 For 0 ≤ t ≤ T (ω), define S0 (x, t) for a.e. ω ∈ by S0 (x, t) = S˜ 0 (8−1 t x, t) .

(1.2)

Recall the following theorem in Truman and Zhao (1996a-b): Theorem 1.1 (Stochastic Hamilton Jacobi equation and continuity equation). Assume that c ∈ C 2 (Rn × R1 ), k ∈ C 2 (Rn × R1 ) and S(−, 0) ∈ C 2 (Rn ), 8s defined by (1.1) satisfies a no-caustic condition for 0 ≤ t ≤ T (ω) and S0 is defined by (1.2). (i) For a.e. ω ∈ and 0 ≤ t ≤ T (ω), ˙ t = ∇S0 (8t , t), 8

(1.3)

234


and S0 (x, t) satisfies the following stochastic Hamilton Jacobi equation 1 2 (1.4) dS0 (x, t) + |∇S0 (x, t)| dt + c(x, t) dt + k(x, t) dwt = 0 . 2 ∂ −1 (ii) Define φ(x, t) = det 8t x . Then for a.e. ω ∈ , any x ∈ Rn and 0 ≤ t ≤ ∂x T (ω), φ(x, t) satisfies the following continuity equation ∂ φ(x, t) + div {φ(x, t)∇S0 (x, t)} = 0 . (1.5) ∂t Suppose T0 : Rn → R is positive and C ∞ . Function T0 (x) is associated with the initial condition of the stochastic diffusion Eq. (1.13). Define for the random map 8s (ω) : Rn → Rn , T0 (y, t) = T0 (y) and for j = 1, 2, . . .

Z

Tj (y, t) =

t

φ 0

− 21

(·, s)1 φ

1 2

(·, s)Tj−1 (8−1 s ·, s)

8s (y)

ds

and for j = 0, 1, 2, . . .

p ψj (x, t) = Tj (8−1 t x, t) φt (x) . Then we have the following iterated continuity equations :

(1.6)

Lemma 1.2 (Iterated continuity equations). For a.e. ω ∈ , ψj (x, t) defined by (1.6) satisfy the following iterated continuity equations for 0 ≤ t ≤ T (ω) : 1 ∂ ψj + ∇ψj · ∇S0 = − ψj 1S0 + 1(ψj−1 ) , ∂t 2 with the convention ψ−1 ≡ 0.

j = 0, 1, 2, . . .

(1.7)

Therefore, if we consider the following linear combination of ψj , we have     m m X X 1 2j 1 2j ∂  µ ψj (x, t) + ∇  µ ψj (x, t) · ∇S0 (x, t) ∂t 2j 2j j=0 j=0     m m−1 X X 1 1 1 2j 1 2j =−  µ ψj (x, t) 1S0 (x, t) + µ2 1  µ ψj (x, t) , 2 2j 2 2j j=0

j=0

which immediately implies     m m X X 1 2j 1 2j ∂  log µ ψj (x, t) + ∇ log µ ψj (x, t) · ∇S0 (x, t) ∂t 2j 2j j=0 j=0 P m−1 1 2j 1 µ ψ (x, t) j j j=0 2 1 1 = − 1S0 (x, t) + µ2 . Pm 1 2j 2 2 j=0 2j µ ψj (x, t)

(1.8)

In the next theorem we give the solutions of the iterated Hamilton-Jacobi-continuity equations. The classical Hamilton Jacobi equation is the first one among these equations.


235

Theorem 1.3. Assume that c ∈ C 2 (Rn ×R1 ), k ∈ C 2 (Rn ×R1 ) and S(−, 0) ∈ C 2 (Rn ), 8s defined by (1.1) satisfies a no-caustic condition for 0 ≤ t ≤ T (ω). Then for a.e. ω ∈ , the solutions of the following Hamilton Jacobi continuity equations: 1 ∂ Sj + ∂t 2

X

∇Si1 ∇Si2 =

i1 ,i2 ≥0,i1 +i2 =j

1 1Sj−1 , 2

j≥0

(1.9)

with the convention 21 1S−1 (x, t) = −c(x, t) − k(x, t)w˙ t , for 0 ≤ t ≤ T (ω) are given by S0 defined by (1.2), S1 (x, t) = − log ψ0 (x, t) and for j ≥ 2,  P P ψ i1 ψ i2 ψ i1 ψ i 2 ψ i 3 1  ψj−1 i1 ,i2 ≥1,i1 +i2 =j−1 i1 ,i2 ,i3 ≥1,i1 +i2 +i3 =j−1 + − Sj (x, t) = j−1 − 2 ψ0 2ψ02 3ψ03 j−1

+ · · · + (−1)

ψ1j−1

(j − 1)ψ0j−1

! (x, t). (1.10)

It follows from (1.9) that m

m

X

j=0

j=0

i1 ,i2 ≥0,i1 +i2 =j

1 X 2j ∂ X 2j ( µ Sj ) + µ ( ∂t 2

∇Si1 ∇Si2 ) + c + k w(t) ˙ m−1

X 1 µ2j Sj ). = µ2 1( 2

(1.11)

j=0

ˆ defined ˆ Fˆ , P) First we consider a stochastic process xµs on a probability space (, by the following stochastic differential equation:  m dxµ = µ dBs − ∇S0 (xµ , t − s) ds + µ2 ∇ log P 1j µ2j ψj (xµ , t − s) ds , s s s 2 (1.12) j=0  µ x0 = x . ˆ Fˆ , Pˆ ). Here Bs is a standard Brownian motion on Rn on the probability space (, Note that the process xs is also on the probability space (, F , P ). Therefore, as in Rt Truman and Zhao (1996a-b), in order to make the stochastic integral 0 k(t−s, xs )dwt−s well defined in the Itô sense, we denote ws∗ = wt−s and Fs∗ is the enlargement of the filtration {Fs∗ }, where Fs0 = σ(wr∗ : r ≤ s). Then wt−s = ws∗ is Fs∗ measurable and Fs∗1 ⊂ Fs∗2 if s1 ≤ s2 . We are going to use the process xs if it is Fs∗ measurable and non-explosive to construct the solution to the following stochastic diffusion equation of the Stratonovich type: h i ( µ duµt (x) = 21 µ2 1uµt (x) + µ12 c(x, t)uµt (x) dt + k(x,t) µ2 ut (x) ◦ dwt , (1.13) 2 uµ0 (x) = T0 (x)e−S(x,0)/µ .

236


Theorem 1.4. Assume that c ∈ C 2 (Rn ×R1 ), k ∈ C 2 (Rn ×R1 ) and S(−, 0) ∈ C 2 (Rn ), 8s defined by (1.1) satisfies a no-caustic condition for 0 ≤ t ≤ T (ω), ψj (x, t) are defined by (1.6) and the stochastic process defined by (1.12) is Fs∗ measurable and non-explosive. Then for a.e. ω ∈ , 0 ≤ t ≤ T (ω), the solution of the heat Eq. (1.13) is given by m S0 (x, t) X 1 2j µ µ ψj (x, t) ut (x) = exp − µ2 2j j=0 (1.14) ( ) Z 1(ψm (xµs , t − s)) 1 2(m+1) t ˆ µ × E exp ds . Pm 1 2j µ 2m+1 0 j=0 2j µ ψj (xs , t − s) Formula (1.14) looks neat for the stochastic heat Eq. (1.13). But for the stochastic Burgers’ equations the formula we can obtain by taking a logarithmic transformation to (1.14) is then complicated and the remainder term is not explicit (given by a series of which the convergence certainly needs a serious discussion). To avoid this we can use an alternative stochastic process which is slightly different from (1.12): namely the following Nelson’s stochastic process defined by  m dy µ = µ dBs − P µ2j ∇Sj (y µ , t − s) ds , s s (1.15) j=0  µ y0 = x . ˆ Then ˆ Fˆ , P). Here Bs is a standard Brownian motion on Rn on the probability space (, we can prove the following theorem: Theorem 1.5. Assume that c ∈ C 2 (Rn ×R1 ), k ∈ C 2 (Rn ×R1 ) and S(−, 0) ∈ C 2 (Rn ), 8s defined by (1.1) satisfies a no-caustic condition for 0 ≤ t ≤ T (ω), Sj (x, t) are defined by (1.10). Suppose that the stochastic process ys defined by (1.15) is Fs∗ measurable and non-explosive. Then the solution of the heat Eq. (1.13) is given by Z t m 1 X 2j ˆ exp { − 1 µ2m µ Sj (x, t)}E 1Sm (ysµ , t − s) ds uµt (x) = exp { − 2 µ 2 0 j=0

1 + 2

2m X

µ

X

2(j−1)

j=m+1

0≤i1 ,i2 ≤m,i1 +i2 =j

Z

t 0

(∇Si1 ∇Si2 )(ysµ , t − s)ds}. (1.16)

In Eq. (1.13), consider a special case when T0 ≡ 1. The Hopf–Cole logarithmic transformation v µ (x, t) = −µ2 ∇ log uµt (x) then gives the solution of the following stochastic Burgers’ equation: ( dvtµ (x) + ∇vtµ (x) · vtµ (x) + ∇c(x, t) dt + ∇k(x, t) dwt = 21 µ2 1vtµ (x) , (1.17) v0µ (x) = ∇S0 (x) = v0 (x) . Denote vj (x, t) = ∇Sj (x, t). It is easy to see that vj (x, t) satisfy the following iterated Burgers’ equation for j = 0, 1, 2, · · · : ∂ vj + ∂t

X i1 ,i2 ≥0,i1 +i2 =j

(∇vi1 )vi2 =

1 1vj−1 2

(1.18)


237

with the convention 21 1v−1 = −∇c − ∇k w(t) ˙ and initial condition v0 (x, 0) = ∇S0 (x), vj (x, 0) = 0 for j = 1, 2, · · · . Then the following theorem is a corollary of Theorem 1.5 and the Hopf–Cole transformation. Theorem 1.6. Assume that c ∈ C 2 (Rn ×R1 ), k ∈ C 2 (Rn ×R1 ) and S(−, 0) ∈ C 2 (Rn ), 8s defined by (1.1) satisfies a no-caustic condition for 0 ≤ t ≤ T (ω) and Sj (x, t) are defined by (1.10). Suppose that the stochastic process ys defined by (1.15) is Fs∗ measurable and non-explosive. The solution of the stochastic Burgers’ Eq. (1.17) is given by µ

v (x, t) =

m X

µ2j vj (x, t)

j=0

ˆ exp { − 1 µ2m − µ2 ∇ log E 2 2m 1 X 2(j−1) µ + 2 j=m+1

Z

t 0

div vm (ysµ , t − s) ds

X 0≤i1 ,i2 ≤m,i1 +i2 =j

Z

t 0

(1.19)

(vi1 vi2 )(ysµ , t − s)ds}.

Remark. We have solved the viscous stochastic Burgers’ equation up to arbitrarily high order in the viscosity µ2 . The first term in the asymptotic expansion being the solution to the inviscid Burgers’ equations, is a well-known result of Hopf (1950). It is clear from the above formula that iterated Burgers’ equations give the higher order terms of the asymptotics which arise beyond the inviscid limit. The explicit formula for the remainder term is given by the logarithmic derivative of a path integral.

2. Proof of Iterated Continuity Equations and Hamilton Jacobi Continuity Equations In this section we give the proofs of Lemma 1.2 and Theorem 1.3. Proof of Lemma 1.2. Differentiating the identity 8t (8−1 t x) = x with respect to x and t implies (denote y = 8−1 x) t −1 ∇y 8t (8−1 t x) ∇8t x = I

(2.1)

−1 ˙ t (8−1 ˙ −1 8 t x) + ∇y 8t (8t x)(8t x) = 0 .

(2.2)

and

Multiplying both sides of Eq. (2.2) by ∇8−1 t x and using identity (2.1) and (1.2) we obtain −1 ˙ −1 (2.3) 8 t x = − ∇8t x ∇S0 (x, t) . It turns out from the definitions of ψj and Tj that

238


∂ ψj (x, t) + ∇ψj (x, t) · ∇S0 (x, t) ∂t 1 1 ∂ 2 2 ˙ −1 = ∇y Tj (8−1 Tj (8−1 t x, t) · (8t x)φt (x) + t x, t)φt (x) ∂t 1 1 ∗ ∂ 2 + Tj (8−1 φ 2 (x) + ∇8−1 ∇y Tj (8−1 t x, t) t x t x, t)φt (x) · ∇S0 (x, t) ∂t t 1

2 + Tj (8−1 t x, t)∇φt (x) · ∇S0 (x, t) h i 1 −1 −1 ˙ = φ 2 (x, t)∇y Tj (8−1 x, t) · 8 x + (∇8 x)∇S (x, t) 0 t t t 1 1 ∂ 2 2 + Tj (8−1 φ x, t) (x) + ∇φ (x) · ∇S (x, t) 0 t t ∂t t 1 + 1 φ 2 (x, t)Tj−1 (8−1 t x, t) .

It follows from (2.3) and the continuity Eq. (1.4) that 1 ∂ ψj (x, t) + ∇ψj (x, t) · ∇S0 (x, t) = − ψj (x, t)1S0 (x, t) + 1(ψj−1 (x, t)) . ∂t 2

Proof of Theorem 1.3. For the case j = 0, (1.9) is the stochastic Hamilton Jacobi Eq. (1.4). For j = 1, dividing both sides of the continuity equation for ψ0 by −ψ0 we have 1 ∂ (− log ψ0 ) + ∇(− log ψ0 )∇S0 = 1S0 . ∂t 2

(2.4)

That is (1.9) for j = 1. For j = 2, multiplying the iterated continuity equation for ψ1 by ψ0 and the continuity equation for ψ0 by ψ1 we have ψ0

1 ∂ ψ1 + ψ0 ∇ψ1 ∇S0 = − ψ0 ψ1 1S0 + ψ0 1(ψ0 ), ∂t 2

and ψ1 It turns out that

1 ∂ ψ0 + ψ1 ∇ψ0 ∇S0 = − ψ1 ψ0 1S0 . ∂t 2

1 ψ1 1 ψ1 1 1(ψ0 ) ∂ (− ) + ∇(− )∇S0 = − . ∂t 2 ψ0 2 ψ0 2 ψ0

But −

(2.5)

1 1ψ0 1 1 = 1(S1 ) − (∇S1 )2 . 2 ψ0 2 2

It turns out from (2.5) that 1 1 ∂ (S2 ) + ∇(S2 )∇S0 = 1(S1 ) − (∇S1 )2 . ∂t 2 2

(2.6)

For an integer j ≥ 3, first, multiplying the continuity equation for ψ0 by ψj−1 and the iterated continuity equation for ψj−1 by ψ0 we find ψj−1

1 ∂ ψ0 + ψj−1 ∇ψ0 ∇S0 = − ψj−1 ψ0 1S0 , ∂t 2


and ψ0

239

1 ∂ ψj−1 + ψ0 ∇ψj−1 ∇S0 = − ψ0 ψj−1 1S0 + ψ0 1(ψj−2 ). ∂t 2

It turns out that 1 ψj−1 1 ψj−1 1 1(ψj−2 ) ∂ (− ) + ∇(− j−1 )∇S0 = − j−1 . ∂t 2j−1 ψ0 2 ψ0 2 ψ0

(2.7)

Second, for any i1 , i2 ≥ 1 with i1 + i2 = j − 1, multiplying the iterated continuity equation for ψi1 by ψ02 ψi2 and the iterated continuity equation for ψi2 by ψ02 ψi1 we find ψ02 ψi2

1 ∂ ψi1 + ψ02 ψi2 ∇ψi1 ∇S0 = − ψ02 ψi2 ψi1 1S0 + ψ02 ψi2 1(ψi1 −1 ), ∂t 2

and

1 ∂ ψi + ψ02 ψi1 ∇ψi2 ∇S0 = − ψ02 ψi1 ψi2 1S0 + ψ02 ψi1 1(ψi2 −1 ). ∂t 2 2 It turns out that ψ02 ψi1

∂ (ψi ψi ) + ψ02 ∇(ψi1 ψi2 )∇S0 ∂t 1 2 1 = − ψ02 × 2(ψi1 ψi2 )1S0 + ψ02 (ψi1 1(ψi2 −1 ) + 1(ψi1 −1 )ψi2 ). 2 ψ02

Then the symmetry of indexes i1 and i2 implies that ψ02

∂ ( ∂t

X i1 +i2 =j−1,i1 ,i2 ≥1

ψi1 ψi2 )∇S0

i1 +i2 =j−1,i1 ,i2 ≥1

X

= − ψ02 × (

X

ψi1 ψi2 ) + ψ02 ∇(

X

ψi1 ψi2 )1S0 + 2ψ02 (

i1 +i2 =j−1,i1 ,i2 ≥1

ψi1 1(ψi2 −1 )).

(2.8)

i1 +i2 =j−1,i1 ,i2 ≥1

On the other hand, multiplying the continuity equation for ψ0 by X ψ i1 ψ i 2 2ψ0 i1 +i2 =j−1,i1 ,i2 ≥1

we have X

(

ψ i1 ψ i2 )

i1 +i2 =j−1,i1 ,i2 ≥1

=−

ψ02 (

X

∂ 2 ψ +( ∂t 0

X

ψi1 ψi2 )∇ψ02 ∇S0

i1 +i2 =j−1,i1 ,i2 ≥1

(2.9)

ψi1 ψi2 )1S0 .

i1 +i2 =j−1,i1 ,i2 ≥1

It follows from (2.8) and (2.9) that P P ψ i1 ψ i2 ψ i1 ψ i 2 ∂ i1 +i2 =j−1,i1 ,i2 ≥1 i1 +i2 =j−1,i1 ,i2 ≥1 ( )+∇( )∇S0 ∂t 2j−1 × 2ψ02 2j−1 × 2ψ02 P ψi1 1(ψi2 −1 ) =

i1 +i2 =j−1,i1 ,i2 ≥1 2j−1 ψ02

.

(2.10)

240


By the same method we can prove that P

P

ψ i1 ψ i2 ψ i3

∂ i1 +i2 +i3 =j−1,i1 ,i2 ,i3 ≥1 (− ∂t 2j−1 × 3ψ03

) + ∇(− =−

i1 +i2 +i3 =j−1,i1 ,i2 ,i3 ≥1

ψ i1 ψ i2 ψ i 3

)∇S0 2j−1 × 3ψ03 P ψi1 ψi2 1(ψi3 −1 )

i1 +i2 +i3 =j−1,i1 ,i2 ,i3 ≥1

,

2j−1 ψ03

··· ψ1j−1

ψ1j−1 ∂ j−1 ((−1)j−1 ) + ∇((−1) )∇S0 ∂t 2j−1 × (j − 1)ψ0j−1 2j−1 × (j − 1)ψ0j−1 =(−1)j−1

ψ1j−2 1ψ0

2j−1 ψ0j−1

. (2.11)

It turns out from (2.7), (2.10) and (2.11) that ∂ Sj + ∇Sj ∇S0 ∂t   1  1(ψj−2 ) = j−1 − +  2 ψ0  P −

P i1 +i2 =j−1,i1 ,i2 ≥1 ψ02

i1 +i2 +i3 =j−1,i1 ,i2 ,i3 ≥1 ψ03

ψi1 1(ψi2 −1 ) 

ψi1 ψi2 1(ψi3 −1 ) + ··· +

 j−2 1ψ0  j−1 ψ1 (−1) .  ψ0j−1 

This leads to ∂ Sj + ∇Sj ∇S0 ∂t   1  1(ψj−2 ) ψj−2 1ψ0 = j−1 − + +  2 ψ0 ψ02  P ( ψi1 ψi2 )1ψ0 i1 +i2 =j−2,i1 ,i2 ≥1 ψ03 ψ j−3 1ψ1 + · · · + (−1)j−2 1 j−2 ψ0

−

P i1 +i2 =j−2,i1 ,i2 ≥1 ψ02

ψi1 1(ψi2 )

P

i1 +i2 +i3 =j−2,i1 ,i2 ,i3 ≥1 ψ3 ) 0 ψ j−2 1ψ0 (−1)j−1 1 j−1 . ψ0

ψi1 ψi2 1(ψi3 )

−

+

Differentiating Si1 for i1 ≥ 2 with respect to space variables x we have

(2.12)


241

 ∇Si1 (x, t) =

1 2i1 −1

P

 ∇ψi1 −1 ψi1 −1 ∇ψ0 + + − ψ0 ψ02 P

−

j1 ,j2 ≥1,j1 +j2 =i1 −1 ψ03

P

j1 ,j2 ≥1,j1 +j2 =i1 −1 ψ02

∇(ψj1 )ψj2

P

ψj1 ψj2 ∇ψ0 −

j1 ,j2 ,j3 ≥1,j1 +j2 +j3 =i1 −1 ψ03

∇(ψj1 )ψj2 ψj3

ψj1 ψj2 ψj3 ∇ψ0

j1 ,j2 ,j3 ≥1,j1 +j2 +j3 =i1 −1 + ψ04 ψ i1 −2 ∇ψ1 +(−1)i1 −1 1 i1 −1 − ψ0

+ ··· ψ i1 −1 ∇ψ0 (−1)i1 −1 1 i1 ψ0

) .

Similarly for i2 ≥ 2,  ∇Si2 (x, t) =

1 2i2 −1

P

 ∇ψi2 −1 ψi2 −1 ∇ψ0 + + − ψ0 ψ02 P

−

j1 ,j2 ≥1,j1 +j2 =i2 −1 ψ03

P

−

X

1 2j−2

i1 +i2 =j,i1 ,i2 ≥2

+ ∇ψi1 −1 − + ··· +

j1 ,j2 ,j3 ≥1,j1 +j2 +j3 =i2 −1 ψ03

∇(ψj1 )ψj2 ψj3

ψj1 ψj2 ψj3 ∇ψ0 + ··· ψ i2 −1 ∇ψ0 (−1)i2 −1 1 i2 ψ0

! .

∇Si1 ∇Si2

i1 +i2 =j,i1 ,i2 ≥2

=

∇(ψj1 )ψj2

P

ψj1 ψj2 ∇ψ0

j1 ,j2 ,j3 ≥1,j1 +j2 +j3 =i2 −1 + ψ04 ψ i2 −2 ∇ψ1 + (−1)i2 −1 1 i2 −1 − ψ0

It follows that X

j1 ,j2 ≥1,j1 +j2 =i2 −1 ψ02

(∇ψi1 −1 ψi2 −1 + ψi1 −1 ∇ψi2 −1 )∇ψ0 ∇ψi1 −1 ∇ψi2 −1 − ψ02 ψ03

ψi1 −1 ψi2 −1 (∇ψ0 )2 ψ04 P ∇(ψj1 )ψj2 + ∇ψi2 −1

j1 ,j2 ≥1,j1 +j2 =i2 −1

P j1 ,j2 ≥1,j1 +j2 =i1 −1

∇(ψj1 )ψj2

ψ03 ψ i1 +i2 −4 (∇ψ1 )2 (−1)i1 +i2 −2 1 i1 +i2 −2 ψ0

− (−1)i1 +i2 −2

ψ i1 +i2 −2 (∇ψ0 )2 + (−1)i1 +i2 −2 1 ψ0i1 +i2

!

2ψ1i1 +i2 −3 ∇ψ1 ∇ψ0 ψ0i1 +i2 −1

. (2.13)

242

A. Truman, H. Z. Zhao 0 Note that ∇S1 = − ∇ψ ψ0 . Therefore we also have

∇S1 ∇Sj−1 = −

1

−

2j−2

P +

∇ψj−2 ∇ψ0 ψj−2 (∇ψ0 )2 + ψ02 ψ03

j1 ,j2 ≥1,j1 +j2 =j−2

∇(ψj1 )ψj2 ∇ψ0 −

ψ03

+··· +

ψ j−3 ∇ψ1 ∇ψ0 (−1)j−2 1 ψ0j−1

−

P

ψj1 ψj2 (∇ψ0 )2

j1 ,j2 ≥1,j1 +j2 =j−2 ψ04

ψ j−2 (∇ψ0 )2 (−1)j−2 1 ψ0j

! . (2.14)

Finally from (2.12) and (2.13) and (2.14) we obtain that 1 ∂ Sj + ∂t 2 =

1 2j−1

X

ψj−2 (∇ψ0 )2 1(ψj−2 ) ψj−2 1ψ0 ∇ψj−2 ∇ψ0 + + 2 − ψ0 ψ02 ψ02 ψ03 P P ψi1 1(ψi2 ) ψi1 ψi2 1ψ0

−

i1 +i2 =j−2,i1 ,i2 ≥1 ψ02

+

P

i1 +i2 =j−2,i1 ,i2 ≥1 ψ02

+

4× −

−

i1 +i2 =j−2,i1 ,i2 ≥1 ψ03

∇ψi1 ∇ψi2 (2.15)

P

3×

(∇ψi1 )ψi2 ∇ψ0

i1 +i2 =j−2,i1 ,i2 ≥1 ψ03

+ · · · + (−1)j−2

+

∇Si1 ∇Si2

i1 +i2 =j,i1 ,i2 ≥0

+

ψ1j−3 1ψ1 ψ0j−2

(j − 1)ψ1j−2 (∇ψ0 )2 ψ0j

−

−

ψ1j−2 1ψ0 ψ0j−1

+

P i1 +i2 =j−2,i1 ,i2 ≥1 ψ04

ψi1 ψi2 (∇ψ0 )2

(j − 3)ψ1j−4 (∇ψ1 )2 ψ0j−2 )

2(j − 2)ψ1j−3 ∇ψ1 ∇ψ0 ψ0j−1

.

Then simple computation implies that the right hand side of the above equals 21 1Sj−1 . That is to say 1 ∂ Sj + ∂t 2

X i1 +i2 =j,i1 ,i2 ≥0

∇Si1 ∇Si2 =

1 1Sj−1 . 2


243

3. Proof of Theorem 1.4 Proof. Let xµs be defined by (1.12). For each ω ∈ , define a new probability measure Pˆ 1 by dPˆ 1 = Mµt ˆ dP     Z t m  X 1 1 2j −∇S0 (xµs , t − s) + µ2 ∇ log = exp − µ ψj (xµs , t − s) dBs j  µ 2 (3.1) 0  j=0  2  Z t m  X 1 1 µ 2 2j µ − 2 (x , t − s) + µ ∇ log µ ψ (x , t − s) ds . −∇S 0 j s s  2µ 0 2j  j=0 ˆ Fˆ , Pˆ1 ) with variance µ2 . Therefore Then for each ω ∈ , xµs is a Brownian motion on (, µ µ ˆ Let E ˆ ˆ be the ˆ Fˆ , P). xs is isometric to Bs = x + µBs on the probability space (, P ˆ We write as E ˆ for simplicity. Denote by E ˆˆ expectation with respect to the measure P. P1 ˆ the expectation with respect to the measure P1 . By the Feynman—Kac formula, the solution of the heat Eq. (1.13) is given by Z t S0 (Btµ ) 1 + c(Bsµ , t − s) ds uµt (x) = EPˆ T0 (Btµ ) exp − µ2 µ2 0 Z t 1 µ k(Bs , t − s) dwt−s + 2 µ 0 Z t S0 (xµt ) 1 µ + 2 c(xµs , t − s) ds = EPˆ 1 T0 (xt ) exp − µ2 µ 0 Z t 1 k(xµs , t − s) dwt−s , + 2 µ 0 Rt provided xµs is Fs∗ measurable and non-explosive so that 0 k(xµs , t − s) dwt−s is well defined in Itô’s sense. By the Maruyama–Girsanov–Cameron–Martin formula we have

uµt (x)

=

EPˆ T0 (xµt ) exp

Z t S0 (xµt ) 1 − + 2 c(xµs , t − s) ds µ2 µ 0 Z t 1 µ k(xs , t − s) dwt−s · Mµt . + 2 µ 0

Applying Itô’s formula to S0 (xµs , t − s) − µ2 log

Pm

1 2j µ j=0 2j µ ψj (xs , t

(3.2)

− s) we have

244


S0 (xt , 0) − µ2 log T0 (xt ) m X 1 2j µ ψj (x, t) =S0 (x, t) − µ2 log 2j j=0 Z t m X ∂ 1 2j + log ds S0 (xµs , t − s) − µ2 µ ψj (xµs , t − s) ds ∂s 2j 0 j=0   m X 1 µ2j ψj (xµs , t − s) + ∇S0 (xµs , t − s) − µ2 ∇ log 2j j=0   m X 1 2j × µ dBs − ∇S0 (xµs , t − s) ds + µ2 ∇ log µ ψj (xµs , t − s) ds 2j j=0   m 2 Z t X µ 1 2j 1S0 (xµs , t − s) − µ2 1 log + µ ψj (xµs , t − s) ds . 2 0 2j j=0

It turns out that   Z m X 1 1 t µ2j ψj (xµs , t − s) dBs ∇S0 (xµs , t − s) − µ2 ∇ log µ 0 2j j=0   m X 1  1 = 2 S0 (xt , 0) − µ2 log T0 (xt ) − S0 (x, t) + µ2 log µ2j ψj (x, t) µ 2j j=0 Z t 1 {−ds S0 (xµs , t − s)+ + 2 µ 0 2   m  X 1 2j µ 2 µ + ∇S0 (xs , t − s) − µ ∇ log ds µ ψ(x , t − s) s  2j  j=0  Z t m X 1 ∂ 1 2j + log µ ψj (xµs , t − s) − 1S0 (xµs , t − s) j  ∂s 2 2 0 j=0   m  X 1 1 2j µ  ds . + µ2 1 log  µ ψ (x , t − s) j s  2 2j j=0

Note that

P   m 1 2j µ m 1 µ ψ (x , t − s) X j j s j=0 2 1 2j µ ψj (xµs , t − s) = Pm 1 2j 1 log  µ 2j µ ψ (x j s , t − s) j=0 2j j=0 2 m X 1 2j µ µ ψj (xs , t − s) . − ∇ log j 2 j=0


245

It follows that     m 1 X 1 S0 (xt , 0) − µ2 log T0 (xt ) − S0 (x, t) + µ2 log Mµt = exp µ2j ψj (x, t) 2  2j µ j=0 Z t 1 1 2 −ds S0 (xµs , t − s) + |∇S0 (xµs , t − s)| ds + 2 µ 0 2 2 Z t m X 1 2 1 2j µ ds µ ∇ log µ ψ (x , t − s) + j s j 2 0 2 j=0  Z t m X 1 1 2j ∂ + µ ψj (xµs , t − s) − 1S0 (xµs , t − s)  log j ∂s 2 2 0 j=0

m X 1 2j µ ψj (xµs , t − s) · ∇S0 (xµs , t − s) 2j j=0 P m 1 2j µ 1 µ ψ (x , t − s) j j s j=0 2 1 + µ2 Pm 1 2j µ 2 j=0 2j µ ψj (xs , t − s) 2    m X   1 2j 1 2 µ µ ψj (xs , t − s)  ds . − µ ∇ log  2 2j  j=0

− ∇ log

(3.3) By using the stochastic Hamilton Jacobi Eq. (1.4) and the continuity Eq. (1.8) we have m S0 (x, t) X 1 2j µ µ ψj (x, t) ut (x) = exp − µ2 2j j=0 ( ) Z 1(ψm (xµs , t − s)) 1 2(m+1) t ˆ µ × E exp ds . Pm 1 2j µ 2m+1 0 j=0 2j µ ψj (xs , t − s) 4. Proof of Theorem 1.5 and Theorem 1.6 Proof of Theorem 1.5. Let ysµ be defined by (1.15). For each ω ∈ , define a new probability measure Pˆ 1 by dPˆ 1 = Mµt dPˆ     m  1Z t X − = exp − µ2j ∇Sj (ysµ , t − s) dBs  (4.1)  µ 0 j=0  2  Z t X m  1 2j µ − − 2 µ ∇S (y , t − s) ds . j s  2µ 0  j=0

246


ˆ Pˆ1 ) with variance µ2 . Therefore ˆ F, Then for each ω ∈ , ysµ is a Brownian motion on (, µ µ ˆ By the Maruyama– ˆ P). ˆ F, ys is isometric to Bs = x + µBs on the probability space (, Girsanov–Cameron–Martin formula and Feynman–Kac formula, similar to (3.2), we have Z t S0 (ytµ ) 1 + c(ysµ , t − s) ds uµt (x) = EPˆ T0 (ytµ ) exp − µ2 µ2 0 Z t 1 µ k(ys , t − s) dwt−s · Mµt . + 2 µ 0 Applying Itô’s formula to

m P j=0

µ2j Sj (ysµ , t − s) we have

S0 (yt , 0) − µ2 log T0 (yt ) Z tX m m X µ2j Sj (x, t) + µ2j ds Sj (ysµ , t − s)ds = 0

j=0

j=0

    Z t X m m X  µ2j ∇Sj (ysµ , t − s) × µ dBs − µ2j ∇Sj (ysµ , t − s) ds + 0

j=0

  Z m µ2 t X 2j + µ 1Sj (ysµ , t − s) ds . 2 0

j=0

j=0

It turns out that

1 µ

Z

t

 

0

m X

 µ2j ∇Sj (ysµ , t − s) dBs

j=0

 m X 1  = 2 S0 (yt , 0) − µ2 log T0 (yt ) − µ2j Sj (x, t) µ j=0  2  X  Z t m m   X 1 + 2 µ2j ds Sj (ysµ , t − s) + µ2j ∇Sj (ysµ , t − s) ds −  µ 0  j=0   j=0 

−

1 2

Z tX m 0

j=0

This is followed by

µ2j 1Sj (ysµ , t − s) ds .

(4.2)


247

   m 1 X S0 (yt , 0) − µ2 log T0 (yt ) − Mµt = exp µ2j Sj (x, t)  µ2 j=0  2  m Z t m X X 1 1   2j µ 2j µ + 2 µ ds Sj (ys , t − s) + µ ∇Sj (ys , t − s) ds − µ 0 2 j=0 j=0  Z m  1 t X 2j − µ 1Sj (ysµ , t − s) ds .  2 0 j=0

But 2 m X 2j µ ∇S j j=0 =

m X j=0

µ

2j

X

∇Si1 ∇Si2 +

2m X j=m+1

i1 ,i2 ≥0,i1 +i2 =j

X

µ2j

∇Si1 ∇Si2 .

0≤i1 ,i2 ≤m,i1 +i2 =j

It turns out by using (1.11) that   Z m  1 X  1 2m t µ 2j ˆ ut (x) = exp − 2 µ Sj (x, t) × E exp − µ 1Sm (ysµ , t − s) ds  µ  2 0 j=0

1 + 2

Z

t

2m X

0 j=m+1

µ2(j−1)

X 0≤i1 ,i2 ≤m,i1 +i2 =j

(∇Si1 ∇Si2 )(ysµ , t − s)ds

  

.

Proof of Theorem 1.6. The result for the Burgers’ equation follows by applying the logarithmic transformation to (1.16). Acknowledgement. One of us, HZZ would like to thank Professor D. Williams FRS for his invitation to visit the University of Bath and Professosr B. Oksendal and T. Lindstrom for inviting him to visit the University of Oslo. It is our great pleasure to thank Professor K.D. Elworthy, Professor W.A. Zheng and Dr. Z. Brzezniak for helpful conversations. We would like to acknowledge the support of EPSRC Grants GR/L37823 and GR/K70397.

References 1. Albeverio, S., Molchanov, S. and Surgailis, D.: Stratified structure of the universe and Burgers equation – A probability approach. Probab. Theory Relat. Fields 100, 457–484 (1994) 2. Bertini, L., Cancrin, N. and Jona-Lasinio, G.: The stochastic Burgers equation. Commun. Math. Phys. 165, 211–232 (1994) 3. Bertini, L. and Cancrini, N.: The stochastic heat equation: Feynman–Kac formula and intermittence. J. Stat. Phys. 78, 1377–1401 (1995)

248


4. Da Prato, G., Debusche, A. and Temam, R.: Stochastic Burgers equation. Preprint di mathematica n.27 ’Scuola Normale Superiore’ Pisa (1993) 5. Elworthy, K. D.: Geometric Aspects of Diffusions on Manifolds. In: Hennequin, P. L. (ed.) Ecole d’Eté Probabilité de Saint–Flour XV–XVII 1985, 1987, Lecture Notes in Mathematics 1362, Berlin– Heidelberg–New York: Springer-Verlag, 1989, pp. 276–425 6. Elworthy, K. D.: Stochastic differential equations on manifolds. London Mathematical Society Lecture Notes 70, Cambridge: Cambridge University Press, 1989 7. Elworthy, K. D. and Truman, A.: The diffusion equation and classical mechanics: An elementary formula. In: Albeverio, S. et al (eds.) Stochastic Processes in Quantum physics, Lecture Notes in Physics 173, Berlin: Springer, 1982, pp. 136–146 8. Elworthy, K. D., Truman, A., Zhao, H. Z. and Gaines, J.: Approximate travelling waves for the generalized KPP equations and classical mechanics. Proc. R. Soc. Lond., Series A 446, 529–554 (1994) 9. Glimm, J. and Jaffe, A.: Quantum Physics. New York: Springer-Verlag, 1981 10. Guerra, F.: Structural aspects of stochastic mechanics and stochastic field theory. Phys. Rep. 77, 263–312 (1981) 11. Hopf, E.: The partial differential equation ut + uux = µuxx . Comm. Pure and Appl. Math. 3, 201–230 (1950) 12. Holden, H., Lindstrom, T., Oksendal, B., Uboe, J. and Zhang, T. S.: The Burgers’ equation with a noise force and the stochastic heat equations. Comm. PDE. 19, 119–141 (1994) 13. Holden, H., Lindstrom, T., Oksendal, B., Uboe, J. and Zhang’ T. S.: The stochastic Wick-type Burgers equation. In: Ethridge. A.M. (ed.), Stochastic Partial Differential Equations, (London Mathematical Society Lecture Note Series 216), Cambridge: Cambridge University Press, 1995, pp.141–161 14. Jona-Lasinio, G., Martinelli, F. and Scoppola, E.: The semiclassical limit of quantum mechanics. Phys. Rep. 77, 313–327 (1981) 15. Li, X.M. and Zhao, H.Z.: Gradient estimates and the smooth convergence of approximate travelling waves for reaction diffusion equations. Nonlinearity 9, 459–477 (1996) 16. Maslov, V.P. and Fedoryuk, M.V.: The quasiclassical approximation for equations of quantum mechanics. Dordecht: Reidel Publishing Comp., 1981 17. Nelson, E.: Quantum Fluctuations. Princeton, NJ: Princeton University Press 1985 18. Oksendal, B.: Stochastic partial differential equations and applications to hydrodynamics. In: Cardaso, A.I., de Faria, M., Potthoff, J. and Streit, L. (eds.) Stochastic Analysis and Applications in Physics, (Nato ASI Series, 449, Dordrecht: Kluwer 1994, pp. 283-305 19. Simon, B.: Functional Integration and Quantum Physics. New York: Academic Press, 1979 20. Sinai, Ya. G.: Two results concerning asymptotic behaviour of solutions of the Burgers equation with force. J. Stat. Phys. 64, 1–12 (1991) 21. Truman, A.: Classical mechanics, the diffusion (heat) equation and the Schrödinger equation. J. Math. Phys. 18, 2308–2315 (1977) 22. Truman, A. and Zhao, H. Z.: The stochastic Hamilton Jacobi equation, stochastic heat equation and Schrödinger equation. In: Davies, I. M., Truman, A. and Elworthy, K. D. (eds), Stochastic Analysis and Applications, Singapore: World Scientific, 1996a, pp. 441–464 23. Truman, A. and Zhao, H. Z.: On stochastic diffusion equations and stochastic Burgers’ equations. J. Math. Phys. 37, 283–307 (1996b) 24. Truman, A. and Zhao, H. Z.: Quantum mechanics of charged particles in random electromagnetic fields. J. Math. Phys. 37, 3180–3197 (1996c) 25. Watling, K. D.: Formulae for solutions to (possibly degenerate) diffusion equations exhibiting semiclassical asymptotics. In: Truman, A. and Davies, I. M. (eds.), Stochastics and Quantum Mechanics Singapore: World Scientific, 1992, pp. 248–271 26. Zhao, H.Z.: On the gradients of the travelling waves for the generalized KPP equations. Proc. R. Soc. Edinb. 127, 423–439 (1997) 27. Zhao, H.Z. and Elworthy, K.D.: The travelling wave solutions of scalar generalized KPP equations via classical mechanics and stochastics approaches. In: Truman, A. and Davies, I. M. (eds.), Stochastics and Quantum Mechanics, Singapore: World Scientific, 1992, pp. 298–316 Communicated by D. Brydges

Commun. Math. Phys. 194, 249 – 295 (1998)

Communications in


Continuous Renormalization for Fermions and Fermi Liquid Theory Manfred Salmhofer Mathematik, ETH-Zentrum, 8092 Zürich, Switzerland. E-mail: [email protected] Received: 14 May 1997 / Accepted: 6 October 1997

Abstract: I derive a Wick ordered continuous renormalization group equation for fermion systems and show that a determinant bound applies directly to this equation. This removes factorials in the recursive equation for the Green functions, and thus improves the combinatorial behaviour. The form of the equation is also ideal for the investigation of many-fermion systems, where the propagator is singular on a surface. For these systems, I define a criterion for Fermi liquid behaviour which applies at positive temperatures. As a first step towards establishing such behaviour in d ≥ 2, I prove basic regularity properties of the interacting Fermi surface to all orders in a skeleton expansion. The proof is a considerable simplification of previous ones.

1. Introduction In this paper, I begin a study of fermionic quantum field theory by a continuous Wick ordered renormalization group equation (RGE). As an example, I take the standard many-fermion system of solid state quantum field theory, but the method applies to general fermionic models with short-range interactions. I show that a determinant of propagators appears in the RGE and I use a determinant bound to prove that a factorial which would appear in bosonic theories is removed from the recursion for the fermionic Green functions. This may lead to convergence of perturbation theory in the absence of relevant couplings, but I do not address the convergence problem, which is related to the solution of a particular combinatorial recursion, in this paper. A short account of this work has appeared in [13]. Continuous RGEs were invented by Wegner [1] and Wilson [2]. Polchinski [3] found a beautiful way to use them for a proof of perturbative renormalizability of φ4 theory. His method was simplified in [5], and extended to composite operator renormalization and to gauge theories by Keller and Kopper [6]. Keller [7] also proved local Borel summability. While equivalent to the Gallavotti-Nicolò [8–10] method, the continuous

250

M. Salmhofer

RGE is much simpler technically. An application of continuous RG methods to nonperturbative bosonic problems [11, 12] requires many new ideas and a combination with cluster expansion techniques, to control the combinatorics. It is one of the points of this paper that for fermions, the straightforward adaption of the method yields a determinant bound which improves the combinatorics of fermionic theories as compared to bosonic ones, and may lead to nonperturbative bounds. The determinant structure is not visible in the form of the flow equation used in [3, 5, 6], because the flow equation in that form is a one-loop equation which has too little structure. A key ingredient for the present analysis is Wick ordering, which was first used in the context of continuous RGEs for scalar field theories by Wieczerkowski [4]. I show that for fermions, the Wick ordered RGE contains a determinant of propagators to which a Gram inequality applies directly. A closer look at the way the Feynman graph expansion is generated by the RGE shows that the sign cancellations bring the combinatorial factors for fermions nearer to that of a planar field theory. This reduction does, however, not lead to a planar field theory in the strict sense because of certain binomial factors in the recursion. The direct application of the determinant bound shown here requires the interaction between the fermions to be short-range. This prevents a straightforward application to systems with abelian gauge fields by simply integrating over the gauge fields. One model to which the method applies directly is the Gross-Neveu model (which has been constructed rigorously [14, 15]). I show here only the most basic power counting bounds by leaving out all relevant and marginal couplings, but it is possible to take them into account by renormalization. A class of physically realistic models with a short-range interaction is that of nonrelativistic many-fermion models. In these models, there is a significant complication of the analysis because the singularity of the fermion propagator in momentum space is not at a point, but instead on the Fermi surface, which is a (d − 1)-dimensional subset of momentum space. Only in one dimension, the singularity is pointlike – the “surface” becomes a point. The interest in these models has resurged recently because of the discovery of hightemperature superconductivity. Before that it was taken for granted that Fermi liquid (FL) behaviour holds in all dimensions d ≥ 2, and Luttinger liquid behaviour in one dimension (the latter has been proven [16–18]). At certain doping values, however, strong deviations from FL behaviour are seen in the high-Tc materials. The discussion following these discoveries revealed that the former arguments for FL behaviour contained logical gaps. Like [19–25], the present work is not aimed at an understanding of these deviations, but at the more modest goal of determining first under which conditions FL behaviour occurs. Before doing so, it is necessary to give a definition of what would constitute FL behaviour. At zero temperature, the noninteracting Fermi gas has a discontinuity in the occupation number density; this is also a property one would require of a zero temperature FL. This discontinuity is absent for Luttinger liquids. It is, however, not sufficient for FL behaviour because there is a one-dimensional model which has both such a discontinuity and some Luttinger liquid features in the spectral density [27]. Moreover, in the standard models of many-fermion systems such a step never occurs because superconductivity sets in below a critical temperature, and it smooths out the step in the zero-temperature Fermi distribution. It is thus desirable to give a definition of FL behaviour at temperatures above the critical temperature for superconductivity. This is not at all straightforward because there is no clean characterization of FL behaviour at a fixed temperature. I propose to look at a whole range of temperatures and values of the coupling constant to bring out the characteristic features of a FL.

Continuous Renormalization for Fermions and Fermi Liquid Theory

251

I define an equilibrium Fermi liquid as a system in which the perturbation expansion in the coupling constant λ converges for the skeleton Green functions in the region |λ| log β small enough (here β is the inverse temperature) and where the selfenergy fulfills certain regularity conditions. The skeleton Green functions are defined in detail in Sect. 6; in them, selfenergy insertions are left out, so that the Fermi surface stays fixed. I discuss in Sect. 7 how they are related to the exact Green functions (a suitable choice of the Wick ordering covariance can be used to take the selfenergy insertions into account). The logarithmic dependence of the radius of convergence on β comes from the Cooper instability; the difference between Fermi liquids and Luttinger liquids is in the regularity properties of the selfenergy. This is discussed in detail in Sect. 2.6. The goal is to show that the standard many-fermion systems are Fermi liquids in that sense. A proof of this requires a combination of the regularity techniques of [21–23] for renormalization with the sector technique of [20] in the determinant bound (see Sect. 2.6 for further discussion). I do not give a complete proof in this paper but only part of it, by showing a determinant bound and some of the required regularity properties of the selfenergy in perturbation theory. The hope is that the determinant bound will lead to convergence, so that the method developed here, which is somewhat simpler than e.g. the one in [20], will work nonperturbatively. A different representation for fermionic Green functions that provides a simplification and leads to nonperturbative bounds is given in [28]. In Sect. 2, I review the Grassmann integral for many-fermion systems briefly, to give a self-contained motivation for the study of such systems, and to fix notation. The Fermi liquid criterion is formulated in Sect. 2.6. Sect. 3 contains the general renormalization group equation and the determinant formula Eq. (53). Sect. 4 contains the determinant bound and an application to systems where the propagator has point singularities. In Sect. 5, I show the existence of the thermodynamic limit for the many-fermion system in perturbation theory. In Sect. 6, I prove bounds on the skeleton selfenergy that are needed to renormalize the full theory in perturbation theory. Again, the RGE in the form of [3, 5, 6] would not be very convenient for this because it is a one-loop equation, whereas the crucial effects for regularity all start at two loops. They can be seen in a simple way in the Wick ordered RGE. Details about Wick ordering and the derivation of the determinant formula Eq. (53) are deferred to the Appendix.

2. Many-Fermion Systems The model is defined on a spatial lattice with spacing ε. Continuum models are obtained in the limit ε → 0; lattice models, such as the Hubbard model, are obtained by fixing ε. L ∈ N, and let G Let d ≥ 2 be the spatial dimension, ε > 0, L ∈ R be such that 2ε d d 3 be the torus be any lattice of maximal rank in R , e.g., G = Z . Let R P3 = εG/LG. The number of points of this lattice is |3| = ( Lε )d . Let 3 dx F (x) = εd x∈3 F (x) and δ3 (x, x0 ) = ε−d δx,x0 . Let F3 be the Fock space generated by the spin one half fermion operators satisfying the canonical anticommutation relations [29], i.e. for all x, x0 ∈ 3, cα (x)c+α0 (x0 ) + c+α0 (x0 )cα (x) = δαα0 δ3 (x, x0 ).

(1)

Here α ∈ {−1, 1} is the spin of the fermion in units of ~2 . The free part of the Hamiltonian H3 (c, c+ ) = H0 + λV is

252

M. Salmhofer

H0 = −

X

Z

Z dy T (x, y) c+α (x)cα (y).

dx

α∈{−1,1} 3

(2)

3

For a one-band model on a lattice with fixed spacing ε, T (x, y) = tx−y = ty−x describes hopping from a site y to another site x with an amplitude tx−y = ty−x . The interaction is multiplied by a small coupling constant λ; I assume it to be a normal ordered density-density interaction Z Z X dy v(x − y)c+α (x)c+σ (y)cα (x)cσ (y). (3) V (c, c+ ) = − dx 3

3

α,σ∈{−1,1}

In other words, it is a special type of a four-fermion interaction. For instance, the simplest Hubbard model is given by λ = U2 , where U is the usual Hubbard-U , and by v(x − y) = δ3 (x, y), and the hopping term is tx−y = t if |x − y| = 1 and zero otherwise, where t is the hopping parameter. At temperature T and chemical potential µ, the R grand canonical partition function is given by Z3 = trF3 e−β(H3 −µN3 ) with N3 = 3 dx n(x), and β = kB1 T . Observables are given by expectation values of functions, mainly polynomials, of the c and c+ , hOi3 =

1 tr e−β(H3 −µN3 ) O(c, c+ ) . Z3

(4)

A basic question is whether the expected values of observables have a finite thermodynamic limit and whether an expansion in λ can be used to get their behaviour at small or zero temperature T . For instance, one would like to expand the two-point function P∞ hc+ (x)c(y)i3 = r=0 λr GL,ε 2,r (x, y). It is by now well-known that the result of a naive −λV expansion e in powers of λ is that at T = 0, limL→∞ GL,ε 2,r = ∞ for all r ≥ 3 (see, e.g., [19, 21]). At positive temperature T , this unrenormalized expansion converges for |λ| ≤ const T d ; see Sect. 4. To get a better T -dependence of the radius of convergence, one has to renormalize. Because of the BCS instability, the best one can hope for in general is a bound |λ| log T1 < const for the region of convergence. This is part of the Fermi liquid criterion formulated below. 2.1. Grassmann integral representation. The standard Grassmann integral representation is obtained by applying the Lie product formula nτ (5) e−β(H3 −µN3 ) = lim e−ετ (H0 −µN3 ) e−ετ λV nτ →∞

to the trace for Z3 and hOi. The spacing in the imaginary-time direction is ετ = nβτ . The limit exists in operator norm because on the finite lattice 3, all operators are just finite-dimensional matrices. Inserting the orthonormal basis of F3 between the factors in Eq. (5) and rearranging, I get Z3 = limnτ →∞ Z3,nτ , where Z3,nτ is given by a finite-dimensional Grassmann integral, as follows. Let nτ be even and T = {τ = nετ : n ∈ Z, − n2τ ≤ n < n2τ }, let 3 = T × 3, and A be the Grassmann algebra generated by ψσ (x), ψ¯ σ (x), with σ ∈ {1, −1} and x = (τ, x) ∈ 3. Fix some Q ordering on 3 and denote the usual Grassmann measure [30, 31] by D3 ψD3 ψ¯ = x,σ dψσ (x) dψ¯ σ (x). Then Eq. (5) implies R ¯ Z3,nτ = N3 D3 ψD3 ψ¯ e−S3 (ψ,ψ) , where N3 is a normalization factor that depends on ε, L, and nτ , and where




XZ

Z ¯ = S3 (ψ, ψ)

253

dτ  T

σ

 ¯ )) . dx ψ¯ σ (τ, x)∂τ ψσ (τ, x) − H3 (ψ(τ ), ψ(τ

(6)

3

R P Here I have used the notations ψ(τ )(x) = ψ(τ, x), T dτ F (τ ) = ετ τ ∈T F (τ ), and ∂τ ψ(τ ) = ε−1 τ (ψσ (τ + ετ ) − ψσ (τ )), and the sum over τ runs over T, with antiperiodic boundary conditions [32]. For nτ < ∞ and L < ∞, this is a finite-dimensional Grassmann integral. The limit nτ → ∞, and afterwards L → ∞, will be taken only for the effective action. No infinite-dimensional Grassmann integration will be required. To do the Fourier transformation, it will be convenient to deal with periodic functions defined on an interval of double length in τ , and to impose the antiperiodicity as an antisymmetry condition: let T2 = ετ Z/2βZ, in other words, T2 = {τ ∈ ετ Z : −β ≤ τ < β} with periodic boundary conditions. Thus the fields ψ and ψ¯ are periodic with respect to translations of τ by 2β, and antiperiodicity with respect to translations by β is imposed by setting (−)

(−)

ψ (τ + β, x) = − ψ (τ, x) (7) R R R for all x ∈ 3. With the further notation 3 dx F (x) = T dτ 3 dx F (τ, x), and ¯ = S2 (ψ, ψ) ¯ + λS4 (ψ, ψ), ¯ where δ3 (x, x0 ) = ετ −1 ε−d δτ,τ 0 δx,x0 , the action is S(ψ, ψ) Z XZ ¯ = dx dx0 ψ¯ σ (x) a(x, σ, x0 , σ 0 )ψσ0 (x0 ) (8) S2 (ψ, ψ) σ,σ 0 3

with

3

a(x, σ, x0 , σ 0 ) = δσσ0 (∂τ + µ)δ3 (x, x0 ) − T (x, x0 )δT2 (τ, τ 0 )

and ¯ = S4 (ψ, ψ)

XZ σ,σ 0

0

0

Z dx

3

dx0 ψ¯ σ (x)ψσ (x)v(τ, x, τ 0 , x0 )ψ¯ σ0 (x0 )ψσ0 (x0 )

(9) (10)

3 0

with v(τ, x, τ , x ) = δT2 (τ, τ )v(x − x0 ). For the present work, the interaction does not have to be instantaneous. Retardation effects, like from phonons, are allowed. That is, v(τ, x, τ 0 , x0 ) may have a dependence on τ and need not be local in τ . The operator a appearing in S2 is invertible because the antiperiodicity condition removes the zero modes of the discretized time derivative. In other words, the Matsubara frequencies for fermions are nonzero at positive temperature (this will become explicit in the next section). 2.2. The propagator in Fourier space. Fourier transformation with the antiperiodicity conditions Eq. (7) is described in Appendix A. The Fourier transforms of ψ and ψ¯ are Z (−) (−) ˆ ψσ (p) = dx e−ipx ψσ (x), (11) 3

where, for p = (ω, p) and x = (τ, x), px = ωτ + px. If 3∗ is the dual lattice to 3, the momentum p is in 3∗ = Mnτ × 3∗ , where Mnτ = {ωn =

nτ nτ π (2n + 1) : n ∈ Z, − ≤n< } β 2 2

(12)

254

M. Salmhofer

R P is the set of Matsubara frequencies ωn . With the notation 3∗ dp F (p) = β1 ω∈Mnτ R R P −d ∗ dp F (ω, p), where 3∗ dp = L p∈3∗ , the inverse Fourier transform is ψσ (x) = R3 ipx ˆ ψσ (p). The Fourier transform of the hopping term is Tˆ (p, q) = δ3∗ (p + dx e 3∗ R ˜ q, 0)T (q), with T˜ (q) = dz eiqz tz . Denoting 3

E(p) = T˜ (p) − µ,

(13)

where µ is the chemical potential, ω b=

1 eiετ ω − 1 , iετ

(14)

and δ3∗ (p + p0 , 0) = δ3 (p + p0 , 0) ετ −1 δ−ω,ω0 , the Fourier transform of the operator a in the quadratic part of the action is aˆ (p, σ, p0 , σ 0 ) = δ3 (p + p0 , 0)δσσ0 iωb0 − E(p0 ) . (15) In other words, the matrix with entries aˆ (p, σ, −p0 , σ 0 ) is diagonal, and for temperature T = β1 > 0, all diagonal entries are nonzero because sin(ετ ω) ≥ 1 . (16) | Re ω b | = ω ετ ω 2β Thus the inverse of a, the propagator c = a−1 , exists; it has the Fourier transform cˆ(p, σ, p0 , σ 0 ) = δ3∗ (p + p0 , 0)δσσ0

1 . 0 b iω − E(p0 )

(17)

b → ω, so one gets the usual formula (iω − In the formal continuum limit ετ → 0, ω E(p))−1 . The partition function of the system of independent fermions (λ = 0) is Z Y ¯ d − E(p) , iω(p) (18) D3 ψ D3 ψ¯ e(ψ,Aψ) = det A = p

which is nonzero by Eq. (16). 2.3. The class of models. Denote the dual to G by B, the first Brillouin zone of the d infinite lattice. For instance, for G = εZd , B = Rd / 2π ε Z . The assumptions for the class of models are: there is k0 ≥ 2 such that the dispersion relation E ∈ C k0 (B, R), and for all p ∈ B, E(−p) = E(p) holds. The interaction vˆ is a C k0 function from R × B to R, ˆ ˆ 0 , p), all its derivatives up to order k0 are bounded functions on B × R, v(−p 0 , p) = v(p and the limit p0 → ∞ of vˆ exists and is C k0 in p. There is g0 > 0 such that for all p on the Fermi surface S = {p : E(p) = 0}, |∇E(p)| ≥ g0 holds. The Fermi surface is a subset of an ε–independent bounded region of momentum space (hence compact), it is strictly convex and has positive curvature everywhere. In particular, there is V1 > 0 such that for all L and ε , Z dk 1l |E(k)| ≤ 2 ≤ V1 . (19) 3∗


255

The constant Emax = sup |E(p)|

(20)

p∈B

is independent of ε. Under these hypotheses, there is 0 > 0 and a C 2 -diffeomorphism π from (−20 , 20 ) × S d−1 to an open neighbourhood of the Fermi surface S in B, (ρ, θ) 7→ π (ρ, θ), such that 2 (21) E(π π (ρ, θ)) = ρ and |∂ρ π (ρ, θ)| ≤ g0 (see [21], Lemma 2.1, and [22], Sect. 2.2; 0 was called r0 there). Let J(ρ, θ) = det π 0 (ρ, θ) and denote Z dθ |J(ρ, θ)|. (22) J0 = sup |J(ρ, θ)| and J1 = sup |ρ|≤0 S d−1

|ρ|≤0 θ∈S d−1

I assume that 0 ≤ 1 and (for convenience in stating some bounds) that β0 = kB0T ≥ 6. With the units chosen in a natural way, i.e., with typical bandwidths of electron volts, this corresponds to temperatures T up to 1000 Kelvin if 0 is of order one, which seems a sufficient temperature range to study conduction in crystals. Note, however, that 0 depends on the Fermi surface and thus on the filling factor. A typical example is the discretized Laplacian E(k) = B

d 1 X (1 − cos(εkν )) − µ. ε2

(23)

ν=1

1 , E(k) → k2 /2m − µ, the Jellium dispersion relation, which For ε → 0 and B = m satisfies the above hypotheses if large |k| are cut off. For ε = 1 and B = t, one gets the tight-binding dispersion relation with hopping parameter t/2, which satisfies the above hypotheses if µ 6= td (half-filling). In the limit µ → td, 0 → 0 in the Hubbard model. This implies that to have bounds uniform in the filling, one has to stay away from halffilling. The energy 0 sets the scale where the low-energy behaviour sets in. The effective four-point interaction at that scale can differ substantially from the original interaction. For a discussion, see Sect. 7 and [24].

2.4. Nambu formalism. It will be useful for deriving the component form of the RGE to rename the Grassmann variables such that the distinction between ψ and ψ¯ is in another index. This is a variant of the usual “Nambu formalism”; see, e.g., [33]. Let 0 = 3 × {−1, 1} × {1, 2},

(24)

and denote X = (x, σ, i) ∈ 0. For x ∈ 3 and σ ∈ {−1, 1}, the fields are defined as ψ(x, σ, 1) = ψ¯ σ (x) and ψ(x, σ, 2) = ψσ (x). The antiperiodicity condition reads ψ(x + βeτ , σ, i) = −ψ(x, σ, i) with eτ the unit vector in τ -direction. The Grassmann algebra generated by the (ψ(X))X∈0 is denoted by A0 [ψ]. Given another set of Grassmann algebra generated by the ψ Rand η is denoted by variables (η(X))X∈0 , the Grassmann R P2 P A0 [ψ, η]. Furthermore, denote 0 dX F (X) = i=1 σ∈{−1,1} 3 dx F (x, σ, i) and δ0 ((x, σ, i), (x0 , σ 0 , i0 )) = δii0 δσσ0 δ3 (x, x0 ), and define a bilinear form on A0 [ψ, η] by

256

M. Salmhofer

Z dX ψ(X) η(X) = − (η, ψ)0 .

(ψ, η)0 =

(25)

0

Then S2 =

1 2

(ψ, A ψ)0 , where, for X = (x, σ, i) and X 0 = (x0 , σ 0 , i0 ), ( 0 if i = i0 0 A(X, X ) = a(x, σ, x0 , σ 0 ) if i = 1 and i0 = 2 −a(x0 , σ 0 , x, σ) if i = 2 and i0 = 1

(26)

with a given by Eq. (9). In other words, when written as a matrix in the index i, A takes the form 0 a (27) T 0 −a with (aT )(x, σ, x0 , σ 0 ) = a(x0 , σ 0 , x, σ) denoting the transpose of a. Since a is invertible, A is invertible as well. R With this, Z3,nτ = N3 det a Z˜ 3 , where Z˜ 3 = dµC (ψ)e−λS4 (ψ) , where C = A−1 , and dµC is the linear functional (“Grassmann Gaussian measure”) defined by dµC (ψ) = 1 (det a)−1 D0 ψ e 2 (ψ,Aψ)0 . The constant N3 det a drops out of allRcorrelation functions and can therefore be omitted. The “measure” dµC is normalized, dµC (ψ) = 1, and its characteristic function is Z 1 (28) dµC (ψ) e(η,ψ)0 = e 2 (η, C η)0 . All moments of dµC can be obtained by differentiating Eq. (28) with respect to η and setting η = 0; see also the next subsection. 2.5. The connected Green functions. In the correspondence between the system, as defined by the Hamiltonian H3 and F3 , to the Grassmann integral, I have so far only discussed the partition function itself. In the path integral representation of Eq. (4), ¯ in the with a polynomial observable O(c, c+ ), one simply gets a factor O(ψ(0), ψ(0)) Grassmann integral. The m-point Green functions of the system determined by C and V are + *m Z m Y Y 1 dµC (ψ) e−λV (ψ) ψ(Xk ) = ψ(Xk ). (29) Z3 k=1

k=1

They determine the expected values of all polynomials by linearity. On a finite lattice, the limit nτ → ∞ of Z˜ 3 exists by the Lie product formula Eq. (5), and for λ small enough (depending on L, ε, β, and µ), it is nonzero since the trace of the matrix e−β(H3 −µN3 ) over the finite-dimensional space F3 is a continuous function of λ, which is nonzero at λ = 0 by Eq. (16) and Eq. (18). A similar argument applies to the numerator of Eq. (4). Thus the limits of numerator and denominator in Eq. (4) as nτ → ∞ exist separately. Therefore one can take this limit in numerator and denominator through the same sequence, i.e., take nτ to be the same in numerator and ¯ denominator. It follows that with the special choice O(ψ(0), ψ(0)), all expectation values in the Hamiltonian picture can be expressed as the limit nτ → ∞ of Eq. (29), with a special choice of the polynomial in the fields. Thus the correlation functions given by Eq. (29) include as a special case the expectation values of polynomials in the creation and annihilation operators. Let (η(X))X∈0 be a Rfamily of Grassmann generators. The partition function with δ ∂ = ετ −1 ε−d ∂η(X) , then source terms is Z0 (η) = dµC (ψ) e−λV (ψ) + (η,ψ)0 . Let δη(X)


*

n Y

+ ψ(Xk )

k=1

257

"n # Y 1 δ Z0 (η) = Z0 (0) δη(Xk ) k=1

. η=0

Thus, if one knows Z0 (η) one can derive all correlation functions. It is convenient to study the connected correlation functions, defined as + "n # *n Y Y δ log Z0 (η) ψ(Xk ) = δη(Xk ) k=1

c

k=1

(30)

(31)

η=0

instead. Since Z0 (η) is the exponential of log Z0 (η), one can reconstruct all correlation functions from the connected ones. It is even more convenient to transform the sources η, to get the amputated connected Green functions. They are generated by Z (32) Geff (χ) = log dµC (ψ) e−λV (ψ+χ) . A shift in the measure shows that Geff (χ) = 21 χ, C −1 χ 0 + log Z0 C −1 χ , so that the study of Geff is equivalent to that of log Z0 . The selfenergy 6(p) is defined as the one-particle irreducible part of the two-point function. In terms of the connected amputated two-point Green function G2 , which is the coefficient of the quadratic part (in ψ) of the effective action Geff (ψ), it is 6(p) = G2 (p)(1 − CG2 (p))−1 . 2.6. Criteria for Fermi liquid behaviour . In the following I give a definition of Fermi liquid behaviour which is linked to the question of convergence of the expansion in the coupling constant λ, and I discuss in some detail the physical motivation for this definition, the results that have been proven in this direction, and its relation to other notions of FL behaviour. In most many-fermion models, one cannot expect the expansion in λ to converge uniformly in the temperature, not even after renormalization. In particular, the Cooper instability produces a superconducting ground state, and thus a nonanalyticity in λ, if the temperature is low enough. This happens even if the initial interaction is repulsive [34, 35]. Nesting instabilities can produce other types of symmetry breaking, such as antiferromagnetic ordering, which may compete or coexist with superconductivity, but the conditions I posed, in particular the curvature of the Fermi surface, remove these instabilities at low temperatures (which temperatures are “low” depends on the scale 0 ). Let the skeleton Green functions be defined as the connected amputated m-point correlation functions where selfenergy insertions are left out. These functions are the solution of a natural truncation of the renormalization group equation; they are defined precisely in Sect. 6. Definition 1. The d-dimensional many-fermion system with dispersion relation E and interaction V shows (equilibrium) Fermi liquid behaviour if the thermodynamic limit of the Green functions exists for |λ| < λ0 (β), and if there are constants M0 , M1 , M2 > 0 (independent of β and λ), such that the following holds. The perturbation expansion for the skeleton Green functions converges for all (λ, β) with |λ| log β < M0 , and for all (λ, β) with |λ| log β ≤ M2 0 , the skeleton selfenergy 6sk : R × B → C satisfies the regularity conditions:

258

M. Salmhofer

1. 6sk is twice differentiable in p and max k∂ α 6sk k∞ ≤ M1 ,

|α|=2

(33)

2. the restriction to the Fermi surface 6sk |{0}×S ∈ C k0 (S, R), and max k∂ α 6sk k∞ ≤ M2 .

|α|=k0

(34)

Here k0 > d is the degree of differentiability of the dispersion relation E (given in Sect. 2.3). Nothing is special about the factor 21 in the condition |λ| log β ≤ M2 0 . One could instead also have taken any fixed compact subset of {z : |z| < M0 }. The derivatives mean, β when taken in p0 , a difference 2π (6sk (p0 + 2π β , p) − 6sk (p0 , p)). The maximum runs d+1 over all multiindices α ∈ N0 . This definition only concerns equilibrium properties of Fermi liquid behaviour; it does not touch phenomena like zero sound, which require an analysis of the response to perturbations that depend on real time. It is natural in that it defines a Fermi liquid above the critical temperature for superconductance: at a given λ, the value of T for which the convergence breaks down is Tc ∝ e−M0 /|λ| , which is the usual BCS formula. Convergence of perturbation theory above Tc implies that the usual Fermi liquid formulas are valid there. Convergence is stated only for skeleton quantities because that is all one can show. This convergence and the regularity properties of the selfenergy imply that the exact Green functions (no restriction to skeletons) are continuous in λ, that the exact selfenergy 6 is C 2 in λ and p, and that 6 obeys a bound similar to Eq. (33). The Green functions are not analytic in λ because otherwise already the unrenormalized expansion, which diverges termwise, would converge. The regularity properties (1) and (2) ensure that the exact Green functions can be reconstructed from the skeleton Green functions by renormalization. The usual skeleton expansion argument [36], where finiteness only of the skeleton selfenergy, but not of its derivatives, is shown, is insufficient to do that; one has to prove regularity properties (1) and (2). This was discussed in detail in [22], see also Sect. 7. The condition k0 > d is necessary to make the regularized propagator summable in position space. It is required in the proof of Lemma 5 (and in the proofs in [20], only that there the dispersion relation was taken C ∞ ). In the absence of level crossing, the free dispersion relation E(k) is usually even real analytic in k. However, when reconstructing the exact Green functions from the skeleton Green functions, one needs regularity of the dispersion relation of the interacting system, and thus regularity property (2), which is rather hard to verify even in perturbation theory. Thus it is desirable to take the smallest possible k0 > d. Because 6 obeys a bound similar to Eq. (33), one can do the usual first-order Taylor expansion in the momenta to get ˜ 6(p) = p0 (∂0 6)(0, P(p)) + (p − P(p)) · ∇6(0, P(p)) + 6(p),

(35)

from which one obtains a finite wave function renormalization Z(p) = 1 + i(∂0 6)(0, P(p))

(36)

˜ and a finite correction to the Fermi velocity, and the Taylor remainder 6(p) vanishes quadratically in the distance of the momentum (p0 , p) to its projection (0, P(p)) to the Fermi surface.


259

This property distinguishes Fermi liquids from other possible states of the manyfermion system, such as Luttinger liquids: In one dimension (where “Luttinger liquid behaviour” has been proven [16–18]), the second derivative of even the second order skeleton selfenergy grows like β for large β and thus violates the condition that the second derivative should be bounded independently of β for |λ| log β ≤ M2 0 . Note that this distinction can only be made if β is allowed to vary; at fixed β, the requirement that something is bounded independently of β is trivial. This is the reason why a whole range of values of β and λ is included in Definition 1. A full proof that the models obeying the hypotheses stated in Sect. 2.3 are Fermi liquids in the sense of Definition 1 is not within the scope of this paper, but several ingredients for such a proof are already in place. I now discuss what is known and then briefly state the main results of the present paper. The analyticity of the skeleton Green functions in λ follows for d = 2 spatial dimensions from a modification of the method in [20]. The required modification, namely to put in the four-point functions, is not difficult because at positive temperature, these functions have no singularities (in momentum space), but are bounded by a constant times a power of log β. Regularity property (1) was proven for all d ≥ 2 in perturbation theory. The “overlapping loop” method of [21] developed for these proofs applies nonperturbatively as well, so that (1) holds. For E(k) = k2 /2m − µ, rotational invariance implies that 6sk |{0}×S is independent of p, so that (2) is trivially fulfilled. Thus the model with this dispersion relation and a rotation-invariant short-range interaction is the simplest example of a Fermi liquid in d = 2. In the case without rotational symmetry (e.g. the Hubbard model), (2) was proven in perturbation theory for d = 2 in [21–23], with k0 = 2+h, h < 21 , by use of a classification of graphs without double overlaps [23] and a detailed analysis of their contributions [22]. A nonperturbative implementation of the double overlap technique of [23] has not been given yet, but it should be possible. For d = 3, analyticity has not yet been proven (for a partial result, see [26]). Regularity property (1) was proven in perturbation theory in [21–23]. Property (2) has not even been shown in perturbation theory up to now. In this paper, I use the continuous RGE to give a largely simplified proof of the existence of the thermodynamic limit (Lemma 3) and of regularity property (1) in perturbation theory, for all d ≥ 2 (Theorem 6). I show in perturbation theory that only the ladder four-point function (see Definition 3) can produce the logarithmic growth in β that leads to the Cooper instability (and hence to the restriction |λ| log β < M0 ), and that all other contributions to the four-point function are bounded uniformly in β (Theorem 4). In Sect. 4, I prove a basic power counting theorem (Theorem 1) that takes into account the effect of fermionic sign cancellations by a determinant bound. The determinant appearing in the RGE may lead to analyticity of the skeleton Green functions, but a proof of analyticity is not given here. A simplified proof of (2) can be given for d = 2 in perturbation theory by an extension of the methods developed here. For d = 3, regularity property (2) requires more than three derivatives of 6sk |{0}×S to exist. A proof that this is the case looks rather difficult, but the simple structure of the continuous RGE makes it seem within reach of that method. A natural question is if there are criteria independent of temperature that can be applied also in the zero temperature limit for “Fermi liquid behaviour”. For the class of models specified in Sect. 2.3, the simplest criterion is that the non-ladder skeleton Green functions are analytic in the coupling constant λ, that the non-ladder skeleton selfenergy

260

M. Salmhofer

6(N ) is C 1 , and that 6(N ) |S is C k0 , with bounds uniform in β. The non-ladder skeleton Green functions are obtained by removing all ladder contributions (defined in Sect 6) to the Green functions. The regularity implies that the Fermi velocity and the wave function renormalization are finite uniformly in β, and even at zero temperature, for d ≥ 2 (which is the usual criterion for Fermi liquids), whereas they still diverge in one dimension as β → ∞ 16–18. The Fermi liquid criterion given in Definition 1 is more natural than the one using the non-ladder Green functions: if the regularity conditions (1) and (2) of Definition 1 hold, the transition from the exact Green functions to the skeleton Green functions is a matter of convenience, but the replacement of the skeleton Green functions by the non-ladder skeleton Green functions changes the model drastically because it removes superconductivity. Evidently, a definition referring to a modified model is not as natural. Moreover, as mentioned above, Fermi liquid behaviour is observed only above the critical temperature for superconductance anyway. In Sect. 6, I define the non-ladder skeleton functions precisely and prove the above statements about 6(N ) uniformly in the temperature in perturbation theory. In [25], an asymmetric model, in which the symmetry E(k) = E(−k) does not hold, was introduced, and a proof was outlined that such models are Fermi liquids down to zero temperature. The asymmetry of the Fermi surface removes the Cooper instability at zero relative momentum q of the Cooper pair, i.e., it implies that the four-point function has no singularity at relative momentum q = 0 (which is where the usual Cooper pairing comes from). The regularity properties of the selfenergy, which are crucial for Fermi liquid behaviour, were not proven in [25]. Doing this [22, 23] is quite a bit harder than in the (k → −k)-symmetric case. At zero temperature, regularity property (1) is replaced by (10 ) : 6 ∈ C 2− because the selfenergy is not C 2 at zero temperature (the second derivative grows as a power of log β; for a detailed discussion of these problems see [22]). This modified regularity property (10 ) and (2) were proven in perturbation theory for a general class of two-dimensional models with a strictly convex Fermi surface, which includes the (k → −k)-nonsymmetric Fermi surfaces, in [21–23]. More precise conditions on the dispersion relation that imply absence of the Cooper instability also at nonzero relative momentum q were also formulated in [22]. It is not sufficient just to have a k → −k nonsymmetric surface to achieve that; one also needs that the curvature at a point on the Fermi surface and at its antipode differ except at finitely many points (for details, see Hypothesis (H40 ) of [22] and the geometrical discussion in Appendix C of [22]).

3. The Renormalization Group Equation In this section I derive the continuous RGE for fermionic models. I first derive it for the generating function, and then turn to the component form which is obtained by expanding the effective action in Wick ordered monomials of theR fields. Let 0 be aP finite set, for a function X 7→ F (X) from 0 to any linear space let 0 dX F (X) = ε0 X∈0 F (X), where ε0R > 0 is a constant, let δ(X, X 0 ) = ε−1 0 δXX 0 , and define the bilinear form (f, g) = 0 dX f (X)g(X). Let A be the finite-dimensional Grassmann algebra generated δ by the generators (ψ(X), χ(X), η(X))X∈0 and let δψ(X) be the fermionic derivative δ 0 0 normalized such that δψ(X) ψ(X ) = δ(X, X ); recall that the fermionic derivatives anticommute.


261

3.1. The RGE for the generating function. For t ≥ 0 let Ct be an invertible, antisymmetric R linear operator acting on functions defined on 0, i.e. (Ct f )(X) = 0 dX 0 Ct (X, X 0 )f (X 0 ) with (37) Ct (X 0 , X) = −Ct (X, X 0 ). ∂Ct ˙ Let Ct be continuously differentiable in t; denote ∂t = Ct . Let dµCt be the linear functional (Grassmann Gaussian measure) with characteristic function Z 1 (38) dµCt (ψ) e(η,ψ)0 = e 2 (η, Ct η)0 . The integrals of arbitrary monomials are obtained R from this formula by taking derivatives with respect to η. The measure is normalized: dµCt (ψ) = 1. Let V (ψ) ∈ A have no constant part, λ ∈ C, and G(0, ψ) = λV (ψ). The effective action at t > 0 is Z (39) G(t, ψ) = log dµCt (χ) eG(0,χ+ψ) . Because the measure is normalized, G(t, ψ) is a well-defined formal power series in λ. By the nilpotency of the Grassmann variables, eG(t,ψ) is a polynomial in λ (the degree of which grows with |0|). Thus G(t, ψ) is analytic in λ for |λ| < λ0 (0). Proposition 1. Let Z Z 1 δ 1 δ δ δ , Ct Ct (X, X 0 ) . = dX dX 0 1 Ct = 2 δψ δψ 0 2 δψ(X) δψ(X 0 ) 0

Then

and

(40)

0

∂ G(t,ψ) e = 1C˙ t eG(t,ψ) ∂t

(41)

eG(t,ψ) = e1Ct eG(0,ψ) .

(42)

If G(0, ψ) is an element of the even subalgebra, then for all t > 0, G(t, ψ) is an element of the even subalgebra, and it satisfies the renormalization group equation ∂ ˙ Ct G(t, ψ) + 1 δG(t, ψ) , C˙ t δG(t, ψ) . G(t, ψ) = 1 (43) ∂t 2 δψ δψ δ δ ) by replacing every factor ψ(X) by δη(X) in the Proof. For any F (ψ) ∈ A, define F ( δη δ δ (η,ψ)0 ]η=0 (the derivatives δη also polynomial expression for F . Then F (ψ) = [F ( δη ) e generate a finite-dimensional Grassmann algebra, so the expansion for F terminates at some power). Since Grassmann integration is a continuous operation, and by Eq. (38), Z δ G(0, δη ) G(t,ψ) (η,χ+ψ)0 dµCt (χ) e = e e

i h δ 1 = eG(0, δη ) e 2 (η,Ct η)0 e(η,ψ)0 For any formal power series f (z) =

P

f (1Ct ) e

(η,ψ)0

=f

fk x k , 1 (η, Ct η)0 2

η=0

.

(44)

e(η,ψ)0 ,

(45)

η=0

262

M. Salmhofer

so e 2 (η, Ct η)0 +(η,ψ)0 = e1Ct e(η,ψ)0 . Since 1Ct is bilinear in the derivatives, it commutes with all factors that depend only on η and can be taken out in front in Eq. (44). This ˙ Ct , Eq. (41) follows. implies Eq. (42). Since 1Ct also commutes with 1 If G(0, ψ) is an element of the even subalgebra, the same holds for G(t, ψ) by Eq. (42), since every application of 1Ct removes two fields. Thus performing the derivatives with respect to ψ gives Eq. (43). 1

3.2. The component RGE in position space. Let V (ψ) be an element of the even subalgebra. The effective action has the expansion G(t, ψ) =

∞ X

λr Gr (t, ψ)

(46)

r=1

with Gr (t, ψ) polynomials in the Grassmann algebra. As explained in Sect. 3, this expansion converges for 0 finite, but the radius of convergence λ0 depends on 0 and Ct . For the models discussed in Sect. 2, this means that λ0 goes to zero in the limit L → ∞ and β → ∞. ˙ t . The Assume that C = limt→∞ Ct exists and let Dt = C − Ct so that C˙ t = −D application will be that C is the covariance of the model and Ct is part of it, so that in G(t, ψ), part of the fields have been integrated over. Dt is then the covariance of the unintegrated fields. I expand the polynomial Gr (t, ψ) ∈ A in the basis for the Grassmann algebra given by the Wick ordered monomials Dt (ψ(X1 ) . . . ψ(Xp )), Gr (t, ψ) =

m(r) ¯ X

Z

m=0 0m

dX Gmr (t | X) Dt

m Y

! ψ(Xk ) ,

(47)

k=1

where Gmr (t | X1 , . . . , Xm ) is the connected, amputated m–point Green function and X = (X1 , . . . , Xm ). Details about Wick ordering are provided in Appendix B. A short formula is (48) Dt (ψ(X1 ) . . . ψ(Xp )) = e−1Dt ψ(X1 ) . . . ψ(Xp ). I use the symbol Dt (ψ(X1 ) . . . ψ(Xp )) rather than : ψ(X1 ) . . . ψ(Xp ) : to indicate clearly with respect to which covariance Wick ordering is done, because this will be important. The Gmr (t | X1 , . . . , Xm ) are assumed to be totally antisymmetric, that is, for all π ∈ Sm , Gmr (t | Xπ(1) , . . . , Xπ(m) ) = ε(π)Gmr (t | X1 , . . . , Xm ), because any part of G that is not antisymmetric would cancel in Eq. (47). ∂ Application of ∂t to Eq. (47) gives a sum of two terms since two factors depend on t. By Eq. (48), ∂ ˙ Dt Dt (ψ(X1 ) . . . ψ(Xp )) Dt (ψ(X1 ) . . . ψ(Xp )) = −1 ∂t ˙ Ct Dt (ψ(X1 ) . . . ψ(Xp )). =1

(49)

When multiplied by Gmr (t | X1 , . . . , Xm ) and integrated over X1 , . . . , Xm , this gives ˙ Ct G(t, ψ). Thus the term linear in G drops out of Eq. (43) by Wick ordering with 1 respect to Dt , and Eq. (43) now reads

Continuous Renormalization for Fermions and Fermi Liquid Theory m(r) ¯ X

Z dX Dt

m=0 0m

m Y

! ψ(Xk )

k=1

263

1 ∂ Gmr (t | X) = Qr (t, ψ), ∂t 2

where Qr (t, ψ) is defined by X ∞ δG(t, ψ) δG(t, ψ) , Ct = λr Qr (t, ψ). δψ δψ

(50)

(51)

r=1

Being an element of the Grassmann algebra, Qr (t, ψ) has the representation ! Z m(r) ¯ m X Y Qr (t, ψ) = dX Qmr (t | X) Dt ψ(Xk ) . m=0 0m

(52)

k=1

To obtain the Qmr (t | X), one has to rewrite the product of the two Wick monomials in Eq. (51). This is done in Appendix C. The result is Proposition 2. Qm1 (t | X) = 0, and for r ≥ 2, Z Z Z ∂ dV dW − det Dt(i) (V , W ) Qmr (t | X) = dκmr ∂t 0i

0i

˜ , X2 ), Gm1 r1 (t | X1 , V ) Gm2 r2 (t | W where

R

dκmr stands for the sum Z dκmr (r1 , m1 , r2 , m2 , i) F (r1 , m1 , r2 , m2 , i) X X κm1 m2 i F (r1 , m1 , r2 , m2 , i) = r1 ,r2 ≥1 r1 +r2 =r

(53)

(54)

(m1 ,m2 ,i)∈Mr1 r2 m

with positive weights κm1 m2 i = mi 1 mi 2 , so that dκmr is a positive measure. Mr1 r2 m is the set of (m1 , m2 , i) such that i ≥ 1, 1 ≤ m1 ≤ m(r ¯ 1 ), 1 ≤ m2 ≤ m(r ¯ 2 ), m1 + m2 = m + 2i, and m1 and m2 are even. X1 = (X1 , . . . , Xm1 −i ), X2 = (Xm1 −i+1 , . . . , Xm ), ˜ = (Wi , . . . , W1 ), and Dt(i) (V , W ) is the V = (V1 , . . . , Vi ), W = (W1 , . . . , Wi ), and W (i) i × i matrix (Dt (V , W ))kl = Dt (Vk , Wl ). Comparison of the coefficients gives the component form of the RGE, 1 ∂ Gmr (t | X1 , . . . , Xm ) = Am Qmr (t | X1 , . . . , Xm ), ∂t 2

(55)

where Am is the antisymmetrization operator (Am f )(X1 , . . . , Xm ) =

1 X ε(π)f (Xπ(1) , . . . , Xπ(m) ). m!

(56)

π∈Sm

The important feature of Eq. (53) is that the determinant of the propagators appears in this equation. The Gram bound for this determinant improves the combinatorics by a

264

M. Salmhofer

factorial. I now discuss the graphical interpretation of the equation and the determinant, to motivate why this improvement can be regarded as a “planarization” of the graphs. 3.3. The graphical interpretation. The component form of the RGE has a straightforward graphical interpretation. If one associates the vertex drawn in Fig. 1 to Gmr (t | X1 , . . . , Xm ), adopts the convention that the variables occurring on the internal lines of a graph are integrated, and writes out the determinant as det Dt(i) (V , W ) =

X π∈Si

ε(π)

i Y

Dt (Vk , Wπ(k) ),

(57)

k=1

the right-hand side of Eq. (53) appears as the signed sum over graphs with two vertices Gm1 r1 (t) and Gm2 r2 (t), obtained by joining leg number m1 − i + k of vertex 1 with leg number i − π(k) + 1 of vertex 2, for all k ∈ {1, . . . , i}. The graph for π = id is drawn in Fig. 2. The expansion in terms of Feynman graphs is generated by iteration of the equivalent integral equation 1 Gmr (t | X) = Gmr (0 | X) + Am 2

Zt dt Qmr (t | X)

(58)

0

with the initial condition Gmr (0 | X) = −δr1 Vm (X), where Vm (X) is the coefficient of C (ψ(X1 ) . . . ψ(Xm )) in the original interaction. It is evident from Fig. 2 that only connected graphs contribute to this sum.

3

m-2

Gmr(t)

2 1

m-1 m

Fig. 1. The vertex corresponding to Gmr (t)

Note that the only planar graphs appearing in this sum are from π(j) = j + k mod i for k ∈ {0, . . . , i−1}, and that for k 6= 0, these permutations produce planar graphs only if i = m1 or i = m2 . For all other permutations, the graphs arising are nonplanar. For bosons, the determinant is replaced by a permanent, and one can permute the integration variables so that the derivative of the permanent (and hence the sum over permutations) gets replaced by i Y ∂ Dt (Vk , Wk ) Dt (V1 , W1 ), (59) i i! ∂t k=2

so that the planar graph drawn in Fig. 2 is the only one contributing to the right-hand side. Thus this factor i! distinguishes between the combinatorics of the exact theory and a “planarized” theory, in which i! is replaced by i2 (the second factor i comes


m1-i

265

i 2

Gm ,r (t)

2

m1

1 1

1

Gm ,r (t) 2 2

1

m2 Fig. 2. A graph contributing to the right hand side of the RGE

from doing the derivative in the determinant, see Sect. 4). The “planarized” theory does contain more than the sum over all planar graphs because of the binomial factors (the antisymmetrization operation Am does not change the combinatorics because it contains an explicit factor 1/m!). In the next section, I bound the determinant by const i and thereby reduce the combinatorics of the fermionic theory to that of the planarized theory. 4. Fermionic Sign Cancellations 4.1. The determinant bound. Equation (53) already suggests that a determinant bound similar to the one used in Lemma 1 of [20] can be applied to the RGE. Before applying this bound, the derivative with respect to t has to be performed, and some factors need to be arranged to avoid the factor |3| that appeared in [20]. The reason it does not appear here is that only connected graphs contribute to the effective action (whereas the partition function itself was bounded in Lemma 1 of [20]). In the RGE, this is very easily seen without a reference to graphs. Since the determinant is multilinear in the columns of the matrix, the derivative with respect to t produces a sum of terms where every column gets differentiated. Expanding along each differentiated column gives XX 0 0 ∂ ˙ t (Vl , Wl0 ) det Dt(i−1) (V (l) , W (l ) ) det Dt(i) (V , W ) = (−1)l+l D ∂t 0 i

i

(60)

l=1 l =1

0

with V (l) = (V1 , . . . , Vl−1 , Vl+1 , . . . , Vi ) and a similar expression for W (l ) . The sign is cancelled by rearranging Gm1 r1 (t | X 1 , V ) = (−1)i−l Gm1 r1 (t | X 1 , V (l) , Vl ), 0

˜ , X 2 ) = (−1)i−l0 Gm2 r2 (t | Wl0 , W ˜ (l ) , X 2 ). Gm2 r2 (t | W

(61)

Upon renaming of the integration variables, the summand becomes independent of l and l0 , so the sum gives a factor i2 . Thus Z Z Z Z ˙ t (V, W ) dY dZ (62) Qmr (t | X) = dκmr i2 dV dW D ˜ X 2 ). det Dt(i−1) (Y , Z)Gm1 r1 (t | X 1 , Y , V )Gm2 r2 (t | W, Z,

266

M. Salmhofer

Let ||| · ||| be the norm [14] |||Fm ||| =

max

sup

Z Y m

p∈{1,...,m} Xp

Lemma 1. Assume that

dXq |Fm (X1 , . . . , Xm )|.

(63)

q=1 q6=p

sup det Dt(i−1) (Y , Z) ≤ Ai−1 (t).

(64)

Y ,Z

Z

Then |||Qmr (t)||| ≤

˙ t ||| |||Gm1 r1 (t)||| |||Gm2 r2 (t)|||. dκmr i2 Ai−1 (t) |||D

(65)

Proof. Let p ∈ {1, . . . , m}. Without loss of generality, let p ≤ m1 − i, so that Xp is a component of X 1 (the other case is similar by the symmetry of the sum for Qmr (t) in m1 and m2 ). By Eq. (64), Z |||Qmr (t)||| ≤

dκmr i2 Ai−1 (t) sup

Z mY 1 −i

Xp

Z dXq

dV φ(V, X 1 ) χ(V )

(66)

q=1 q6=p

Z

with φ(V, X 1 ) = Z

and χ(V ) = The bound

Z

χ(V ) ≤ sup

˜ X 2 ) . ˙ t (V, W ) Gm2 r2 (t | W, Z, dW dZ dX 2 D ˙ t (V, W ) sup dW D

V

gives the result.

dY |Gm1 r1 (t | X 1 , Y , V )|

W

Z

˜ X 2 ) dZ dX 2 Gm2 r2 (t | W, Z,

(67)

(68)

(69)

From now on I assume that 3 is a discrete torus and that 3∗ is its dual. Let N be a set with |N | = n elements, let 0 = 3 × N × {1, 2} and 0∗ = 3∗ × N × {1, 2}. N can be thought of as the index set containing spin and colour indices. The last factor {1, 2} ¯ as discussed in Sect. 2. distinguishes between the usual ψ and ψ, Lemma 2. Let X = (x, σ, j) ∈ 0, and for every k ∈ 3∗ , let M (k) be a symmetric matrix in M(n, R) with eigenvalues mρ (k) satisfying |mρ (k)| ≤ 1. Let ft : 3∗ → C, and let Z 0 0 0 Dt (X, X ) = δj ,3−j dk eik(x−x ) (−1)j ft ((−1)j k)Mσ,σ0 (−1)j k . (70) 3∗

Then Dt (X 0 , X) = −Dt (X, X 0 ), and  i−1 Z (i−1) (Y , Z) ≤  dk |ft (k)| . det Dt 3∗

(71)


267

0

Proof. Since 3∗ = −3∗ and (−1)j = −(−1)j , a change of variables k → (−1)j k implies Z j j0 0 dk ft (k) Mσ,σ0 (k) eik((−1) x+(−1) x ) . (72) Dt (X, X 0 ) = δj+j 0 ,3 (−1)j 3∗

The antisymmetry of Dt now follows from the symmetry Pof M . Let 5ρ (k) be the spectral projection to the eigenspace of mρ (k), so that M (k) = ρ mρ (k)5ρ (k), and denote the scalar product on the spin space by [·, ·] so that M (k)σσ0 = [eσ , M (k) eσ0 ], where the eσ are orthonormal. Then Z2π

Z

0

Dt (X, X ) =

dk 3∗ 1

0

dϕ X at (X)(k, ϕ, ρ), 5ρ (k) bt (X 0 )(k, ϕ, ρ) 2π ρ

= hat (X) , bt (X 0 )i

(73)

with 3∗ 1 = {k ∈ 3∗ : ft (k) 6= 0}, and at (X)(k, ϕ, ρ) = (−1)j e−ik(−1)

j

j0

bt (X 0 )(k, ϕ, ρ) = eik(−1)

x0

x

e−iϕ(j−3) |ft (k)| 2 eσ , 1

0

eiϕj ft (k)mρ (k) |ft (k)|− 2 eσ0 . 1

(74)

Gram’s bound [37], i−1 Y 21 (i−1) (Y , Z) ≤ hat (Yk ), at (Yk )i hbt (Zk ), bt (Zk )i , det Dt

(75)

k=1

X

and

|mρ (k)|2 heσ , 5ρ (k)eσ i ≤

X

ρ

imply Eq. (71).

heσ , 5ρ (k)eσ i ≤ 1

(76)

ρ

˜ t (k) and Mσσ0 (k) = δσσ0 (as in the above many-fermion Corollary 1. If ft (k) = D systems) then Z i−1 (i−1) ˜ | det Dt (Y , Z)| ≤ dk Dt (k) . (77) 3∗

4.2. Power counting for point singularities. In this section, I show some basic power counting bounds for the Green functions obtained from the truncation that all marginal or relevant couplings are left out. Theorem 1. Assume that Dt is of the form Eq. (70), that Z ˜ t (k) ≤ 11 e−t , dk D

(78)

3∗

and that R

˙ t ||| ≤ 12 et . |||D

R

(79)

Let dκ˜ mr denote the measure obtained by restricting the sum in dκmr to m1 ≥ 4 and m2 ≥ 4 and replacing G4,rk by vδrk ,1 whenever it appears in the sum on the

268

M. Salmhofer

right-hand side of Eq. (53). Then the solution G˜ mr (t) of Eq. (55) with initial condition Gmr (0) = G(0) mr satisfies ( m γmr et( 2 −2) if m ≥ 6 |||G˜ mr (t)||| ≤ γ4r (1 + t) (80) if m = 4 if m = 2 γ2r with γmr defined recursively as γmr =

|||G(0) mr |||

1 + 312 m

Z dκ˜ mr i2 1i−1 1 γ m 1 r1 γ m 2 r2 .

(81)

Proof. Induction in r, with Eq. (80) and Eq. (81) as the inductive hypothesis. The case r = 1 is trivial. Let r ≥ 2, and the statement hold for all r0 < r. By Lemma 2, Eq. (64) −t(i−1) . Thus holds with Ai (t) = 1i−1 1 e Z −t(i−2) ˜ |||Gm1 r1 (t)||| |||G˜ m2 r2 (t)|||. (82) |||Qmr (t)||| ≤ 12 dκ˜ mr i2 1i−1 1 e By the inductive hypothesis and m1 + m2 − 2i = m, ||| · ||| of the right-hand side of Eq. (58) is bounded by Z |||Gmr (0)||| +

dκ˜ mr i

2

1i−1 1 12 γm1 r1 γm2 r2

1 2

Zt ds es(

m 2 −2)

.

(83)

0

To complete the induction step, this has to be bounded by the right hand side of Eq. (80). m If m ≥ 6, m 2 − 2 ≥ 6 ≥ 1, so 1 2 Rt

Zt ds es( 0

m 2 −2)

≤

m 1 3 m et( 2 −2) ≤ et( 2 −2) . m−4 m

2 If m = 4, ds ≤ m (1 + t). If m = 2, m 2 − 2 = −1, so 0 This implies Eq. (80), with γmr given by Eq. (81). 1 2

1 2

Rt 0

ds es(

(84) m 2 −2)

≤

1 2

=

1 m.

Remark 1. In graphical language, the truncation removes all two-legged insertions that require renormalization and all nontrivial four-legged insertions (‘nontrivial’ means that the four-legged vertices are still there). Remark 2. The solution to Eq. (81) is bounded by the solution to the untruncated recursion γm1 = vδm4 , and for r ≥ 2, Z 1 (0) dκmr i2 B i−1 γm1 r1 γm2 r2 +A (85) γmr = γmr m with A = 312 and B = 11 . The constants A and B can be scaled out of the mrecursion. If the initial interaction is a four-fermion interaction, gm,r = mγmr A−1 B 1− 2 satisfies the recursion gm,1 = wδm4 , with w = 4AB, and for r ≥ 2, r−1 X 2(s+1) X X µ − 1m + 2k + 1 − µ gµ,s gm+2k+2−µ,r−s . (86) gm,r = k k µ=2 s=1 k≥0

µ even

I do not provide bounds for the solution gm,r here; if gm,r ≤ const r , then the above P bounds imply that r λr G˜ mr (t) is analytic in λ. This behaviour is suggested by the absence of the factor i! that would appear for bosons in the recursion; see [13].


269

The propagators Ct and Dt for the renormalization group equation are defined using a partition of unity χ1 + χ2 = 1, χi ∈ C ∞ (R+0 , [0, 1]) with 1 χ1 (x) = 1 if x ≤ 4 , (87) 0 if x ≥ 1 χ01 (x) < 0 for all x ∈ ( 41 , 1), and kχ01 k∞ ≤ 2. Proposition 3. Let 3 be d-dimensional, and B be the infinite-volume limit of 3∗ . For L d ∈ 2N, B = Rd / 2π instance, for 3 = εZd /LZd , with 2ε ε Z . Let Dt be given by Eq. (70) with kM (k)k ≤ 1, ft (0) = 0, and for k 6= 0 ft (k) = f˜(k)χ1 (e2t f˜(k)2 ),

(88)

with f˜ ∈ C d+1 (B \ {0}, C), satisfying for all |α| ≤ d + 1 and all |k| ≤ 1, |Dα f˜(k)| ≤ Fd |k|1−d−|α| . Then Eq. (78) and Eq. (79) hold. R Proof. Equation (78) holds because 3∗ \{0} |k|1−d χ1 (k 2 e2t )dk is a Riemann sum apR dd k proximation to the convergent integral B |k|1−d χ1 (k 2 e2t ) (2π) d . The infinite-volume analogue of Eq. (79) is usually proven by integration by parts, using repeatedly Z dd k i(x−x0 )k ∂ ˆ˙ ˙ t (x − x0 ) = − Dt (k), e (89) (xν − x0ν )D (2π)d ∂kν B

which implies that |D˙ t (X, X 0 )| falls off at least as (1+e−t |x−x0 |)−d−1 for large |x−x0 |. On the torus at finite L, one iterates instead the summation by parts formula Z ˙ˆ ˙ˆ − a) , ˙ − x0 )(1 − eia(x−x0 ) ) = dk eik(x−x0 ) D(k) D(x − D(k (90) 3∗

which holds for all a ∈ 3∗ , decomposes 3 into 2d parts and chooses a appropriately to get an analogue of Eq. (89). This works uniformly in L because M (0) = 0. A similar argument is given in more detail in the proof of Lemma 5. Remark 3. Let d ≥ 2, 3 = εZd /LZd and d 2 X (1 − cos(εkν )). ε2

Lε (k) =

(91)

ν=1

1−d The choices f˜(k) = Lε (k) 2 , M = 1 (the toy model of [20]), and

f˜(k) =

d X 1 ν=1

and M (k) = f˜(k)

ε

!−1/2

2 sin(kν ε)

d X ν=1

+ (εLε (k))

2

(92) !

1 iγν sin(kν ε) + εLε (k) ε

(93)

270

M. Salmhofer

(Wilson fermions) satisfy the hypotheses of Proposition 3. For all t > 0, the infinitevolume limit L → ∞ and the continuum limit ε → 0 of the Green functions G˜ mr (t) exist and satisfy Eq. (80). In the first case, f˜(k) → |k|1−d , in the second case, M (k)f˜(k) → pp/2 as ε → 0. The Euclidean Dirac matrices satisfy the Clifford algebra γµ γν + γν γµ = δµν , and can be chosen hermitian, γµ∗ = γµ . It is possible to adapt the matrix structure of Lemma 2 to satisfy the antisymmetry condition on Dt also for this case. Remark 4. In ultraviolet renormalizable theories, the signs in the exponents of Eq. (78) and Eq. (79) are reversed. A power counting theorem similar to the infrared power counting Theorem 1 can be proven provided that in the initial interaction, at most quartic polynomials appear. The statement is then that for m ≥ 6, the Green function is bounded m by const e−t( 2 −2) .

5. The Thermodynamic Limit of the Many-Fermion System In this section, I apply the RGE to the many-fermion systems defined in Sect. 2. For d = 1, the singularity is pointlike, so the power counting bound Theorem 1 applies. For d ≥ 2, the analogue of Theorem 1 gives only weaker bounds, which, e.g., in d = 2 would mean that the four-point function is still relevant and the six-point function is marginal. This is not the actual behaviour; showing better bounds in d ≥ 2 requires a refinement using the sector technique of [20] and is deferred to another paper. I also show a simple bound for the full Green functions that takes into account the sign cancellations. If the coefficients γmr given by Eq. (85) are exponentially bounded, this bound implies that the unrenormalized expansion converges in a region |λ|β d < const . I also give a simple proof that the thermodynamic limit exists in perturbation theory. One can also show bounds on the expansion coefficients for finite nτ and L. Since this is mainly a tedious repetition of the infinite-volume proofs with integrals replaced by Riemann sums (it also requires that µ is chosen such that the Fermi surface contains no points of the finite-L momentum space lattice), I will content myself with indicating where this is necessary. For convenience, I call the limit nτ → ∞ and L → ∞ the thermodynamic limit, although the first limit would more aptly be called the time-continuum limit. The limit nτ → ∞ has to be taken first, because I want to apply Eq. (5) for operators on a finite-dimensional space only. However, it will turn out that for T > 0, the order of the two limits does not matter. Since I want to take the limit nτ → ∞, I can assume that nτ ≥ 2β(0 + Emax ). Thus Lemma 9 applies, with E0 = t , for all t ≥ 0. 5.1. The component RGE in Fourier space. Let 3 = T×3 and 3∗ = Mnτ ×3∗ , as given in Sects. 2.1 and 2.2. Let 0 = 3 × {−1, 1} × {1, 2} and 0∗ = 3∗ × {−1, 1} × {1, 2}. The Fourier transforms of the Gmr (t) are, with (P = P1 , . . . , Pm ), Gˆ mr (t | P ) =

Z Y m 3m

dxk e−i(p1 x1 +...+pm xm ) Gmr (t | X).

(94)

k=1

For K = (k, σ, j) ∈ 0∗ let ∼ K = (−k, σ, 3−j), and let δ0∗ (K, K 0 ) = δ3∗ (k, k 0 )δσσ0 δjj 0 . Assume that the Fourier transform of the propagator Dt is of the form ¯ t (K 0 ), ˆ t (K, K 0 ) = δ0∗ (K, ∼ K 0 ) D D

(95)


271

where ˜ t ((−1)j k). ¯ t (K) = (−1)j D D

(96)

˜ t for This combination of signs implies that Dt (X, Y ) = −Dt (Y, X). The propagator D the many-fermion system is given in Eq. (103). The Fourier transform of Qmr (t) is Z Qˆ mr (t | P ) =

Z dκmr i!i

¯˙ t (K1 )) dK1 . . . dKi (−D

i Y

¯ t (Kj ) D

j=2

Gˆ m1 r1 (t | P (1) , K) Gˆ m2 r2 (t |∼ K, P (2) )

(97)

with P (1) = (P1 , . . . , Pm1 −i ), P (2) = (Pm1 −i+1 , . . . , Pm1 +m2 ), K = (K1 , . . . , Ki ), and ∼ K = (∼ Ki , . . . , ∼ K1 ). Here the arguments of Gm2 r2 (t) have been permuted and relabelled such that the determinant is transformed into a sum with the same sign for all terms (see Appendix C); this gives the extra factor i!. By translation invariance in space x and time τ , Gˆ mr (t | P1 , . . . , Pm ) = δ3∗ (p1 + . . . + pm , 0)Imr (t | P1 , . . . , Pm )

(98)

with a totally antisymmetric function Imr (t | P1 , . . . , Pm ) of (P1 , . . . , Pm ) ∈ 0∗ m that satisfies m X ∇pµ Imr (t | P1 , . . . , Pm ) = 0. (99) µ=1

A priori, Eq. (98) implies only the existence of a function Iˆmr (t | P1 , . . . , Pm ), defined only for those P1 , . . . , Pm for which p1 +. . .+pm = 0. However, since H = {p1 , . . . , pm : p1 + . . . + pm = 0} is a linear subspace of (3∗ )m , one can simply extend I˜ to a function on all space by defining I = Iˆ ◦ 5H with 5H the projection to the subspace H. Since 5H is symmetric in all its arguments, I is totally antisymmetric, and Eq. (99) holds because (1, . . . , 1) ⊥ H (in Eq. (99), ∇pµ is a difference operator which becomes the gradient in the limit where the momenta become continuous). The product of the two δ3∗ in Eq. (97) can be combined to cancel the δ3∗ in the relation between Gˆ and I, and to remove the integration over k1 . Thus the RGE in Fourier ∂ Imr (t | P ) = 21 Am Qmr (t | P ), with space is ∂t Z Z X ¯˙ t (K1 )) Qmr (t | P ) = dκmr i!i dK2 . . . dKi (−D σ1 ,j1 i Y

¯ t (Ks ) Im1 r1 (t | P (1) , K) Im2 r2 (t |∼ K, P (2) ), D

(100)

s=2

where K1 = (k1 , σ1 , j1 ) and k1 is fixed as k1 = −(k2 + . . . + ki + p1 + . . . + pm1 −i ). I use the same symbol for the Q in position and in momentum space since it will be always clear from the context which one is meant. Remark 5. The equivalent integral equation is 1 Imr (t | P ) = Imr (0 | P ) + Am 2

Zt ds Qmr (s | P ). 0

(101)

272

M. Salmhofer

R The sum dκmr contains the sum over r1 ≥ 1 and r2 ≥ 1, with the restriction r1 +r2 = r. Thus only r1 < r and r2 < r occur in this sum. Therefore Eq. (101), together with an initial condition Imr (0 | P ), uniquely determines the family of functions Imr (t | P ). Iteration of Eq. (101) generates the usual perturbation expansion. 5.2. Bounds on the finite-volume propagator. The propagator cˆ for the many-fermion system is given in Eq. (17). If k is such that E(k) = 0, then cˆ becomes of order β for small |ω|. In the temperature zero limit, β → ∞, this becomes a singularity. This is the reason why renormalization is necessary. The renormalization group flow is parametrized by t ≥ 0, where (102) t = 0 e−t is a decreasing energy scale. The fixed energy scale 0 was specified in Sect. 2.3. The limit of interest is t → ∞. The uncutoff propagator for the many-fermion system is given in Eq. (17). Thus, for k = (ω, k) ∈ 3∗ let 1 2 ˜ t (k) = χ1 t −2 |ib (103) ω − E(k)| , D ib ω − E(k) where ω b is defined in Eq. (14), χ1 is given in Eq. (87), and define C˜ t (k) similarly, with ˜ t (k) = (ib ω − E(k))−1 is independent of t. χ1 replaced by χ2 = 1 − χ1 . Then C˜ t (k) + D ˜ t define operators D ˆ t (K, K 0 ) and Cˆ t (K, K 0 ) on the functions The functions C˜ t and D ˙ ∗ ∂ ˜ ˜ on 0 by Eq. (96) and Eq. (95). Denote Dt = ∂t Dt , and let 1l (A) = 1 if the event A is true and 0 otherwise. ˜ t is a C ∞ function of t that vanishes identically if t > log β0 . If Proposition 4. D 2 β0 t ≤ log 2 , then ˜ t ⊂ {k ∈ 3∗ : |ib supp D ω − E(k)| ≤ t }, 1 ˜˙ t ⊂ {k ∈ 3∗ : t ≤ |ib ω − E(k)| ≤ t }. supp D 2

(104)

Moreover ˜˙ ω − E(k)| ≤ t ≤ 2β 1l |ib ω − E(k)| ≤ t , Dt (k) ≤ 4−1 t 1l |ib D ˜ t (k) ≤

β 2

1l |ib ω − E(k)| ≤ t , Z ˜˙ dk D and t (k) ≤ 4V1 3∗

Z

(105)

˜ t (k) ≤ V1 log β0 , dk D 2

(106)

3∗

where V1 is the constant in Eq. (19). ˜ t (k) 6= 0 implies |ib Proof. χ1 (x) = 0 if x ≥ 1, so D ω − E(k)| ≤ t ≤ 1. By Lemma 9, ˜ t 6= 0 only for t ≤ log β0 . Since |ib this implies |ω| ≤ π2 t . Since |ω| ≥ πβ , D ω − E(k)| ≥ 2 2 ∂ ˜ ˜ Dt (k)| = |Re ω b | ≥ β , the stated properties of Dt follow. The t-derivative gives | ∂t −2 ˙ 1 0 −2 2 0 ˜ 2 |ib ω − E(k)| |χ ( |ib ω − E(k)| )|. Since χ (x) = 0 for x 6∈ ( , 1), D (k) 6= 0 t

1

t

1

4

t

implies 21 t ≤ |ib ω − E(k)| ≤ t , which implies Eq. (105) and Eq. (104). Equation (106) R ˜ t = log(β0 /2) ds D ˜˙ s . follows from these inequalities by Lemma 9, by Eq. (19), and by D t


273

Remark 6. The bounds Eq. (106) are crude because the restriction |E(k)| ≤ t was replaced by |E(k)| ≤ 2 when Eq. (19) was applied. To get a better bound, one has to require that no point of the finite–volume lattice in momentum space is on the Fermi R surface S. Because only then 3 dk 1l |E(k)| ≤ t ≤ const t holds uniformly in L. 5.3. Existence of the thermodynamic limit in perturbation theory. The proof of existence of the thermodynamic limit will proceed inductively in r, because of the recursive structure of the RGE mentioned in Remark 5. It will be an application of the dominated convergence theorem to Eq. (101). To this end, it is necessary to make the integration region independent of nτ and L. Although Imr (t), given by Eq. (101), appears evaluated at P = (P1 , . . . , Pm ) ∈ 0m on the RHS of Eq. (101) only, the integral defines the tderivative of a function defined on 0∗∞ m , where 0∗∞ = M(β) × B × {1, −1} × {1, 2} with B = G∗ the first Brillouin zone of the infinite lattice, and π M(β) = {ωn = (2n + 1) : n ∈ Z} (107) β the set of Matsubara frequencies in the limit nτ → ∞. For a bounded function Fm : 0∗∞ m → C, let (108) |Fm |0 = sup |Fm (P )|. m P ∈0∗ ∞

Up to now, the dependence of Imr (t) on (nτ , L) was not denoted explicitly. I now put it (nτ ,L) . Let lim = limL→∞ limnτ →∞ . in a superscript and write Imr (nτ ,L)→∞

(nτ ,L) (nτ ,L) (0))m,r,nτ ,L be a family of bounded functions such that Imr (0) Lemma 3. Let (Imr (nτ ,L) = 0 if m > 2r + 2, lim(n τ ,L)→∞ Imr (0) = Imr (0) exists and is a bounded function (nτ ,L) (0) (nτ ,L) on 0∗∞ m , and Imr (0) ≤ Kmr . Let (Imr (t))m,r,nτ ,L be the solution to Eq. (101). (nτ ,L) Then, for all m and r, Imr (t) = 0 if m > 2r + 2, and

∂ (nτ ,L) I (t) = 0 ∂t mr

for all t > log

β , 2

(109)

there are bounded functions Imr (t) : 0∗∞ m → C such that lim

(nτ ,L)→∞

(nτ ,L) Imr (t) = Imr (t).

Let Pmr be the polynomials defined recursively as Z −1 1−r dκmr i i! xi−1 Pm1 r1 (x)Pm2 r2 (x) Pmr (x) = 6 |Imr (0)| + 0

(110)

(111)

(in particular, the coefficients of Pmr are independent of β), then for all nτ , L, β, t, (n ,L) Imrτ (t) ≤ (0 β)r−1 Pmr 4V1 log β0 . (112) 2 0 Proof. Induction in r, with the statement of the lemma as the inductive hypothesis. Let r = 1. Since r1 ≥ 1 and r2 ≥ 1, the right-hand side of the equation is zero, so (nτ ,L) (nτ ,L) (t, P ) = Im1 (0, P ) for all t. Thus the statement follows from the hypotheses Im1 (nτ ,L) on Imr (0). (nτ ,L) (0) = 0. Let r ≥ 2, and the statement hold for all r0 < r. Let m > 2r +2. Then Imr The five-tuple (m1 , r1 , m2 , r2 , i) contributes to the right-hand side only if r1 + r2 = r,

274

M. Salmhofer

m1 + m2 = m + 2i, i ≥ 1, and by the inductive hypothesis, only if m1 ≤ 2r1 + 2 and (nτ ,L) (t) can be nonzero only if m = m1 +m2 −2i ≤ m1 +m2 −2 ≤ m2 ≤ 2r2 +2. Thus Imr 2r1 + 2 + 2r2 + 2 − 2 = 2r + 2. The integral appearing on the right-hand side of Eq. (101) is a Riemann sum approximation to an integral over the (t, nτ , L)-independent region (nτ ,L) have a s ∈ [0, ∞), kj ∈ M(β) × B. By the inductive hypothesis, the factors Im k rk 2 ¯ limit Eq. (110) satisfying Eq. (112). Since for all t, Dt is a bounded C function, the same holds for the propagators (boundedness holds because β < ∞). Thus the integrand converges pointwise, and it suffices to show that it is bounded by an integrable function to get Eq. (110) and Eq. (112) by an application of the dominated convergence theorem. Let α = 4V1 log β2 0 , and let g be the function on [0, ∞) × (M(β) × B)i−1 given by Z 2 β0 g(s, k) = dκmr i i!Pm1 r1 (α)β r1 −1 Pm2 r2 (α)β r2 −1 1l s ≤ log 2 0 e−s i Zs Y dsj 1l |ωj | ≤ π2 0 e−sj 1l |E(kj | ≤ 20 e−sj . (113) j=2 0

The integrand on the right-hand side of Eq. (101) is bounded by g by Proposition 4, Lemma 9, and the inductive hypothesis Eq. (112). Because g vanishes identically for s > log β2 0 , it is integrable. By Eq. (105) and Eq. (106), Zt

Z

Z ds

g(k)dk2 . . . dki ≤ β r−1

dκmr i i! αi−1 Pm1 r1 (α)Pm2 r2 (α).

(114)

0

Thus Eq. (110) holds, and Eq. (112) holds with Pmr given by Eq. (111) (the factor 61−r comes from the assumption that β0 ≥ 6). Since the right-hand side of Eq. (101) vanishes for t > log β2 0 , Eq. (109) holds. (nτ ,L) (0) is not the initial interaction because at t = 0, the propagator C0 6= 0. Remark 7. Imr (nτ ,L) The Imr (0) are obtained from the original interaction by Wick ordering with respect to C and integrating over all fields with covariance C0 . The existence of the limit (nτ , L) → (nτ ,L) (0) is not obvious because in the limit nτ → ∞, the absolute value of ∞ of the Imr the propagator is not summable. In perturbation theory, this is no serious problem, and (nτ ,L) (0) are controlled by a similar inductive proof as the above. An alternative the Imr proof is in Appendix D of [22].

5.4. The many-fermion system in the thermodynamic limit. In this section, let the manyfermion system satisfy the hypotheses stated in Sect. 2.3 with k0 > d. In the thermodynamic limit, the spatial part p of momentum becomes continuous, p ∈ B, and the set of Matsubara frequencies becomes M(β), given in Eq. (107). It is convenient to take (p0 , p) ∈ R × B and put all the β-dependence into the integrand. To this end, I define the step function ωβ : R → M(β) by ωβ (p0 ) =

π (2n + 1) β

if p0 ∈ (

2π 2π n, (n + 1)]. β β

For any continuous and integrable function f , Z 1 X dp0 f (ωβ (p0 )) = f (ω). 2π β R

ω∈M(β)

(115)

(116)


Moreover, sup |ωβ (p0 ) − p0 | =

p0 ∈R

π β

and

275

inf |ωβ (p0 )| =

p0 ∈R

π , β

(117)

|ωβ (p0 )| ≥ p20 , and ωβ (−p0 ) = −ωβ (p0 ) holds Lebesgue-almost everywhere, so that in integrals like Eq. (116), ωβ can be treated as an antisymmetric function. With this, the ˜ t (p) = Ct (ωβ (p0 ), E(p)) with propagator now reads D Ct (x, y) =

1 2 2 χ1 (−2 t (x + y )) ix − y

(118)

(and t = 0 e−t ). In infinite volume, the restriction on the spatial part p of momentum improves the bounds on the propagator over the finite-volume ones given above. ˜ t (p) = 0 for all ˜ t is bounded, C ∞ in t and C k0 in p. If t > log β0 , then D Lemma 4. D π d p ∈ R × B. For all multiindices α ∈ N0 with |α| ≤ k0 , there is a constant Bα > 0 such that α ˜˙ −1−|α| 1l |iωβ (p0 ) − E(p)| ≤ t D Dt (p) ≤ Bα t −1−|α| 1l |ωβ (p0 )| ≤ t 1l |E(p)| ≤ t , (119) ≤ Bα t 0 B0 = 4. For t ≤ log β π , Z −|α| dp0 α ˜˙ D (p) 1l |E(p)| ≤ t , ≤ Bα t D t 2π

R

Z

dd+1 p α ˜˙ Dt (p) (2π)d+1 D

1−|α|

≤ 2J1 Bα t

,

(120)

(121)

R×B

Z

and

dd+1 p ˜ Dt (p) (2π)d+1

≤ 8J1 t .

(122)

R×B

˜ t = 0 for ˜ t (p) to be nonzero, |ωβ (p0 )| ≤ t must hold. By Eq. (117), D Proof. For D β0 −2 ˙ 0 −2 2 ˜ t > log π . Since Dt (p) = −2t (iωβ (p0 ) + E(p)) χ1 (t (ωβ (p0 ) + E(p)2 )) and ˜˙ (p)| ≤ 4 1l |iω (p ) − E(p)| ≤ . Thus B = 4. Derivatives with kχ0 k ≤ 2, |D 1 ∞

t

β

t

0

t

0

respect to p can act on E(p) or on χ01 , in which case they produce factors bounded by −1 t |∇E(p)|, so there is Bα such that Eq. (119) holds. Inserting Eq. (119) into the integral in Eq. (120) gives Z dp0 α ˜˙ −1−|α| (123) 1l |E(p)| ≤ t Mβ D Dt (p) ≤ Bα t 2π R

with Mβ =

1 β {n

∈ Z : |2n + 1| ≤

2 βt } ≤ t , π π

which proves Eq. (120). This gives for the integral in Eq. (121),

(124)

276

M. Salmhofer

Z

Z dd+1 p α ˜˙ dd p −|α| D (p) ≤ B 1l |E(p)| ≤ t . D t α t (2π)d+1 (2π)d

(125)

B

R×B

For t ≥ 0, t ≤ 0 , so by a change of coordinates in the integral, Z

dd p 1l |E(p)| ≤ t ≤ d (2π)

B

Zt

Z dρ

dθ|J(ρ, θ)| = 2J1 t ,

(126)

−t

which implies Eq. (121). Equation (122) follows by integration over t.

5.5. A bound in position space. In this section, I prove a bound for the unrenormalized Green functions that motivates the statement that the unrenormalized expansion converges for |λ|β d < const (if the solution to Eq. (86) is exponentially bounded, it implies this convergence). Lemma 5. For many-fermion systems with E ∈ C k0 , where k0 > d, there is 12 > 0 such that for all t ≥ 0 and all β, ˙ t ||| ≤ 12 etd . |||D ˙ t ||| ≤ 4 Proof. By definition of ||| · |||, |||D Z

Z ˙ t (x0 , x) = D

dk0 2π R

R β/2 −β/2

dx0

(127) R 3

˙ t (x0 , x)| with dx |D

dd k ix0 ωβ (k0 )+ik·x ˙ Ct (ωβ (k0 ), E(k)). e (2π)d

(128)

B ω (k )2 +E(k)2

By Eq. (118), C˙t (ωβ (k0 ), E(k)) = − 22 (iωβ (k0 ) + E(k))χ01 ( β 0 2 ). t t Claim. There is a constant N , depending on k0 and E, such that for all β, ˙ t (x0 , x)| ≤ N t |D

1 . (1 + t |x|)k0 (1 + t min{|x0 |, β2 − |x0 |})2

The lemma follows from Eq. (129) by Z Z 1 −d d d x dd ξ(1 + |ξ|)−k0 = t (1 + t |x|)k0

(129)

(130)

and β

Z2 −β 2

1 dx0 ≤ 4−1 t (1 + t min{|x0 |, β2 − |x0 |})2

Z∞

du . (1 + u)2

(131)

0

Equation (129) is proven by the standard integration by parts method. Since the k0 dependence is via the step function ωβ , one has to use summation by parts in the form Z Z 2πix0 dk0 q dd k ix0 ωβ (k0 )+ik·x ˙ β ) = e (1q2π C˙t )(ωβ (k0 ), E(k)), (132) Dt (x0 , x) (1 − e 2π (2π)d β

R

B


277

where (1a f )(k0 ) = f (k0 ) − f (k0 − a). Taking q ≤ 2 and using a Taylor expansion of C˙t (x, y) of order q in x, one sees that 1q2π C˙t is still C k0 in k and that for any multiindex α with |α| ≤ k0 , ∂ α q ( ) 1 2π C˙t (ωβ (k0 ), E(k)) ≤ N1 ∂k β

β

1 1+|α|+q

β 2 t

1l |iωβ (k0 ) − E(k)| ≤ t .

By Eq. (124) and Eq. (126), this implies 2 α 1 −1−|α|−q x 1 − e 2π ˙ β ix0 . Dt (x) ≤ 4J1 N1 2 t β For |ξ| ≤ π, | sin ξ| ≥

min{|ξ|, π − |ξ|}, so 2π 2π 4 ix0 β x0 ≥ min{|x0 |, β2 − |x0 |}, ≥ sin 1 − e β β

and thus

(133)

(134)

2 π

|α|+q

t

This implies Eq. (129).

˙ t (x) ≤ N2 t . |xα |(min{|x0 |, β2 − |x0 |})q D

(135)

(136)

Remark 8. This proof changes in finite volume: one needs the requirement that no point on the Fermi surface S is a point of the momentum space lattice determined by the sidelength L to get a bound that is uniform in L. This can be achieved by choosing the chemical potential µ appropriately. Theorem 2. For the many-fermion model in d ≥ 1, and with the initial condition (0) , |||Gmr (0)||| = γmr |||Gmr (t)||| ≤ γmr (0 β)(r−1)d (137) with the γmr given by Eq. (85), where A =

8 0 d 12

and B = 8J1 0 .

Proof. By Eq. (122), Corollary 1, and Lemma 2, | det Dt(i−1) | ≤ (8J1 t )i−1 , so 11 ≤ ˙ t ||| ≤ 12 etd . The proof is by induction in r, with the inductive 8J1 0 . By Lemma 5, |||D hypothesis (138) |||Gmr (t)||| ≤ γmr etd(r−1) . The statement is trivial for r = 1. Let r ≥ 2, and Eq. (138) hold for all r0 < r. By Lemma 1 and Eq. (106), and since r1 + r2 = r, Z td(r−1) |||Qmr (t)||| ≤ 12 e dκmr i2 1i−1 γ m 1 r1 γ m 2 r2 . (139) 1 Rt 2 td(r−1) e , and Eq. (138) follows by Since r ≥ 2, r − 1 ≥ r2 , so 0 dt esd(r−1) ≤ dr m integration and by r ≥ 4 . Because Dt = 0 if t > log(β0 ), Eq. (137) holds. Remark 9. The bounds here are rather crude because t ≤ 0 was used. However, these are bounds for the full Green functions, not for truncations. Improving them requires renormalization and, for d ≥ 2, the sector technique of [20]. Proposition 5. The hypotheses of Theorem 1 are satisfied for the many-fermion model in d = 1 in the thermodynamic limit.

278

M. Salmhofer

Proof. Equation (78) follows from Eq. (121). Equation (79) follows from Lemma 5. R Proposition 6. Let d ≥ 2. If the sum dκmr is truncated to m1 > 2(d + 1), and m2 > 2(d + 1), then the Gmr defined by this truncation satisfy for all m > 2d + 2, |||Gmr (t)||| ≤ γmr et(

m 2 −(d+1))

.

(140)

If these bounds were sharp, all m-point functions up to m = 2d would be relevant in the RG sense, and the (2d + 2)-point function would be marginal. This is not really the case; see [20]. It is, however, remarkable that such a simple power counting ansatz goes through at all in d ≥ 2. 6. Regularity of the Selfenergy In this section, I study the regularity question for the skeleton selfenergy, which is very nontrivial even in perturbation theory. I show that all notions introduced in [21–23] arise in a natural way in the RGE and prove that the skeleton selfenergy is twice differentiable. This verifies the regularity criterion (1) for Fermi liquids in perturbation theory. The proof given here is a considerable simplification of the ones in [21, 22]. The main technique in doing the regularity proofs, and in showing that the generalized ladder graphs give the only contribution to the four-point function that is not bounded uniformly in β, is the volume improvement technique invented in [21]. Here, I show that the overlapping loops, and all related concepts developed in [21–23], appear naturally in the Wick-ordered continuous RGE. In particular, overlapping loops always appear in the skeleton two-point function, and the only nonoverlapping part of the skeleton four-point function is m1 = m2 = 4, which is precisely the ladder part of the four-point function. Moreover, the double overlaps of [23] arise in a natural way when the integral equation Eq. (101) is iterated. The main reason why all these effects are seen so easily is Wick ordering. The RGE without Wick ordering has too little structure to make these effects explicit in a convenient way. For instance, the volume improvement effect from overlapping loops is a two-loop effect, whereas the non-Wick-ordered RGE is a one-loop equation. 6.1. Volume-improved bounds. The indicator functions restrict the spatial support of the propagator to regions (141) R(t ) = {p ∈ B : |E(p)| ≤ t }. The volume of the intersection of R and its translates occurs in the bounds for the integrals in the RGE. Good bounds for these volumes are the key to the analysis of these systems. Lemma 6. Let k0 ≥ 2, π (ρ, θ) be as defined in Sect. 2, and Z Z dθ1 dθ2 1l |E(v1 π (0, θ1 ) + v1 π (0, θ2 ) + q)| ≤ ε . W(ε) = sup max q∈B vi ∈{±1} S d−1

S d−1

There is a constant QV ≥ 1 such that for all 0 < ε ≤ 0 , 1 + | log ε| if d = 2 W(ε) ≤ QV ε 1 if d ≥ 3.

(142)

(143)


279

Proof. This is Theorem 1.2 of [22]. The constant QV depends on the curvature of the Fermi surface, hence on 0 . I now prepare for the regularity proofs by using the volume improvement of Lemma 6 to bound the right-hand side of the RGE. This is possible because the graph drawn in Fig. 2 is overlapping according to the graph classification of [21], if i ≥ 3 Z α (144) D Qmr (t | P ) = dκmr (m1 , r1 , m2 , r2 , i)i i!Xα,i (t | P ), where, after a change of variables kj → (−1)lj kj , X X

X

Xα,i (t | P ) =

σ∈{−1,1}i l∈{1,2}i α i Y

Z

α! α0 !α1 !α2 !

Z

˜˙ t (k1 ) dki Dα0 D

dk2 . . . R×B

R×B

˜ t (kj ) Dα1 Im1 r1 (t | P (1) , K) Dα2 Im2 r2 (t |∼ K, P (2) ) D

(145)

j=2

with Kj = (kj , σj , lj ), the sum over α running over all triples (α0 , α1 , α2 ) with α0 + Pi α1 + α2 = α, and k1 = − j=2 (−1)lj kj + p1 + . . . + pm1 −i+1 . Xα,i depends also on (m1 , r1 , m2 , r2 , m, r). Application of | · |0 gives X α1 α2 α! (146) |Xα,i (t)|0 ≤ 4i α0 !α1 !α2 ! |D Im1 r1 (t)|0 |D Im2 r2 (t)|0 Yα0 ,i (t) α

with Yα,i (t) =

sup

max

Q∈M(β)×B vi ∈{−1,1}

Z Y i

˜˙ t ( ˜ t (kj )|dkj |Dα D |D

j=2

i X

vj kj + Q)|.

(147)

j=2

0 Lemma 7. Yα,i (t) = 0 for t > log β π , and

i−2−|α|

Yα,i (t) ≤ (8J1 )i−1 Bα t

.

(148)

If i ≥ 3,

1 i−1−|α| Yα,i (t) ≤ (8J1 )i−1 Bα K (1) (1 + t) t 2 with K (1) given in Eq. (158). ˜˙ t (p)| ≤ Bα −1−|α| , so Proof. By Eq. (119), |Dα D t i−1  Z −1−|α|  D ˜ t (k) dk  , Yα,i (t) ≤ Bα t

(149)

(150)

R×B

thus Eq. (148) holds by Eq. (121). For i ≥ 3,  i−3 Z ˜ t (k)|dk  Yα,i (t) ≤  |D sup R×B

max

Q∈M(β)×B v1 ,v2 ∈{−1,1}

Yα (t, Q, v1 , v2 )

(151)

280

M. Salmhofer

with

Z

Z

Yα (t, Q, v1 , v2 ) =

dk1

˜˙ t (v1 k1 + v2 k2 + Q)|. ˜ t (k1 )| |D ˜ t (k2 )| |Dα D dk2 |D

(152)

R ˜ tj (kj ). By Eq. ˜˙ t = 0 for t > log β0 , to write D ˜ t (kj ) = − log(β0 ) dtj D In Yα (t), I use D π t −1−|α| ˙ ˜ t (p)| ≤ Bα t 1l |E(p)| ≤ t . Inserting this and doing the integrals (119), |Dα D over (k1 )0 and (k2 )0 , I get by Eq. (120), Yα (t, Q, v1 , v2 ) ≤ 8

2

−1−|α| Bα t

log(β Z 0)

log(β Z 0)

dt2 , V(t, t1 , t2 )

dt1 t

with

Z

Z

V(t, t1 , t2 ) =

dk2 1l |E(v1 k1 + v2 k2 + Q)| ≤ 0 e−t . (154)

dk1 R(0 e−t1 )

(153)

t

R(0 e−t2 )

With the coordinates (ρ, θ) defined in Eq. (21), 0Ze−t1

V(t, t1 , t2 ) =

0Ze−t2

dρ1 −0 e−t1

Z dρ2

Z dθ1

dθ2 J(ρ1 , θ1 )J(ρ2 , θ2 )

−0 e−t2

1l |E(v1 π (ρ1 , θ1 ) + v2 π (ρ2 , θ2 ) + Q)| ≤ 0 e−t .

(155)

Since |ρj | ≤ 0 e−tj ≤ 0 e−t , Eq. (21) implies |E(v1 π (ρ1 , θ1 ) + v2 π (ρ2 , θ2 ) + Q) −E(v1 π (0, θ1 ) + v2 π (0, θ2 ) + Q)| ≤ g40 |E|1 0 e−t , with |E|1 = |∇E|0 . Using J(ρj , θj ) ≤ J0 and doing the ρ-integrals, I get −t 1 V(t, t1 , t2 ) ≤ (20 J0 )2 e−t1 −t2 W (1 + 4|E| (156) g0 )0 e with W given by Eq. (142). The integrals over t1 and t2 give 1−|α| −t 1 Yα (t, Q, v1 , v2 ) ≤ 44 Bα J02 t W (1 + 4|E| ) e . 0 g0

(157)

This implies Eq. (149) with K (1) = 2(

J0 2 4|E|1 ) QV (1 + ) 1 + log(1 + J1 g0

4|E|1 g0 )

+ |log 0 | .

(158)

This gives the following Lemma 8. For all i, |Xα,i (t)|0 ≤ 4i (8J1 )i−1

X

α! α0 !α1 !α2 !

|Dα1 Im1 r1 (t)|0 |Dα2 Im2 r2 (t)|0

α i−2−|α0 | Bα0 t .

(159)


For i ≥ 3, |Xα,i (t)|0 ≤ 4i (8J1 )i−1

X

α! α0 !α1 !α2 !

α i−1−|α0 | Bα0 K (1) t

1 2 (1

1

281

|Dα1 Im1 r1 (t)|0 |Dα2 Im2 r2 (t)|0 + t) if d = 2 if d ≥ 3.

(160)

Equation (160) is the implementation of the volume improvement bound from overlapping loops in the continuous RGE setting. In the next section I use it to prove regularity properties of the selfenergy. 6.2. The ladder four-point function and the skeleton selfenergy. ˜ R for the skeleton functions Imr (t)is given by Eq. (100), but with RDefinition 2. The RGE dκmr replaced by dκ˜ mr , where in the latter the sums over m1 and m2 start at four instead of two. The function I˜2r (t) is the skeleton selfenergy of the model. This truncation of the sum prevents any two-legged insertions that require renormalization from occurring in the graphical expansion. To give a precise meaning to the statement that the ladder resummation takes into account the most singular contributions to the four-point function, I split the skeleton four-point function into two pieces, as follows. In Q4r , m1 + m2 = 4 + 2i, so i ≥ 1 2 (m1 + m2 − 4) ≥ 2 since the skeleton condition has removed m1 = 2 and m2 = 2. Let Q4,r,2 be the i = 2 term in this sum; it corresponds to the “bubble” graph drawn in Fig. 3.

P1

K

-K

P2

P3

P4 Fig. 3. The graph corresponding to Q4,r,2

More explicitly, Q4,r,2 is given by X Z X 1 h ¯ t (K) D ¯˙ t (K 0 ) dk D Q4,r,2 (t | P1 , . . . , P4 ) = A4 144 2 r1 +r2 =r i1 ,σ1 ,i2 ,σ2 i 0 I˜4,r1 (t | P1 , P2 , K , K) I˜4,r2 (t |∼ K, ∼ K 0 , P3 , P4 ) (161) with K = (k, i1 , σ1 ) and K 0 = (−p1 − p2 − k, i2 , σ2 ). Let Q4,r,≥3 (t | P ) = Q4,r (t | P ) − Q4,r,2 (t | P )

(162)

be the contribution from all terms where at least i ≥ 3 lines connect the two vertices in Q4,r . Correspondingly, let I˜4,r (t) = Br (t) + Ur (t), (163) where B4,r (t) and U4,r (t) are defined by

282

M. Salmhofer

1 ∂ Br (t) = A4 Q4,r,2 (t), ∂t 2

∂ 1 Ur (t) = A4 Q4,r,≥3 (t) ∂t 2

(164)

with the initial condition Br (0) + Ur (0) = I˜4,r (0) (where it is understood that one of the two summands on the left-hand side is set to zero, see below). Note that because I˜4,r occurs on the right-hand side of both equations in Eq. (164), Eq. (164) is a coupled system of differential equations. Definition 3. The function Br(L) obtained from the skeleton RGE by the truncation ∂ Ur (t) = 0, ∂t

Ur (0) = 0

(165)

is the ladder skeleton four-point function. The function Ur(N ) obtained by the truncation ∂ Br (t) = 0, ∂t

Br (0) = 0

(166)

(N ) , obtained with this is the non-ladder skeleton four-point function. The functions I˜m,r R truncation, (i.e., where m1 = 4 and m2 = 4 are left out in the sum dκ˜ 4,r ) are the non-ladder skeleton Green functions.

The motivation for the split is that in Q4,r,≥3 , the number of internal lines of the graph in Figure 2 is i ≥ 3, so the volume improvement of Lemmas 7 and 8 improves the power counting. The constants in the following theorems are independent of β. Theorem 3. There is a constant L2 such that (L) Br (t) ≤ L2 r ( 1 (1 + t))r−1 ≤ L2 r |log(β0 )|r−1 . 0 2 The series

P r

(167)

Br(L) (t) converges uniformly in t and P if |λ log β0 | < L2 −1 .

Proof. This follows immediately by induction from Eq. (161) by use of Eq. (159) with i = 2 and α = 0. Using one-loop volume bounds one can show that both the particle-particle and the particle-hole ladder are bounded uniformly in t and β for Q 6= 0, so that the only singularity in the four-point function can arise at zero momentum (for a discussion of this, see, e.g., [24]). In the particle-particle ladder, this singularity is really there, and it implies that Fermi liquid behaviour occurs only above a critical temperature: the log β in Definition 1 is the logarithm occurring in Theorem 3. The next theorem states that the non-ladder skeleton four-point function is bounded and that consequently, the non-ladder skeleton selfenergy is C 1 uniformly in β. This shows that indeed, only the ladder four-point function produces a nonuniformity in β (this motivates the alternative criterion for Fermi liquid behaviour in Sect. 2.6). The second derivative, however, is only bounded by a power of log β in d = 2; this motivates why at zero temperature, the selfenergy is only required to be C 2−δ for some δ > 0.


283

(N ) Theorem 4. For all r ≥ 1, the non-ladder skeleton functions I˜2,r (t) converge for 2 t → ∞ to a C function. There are constants L3,r and L4,r , independent of β, such that

α ˜(N ) D I2,r (t) ≤ L3,r 0

and

(

1 (log β0 )2 log β0

α ˜(N ) D I4,r (t) ≤ L4,r 0

(

if |α| ≤ 1 if |α| = 2 and d = 2 if |α| = 2 and d ≥ 3,

1 log β0 β0

if |α| = 0 if |α| = 1 if |α| = 2.

(168)

(169)

If the ladder four-point function is left in the RGE, its logarithmic growth (Theorem 3) shows up in all other Green functions, and in the selfenergy: Theorem 5. The skeleton functions I˜2,r and I˜4,r converge for t → ∞ and satisfy α D I˜m,r (t) ≤ L5,m,r (log β0 )r (170) 0 for m = 2, |α| ≤ 2, and m = 4, |α| = 0. Theorems 4 and 5 are proven in the next section. Remark 10. To prove bounds with a good β behaviour for m ≥ 6 requires the use of different norms, where part of the momenta are integrated [19]. It is easy to see that for m ≥ 6, the connected m-point functions have singularities and thus are not uniformly bounded functions of momentum; for instance the second order six-point function is given by C(P ), which is O(β) if p is on the Fermi surface and if ω = π/β. Remark 11. A bound const r for the constants in Theorem 5 would suffice to show the first requirement of the Fermi liquid behaviour defined in Definition 1, namely the convergence of the perturbation expansion for the skeleton functions for |λ log β| < const . The proofs given in this section do not imply this bound because in the momentum space equation, the factorial remains. It may, however, be possible to give such bounds in d = 2 by combining the sector technique of [20] with the determinant bound. 6.3. The regularity proofs. At positive temperature, the frequencies are still discrete, whereas the spatial part of momentum is a continuous variable. Whenever derivatives with respect to p0 are written below, they are understood as a difference operation β 2π 2π (f (p0 + β ) − f (p0 )). The RGE actually defines the Green functions for (almost all) real values of p0 , not just the discrete set of Matsubara frequencies, so the effect of such a difference can be bounded by Taylor expansion. This changes at most constants, so I shall not write this out explicitly in the proofs. The following theorem implies Theorem 4 about the non-ladder skeleton selfenergy and four-point function. Note that all bounds in this theorem are independent of β. Theorem 6. Let α be a multiindex with |α| ≤ 2. For all m, r, let Imr (0) be C 2 in (0) (0) ≥ 0 such that |Dα Imr (0)| ≤ Kmr , with (p1 , . . . , pm ), and assume that there are Kmr (0) (0) (N ) ˜ Kmr = 0 for m > 2r + 2, and Km1 = δm4 v, where v > 0. Let Imr (t) be the Green functions generated by the non-ladder skeleton RGE, as given in Definition 3, with initial (0) and for r ≥ 2, let values Imr (0). Let Km1 = Km1

284

M. Salmhofer

1 M1 m

(0) Kmr = Kmr +

Z dκ˜ mr i i!(32J1 )i−1 Km1 r1 Km2 r2 ,

(171)

where M1 = 240BK (1) , with B = maxα Bα . Then Kmr = 0 if m > 2r + 2, and for all t ≥ 0 and all m, r,  2− m −|α| 2  if m ≥ 6  t α (N ) D Q˜ mr (t) ≤ 2Kmr 1−|α| 1+t (172) if m = 4 t 2 0   2−|α| 1+t δd,2 t if m = 2. 2 Moreover, for m ≥ 6, for m = 4,

m α (N ) D I˜mr (t) ≤ Kmr t2− 2 −|α| , 0

  1 2 α (N ) 1+t D I˜mr (t) ≤ K4r 0  −12 1+t t 2

(173)

if α = 0 if |α| = 1 if |α| = 2,

(174)

if |α| ≤ 1 if |α| = 2 and d = 2 if |α| = 2 and d = 3.

(175)

and for m = 2,  1 α (N ) 1+t 2 D I˜mr (t) ≤ K2r 0  1+t2 2

Proof. Induction in r, with the statement of the theorem as the inductive hypothesis. ) r = 1 is trivial because Q˜ (N m1 = 0 and because the statement holds for Im1 (0). Let r ≥ 2 and the statement hold for all r0 < r. The inductive hypothesis applies to both factors (N ) ) I˜m in Q˜ (N mr . For mk = 4, it implies that k rk (N ) mk −|α| I˜m r (t) ≤ Km r −|α| = Kmk rk 0 et(|α|+ 2 −2) k k t k k 0

(176)

for all t ≥ 0. Recall Eq. (159) and Eq. (160), and α α (N ) ) α ˜ (N ) D Am Q˜ (N ˜ mr (t) 0 = Am D Qmr (t) 0 ≤ D Qmr (t) 0 Z ) ≤ dκ˜ (N mr i i!|Xα,i (t)|0 .

(177)

Let m ≥ 6. By Eq. (159) and m1 + m2 = m + 2i, the t-dependent factors in Xα,i (t) are et( Since

m 2

m1 2

− 2 + |α| ≥

−2+|α1 |)

m 2

et(

m2 2

−2+|α2 |) t(2−i+|α0 |)

e

= et(

m 2 −2+|α|)

.

(178)

− 2 ≥ 1, integrating the RGE gives

α (N ) 1 (N ) D I˜mr (t) ≤ Dα I˜mr (0) 0 + 0 2

Zt

) ds Dα Q˜ (N mr (s) 0

0

1 2− m −|α| (0) νmr t 2 ≤ Kmr + , m−4 where

(179)


Z νmr = 4B

dκi ˜ i!(32J1 )i−1

X

285

α! α0 !α1 !α2 ! Km1 r1 Km2 r2 .

(180)

α

P α! |α| For m ≥ 6, m − 4 ≥ m 3 . Moreover, α0 !α1 !α2 ! ≤ 3 , so Eq. (172) and Eq. (173) follow. Let m = 4. One of m1 and m2 must be at least six because the ladder part is left (N ) . Thus i = 21 (m1 + m2 − 4) ≥ 3. By Eq. (160), there is an extra small factor out in I˜mr −t t = 0 e , so 1 + t 1−|α| (1) α ˜ (N ) K νmr , (181) D Q4,r (t) ≤ 2 t 0 which proves Eq. (172). Equation (174) follows by integration. Let m = 2. The case m1 = m2 = 2 is excluded since this is the skeleton RG. Thus i = 21 (m1 + m2 − 2) ≥ 3, and Eq. (160) implies δ 1 + t d,2 1−|α| (1) α ˜ (N ) t K ν2r . (182) D Q2,r (t) ≤ 2 0 Thus Eq. (172) holds, and Eq. (175) follows by integration.

(N ) Proof of Theorem 4. Let |α| ≤ 1. By Eq. (175), |Dα I˜2r (t)| ≤ K2,r . By Eq. (172), (N ) α ∂ ˜(N ) −t 1+t → 0 as t → ∞. Thus the limit t → ∞ of I˜2,r (t) exists |D ∂t I2r (t)| ≤ K2,r 0 e 2 1 and is a C function of (p1 , p2 ). All constants are uniform in β. The second derivative ) ˜ (N ) of Q˜ (N 2,r is O(t) in d = 2 and O(1) in d = 3. Since Q2,r = 0 for t > log β0 , the integral (N ) for I˜2,r over t runs only up to log β0 , which gives the stated dependence on log β0 .

Remark 12. Note that the bounds for m = 2 and |α| = 2 in Theorem 6 do not imply convergence. This is the source of the logarithmic behaviour discussed in Sect. 2 and in [22]. The following theorem implies Theorem 5. Here the bounds have an explicit βdependence. One could avoid this β-dependence by including polynomials in t, but this would also increase the combinatorial coefficients by factorials. Theorem 7. Let α be a multiindex with |α| ≤ 2. For all m, r, let Imr (0) be C 2 in (0) (0) ≥ 0 such that |Dα Imr (0)| ≤ Kmr , with (p1 , . . . , pm ), and assume that there are Kmr (0) (0) ˜ Kmr = 0 for m > 2r + 2, and Km1 = δm4 v, where v > 0. Let Imr (t) be the Green functions generated by the skeleton RGE, as given in Definition 2, with initial values (0) Imr (0). Let Km1 = Km1 and for r ≥ 2, let Kmr be given by Eq. (171). Then for m ≥ 4, m α D Q˜ mr (t) ≤ 2Kmr (log β0 )r−2 t2− 2 −|α| (183) 0 and For m = 2, and

m α D I˜mr (t) ≤ Kmr (log β0 )r−1 t2− 2 −|α| . 0

α D Q˜ 2r (t) ≤ 2K2r (log β0 )r−2+δd,2 2−|α| t 0 α D I˜2r (t) ≤ K2r (log β0 )r−2 0

1 (log β0 )1+δd,2

if |α| ≤ 1 if |α| = 2.

(184) (185) (186)

286

M. Salmhofer

Proof. The proof is by induction in r, with the statement of the theorem as the inductive hypothesis. It is similar to the proof of Theorem 6, with only a few changes. Note that m1 = 2 and m2 = 2 never appear on the right-hand side of the RGE because of the skeleton truncation in Definition 2. For m ≥ 4, use Eq. (159); this gives m α D Q˜ mr (t) ≤ 9νmr (log β0 )r−2 t2− 2 −|α| . (187) 0 For m ≥ 6, the scale integral is as in the proof of Theorem 6. For m = 4, the scale integral Rt is now 0 ds = t ≤ log β0 . This produces the powers of log β0 upon iteration. For 2−|α|

2− m −|α|

( 21 (1 + t))δd,2 instead of t 2 in Eq. m ≥ 2, use Eq. (160); this gives K (1) t (187). The theorem now follows by integration over t, recalling that the upper integration limit is at most log β0 . Proof of Theorem 5. Convergence of the selfenergy follows for |α| ≤ 1 as in the proof of Theorem 4. For m = 4, the function is bounded uniformly in t. For |α| = 2, convergence ∂ ˜ I4r (t) = 0 for t > log β0 . at β > 0 holds because ∂t

7. Conclusion The determinant bound for the continuous Wick-ordered RGE removes a factorial in the recursion for the Green functions. If the model has a propagator with a pointlike singularity, power counting bounds that include this combinatorial improvement can be proven rather easily. The improvement may lead to convergence, but I have not proven this here. If it does, then Theorem 1 implies analyticity of the Green functions where all relevant and marginal couplings are left out in a region independent of the energy scale, and Theorem 2 implies analyticity of the full Green functions for |λ|β d < const in many-fermion systems for all d ≥ 1. Natural models to which Theorem 1 applies are the Gross-Neveu model in two dimensions and the many-fermion system in one dimension. In both cases, I have only given bounds where the marginal and relevant terms were left out, to give a simple application of the determinant bound derived in Sect. 4. In both cases, analyses of the full models, including the coupling flows, have been done previously ([14, 15] and [16–18]). Many-fermion models in d ≥ 2 are the most realistic physical systems where the interaction is regular enough for the analysis done here to apply directly (i.e., without the introduction of boson fields). I have defined a criterion for Fermi liquid behaviour for these models. A proof that such behaviour occurs requires more detailed bounds and a combination of the method with the sector method of [20]. This may be feasible by an extension of the analysis done here. The Jellium dispersion relation E(k) = k2 /2 − µ is the case where the proofs are easiest because 6|S is constant. The proof that such a system is a Fermi liquid is possible by a combination of the techniques of [20] and [21]; it may also be within the reach of the continuous RGE method developed here. Perturbative bounds that implement the overlapping loop method of [21] in a simple way were given in Sect. 6. The verification of regularity property (2) for nonspherical Fermi surfaces is not as simple, but the perturbative analysis done here can be extended rather easily to include the double overlaps used in [23], because the graph classification of [23] arises in a natural way when the integral equation for the effective action is iterated. Multiple overlaps can also be exploited in the RGE; this may be necessary for the many-fermion systems in d ≥ 3. The split of the four-point function into the ladder and non-ladder part


287

done in Sect. 6.2 singles out the only singular contributions to the four-point function and the least regular contributions to the selfenergy in a simple way. The treatment of these ladder contributions was done in [22], where bounds uniform in the temperature were shown for the second derivative in perturbation theory. The results in Sect. 6.2 provide another proof that only the ladder flow corresponding to Fig. 3 leads to singularities and hence instabilities. It should be noted that this statement depends on the assumptions stated in Sect. 2.3, in particular on the choice of 0 , in the following specific way. The curvature of the Fermi surface sets a natural scale which appears via the constants in the volume bounds. Above this scale, the geometry of the Fermi surface provides no justification of restricting to the ladder flow. This is of some relevance in the Hubbard model, where two scale regimes arise in a natural way: if ˜ where µ˜ = µ − td, t the hopping parameter, is defined such that µ˜ = 0 is t > µ, half-filling, then the curvature of the Fermi surface is effectively so small that one can replace the Fermi surface by a square. More technically speaking, the constant QV , which depends on the curvature of the Fermi surface, diverges for µ˜ → 0. Thus, for small µ, ˜ t has to be very small for QV t < 1 to hold. If QV t > 1, the improved volume estimate does not lead to a gain over ordinary power counting. Only below this energy scale, the curvature effects of the volume improvement bound, Lemma 6, imply that the ladder part of the four-point function dominates. To get to scale µ, ˜ one has to calculate the effective action. Needless to say, the effective four-point interaction at scale µ˜ may look very different from the original interaction, and RPA calculations suggest that the antiferromagnetic correlations produced by the almost square Fermi surface lead to an attractive nearest-neighbour-interaction [38]. However, one should keep in mind that for the reasons just mentioned, it is an ad hoc approximation to keep only the RPA part of the four-point function above scale µ, ˜ and that a correct treatment must either give a different justification of the ladder approximation, or replace it by a better controlled approximation. Above, I have not discussed how one gets from the skeleton Green functions to the full Green functions. The key to this is to take a Wick ordering covariance which already contains part of the selfenergy. That is, the Gaussian measure changes in a nontrivial way with t. This can be done such that all two-legged insertions appear with the proper renormalization subtractions, and it gives a simple procedure for a rigorous skeleton expansion. Moreover, it makes clear why it is so important to establish regularity of the selfenergy: once the selfenergy appears in the propagator, its regularity is needed to show Lemma 4 (or its sector analogue), on which in turn, all other bounds depend. The condition k0 > d also enters in this lemma and seems indispensable from the point of view of the method developed here. Details about this modified Wick ordering technique will appear later. There is a basic duality in the technique applied to these fermionic models. The determinant bound has to be done in position space, but the regularity bounds use geometric details that are most easily seen in momentum space. The continuous RGE shows very nicely that those terms that require the very detailed regularity analysis are very simple from the combinatorial point of view and vice versa.

A. Fourier Transformation Recall that ψ is defined on the doubled time direction T2 and obeys the antiperiodicity with respect to translations by β, Eq. (7), and that nτ was chosen to be even. T2 is the set T2 = ετ Z/2βZ = T ∪ (T + β), where

288

M. Salmhofer

T = {τ ∈ T2 : τ = ετ k, k ∈ {− The dual to T2 is T∗2 = transform on T2 is

π β Z/2nτ Z

f˜(ω) = ετ

X

= {ω =

π βk

e−iωτ f (τ ),

nτ nτ ,..., − 1}}. 2 2

(188)

: k ∈ {−nτ , . . . , nτ − 1}}. The Fourier

f (τ ) =

τ ∈T2

1 X iωτ ˜ e f (ω). 2β

(189)

ω∈T2

If f (τ − β) = −f (τ ), then f˜(ω) = 0 if f (τ ) =

ωβ π

is even. In that case, with fˆ(ω) = 21 f˜(ω),

1 X iωτ ˆ e f (ω) β

(190)

ω∈Mnτ

with Mnτ given by Eq. (12). The orthogonality relations are Z

1 X iωτ 1 e = (δτ 0 − δτ β ). β ετ

dτ ei(ωn ±ωm )τ = βδmn ,

(191)

ω∈Mnτ

T

Lemma 9. Let β > 0, E0 ≥ 0, and nτ ≥ 2β(E0 + Emax ), where Emax is defined in Eq. (20), and let ω b be defined as in Eq. (14). Then for all k ∈ B and all ω ∈ Mnτ , |ib ω − E(k)| ≤ E0 implies |ω| ≤ π2 E0 and |E(k)| ≤ 2E0 , and X

1 β

1l |ib ω − E(k)| ≤ E0

≤ E0 1l |E(k)| ≤ 2E0 .

(192)

ω∈Mnτ

b = ε1τ sin(ωετ ). The condition Proof. By Eq. (14), Im ω b = ε1τ (1 − cos(ωετ )) and Re ω b | ≤ E0 and |Im ω b + E(k)| ≤ E0 . Thus |Im ω b| ≤ |ib ω − E(k)| ≤ E0 implies |Re ω E0 + Emax . Since 1 − cos x ≥ π22 x2 , (E0 + Emax )ετ ≥ ετ |Im ω b | = 1 − cos(ωετ ) ≥ 2 2 −1 1/2 ≤ π2 . Since sinxx is decreasing on π 2 (ωετ ) , so |ωετ | ≤ π(β(E0 + Emax )(2nτ ) ) 2 π 1 2 τ) [0, π2 ], E0 ≥ |Re ω b | ≥ |ω|| sin(ωε ωετ | ≥ π |ω|. So |ω| ≤ 2 E0 . Since 1 − cos x ≤ 2 x , |Im ω b| ≤

1 2ετ

(ωετ )2 ≤

π2 1 βE02 ≤ E0 . 8 nτ

(193)

≤ 2E0 . In terms Thus |Im ω b + E(k)| ≤ E0 implies |E(k)| of indicator functions, this means that 1l |ib ω − E(k)| ≤ E0 ≤ 1l |ω| ≤ π2 E0 1l |E(k)| ≤ 2E0 . The summation over Mnτ gives 1 β

X ω∈Mnτ

1l |ω| ≤

π 2 E0

nτ

=

1 β

2 X

1l |2n + 1| ≤

βE0 2

≤ E0 .

(194)

n=− n2τ

0 Note that the last inequality holds in particular if βE < 1, because then the sum is 2 empty, hence the result is zero, because 2n + 1 is always odd.


289

B. Wick Ordering Recall the conventions fixed at the beginning of Sect. 3. Definition 4. Let W0 (η, ψ) = e(η,ψ)0 − 2 (η, C η)0 , and let A00 be the Grassmann algebra generated by (ψ(x))x∈0 . Wick ordering is the C–linear map C : A00 → A00 that takes the following values on the monomials: C (1) = 1, and for n ≥ 1 and X1 , . . . , Xn ∈ 0, "n # Y δ C ψ(X1 ) . . . ψ(Xn ) = W0 (η, ψ) . (195) δη(Xk ) 1

k=1

η=0

Theorem 8. W0 (η, ψ) = C e . Let α1 , . . . , αn be Grassmann variables, and Pn let α(X) = k=1 αk δ0 (X, Xk ). Then " n ! # Y ∂ (α,ψ)0 C ψ(X1 ) . . . ψ(Xn ) = . (196) C e ∂αk (η,ψ)0

k=1

α=0

Proof. By Taylor expansion in η, and by definition of C , # "n X Y X 1 ∂ W0 (η, ψ) = W0 (α, ψ) η(Xk ) n! ∂α(Xk ) X1 ,...,Xn ∈0 k=1 n≥0 α=0 # "n Y X 1 Z δ dX1 . . . dXn W0 (α, ψ) = η(Xk ) n! δα(Xk ) n≥0 k=1 α=0 ! n X 1 Z Y dX1 . . . dXn C = η(Xk )ψ(Xk ) n! n≥0

k=1

X 1 = C (η, ψ)0 n = C e(η,ψ)0 . n!

(197)

n≥0

Equation (196) follows directly from the definition of C .

The next theorem contains the alternative formula Eq. (48) for the Wick ordered monomials used in Sect. 2. Theorem 9. Let 1Ct be defined by Eq. (40). Then Ct ψ(X1 ) . . . ψ(Xn ) = e−1Ct ψ(X1 ) . . . ψ(Xn ). In particular, if Ct depends differentiably on t, then ∂1Ct ∂ Ct ψ(X1 ) . . . ψ(Xn ) = − C ψ(X1 ) . . . ψ(Xn ) . ∂t ∂t P Proof. For any formal power series f (z) = fk z k , 1 (η,ψ)0 f (1Ct ) e (η , Ct η)0 e(η,ψ)0 , =f 2

(198)

(199)

(200)

so e−1Ct e(η,ψ)0 = e− 2 (η,Ct η)0 +(η,ψ)0 , from which Eq. (198) follows by definition of Ct , because derivatives with respect to η commute with 1Ct . If Ct depends differentiably on t, Eq. (199) follows by taking a derivative of Eq. (198) because 1Ct commutes ∂1 with ∂tCt . 1

290

M. Salmhofer

C. Wick Reordering By Eq. (47) and Eq. (51), Qr (t, ψ) =

1 2

Z

P

m(r ¯P1 ) m(r ¯P2 )

r1 ≥1,r2 ≥1 r1 +r2 =r

m1 =1 m2 =1

Yr1 ,m1 ,r2 ,m2 with

dX Gm1 r1 (t | X (1) ) Gm2 r2 (t | X (2) ) P (X, ψ),

Yr1 ,m1 ,r2 ,m2 =

(201)

where X (1) R= (X1 , . . . , Xm1 ), X (2) = (Xm1 +1 , . . . , Xm1 +m2 ), X = (X1 , . . . , Xm1 +m2 ), P (X, ψ) = dXdY C˙ t (X, Y ) ωX,Y (ψ), and !! ! m1 m2 Y Y δ δ Dt Dt ωX,Y (ψ) = ψ(Xk1 ) ψ(Xm1 +k2 ) . (202) δψ(X) δψ(Y ) k1 =1

k2 =1

Only even m1 and m2 contribute to the sum for Qr (t, ψ) because Gr (t, ψ) is an element of the even subalgebra for all t. Let η (1) =

m1 X

ηk δ0 (X, Xk ),

k=1

η (2) =

m2 X

ηm1 +k δ0 (X, Xm1 +k ),

(203)

k=1

and η = η (1) + η (2) . With this,

# ! " m1 m (1) Y δ Y1 ∂ δ (η ,ψ)0 Dt ψ(Xk1 ) = Dt e δψ(X) δψ(X) ∂ηk k1 =1 k=1 η=0 "m # Y1 ∂ (1) (1) (1) 1 δ e(η ,ψ)0 − 2 (η ,Dt η )0 = (−1)m1 ∂ηk δψ(X) k=1 η=0 "m # Y1 ∂ (1) (1) (1) 1 = (−1)η (1) (X)e(η ,ψ)0 − 2 (η ,Dt η )0 , ∂ηk k=1

since (−1)m1 = 1. Similarly, ! "m m2 Y Y2 ψ(Xk ) = Dt k=1

k=1

(204)

η=0

∂ ∂ηm1 +k

# (2)

(−1)η (Y )e

(η (2) ,ψ)0 − 21 (η (2) ,Dt η (2) )0

.

(205)

η=0

Since η (1) is independent of ηm1 +1 , . . . , ηm1 +m2 , the derivatives with respect to η (2) can be commuted through so that ∂ m1 +m2 Z(η, ψ) (206) P (X, ψ) = ∂η1 . . . ∂ηm1 +m2 η=0 with

(1) (1) (2) (2) 1 1 Z(η, ψ) = (η (1) , C˙ t η (2) ) e(η,ψ)0 − 2 (η , Dt η )0 − 2 (η , Dt η )0 . ˙ t , give Wick ordering of e(η,ψ)0 , the antisymmetry of Dt , and C˙ t = −D ∂ (η (1) , Dt η (2) )0 Dt e(η,ψ)0 . e Z(η, ψ) = − ∂t

(207)

(208)


291

Lemma 10. Let A and P B be elements of the P Grassmann algebra generated by (η(X))X∈0 , that is, A = I⊂0 aI η I andPB = I⊂0 bI η I , where aI , bI ∈ C, and let F (x) be the formal power series F (x) = r≥0 fr xr . Then ∂L ∂L ∂ = A(η) B( ) F ( , ψ) , (209) A( ) B(η) F (η, ψ) ∂η ∂η ∂η η=0 η=0 where

∂L ∂η

is the derivative with respect to η, acting to the left.

Proof. The proof is an easy exercise in Grassmann algebra and is left to the reader. Let m1 + m2 = µ. By Lemma 10 ∂L ∂µ Z(η, ψ) = η1 . . . ηµ Z( , ψ) . ∂η1 . . . ∂ηµ ∂η η=0 η=0 By definition of η (1) and η (2) , X µ m1 X ∂L ∂L , D = t ∂η (1) ∂η (2)

k1 =1 k2 =m1 +1

∂L ∂L Dt (Xk1 , Xk2 ) . ∂ηk1 ∂ηk2

(210)

(211)

Every derivative can act only once on η1 . . . ηµ without giving zero because the η’s are all different. So ∂L ∂L , D η1 . . . ηµ exp t ∂η (1) ∂η (2) X Y ∂L ∂L = η1 . . . η µ Dt (Xk1 , Xk2 ) , (212) ∂ηk1 ∂ηk2 L⊂M1 ×M2

(k1 ,k2 )∈L

where M1 = {1, . . . , m1 } and M2 = {m1 + 1, . . . , µ}. L = ∅ contributes to the sum, but gives the t-independent result η1 . . . ηµ , so the t-derivative removes this term in Eq. (208). Let π1 (k1 , k2 ) = k1 and π2 (k1 , k2 ) = k2 . For a term given by L 6= ∅ to be nonzero, π1 |L and π2 |L must be injective, because otherwise a derivative would act twice. Thus the sum can be restricted to the set L = {L ⊂ M1 × M2 : L 6= ∅, and πk |L injective for k = 1, 2}.

(213)

If L ∈ L, π1 (L) = π2 (L). Thus L=

with

min{m1 ,m2 }

[

[

i=1

Bi ⊂Mi |B1 |=|B2 |=i

L(B1 , B2 )

(214)

L(B1 , B2 ) = {L ∈ L : π1 (L) = B1 and π2 (L) = B2 }.

(215)

B1 = {b1 , . . . , bi }, 1 ≤ b1 < . . . < bi ≤ m1 , B2 = {bi+1 , . . . , b2i }, m1 + 1 ≤ bi+1 < . . . < b2i ≤ m1 + m2 ,

(216)

Let

292

M. Salmhofer

then for any L ∈ L(B1 , B2 ), there is a unique permutation π ∈ Si such that L = {(bk , bi+π(k) ) : k ∈ {1, . . . , i}}.

(217)

Thus the sum over L splits into a sum over i ≥ 1, a sum over sequences b = (b1 , . . . , b2i ) with (218) 1 ≤ b1 < b2 < . . . < bi ≤ m1 < bi+1 < . . . < b2i ≤ m1 + m2 , and a sum over permutations π ∈ Si . Therefore ∂L ∂L η1 . . . ηµ exp , Dt ∂η (1) ∂η (2) min{m1 ,m2 }

= with

X

i XX Y

i=1

π∈Si

b

D(Xbk , Xbi+π(k) ) H(i, b, π)

(219)

k=1

i Y ∂L ∂L . H(i, b, π) = η1 . . . ηµ ∂ηbk ∂ηbi+π(k)

(220)

k=1

The derivatives give (since m1 and m2 are even and hence (−1)mk = 1) i(i+1) 2 −

H(i, b, π) = (−1)

2i P

bk

k=1

ε(π)

µ Y

ηk .

(221)

k=1 k6∈{b1 ,...,b2i }

The remaining derivatives acting in Eq. (210) come from the Wick ordered exponential in Eq. (208). They now act on H. By Lemma 10, they give     µ µ Y Y ∂L  ∂    ηk Dt e( ∂η ,ψ)  = Dt e(η,ψ)   ∂ηk k=1 k=1 k6∈{b1 ,...,b2i }

k6∈{b1 ,...,b2i }

η=0



 = Dt 

mY 1 +m2

  ψ(Xk ) .

η=0

(222)

k=1 k6∈{b1 ,...,b2i }

The final step is to rename the integration variables to rewrite the Wick ordered product in the form in which it appears in Eq. (52). Before this is done, it is necessary to permute the arguments of the Gmr (t | X (k) ) such that X1 , . . . , Xbi appear as the first i entries. Since b1 < . . . < bi , one can first permute (X1 , . . . , Xb1 ) → (Xb1 , X1 , . . . , Xb1 −1 ). This takes b1 − 1 transpositions and hence gives a factor (−1)b1 −1 . The next permutation (Xb1 , X1 , . . . , Xb1 −1 , Xb1 +1 , . . . Xb2 ) → (Xb1 , Xb2 , X1 , . . . , Xb2 −1 )

(223)

gives a factor (−1)b2 −2 because Xb1 has already been moved, etc. Thus i P

Gm1 r1 (t | X1 , . . . , Xm1 ) = (−1) k=1

(bk −k)

˜ Gm1 r1 (t | Xb1 , . . . , Xbi , X)

(224)


293

with X˜ = (Xρ1 , . . . , Xρm1 −i ), where {1, . . . , m1 } \ {b1 , . . . , bi } = {ρ1 , . . . , ρm1 −i } and ρ1 < . . . < ρm1 −i . Similarly, i P

Gm2 r2 (t | X2 ) = (−1) k=1

(bi+k −k−m1 )

Gm2 r2 (t | Xbi+1 , . . . , Xb2i , X).

(225)

Thus the b–dependent sign factor cancels, and upon renaming of the integration variables, Vk = Xbk , Wk = Xbi+k , etc., and with m = m1 + m2 − 2i, Y becomes ! min{m1 ,m2 } Z m X Y Yr1 ,m1 ,r2 ,m2 = dX Dt ψ(Xk ) Y (X) (226) i=1

with Y (X) = (−1)

i(i+1) 2

Z dV dW

k=1

X

Gm1 r1 (t | V , X1 ) Gm2 r2 (t | W , X2 )

b

X π∈Si

ε(π)

∂ − ∂t

i Y

!

Dt (Vk , Wπ(k) )

(227)

k=1

and X, X1 , X2 , V , and W given in Proposition 2. The summand does not depend on b any more, so the sum over b, with the constraint Eq. (218), gives X m2 m1 = κm1 m2 i . = (228) i i B ⊂M ,B ⊂M 1 1 2 2 |B1 |=|B2 |=i

The sum over permutations π gives the determinant of Dt(i) . Finally, Gm1 r1 (t | V1 , . . . , Vi , X1 ) = (−1)(m1 −i)i Gm1 r1 (t | X1 , V1 , . . . , Vi )

(229)

and Gm2 r2 (t | W1 , . . . , Wi , X2 ) = (−1)

i(i−1) 2

Gm2 r2 (t | Wi , . . . , W1 , X2 ),

(230)

which cancels the sign factor, and thus proves Eq. (53). The graphical interpretation of the above derivation is that Gm1 r1 (t | X1 ) and Gm2 r2 (t | X2 ) are vertices of a graph, with the set of legs of vertex 1 given by η1 . . . ηm1 ∂L ∂L and the set of legs of vertex 2 given by ηm1 +1 . . . ηm1 +m2 . The operator ( ∂η (1) , Dt ∂η (2) ) generates lines between these two vertices by removing factors of η in the monomial η1 . . . ηm1 +m2 . The number of these internal lines is i, and the remaining m = m1 +m2 −2i factors ηk correspond to external legs of the graph. i ≥ 1 must hold because a derivative with respect to t is taken, so the graph is connected. The rearrangement using permuta- m1 tions is simply the counting of all those graphs that have the same value. There are i m2 ways of picking i legs from the m1 legs of vertex number 1 and i ways of picking i legs from the m2 legs of vertex number 2, which gives κm1 m2 i . And there are i! ways of pairing these legs to form internal lines. One could also have used the antisymmetry Qi of Gm2 r2 (t) to get this factor explicitly, i.e. to permute to get i! k=1 Dt (Vk , Wk ) instead of the determinant. This is the result one would have got for bosons since there are no sign factors in that case. The order of the Wk in Gm2 r2 (t | Wi , . . . , W1 , X2 ) is

294

M. Salmhofer

chosen reversed to cancel the i-dependent sign factor for π = id. The graph for π = id is the planar graph drawn in Fig. 2. The sum over permutations π 6= id corresponds to the sum over all graphs with the fixed two vertices and i internal lines. In particular, it contains the nonplanar graphs. If one restricts the sum to planar graphs, only the shifts πk (j) = j + k mod i, k ∈ {0, . . . , i − 1}, remain, and this is only the case if m1 = i of m2 = i. The factor i! makes the combinatorial difference between the exact theory and the ‘planarized’ theory. Acknowledgement. I thank Volker Bach, Walter Metzner, Erhard Seiler, and Christian Wieczerkowski for discussions. I also thank Christian Lang for his hospitality at a very pleasant visit to the University of Graz, where this work was started.

References 1. Wegner, F.: In: Phase Transitions and Critical Phenomena vol. 6, edited by C. Domb and M. Green, London–NewYirk, Academic Press 2. Wilson, K., Kogut, J.: Phys. Reports 12, 75 (1974) 3. Polchinski, J.: Nucl. Phys. B 231, 269 (1984) 4. Wieczerkowski, C.: Commun. Math. Phys. 120, 149 (1988) 5. Keller, G., Kopper, C., Salmhofer, M.: Helv. Phys. Acta 65, 32 (1992) 6. Keller, G., Kopper, C.: Commun. Math. Phys. 148, 445 (1992); Phys. Lett. B273, 323 (1991) 7. Keller, G.: Commun. Math. Phys. 161, 311 (1994) 8. Gallavotti, G.: Rev. Mod. Phys. 57, 471 (1985) 9. Gallavotti, G. and Nicolò, F.: Commun. Math. Phys. 100, 545 (1985) and 101, 247 (1985) 10. Feldman, J., Hurd, T., Rosen, L., Wright, J.: QED: A Proof of Renormalizability. Springer Lecture Notes in Physics 312, Berlin–Heidelberg–New York: Springer-Verlag, 1988 11. Brydges, D.C., Yau, H.-T.: Commun. Math. Phys. 129, 351 (1990) 12. Brydges, D.C., Dimock, J., Hurd, T.: Commun. Math. Phys. 172, 143 (1995); mp-arc 96-538, mp-arc 96-681, and references therein 13. Salmhofer, M.: Fermionic sign cancellations in the continuous renormalization group equation. Phys. Lett. 408 B, 245 (1997) 14. Gawedzki, K. and Kupiainen, A.: Commun. Math. Phys. 102, 1 (1985) 15. Feldman, J., Magnen, J., Rivasseau, V. and Sénéor, R.: Commun. Math. Phys. 103, 67 (1986) 16. Benfatto, G., Gallavotti, G.: J. Stat. Phys. 59, 541 (1990) 17. Benfatto, G., Gallavotti, G., Procacci, A., Scoppola, B.: Commun. Math. Phys. 160, 93 (1994) 18. Bonetto, F., Mastropietro, V.: Commun. Math. Phys. 172, 57 (1995) 19. Feldman, J. and Trubowitz, E.: Helv. Phys. Acta 63, 157 (1990), ibid. 64, 213 (1991) 20. Feldman, J., Magnen, J., Rivasseau, V. and Trubowitz, E.: Helv. Phys. Acta 65, 679 (1992) 21. Feldman, J., Salmhofer, M. and Trubowitz, E.: J. Stat. Phys. 84, 1209 (1996) 22. Feldman, J., Salmhofer, M. and Trubowitz, E.: Regularity of the Moving Fermi Surface: RPA Contributions. To appear in Commun. Pure Appl. Math. 23. Feldman, J., Salmhofer, M. and Trubowitz, E.: Regularity of the Moving Fermi Surface: The Full Selfenergy. To appear in Commun. Pure Appl. Math. 24. Salmhofer, M.: Improved Power Counting and Fermi Surface Renormalization. Rev. Math. Phys. 10 (1998) 25. Feldman, J., Knörrer, H., Lehmann, D., Trubowitz, E.: In: Constructive Physics, V. Rivasseau (ed.), Springer Lecture Notes in Physics, 1995 26. Magnen, J., Rivasseau, V.: Mathematical Physics Electronic Journal 1, No. 3 (1995) 27. I thank Walter Metzner for pointing this out to me 28. Feldman, J., Knörrer, H. and Trubowitz, E.: To appear in Commun. Math. Phys. 29. Bratteli, O., Robinson, D.: Operator Algebras and Quantum Statistical Mechanics. Berlin–Heidelberg– New York: Springer, 1979 30. Berezin, F.A.: The Method of Second Quantization. New York: Academic Press, 1966 31. Brydges, D.C. and Munoz Maya, I.: Journal of Theoretical Probability 4, 371 (1991) 32. Lüscher, M.: Commun. Math. Phys. 54, 283 (1976)


33. 34. 35. 36.

295

Brydges, D.C. and Wright, J.: J. Stat. Phys. 51, 435 (1988) Kohn, W. and Luttinger, J.M.: Phys. Rev. Lett. 15, 524 (1965) Feldman, J., Knörrer, H., Sinclair, R. and Trubowitz, E.: Helv. Phys. Acta 70, 154 (1997) Fetter, A.L. and Walecka, J.D.: Quantum Theory of Many-Particle Systems. New York: McGraw-Hill, 1971 37. Beckenbach, E.F., Bellman, R.: Inequalities. Berlin–Heidelberg–New York: Springer, 1961 38. Vignale, G. et al.: Phys. Rev. B 39, 2956 (1989); Langmann, E., Salmhofer, M., Wallin, M.: Unpublished Communicated by D. C. Brydges

Commun. Math. Phys. 194, 297 – 321 (1998)

Communications in


Monopoles and the Gibbons–Manton Metric Roger Bielawski Max-Planck-Institut für Mathematik, Gottfried-Claren-Strasse 26, 53225 Bonn, Germany. E-mail: [email protected] Received: 6 June 1997 / Accepted: 7 October 1997

Abstract: We show that, in the region where monopoles are well separated, the L2 metric on the moduli space of n-monopoles is exponentially close to the T n -invariant hyperkähler metric proposed by Gibbons and Manton. The proof is based on a description of the Gibbons–Manton metric as a metric on a certain moduli space of solutions to Nahm’s equations, and on twistor methods. In particular, we show how the twistor description of monopole metrics determines the asymptotic metric. The construction of the Gibbons–Manton metric in terms of Nahm’s equations yields a class of interesting (pseudo)-hyperkähler metrics. For example we show, for each semisimple Lie group G and a maximal torus T ≤ G, the existence of a G × T invariant (pseudo)-hyperkähler manifold whose hyperkähler quotients by T are precisely Kronheimer’s hyperkähler metrics on GC /T C . A similar result holds for Kronheimer’s ALE-spaces. The moduli space Mn of (framed) static SU (2)-monopoles of charge n, i.e. solutions to Bogomolny equations dA 8 = ∗F , carries a natural hyperkähler metric [1]. The geodesic motion in this metric is a good approximation to the dynamics of low energy monopoles [26, 33]. For the charge n = 2 the metric has been determined explicitly by Atiyah and Hitchin [1], and it follows from their explicit formula that when the two monopoles are well separated, the metric becomes (exponentially fast) the Euclidean Taub-NUT metric with a negative mass parameter. It was also shown by N. Manton [27] that this asymptotic metric can be determined by treating well-separated monopoles as dyons. The equations of motion for a pair of dyons in R3 are found to be equivalent to the equations for geodesic motion on Taub-NUT space. For an arbitrary charge n, it was shown in [3] that, when the individual monopoles are well-separated, the L2 -metric is close (as the inverse of the separation distance) to the flat Euclidean metric. Gibbons and Manton [14] have then calculated the Lagrangian for the motion of n dyons in R3 and shown that it is equivalent to the Lagrangian for geodesic motion in a hyperkähler metric on a torus bundle over the configuration space C˜ n (R3 ).

298

R. Bielawski

This metric is T n -invariant and has a simple algebraic form. Gibbons and Manton have conjectured, by analogy with the n = 2 case, that the exact n-monopole metric differs from their metric by an exponentially small amount as the separation gets large. We shall prove this conjecture here. ˜ n of solutions Our strategy is as follows. We construct a certain moduli space M to Nahm’s equations which carries a T n -invariant hyperkähler metric. Using twistor methods we identify this metric as the Gibbons–Manton metric. Finally, we show that ˜ n and Mn are exponentially close. This proof adapts equally well to the the metrics on M asymptotic behaviour of SU (N )-monopole metrics with maximal symmetry breaking, as will be shown elsewhere. The asymptotic picture can be explained in the twistor setting. We recall that a monopole is determined (up to framing) by a curve S – the spectral curve – in T CP 1 , which satisfies certain conditions [16]. One of these is triviality of the line bundle L−2 over S, and a nonzero section of this bundle is the other ingredient needed to determine the metric [19, 1]. Asymptotically we have now the following situation. When the individual monopoles become well separated the spectral curve of the n-monopole degenerates (exponentially fast) into the union of spectral curves Si of individual monopoles, while the section of L−2 becomes (also exponentially fast) n meromorphic sections of L−2 over the individual Si . The zeros and poles of these sections occur only at the intersection points of the curves Si . This information (and the topology of the asymptotic region of Mn ) is, as we show in the last section, sufficient to conclude that the asymptotic metric is the Gibbons–Manton metric. The construction of the moduli space of solutions to Nahm’s equations which gives the Gibbons–Manton metric admits various generalizations. Some of them are described in Sect. 4. Let us recall that Kronheimer [23] has shown existence of hyperkähler structures M (τ1 , τ2 , τ3 ) on GC /T C , where G is a compact semisimple Lie group and T ≤ G is a maximal torus. These structures are parameterized by the cohomology classes τ1 , τ2 , τ3 ∈ Lie(T ) of the three Kähler forms. We show (in Sect. 4) that there is a (pseudo)-hyperkähler manifold MG with a tri-Hamiltonian action of T such that, if µ : MG → Lie(T ) ⊗ R3 is the hyperkähler moment map, then the hyperkähler quotient µ−1 (τ1 , τ2 , τ3 )/T of MG by T is precisely Kronheimer’s M (τ1 , τ2 , τ3 ). A similar construction can be done for Kronheimer’s ALE-spaces. The article is organized as follows. In Sects. 1 and 2 we recall the definitions of the ˜ n of Gibbons–Manton and monopole metrics. In Sect. 3 we introduce the moduli space M ˜ solutions to Nahm’s equations and give heuristic arguments why the metric on Mn should be exponentially close to the monopole metric. In Sect. 4, as a preliminary step to study ˜ n we introduce yet another moduli space of solutions to Nahm’s equations, somewhat M ˜ n . In that section we also discuss the relation with Kronheimer’s metrics simpler than M ˜ n as a differential, complex, and finally mentioned above. In Sect. 5 we identify M ˜ n and complex-symplectic manifold. In Sect. 6 we calculate the twistor space of M identify its hyperkähler metric as the Gibbons–Manton metric. In Sect. 7 we finally ˜ n are exponentially close. The short show that the monopole metric and the metric on M Sect. 8 shows how one can read off the Gibbons–Manton metric, as the asymptotic form of the monopole metric, from the twistor description of the latter.

1. The Gibbons–Manton metric The Gibbons–Manton metric [14] is an example of 4n-dimensional (pseudo)-hyperkähler metric admitting a tri-Hamiltonian (hence isometric) action of the n-dimensional torus

Monopoles and the Gibbons–Manton Metric

299

T n . Such metrics have particularly nice properties and were studied by several authors [25, 18, 32]. The Gibbons–Manton metric was described as a hyperkähler quotient of a flat quaternionic vector space by Gibbons and Rychenkova in [15]. We recall here this description, which we slightly modify to better n suit our purposes. We start with flat hyperkähler metrics g1 and g2 on M1 = S 1 × R3 and M2 = Hn(n−1)/2 . We consider a pseudo-hyperkähler metric on the product manifold M = M1 ×M2 given by g = g1 −g2 . The complex structures on H are given by the right multiplication by quaternions i, j, k. The metric g1 is invariant under the obvious action (by translations) of T n = (S 1 )n and the metric g2 is invariant under the left diagonal action of T n(n−1)/2 . We consider a homomorphism φ : T n(n−1)/2 → T n given by   i−1 n Y Y  tij t−1 . (tij )i<j 7→  ji j=i+1

j=1

i=1,... ,n

on M = M1 ×M2 by t·(m1 , m2 ) = (φ(t)·m1 , t·m2 ). This defines an action of T Gibbons and Rychenkova have shown that the hyperkähler quotient of (M, g) by this action of T n(n−1)/2 is the Gibbons–Manton metric. We remark that, if we choose coordinates (ti , xi ) on M1 , ti ∈ S 1 and xi ∈ R3 , and quaternionic coordinates qij , i < j, on Hn(n−1)/2 , then the moment map equations are: n(n−1)/2

1 qij iq¯ij = xi − xj . 2

(1.1)

As long as xi 6= xj for i 6= j, the torus T n(n−1)/2 acts freely on the zero-set of the moment map. The quotient of this set by T n(n−1)/2 is a smooth hyperkähler manifold which we denote by MGM . The action of T n on M1 induces a free tri-Hamiltonian action on MGM for which the moment map is just (x1 , . . . , xn ). This makes MGM into a T n -bundle over 3 the configuration space C˜ n (R3 ) of n distinct points in R . We shall now determine this 3 ˜ bundle. We recall that a basis of H2 Cn (R ), Z is given by the k(k − 1)/2 2-spheres, 2 Sij = {xk ∈ R3 ; |xi − xj | = const, xk = const if k 6= i, j},

(1.2)

where i < j. We have Proposition 1.1. The hyperkähler moment map for the action of T n makes MGM into a T n -bundle over C˜ n (R3 ) determined by the element (s1 , . . . , sn ) of H 2 C˜ n (R3 ), Zn given by   −1 if k = i 2 sk (Sij ) = 1 if k = j  0 otherwise. 2 Proof. From the formula (1.1) it follows that restricting the bundle to a fixed Sij is 2 equivalent to considering the case n = 2. In other words sk (Sij ) = 0 if k 6= i, j and we have to consider only one quaternionic coordinate qij . The zero-set of the moment map is 21 qij iq¯ij = xi − xj and the circle S 1 by which we quotient acts by t · qij , (ti , xi ), (tj , xj ) = tqij , (tti , xi ), (t−1 tj , xj ) . The quotient can be obtained by setting ti = 1 and the induced action of the ith generator si of T n is then given by left 1 1 multiplication by s−1 i on qij . Since the map qij → 2 qij iq¯ij with the left action of S on 2 {qij ∈ H; |qij | = 1} is the Hopf bundle, it follows that si (Sij ) = −1. A similar argument 2 shows that sj (Sij ) = 1.

300

R. Bielawski

In particular, (t, x) = (ti , xi ) form local coordinates on MGM . The metric tensor can be then written in the form [32]: g = 8dx · dx + 8−1 (dt + A)2 , where the matrix 8 and the 1-form A depend only on the xi and satisfy certain linear PDE’s. In particular, 8 determines the metric. For the Gibbons–Manton metric ( P 1 1 − k6=i kxi −x if i = j kk 48ij = 1 if i 6= j. kxi −xj k 2. Nahm’s Equations and Monopole Metrics We shall recall in this section the description of the L2 -metric on the moduli space of charge n SU (2)-monopoles in terms of Nahm’s equations. A proof that the Nahm transform [30, 16] between the two moduli spaces is an isometry was given by Nakajima in [31]. One starts with the space A of quadruples (T0 , T1 , T2 , T3 ) of smooth u(n)-valued functions on (−1, 1) such that T1 , T2 , T3 have simple poles at ±1 with residues 21 ρ(σi ), i = 1, 2, 3, where ρ : su(2) → u(n) is the standard irreducible n-dimensional representation of su(2) and σi are the Pauli matrices. Equipped with the L2 -norm (given by a biinvariant inner product on u(n)), A becomes a flat quaternionic affine space. There is an isometric and triholomorphic action of the gauge group G of U (n)-valued functions g : [−1, 1] → U (n) which are 1 at ±1: 7 Ad(g)T0 − gg ˙ −1 , T0 → 7 Ad(g)Ti , i = 1, 2, 3. Ti →

(2.1)

The zero-set of the hyperkähler moment map for this action is then described by Nahm’s equations [30]: 1 T˙i + [T0 , Ti ] + 2

X

ijk [Tj , Tk ] = 0 ,

i = 1, 2, 3.

(2.2)

j,k=1,2,3

The quotient of the space of solutions by G is a smooth hyperkähler manifold Mn of dimension 4n. By the above mentioned result of Nakajima, Mn is the moduli space of (framed) charge n SU (2)-monopoles. With respect to any complex structure Mn is biholomorphic to the space of based rational maps of degree n on CP 1 [13]. If we replace U (n) by SU (n) (resp. by P SU (n)) in the above description, we obtain the moduli space of strongly centered (resp. centered) SU (2)-monopoles of charge n. Remark 2.1. A similar construction can be done for any compact Lie group G. We require ρ : su(2) → g to be a Lie algebra homomorphism whose image lies in the regular part of g. We obtain a smooth hyperkähler manifold of dimension 4 rank G which can be identified with a totally geodesic submanifold of a certain moduli space of SU (N )-monopoles (with a minimal symmetry breaking). Alternatively, as a complex manifold, it is a desingularization of hC × T C /W , where T C is a maximal torus in GC , hC its Lie algebra, and W the corresponding Weyl group [6].


301

The tangent space to Mn can be described as the space of solutions to the linearized Nahm’s equations and satisfying the condition of being orthogonal (in the L2 -metric) to vectors arising from infinitesimal gauge transformations. In other words the tangent space to Mn at a solution (T0 , T1 , T2 , T3 ) can be identified with the set of solutions (t0 , t1 , t2 , t3 ) to the following system of linear equations: t˙0 + [T0 , t0 ] + [T1 , t1 ] + [T2 , t2 ] + [T3 , t3 ] = 0, t˙1 + [T0 , t1 ] − [T1 , t0 ] + [T2 , t3 ] − [T3 , t2 ] = 0, t˙2 + [T0 , t2 ] − [T1 , t3 ] − [T2 , t0 ] + [T3 , t1 ] = 0, t˙3 + [T0 , t3 ] + [T1 , t2 ] − [T2 , t1 ] − [T3 , t0 ] = 0.

(2.3)

The metric is defined by k(t0 , t1 , t2 , t3 )k2 =

1 2

Z

1 −1

3 X

kti k2 .

(2.4)

0

The three anti-commuting complex structures can be seen by writing a tangent vector as t0 + it1 + jt2 + kt3 .

3. The Asymptotic Moduli Space ˜ n (c), c ∈ R, of We shall now construct a one-parameter family of moduli spaces M solutions to Nahm’s equations carrying (pseudo-)hyperkähler metrics. We shall see later on that these metrics are the Gibbons–Manton metric with different mass parameters. We consider the subspace 1 of exponentially fast decaying functions in C 1 [0, ∞], i.e.: ηt ηt 1 = f : [0, ∞] → u(n); ∃η>0 sup e kf (t)k + e kdf /dtk < +∞ . (3.1) t

As in the previous section, ρ : su(2) → u(n) is the standard irreducible n-dimensional representation of su(2) (in particular, ρ(σ1 ) is a diagonal matrix). We denote by h the (Cartan) subalgebra of u(n) consisting of diagonal matrices. Let A˜ n be the space of C 1 -functions (T0 , T1 , T2 , T3 ) defined on (0, +∞] and satisfying (cf. [23]): (i) (ii) (iii) (iv)

T1 , T2 , T3 have simple poles at 0 with res Ti = 21 ρ(σi ); Ti (+∞) ∈ h for i = 0, . . . , 3; (T1 (+∞), T2 (+∞), T3 (+∞)) is a regular triple, i.e. its centralizer is h; (Ti (t) − Ti (+∞)) ∈ 1 for i = 0, 1, 2, 3.

Next we shall define the relevant gauge group. The Lie algebra of our gauge group G(c) is the space of C 2 -paths ρ : [0, +∞) → u(n) such that (i) ρ(0) = 0 and ρ˙ has a limit in h at +∞; (ii) (ρ˙ − ρ(+∞)) ˙ ∈ 1 , and [τ, ρ] ∈ 1 for any regular element τ ∈ h; ˙ = 0. (iii) cρ(+∞) ˙ + limt→+∞ (ρ(t) − tρ(+∞))

302

R. Bielawski

It is the Lie algebra of the Lie group G(c) = {g : [0, +∞) → U (n); g(0) = 1, s(g) := lim gg ˙ −1 ∈ h, (τ − Ad(g)τ ) ∈ 1 , (gg ˙ −1 − s(g)) ∈ 1 , exp(cs(g)) lim (g(t) exp(−ts(g))) = 1} . Remark. The last condition in the definition of G(c) means that g(t) is asymptotic to exp(ht − ch) for some diagonal h. We introduce a family of metrics on A˜ n . Let (t0 , t1 , t2 , t3 ) be a tangent vector to the space A˜ n at a point (T0 , T1 , T2 , T3 ). The functions ti are now regular at 0, i = 0, . . . , 3. We put Z +∞ X 3 3 X k(t0 , t1 , t2 , t3 )k2c = c kti (+∞)k2 + kti (s)k2 − kti (+∞)k2 ds. (3.2) 0

0

0

We observe that the group G(c) acting by (2.1) preserves the metric k·kc and the three ˜ n (c) as the (formal) complex structure of the flat hyperkähler manifold A˜ n . We define M hyperkähler quotient of A˜ n by G(c) (with respect to the metric k · kc ). The zero set of the moment map is given by (2.2) (here condition (iii) in the definition of Lie(G(c)) is ˜ n (c) is defined as the moduli space of solutions to Nahm’s equations: essential) and so M ˜ n (c) = solutions to (2.2) in A˜ n /G(c). M ˜ n (c) will be seen to be positive definite if Remark. If c > 0, then the metric (3.2) on M (T1 (+∞), T2 (+∞), T3 (+∞)) is sufficiently far from the walls of Weyl chambers. On the other hand, if c < 0, then the metric will be shown to be everywhere negative definite. Therefore, for c < 0 we should really replace k · kc with its negative; it is, however more convenient to consider the metrics k · kc . We observe that sending a solution Ti to the solution rTi (rt) for any r > 0 induces ˜ n (rc). ˜ n (c) and M a homothety of factor r between M ˜ n (c), let us explain why we expect this Before we begin the detailed study of M metric to be exponentially close to the monopole metric. It is known [4] that the solutions to Nahm’s equations on (0, 2) corresponding to a well-separated monopole are exponentially close to being constant away from the boundary points (i.e. on any [, 2 − ]). The same is true for solutions on the half line (0, +∞): as long as the triple (T1 (+∞), T2 (+∞), T3 (+∞)) is regular, the solutions are exponentially close to being constant away from 0 [23] (it is helpful to notice that the space of regular triples is the same as the space C˜ n R3 of distinct points in R3 ). Our strategy is to take two solutions, on half-lines (0, ∞) and (−∞, 2) with the same values at ±∞, cut them off at t = 1 and use this non-smooth solution on (0, 2) (with correct boundary behaviour) to obtain an exact solution to the monopole Nahm data. The exact solution will differ from the approximate one by an exponentially small amount. Furthermore the part of the half-line solutions which we have cut off is exponentially close to being constant and, for c = 1, contributes an exponentially small amount to the metric k · kc (all estimates are uniform and can be differentiated). This can be seen from the fact that we can rewrite (3.2) as Z cX Z +∞ X 3 3 kti (s)k2 + kti (s)k2 − kti (+∞)k2 ds. (3.3) k(t0 , t1 , t2 , t3 )k2c = 0

0

c

0

The first term, together with the corresponding term for the solution on (−∞, 2), is exponentially close to the monopole metric (for c = 1).


303

4. Moduli Space of Regular Semisimple Adjoint Orbits ˜ n (c) we need to consider first another moduli In order to obtain information about M space of solutions to Nahm’s equations, defined analogously, except that we require the solutions to be smooth at t = 0. This space, which can be defined for an arbitrary compact Lie group G, is of some interest as all hyperkähler structures on GC /T C (here T C is a maximal torus) due to Kronheimer [23] can be obtained from it as hyperkähler quotients (see Theorem 4.3 below). A reader who is primarily interested in monopoles should think of G as U (n). Let us first recall how Kronheimer constructs hyperkähler metrics on GC /T C . Let h be the Lie algebra of T C and let (τ1 , τ2 , τ3 ) ∈ h3 be a regular triple, i.e. one whose centralizer is h. For a fixed η > 0, consider the Banach space η1 = f : [0, ∞] → g; sup eηt kf (t)k + eηt kdf /dtk < +∞ t

with the norm kf k = supt eηt kf (t)k + eηt kdf /dtk . Define Aη (τ1 , τ2 , τ3 ) as the space of C 1 -functions (T0 , T1 , T2 , T3 ) : (0, +∞] → g which satisfy: {T0 (t), (Ti (t) − τi ) ; i = 1, 2, 3} ⊂ η1 . Define also G η by replacing 1 with η1 in the definition of G given in the previous section. Kronheimer shows then that for small enough η, M (τ1 , τ2 , τ3 ) = {solutions to (2.2) in Aη (τ1 , τ2 , τ3 )} /G η , equipped with the L2 metric is a smooth hyperkähler manifold, diffeomorphic to GC /T C . Futhermore, if (τ2 , τ3 ) is regular, then M (τ1 , τ2 , τ3 ) is biholomorphic, with respect to the complex structure I, to the complex adjoint orbit of τ2 + iτ3 . We observe that the union of all M (τ1 , τ2 , τ3 ) has a natural topology and it is, in fact, a smooth manifold. We shall show now that there is a T -bundle over this union which carries a (pseudo)-hyperkähler metric. We define the space AG by omitting the condition (i) in the definition of A˜ n in the previous section. Instead we require that the Ti are smooth at t = 0 for i = 0, 1, 2, 3. We define MG (c), c ∈ R, as the (formal) hyperkähler quotient of AG by G(c) with respect to the metric (3.2). We have: Proposition 4.1. MG (c) equipped with the metric (3.2) is a smooth hyperkähler manifold. The tangent space at a solution (T0 , T1 , T2 , T3 ) is described by Eqs. (2.3). We remark that the metric 3.2 may be degenerate at some points. However the hypercomplex structure is defined everywhere. η (c) by replacing with η in the definition of MG (c). By the expoProof. Define MG nential decay property of solutions to Nahm’s equations ([23], Lemma 3.4), a neighbourhood of a particular element in MG (c) is canonically identified with its neighbourhood η (c) for small enough η. Therefore we can use the transversality arguments of in MG [23], Lemma 3.8 and Proposition 3.9 (with a slight modification due to condition (iii) in the definition of Lie(G(c))) to deduce the smoothness. The fact that the metric is hyperkähler is, formally, the consequence of the fact that MG (c) is a hyperkähler quotient. One can, in fact, check directly that the three Kähler forms are closed. We shall also, later on, identify the complex structures and the complex symplectic forms proving their closedness.

304

R. Bielawski

We observe now that the action on AG of gauge transformations which are asymptotic to exp(−th + λh), h ∈ h, λ ∈ R, induce a free isometric action of T = exp(h) on MG (c). In fact this action is tri-Hamiltonian and a simple calculation shows Proposition 4.2. The hyperkähler moment map µ = (µ1 , µ2 , µ3 ) for the action of T on MG (c) is given by µi (T0 , T1 , T2 , T3 ) = Ti (+∞) for i = 1, 2, 3. As an immediate corollary we have: Theorem 4.3. Let (τ1 , τ2 , τ3 ) be a regular triple in h3 . The hyperkähler quotient µ−1 (τ1 , τ2 , τ3 )/T of MG (c) by the torus T is isometric to Kronheimer’s M (τ1 , τ2 , τ3 ). We have also a tri-Hamiltonian action of G on MG (c) given by the gauge transformations with arbitrary values at t = 0. The hyperkähler moment map for this action is (T1 (0), T2 (0), T3 (0)). We have two other group actions on MG (c). There is a free isometric and triholomorphic action of the Weyl group W = N (T )/T given by the gauge transformations which become constant (and in W ) exponentially fast. Finally there is a free isometric SU (2)-action which rotates the complex structures. As a consequence it has a globally defined Kähler potential for each Kähler form (cf. [18]). The potential for ω2 (or ω3 ) is given by the moment map for the action of a circle in SU (2) which preserves I. This is easily seen to be KJ = c

3 X

Z

+∞

kTi (+∞)k2 +

i=2

0

3 X

kTi (s)k2 − kTi (+∞)k2 ds.

i=2

Remark 4.4. There is a similar (pseudo)-hyperkähler manifold with a torus action such that the hyperkähler quotients by this torus are isometric to Kronheimer’s ALE-metrics on the minimal resolution of a given Kleinian singularity C2 /0 [24]. This manifold is defined as MG except that the Ti have poles at t = 0 with the residues defined by a subregular homomorphism su(2) → g (cf. [6, 5]). Remark 4.5. One can observe that MG (0) is a cone metric (with the R>0 -action given by Ti (t) 7→ rTi (rt)) and in fact, it is an H∗ -bundle over a pseudo-quaternion-Kähler manifold (cf. [34]). Remark 4.6. It is instructive to consider the Kähler analogue of MG (c). The Kähler metrics on G/T (cf. [2]) can be described (cf. [7]) as the natural L2 -metrics on the moduli space of solutions to T˙1 = [T1 , T0 ] with T0 (t), (T1 (t) − τ ) ∈ 1 for a fixed regular element τ of h (this gives the Kähler form whose cohomology class is τ ). We ˆ G (c) with can now do a construction similar to that of MG (c) to obtain moduli spaces M a (G × T )-invariant pseudo-Kähler metric whose Kähler quotients by T are precisely the Kähler metrics on G/T . In this case it is easy to compute both the topology and the ˆ G (c) is diffeomorphic to G × {regular elements of h}, ˆ G (c): M complex structure of M and the complex structure at (1, h) is given by I(v + w, p) = (I0 v − p, w), where v ⊥ h, w, p ∈ h and I0 is the complex structure on G/T . ˜ n (c) as a Manifold 5. M ˜ n (c) defined in Sect. 3. Our first task is to show that this We now return to the space M ˜ n (c) is a smooth hyperkähler quotient of the space is smooth. We shall show that M


305

product of the space MU (n) (c − 1) considered in the previous section and of another moduli space of solutions to Nahm’s equations. This latter space, denoted by Nn , is given by u(n)-valued solutions to Nahm’s equations defined on (0, 1] smooth at t = ˜ n (c) at t = 0. The gauge group consists of gauge 1 and with the same poles as M transformations which are identity at t = 0, 1. Equipped with the metric (2.4) this is a smooth hyperkähler manifold [6, 11]. It admits a tri-Hamiltonian action of U (n) given by gauge transformations with arbitrary values at t = 1. In addition, we consider the space MU (n) (c−1) defined in the previous section. We identify it this time with the space of solutions on [1, +∞] via the map Ti (t) 7→ Ti (t + 1) (so that the gauge transformations behave now, near +∞, as elements of G(c)). ˜ n (c) is the hyperkähler quotient of Nn × It is easy to observe that the space M MU (n) (c − 1) by the diagonal action of U (n) (cf. [6]; the moment map equations simply match the functions T1 , T2 , T3 at t = 1; after that, quotienting by G means that the ˜ n (c) remaining gauge transformations are smooth at t = 1). Using this description of M we can finally show ˜ n (c) equipped with the metric (3.2) is a smooth hyperkähler manProposition 5.1. M ifold. The tangent space at a solution (T0 , T1 , T2 , T3 ) is described by the Eqs. (2.3). Proof. Since the metric (3.2) may be degenerate, we still have to show that the moment map equations on Nn × MU (n) (c − 1) are everywhere transversal. Consider a particular point in MU (n) (c − 1) which we represent by a solution m = (T0 , T1 , T2 , T3 ) with T0 (+∞) = 0 and Ti (+∞) = τi , i = 1, 2, 3. Let µ be the hyperkähler moment map for the action of G on Nn × MU (n) . We observe that the image of dµ|m contains the image of dµ0|m , µ0 being the hyperkähler moment map for the action of G on Nn × M (τ1 , τ2 , τ3 ) (Kronheimer’s definition of M (τ1 , τ2 , τ3 ) was recalled in the previous section). The metric on Nn × M (τ1 , τ2 , τ3 ) is non-degenerate and, as G acts freely, dµ0|m is surjective. ˜ n (c) is smooth. Thus dµ is surjective at each point in Nn × MU (n) (c − 1) and M ˜ n (c) has isometric actions of the We observe that, as in the case of MU (n) (c), M n torus T (defined as the diagonal subgroup of U (n)), of the symmetric group Sn , and of SU (2). In particular, the hyperkähler moment map for the action of T n is still given by the values of T1 , T2 , T3 at infinity (cf. Proposition 4.2). ˜ n (c): We can describe the topology of M ˜ n (c) is a principal T n -bundle over the configuration space C˜ n (R3 ) Proposition 5.2. M of n distinct points in R3 . We postpone identifying this bundle until the next section (Proposition 6.3). Proof. The space C˜ n (R3 ) is the space of regular triples in the subalgebra of diagonal ˜ n (c) → matrices and the moment map µ for the action of T n gives us a projection M 3 ˜ ˜ Cn (R ). Let us consider a fixed regular triple (τ1 , τ2 , τ3 ) and all elements of Mn (c) with Ti (+∞) = τi , i = 1, 2, 3, i.e. µ−1 (τ1 , τ2 , τ3 ). For each such solution we can make T0 identically 0 by some gauge transformation g with g(0) = 1. This is defined uniquely up to the action of G × T n and so the set of T n -orbits projecting via µ to (τ1 , τ2 , τ3 ) can be identified with the set of solutions to Nahm’s equations with T0 ≡ 0, T1 , T2 , T3 having the appropriate residues at t = 0 and being conjugate to τ1 , τ2 , τ3 at infinity. By the considerations at the beginning of this section this space is the hyperkähler quotient of Nn ×M (τ1 , τ2 , τ3 ) by U (n). The arguments of [6] show that the corresponding complexsymplectic quotient can be identified with the intersection of a regular semisimple adjoint

306

R. Bielawski

orbit of GL(n, C) with the slice to the regular nilpotent orbit. This intersection is a single point. Finally, in order to identify in this case the hyperkähler quotient with the complexsymplectic one we can adapt the argument in the proof of Proposition 2.20 in [20]. ˜ n (c) (because of the action of Our next task is to describe the complex structure of M SU (2) all complex structures are equivalent). As usual (cf. [13]), if we choose a complex structure, say I, we can introduce complex coordinates on the moduli space of solutions to Nahm’s equations by writing α = T0 + iT1 and β = T2 + iT3 . The Nahm equations can be then written as one complex and one real equation: dβ = [β, α], dt d (α + α∗ ) = [α∗ , α] + [β ∗ , β]. dt

(5.1) (5.2)

˜ n (c) is the hyperkähler quotient By the remark made at the beginning of this section, M of the product manifold Nn × MU (n) (c − 1). We shall show that as a complex symplectic ˜ n (c) is the complex-symplectic quotient of Nn × MU (n) (c − 1). Let us recall manifold M the complex structure of Nn [13, 19, 6, 12]. Let e1 , . . . , en denote the standard basis of Cn . There is a unique solution w1 of the equation dw = −αw dt with

lim t−(n−1)/2 w1 (t) − e1 = 0.

t→0

(5.3)

(5.4)

Setting wi (t) = β i−1 (t)w1 (t), we obtain a solution to (5.3) with lim ti−(n+1)/2 wi (t) − ei = 0. t→0

The complex gauge transformation g(t) with g −1 = (w1 , . . . , wn ) makes α identically zero and sends β(t) to the constant matrix   0 . . . 0 (−1)n+1 Sn  ..  1 . (−1)n Sn−1  . (5.5) B(β1 , . . . , βn ) =   . .  ..  .. ..  . 0 ... 1 S1 Here βi denote the (constant) eigenvalues of β(t) and Si is the ith elementary symmetric polynomial in {β1 , . . . , βn }. The mapping (α, β) → (g(1), B) gives a biholomorphism between (Nn , I) and Gl(n, C) × Cn [6]. ˜ n (c) as follows: We describe the complex structure of M ˜ n (c) and Proposition 5.3. There exists a T n -equivariant biholomorphism between M an open subset of ! a −1 {[g, b] ∈ Gl(n, C) ×N (d + n); gbg is of the form (5.5)} ∼, n


307

where d denotes diagonal matrices, the union is over unipotent algebras n (with respect to d) and N = exp n. Furthermore, the relation ∼ is given as follows: [g, d+n] ∼ [g 0 , d0 +n0 ] if and only if n ∈ n, n0 ∈ n0 , and either n0 ⊂ n and there exists an m ∈ N such that gm−1 = g 0 , Ad(m)(d + n) = d0 + n0 or vice versa (i.e. n ⊂ n0 etc.). Remark. It will follow from the description of the twistor space that this biholomorphism is actually onto. Proving this right now would require showing that the T n -action on ∗ n ˜ Mn (c) extends to the global action of C . This, in turn, requires showing existence of solutions to a mixed Dirichlet-Robin problem on the half-line - something that seems quite tricky. Proof. Fix a unipotent algebra n and consider the set of all solutions (α, β) = (T0 + iT1 , T2 + iT3 ) on [1, +∞) such that the intersection of the sum of positive eigenvalues of ad(iT1 (+∞)) with C(β(+∞)) is contained in n. Let M (n; c − 1) be the corresponding subset of MU (n) (c). We observe that, since (T1 (+∞), T2 (+∞), T3 (+∞)) is a regular triple, the projection of T1 (+∞) onto dC ∩ C(β(+∞)) is a regular element, and so n contains the unipotent radical of a Borel subalgebra of C(β(+∞)) for any element of M (n; c − 1). Using gauge freedom, we always make T0 (+∞) = 0 and, by Proposition 4.1 of Biquard [8], such a representative is of the form g α(+∞), β(+∞) + Ad(exp{−α(+∞)t})n , where n ∈ n and g is a bounded Gl(n, C)-valued gauge transformation. The transformation g is defined modulo exp{−α(+∞)t}g0 exp{α(+∞)t} with g0 ∈ P = exp(d + n). Since T0 (+∞) = 0 and T0 is decaying exponentially fast, g has a limit (in T C ) at +∞. If we replace g(t) by g 0 (t) = g(t)g(+∞)−1 exp{−α(+∞)t + cα(+∞)}, then (α, β) = g 0 (0, β(+∞) + n0 ) for an n0 ∈ n. The transformation g 0 , which satisfies (at infinity) the boundary condition of an element of G(c − 1)C , is now defined modulo constant gauge transformations in N . Moreover g 0 (1) is independent of G(c − 1) and we obtain a map φ : M (n) → Gl(n, C) ×N (d + n) by sending (α, β) to (g 0 (1), β(+∞) + n0 ). Considering the infinitesimal version of this construction shows that φ is holomorphic. Since φ is U (n)-equivariant, it is (locally) Gl(n, C)-equivariant. We can adapt the ˜ n (c) is the complex-symplectic argument of Proposition 2.20 in [19] to show that M quotient of Nn × MU (n) (c − 1) by (local action of) Gl(n, C). Let us restrict attention to Nn × M (n). The complex symplectic moment map at the point (g, B) of Nn is −g −1 Bg (here g ∈ Gl(n, C) and B is of the form (5.5)) and the complex symplectic moment map at the point corresponding to [g 0 , βd + n] is g 0 (βd + n)g 0−1 (here βd is diagonal and n ∈ n). The moment map equation for the diagonal action of Gl(n, C) is g −1 Bg = g 0 (βd + n)g 0−1 . If we now quotient by Gl(n, C), i.e. send g to identity, we shall end up with the set of [g 0 , b] ∈ Gl(n, C) ×N (d + n) such that g 0 bg 0−1 = B (B is determined by the diagonal part of b). This identifies the charts described in this proposition. By going through the procedure we can conclude that the charts for different n are matched as claimed. ˜ n (c) to the manifold So far we have shown that there is a holomorphic map φ from M M described in the statement. We still have to show that φ is 1-1. By construction our n map is T n -equivariant, and so C∗ -equivariant (where the action is defined). Since n ˜ n (c). Furthermore the C∗ n -action on the C∗ -action on M is free, it is free on M M leaves invariant sets of the form M ∩ Gl(n, C) ×N (d + n) , d ∈ d. Each such set n is a single orbit of C∗ and so φ is 1-1. ˜ n (c) is rather complicated. We remark that the open The above description of M dense subset where β(+∞) is regular corresponds to n = 0, i.e. to {(βd , g); βd = diag(β1 , . . . , βn ), βi 6= βj if i 6= j, gβd g −1 = B(β1 , . . . , βn )}.

308

R. Bielawski

˜ n (c) by M ˜ nreg (c). We observe that an We shall denote the corresponding subset of M element g of Gl(n, C) which sends diag(β1 , . . . , βn ) to B(β1 , . . . , βn ) is of the form g = V (β1 , . . . , βn )−1 diag(u1 , . . . , un ),

(5.6)

where ui 6= 0 and V (β1 , . . . , βn ) is the Vandermonde matrix, i.e. Vij = (βi )j−1 . We can ˜ nreg (c): calculate the complex symplectic form ω = ω2 + iω3 on M ˜ nreg (c) is given, in coordinates Proposition 5.4. The complex symplectic form ω on M βi , ui , i = 1, . . . , n, by n X dui i=1

ui

∧ dβi −

X dβi ∧ dβj i<j

βi − βj

.

(5.7)

Proof. First, we calculate ω on the subset of MU (n) (c − 1), where β(+∞) is regular. This subset is biholomorphic to Gl(n, C) × {regular elements of hC } and according to the proof of Proposition 5.3, an element (α, β) of this set corresponding to (g, βd ) ∈ −1 , g(t)βd g(t)−1 ), where g(t) is a complex Gl(n, C)×h can be written as (α, β) = (−g(t)g ˙ gauge transformation with g(0) = g. Therefore a tangent vector (a(t), b(t)) at (α, β) can be written as (a, b) = −g ρg ˙ −1 , g bd + [ρ, βd ] g −1 , (5.8) where ρ is dual to g −1 dg and bd is dual to dβd . The complex symplectic form on MU (n) (c − 1) is given by Z +∞ tr dα ∧ dβ − dα(+∞) ∧ dβ(+∞) . ω = (c − 1) tr dα(+∞) ∧ dβ(+∞) + 0

ˆ corresponding, via (5.8), to (ρ, bd ) and (ρ, For two tangent vectors (a, b) and (â, b), ˆ bˆ d ) we obtain ˆ d , ω = − tr bd ρˆ − ρbˆ d − [ρ, ρ]β ˜ nreg (c) it remains to where ρ = ρ(0), ρˆ = ρ(0). ˆ To calculate the symplectic form on M substitute (5.6) for g. Let us write u for diag(u1 , . . . , un ). Then ρ becomes dual to u−1 du − u−1 dV V −1 u. Let us write ν for the tangent vector dual to u−1 du and ϒ for the tangent vector dual to dV V −1 . Since ν is diagonal and the ith row of ϒ is of the form bi s (here we write bd = diag(b1 , . . . , bn )), for a covector s, we can write ω as ˆ d . ω = − tr bd νˆ − ν bˆ d − [ϒ, ϒ]β ˆ d . Let us write Wij for the (i, j)th entry of V −1 , i.e. It remains to calculate tr[ϒ, ϒ]β .Y (βj − βk ), Wij = (−1)n−i Sn−i (β1 , . . . , βˆj , . . . , βn ) (5.9) k6=j

Sk being the k th elementary symmetric polynomial (S0 = 1). We calculate the (i, i)th ˆ as entry of [ϒ, ϒ] ! ! X X X k−2 k−2 (bi bˆ j − bˆ i bj ) (k − 1)βi Wkj (k − 1)βj Wki . j

k

k


309

This means that ˆ d= tr[ϒ, ϒ]β

X

(bi bˆ j −bˆ i bj )(βi −βj )

X

i<j

! (k −

1)βik−2 Wkj

X

k

! (k −

1)βjk−2 Wki

Formula (5.7) will be proven if we can show (for i 6= j) the following identity: ! ! X X −1 (k − 1)βik−2 Wkj (k − 1)βjk−2 Wki = . (βi − βj )2 k

.

k

(5.10)

k

According to (5.9) we have X

P (k − 1)βik−2 Wkj =

k (k

k

− 1)βik−2 (−1)n−k Sn−k (β1 , . . . , βˆj , . . . , βn ) Q . s6=j (βj − βs ) (5.11)

We compute the numerator of this expression. We set p = n − 1 and (a1 , . . . , ap ) = (β1 , . . . , βˆj , . . . , βn ). Then the numerator can be written as ! p p X d X s p−1−s s p−s (p − s)(−1) ai Ss (a1 , . . . , ap ) = (−1) Ss t . dt s=0

Since

P

s=0

s

Ss t =

Q

t=ai

(1 + as t), we can rewrite the expression under the derivative as p X

(−1)s Ss tp−s =

s=0

p Y

(t − as ).

s=0

Taking the derivative and substituting ai for t, finally gives p X

(p − s)(−1)s ap−1−s = i

s=0

Y

(ai − as ).

s6=i

Going back to (5.11), we have X

Q

(k −

1)βik−2 Wkj

k

from which (5.10) follows.

s6=i,j (βi − βs ) , = Q s6=j (βj − βs )

Remark 5.5. Setting p i = ui

.Y

(βi − βj ),

j>i

the formula (5.7) can be rewritten as ω=

n X dpi i=1

pi

∧ dβi .

310

R. Bielawski

˜ n (c) 6. The Twistor Space and the Metric on M ˜ n (c). As a first step, we observe, We shall now identify the twistor space Z(c) of M after Hitchin et al. [18], that the hyperkähler moment map µ for the T n -action defines a moment map, also denoted by µ, for the complex-symplectic form along the fibers Z(c) → CP 1 . This µ is a map from Z(c) to O(2) ⊗ Cn . We shall first identify the open subset Z reg (c) of Z(c) defined as the set Z reg (c) = µ−1 (O(2) ⊗ Cn − O(2) ⊗ 1) ,

(6.1)

where 1 is the generalized diagonal in Cn . In terms of the coordinates (β1 , . . . , βn ) and (u1 , . . . , un ) given by (5.6), Z reg (c) has the following description: Proposition 6.1. Z reg (c) is obtained by taking two copies of C × (Cn − 1) × (C∗ )n ˜ β˜i , u˜ i ), i = 1, . . . , n, and identifying over ζ 6= 0 by with coordinates (ζ, βi , ui ) and (ζ, ζ˜ = ζ −1 , β˜i = ζ −2 βi , u˜ i = ζ −(n−1) exp{−cβi /ζ}ui . The real structure is given by ¯ ζ 7→ −1/ζ, ¯ βi 7→ −βi /ζ¯2 , n−1 Q ¯ ¯ cβ¯ i /ζ¯ . ui 7→ u¯ −1 1/ζ¯ i j6=i (βi − βj )e Finally, the complex symplectic form along the fibers is given by (5.7). Proof. For any hyperkähler moduli space of solutions to Nahm’s equations one can trivialize the twistor space by choosing an affine coordinate ζ on CP 1 and then putting η = β + (α + α∗ )ζ − β ∗ ζ 2 , u = α − β ∗ ζ for ζ 6= ∞, and η˜ = β/ζ 2 + (α + α∗ )/ζ − β ∗ , u˜ = −α∗ − β/ζ for ζ 6= 0. Then, over ζ 6= 0, ∞, we have η˜ = η/ζ 2 , u˜ = u − η/ζ. ¯ η 7→ −η ∗ /ζ¯2 , u 7→ −u∗ + η ∗ /ζ¯ (cf. [12, 9]). Moreover, the real structure is ζ 7→ −1/ζ, We now have to go through the procedure in the proof of Proposition 5.3 to describe ˜ β˜i , u˜ i ). First we describe the twistor space of Nn Z reg in coordinates (ζ, βi , ui ) and (ζ, ˜ in coordinates (g, B) and (g, ˜ B) defined right after (5.5) (cf. [12]). Going through the procedure assigning (g, B) to (α, β), we see that B˜ = B β1 /ζ −2 , . . . , βn /ζ −2 . On the other hand g is given by g = g(1), where g(t) is a complex gauge transformation such d −1 g = −ug −1 . This means that g(t) makes u identically zero. We observe that that dt exp{−Bt/ζ}g(t) makes u˜ identically zero and η˜ into B/ζ 2 . The initial value for the solution g −1 depends on ζ and so we can write g(t) ˜ = U exp{−Bt/ζ}g(t) for some constant matrix U . If we are to get the form (5.5), we must have U = U 0 d(ζ), where (6.2) d(ζ) = diag ζ −(n−1) , ζ −(n−3) , . . . , ζ n−1 . In addition U 0 commutes with B β1 /ζ −2 , . . . , βn /ζ −2 . Moreover, the initial value for d −1 g = −αg −1 depends only on the residues of u, η, u, ˜ η˜ and therefore the equation dt 0 U does not depend on B. Since the initial values belong to SU (n), we also have U 0 ∈ SU (n). It follows that U 0 belongs to the center of SU (n). This is only an ambiguity in the choice of trivialization and it does not affect the twistor space. Similar considerations show that the real structure sends B(β1 , . . . , βn ) to B −β¯1 /ζ¯−2 , . . . , −β¯n /ζ¯−2 and g ¯ (g ∗ )−1 , where to r(ζ) exp{B ∗ /ζ}


rij (ζ) =

311

0 if i + j = 6 n . (−1)j−1 ζ¯n+1−2j if i + j = n

This time the remaining ambiguity is given by a real element in the center of SU (n), i.e. −1 if n is even. We now go through a similar procedure for the subset of MU (n) (c − 1), where β(+∞) is regular. We have assigned in the proof of Proposition 5.3 to each element of this set a pair (g, β(+∞). We already know how β(+∞) changes (as it is given by the complex moment map for a torus action). The proof of Proposition 5.3 shows that the other coordinates, g on {ζ 6= ∞} and g˜ on {ζ 6= 0}, are related by g˜ = g exp{−(c − ¯ 1)β(+∞)/ζ}. The real structure sends g to (g ∗ )−1 exp{(c − 1)β(+∞)∗ /ζ}. Finally we have to go to the complex-symplectic quotient as in the proof of Propo˜ β˜d ), where βd = diag(β1 , . . . , βn ) and sition 5.3. We end up with (g, βd ) and (g, gβd g −1 = B(β1 , . . . , βn ) (and similarily for (g, ˜ β˜d )). We see that βi and β˜i are related as stated and g˜ = d(ζ) exp{−B/ζ}g exp{−(c − 1)βd /ζ}. Since exp{−B/ζ}g = g exp{−βd /ζ}, g˜ = d(ζ)g exp{−cβd /ζ}. If we now go to the coordinates ui , u˜ i defined by (5.6), we see that they change as required, since the (i, j)th entry of V −1 is given by (5.9) and the βi change as prescribed (i.e. as sections of O(2)). A similar argument shows that the real structure is, up to a sign, the one described in the statement ∗−1 ¯ ¯ diag{ecβi /ζ } and in (it is enough to compare the last row in r(ζ) V −1 diag{ui } −1 −2 −2 0 ¯ ¯ ¯ ¯ V −β1 /ζ , . . . , −βn /ζ diag{ui }). We shall see shortly (Proposition 6.2) that the negative of the real structure described in the statement does not admit any sections (a section would be equivalent to a complex number with imaginary modulus). The formula for the complex symplectic structure is a direct consequence of Proposition 5.4. ˜ n (c) and this means We now wish to find the full twistor space and the metric on M finding a family of real sections. We know their projections to O(2) ⊗ Cn : they are given 3 by (β + (α + α∗ )ζ − β ∗ ) (+∞) (cf. [18]) and are parameterized √ by n distinct points in R with coordinates (xi , Re zi , Im zi ), i = 1, . . . , n, where xi = −1T1 (+∞), zi = β(+∞). In other words we have n curves Si = {(ζ, η); η = zi + 2xi ζ − z¯i ζ 2 } in T CP 1 (here η is the fiber coordinate). According to Proposition 6.1 the ui coordinate of a real section of Z(c) changes as a non-zero section of the bundle Lc (k − 1) (with the transition function ζ k−1 ecη/ζ from ∞ to 0) over Si . This is true only away from the intersection points of the curves Si and we have to understand what happens to the section at these points. Two curves Si = {(ζ, η); η = zi + 2xi ζ − z¯i ζ 2 } and Sj = {(ζ, η); η = zj + 2xj ζ − z¯j ζ 2 } intersect in a pair of distinct points aij and aji , where aij =

(xi − xj ) + rij , z¯i − z¯j

rij =

q (xi − xj )2 + |zi − zj |2 .

(6.3)

We have: ˜ Proposition 6.2. The real sections of the twistor space Z(c) of Mn (c) are given, over ζ 6= ∞, by β1 (ζ), . . . , βn (ζ), u1 (ζ), . . . , un (ζ) , where βi (ζ) = zi + 2xi ζ − z¯i ζ 2 , Y ui (ζ) = Ai (ζ − aji )ec(xi −z¯i ζ) , j6=i

312

R. Bielawski

where (xi , zi ), i = 1, . . . , n, are distinct points in R × C and Ai are complex numbers satisfying Y xi − xj + rij . Ai A¯ i = j6=i

Remark. Given Proposition 5.2, this finally shows that the biholomorphism of Proposition 5.3 is onto. Proof. Consider a real section s of Z(c) (corresponding to a solution (T0 , T1 , T2 , T3 )) which projects to a given real section (β1 (ζ), . . . , βn (ζ)) of O(2) ⊗ Cn . For a generic section the intersection points of the βs are all distinct. We consider the point aji at which √ βi intersects βj and let us assume that no√other βs intersect there. We recall that −1T1 (ζ) = 21 (α + α∗ ) − β ∗ ζ and, hence, −1T1 (ζ)(+∞)ss = xs − z¯s ζ. This √ √ means that −1T1 (aji )(+∞)jj < −1T1 (aji )(+∞)ii , and so, with respect to the complex structure corresponding to aji ∈ CP 1 , the solution (T0 , T1 , T2 , T3 ) belongs to the chart described in Proposition 5.3 with n generated by the matrix with the only nonzero entry having coordinates (i, j). Let us write s as (βi (ζ), ui (ζ)), i = 1, . . . , n, in a neighbourhood of aji , ζ 6= aji (notice that the procedure of Proposition 6.1 does assign well-defined complex numbers u1 (ζ), . . . , un (ζ) to each ζ 6= aji ). According to the proof of Proposition 5.3 there is an element m(ζ) ∈ N = exp n such that the following expression −1 diag u1 (ζ), . . . , un (ζ) m(ζ) V β1 (ζ), . . . , βn (ζ) has an invertible limit at ζ = aji . −1 and let p(ζ) denote Let Wkl (ζ) denote the (k, l)th entry of V β1 (ζ), . . . , βn (ζ) the only non-zero non-diagonal entry of diag u1 (ζ), . . . , un (ζ) m(ζ) (p(ζ) is the (i, j)th entry). We then have that Wkj uj + Wki p and Wki ui have a finite limit at ζ = aji , for all k = 1, . . . , n. From the formula (5.9) a finite limit for Wni ui implies that ui (aji ) = 0, while the nonvanishing of the last row of V −1 diag(us )m means that aji is a single zero of ui . If more than two sections βs (ζ) meet at aji the considerations are similar but involve larger n. We can conclude the aji contribute precisely n−1 zeros of ui (counting multiplicities) and, given Proposition 6.1, this proves the formula for ui (ζ) as soon as we show that ui has no other zeros, or, equivalently, no poles. To prove this latter statement it is enough to show that uj does not have a pole at aji . We go back to the situation when n is one-dimensional, and where we concluded that Wkj uj + Wki p has a finite limit at ζ = aji for all k = 1, . . . , n. We can write Wnj uj + Wni p as (f uj + gp)/(βi − βj ) where f and g have finite limits at ζ = aji . We then have   X X 1 βs + gp βs  , Wn−1,j uj + Wn−1,i p = − f uj βi − βj s6=j

s6=i

which can be rewritten as −f uj −

X s6=i

βs

f uj + gp . βi − βj

Since the second term has a finite limit, so does f uj and hence uj . Again, if more than two sections βs (ζ) meet at aji the considerations are similar but involve larger n. Thus


313

we have shown the second formula of the statement. The last formula follows from the reality condition and the fact that aji = −1/¯aij (this calculation also eliminates the ±1 ambiguity in the choice of the real structure in the proof of 6.1). ˜ n (c) as a T n -bundle over the configuration space C˜ n (R3 ) We can finally identify M 3 of n distinct points xi in R . ˜ n (c) is equivalent to the T n -bundle described in Proposition 1.1. Proposition 6.3. M Proof. From the last formula in Proposition 6.2 it follows that Ai 6= 0 if, for all j 6= i, zi 6= zj or xi > xj . On the other hand, if we put Y AI = Ai aji , j∈I

for any subset I of {j; j 6= i}, then we have Y Y AI A¯ I = xi − xj + rij xj − xi + rij . j6=i j6∈I

j∈I

Let us choose sets I1 , . . . , In such that Ii ⊂ {j; j 6= i} and j ∈ Ii ⇔ i 6∈ Ij . Define U (I1 , . . . , In ) as the complement of the subset {(xi , zi )i=1,...,n ; Iic = {j; zi = zj and xi < xj }} (Iic denotes the complement of Ii in {j; j 6= i}). The sets ˜ n (c) is trivialized U (I1 , . . . , In ) cover C˜ n (R3 ) and over each of them the bundle M by coordinates xi , zi , AIi /|AIi | . To determine the bundle, choose i < j. The bundle 2 restricted to Sij is given by the transition function from U (I1 , . . . , In ), where j 6∈ I(i) 0 to U (I1 , . . . , In0 ), where Ii0 = Ii ∪ {j}, Ij0 = Ij − {i}, Ik0 = Ik for k 6= i, j. Let φk be the transition function for the k th generator of T n , i.e. the transition function from AIk /|AIk | to AIk0 /|AIk0 |. We see that φk = 1 if k 6= i, j, and φi = aji /|aji |, φj = |aji |/aji . Therefore φi = (zj − zi )/|zj − zi | and φj = φ−1 i . It remains to identify the circle bundle over the sphere x2 + |z|2 = const given by the transition function z/|z| from the region U0 = {z 6= 0 or x > 0} to the reon U1 = {z 6= 0 or x < 0}. Let us write the unit 3-sphere as {(u, v) ∈ C2 ; |u|2 + |v|2 = 1}. The Hopf bundle is given the S 1 action t · (u, v) = (tu, t−1 v) and the projection S 3 → S 2 by the map x = |u|2 − |v|2 , z = 2uv. Over U0 this bundle is trivialized by (x, z, u/|u|) and over U1 by (x, z, |v|/v). 2 , S 1 ). The transition function is |z|/z. Thus [φi ] = −1 ∈ H 1 (Sij ˜ n (c). By the remark at the end of Sect. 3, it We can now calculate the metric on M is enough to know the metric for c = −1, 0, 1, as the others are obtained by homothety. We shall calculate the metric for c = 1. The metric for c = −1 is the everywhere negative definite version of the Gibbons–Manton metric (this can be seen from the c = 1 calculation) and the one for c = 0 the and negative-definite cone over a 3-Sasakian manifold. ˜ n (1) is isomorphic, as a hyperkähler manifold, to the Gibbons–Manton Theorem 6.4. M manifold MGM defined in Sect. 1. Proof. We know from the previous proposition that the two spaces are diffeomorphic. ˜ n (1) and of the Gibbons–Manton metric We shall show that the twistor description of M coincide. We recall from Sect. 1 that thelatter is a hyperkähler quotient of M = M1 × n M2 by a torus, where M1 = S 1 × R3 and M2 = Hn(n−1)/2 . With respect to any

314

R. Bielawski

n complex structure M1 = C∗ × Cn and M2 = Cn(n−1)/2 × Cn(n−1)/2 . Let us write the corresponding complex coordinates as (pi , βi ), i = 1, . . . , n, on M1 and as (vij , wij ), i < j, on M2 . The complex-symplectic forms corresponding to metrics g1 and g2 are given by n X dpi

pi

i=1

X

∧ dβi ,

(6.4)

dvij ∧ dwij .

(6.5)

i<j

The real sections of the twistor space Z1 of M1 are written, over ζ 6= ∞, as pi (ζ) = Bi exi −z¯i ζ ,

βi (ζ) = zi + 2xi ζ − z¯i ζ 2 ,

(6.6)

where Bi B¯ i = 1. The real sections of the twistor space Z2 of M2 are (cf. [2], chapter 13.F): vij (ζ) = Cij (ζ − aij ),

wij (ζ) = Dij (ζ − aji ),

(6.7)

where aij , aji are roots of vij wij = zij + 2xij ζ − z¯ij ζ 2 for some (xij , zij ) ∈ R × C, i.e. q q xij + x2ij + |zij |2 xij − x2ij + |zij |2 , aji = aij = z¯ij z¯ij and Cij C¯ ij = −xij +

q

x2ij + |zij |2 ,

¯ ij = xij + Dij D

q

x2ij + |zij |2 .

Here the particular choice of sections is forced either by the fact the metric is positive definite or by requiring that the S 1 -action t · (vij , wij ) = (tvij , t−1 wij ) determines the Hopf bundle over the 2-sphere x2ij + |zij |2 = 1 (this calculation was done in the proof of Proposition 6.3). To obtain the twistor description of the Gibbons–Manton metric we have to perform the complex-symplectic quotient construction along the fibers of Z1 ⊕ Z2 with respect to the difference of the forms (6.4) and (6.5). As in Sect. 1, the moment map equations are vij wij = βi − βj and so the aij , aji are given by (6.3). Since we already know that the manifolds are diffeomorphic, it is sufficient to determine the metric on an open dense subset, e.g. on the set where all vij are non-zero. Quotienting n(n−1)/2 is equivalent to sending all vij to 1. This is achieved by acting this set by C∗ n(n−1)/2 −1 . By the description of the torus action given in by the element (vij ) of C∗ Sect. 1, this sends pi (ζ) to Q Q ji (ζ − aij ) where

Q ji (xj

− xi + rij )

.

(6.9)

These and the βi give the real sections for the Gibbons–Manton metric and the symplectic form is (6.4). We now compare this with the description of Z(1) given in Proposition


315

6.2. According to Remark 5.5 we should set pi = ui the same symplectic form. We obtain pi (ζ) = Q

Q j6=i Q

(xi − xj + rij )

j>i

j>i (βi

Q

|zj − zi |2

=

Y

j>i (z¯j

− z¯i ) with the norm of Ei .

(xi − xj + rij )

j

Q ji

(xi − xj + rij )

j>i (xj

which proves the theorem.

− βj ) in order to have

Q (ζ − aji ) x −z¯ ζ Ai Qji (z¯j − z¯i ) j>i (ζ − aij )

All we have to do is to compare the norm of Ai We have, from Proposition 6.2 and Eq. (6.9), Ai A¯ i Q = 2 j>i |zj − zi |

Q

− xi + rij )

= Ei E¯ i ,

We shall finish the section with a remark that Propositions 6.2 and 6.3 can be generalized to define hyperkähler metrics on a class of T n -bundles over C˜ n (R3 ). We have: Theorem 6.5. Let P be a T n -bundle over C˜ n R3 determined by an element (s1 . . . . , 2 2 2 sn ) of H 2 C˜ n (R3 ), Zn satisfying sk (Sij ) = 0 if k 6= i, j and si (Sij ) = −sj (Sij ). Then P carries a family of (pseudo)-hyperkähler metrics such that the real sections of the twistor space are given, over ζ 6= ∞, by (β1 (ζ), . . . , βn (ζ), u1 (ζ), . . . , un (ζ)), where βi (ζ) = zi + 2xi ζ − z¯i ζ 2 , Y ui (ζ) = Ai (ζ − aji )sij ec(xi −z¯i ζ) , j6=i

where c is a real constant, (xi , zi ), i = 1, . . . , n, are distinct points in R × C, sij = 2 )|, and Ai are complex numbers satisfying |si (Sij Ai A¯ i =

Y

xi − xj + rij

sij

.

j6=i

This description determines a hypercomplex structure on P . A (pseudo)-hyperkähler metric can be then calculated using any complex-symplectic form along the fibers, given as a section of 32 TF∗ ⊗ O(2), e.g. the form (5.7). These metrics will correspond to the motion of n dyons in R3 interacting in different ways (cf. [14]). Remark. The calculation of the metric given above shows that the Taub-NUT metric (cf. [2]) has two very different descriptions in terms of Nahm’s equations: 1) it is the ˜ 2 (−1) defined by considering ˜ 0 (−1) of M metric on the totally geodesic submanifold M 2 su(2)-valued solutions to Nahm’s equations and SU (2)-valued gauge transformations; 2) it is the metric on the moduli space of SU (3)-monopoles of charge (1, 1) [10, 29].

316

R. Bielawski

7. Asymptotic Comparison of the Metrics We shall now show that the Gibbons–Manton metric and the monopole metric are asymptotically exponentially close. The asymptotic region, where the individual monopoles are separated, of the monopole space Mn is diffeomorphic to P/Sn , where P is a t orus bundle over the configuration space C˜ n (R3 ) and Sn the symmetric group. The bundle P is not, however, the bundle of Proposition 6.3. Rather, as we shall see shortly, it is n the quotient of that bundle by a Z2 -subgroup of T n . In other words it is the bundle determined by an s ∈ H 2 C˜ n (R3 ), Zn with all sk being twice those in Proposition 6.3. We shall compare the metric on Mn with the metric on the hyperkähler quotient of ˜ n (1) × M ˜ n (1) by the diagonal T n -action. We do this in order to have solutions to M Nahm’s equations with poles at both ends of the interval [−1, 1]. For any c, c0 , let us ˜ n (c) × M ˜ n (c0 ) by the diagonal action ˜ n (c, c0 ) for the hyperkähler quotient of M write M n n 0 0 of T . The action of T given by t·(m, m ) = (tm, m ) induces a tri-Hamiltonian action ˜ n (c, c0 ) which makes M ˜ n (c, c0 ) into a T n -bundle over C˜ n (R3 ). We have of T n on M ˜ n (c + c0 ) Z2 n , ˜ (c, c0 ) is isomorphic, as a hyperkähler manifold, to M Lemma 7.1. M n n where Z2 = {t ∈ T n ; t2 = 1}. ˜ n (c), M ˜ n (c0 ) respectively. Proof. Let µ, µ0 be the moment maps for the action of T n on M n 0 The moment map for the diagonal T -action on the product is µ+µ . If we go back to the proof of Proposition 6.3 and use the same notation, we can see that the zero-set of this 2 is given by moment map is a (T n ×T n )-bundle over C˜ n (R3 ) which restricted to each Sij −1 −1 0 transition functions (φ1 , . . . , φn , φ1 , . . . , φn ) (the point being that U (I1 , . . . , In0 ) = −U (I1 , . . . , In )). Hence, if we quotient by T n , by sending the second T n to 1 over each U (I1 , . . . , In ), we end up with a T n -bundle for which the transition functions are φ2k , k = 1, . . . , n. This proves the differential-geometric part of the statement. To obtain the ˜ n (c0 ), performing ˜ n (c) × M isometry we repeat this argument for the twistor space of M the complex-symplectic quotient along the fibers as in the proof of Theorem 6.4. ˜ n (1, 1) with half (compare formula (2.4)) of From now on, we shall consider M the metric given by the above lemma. In other words, locally the metric is still the ˜ Gibbons–Manton metric. We can identify Mn (1, 1) with the moduli space of pairs (T0 , T1 , T2 , T3 ), (T00 , T10 , T20 , T30 ) of solutions to Nahm’s equations, defined respectively on [−1, ∞] and on [−∞, 1], such that Ti (+∞) = Ti0 (−∞) for i = 0, 1, 2, 3, and the residues of Ti at −1 and of Ti0 at +1, i = 1, 2, 3, define the standard ndimensional irreducible representation of su(2). The group of gauge transformations G(1, 1) is now defined as pairs (g, g 0 ) such that g(t + 1), g 0 (−t + 1) ∈ G(c) for some ˙ −1 = limt→−∞ g˙ 0 g 0−1 . The tangent space consists of pairs c and s = limt→+∞ gg 0 0 0 0 (t0 , t1 , t2 , t3 ), (t0 , t1 , t2 , t3 ) defined on [−1, ∞] and on [−∞, 1], respectively, with ˜ n (1, 1) can be written as ti (+∞) = t0i (−∞) and satisfying Eqs. (2.3). The metric on M 1 1X kti (+∞)k2 + 2 2 3

Z

−1

0

1 2 We can rewrite this as

+∞

3 X 0

3 X

kti (s)k2 − kti (+∞)k2 ds +

0

kt0i (−∞)k2 +

1 2

Z

1

3 X

−∞

0

kt0i (s)k2 − kt0i (−∞)k2 ds.


1 2

Z

+∞ 0

3 X

317

kti (s)k − kti (+∞)k 2

2

0

+

1 2

Z

1 ds + 2

3 0 X −1

Z

0

3 X

−∞

0

kti (s)k2 ds +

0

kt0i (s)k2 − kti (+∞)k2 ds 1 2

Z

1 0

3 X

kt0i (s)k2 ds.

(7.1)

0

Let us fix a complex structure, say I and write as in Sect. 5, α forT0 + iT1 , β for T2 + iT3 . ˜ n (1, 1) as a pair (α− , β− ), (α+ , β+ ) . We shall write βi for We write an element of M th ˜ nreg (1, 1) the subset of M ˜ n (1, 1) the (i, i) entry of β− (+∞) = β+ (−∞) and denote by M reg where all βi are distinct. Similarly, we write Mn for the subset of (α, β) in (Mn , I) where the eigenvalues of β are distinct. We shall prove: ˜ nreg (1, 1)/Sn to Mnreg such that Theorem 7.2. There exists a biholomorphism φ from M |φ∗ g − g 0 | = O(e−cR ),

(7.2)

where g, g 0 denote the monopole and Gibbons–Manton metric respectively, c = c(n) is a constant, and R is the separation distance of particles in Cn (R3 ), i.e. R = min{|xi − xj |; i 6= j}.

(7.3)

The same estimate holds for the Riemannian curvature tensor. Since such a biholomorphism will be defined for any complex structure and the union ˜ n (1, 1), we conclude that the ˜ nreg (1, 1) for different complex structures is all of M of M monopole and the Gibbons–Manton metrics are exponentially close in the asymptotic region of the monopole moduli space. The remainder of the section is devoted to proving this theorem. We need the following lemma: ˜ nreg (1) is biholomorphic to the quotient of the Lemma 7.3. Let C > 0. The space M space of solutions (α, β) to Eq. (5.1) which have the correct boundary behaviour at t = 0 and are constant (hence diagonal) for t ≥ C by the group of complex gauge transformations g : [0, +∞) → Gl(n, C) with g(0) = 1 and g(t) = exp(ht − h) for some diagonal h for t ≥ C. ˜ nreg (1) and let αd = α(+∞), βd = β(+∞). According Proof. Let (α, β) be an element of M to the proof of Proposition 5.3, there is a unique complex gauge transformation g defined on [C/2, +∞) with g(+∞) = 1 such that (α, β) = g(αd , βd ). Let gˆ : [C/2, ∞) → Gl(n, C) be a smooth path with the values and the first derivatives of gˆ and g coinciding ˆ to the complex at t = C/2 and with g(t) ˆ = 1 and for t ≥ C. We obtain a solution (α, ˆ β) Nahm equation (5.1) by setting ( (α, β)(t) if t < C ˆ (7.4) (α, ˆ β)(t) = g(t)(α ˆ d , βd ) if t ≥ C. This is a solution of the type described in the statement of this lemma. The proof of 5.3 shows further that it is only g(C/2) exp{(1 − C/2)αd } (and a solution to (5.1) on ˜ nreg (1). Therefore we obtain a well defined [0, C/2]) that determines the element of M reg ˜ holomorphic map from Mn (1) to the moduli space described in the statement. Let us ˆ be an element of the moduli space described in the define the inverse map. Let (α, ˆ β)

318

R. Bielawski

statement. As in [23] we can find a bounded complex gauge transformation g0 such that ˆ is an element of M ˜ nreg (1). g0 (α, ˆ β) We can assume that g0 has a limit h at +∞ (this follows from the convexity property of g0 [13], since we can assume that g0 (t) is hermitian for all t). According to Proposition ˜ n (1) extends to a global action C∗ n with respect to 5.3 or 6.2 the action of T n on M ˜ reg the complex structure I (or any other). Let n (α, β) be the element of Mn (1)Cobtained −1 ∗ ˆ by the action of h ∈ C . Then (α, β) = g(α, ˆ and g ∈ G (1). This from g0 (α, ˆ β) ˆ β) gives the inverse mapping. ˜ nreg (1, 1)/Sn and Mnreg . From We can now construct a biholomorphism between M reg ˜ the above lemma, M n (1, 1) is biholomorphic to the quotient of the space of pairs (α− , β− ), (α+ , β+ ) defined on [−1, +∞) and on (−∞, 1] respectively such that (α− , β− )(t + 1) and (α+ , β+ )(1 − t) are as in the above lemma, (α− , β− )(+∞) = (α+ , β+ )(−∞) by the group of pairs (g− , g+ ) with g− (−1) = g+ (1) = 1 and such that there are diagonal h, p with g− (t) = exp(th − p) for t > −r, g+ (t) = exp(th − p) for ˆ to the complex Nahm t < r (r ∈ (0, 1) is fixed but arbitrary). We define a solution (α, ˆ β) equation (5.1) on (−1, 1) by ( (α− , β− )(t) if t < 0 ˆ (7.5) (α, ˆ β)(t) = (α+ , β+ )(t) if t ≥ 0. The G C -orbit of this solution (see Sect. 2 for the definition of G) contains a unique element of Mn [13, 20]. Furthermore, the action of a (g− , g+ ) translates into the action of g ∈ G C , where g(t) = g− (t) for t < 0 and g(t) = g+ (t) for t ≥ 0. Therefore we have ˜ nreg (1, 1) to Mn . If we now have an element a well defined holomorphic map φr from M reg (α, β) of Mn , we can diagonalize β on [−r, r] and make α diagonal and constant on ˜ be the resulting solution to the complex Nahm equation. We obtain [−r, r]. Let (α, ˜ β) ˜ an element of Mnreg (1, 1) by setting ( ˜ (α, ˜ β)(t) for t < 0 (α− , β− )(t) = ˜ (α, ˜ β)(0) for t ≥ 0 and similarly for (α+ , β+ ). This defines the inverse to φr up to the ordering of eigenvalues ˜ nreg (1, 1)/Sn and Mnreg . of β. In other words φr induces a biholomorphism between M reg ˜ Furthermore, for a fixed element (α− , β− ), (α+ , β+ ) of Mn (1, 1) and two parameters ˆ of (7.5) are G C -equivalent and therefore φr , φr0 induce the same r, r0 , the resulting (α, ˆ β) biholomorphism φ. Let us now prove the estimate (7.3). Fortunately, much of the analysis has been already done in [3]. First of all, we recall ([23], Lemma 3.4) that solutions to Nahm’s equations which have a regular triple as a limit at infinity, approach this limit exponentially fast, of order O e−cR (that is T1 , T2 , T3 do and we can always make T0 to have such decay by using the gauge freedom). The proofs of Propositions 3.11–3.14 in [3] show that the same holds for tangent vectors (t0 , t1 , t2 , t3 ). Let us now see what happens to a tangent vector v under the The gauge transformations (g, g 0 ) which make the map φ. ˜ nreg (1, 1) constant and equal to the common value at element (α− , β− ), (α+ , β+ ) of M infinity on [−1 + C/2, +∞) and (−∞, 1 − C/2] are exponentially close to the identity. In the next stage of the construction of φ – formula (7.4) – we have smoothed out the solutions which can be again done by gauge transformations exponentially close to 1.


319

Therefore the resulting tangent vector vˆ is exponentially close to the original one in the metric (7.1). We have then restricted the solutions (formula (7.5)) to obtain a solution ˆ to the complex Nahm equation on [−1, 1]. Let p denote this operation of restric(α, ˆ β) tion. The first line of the formula (7.1) is exponentially small and therefore the norm of ˆ will vˆ in (7.1) and the norm of dp(v) ˆ in (2.4) are exponentially close. The solution (α, ˆ β) not satisfy the real Nahm equation, however, we will have ˆ βˆ ∗ ] = O(e−cR ). ˆ := d αˆ + αˆ ∗ + [α, ˆ αˆ ∗ ] + [β, F (α, ˆ β) dt Lemma 2.10 in [13] implies now that we can solve the real equation by a complex gauge transformation bounded as O(e−cR ). We can now show that the vector dφ(v) tangent ˆ is exponentially close to dp(v) ˆ by following the to Mn (which is obtained from dp(v)) analysis of Sect. 3 in [3] step by step, replacing the O(1/R) estimates by O(e−cR ). This proves the estimate (7.3). For the curvature estimates we do the same using the analysis of Sect. 4 in [3]. This proves Theorem 7.2. 8. Twistor Description of Monopoles and the Gibbons–Manton Metric We shall show in this section how the twistor description of monopole metrics determines the asymptotic metric. We recall [13] that the moduli space of n-monopoles is biholomorphic to the space of based rational maps p(z)/q(z) on CP 1 o f degree n (based means that deg p < deg q). On the set, where the roots β1 , . . . , βn of q(z) are distinct, these roots and the values pi = p(βi ) of p form local coordinates and the complex-symplectic form can be written as [1]: n X i=1

dβi ∧

dpi . pi

(8.1)

The metric is determined by the real sections p(z, ζ)/q(z, ζ). Their description is provided in [19]. The denominator q(z, ζ) is given by a curve S – the spectral curve of the monopole – in T CP 1 [16]. This curve satisfies several conditions, one of which is the triviality of the line bundle L−2 restricted to S, and Hurtubise [19] shows that the numerator p(z, ζ) is given by a nonzero section of this bundle. (The values pi (ζ) are given by the values of this section at the intersection points βi (ζ) of S with Tζ CP 1 .) What happens when the individual monopoles separate? First of all, the spectral curve approaches the union of spectral curves of individual monopoles exponentially fast [4]. These curves Si are of the form ηi = zi + 2xi ζ − z¯i ζ 2 , i = 1, . . . , n, where (xi , Re zi , Im zi ) are locations of 1-monopoles (particles). What happens to the section of L−2 ? We make a heuristic assumption (which we know to be true from Sect. 6) that the section acquires zeros and poles at the intersection points of the Si (more precisely the only singularities of pi (ζ) occur at the intersection points of Si with other Sj ). As we shall see this is sufficient to determine the asymptotic metric. ¯ ζ¯ First of all the real structure on the bundle L−2 is u 7→ u¯ −1 e−2η/ and therefore if pi has a zero at one of the points of Si ∩ Sj , then it has a pole of the same order at the other, and vice versa. Furthermore, since the metric and hence the real sections are invariant under the action of the symmetric group, we must have Y ζ − aij k e−2(xi −z¯i ζ) , i = 1, . . . , n, pi (ζ) = Ai ζ − aji j6=i

320

R. Bielawski

where aij , aji are the two points in Si ∩ Sj given by (6.3) and k is an integer. The reality condition implies that Y akji a¯ kji . Ai A¯ i = j6=i

One can now calculate the asymptotic metric, using (8.1). The sign of k will determine the signature, while |k| is simply a constant m ultiple. The actual value of k is determined by the topology of the asymptotic region of Mn , and comparing with Proposition 6.3 and the remarks at the beginning of Sect. 7 we conclude that k = 1 (in the coordinates Q of Proposition 6.2, pi = j;j6=i (βi − βj )/u2i ). We remark that the above analysis can be easily done for other compact Lie groups G. The twistor description of metrics on moduli spaces of G-monopoles with maximal symmetry breaking is known from the work of Murray [28] and Hurtubise and Murray [21, 22] and from this the asymptotic metric can be calculated. We shall do the exact analysis in the case of G = SU (N ) in a subsequent paper. Acknowledgement. This work was carried out at the Max-Planck-Institut für Mathematik in Bonn. I wish to thank the Institute’s directors and staff for their hospitality and for creating a stimulating research atmosphere. I also thank Michael Murray for comments.

Note added in proof The metrics of section 4 (or, at least, their positive-definite counterparts) can be also obtained as finite-dimensional hyperkähler quotients. This follows from the construction of Kobak and Swann (Internat. J. Math. 7, 193–210 (1996)) of a nilpotent GC -orbit as a hyperkähler quotient of a vector space V by a product U of unitary groups. The hyperkähler quotient P of V by the semisimple part of U is a positive-definite analogue of the spaces MG (c) (i.e. Theorem 4.3 holds for P ). Presumably the spaces MG (c) can also be obtained this way by changing the signature of the metric on V .

References 1. Atiyah, M.F. and Hitchin, N.J.: The geometry and dynamics of magnetic monopoles. Princeton: Princeton University Press, 1988 2. Besse, A.: Einstein manifolds.Berlin–Heidelberg–New York: Springer, 1987 3. Bielawski, R.: Asymptotic behaviour of SU (2) monopole metrics. J. reine angew. Math. 468, 139–165 (1995) 4. Bielawski, R.: Monopoles, particles and rational functions. Ann. Glob. Anal. Geom. 14, 123–145 (1996) 5. Bielawski, R.: On the hyperkähler metrics associated to singularities of nilpotent varieties. Ann. Glob. Anal. Geom. 14, 177–191 (1996) 6. Bielawski, R.: Hyperkähler structures and group actions. J. London Math. Soc. 55, 400–414 (1997) 7. Bielawski, R.: Invariant hyperkähler metrics with a homogeneous complex structure. Math. Proc. Cam. Phil. Soc. 122, 473–482 (1997) 8. Biquard, O.: Sur les e´ quations de Nahm et les orbites coadjointes des groupes de Lie semi-simples complexes. Math. Ann. 304, 253–276 (1996) 9. Biquard, O.: Twisteurs des orbites coadjointes. Ecole Polytechnique preprint (1997) 10. Connell, S.: The dynamics of the SU (3) (1, 1) magnetic monopoles. Ph.D. thesis, The Flinders University of South Australia (1991) 11. Dancer, A.S.: Nahm’s equations and hyperkähler geometry. Commun. Math. Phys. 158, 545–568 (1993) 12. Dancer, A.S.: A family of hyperkähler manifolds. Quart. J. Math. Oxford 45, 463–478 (1994)


321

13. Donaldson, S.K.: Nahm’s equations and the classification of monopoles. Commun. Math. Phys. 96, 387–407 (1984) 14. Gibbons, G.W. and Manton, N.S.: The moduli space metric for well-separated BPS monopoles. Phys. Lett. B 356, 32–38 (1995) 15. Gibbons, G.W. and Rychenkova, P.: HyperKähler quotient construction of BPS monopole moduli spaces. Commun. Math. Phys. 186 (1997), 581–599 16. Hitchin, N.J.: On the construction of monopoles. Commun. Math. Phys. 89, 145–190 (1983) 17. Hitchin, N.J.: Polygons and gravitons. Math. Proc. Camb. Phil. Soc. 83, 465–476 (1979) 18. Hitchin, N.J., Karlhede, A., Lindström, U. and Roˇcek, M.: Hyperkähler metrics and supersymmetry. Commun. Math. Phys. 108, 535–586 (1985) 19. Hurtubise, J.C.: Monopoles and rational maps: a note on a theorem of Donaldson. Commun. Math. Phys. 100, 191–196 (1985) 20. Hurtubise, J.C.: The classification of monopoles for the classical groups. Commun. Math. Phys. 120, 613–641 (1989) 21. Hurtubise, J.C. and Murray, M.K.: On the construction of monopoles for the classical groups. Commun. Math. Phys. 122, 35–89 (1989) 22. Hurtubise, J.C. and Murray, M.K.: Monopoles and their spectral data. Commun. Math. Phys. 133, 487– 508 (1990) 23. Kronheimer, P.B.: A hyper-kählerian structure on coadjoint orbits of a semisimple complex group. J. London Math. Soc. 42, 193–208 (1990) 24. Kronheimer, P.B.: The construction of ALE spaces as hyper-Kähler quotients. J. Diff. Geom. bf29, 665–683 (1989) 25. Lindström, U. and Roˇcek, M.: Scalar tensor duality and N = 1, 2 nonlinear σ-models. Nucl. Phys. 222B, 285–308 (1983) 26. Manton, N.S.: A remark on the scattering of BPS monopoles. Phys. Lett. B 110, 54–56 (1982) 27. Manton, N.S.: Monopole interactions at long range, Phys. Lett. B 154, 397–400 (1985) 28. Murray, M.K.: Non-abelian magnetic monopoles. Commun. Math. Phys. 96, 539–565 (1984) 29. Murray, M.K.: A note on the (1, 1, . . . , 1) monopole metric. J. Geom. Phys. 23, 31–41 (1997) 30. Nahm, W.: The construction of all self-dual monopoles by the ADHM method. In Monopoles in quantum field theory. Singapore: World Scientific, 1982 31. Nakajima, H.: Monopoles and Nahm’s equations. In Einstein metrics and Yang-Mills connections New York: Marcel Dekker, 1993 32. Pedersen, H. and Poon, Y.S.: Hyper-Kähler metrics and a generalization of the Bogomolny equations. Comm. Math. Phys. 117, 569–580 (1988) 33. Stuart, D.: The geodesic approximation for the Yang-Mills-Higgs equations. Commun. Math. Phys. 166, 149–190 (1994) 34. Swann, A.: Hyperkähler and quaternionic Kähler geometry. Math. Ann. 289, 421–450 (1991) Communicated by H. Nicolai

Commun. Math. Phys. 194, 323 – 341 (1998)

Communications in


Dynamical Localization for Discrete and Continuous Random Schrödinger Operators F. Germinet1 , S. De Bièvre2 1 UFR de Math´ ematiques et LPTMC, Université Paris VII – Denis Diderot, 75251 Paris Cedex 05, France. E-mail: [email protected] 2 UFR de Math´ ematiques et URA GAT, Université des Sciences et Technologies de Lille, 59655 Villeneuve d’Ascq Cedex, France. E-mail: [email protected]

Received: 23 July 1997 / Accepted: 22 October 1997

Abstract: We show for a large class of random Schrödinger operators Hω on `2 (Zν ) and on L2 (Rν ) that dynamical localization holds, i.e. that, with probability one, for a suitable energy interval I and for q a positive real, sup rq (t) ≡ suphPI (Hω )ψt , |X|q PI (Hω )ψt i < ∞. t

t

Here ψ is a function of sufficiently rapid decrease, ψt = e−iHω t ψ and PI (Hω ) is the spectral projector of Hω corresponding to the interval I. The result is obtained through the control of the decay of the eigenfunctions of Hω and covers, in the discrete case, the Anderson tight-binding model with Bernoulli potential (dimension ν = 1) or singular potential (ν > 1), and in the continuous case Anderson as well as random Landau Hamiltonians.

1. Introduction We show for a large class of random Schrödinger operators Hω on `2 (Zν ) and on L2 (Rν ) that dynamical localization holds, i.e. that, with probability one, for a suitable energy interval I and q > 0, sup rq (t) ≡ suphPI (Hω )ψt , |X|q PI (Hω )ψt i < ∞. t

t

Here ψ is a function of sufficiently rapid decrease, ψt = e−iHω t ψ, PI (Hω ) is the spectral projector of Hω corresponding to the interval I and X is the usual position operator. The result covers all random Schrödinger operators for which exponential localization has been proved, including operators with Bernoulli potentials in dimension 1 and random Landau Hamiltonians, for example.

324

F. Germinet, S. De Bièvre

The strategy of the proof is as follows. First recall that exponential localization, i.e. pure point spectrum and exponentially decaying eigenfunctions, is by now a well established property of random Schrödinger operators in many situations. On the other hand, it is also known that exponential localization does not systematically entail dynamical localization [11]. The authors of [11] point out that, to obtain dynamical localization, some control is needed on the location and the size of the boxes outside of which the eigenfunctions “effectively” decrease exponentially. This is precisely what is achieved for random Schrödinger operators in the present paper (Theorem 3.1 and Theorem 4.2). Our proof here uses the ideas of Von Dreifus and Klein [14], and in particular those of the proof of their Theorem 2.3. We proceed as follows: once exponential localization has been proved, and using the fact that the spectrum is now known to be discrete, one can exploit the result of the multi-scale analysis a second time to get better (and sufficient) control on the eigenfunction decay. We first deal with the discrete case (Sects. 2 and 3). In Sect. 2 we start by proving (along the lines of [11]) a sufficient condition (see (2.1)) on the eigenfunctions of a Hamiltonian H which implies dynamical localization. In Sect. 3 we give the proof of the announced result for the discrete Anderson model. The continuous case is dealt with in Sects. 4 and 5. Exponential localization for Schrödinger operators has recently been carried over to the continuum by Combes and Hislop [6] and by Klopp [21]. The case of random Landau Hamiltonians is dealt with by Combes, Hislop and Barbaroux [3, 7], by Wang [27] and by Dorlas, Macris and Pulé [13]. All those papers use an adaptation to the continuous case of the multi-scale analysis originally developed for discrete Schrödinger operators ([17, 18, 14] or see [5], [24]). This reduces the proof of exponential localization to the verification of two hypotheses: a Wegner estimate and an estimate allowing the “initialization” of the multi-scale analysis. Our central result here (Theorem 4.2) shows that those two hypotheses actually imply dynamical localization. We give some applications in Sect. 5. To put our results in perspective, we recall, first, the work of Holden and Martinelli [23] who prove, roughly speaking, that r2 (t) = o(t) for some particular continuous models. More recently Del Rio, Jitomirskaya, Last and Simon [11] used bounds of Aizenman [1] to give a simple proof (avoiding the multi-scale analysis) of dynamical localization for the discrete Anderson model with a potential with bounded density. But the bounds of [1] do not seem to carry over to the continuous case, nor to Bernoulli and other singular potentials in the discrete case. To deal with these cases, we were therefore obliged to return to the (rather painful) multi-scale analysis. A further application of our result to the random dimer model [16] is given in [10], and dynamical localization for the almost Mathieu model is proven in [19], using a related method.

2. Eigenfunction Decay and Dynamical Localization In this section we give, for a class of self-adjoint operators H with pure point point spectrum, defined on either `2 (Zν ) or L2 (Rν ), a sufficient condition on the eigenfunctions (see (2.1)) that guarantees dynamical localization. Our strategy for proving dynamical localization for random Schrödinger operators is then to prove that a property much stronger than this condition is indeed satisfied (Theorem 3.1, Theorem 4.2). Let H0 be the following operator on L2 (Rν ): H0 = H1 ⊕ H2 ,

Dynamical Localization for Random Schrödinger Operators

325

Pν where H1 = p21 +p22 , p1 = ∂x1 +Bx2 /2, p2 = ∂x2 −Bx1 /2, B ≥ 0, and H2 = 3 p2i , pi = ∂xi . One can also write H0 = (P − A)2 , where A is the vector potential B/2(x2 , −x1 ), − → written in the symmetric gauge, associated to the constant magnetic field B = Bex3 . Theorem 2.1. Let H be a self-adjoint operator on H = `2 (Zν ) or L2 (Rν ) with pure point spectrum on some interval I ⊂ R. Let ϕn be its eigenfunctions with corresponding eigenvalues En ∈ I. In the case H = L2 (Rν ), suppose that I is compact and that H has the form H0 + V , H0 as described above, V ∈ L∞ (Rν ). Suppose moreover that ∃γ > 0, γ 0 ∈]0, γ/2[ and sites (xn ) s.t.∀ n, |ϕn (x)| < Cγ eγ

0

|xn | −γ|x−xn |

e

.

(2.1)

Let q > 0 and ψ ∈ H decaying exponentially at a rate θ > 2γ 0 . Then there exists a constant Cψ = Cψ (I, γ, γ 0 , θ, q) such that: ∀ t ≥ 0, k |X|q/2 e−iHt PI (H)ψk2 ≤ Cψ .

(2.2)

This simple result relies on ideas of Sect. 7 in [11]. Note that (2.1) says roughly that the eigenfunctions are localized inside boxes of size |xn |/2 around “centers” xn . This is stronger than exponential localization of H on I, which only means that ∃γ > 0 such that ∀n, ∃Cn > 0, |ϕn (x)| ≤ Cn e−γ|x| ,

(2.3)

but weaker than what the authors of [11] called SULE (Semi-Uniformly Localized Eigenfunctions). We choose to present the proof of this theorem in the continuous case, since the first part of the proof (Lemma 2.2) is a little bit more technical in this situation. In order to prove Theorem 2.1 we need control on the growth of the |xn | in n, which is given by the following preliminary lemma: Lemma 2.2. Let H be as in the proposition and δ > 0. Then one can order the |xn | in increasing order, and there exists a constant C = (4 max(B, 1))−1 such that for n large enough (depending on δ): |xn | ≥ Cn1/(ν+δ) . Proof of Lemma 2.2. Essentially we follow the ideas of the proof of Theorem 7.1 of [11]. We recall that the energies En that we consider belong to I. Let δ > 0 be given and let 0 δ 0 > 0 so that δ > δ 0 (ν − 1). Let L > 0 be given, define J = [0, Lδ ], and write χ2L (x) for the characteristic function of the ball of radius 2L centered at 0. Suppose that |xn | < L. Then, for such n and for some constant C1 , one has: hϕn , (1 − χ2L (X))χJ (H0 )χ2L (X)ϕn i ≤ k(1 − χ2L (X))ϕn kL2 kχ2L (X)ϕn kL2 0 ≤ Cγ e(γ +γ)|xn | (1 − χ2L (X))e−γ|x| ≤ C1 e

−(γ−γ 0 )L

L2

.

(2.4)

0

Secondly, using 1 − χJ (y) ≤ yL−δ , y ≥ 0, and Hϕn = En ϕn : 0 hϕn , (1 − χJ (H0 ))ϕn i ≤ hϕn , Hϕn i + kV k∞ L−δ 0

≤ C(I, kV k∞ )L−δ .

(2.5)

326


So, using (2.4) and (2.5) together with two similar inequalities, tr(χ2L (X)χJ (H0 )χ2L (X)) X hϕn , χ2L (X)χJ (H0 )χ2L (X)ϕn i ≥

(2.6)

n| En ∈I, |xn |≤L

0 0 ≥ ]{n| En ∈ I, |xn | ≤ L} 1 − 3C1 e−(γ−γ )L − C(I, kV k∞ )L−δ ≥

1 ]{n| En ∈ I, |xn | ≤ L} for L ≥ L0 = L0 (γ, γ 0 , I, kV k∞ , δ 0 ). 2

(2.7)

The next step is then to bound the trace class norm of the operator Q = χ2L (X)χJ (H0 )χ2L (X). Let’s study the case B 6= 0. Since {u21 + u22 ≤ L} ⊂ {u21 ≤ L, u22 ≤ L}, and denoting by χ(d) 2L (x) the characteristic function of the d-dimensional ball of radius 2L centered at 0, remark that tr(Q) ≤ tr (χ2L (X)(χJ (H1 ) ⊗ χJ (H2 ))χ2L (X)) (2) (ν−2) (ν−2) = tr (χ(2) 2L (X)χJ (H1 )χ2L (X)) ⊗ (χ2L (X)χJ (H2 )χ2L (X)) ≡ tr (Q1 ⊗ Q2 ) .

(2.8)

The operator Q2 is the product of two Hilbert–Schmidt operators with respective kernel −1 −1 (ν−2) (x − y) [25], where F −1 g denotes (x) F g (x − y) and χ (x) F χ χ(ν−2) J 2L 2L Pν−2 the inverse Fourier transform of g(x) = χJ ◦ s(x), with s(x1 , ..., xν−2 ) = 1 x2i . So, denoting by kCk1 the trace class norm of an operator C defined on H, one has: kQ2 k1 ≤ kχ2L (X)χJ (H2 )kHS kχJ (H2 )χ2L (X)kHS 0

2 2 ν−2 δ (ν−2) ≤ k(χ(ν−2) L . 2L (x)kL2 kχJ ◦ s(x)kL2 ≤ (4L)

(2.9)

We turn now to the magnetic part Q1 . It is well known that the spectrum of our H1 consists of eigenvalues (2n + 1)B, n = 0, 1, 2, .... The corresponding projectors are operators with kernel [22], √ 0 B B(x − x0 ) , (2.10) Pn (x, x0 ) = ei 2 x∧x pn where

1 pn ((x1 , x2 )) = 2π

Z R

eikx1 hn (k + x2 /2)hn (k − x2 /2) dk,

and hn (k) are the normalised Hermite functions. Note that we gave the expression in the symmetric gauge, and not in the Landau gauge as in [22]. Write now of Pn P projector corresponding to the eigenvalues belonging to J. PJ = n| En ∈J Pn , the (2) (X) , which are two Hilbert–Schmidt operators with Then Q1 = χ2L (X)PJ PJ χ(2) 2L the same norm (because of (2.10)). And one has, for each n, 2 kχ(2) 2L (X)Pn kHS Z Z √ 2 (2) = χ2L (x) B(x − x0 ) dx dx0 pn 0 x Z x 2 x2 x2 2 ≤ (4BL) F hn (. + )hn (. − ) (x1 ) 2 dx2 2 2 L x2


Z = (4BL)2

327

x2 2 x2 2 hn (k + ) hn (k − ) dx2 dk 2 2 x2 ,k

= (4BL)2 khn k4L2 . 0

0

0

So, since khn kL2 = 1, and using ]{n ≥ 0, (2n + 1)B ≤ Lδ } ≤ Lδ /2B if Lδ ≥ B, one has 2 2+δ 0 . kχ(2) 2L (X)PJ kHS ≤ 8BL And then, using (2.8) and (2.9), tr(Q) ≤ CLν+δ ,

(2.11)

taking δ > δ 0 (ν − 1), and C P = 4 max(B, 1). Note here that if B = 0 then the free ν Hamiltonian H0 has the form 1 ∂i2 . Hence the analysis made previously for Q2 is valid for such a Q and (2.11) holds. In order to finish the argument, note that, together with (2.7) and (2.8), (2.11) tells us that, for L ≥ L0 , N (L) ≡ ]{n|, En ∈ I, |xn | ≤ L} is finite. Order then the eigenfunctions in such a way that |xn | increases. So, N (|xn |) = n, and if n ≥ N0 ≡ N (L0 ), one has, with L = |xn |: |xn | ≥ Cn1/(ν+δ) , for n > N0 .

(2.12)

The main difference between the continuous and discrete cases comes from this lemma, in the sense that on L2 (Rν ) one has to control the behaviour of the eigenfunctions in the momentum variables: that was achieved through the use of χJ (H0 ). In the discrete case, (2.11) is replaced by the trivial equality tr(χ2L ) = (4L + 1)ν . This also explains why no specific form for H is needed on l2 (Zν ). It is easy, then, to rewrite the proof of Lemma 2.2 in this case (see also [11]). Proof of Theorem 2.1. Let ψ ∈ H be such that, for some constant C(ψ) > 0 and θ > 2γ 0 , |ψ(x)| < C(ψ)e−θ|x| . We have to bound kX q/2 PI (H)e−iHt ψk, for q > 0, t > 0. We recall that θ > 2γ 0 and γ > 2γ 0 . Without loss of generality, let’s suppose θ < γ and write γ = θ + ε, ε > 0. Then kX q/2 PI (H)e−iHt ψk2 X ≤ |hϕn , ψi| kX q ϕn k n| En ∈I

≤

X

n| En ∈I

Z C(ψ)Cγ,q |xn | 

≤C(ψ)Cγ,q 

X

q

e

−θ|x| 2γ 0 |xn | −γ|x−xn |

e

e

 0

q −(θ−2γ )|xn | 

|xn | e

Z

≤Cψ (γ, γ 0 , I, θ, ε), where we used respectively for the second and last inequalities 2 kX q ϕn k2 ≤ Cγ,q |xn |2q e2γ

and Lemma 2.2.

0

|xn |

,

dx

−ε|x−xn |

e

n| En ∈I

dx

328


3. The Discrete Anderson Model We consider the self-adjoint operator Hω Hω = −1 + Vω , where 1 is the discrete Laplacian on `2 (Zν ) and Vω (ω ∈ ) is a random potential, the (Vω (x))x∈Zν being i.i.d. random variables. Their common probability measure µ is assumed to be non degenerate, i.e. not concentrated on a single point. The conditions that we impose on µ are: R if ν = 1 : ∃ η > 0 , R | v |η dµ(v) < ∞; (3.1) if ν ≥ 2 : µ is α-Hölder continuous. Let us recall how the disorder δ(µ) of a α-Hölder continuous measure µ is defined: δ(µ)−1 = inf

sup |b − a|−α µ([a, b]).

τ >0 |b−a| 0. • If ν = 1, define 0 ≡ γ(I) ≡ inf {γ(E), E ∈ I}, where γ(E) is the Lyapunov exponent at energy E. Suppose 0 = γ(I) > 0. • If ν > 1, pick 0 > 0. Suppose the disorder δ(µ) is taken sufficiently high. Then, P almost surely, Hω has pure point spectrum on I, and there exist centers xn,ω associated to the eigenfunctions ϕn,ω with energy En,ω ∈ I such that: ∀ γ0 ∈]0, 0[, there exists a constant C(ω, ε, γ0 ) such that ε

∀x ∈ Zν , |ϕn,ω (x)| ≤ C(ω, ε, γ0 )eγ0 |xn,ω | e−γ0 |x−xn,ω | .

(3.2)

Evidently, one can also write a “low energy” version of this theorem. As an immediate consequence of Theorem 2.1 and Theorem 3.1 we have: Corollary 3.2. Let Hω be as in Theorem 3.1, and PI (Hω ) the spectral projection on the compact interval I. Then for q > 0 and ψ ∈ `2 (Zν ) decaying exponentially with rate θ > 0, k |X|q/2 PI (Hω )e−iHω t ψk2 is bounded uniformly in t almost surely. Comparing (3.2) to (2.3) and to (2.1), one notices that now the size of the boxes in which the eigenfunctions “live” can grow at most as |xn,ω |ε . One expects this can be improved to a polynomial bound. Supposing µ has a bounded density with compact support, the polynomial bound follows from the proof of Theorem 7.6 in [11]. The proof of Theorem 3.1 is based on the ideas of [14], and in particular on the proof of Theorem 2.3 in [14]. The strategy is the following: since the hypotheses of Theorem 3.1 imply exponential localization, we know that there exist “centers” xn,ω , where the eigenfunction ϕn,ω is maximal, and one can then exploit the result of the multiscale analysis a second time to improve the control of the decay of the eigenfunctions. As already pointed out, this proof has the advantage of yielding the result for singular potentials and in particular for Bernoulli potentials in dimension 1. In addition, the proof extends to continuous random Schrödinger operators, as shown in Sects. 4 and 5. To make this paper self-contained, we start by recalling the elements from [14] that we need. First of all, 3L (x) denotes the cube of side L/2 centered in x and ∂3L (x) its


329

boundary. H3L (x),ω is the restriction of the operator Hω to the cube 3L (x) with Dirichlet boundary conditions, and G3L (x) (E, ., .) is its resolvent. Given L0 > 1, α ∈]1, 2[, we define Lk (k ∈ N) recursively via Lk+1 = Lα k . Given in addition an integer b ≥ 2, we define Ak+1 (xo ) = 32bLk+1 (xo )\32Lk (xo ). Note that we do not indicate the dependence of Lk and Ak on L0 , α and b since these quantities will at any rate be fixed later on. We further need the following definition: Definition 3.3. Let γ > 0 and an energy E ∈ R be given. A cube 3L (x) is said to be (γ, E)-regular if E 6∈ σ(H3L (x) ) and if for all y ∈ ∂3L (x), |G3L (x) (E, x, y)| ≤ e−γL/2 . Otherwise 3L (x) will be called (γ, E)-singular. Note that this definition is ω-dependent, but we follow the usual practice by not indicating this. For xo ∈ Zν , Ek (xo ) is defined to be the following set: {ω| ∃ E ∈ I, ∃ x ∈ Ak+1 (xo ), 3Lk (xo ) and 3Lk (x) are (γ, E)−singular}. Finally, we recall a well known identity. Let x ∈ Zν , E 6∈ σ(H3L (x) ), and ϕ ∈ `2 (Zν ) so that Hϕ = Eϕ be given, then: X

ϕ(x) =

G3L (x) (E, x, y)ϕ(y 0 ).

(3.3)

(y,y 0 )∈∂3L (x)

Here (with some abuse of notation) (y, y 0 ) ∈ ∂3L means y and y 0 are nearest neighbours with y ∈ 3L (x) and y 0 6∈ 3L (x). ∂3+Lk (x) will denote the points y 0 just outside 3Lk (x). In order to prove Theorem 3.1, we start with the following three lemmas: Lemma 3.4. Let p > ν and α ∈]1, 2 − 2ν/(p + 2ν)[ and b > 1 be given. Assume the hypotheses of Theorem 3.1 are satisfied. Then for any γ ∈]0, 0[ there exists L0 = L0 (p, ν, γ, b, δ(µ)) > 1 such that: ∀ k ∈ N, ∀ x ∈ Zν , P(Ek (x)) ≤

(2bLk+1 + 1)ν . (Lk )p

Lemma 3.5. Let γ > 0 be fixed. There exists a constant L∗ (ν, γ) so that, if H is a Schrödinger operator and ϕ ∈ `2 (Zν ) an eigenvector of H with eigenvalue E, and if x∗ ∈ Zν satisfies |ϕ(x∗ )| = sup{|ϕ(x)|, x ∈ Zν }, then 3L (x∗ ) is (γ, E)-singular, for all L ≥ L∗ (ν, γ). Lemma 3.6. If ν, sν , and γ are some positive constants, then ∀ η ∈]0, 1[ there exists L(η, γ, ν) such that, if L ≥ L(η, γ, ν): ∀ x, xo ∈ Zν ,

sν Lν−1 e−γL/2

o| |x−x L/2+1

≤ e−γη|x−xo | .

(3.4)

330


The first lemma follows immediately from the Appendix and from Theorem 2.2 of Von Dreifus and Klein [14] and constitutes the core of the proof of exponential localization in [14]. We will prove an analog of it for continuous random Schrödinger operators in the Appendix. The second lemma says roughly that if ϕ is an eigenvector of H with eigenvalue E, then E must be close to the spectrum of H3L (x) provided L is big enough and 3L (x) is centered on a maximum of |ϕn,ω |. It is quite simple, but central for what follows. Obviously the third lemma doesn’t need a proof. We have stated it separately in order to make clear later on that L(η, γ, ν) only depends on the model parameters and not on the particular eigenfunction we consider. Note that L(η, γ, ν) behaves like (1/γ) at a positive power. Proof of Lemma 3.5. Let ϕ be as in the lemma: ϕ ∈ `2 (Zν ), so x∗ exists. Suppose that 3L (x∗ ) is (γ, E)-regular, and apply the identity (3.3) at the point x∗ . Then for some y 0 ∈ ∂3+L (x): |ϕn,ω (x∗ )| ≤ sν Lν−1 e−γL/2 |ϕn,ω (y 0 )| ≤ sν Lν−1 e−γL/2 |ϕn,ω (x∗ )|,

(3.5)

where sν is a constant depending only on the dimension. Now let L∗ (γ, ν) be a positive real such that sν Lν−1 e−γL/2 < 1 for L ≥ L∗ . Then, for such L, (3.5) is impossible, and 3L (x∗ ) cannot be (γ, E)-regular any more, that is: 3L (x∗ ) is (γ, E)-singular for L ≥ L∗ (γ, ν). Proof of Theorem 3.1. Under the hypotheses of the theorem, Hω has P−a.s. exponential localization on I (see [4 and 3]). This means that there exists 0 ⊂ , µ(0 ) = 1 so that for all ω ∈ 0 , σc (Hω ) ∩ I = ∅ and for all eigenvalue En,ω ∈ I, the corresponding eigenfunction ϕn,ω is `2 and satisfies (2.3). The aim is therefore to control the constant Cn,ω of (2.3) and more precisely to show that xn,ω can be chosen so that this Cn,ω grows slower than an exponential in |xn,ω |ε . In order to prove Theorem 3.1, we wish to use Eq. (3.3) repeatedly, and on a scale Lk for suitably large k, to estimate the value of ϕn,ω (x) when x belongs to Ak+1 (xn,ω ) for suitably chosen xn,ω . To do this, one has to work “outward” from x ∈ Ak+1 (xn,ω ) to the boundary of Ak+1 (xn,ω ), making sure that the boxes of size Lk to which one applies (3.3) are regular. After these preliminaries, let’s start the proof properly speaking, which will consist of three steps. Firstly, let I, 0 > 0, γ0 and ε be as in the theorem. Pick γ ∈]γ0 , 0[. First step. Let p > ν, α ∈]1, 2 − 2ν/(p + 2ν)[ and b > 1 be given. With the Lk , k ≥ 0, defined in Lemma 3.4, consider [ Ek (xo ). Fk = |xo |≤(Lk+1 )1/ε

Lemma 3.4 then implies that for some constant C(ε, ν, b), −p+να(1+1/ε)

P(Fk ) ≤ C(ε, ν, b)Lk

.

Hence, since p can be chosen larger than 2ν(1 + 1/ε), one has Borel-Cantelli lemma then implies that:

P∞ k=0

P(Fk ) < ∞. The


 P  lim

m→∞

[

331

 Fk  = 0,

k≥m

so that the set 1 = {ω ∈ |∃ k˜ 1 = k˜ 1 (ω, ε, p, γ) such that ∀ k ≥ k˜ 1 , ω 6∈ Fk } has full measure. This ends the probabilistic part of the proof. Note that the choice of ε puts a lower bound on p. This in turn forces the disorder to be high via Lemma 3.4. Second step Now pick an ω in (0 ∩ 1 ), which will be kept fixed throughout the rest of the proof. Let En,ω ∈ I, ϕn,ω be its eigenfunction, and let xn,ω be a point where |ϕn,ω (x)| is maximal. Note that such a point exists since ω ∈ 0 and therefore ϕn,ω ∈ `2 (Zν ). Let k1 = k1 (ω, ε, p, γ, xn,ω ) = max(k˜ 1 , k0 (ε, xn,ω )),

(3.6)

where for all y ∈ Zν , the integer k0 (ε, y) is defined as follows: k0 (ε, y) = min{k ≥ 0 such that |y|ε < Lk+1 }.

(3.7)

Hence ∀k ≥ k1 , ω 6∈ Ek (xn,ω ). Indeed, if k ≥ k1 , then ω 6∈ Fk and |xn,ω |ε < Lk+1 . This implies that ω 6∈ Ek (xn,ω ), and consequently that ∀k ≥ k1 , ∀E ∈ I and ∀y ∈ Ak+1 (xn,ω ) either 3Lk (xn,ω ) or 3Lk (y) is (γ, E)-regular. We can then apply Lemma 3.5 to conclude that there exists an integer k2 = max(k˜ 2 , k0 (ε, xn,ω )), where k˜ 2 depends on the same parameters as k˜ 1 and not on n, so that ∀k ≥ k2 , ∀En,ω ∈ I, ∀y ∈ Ak+1 (xn,ω ), 3Lk (y) is (γ, En,ω )-regular. Last step. We now finish the argument along the lines of the proof of Theorem 2.3 in [14] ; ω is still fixed in (0 ∩ 1 ). For k ≥ k2 and for x ∈ Ak+1 (xn,ω ), 3Lk (x) being (γ, En,ω )-regular, we can apply relation (3.3): −γLk /2 |ϕn,ω (x0 )|, |ϕn,ω (x)| ≤ sν Lν−1 k e

where x0 ∈ ∂3+Lk (x) is chosen so that |ϕn,ω (x0 )| = sup{|ϕn,ω (y)|, y ∈ ∂3+Lk (x)}. As long as x0 is still in Ak+1 (xn,ω ) we can use (3.3) again, so that we can repeat this step at least d(x, ∂Ak+1 (xn,ω ))/(Lk /2 + 1) times. Hence for k ≥ k2 and x ∈ Ak+1 (xn,ω ), |ϕn,ω (x)| ≤

−γLk /2 sν Lν−1 k e

d(x, ∂Ak+1 (xn,ω )) Lk /2 + 1 .

Comparing this to (3.4), we see that in order to get a useful estimate we need a lower bound on d(x, ∂Ak+1 (xn,ω )). For that purpose we cover Zν with new annular regions A˜ k+1 (xn,ω ) defined as follows. Pick ρ ∈]0, 1[ so that ργ > γ0 and choose the integer b introduced at the beginning of the proof so that b > (1 + ρ)/(1 − ρ). Then set A˜ k+1 (xn,ω ) = 3[2bLk+1 /(1+ρ)] (xn,ω )\32Lk /(1−ρ) (xn,ω ) ⊂ Ak+1 (xn,ω ).

332


Note that if x ∈ A˜ k+1 (xn,ω ) then d(x, ∂Ak+1 (xn,ω )) ≥ ρ|x − xn,ω |. Hence, repeating (3.3) ρ|x − xn,ω | times, one has that for all k ≥ k2 and for all x ∈ A˜ k+1 (xn,ω ), n| ρ|x−x Lk /2+1 −γLk /2 |ϕn,ω (x)| ≤ sν Lν−1 e , k

or, applying Lemma 3.6, and choosing η ∈]0, 1[ such that γ0 = ρηγ we conclude that there exists an integer k3 = max(k˜ 3 , k0 (ε, xn,ω )), where k˜ 3 depends again not on n, such that: (3.8) ∀ k ≥ k3 , and ∀ x ∈ A˜ k+1 (xn,ω ), |ϕn,ω (x)| ≤ e−γ0 |x−xn | . But now note that for all x ∈ Zν , and provided |x − xn,ω | > L0 /(1 − ρ), there exists a k so that x ∈ A˜ k+1 (xn,ω ). This means that there exists an integer k = max(k˜ 4 , k0 (ε, xn,ω )), k˜ 4 depending once again not on n, such that (3.8) holds for all x ∈ Zν satisfying |x − xn,ω | > Lk . Hence, using that |ϕn,ω (x)| ≤ 1 for all x ∈ Zν : (3.9) ∀x ∈ Zν , |ϕn,ω (x)| ≤ C(ω, ε, γ0 )eγ0 Lk e−γ0 |x−xn,ω | . So far, we have only proved that the eigenfunctions decay exponentially, but we are now in a position to control the n-dependence of the constant eγ0 Lk as follows. Note that the only n-dependence of k comes from k0 (ε, xn,ω ). Suppose sup{|xn,ω |, En,ω ∈ I} < ∞, then k can be chosen n-independently, so that we actually obtain a uniform localization (called ULE in [11]), and a fortiori condition (2.1) of Theorem 2.1. But Lemma 2.2 contradicts this first possibility. So, in fact, sup{|xn,ω |, En,ω ∈ I} = ∞, and, for n sufficiently large, one has: k = k0 (ε, xn,ω ) i.e., with (3.7), Lk ≤ |xn,ω |ε . Inserting this in (3.9) yields the announced result.

4. The Continuous Case In this section, our goal is to obtain the analog of Theorem 3.1 and Corollary 3.2 for continuous random Schrödinger operators. The result is stated in Theorem 4.2 below. In Sect. 5, we will present some models where the hypotheses of this theorem are satisfied. We consider random Schrödinger operators on L2 (Rν ) of the following type (ν ≥ 1): X λi (ω)u(x − i). (4.1) Hω = H0 + i∈Zν

Here H0 = (i∇ − A)2 + Vper , where A is a vector potential of a constant magnetic field − → B =rot(A), and Vper a periodic potential. ii) The variables λi (ω), i ∈ Zν are independent and identically distributed, with common distribution µ. iii) The function u(x) belongs to C02 (Rν ), with suppu ⊂ [−R, R]ν .

i)


333

To state the hypotheses, we need to recall some notations and simple facts. We introduce |x| = max{|xi |, i = 1, ..., ν}, and denote by 3L (x) the cube 3L (x) = {y ∈ Rν | |y − x| < L/2}. ˜ L (x) is the subset Moreover δ > 0 being fixed (independently of L), 3 ˜ L (x) = {y ∈ 3L (x) such that L/2 − δ < |x − y| < L/2}; 3 χ˜ L,x will denote its characteristic function.We denote furthermore by χL,x a function in C 2 (Rν ) with support in 3L (x) and satisfying, 0 ≤ χL,x ≤ 1, χL,x ≡ 1 on the ˜ L (x) so that ∇χL,x lives in 3 ˜ L (x). Note that we will often drop the subcube 3L (x)\3 ω or x-dependence of the objects introduced in order to alleviate the notations. We furthermore define local Hamiltonians H3L (x),ω as follows. When A = 0 and Vper = 0, H3L (x),ω is the restriction of the operator Hω to the cube 3L (x) with Dirichlet boundary conditions. When A 6= 0 or Vper 6= 0, X λi (ω)u(x − i). H3L (x),ω = H0 + i∈3L ∩Zν

We denote by WL,x the first order differential operator WL,x ≡ [H0 , χL,x ] and write R3L (x) (E) the resolvent of H3L (x),ω . We then always have the geometric resolvent equation: if 3l ⊂ 3L ⊂ Rν and if E 6∈ σ(H3L ) and E 6∈ σ(H3l ) then χl R3L (E) = R3l (E)χl + R3l (E)Wl R3L (E).

(4.2)

Definition 4.1. Let γ > 0 and an energy E ∈ R be given. A cube 3L (x) is said to be (γ, E)-regular if E 6∈ σ(H3L (x) ) and if: kχ4δ,x R3L (x) (E)WL,x k ≤ e−γL/2 . Otherwise 3L (x) will be called (γ, E)-singular. We now state the result. Given a compact interval I, and reals γ0 > 0, p > ν, L0 , L˜ > 1, we introduce Hypothesis (H1). (γ0 , I, p, L0 ): P(∀ E ∈ I, 3L0 is (γ0 , E)-regular) > 1 −

1 . Lp0

˜ (Wegner): There exists CW so that for all Hypothesis (H2). (I, L) ˜ I˜ ⊂ I1 ≡ {E | d(E, I) < 1} and L > L, ˜ < CW |I| ˜ |3L |. E(tr(E3L (0) (I))) ˜ one also has, for L > L, ˜ 0< Note that, using Chebyshev’s inequality and [H2](I, L), η < 1 and E ∈ I, (4.3) P(d(E, σ(H3L (0),ω )) < η) < CW |3L |η. This is the so-called “Wegner Estimate”. We will need both [H2] and (4.3) for the proof of Proposition 4.3.

334


Theorem 4.2. Let ε > 0. Suppose that for some interval I and reals γ0 > 0, ˜ hold for L0 > L˜ p > 2ν(1 + 1/ε), the hypotheses [H1](γ0 , I, p, L0 ) and [H2](I, L) large enough. Then with probability one there exist points xn,ω , associated to the eigenfunctions ϕn,ω with energies En,ω ∈ I, so that: ∀ γ ∈]0, γ0 [ and for some constant Cω = C(ω, ε, γ, γ0 , I, L0 ), one has, for all x ∈ Rν , ε

|ϕn,ω (x)| ≤ Cω eγ|xn,ω | e−γ|x−xn,ω | .

(4.4)

Moreover, if q > 0 and ψ ∈ L2 (Rν ) decays exponentially with mass θ > 0, then, with probability 1, there exists a constant Cψ,ω such that, k |X|q/2 PI (Hω )e−iHω t ψk2 ≤ Cψ,ω . Analysing the proofs of the previous two sections, one sees that the only missing ingredient for the proof of Theorem 4.2 is an analog of Lemma 3.4, stated as Proposition 4.3 below. Indeed, the arguments of Sect. 3 are readily transcribed to the continuous case, provided one makes the following adaptation. First, one replaces Eq. (3.3) by the following equality: if ϕ ∈ L2 (Rν ) satisfies Hϕ = Eϕ for some E, then for 3L (x) ⊂ Rν so that E 6∈ σ(H3L (x) ): χ4δ,x ϕ = χ4δ,x R3L (x) (E)WL,x ϕ. Secondly, one defines x∗ (ϕ), the analog of x∗ in Lemma 3.5, as follows. If ϕ belongs to L2 (Rν ) and Hϕ = Eϕ, then Z Z 2 sup |ϕ(x)| dx = |ϕ(x)|2 dx. y∈4δZν

34δ (y)

34δ (x∗ (ϕ))

So, writing xn,ω ≡ x∗ (ϕn,ω ), and redefining the annular region Ak+1 (xn,ω ) as 32bLk+1 (x)\32Lk +2R (xn,ω ), one obtains the bound written in (4.4), but for kχ4δ,x ϕn,ω k, x ∈ 4δZν , rather than for |ϕn,ω (x)|. Then to get the pointwise estimate (4.4) apply ε 0 Theorem 2.4 of [9], or decompose ke−γ|xn,ω | eγ |x−xn,ω | ϕn,ω k2L2 , with γ 0 sufficiently close to γ, on boxes of size 4δ and centered on 4δZν , and apply Theorem IX.26 of [25]. It therefore remains to state and prove the analog of Lemma 3.4. Following [14] let’s denote by R(γ, L, x, y) the set R(γ, L, x, y) ≡ {ω ∈ | ∀ E ∈ I, 3L (x) or 3L (y) is (γ, E)-regular}. As in the discrete case, for all α ∈]1, 2[ and L0 > 1 we define the sequence (Lk )k∈N by Lk+1 = Lα k. Proposition 4.3. For any γ ∈]0, γ0 [, p > ν and α ∈]1, 2 − 2ν/(p + 2ν)[ there exist ˜ hold for L0 > L∗ , L0 > L, ˜ L∗ = L∗ (γ, I, α) such that if [H1](γ0 , I, p, L0 ) and [H2](I,L) then for all k ≥ 0: |x − y| > Lk + 2R =⇒ P(R(γ, Lk , x, y)) > 1 −

1 L2p k

.

The proof of Proposition 4.3 follows upon adapting the arguments of the proof of Theorem 2.2 in [14] to the continuum. Various authors [6, 21] have written up versions of the multi-scale analysis for continuous Schrödinger operators, but, to our knowledge, the version we need is not available in the literature. Nethertheless, nobody seems to doubt that any such argument can be carried over from the discrete to the continuous case. Since multi-scale arguments are in addition to this painful, we have chosen to put the proof of Proposition 4.3 in the appendix, while making an effort to give a clear, complete and relatively simple argument.


335

5. Applications We briefly indicate two applications of Theorem 4.2 to an Anderson model on Rν [6] and to random Schrödinger operators with a magnetic field. The Anderson tight-binding model. Here the free Hamiltonian H0 of Eq. (4.1) is −1. We suppose, following [6], that u(x) > χ3/2 (x), where χ3/2 is the characteristic function of the cube 33/2 (0). Putting together Proposition 4.5 and Theorem 5.1 of [6], one has immediately from Theorem 4.2: Theorem 5.1. Suppose that µ has a L∞ density g(λ) with support [0, λmax ] and disorder δ0 = kgk−1 ∞ > 0 then, for energy EA > 0 fixed and disorder δ0 high enough, or for disorder δ0 > 0 fixed and energy EA low enough, the conclusions of Theorem 4.2 hold on [0, EA ]. We notice that the result holds as well for theP “breather” model, where the potential is given by the closely related formula: Vω (x) = i∈Zν u(λi (ω)(x − i)) [8]. The Landau Hamiltonian. Here the Hamiltonian has the general form described in Eq. (4.1), i.e. A 6= 0 and Vper = 0. Although the result is still valid in arbitrary dimension under further assumptions (see [2]) we prefer to state the application in the well-known two dimensional version [3, 7, 27]. In that case, the vector potential A is given by A=

B (x2 , −x1 ), 2

with B > 0. Recall that the spectrum of the free Landau Hamiltonian H0 consists of a sequence of eigenvalues En (B) = (2n + 1)B, n ∈ N. √ We suppose that u > 0, suppu ⊂ B(0, 1/ 2), and that there exist C0 and r0 > 0 such that u|B(0,r0 ) > C0 . We suppose that the common measure µ (of the λi (ω)) has a bounded density function g ∈ C02 (R), g being even and positive for almost every λ ∈ suppg. Under those assumptions it is well-known that M0 = sup{|Vω (x)|, x, ω} < +∞. Let’s define the following bands: I0 (B) = [−M0 , B − ε0 (B)], In+1 (B) = [En (B) + εn (B), En+1 (B) − εn (B)] with some εn (B) > 0. It follows from [7] and Theorem 4.2: Theorem 5.2. Let V be as described above. Then for B high enough, there exist some εn (B) = O(B −1 ) such that the conclusions of Theorem 4.2 hold on each interval In (B), n ∈ N. A similar result holds for the model treated by Wang [27], where u can be negative and its support is included in B(0, r), 0 < r < 1.

336


A. Appendix We turn to the proof of Proposition 4.3. Let us point out that our definition of a (γ, E)regular box (Definition 4.1) differs slightly from the one in [6]. This difference will allow us to free ourselves from most of the difficulties due to the use of an auxiliary lattice in [6] and [21]. We need the following two concepts: Definition A.1. Let r > 0. Two boxes 31 and 32 will be called r non-overlapping iff d(31 , 32 ) > 2r. Note that, if 31 and 32 are R non − overlapping, then, since suppu ⊂ [−R, R]ν , two events depending respectively on the λi with i in 31 and in 32 are necessarily independent. Definition A.2. Let β ∈]0, 1[ be given. A box 3L will be called non resonant at energy E (we’ll write E − N R) if β kR3L (E)k ≤ 2eL . β

This means, in other words, that d(E, σ(H3L )) > (1/2)e−L . Remark that the commutator WL does not appear in Definition A.2 as it did in Definition 4.1. In fact one can replace WL with the characteristic function χ˜ L,x defined above (see [6] for the Anderson case and [7] Lemma 5.1 and lines (5.27 - 5.29) if A 6= 0) as follows. There exists a constant C(δ, I) with: kχ4δ,x R3L (x) (E)WL,x k ≤ C(δ, I)kχ4δ,x R3L (x) (E)χ˜ L,x k.

(A.1)

Remark then that this last bound tells us that β

3L is E − N R =⇒ kχ4δ R3L (E)WL k ≤ 2C(δ, I)eL .

(A.2)

An essential ingredient of the proof of Proposition 4.3 is the following deterministic lemma. Lemma A.3. Let L = lα with α ∈]1, 2[ and x ∈ Rν , l > 12R. Denote by sν the number of faces of a cube in dimension ν. Assume that for some γ > 27lβ + 4(ν − 1)ln(l/δ)) + 4ln(2sν ) /l, with δ and β defined previously, and for some energy E, i) 3L (x) is E − N R; ii) Each box of size 4j(l+R), j=1,2,3, centered in x + lZν and contained in 3L (x) is E-NR; iii) Among all the (γ, E)-singular boxes of size l contained in 3L (x), there are no more than three that are two by two R non-overlapping. Then 3L (x) is (γ 0 , E)-regular with ln 2sν C(l/δ)ν−1 27 2 0 , (A.3) γ = γ 1 − α−1 − α(1−β) − l l ) l with C = C(δ, I) defined in (A.1).


337

In [6], Combes and Hislop have proved a simpler version of this result. In fact, they have adapted to the continuous case a simplified version of [14] which is contained in chapter IX of [5] (see also [15]). But, as in the discrete case, this simplified version does not seem to suffice to obtain the results of this paper, since we need to obtain regular boxes at any size with good probability, uniformly in a compact interval of energy, and no longer at some fixed energy E. So we turn to [14] and adapt it to the continuous case. Proof of Lemma A.3. The aim is to bound kχ4δ,x R3L (x) (E) WL,x k. Using first inequality (A.1), we are reduced to control kχ4δ,x R3L (x) (E)χ˜ L,x k. This is achieved in (A.12) below. We recall that δ > 1 has been chosen small, so, without Loss of generality one can suppose l > 3δ. In order to achieve our goal, we will recursively construct inside 3L (x) a chain of n boxes 3l (vk ), k = 0, ..., n − 1, being most of the time (γ, E)-regular, and starting at v0 ≡ x. At each step of this process, we will use the geometric resolvent equation as follows. ˜ L (x)) > 0. For Let l0 > 3δ and consider any box 3l0 (z) ⊂ 3L (x) with d(3l0 (z), 3 E 6∈ σ(H3l0 (z) ) ∪ σ(H3L (x) ), the resolvent identity (4.2) gives: χ4δ,z R3L (x) (E)χ˜ L,x = χ4δ,z R3l0 (z) (E)Wl0 ,z R3L (x) (E)χ˜ L,x . The support of Wl0 ,z can be covered by a family of boxes 34δ (v) ⊂ 3L (x), indexed by points v that satisfy |v − z| = l0 /2 and so that the sum over all the corresponding characteristic functions χ4δ,v is equal to 1 on suppWl0 ,z . (Note that sν (1 + l0 /3δ)ν−1 < sν (l0 /δ)ν−1 such boxes suffice.) Clearly, there exists one of those v for which kχ4δ,z R3L (x) (E)χ˜ L,x k ≤ sν (l0 /δ)ν−1 kχ4δ,z R3l0 (z) (E)Wl0 ,z k kχ4δ,v R3L (x) (E)χ˜ L,x k.

(A.4)

Suppose now in addition that 3l0 (z) is (γ, E)-regular. Then we immediately have: 0

kχ4δ,z R3L (x) (E)χ˜ L,x k ≤ sν (l0 /δ)ν−1 e−γl /2 kχ4δ,v R3L (x) (E)χ˜ L,x k.

(A.5)

Apply now this argument to 3l (x), and set v0 = x, v1 = v. Repeat the process as long as 3l (vk ) is (γ, E)-regular and stays away from the boundary of 3L (x). Clearly, there exists some k ∗ ≥ 0 so that for all 0 ≤ k < k ∗ , 3l (vk ) is (γ, E)-regular and ˜ L (x)) > 0, whereas one of these conditions fails for 3l (vk ) ⊂ 3L (x), d(3l (vk ), 3 3l (vk∗ ). As a result, if k ∗ > 0, we have ∗

kχ4δ,x R3L (x) (E)χ˜ L,x k ≤ (sν (l0 /δ)ν−1 )k e−γk

∗

l/2

kχ4δ,vk∗ R3L (x) (E)χ˜ L,x k. (A.6)

If k ∗ = 0, ∗this equation holds trivially. The important point here is that we gained a factor e−γk l/2 : if there were no (γ, E)-singular boxes 3l inside 3L (x), this would end the proof. Indeed, in that case, the process could only end when 3l (vk∗ ) gets too close to the boundary of 3L (x), implying k ∗ ≥ (L/l), so that (A.6) immediately yields the result upon using hypothesis (i). Of course, there may be (γ, E)-singular boxes in 3L (x) and we now use hypothesis (iii) of the lemma to control the case in which the above process stops because 3l (vk∗ ) is (γ, E)-singular and vk∗ is at a distance greater than 12(l + R) from the boundary of 3L (x). Using hypothesis (iii) and drawing a few pictures one easily convinces oneself that one can pack all the singular boxes of size l in t ≤ 3 slightly bigger and disjoint boxes 3li ⊂ 3L (x), centered in x + lZν , and so that each box 3l (z), where z belongs to the edge of one of those 3li , is (γ, E)-regular. More precisely, the li are taking on

338


Pt one of the values 4j(l + R), 1 ≤ j ≤ 3, they satisfy i=1 li ≤ 12(l + R) ≡ l0 , and the two following facts are simultaneously true:   d(z, ∂3L (x)) ≥ l/2 r   [ (A.7) (1)  z ∈ 3 (x)\ 3  =⇒ (3l (z) is (γ, E)-regular) . L

li

i=1

(2) If z belongs to the edge of one of the 3li , then z 6∈

Sr i=1

3li .

(A.8)

So, if 3l (vk∗ ) is (γ, E)-singular, there exists i ∈ {1, .., t} so that 3l (vk∗ ) ⊂ 3li . Define a new family of points v on the boundary of 3li , such that the boxes 34δ (v) cover suppWli . Then use the equivalent of (A.4) with 3l0 (z) = 3li and z = vk∗ . This produces some vk∗ +1 that belongs to the edge of 3li , and consequently (A.7)-(A.8) implies that 3l (vk∗ +1 ) is (γ, E)-regular, provided d(vk∗ +1 , ∂3L (x)) > l/2. By checking how vk∗ +1 is positioned with respect to vk∗ one sees that the latter condition is satisfied because vk∗ is at least at a distance 12(l + R) from ∂3L (x)). Use now Hypothesis (ii) and apply once again (A.4) to 3l (vk∗ +1 ), to obtain that for some vk∗ +2 on the edge of 3l (v), with |vk∗ − vk∗ +2 | ≤ li : ν−1 β l0 l kχ4δ,vk∗ R3L (x) (E)χ˜ L,x k ≤ 2s2ν e−γl/2+li kχ4δ,vk∗ +2 R3L (x) (E)χ˜ L,x k. 2 δ (A.9) Using the preliminary condition on γ, this leads to kχ4δ,vk∗ R3L (x) (E)χ˜ L,x k ≤ kχ4δ,vk∗ +2 R3L (x) (E)χ˜ L,x k.

(A.10)

This is the way in which we get past a singular box 3l (vk∗ ) far from the edge of 3L (x). We have now completely described the recursive construction of the vk and it is clear that the process grinds to a standstill only when, for some n, 3l (vn ) is too close to ∂3L (x). From (A.10) one sees that, when meeting a singular box, we do not gain a factor exp−γl/2 , so that we have to assure ourselves this does not happen too often before the process ends. We therefore need to count how many of the boxes 3l (vk ), 0 ≤ k ≤ n are regular. Since |vk+1 − vk | = l/2 in that case and |vk+2 − vk | ≤ li if not, it is not hard to see that the process cannot stop before n = n∗ = n∗1 + 2t, where # " Pt L/2 − i=1 li ∗ , n1 = l/2 or

[L/l] − 27 ≤ n∗1 ≤ L/l.

(A.11)

Hence, using (i) of the lemma, one has n∗1 β 2eL . kχ4δ,x R3L (x) (E)χ˜ L,x k ≤ sν (l/δ)ν−1 e−γl/2

(A.12)

Putting together relations (A.1) and (A.12) and using (A.11) as well as the definition of γ 0 stated in the lemma leads to the desired result. Proof of Proposition 4.3. Take α ∈]1, 2 − 2ν/(p + 2ν)[ and γ ∈]0, γ0 [. Let L ≡ Lk+1 = Lα k , and use Eq. (A.3) to produce a sequence of exponents γk ∈]0, γ0 ]. It will be enough to show that


339

a) ∀k ≥ 0, γ ≤ γk ≤ γ0 ; b) ∀k ≥ 0 : |x − y| > Lk+1 + 2R =⇒ P (R(γk , Lk+1 , x, y)) > 1 − 1/L2p k+1 . To prove (a), choose L0 > 0: the sequence (γk )k≥0 produced by repeatedly using (A.3) decreases, so for all k ≥ 0, γk+1 ≤ γk ≤ γ0 . Then, using Eq. (A.3), it is clear there exists L∗ = L∗ (γ, γ0 , β, α, ν) so that, if L0 > L∗ , the sequence (γk )k≥0 satisfies 0
Lk+1 + 2R =⇒ ∀ E ∈ I, the hypotheses of Lemma A.3 with γ = γk P > 1 − 1/L2p k+1 . (A.13) and L = Lk+1 are satisfied for either point xn,ω or y Firstly, since for all k ≥ 0, γk ≥ γ, provided L0 is large enough, one has: ∀k ≥ 0, γk >

1 27Lβk + 4(ν − 1)ln(Lk /δ)) + 4ln(2sν ) . Lk β

Let’s now define Il = {E ∈ R, d(E, I) ≤ e−l /2} and σ 0 (H3l ) = σ(H3l ) ∩ Il . It is easy to estimate the probability that the distance between the respective spectrum of two ˜ is greater than η, 0 < η < 1. Using R non-overlapping boxes 3l1 and 3l2 , l1 , l2 > L, first (4.3) and then Hypothesis [H2], one has, with some abuse of notations: Z X P(d(σ 0 (H3l1 ), σ 0 (H3l2 )) < η) ≤ P3l1 d(σ 0 (H3l1 ), E) < η dω2 E∈σ 0 (H3l ) 2

≤ CW |3l1 |η E(tr(E3l2 (Il2 ))) 2 (|I| + 1)|3l1 | |3l2 |η ≤ CW = CW,I |3l1 | |3l2 |η.

(A.14)

Hence, for all k, and writing for convenience L ≡ Lk+1 and l = Lk : if |x − y| > L + 2R, it follows from (A.14) that P(∃ u ∈ (x + lZν ) ∩ 3L (x), v ∈ (y + lZν ) ∩ 3L (y) and l1 , l2 = L or 4j(l + R)l, j = 1, .., 3 with 3l1 (u) ⊂ 3L (x) and 3l2 (v) ⊂ 3L (y), with d(σ 0 (H3l1 ), σ 0 (H3l2 )) < η) ≤ CW,I (L/l)2ν |3L |2 η.

(A.15)

But consider this elementary exercise in logic: let Ai and Bj , i and j = 1, .., J be 2J intervals, then ∀ i, j = 1, ..., J, d(Ai , Bj ) > η ⇐⇒ ∀ E ∈ R ∀ i, j = 1, ..., J, (d(E, Ai ) > η/2 or d(E, Bj ) > η/2) ∀ E ∈ R, either (∀ i = 1, ..., J, d(E, Ai ) > η/2) ⇐⇒ . or (∀ j = 1, ..., J, d(E, Bj ) > η/2)

340

F. Germinet, S. De Bièvre β

This, combined with inequality (A.15) and η = e−Lk , gives for all k ≥ 0, and if |x − y| > Lk+1 + 2R that ∀ E ∈ I, (i) and (ii) of Lemma A.3 with γ = γk P > 1 − 1/L2p+1 k+1 . (A.16) and L = Lk+1 are satisfied for either point x or y Let’s finish the proof: for L0 large enough, Hypothesis [H1](γ, I, p, L0 ) gives (A.13) at rank 0. Suppose it is true at rank k: points (i) and (ii) of Lemma A.3, with γ = γk and L = Lk+1 , are satisfied for either points x or y, with probability evaluated line (A.16). Now, P(for any E ∈ I, (iii) of Lemma A.3 holds) = 1 − P(∃ E ∈ I s.t. there are at least 4 R non-overlapping (γk , E)-singular boxes 3Lk contained in 3Lk+1 (x)) ≥ 1 − P(∃ E ∈ I s.t. there are at least 2 R non-overlapping (γk , E)-singular boxes 3Lk contained in 3Lk+1 (x))2 !2 (Lk+1 /Lk + 1)2ν ≥ 1− , L2p k

(A.17)

where we obtained the last inequality using the recurrence hypothesis. Hence, since α < 2 − 2ν/(p + 2ν), combining (A.16), (A.17), there exists a constant L∗ = L∗ (p, γ, γ0 , ν) such that if L0 > L∗ and |x − y| > Lk+1 + 2R: P(R(γk , Lk+1 , x, y)) > 1 − Use now that γk > γ, and Proposition 4.3 is proved.

1 L2p k+1

.

References 1. Aizenman, M.: Localization at weak disorder: Some elementary bounds. Rev. Math. Phys. 6, 1163–1182 (1994) 2. Barbaroux, J.M.: Dynamique quantique des milieux désordonnés. Thèse de doctorat, Toulon (1996) 3. Barbaroux, J.M., Combes, J.M., Hislop, P.D.: Landau Hamiltonian with unbounded random potentials. Preprint (1997) 4. Carmona, R., Klein, A., Martinelli, F.: Anderson localization for Bernoulli and other singular potentials. Commun. Math. Phys. 108, 41–66 (1987) 5. Carmona, R., Lacroix, J.: Spectral theory of random Schrödinger operator. Basel–Boston: Birkhaüser, 1990 6. Combes, J.M., Hislop, P.D.: Localization for some continuous, random Hamiltonian in d-dimension. J. Funct. Anal. 124, 149–180 (1994) 7. Combes, J.M., Hislop, P.D.: Landau Hamiltonians with random potentials: Localization and the density of states. Commun. Math. Phys. 177, 603–629 (1996) 8. Combes, J.M., Hislop, P.D., Mourre, E.: Spectral Averaging, Perturbation of Singular Spectra, and Localization. Trans. Amer. Math. Soc. 348, 4883–4894 (1996) 9. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schrödinger Operators. Berlin–Heidelberg–New York: Springer-Verlag, 1987 10. De Bièvre, S., Germinet, F.: Dynamical Localization for the random dimer model. Preprint (1998) 11. Del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: Operators with singular continuous spectrum IV: Hausdorff dimensions, rank one pertubations and localization. J. d’Analyse Math. 69, 153–200 (1996)


341

12. Delyon, F., Levy, Y., Souillard, Y.: Anderson localization for multi-dimensional systems at large disorder or large energy. Commun. Math. Phys. 100, 463–470 (1985) 13. Dorlas, T.C., Macris, N., Pulé, J.V.: The Nature of the Spectrum for the Landau Hamiltonian with δ impurities. To appear in J. Stat. Phys. 14. von Dreifus, A., Klein, A.: A new proof of localization in the Anderson tight binding model. Commun. Math. Phys. 124, 285–299 (1989) 15. von Dreifus, H., Klein, A.: Localization for random Schrödinger operators with correlated potentials. Commun. Math. Phys. 140, 133–147 (1991) 16. Dunlap, D.H., Phillips, P.: Absence of localization in a random Dimer model. Phys. Rew. Lett. 61, 88 (1990) 17. Fröhlich, J., Spencer, T.: Absence of diffusion with Anderson tight binding model for large disorder or low energy. Commun. Math. Phys. 88, 151–184 (1983) 18. Fröhlich, J., Martinelli, F., Scoppola, E., Spencer,T.: Constructive proof of localization in the Anderson tight binding model. Commun. Math. Phys. 101, 21–46 (1985) 19. Germinet, F.: Dynamical Localization II with an Application to the Almost Mathieu Operator. Preprint (1997) 20. Jona-Lasinio, G., Martinelli, F., Scoppola, E.: Mutiple tunnelings in d-dimensions: a quantum particle in a hierarchical potential. Ann. Inst. Henri Poincaré, vol. 42, 73–108 (1985) 21. Klopp, F.: Localization for continuous Random Schrödinger Operators. Commun. Math. Phys. 167, 553–569 (1995) 22. Kunz, H.: Quantum Hall effect for electrons in random potential. Commun. Math. Phys. 112, 121–145 (1987) 23. Martinelli, F., Holden, H.: On absence of diffusion near the bottom of the spectrum for a random Schrödinger operator. Commun. Math. Phys. 93, 197–217 (1984) 24. Pastur, L., Figotin, A.: Spectra of Random and Almost-Periodic Operators. Berlin: Springer-Verlag, 1992 25. Reed, M., Simon, B.: Methods of modern mathematical physics Vol. I-IV, London: Academic Press, 1975 26. Simon, B., Wolff, T.: Singular continuous perturbation under rank one perturbation and localization for random Hamiltonians. Commun. Pure Appl. Math. 39, 75–90 (1986) 27. Wang, W.M.: Microlocalization, percolation and Anderson localization for the Schrödinger operator with a random potential. J. Funct. Anal. 146, 1–26 (1997) Communicated by B. Simon

Commun. Math. Phys. 194, 343 – 358 (1998)

Communications in


On Maxwell’s Equations with a Temperature Effect, II Robert Glassey1 , Hong-Ming Yin2 1 2

Department of Mathematics, Indiana University, Bloomington, IN 47405, USA Department of Mathematics, University of Notre Dame, Notre Dame, IN 46556, USA

Received: 24 July 1997 / Accepted: 22 October 1997

Abstract: In this paper we study Maxwell’s equations with a thermal effect. This system models an induction heating process where the electric conductivity σ strongly depends on the temperature u. We focus on a special one–dimensional case where the electromagnetic wave is assumed to be parallel to the y-axis. It is shown that the resulting hyperbolic–parabolic system has a global √ smooth solution if the electrical conductivity σ(u) grows like uq with 0 ≤ q < 8 + 4 3. A fundamental element in this paper is the establishment of a maximum principle for wave equations with damping. This maximum principle provides an a priori bound for the first derivative with respect to both x and t of the solution without the imposition of any differentiability assumptions nor bounds on the coefficient of the damping term. The use of a nonlinear multiplier then permits (via a bootstrap procedure) the estimation of successively higher Lp -norms of the temperature function u. 1. Introduction In this paper we study the following coupled hyperbolic–parabolic system: wtt − wxx + σ(u)wt = 0, (x, t) ∈ QT , ut − uxx = σ(u)wt2 , (x, t) ∈ QT ,

(1.1) (1.2)

subject to appropriate initial–boundary conditions, say, for example, w(i, t) = fi (t), u(i, t) = gi (t), 0 ≤ t ≤ T, i = 0, 1, w(x, 0) = w0 (x), wt (x, 0) = w1 (x), u(x, 0) = u0 (x), 0 ≤ x ≤ 1,

(1.3) (1.4)

where QT = (0, 1) × (0, T ) and T > 0. Equation (1.1) derives from Maxwell’s equations [4] where the electric field is asRt sumed to be {0, g(x, t), 0} and w(x, t) = 0 g(x, τ ) dτ . Equation (1.2) describes standard

344

R. Glassey, H.-M. Yin

heat conduction by taking into account the local Joule’s heat produced by Eddy currents. One of the important features in the system is that the electric conductivity σ strongly depends on the temperature. The reader may consult [6] for further physical background while [3, 9] [and the references therein] for the mathematical model of induction heating. In [9] one of the authors studied the full Maxwell system coupled to a nonlinear heat equation in a bounded domain in R3 . Existence of a global weak solution was established under a boundedness assumption of σ(u). Moreover, for the case of one space dimension it is shown that the weak solution is also classical. When σ(u) is not a priori assumed to be bounded, the problem becomes much more complicated. For instance, consider a special case where σ(u) = uq with q > 1. It is well–known that the solution of the following heat equation ut − uxx = uq will blow up in finite time, provided that initial datum is suitably large. In particular, the L1 -norm of u(x, t) blows up in finite time. However, with the coupling factor wt2 as the coefficient of uq , it is not clear whether the temperature u(x, t) will blow up in finite time. Indeed, with suitable boundary conditions, it is easy to derive the following energy estimate for any T > 0: Z TZ 1 Z 1 2 2 [wt + wx ] dx + σ(u)wt2 dx dt ≤ C. 0

0

0

This estimate shows that the right–hand side of Eq.(1.2) belongs to L1 (QT ). From the theory of parabolic equations [5], it follows that at least the L1 (0, 1)-norm of u(x, t) is bounded for all t ≥ 0. On the other hand, the boundedness of u(x, t) in L1 (0, 1) does not ensure the Lp -boundedness of u(x, t) for large p. Thus, the existence of a global solution to the system (1.1)–(1.4) seems challenging. In this paper we show that under certain conditions on the boundary values, the q q problem (1.1)–(1.4) has √ a global smooth solution if σ0 [1 + |u| ] ≤ σ(u) ≤ σ1 [1 + |u| ] for any q ∈ [0, 8 + 4 3), no matter how large the initial data are. We do not know if this value for q is sharp. The crucial step is that we prove a maximum principle for the following wave equation with damping: wtt − wxx + σ(x, t)wt = 0, where σ(x, t) ≥ 0. Unlike the case for elliptic and parabolic equations [8], this maximum principle provides an upper bound for |wx | and |wt | in terms of the bounds for the initial and boundary quantities. It does not require any smoothness nor bounds on σ(x, t). In particular, a priori bounds for wx and wt are derived when the boundary data at x = 0 and x = 1 are homogeneous, or given periodically in the x-variable. To the best of the authors’ knowledge, this maximum principle is new and has independent interest itself. The proof of the maximum principle is based on deriving a sequence of “energy identities” and an iteration technique. This technique has some similarity to Moser’s iteration [7] in deriving a priori bounds of solutions to elliptic and parabolic equations. Moreover, the first two elements of this sequence may be reminiscent of an observation for the solution of the “nonlinear σ model”, cf. [2]. This paper is organized as follows. In Sect. 2 we derive the maximum principle for damped wave equations. In particular, we derive an a priori bound for wx and wt for certain special boundary data. In Sect. 3, we derive various a priori energy estimates for w(x, t) and u(x, t). The existence of a global solution to (1.1)–(1.4) is established via a fixed point argument.

Maxwell’s Equations with Temperature Effect, II

345

For the reader’s convenience, we recall some standard notations below. Let B be a Banach space and 1 ≤ p < ∞, and let Lp (0, T ; B) = {f |f : [0, T ] → B with the norm ||f ||Lp (0,T ;B) < ∞, } , where

Z

T

||f ||Lp (0,T ;B) = [ 0

||f ||pB dt] p . 1

1+α α The spaces W m,p (), H01 (), Wp2,1 (QT ) and C 1+α, 2 (Q¯ T ), C 2+α,1+ 2 (Q¯ T ), etc. are the usual Sobolev and classical spaces. The reader may consult [5] for the definition of these spaces.

2. A Maximum Principle for Damped Wave Equations Let σ(x, t) be measurable and nonnegative in Q = {(x, t) : 0 < x < 1, t > 0}. Consider the following wave equation: wtt − wxx + σ(x, t)wt = 0,

(x, t) ∈ Q.

(2.1)

With appropriate initial and boundary conditions, one can derive an a priori bound for the energy Z tZ 1 Z 1 [wt2 + wx2 ] dx + σwt2 dx dt. 0

0

0

Further a priori estimates often require smoothness and bounds on σ(x, t) as well as on its derivatives. In this section we derive L∞ -bounds for wt and wx over Q in terms of a constant which depends only on the values of wx and wt on the boundary x = 0 and x = 1 as well as at the initial time. In particular, if the boundary conditions are given by w(0, t) = w(1, t) = 0, or wx (0, t) = wx (1, t) = 0, or a periodic form w(0, t) = w(1, t), wx (0, t) = wx (1, t), then the L∞ -norm of wx and wt in Q can be estimated by a known constant depending only on initial values. This a priori estimate does not depend on the smoothness of σ(x, t) nor on bounds for σ(x, t) as long as it is nonnegative. Theorem 2.1. Let σ(x, t) be nonnegative and integrable in Q. Let w(x, t) be a solution of the wave equation (2.1). Then sup |wx | + sup |wt | ≤ C,

(x,t)∈Q

(2.2)

(x,t)∈Q

where C = 4 max{C0 , C1 } and C0 = sup |wx (x, 0)| + sup |wt (x, 0)|, 0<x0

t>0

t>0

346


Proof. We shall first show the result in a fixed time interval, say, [0, T ] for any T > 0. Let QT = (0, 1) × (0, T ]. Multiplying (2.1) by wt , we see that ∂ ∂ 1 2 2 (wt + wx ) + σwt2 = (wx wt ). ∂t 2 ∂x Similarly, we multiply (2.1) by wx to obtain

∂ 1 2 ∂ (wx wt ) + σwt wx = (wt + wx2 ) . ∂t ∂x 2

On Q¯ T , we define

1 2 (w + wx2 ), p = wt wx . 2 t Then e and p will satisfy the following system: e=

et + σwt2 = px , in QT , pt + σp = ex , in QT . Define en+1 (x, t) =

e2n + p2n , pn+1 = en pn , 2

(2.3) (2.4)

n = 1, 2, · · · ,

where e1 = e = 21 (wt2 + wx2 ) and p1 = p = wt wx . Then en+1 and pn+1 will satisfy the following system: (en+1 )t + σan+1 = (pn+1 )x in QT , (pn+1 )t + σbn+1 = (en+1 )x in QT ,

(2.5) (2.6)

where a1 = wt2 , and an+1 = en an + pn bn , b1 = p, and bn+1 = an pn + bn en , n = 1, 2, . . . . We claim that an+1 ≥ 0, pn+1 bn+1 ≥ 0 for all n ≥ 1. The claim can be proved by induction. Indeed, for n = 1, it is clear that a2 = e1 a1 + p1 b1 = ewt2 + p2 ≥ 0 and b2 p2 = (a1 p1 + b1 e1 )e1 p1 ≥ 21 wt4 p2 + e2 p2 ≥ 0. Assume that the claim holds for n, i.e. an ≥ 0, pn bn ≥ 0. Then an+1 = en an + pn bn ≥ 0, since en ≥ 0 for all n ≥ 1. From the definition of pn and bn , we see pn+1 bn+1 = (en pn )[an pn + bn en ] = en an p2n + e2n pn bn ≥ 0, which concludes the induction proof. Now we integrate (2.5) over (0, 1) to obtain


d dt

Z

Z

1

Z

1

en+1 dx + 0

347 1

σan+1 dx =

(pn+1 )x dx.

0

As σ ≥ 0 and an+1 ≥ 0, we see that Z 1 Z 1 Z t en+1 dx ≤ en+1 |t=0 dx + [pn+1 (1, τ ) − pn+1 (0, τ )] dτ. 0

(2.7)

0

0

(2.8)

0

From the definition of en , we see that en+1 = ≥ ≥ ≥ ≥ Now 1 · 2

e2n + p2n 2

!2 1 2 1 e2n−1 + p2n−1 e = 2 n 2 2 2 1 1 · e4n−1 2 2 ··· 2 2n−1 n 1 1 1 · ··· e21 . 2 2 2

2 2n−1 1 1 1 ··· = 2n −1 2 2 2

and e1 ≡ e ≥ It follows that

Z

1

en+1 dx ≥ Z

wt2 w2 , e1 ≥ x . 2 2

0

1 22n+1 −1

1

en+1 dx ≥ 0

1 2

2n+1 −1

Z Z

1

wt2

0 1 0

n+1

n+1

wx2

dx,

(2.9)

dx.

(2.10)

We claim that the following estimates hold for all n ≥ 0: n+1

en+1 (x, 0) ≤ C02 , Z t n+1 [pn+1 (1, τ ) − pn+1 (0, τ )] dτ ≤ 2T C12 , 0

where

s C0 =

||wx (x, 0)||2L∞ (0,1) + ||wt (x, 0)||2L∞ (0,1) 2

,

C1 = max sup |wx (0, t)| + sup |wt (0, t)|, sup |wx (1, t)| + sup |wt (1, t)| . t>0

t>0

t>0

We use induction to prove (2.11)–(2.12). From the definition,

t>0

(2.11) (2.12)

348


e1 (x, 0) =

wx (x, 0)2 + wt (x, 0)2 ≤ C02 . 2

Assume that the estimate (2.11) holds for all n ≤ k, i.e. n

en (x, 0) ≤ C02 ,

n = 1, 2, · · · , k.

Now, from the definition of ek and pk , we see that ek+1 (x, 0) = = = ≤ ≤

e2k + p2k 2 2 ek + e2k−1 p2k−1 2 e2k + e2k−1 e2k−2 · · · e21 p21 2 k+1 k k−1 2 C02 + C02 +2 +···+2 p21 2 k+1 k k−1 2 C02 + C02 +2 +···+2 +4 , 2

since p21 (x, 0) = |wx (x, 0)wt (x, 0)|2 ≤ C04 . Now

2 + 22 + · · · + 2k = 2(2k − 1) = 2k+1 − 2.

It follows that ek+1 (x, 0) ≤

C02

k+1

+ C02 2

k+1

= C02

k+1

.

To prove the second estimate (2.12) of the claim, we recall from the definition of pn+1 , pn+1 = en pn = en en−1 · · · e1 p1 . By using the same induction argument, we can easily derive n

n

en (0, t) ≤ C12 , en (1, t) ≤ C12 , n = 1, 2, . . . as follows. From |p1 (0, t)| = |wx (0, t)wt (0, t)| ≤ C12 , |p1 (1, t)| = |wx (1, t)wt (1, t)| ≤ C12 we get, assuming the validity of the claim at all steps k ≤ n, n

|pn+1 |x=0 ≤ C12 C12 = C12 Similarly,

n+1

.

n−1

· · · C12 · C12

(2.13)


349 n

|pn+1 |x=1 ≤ C12 C12 = C12

n+1

n−1

· · · C12 · C12

.

Finally, we combine the above estimates to obtain from (2.8) (after an integration over t) the following two inequalities: Z

1 22n+1 −1 1

Z

22n+1 −1

TZ 1 0

0

0

TZ 1 0

wt2

n+1

dx dt ≤ C02

n+1

n+1

dx dt ≤ C02

n+1

wx2

T + 2T 2 C12 T + 2T 2 C12

n+1

n+1

,

(2.14)

.

(2.15)

th

We take the 2n+1 root and then take the limit as n tends to infinity to conclude that sup [0,1]×[0,T ]

sup [0,1]×[0,T ]

|wt | ≤ 2 max{C0 , C1 },

(2.16)

|wx | ≤ 2 max{C0 , C1 }.

(2.17)

As the constants in (2.16)–(2.17) do not depend on T , the desired estimates follow immediately. Corollary 2.2. If the boundary conditions associated with Eq. (2.1) are given by w(0, t) = w(1, t) = 0, or wx (0, t) = wx (1, t) = 0, or the boundary conditions are periodic w(0, t) = w(1, t), wx (0, t) = wx (1, t), t ≥ 0, then sup |wx | + sup |wt | ≤ C0 ,

(x,t)∈Q

where

C0 = 4

(2.18)

(x,t)∈Q

sup |wx (x, 0))| + sup |wt (x, 0)| .

0<x τ, x, y ∈ (0, 1)

for all (x, t) ∈ QT .

An application of the generalized Gronwall inequality gives us sup |u(x, t)| ≤ C, QT

where C depends only on the initial and boundary data as well as on the upper bound T. With this priori L∞ -bound of u(x, t) in hand, we can easily follow the argument of [9] to conclude the desired existence result. Next we study the case where σ(u) grows like uq with q > 1. We assume H(3.3): Let σ(u) ∈ C 2 (R) satisfy σ0 [1 + |u|q ] ≤ σ(u) ≤ σ1 [1 + |u|q ], where σ0 and σ1 are positive constants. Theorem 3.2. Let the assumptions H(3.1)–H(3.3) hold and let f1 (t) = f2 (t) = 0 for t ∈ [0, T ]. Assume g1 (t), g2 (t) and u0 (x) are nonnegative on [0, 1]. Then any solution (w, u) of (1.1)–(1.4) satisfies the following a priori estimates: sup |wx | + sup |wt | ≤ C, ||u||Wp2,1 (QT ) ≤ C, QT

QT

where

p √ (q − 8)2 − 48 if q ≥ 8 + 4 3. 4 √ The number p can be arbitrary large if q < 8 + 4 3, while the constant C depends only on known data and p. 1 + ε∗ + q ∗ (q − 8) − , ε = p= q


353

Proof. For simplicity, we assume g0 (t) = g1 (t) = 0. Otherwise, similar to the argument used in Theorem 3.1 we introduce U (x, t) = u(x, t) − H(x, t). Without loss of generality we may further assume that σ(u) = uq . It will be seen that the general case can be handled similarly. The proof will be divided into three steps. First of all, we note that u(x, t) ≥ 0 by the maximum principle since u0 (x) ≥ 0. Step 1. Multiplying Eq. (1.1) by uwt and then integrating over (0, 1) × (0, T ), we have, after some routine calculations, Z Z tZ 1 1 1 u[wx2 + wt2 ] dx + uq+1 wt2 dx dt 2 0 0 0 Z Z Z 1 0 1 1 1 1 u0 [(w0 )2 + (w1 )2 ] dx − ut [wx2 + wt2 ] dx − wx wt ux dx. = 2 0 2 0 0 Since wt and wx are uniformly bounded by Corollary 2.2, it follows that Z TZ 1 Z tZ 1 Z 1 u[wx2 + wt2 ] dx + uq+1 wt2 dx dt ≤ C1 + C2 [|ux | + |ut |] dx dt.(3.6) 0

0

0

0

0

On the other hand, by applying the Wp2,1 -estimate for the parabolic equation (1.2), we obtain " # Z Z ||u||pW 2,1 (Q p

In particular, for p = Z

q+1 q , T

0

T

≤ C ||u0 ||pW 2 +

T)

p

0

1

0

upq wt2p dx dt .

(3.7)

we see that

Z

1 0

Z

upq wt2p dx dt

T

≤C 0

Z

1 0

uq+1 wt2 dx dt

Z

T

≤ C + CC2 0

Z

(since wt is bounded)

1

[|ux | + |ut |] dx dt, 0

where at the final step the estimate (3.6) was used. It follows that Z TZ 1 ||u||pW 2,1 (Q ) ≤ C + C [|ux | + |ut |] dx dt. p

T

0

0

After applying Hölder’s and Young’s inequalities we get ||u||pW 2,1 (Q p

T)

≤ C,

where p = q+1 q . Now by multiplying (1.2) by u and using (3.8), we immediately obtain

(3.8)

354


Z

Z

1

0≤t≤T

0

Z

T

u2 dx +

sup

0

1

u2x dx dt ≤ C.

0

(3.9)

Sobolev’s embedding shows that for any v(x) ∈ H01 (0, 1), Z

Z

1

v dx ≤ C 0

"Z

1

6

0

#2

1

dx ·

vx2

2

v dx

.

(3.10)

0

It follows from (3.9)–(3.10) that Z

T 0

Z

Z

1

u dx dt ≤ C 0

Z

T

6

0

1

u2x dx dt ≤ C.

0

(3.11)

Step 2. Let ε > 0. Multiplying (1.1) by u1+ε wt and then integrating over (0, 1) × (0, T ), we obtain Z

Z

1 0

u1+ε [wx2 + wt2 ] dx + Z

Z

T

≤C +C 0

1

T

Z

0

1 0

uq+1+ε wt2 dx dt

uε [|ut | + |ux |] dx dt.

(3.12)

0

Multiplying Eq.(1.2) by u1+ε and integrating over (0, 1) × (0, T ), we have Z

Z

1

u

sup 0≤t≤T

0

Z

T

≤C 0

1

1 0

T 0

uε u2x dx dt

uq+1+ε wt2 dx dt

T

≤C +C

Z

dx + 0

Z Z

2+ε

0

Z

1

uε [|ut | + |ux |] dx dt,

(3.13)

0

where at the final step the estimate (3.12) was used. An application of the Wp2,1 -estimate with p = 1+ε+q to (1.2) yields q Z

T 0

Z

1 0

[up + upxx + upt ] dx dt Z

T

≤C +C Z

0 T

≤C +C 0

Z Z

1 0 1

uq+1+ε wt2p dx dt uε [|ut | + |ux |] dx dt,

(3.14)

0

where the estimate (3.13) was used again. Now we use Hölder’s inequality with r = inequality to obtain

1+ε+q q

and s =

1+ε+q ε+1

and then Young’s


Z

Z

T 0

1

uε |ut | dx dt

0

Z

T

≤ Z

0

≤δ

Z

!1/r

1

T

|ut | dx dt 1

|ut |r dx dt + C(δ) |ut |

=δ

Z

!1/s

1

εs

0

T 0

(1+ε+q)/q

Z

u dx dt 0

0 1

Z

0

Z

r

0

Z

T

0 T

Z

355

Z

1

uεs dx dt

0

Z

T

Z

1

dx dt + C(δ)

0

0

u[ε(1+ε+q)]/(1+ε) dx dt.

(3.15)

u[ε(1+ε+q)]/(1+ε) dx dt.

(3.16)

0

Similarly, Z

T 0

Z Z

1

uε |ut | dx dt

0 T

≤δ 0

Z

Z

1

T

|ut |(1+ε+q)/q dx dt + C(δ) 0

0

Z

1 0

If we can choose ε such that [ε(1 + ε + q)]/(1 + ε) ≤ 6, then the final term in the right–hand side of (3.15) and (3.16) can be estimated by (3.11). Hence after choosing δ small in (3.15)–(3.16), we see from (3.14)–(3.16) that Z TZ 1 [up + upxx + upt ] dx dt ≤ C, 0

0

where p=

1+ε+q . q

Now the inequality ε(1 + ε + q) ≤6 1+ε is equivalent to

ε2 + (q − 5)ε − 6 ≤ 0.

The maximum value of ε which we can choose is p −(q − 5) + (q − 5)2 + 24 . ε= 2 Before we derive further estimates, we note that from the embedding theorem [5]: ||u||L∞ (QT ) ≤ C||u||Wp2,1 (QT ) if p > 23 . Now p= Hence p >

3 2

is equivalent to

1+ε 1+ε+q =1+ . q q

356


1 1+ε > . q 2 It follows that q < 6. Consequently, when q < 6 there exists a constant C such that ||u||L∞ (QT ) ≤ C. Step 3. Let −(q − 5) +

ε1 =

Now from Step 2 we know that Z 1 Z u2+ε1 dx + sup 0≤t≤T

Let v = u

2+ε1 2

0

p (q − 5)2 + 24 . 2 1

0

Z

T 0

uε1 u2x dx dt ≤ C.

. Then Sobolev’s embedding implies

Z

T 0

It follows that

Z

Z

1

T

v dx dt ≤ C 6

0

0

Z

T 0

Z

Z

Z

1 0

vx2

dx dt · sup 0≤t≤T

!4

1 2

v dx

.

0

1

u6+3ε1 dx dt ≤ C. 0

Now as in Step 2 we can choose ε as large as possible such that ε(1 + ε + q) ≤ 6 + 3ε1 . 1+ε

(3.17)

The largest value of ε satisfying the inequality (3.17) is p −(q − 5 − 3ε1 ) + (q − 5 − 3ε1 )2 + 4(6 + 3ε1 ) . ε= 2 We define ε2 to be the right–hand side of the above equation. By continuing this process, we obtain a sequence εn which is defined as follows: p −(q − 5 − 3εn ) + (q − 5 − 3εn )2 + 4(6 + 3εn ) , n = 1, 2, . . . . (3.18) εn+1 = 2 We now show by induction that the sequence εn is monotonic increasing. Note that ε2 − ε1 > 0 is equivalent to p p 3ε1 + (q − 5 − 3ε1 )2 + 4(6 + 3ε1 ) − (q − 5)2 + 24 > 0,

(3.19)

which is equivalent to p (q − 5 − 3ε1 )2 + 4(6 + 3ε1 ) − (q − 7 − 3ε1 ) > 0.

(3.20)


357

If q − 7 − 3ε1 < 0, then the inequality (3.20) holds automatically. If q − 7 − 3ε1 ≥ 0, then a simple calculation shows that the inequality (3.20) holds as long as q > 0. This concludes the proof that ε2 > ε1 . Now we assume that εn > εn−1 . From the definition, εn+1 − εn = 3(εn −εn−1 )+

p

(q−5−3εn )2 +4(6 + 3εn )− 2

p (q − 5 − 3εn−1 )2 + 4(6 + 3εn−1 )

.

It follows that εn+1 − εn > 0 is equivalent to p 9(εn − εn−1 )2 + 6(εn − εn−1 ) (q − 5 − 3εn )2 + 4(6 + 3εn ) +(q − 5 − 3εn )2 + 4(6 + 3εn ) > (q − 5 − 3εn−1 )2 + 4(6 + 3εn−1 ). That is, p

(q − 5 − 3εn )2 + 4(6 + 3εn ) − [q − 7 − 3εn ] > 0.

(3.21)

If q − 7 − 3εn < 0, the above inequality holds automatically. If q − 7 − 3εn ≥ 0, then the inequality (3.21) holds as long as q > 0. Let ε∗ = lim εn . n→+∞

∗

From the definition of εn , we see that ε , if it is finite, satisfies p −(q − 5 − 3ε∗ ) + (q − 5 − 3ε∗ )2 + 4(6 + 3ε∗ ) ∗ ε = . 2

(3.22)

Equation (3.22) is equivalent to 2ε∗2 + (8 − q)ε∗ + 6 = 0. Solving this equation, we find that ∗

ε =

(q − 8) ±

p (q − 8)2 − 48 . 4

When (q − 8)2 − 48 < 0, then there is no real root of (3.22). This implies that ε∗ = ∞. In this case, we can choose n large enough such that p = 1+εqn +q > 23 . With this choice of εn , the embedding theorem shows that the L∞ -norm of u is bounded by a constant which depends only on known data. When (q − 8)2 − 48 ≥ 0, the largest possible ε∗ is p (q − 8) − (q − 8)2 − 48 ∗ ε = . 4 This implies that if ∗

0 1: ||u||Wp2,1 (QT ) ≤ C.

Sobolev’s embedding yields ||u||

C 1+α,

1+α 2

¯T) (Q

≤ C,

where C depends only on known data. The rest of the proof exactly follows from [9].

√ Remark 3.1. Again we point out that we do not know if the number 8+4 3 in Corollary 3.3 is optimal for the existence of a global solution to (1.1)–(1.4). When σ(u) = uq with any q > 0, the a priori estimates in Theorem 3.2 hold. However, these are not strong enough to allow us to show the existence of a global weak solution to (1.1)–(1.4). Numerical experiments suggest that the temperature will blow up in finite time if q is large. References 1. Boccardo, L. and Gallouet, T.: Nonlinear Elliptic and Parabolic Equations involving measure data. J. Funct. Anal. 87, 149–169 (1989) 2. Ginibre, J. and Velo, G.: The Cauchy problem for the O(N ), CP(N −1), and GC (N, p) models. (English) Ann. Phys. 142 no.2, 393–415 (1982) 3. Kriegsmann, G.A.: Microwave heating of dispersive media. SIAM J. Appl. Math. 53, 655–669 (1993) 4. Landau, L.D. and Lifshitz, E.M.: Electrodynamics of Continuous Media. New York: Pergamon Press, 1960 5. La dyzenskaja, O.A., Solonnikov, V.A. and Ural’ceva, N.N.: Linear and Quasi-linear Equations of Parabolic Type AMS Trans. 23, Providence, R.I.: American Math. Soc., 1968 6. Metaxas, A.C. and Meredith, R.J.: Industrial Microwave Heating. I.E.E. Power Engineering Series, Vol. 4, London: Per Peregrimus Ltd., 1983 7. Moser, J.: A Harnack inequality for parabolic differential equations. Comm. on Pure and Appl. Math. 17, 101–134 (1964) 8. Protter, M.H. and Weinberger, H.F.: Maximum principles in Differential Equations. New York: SpringerVerlag, 1984 9. Yin, H.M.: On Maxwell’s equations in an electromagnetic field with the temperature effect. Notre Dame preprint series# 253, 1996, to appear in SIAM Journal of Mathematical Analysis Communicated by H. Araki

Commun. Math. Phys. 194, 359 – 388 (1998)

Communications in


Renormalization Group Pathologies and the Definition of Gibbs States J. Bricmont1,? , A. Kupiainen2,?? , R. Lefevere1 1 UCL, Physique Th´ eorique, B-1348, Louvain-la-Neuve, Belgium. E-mail: [email protected], [email protected] 2 Helsinki University, Department of Mathematics, Helsinki 00014, Finland. E-mail: [email protected]

Received: 30 April 1997 / Accepted: 27 October 1997

Dedicated to the memory of Roland L. Dobrushin Abstract: We show that the so-called Renormalization Group pathologies in low temperature Ising models are due to the fact that the renormalized Hamiltonian is defined only almost everywhere (with respect to the renormalized Gibbs measures). We construct this renormalized Hamiltonian using a Renormalization Group method developed for random systems and we show that the pathologies are analogous to Griffiths’ singularities. 1. Introduction The Renormalization Group (RG) has been one of the most useful tools of theoretical physics during the past decades. It has led to an understanding of universality in the theory of critical phenomena and of the divergences in quantum field theories. It has also provided a nonperturbative calculational framework as well as the basis of a rigorous mathematical understanding of these theories. Even though the RG was primarily devised for the study of (approximatively) scale invariant situations such as statistical mechanical models at the critical point, it was found useful in the mathematical analysis of problems that were not critical but that nevertheless were “multiscale”: for example, first order phase transitions in regular [12] and disordered [2] spin systems. The spin variables give an appropriate representation of these systems at and above the critical point; however, at low temperatures, these models are most naturally expressed in terms of contours (domain walls) that separate the different ground states. To apply the RG method, one inductively sums over the small scale contours, producing an effective theory for the larger scale contours. However, the real power of the RG both theoretically and in most applications has been to realize it as a map between Hamiltonians, and the latter are usually expressed ? ??

Supported by EC grant CHRX-CT93-0411 Supported by NSF grant DMS-9205296 and EC grant CHRX-CT93-0411

360

J. Bricmont, A. Kupiainen, R. Lefevere

in terms of the spin variables. So, one would like to define rigorously such a map, but this program has met some difficulties. It was observed in simulations [19] that the RG transformation seems, in some sense, “discontinuous” as a map between spin Hamiltonians. These observations led subsequently to a rather extensive discussion of the so-called “pathologies” of Renormalization Group Transformations (RGT): van Enter, Fernandez and Sokal have shown [34, 35] that, first of all, the RG transformation is not really discontinuous. But they also show, using results of Griffiths and Pearce [16, 17] and of Israel [20], that, roughly speaking, there does not exist a renormalized Hamiltonian for many RGT applied to Ising-like models at low temperatures and in some cases even at high temperatures (in particular in a large external field, see [32, 33]). More precisely, van Enter, Fernandez and Sokal consider various real-space RGT (block spin, majority vote, decimation) that can be easily and rigorously defined as maps acting on measures (i.e. on probability distributions of the infinite volume spin system). The problem occurs when one tries to rewrite this map in terms of Hamiltonians. Hamiltonians are usually expressed as sums of (n-body) interactions of the spins that have sufficient decay properties so as to define infinite volume Gibbs measures. If we start with a Gibbs measure µ corresponding to a given Hamiltonian H, one can easily define the renormalized measure µ0 . The problem then is to reconstruct a renormalized Hamiltonian H 0 (i.e. a set of interactions) for which µ0 is a Gibbs measure. Although this is trivial in finite volume, it is not so in the thermodynamic limit, and it is shown in [35] that, in many cases at low temperatures, even if H contains only nearest-neighbour interactions, there is no absolutely summable interaction (defined in (2.7) below) giving rise to a Hamiltonian H 0 for which µ0 is a Gibbs measure. What is one supposed to think about these pathologies? From a practical point of view, we understand the reason why there might be problems: one is using the wrong variables, i.e. the spin variables rather than the contours variables. The fact that the usefulness of the RG method depends crucially on choosing the right variables has been known for a long time. The “good” variables should be such that a single RG transformation, which can be interpreted as solving the statistical mechanics of the small scale variables with the large ones kept fixed, should be “noncritical”, i.e. should be away from the parameter regions where phase transitions occur. This is true in particular in the low temperature region if one uses the contour variables. In all the cases where pathologies were found, they were due to the fact that a single RG transformation involves a system that has a phase transition for some fixed values of the large scale variables. However, from the theoretical point of view, we believe that it is interesting to see just how pathological the renormalized Hamiltonians are. This question is related to another one, of independent interest: when is a measure Gibbsian for some Hamiltonian? For example, Schonmann showed [31] that, when one projects a Gibbs measure (at low temperatures) to the spins attached to a lattice of lower dimension, the resulting measure is not, in general, Gibbsian. There has been an extensive investigation of this problem of pathologies and Gibbsianness. Martinelli and Olivieri [29, 30] have shown that, in a non-zero external field, the pathologies disappear after sufficiently many decimations. Fernandez and Pfister [9] study the set of configurations that are responsible for those pathologies. They give criteria which hold in particular in a non-zero external field, and which imply that this set is of zero measure with respect to the renormalized measures. The renormalized Hamiltonian has been studied at and above the critical temperature by several authors [1, 3, 4, 5, 10, 11, 18, 21]. Our goal in this paper is to further clarify the situation: following an idea of Dobrushin [7], we prove that, for several examples considered in [35], the renormalized Hamilto-

Renormalization Group Pathologies and Definition of Gibbs States

361

nian actually exists, but the corresponding interaction satisfies a weaker summability condition than the one used in [35]. Our condition ((2.11) below) is however sufficient to define Gibbs measures, in a way that is similar to the one used before for “unbounded spins”. Thus in a sense, the pathologies are not there in the end, and the renormalized measures are Gibbsian. However, it turns out that the renormalized Hamiltonian is only defined almost everywhere with respect to the renormalized measure: it becomes “pathological” on a set of measure zero that in particular includes the configurations used in [35] to exhibit the pathologies. Our result is similar to the one of Maes and Vande Velde [28] on the Schonmann example of the measures projected on a lattice of lower dimension, except that, in our case, we show that the two renormalized states are Gibbsian with respect to the same Hamiltonian (while this question is left open in [28]). However, as pointed out to us by A. Sokal, with our definition of Gibbs states, the interaction defining the Hamiltonian is not unique, while it is essentially unique (up to physical equivalence) with the usual definition. Of course, we do not claim to justify the RG method in general, and, besides, our results do not hold near the critical point. We do not even prove that, upon iteration, the RGT drives the Hamiltonians to a trivial fixed point, although this can probably be done, in some of the examples discussed below. But we do clarify the nature of the “pathologies”. Our proof is based on the following idea: we consider the spins distributed by the renormalized measures as random external fields acting on the original system, and we apply the methods of [2] to construct the renormalized Hamiltonian. So, we consider a single transformation of the Renormalization Group but in order to construct the Hamiltonian on the image spin system, we iterate some RG transformations acting on the system with random fields. As in all random systems, there is a set of measure zero of “bad” configurations of the random fields, for which “typical” results (e.g. decay of correlations) do not hold, and which are responsible for the Griffiths’ singularities ([15]). We shall see below that the pathologies used in [35] are actually due to those bad configurations. But, once one excludes this set of measure zero, the renormalized system has a nice Hamiltonian, with rapidly decaying interactions.

2. Results We consider the nearest-neighbour Ising model on Zd , d ≥ 2, at β large, for simplicity. To each i ∈ Zd , we associate a variable σi ∈ {−1, +1}, and the (formal) Hamiltonian is X − βH = β (σi σj − 1), (2.1) hiji

where hiji denotes a nearest-neighbour pair and β is the inverse temperature. At low temperatures, there are two extremal translation invariant Gibbs measures corresponding to (1), µ+ and µ− . To define our RGT, let L = (bZ)d , b ∈ N, b ≥ 2 and cover Zd with disjoint b-boxes Bx = B0 + x, x ∈ L, where B0 is a box of size b centered around 0. Associate to each x ∈ L a variable sx ∈ {−1, +1}, denote by σA an element of {−1, +1}A , for A ⊂ Zd , |A| < ∞, and introduce the probability kernels Tx = T (σBx , sx ) for x ∈ L, i.e. Tx satisfies

362


1) T (σBx , sx ) ≥ 0, X 2) T (σBx , sx ) = 1.

(2.2)

sx d

For any measure µ on {−1, +1}Z , we denote by µ(σA ) the probability of the configuration σA . Definition. Given a measure µ on {−1, +1}Z , the renormalized measure µ0 on = {−1, +1}L is defined by: X Y µ(σBA ) T (σBx , sx ), (2.3) µ0 (sA ) = d

σBA

x∈A

where BA = ∪x∈A Bx , A ⊂ L, |A| < ∞, and sA ∈ A = {−1, +1}A . It is easy to check, using (1.1) and (1.2), that µ0 is a measure. We shall call the spins σi the internal spins and the spins sx the external ones (they are also sometimes called the block spins). We shall need two other conditions on T : we assume that T is symmetric:

and that

T (σBx , sx ) = T (−σBx , −sx )

(2.4)

0 ≤ T (σBx , sx ) ≤ e−β

(2.5)

if σi 6= sx ∀i ∈ Bx . The condition (2.5) means that there is a coupling which tends to align sx and the spins in the block Bx . It would be more natural to have, instead of (2.5), 0 ≤ T ≤ (with independent of β but small enough). However, assuming (2.5) simplifies the proofs. Note that (2.2, 2.4, 2.5) imply that T ≡ T ({σi = +1}i∈Bx , +1) = T ({σi = −1}i∈Bx , −1) ≥ 1 − e−β ≥

1 2

(2.6)

for β large. The usual transformations, discussed in [35], Sect. 3.1.2, like decimation or majority rule, obviously satisfy (2.4, 2.5). The Kadanoff transformation satisfies (2.5) for p large. Our results could be extended to the block spin transformations (where sx does not belong to {−1, +1}). Let us now summarize the main result of [35]. Consider interactions 8 = (8X ), which are families of functions 8X : X → R, indexed by X ⊂ L, |X| < ∞. Assume that 8 is a) translation invariant: 8X = 8X+x ∀x ∈ L, b) uniformly absolutely summable: X X30

k8X k∞ < ∞.

(2.7)


363

This set of interactions obviously forms a Banach space, with the norm (2.7) (note that our terminology differs slightly from the one of [35]: we add the word “uniformly” to underline the difference with respect to condition (2.11) below) . Then, one defines, ∀V ⊂ L, |V | < ∞, the Hamiltonian X 8X (sX∩V ∨ s¯X∩V c ), (2.8) H(sV |s¯V c ) = − X∩V 6=∅

where sV ∈ V ,s¯V c is the restriction to V c of s¯ ∈ and, for X ∩ Y = ∅, sX ∨ sY denotes the obvious configuration in X∪Y . The condition (2.7) implies that H is a continuous function of sV and s¯V c (in the product topology), and that πV8 (sV |s¯V c ) = Z −1 (s¯V c ) exp(−H(sV |s¯V c ))

(2.9)

(where Z −1 (s¯V c ) is the obvious normalization factor) defines a quasilocal specification in the sense of [35]. Definition. µ is a Gibbs measure for 8 if the conditional probabilities satisfy, ∀V ⊂ L, |V | finite, ∀sV ∈ V , (2.10) µ(sV |s¯V c ) = πV8 (sV |s¯V c ) µ a.e. Note that we left out the inverse temperature β. When we refer to β below, we mean the inverse temperature of the original Ising model (2.1), before acting with the RGT. The main result of [35] is that, for a variety of RGT, there is no interaction satisfying a) and b) above for which µ0+ or µ0− are Gibbs measures. However, we observe that, in order to define H(sV |s¯V c ) and πV8 (sV |s¯V c ), it is not necessary to assume (2.7); it is enough to assume the existence of a tail set ⊂ on which the following pointwise bounds hold: b’) -pointwise absolutely summable: X |8X (sX )| < ∞ ∀x ∈ L, ∀s ∈ .

(2.11)

X3x

We shall therefore enlarge the class of “allowed” interactions by dropping the condition (2.7) and assuming (2.11) instead. Actually, a similar setup was already used in the theory of “unbounded spins” with infinite range interactions, see [13]. With this condition, we can define the specification π 8, by −1 c c c (2.12) πV8, (sV |s¯V c ) = Z (s¯V ) exp(−H(sV |s¯V )) for s¯V ∈ , 0 for s¯V c 6∈ and then define Gibbs measures for the pair 8, as follows: Definition. Given a tail set ⊂ , µ is a Gibbs measure for the pair (8, ) if µ() = 1, and there exists a version of the conditional probabilities that satisfy, ∀V ⊂ L, |V | finite, ∀sV ∈ V , µ(sV |s¯V c ) = πV8, (sV |s¯V c ) (2.13) ∀s¯ ∈ .

364


Since conditional probabilities are defined almost everywhere, this definition is very similar to the usual one. However, when condition (2.7) holds, the conditional probabilities can be extended everywhere, and are continuous, which is not the case here. We can now state our main result: Theorem 1. Under assumptions (2.4), (2.5) on T , and for β large enough, there exist disjoint translation-invariant (hence, tail) sets + , − ⊂ such that µ0+ (+ ) = µ0− (− ) = 1 and an interaction 8 satisfying a) and b’) with = + ∪ − such that µ0+ and µ0− are Gibbs measures for the pair (8, ). Remarks. 1. It should be emphasized that we are able to prove that there exists an interaction for which both µ0+ and µ0− are Gibbsian (in the sense defined above). Moreover, this interaction has stronger decay properties than (2.11), see (3.5) below. Thus, the result is different from the one of Maes and Vande Velde [28] on the projected Gibbs measure: they show that both the “plus” and the “minus” projected measures are Gibbsian (in the same sense as here) for some interaction, but not necessarily for the same one. 2. One can distinguish different type of models where “pathologies” occur. Our framework is the one where the pathologies are the weakest. In the case of [28], it is an open question whether one can take the same interaction for µ0+ and µ0− . But if one combines projection with enough decimation, as in [24], then one knows that each of the resulting states is Gibbsian (in the strongest sense, i.e. with interactions satisfying (2.7)), but for different interactions. This in turn implies that non-trivial convex combinations of these states are not quasilocal everywhere, see [36], where other examples of “robust” non-Gibbsianness can be found. 3. Note that in the theory of “unbounded spins” with long range interactions, a set of “allowed” configurations has to be introduced, where a bound like (2.11) holds [13]. But here, of course, contrary to the unbounded spins models, each k8X k∞ is finite. We even have the bound k8X k∞ ≤ Cβ|X|, see (3.4) below. 4. The set = + ∪ − is not “nice” topologically: e.g. it has an empty interior (in the usual product topology). Besides, our effective potentials do not belong to a natural Banach space like the one defined by (2.7). However, this underlines the fact that the concept of Gibbs measure is a measure-theoretic notion and the latter often does not match with topological notions. 5. We can regard {sx } as a set of quenched external fields coupled to the spins σ by the probability kernels T . The distribution of {sx } is given by µ0+ or µ0− . Then, as in most disordered systems, there is a set of “good” configurations of the random fields ( here) for which the system with the spins σ has good clustering properties. And, implicitly, we shall use the latter to construct our Hamiltonian. This is why part of the proof below uses the techniques of [2, 12]. Of course, there are also “bad” configurations of the random fields for which the σ spins do not have good clustering properties, and those are essentially the ones used in [35] to prove that there is no absolutely summable potential for which µ0+ , µ0− are Gibbsian. 6. To illustrate the role of the set , consider the (trivial) case, where b = 1, and T = δ(σi − sx ) with i = x, i.e. the “renormalized” system is identical to the original one (this example was suggested to us by A. Sokal). Then , as constructed in our proof, will be the set of configurations such that all the (usual) Ising contours are finite and each site is surrounded by at most a finite number of contours. When X = a contour γ (considered as a suitable set of sites), we let 8X (sX ) = 2β|γ|

(2.14)


365

for sX = a configuration making γ a contour, and 8X (sX ) = 0 otherwise. Obviously, this 8 satisfies (2.11) but not (2.7). One can write = + ∪− , according to the values of the spins in the infinite connected component of the complement of the contours. It is easy to see that µ+ , µ− are indeed, at low temperatures, Gibbs measures (in the sense considered + − here) for this new interaction: a Peierls argument shows that Pµ (+ ) = µ (− ) = 1, and for s ∈ the (formal) Hamiltonian (2.1) is βH = 2β γ |γ|. Actually, we shall prove the theorem by using a kind of perturbative analysis around this example. Of course, in this example one could alternatively take = and 8 = the original nearestneighbor interaction; this shows the nonuniqueness of the pair (8, ) in our generalized Gibbs-measure framework. 3. Outline of the Proof 3.1. The main propositions. We shall construct the interaction 8 inductively. We shall now give the strategy and indicate the different steps of the proof. Consider µ0+ ; using (2.3), one sees that the conditional probabilities µ0+ (sV |s¯V c ) can be obtained through the following limit, if it exists: µ0+ (sV |s¯V c ) = lim lim P 32 ↑L 31 ↑Zd

where Z3+ 1 (s32 ) =

X Y

Z3+ 1 (sV ∨ s¯V c ∩32 ) , + ˜ ∨s ¯V c ∩32 ) V s˜ V Z31 (s

(3.1)

Tx e−βH(σ31 |+) ,

(3.2)

σ31 x∈32

where H(σ31 |+) is defined as in (2.1), but with the sum restricted to i ∈ 31 , with σj = +1, ∀j ∈ 3c1 . The conditional probabilities µ0− (sV |s¯V c ) can be obtained by similar formulas, with + replaced by −. We shall prove Proposition 1. Under assumptions (2.4), (2.5), there exists, for all η > 0, a β¯ < ∞, such that, ∀β ≥ β¯ in (2.1), (2.5), there exists a set ⊂ , = + ∪ − , and an interaction 8, such that, ∀s1 , s2 ∈ + , ∀V ⊂ L, |V | < ∞, X Z3+ 1 (s132 ) 1 2 lim lim = exp 8X (sX ) − 8X (sX ) (3.3) + (s2 ) 32 ↑L 31 ↑Zd Z3 32 1 X∩V 6=∅

if s1x = s2x ∀x 6∈ V . The functions 8X satisfy: k8X k∞ ≤ cβ|X| for some c < ∞, and, ∀x ∈ L, ∀s ∈ , X |8X (sX )| exp(d(X)1−η ) = C(x, s) < ∞,

(3.4)

(3.5)

X3x

where d(X) = diam(X). A formula similar to (3.3) holds with + replaced by − for all s1 , s2 ∈ − . Remark. The factor exp(d(X)1−η ) is not optimal; we could replace it by exp(d(X)1−η + |X|1−η ); but we expect |8X (sX )| to decay as exp(−d(X) − |X|).

366


Proposition 2. µ0+ (+ ) = µ0− (− ) = 1. Clearly, (1.1) and these two Propositions imply Theorem 1. Remark. The set will be of measure zero for Gibbs measures which are not convex combinations of µ0+ and µ0− , such as the non-translation-invariant Gibbs measures with ˜ of µ0 interfaces that exist for d ≥ 3 [6]. An open question is to find another set , measure one, for the renormalized measure corresponding to a non-translation invariant Gibbs measure µ (for d ≥ 3), and an interaction with respect to which µ0 is Gibbsian. 3.2. The contour representation. To prove these propositions, we shall use a “contour”, or “polymer” representation of Z3+ 1 (s32 ). But, since we regard the external spins as random fields acting on the internal ones, we shall first define the sets where the external spins are “bad”, namely where they change sign and exert opposite influences on the internal spins. Let [ (3.6) D(s) = {Bx |x ∈ L, ∃y ∈ L, |x − y| = b, sx 6= sy }, where we use throughout the paper: |x| = max |xi |. i=1,···,d

(3.7)

D(s) is determined by the set of (ordinary) contours of the configuration s. Define also D32 ≡ D+ (s32 )

(3.8)

with D+ (s32 ) = where

S {Bx |x ∈ 32 , ∃y ∈ 32 , |x − y| = b, sx 6= sy } S {Bx |x ∈ ∂32 , sx = −1}, ∂32 = {x ∈ 32 , d(x, 3c2 ) = b}.

(3.9) (3.10)

(d is the distance corresponding to (3.7).) D− (s32 ) is defined similarly, with sx = +1 instead of sx = −1. Now, we introduce the “contours” of the internal spins: let, for each term in (3.2) S 0(σ31 |+) = {Bx |x ∈ 32 , ∀i ∈ Bx , σi 6= sx } [ S {Bx ⊂ 31 | ∃hiji, i ∈ Bx , σi 6= σj } D32 , (3.11) where σi = +1 for i 6∈ 31 . So 0 includes the boxes Bx where all the internal spins differ from sx , and the boxes intersected by the usual contours of the configuration σ31 , plus all the sites in D32 . Of / ∂32 , either all course, these sets are not disjoint: in a box Bx belonging to D32 , x ∈ internal spins in Bx differ from sx or all internal spins differ from sy 6= sx in some box By with |x − y| = b, or there is a pair hiji with σi 6= σj in Bx ∪ By . We include D32 in the contours, and we coarse-grain them into b-boxes for convenience. Note that in all boxes Bx not contained in 0, the internal spinsSare constant (and, in 32 , are equal to the external ones). So, one may decompose 0 = γ into connected components (a subset Y of Zd is connected if any two points of Y can be joined by a path (iα ), with |iα − iα+1 | = 1), and one may define contours as pairs γ = (γ, σ(γ)), where


367

γ is the support of the contour, and σ(γ) is a configuration {σi (γ)}i∈γ c , σi (γ) = +1 or −1, defined on the complement of γ, which is constant on the connected components of γ c (this notion of contour will be slightly generalized below). Definition. A family of contours 0 is compatible if the supports of the contours are mutually disjoint: γ 1 ∩ γ 2 = ∅, and if their signs match and agree with the boundary conditions on 31 . So, the notion of compatibility is as for the usual Ising contours, and if 0 is compatible, σi (0) is unambiguously defined, ∀i ∈ 0c . Definition. A family of contours 0 is s-compatible if 0 is compatible and, moreover, σx (0) = sx ∀x ∈ (32 ∩ L)\0. The notion of s-compatibility imposes a constraint due to the external spins. For example, if all the external spins have value +1, a single (Ising) contour surrounding 32 , with + spins outside and − spins inside, is compatible but is not s-compatible. One may write: X ρ(0), (3.12) Z3+ 1 (s32 ) = (T )|32 | 0⊃D32

where the sum runs over s-compatible families of contours with 0 ⊂ 31 , ρ(0) = Q γ∈0 ρ(γ) with ρ(γ) = 0 if γ does not contain the connected components of D32 that it intersects or if σx (0) 6= sx , for some x with Bx adjacent to γ; it equals ρ(γ) =

X ? Y Tx exp(−βH(σγ |σ(γ)) T σγ x∈γ∩L

(3.13)

otherwise. Here H(σγ |σ(γ)) is defined in the same way as (2.8) but with the Hamiltonian (2.1); σi (γ), i ∈ / γ, is fixed by the signs associated to the complement of γ, and the P? runs over spin configurations σγ such that γ is a contour of the configuration sum σγ ∨ σ(γ) (and ρ(γ) = 0 if the sum is empty); ρ(γ) is a function of the external spins {sx |x ∈ 32 , d(x, γ ∩ L) ≤ 2b}, since it vanishes unless γ contains the connected components of D32 that it intersects (observe that the property, for a set Di , to be a connected component of D32 depends on {sx |x ∈ 32 , d(x, Di ∩ L) ≤ 2b}). The fact that we sum in (3.12) over s-compatible families introduces a global constraint on the set of contours which will be characterized explicitly in Lemma 4.1 below. It is easy to see that 0 ≤ ρ(γ) ≤ exp(−β0 |γ\(D32 ∩ ∂32 )|),

(3.14)

where β0 = β0 (b, β) depends on the choice of b in the definition of L and goes to infinity as β in (2.1) goes to infinity. To prove (3.14), observe that one gets a factor e−2β from the Hamiltonian (2.1) for each pair hiji with σi 6= σj , and a factor e−β for each box Bx such that ∀i ∈ Bx , σi 6= sx , from our assumption (2.5) on the probability kernels Tx ; for / ∂32 , we use the observation made above that in or the boxes Bx in D32 , but with x ∈ near each such box, either there is a pair hiji with σi 6= σj or all the internal spins differ from the external one. Finally, using the lower bound (2.6) on T , we get (3.14) for β0 = cβ,

(3.15)

368


with some c > 0 (which has to be taken small enough because, in the above argument, we implicitly assigned the same factor e−2β or e−β to different sites of γ). 3.3. Renormalization. Let us now introduce the coarse-grained description of the system on which our inductive scheme is based. Let L > b be some odd integer (which will be taken large enough below). Divide Zd into disjoint L− boxes {i| |i − Lx| < L2 }, where x ∈ Zd , i.e. each i ∈ Zd can be written as i = Lx + j with x ∈ Zd and |jµ | < L2 , µ = 1, · · · , d (here and below, we use the letters x, y to denote sites in the new lattices Zd ). We define [L−1 i] = x and the L−box of sites i such that [L−1 i] = x is denoted by Lx. Also for a set Y ⊂ Zd , [ YL ] = {[L−1 i]|i ∈ Y }, while LY = ∪{Lx|x ∈ Y }. We use a similar notation for all scales Ln , n = 1, 2, · · ·. We shall now describe D32 and D(s) on these different coarse-grained scales. Let us introduce the random variables Nxn = Nxn (s), n = 0, 1, 2 · · ·, defined inductively as follows: x ∈ D(s),

Nx0 = 2d

if

Nx0

otherwise, X

=0

Nxn+1 = L−1+η 0

(3.16) Nyn ,

(3.17)

y∈Lx0 \D n (s)

where Dn (s) = {Di |Di is a connected component of Dn (s), |Di | ≤ Lα , N n (Di ) ≤ L−3α }, (3.18) with η as in Proposition 1, α = η4 , D0 (s) = D(s); X Nxn , N n (Y ) =

(3.19)

x∈Y

for Y ⊂ Zd , and

Dn+1 (s) = [L−1 (Dn (s)\Dn (s))],

(3.20)

where, for a set Y ⊂ Zd , we write: Y = {i|d(i, Y ) ≤ 1}.

(3.21)

It is easy to see inductively that Nxn = 0 if x ∈ / Dn (s)

(3.22)

Dn (s) ⊂ {x|Nxn 6= 0}.

(3.23)

and n n n and sets D3 , D3 n = 2, · · · We define also variables Nx,3 2 2 2 (3.17, 3.20), but starting with D32 instead of D(s) in (3.16). d

Then, we define, ∀x ∈ Z ,

x = {s ∈ |∃n(x) ∀n ≥ n(x), Nxn = 0} and

by the same formulas


=

\

x .

369

(3.24)

x∈Zd

To understand intuitively the meaning of , observe that iterating the operation (3.20) removes, at each step, the “small” connected components of D(s), and “glues” or “blocks” together the “large” ones that are not too far from each other. Then, the configurations in are those for which this operation ends, after finitely many steps, for each sequence of Ln -boxes labelled by a given site of Zd . Note that in the configurations used in [35] to construct “counterexamples”, D(s) covers an infinite connected subset of the lattice (and obviously N0n 6= 0, for all large enough n’s). Now we can formulate the inductive representation for the partition function which will be used to prove Proposition 1. We shall need a somewhat more general notion of ˆ σ(γ)), where contour: here and below, a contour on scale n will be a triple γ =(γ, γ, d γ is a connected subset of Z , γˆ ⊂ γ (γˆ is not necessarily connected) and σ(γ) is a collection of signs σx (γ), x ∈ Zd on the complement of γ which are constant on the connected components of the complement of γ. However, when γ = Di , for Di a n , we shall have γ = γ, ˆ and we shall simply denote γ by connected component of D3 2 Di . For those contours, on scale n = 0, the signs σ(Di ) coincide with the values of the external spins in Dic . We shall define below (at the end of the proof of Proposition n . On 3) “renormalized” values snx of the external spins, for each x ∈ [L−n 32 ]\D3 2 n scale n, the signs σ(Di ) will also coincide with the values s of the external spins in Dic . As before, a set of contours 0 is compatible if γ 1 ∩ γ 2 = ∅, ∀γ1 , γ2 ∈ 0, and if the signs match (among the contours and with the boundary conditions on 31 ). It is sn -compatible, on scale n, if, moreover, σx (0) = snx , ∀x ∈ Zd \0. We shall derive inductively the following representation for the partition function (3.12): X f n (s ) 8nX (sX ))Ze3+ 1 (s32 ), (3.25) Z3+ 1 (s32 ) = e +,31 32 exp( X⊂32 n where f+,31 (s32 ) corresponds to a “bulk” free energy. Since in Proposition 1, we study a n (s32 ), ratio of partition functions, we need only to bound the difference between two f+,3 1 for different s32 ’s, and this is done in (3.32) below. 8nX will converge, as n → ∞, to the interactions 8X , while

Ze3+ 1 (s32 ) =

X

ρn (0) exp(W n (0)),

(3.26)

0

where the sum runs over sn -compatible sets of contours, with 0 ≡ ∪γ∈0 γ ⊂ [L−n 31 ] n b ≡ ∪γ∈0 γ and the constraint 0 b ⊃ D3 , 2 Y ρn (0) = ρn (γ). (3.27) ρn (γ), is, for n ≥ 1, a function of {sx |x ∈ Ln γ ∩ 32 } while X W n (0) = 9n (Y, 0),

(3.28)

Y ⊂[L−n 31 ]

where 9n (Y, 0) is, for n ≥ 1, a function of {sx |x ∈ Ln Y ∩ 32 }. We shall prove that, eventually, Ze3+ 1 (s32 ) → 1 as 32 ↑ L. Proposition 1 will then follow easily from such a representation. In the proposition below, we collect the bounds satisfied by ρn (γ), 9n (Y, 0) and n f+,31 (s32 ):

370


Proposition 3. Under the hypotheses of Proposition 1, ∀s32 ∈ 32 , and for n such that |[L−n 32 ]| ≥ Ld , (3.25, (3.26) hold, where 8nX satisfies (3.4, (3.5) uniformly in n, n 0 ≤ ρn (γ) ≤ exp(βn kn N3n2 (γ) − βn |γ\D3 |) 2

(3.29)

n , and, for each connected component Di of D3 2

ρn (Di ) ≥ exp(−βn kn N3n2 (Di )). Moreover,

(3.30)

|9n (Y, 0)| ≤ e−βn |Y | ;

(3.31)

9 (Y, 0) depends on 0 only through 0 ∩ Y ≡ {γ|γ ∩ Y 6= ∅} and 9 (Y, 0) = 0 unless Y is connected and 0 ∩ Y 6= ∅. Finally, ∀s132 , s232 ∈ 32 , with s1x = s2x , ∀x ∈ V , and for n such that d([L−n V ], [L−n 32 ]c ) ≥ L, n

n

n n |f+,3 (s132 ) − f+,3 (s232 )| ≤ C|V | exp(−(d(V, 3c2 ))1−2η ) 1 1

(3.32)

for some C < ∞, and βn = L(1−η)n β0 , kn = k0 − L−n ,

(3.33)

where k0 < ∞, β0 = cβ and η = 4α. n satisfying (3.32), Similar formulas and bounds hold for Z3−1 , with a function f−,3 1 + − and D32 (=D (s32 )) replaced by D (s32 ). Remarks. 1. We shall see in the proof (Eq. 4.18) below) that, at each scale, there are two contributions to 8nX . One is given by ln ρn (Di ), and is similar to the contour energy in (2.14). The other contribution comes from the sum over the internal spins that introduces “interactions” between the contours of the external spins, and gives rise to the last term in (4.18). 2. The restriction on n in the proposition is technical: for larger n’s, all the statements would remain true, except that βn would no longer increase as in (3.33). However, in the limit 32 ↑ L, the largest value of n to which the proposition applies increases to infinity. 3. In the proofs, we shall denote by c or C a generic constant that depends only on the lattice dimension or on b, but not on the choice of L. This constant may vary from place to place. We shall use C(L) to denote a generic constant that may depend also on L. We shall assume that L is chosen large enough (given η = 4α in Proposition 1) so that inequalities like C ≤ Lα can be used. Besides, we shall assume that β0 = cβ (see (3.15)) is large enough, so that inequalities like C(L) ≤ β0 can be used. 4. Proofs Proof of Proposition 3. The proof will be made for the + boundary conditions. The proof for the − boundary condition is similar, but we shall indicate which quantities may depend on the boundary conditions. Although the proof is rather technical, the main idea is quite simple: we cannot take directly the logarithm of Z3+ 1 (s32 ), wherever s32 is not constant, because the change of signs of s32 in D32 introduces constraints in


371

that partition function, i.e. it forces the presence of contours. We define a sort of local partition function, ρ(D32 ) (see (4.1) below), containing the contours that are constrained only by the “small” connected components of D32 , and factor it out of the sum. Then the sum over the contours that do not intersect (D32 \D32 ) can be exponentiated via the usual polymer formalism. The exponent is divided into three parts: the terms that are n not inside 32 contribute to f+,3 (s32 ) (see (4.20)), those that are inside 32 but do not 1 depend on the remaining contours (i.e. those that intersect (D32 \D32 )) contribute to 8n (X) (see (4.18)), and finally those that depend on the remaining contours contribute to 9n (Y, 0) (see (4.19, 4.14)). Then, we “block” the remaining contours and iterate the operation. By definition of , eventually D32 becomes empty, there are no constraints left and Z˜ 3+ 1 (s32 ) converges to 1 (see (4.34)). Turning to the proof, we see that, for n = 0, (3.25) and (3.26) follow from (3.12) 0 b = 0, 80X = 0, 90 (Y, 0) = 0, f+,3 (s32 ) = |32 | ln T (which is independent of s32 , with 0 1 0 so that (3.32) holds trivially) and ρ (γ) = ρ(γ). The bounds on ρ(γ) will be discussed later (see proof of Lemma 5). Now assume that the proposition holds for n, and let us prove it for n + 1. We shall delete the indices n, n + 1 and denote by a prime the scale n + 1. Let Y ρ(Di ), (4.1) ρ(D32 ) = Di ∈D32

and write (3.26) as:

Ze = ρ(D32 )

XY

ρ(γ) exp(W (0)).

(4.2)

0 γ∈0

W (0) was defined in (3.28), and ρ(γ) = Q

ρ(γ) , i ρ(Di )

(4.3)

where the product runs over Di ⊂ γ, Di ∈ D32 . In (4.2), we use the fact that 0 ⊃ D32 ⊃ D32 . A contour γ is small if, (4.4) V (γ) 6⊃ [L−n 32 ] and

γ ∩ (D32 \D32 ) = ∅,

(4.5)

where V (γ) is the complement of the infinite connected component of Z \γ. A contour is large otherwise. Note that, unlike in [2, 12], the notion of small contour does not refer to the size of γ, but, basically, to the subset of D32 intersected by γ. It is convenient to include in the large contours those for which V (γ) ⊃ [L−n 32 ]. Indeed, as we shall see in Lemma 1 below, the global constraints on families of contours due to the fact that they have to be s-compatible can be expressed entirely in terms of those contours. This inclusion is, however, what limits the values of n in Proposition 3: when |[L−n 32 ]| becomes too small, all the contours are “large” and the iteration stops. As we shall see, the bounds (3.29, 3.30) are sufficient to control the sum over the small contours (see Lemma 3 below). So, we rewrite the sum in (4.2) as d

X` b 01 ⊃D32 \D32

ρ(01 ) exp(W (01 ))

Xs 02

ρ(02 ) exp(W (01 , 02 )).

(4.6)

372


P b 1 ⊃ (D32 \D32 ) and runs over all families of large contours such that 0 02 runs over the set Cs (01 ) of families of small contours 02 such that 01 ∪ 02 is s-compatible P b 2 ⊃ D32 . If Cs (01 ) = ∅, then b1 ∪ 0 and such that 0 02 = 0. Finally, P

01

W (01 , 02 ) = W (01 ∪ 02 ) − W (01 ) X X X 9(Y, 01 ∪ 02 ) − 9(Y, 01 ) ≡ 9(Y, 01 , 02 ) = Y

Y

(4.7)

Y

where the sums run over Y ⊂ [L−n 31 ], and the last sum runs over Y ∩ 02 6= ∅ because 9(Y, 0) depends on 0 only through 0 ∩ Y . Let us first characterize explicitly the constraint that the families of contours have to be s-compatible. For that, we define Out(0) = {γ ∈ 0|V (γ) ⊃ [L−n 32 ]}. b ⊃ D32 is s-compatible if and Lemma 1. A compatible family of contours 0 such that 0 only if Out(0) ∪ D32 is s-compatible. The proof of this lemma and of the other ones is given in the Appendix. Using this Xs lemma, one may characterize the families of contours that enter the sum : 02

b 1 ⊃ D32 \D32 and Cs (01 ) 6= ∅, then Lemma 2. If 01 is a family of contours such that 0 Cs (01 ) is the set of families of small contours 02 such that: 1) 02 ∩ 01 = ∅, 2) the signs of the contours in 02 ∪ 01 match among themselves and with the boundary conditions on 31 , b 1 ). b 2 ⊃ (D32 \0 3) 0 We shall show that the sum

Xs

(4.8) (4.9) (4.10) (4.11)

can be exponentiated using this lemma, the bounds

02

(3.29, 3.30) and the standard polymer formalism: Lemma 3. Xs 02

X

ρ(02 ) exp(W (01 , 02 )) = exp( Y

ϕ+ (Y, 01 )),

(4.12)

⊂[L−n 31 ]

where ϕ+ (Y, 01 ) is a function of {sx , x ∈ Ln Y ∩ 32 }, for n ≥ 1, and of {sx |x ∈ 32 , d(x, Y ) ≤ 2b} for n = 0; ϕ+ (Y, 01 ) depends on 01 only through 01 ∩ Y . In particular, ϕ+ (Y, 01 ) = ϕ+ (Y, ∅) if 01 ∩ Y = ∅, and we denote it by ϕ+ (Y ) in that case. Moreover, ϕ+ (Y, 01 ) satisfies the bound: |ϕ+ (Y, 01 )| ≤ exp(−βL−2α |Y |),

(4.13)

and ϕ+ (Y, 01 ) = 0 unless Y is connected. Finally, one may define ϕ− (Y, 01 ) with − boundary conditions, and we have ϕ+ (Y, 01 ) = ϕ− (Y, 01 ), if Ln Y ⊂ 32 .


373

Now, insert (4.12) in (4.6). We write X X ϕ+ (Y, 01 ) = ϕ+ (Y ) Y ⊂[L−n 31 ]

Y ⊂[L−n 31 ]

X

+ Y

=

(ϕ+ (Y, 01 ) − ϕ+ (Y ))χ(Y ∩ 01 6= ∅)

⊂[L−n 31 ]

X

ϕ+ (Y ) +

Y ⊂[L−n 31 ]

where

X

ϕ˜ + (Y, 01 ),

Y ⊂[L−n 31 ]

ϕ˜ + (Y, 01 ) = (ϕ+ (Y, 01 ) − ϕ+ (Y ))χ(Y ∩ 01 6= ∅).

(4.14)

We get, using (4.2, 4.6) and writing 0 for 01 , X

Ze = ρ(D32 ) exp(

X`

ϕ+ (Y ))

b 0⊃D32 \D32

Y ⊂[L−n 31 ]

X

ρ(0) exp(W (0) +

ϕ˜ + (Y, 0)).

Y ⊂[L−n 31 ]

(4.15) Write:

X Y

X X

ϕ+ (Y ) =

⊂[L−n 31 ]

ϕ+ (Y )χ(Ln Y = X)

X⊂32 Y

X

+ Y

ϕ+ (Y )χ(Ln Y ∩ 3c2 6= ∅),

(4.16)

⊂[L−n 31 ]

and a similar formula for ln ρ(D32 ). Then, using (3.25), we get: Z3+ 1 (s32 )

=e

0 f+, 31 (s32 )

X

exp(

80X (sX ))

X⊂32

X`

f (0)), ρ(0) exp(W

where, for n ≥ 1, and X ⊂ 32 , X X ln ρ(Di )χ(Ln Di = X) + ϕ+ (Y )χ(Ln Y = X), 80X = 8X + Di ∈D32

(4.18)

Y

and, using (3.28), X

f (0) = W Y

9(Y, 0) + ϕ˜ + (Y, 0) ,

(4.19)

⊂[L−n 31 ]

while 0 f+,3 (s32 ) = f+,31 (s32 ) + 1

+

(4.17)

b 0⊃D32 \D32

X

X

ϕ+ (Y )χ(Ln Y ∩ 3c2 6= ∅)

Y ⊂[L−n 31 ]

ln ρ(Di )χ(Ln Di ∩ 3c2 6= ∅).

(4.20)

Di ∈D32

For n = 0, one modifies (4.18, 4.20) by replacing Ln Di by {x|d(x, Di ) ≤ 2b}, and L Y by {x|d(x, Y ) ≤ 2b}. Note that all the terms in (4.19) vanish unless Y ∩ 0 6= ∅ (using Proposition 3 and (4.14)). n

374


Lemma 4. The functions 8nX , defined inductively by (4.18), are functions of {sx }x∈X , are independent of the boundary conditions on 31 , and satisfy the bounds in Proposition n (s32 ), defined 1 uniformly in n. The limit limn→∞ 8nX = 8X exists. Moreover, f+,3 1 n inductively by (4.20), and f−,31 (s32 ) defined similarly, satisfy (3.32). Now we shall “block” the terms in X`

f (0)) ρ(0) exp(W

(4.21)

b 0⊃D32 \D32 in order to obtain the representation (3.26) for Z˜ on the next scale. Let, for each term S 0 0 b 0 = [L−1 0] D3 (note that D3 is not necessarily included in [L−1 0], in (4.21), 0 2 2 S 0 0 b into connected components: 0 b0 = because of the bar in (3.20)) and decompose 0 ˆ i. iγ Write also X`

f (0) = W

0

e b )) + (U (Y, 0) + 9(Y, 0

Y ⊂[L−n 31 ]

where, in

P`

X

E(γˆ i0 , 0 ∩ γˆ i0 ),

(4.22)

i

, we sum only over Y with d(Y ) ≥

L 4,

and we define

U (Y, 0) = 9(Y, 0) + ϕ˜ (Y, 0) − min(9(Y, 0) + ϕ˜ + (Y, 0)),

(4.23)

e b 0 ) = min(9(Y, 0) + ϕ˜ + (Y, 0)), 9(Y, 0

(4.24)

+

0

and

0

S 0 b 0 . Finally, =0 where min0 is taken over all 0 such that [L−1 0] D3 2 X (9(Y, 0) + ϕ˜ + (Y, 0)), E(γˆ 0 , 0 ∩ γˆ 0 ) =

(4.25)

Y ∩Lγˆ 0 6=∅

where all the terms satisfy d(Y ) < L4 . Note that those Y ’s can intersect Lγˆ 0 for at most one γˆ 0 , since, for disconnected sets Y1 , Y2 in Zd , d(LY1 , LY2 ) ≥ L. Let, for Y 0 ⊂ [L−(n+1) 31 ], 0

b)= 90 (Y 0 , 0

X` [L−1 Y

]=Y

e b 0 ). 9(Y, 0

(4.26)

0

We get X

(4.21) =

X

exp(

0

0 b 0 ⊃D3

0

b )) 90 (Y 0 , 0

ρ(0)

0 =b [L−1 0]∪D3 0

0

Y 0 ⊂[L−(n+1) 31 ]

2

exp(

X` 2

X i

E(γˆ i0 , 0 ∩ γˆ i0 ) +

X`

U (Y, 0)).

(4.27)

Y

P` We need to do a Mayer expansion on exp( Y U (Y, 0)) in order to factorize the sum over 0 in (4.27). We note, for further use, that, by (4.23), U (Y, 0) ≥ 0. We write


exp(

X`

U (Y, 0)) =

Y

375

Y X Y (eU (Y,0) − 1 + 1) = (eU (Y,0) − 1) Y

≡

X

Y Y ∈Y

V (Y, 0),

(4.28)

Y

where V (Y, 0) ≥ 0. P` Insert (4.28) in (4.27). We can write the first in (4.27) as: X`

ρ(0) exp(

0 =b [L−1 0]∪D3 0

0

X

E(γˆ i0 , 0 ∩ γˆ i0 ))

X

V (Y, 0) =

00 ⊃b 0

0

Y

i

X Y

2

ρ0 (γ 0 ),

(4.29)

γ0

0

b ∪ [L−1 Y] = ∪γ 0 is decomposed into connected components and where 00 = 0 0

0

ρ (γ ) =

X`

ρ(0)V (Y, 0) exp(

X

E(γˆ i0 , 0 ∩ γˆ i0 )),

(4.30)

i

(0,Y)

where the sum runs over (0, Y) such that 0 b0 ∩ γ0, [L−1 0] ∪ (D3 ∩ γ0) = 0 2

(4.31)

0 [L−1 (0 ∪ Y)] ∪ (D3 ∩ γ0) = γ0, 2

(4.32)

and such that the signs {σ(γ)}γ∈0 are the same for all the terms in the sum. Note that all the terms in (4.30) are positive by (4.23). The factorization of the sum in (4.29) holds because U (Y, 0) depends on 0 only 0 through 0 ∩ Y , and all the terms in (4.25) intersect only one γˆ 0 . We define γ 0 = (γ 0 , 0ˆ ∩ γ 0 , σ(γ 0 )), where σ(γ 0 ) is determined by the common signs of {σ(γ)}γ∈0 , and b 0 ). 90 (Y 0 , 00 ) = 90 (Y 0 , 0 s0x0

0

−n−1

(4.33)

0 32 ]\D3 , 2

for each x ∈ [L as the (constant) value of sy , Finally, we define for y ∈ Lx0 \(∪D⊂D32 V (D)). To see that sy is constant, observe that Lx0 ∩(D32 \D32 ) = 0 ∅, since x0 ∈ / D3 , and to see that the set Lx0 \(∪D⊂D32 V (D)) is not empty, notice that 2 dα n and α small. So, s0x0 is well-defined. |V (D)| ≤ cL < |Lx| = Ld , for D ⊂ D3 2 0 0 With this definition of s and of σ(γ ), we see that the sum (4.29) runs over s0 compatible families of contours. Inserting (4.29) in (4.27), and combining (4.17, 4.27) we get (3.25, 3.26) on the next scale and the proof of Proposition 3 is finished with: Lemma 5. 90 (Y 0 , 00 ) defined by (4.26, 4.33) and ρ0 (γ 0 ) defined by (4.30), satisfy the claims of Proposition 3, for β 0 = L1−4α β. Remark. Before proving Proposition 1, let us characterize + and − . It is easy to see that, if s ∈ , each connected component Di of D(s) is finite. Moreover, for each x ∈ L, there are at most a finite number of Di ’s with x ∈ V (Di ) (where V (Di ) = the complement of the infinite connected component of Zd \Di ). Indeed, otherwise, ∀n, there exists m ≥ n and a connected component Di of Dm (s) such that x ∈ V (Di ), |Di | ≥ Lα , Di ∩ L2 {x} = 6 ∅, where L2 {x} is the cube of size L2 centered at x (indeed, if this last condition is not satisfied for some m, it will hold for a larger m, because of

376


the “blocking” in (3.20)). This would imply (by (3.23) that, ∀n, there exists m ≥ n and y ∈ Zd , with |x − y| ≤ L2 + 1, and Nym 6= 0; but since there is a finite number of such y’s, this in turn means that s ∈ / . So, combining these two facts, we see that, for each s ∈ , there exists a unique infinite b-connected set (a subset Y of L is b-connected if any two points of Y can be joined by a path (xi ), xi ∈ L with |xi − xi+1 | = b), where sx is of a given sign and this sign defines + and − . Now we can give the Proof of Proposition 1. Let us apply Proposition 3 up to the largest n such that |[L−n 32 ]| ≥ Ld . For that n, [L−n 32 ] ⊂ L2 {0}, where L2 {0} is the cube of size L2 centered at 0. Since 32 ↑ L, n → ∞. So, we have [L−n V ] = {0}, for n (i.e. 32 ) large, since V is fixed. Therefore, we have, for n as above, d([L−n V ], [L−n 32 ]c ) ≥ L and we can use (3.32). Then, given the bounds on 8nX and (3.32), it is enough to show that (4.34) Ze3+ 1 (s32 ) = 1 + O(e−cβn ) for s ∈ + . We claim that, in that case, for s ∈ + and n large enough, n D3 =∅ 2

(4.35)

(which implies, by (3.22), N3n2 ,x = 0, ∀x). Postponing the proof of (4.35), and using the representation (3.26), where now 0 = ∅, ρ(∅) = 1, W (∅) = 0, enters the sum since n = ∅, we get: D3 2 X ρn (0) exp(W n (0)), (4.36) Ze+ (s3 ) = 1 + 31

2

06=∅

where, for each γ in the sum, V (γ) ∩ [L−n 32 ] 6= ∅ (all other γ’s were small on the first scale, see (4.4, 4.5)). n = ∅, and N3n2 ,x = 0, ∀x. We use also (3.28) Now, we use the bound (3.29), with D3 2 and |W n (0)| ≤ ce−βn |0|, which follows from (3.31), and the fact that 9n (Y, 0) = 0 unless Y is connected and Y ∩ 0 6= ∅, to get X X ρn (0) exp(W n (0)) ≤ exp(−βn |0|/2). (4.37) 06=∅

06=∅

Since for each γ in the previous sum, V (γ) ∩ [L−n 32 ] 6= ∅, and since |[L−n 32 ]| ≤ L2d , for n as above ([L−n 32 ] ⊂ L2 {0}), we have, k X X X (4.37) ≤ exp(−βn |γ|/2) ≤ (e−cβn L2d )k , (4.38) k≥1 V (γ)∩[L−n 32 ]6=∅

k≥1

which proves (4.34). We are left with the proof of (4.35). Now observe that (on scale n = 0), if Di is a connected component of D32 = D+ (s32 ) (or of D− (s32 )) such that Di ∩ ∂32 = ∅, then Di ⊂ D(s). Besides, if s ∈ + and sx = −1, x ∈ ∂32 , then x ∈ V (Di ) for some connected Di ⊂ D(s).


377

Hence, if s ∈ + , D+ (s32 ) ⊂

[

{V (Di )|Di ⊂ D(s), V (Di ) ∩ 32 6= ∅}.

This implies inductively that [ n D3 ⊂ {V (Di )|Di ⊂ Dn (s), V (Di ) ∩ [L−n 32 ] 6= ∅}. 2

(4.39)

On the other hand, if s ∈ + , there exists n such that, ∀m ≥ n, Dm (s) ∩ L2 {0} = ∅.

(4.40)

V (Dn (s)) ∩ L2 {0} = ∪Di ⊂Dn (s) V (Di ) ∩ L2 {0} = ∅.

(4.41)

This implies that

Indeed, if (4.41) does not hold, and Dn (s) ∩ L2 {0} = ∅, then, because of the blocking in (3.20), Dm (s)∩L2 {0} 6= ∅, for some m ≥ n. By taking 32 large, we may assume that this n is the one chosen at the beginning of the proof, in particular that [L−n 32 ] ⊂ L2 {0}. n = ∅, i.e. (4.35). Combining this fact and (4.39, 4.41), we get D3 2 To prove Proposition 2 and Lemma 5, we need the following lemma. Lemma 6. Let Di be a connected component of Dn . Then p β0 Lαn/2 |Di | ≤ βn N n (Di ) ≤ 2dβ0 Lnd |Di |.

(4.42)

n , N3n2 (Di ). A similar bound holds for D3 2

Now we shall prove that the configurations in + , − are typical for µ0+ , µ0− : Proof of Proposition 2. It is enough to show, see (3.24), that, for any x ∈ Zd , µ0+ (cx ) = µ0− (cx ) = 0,

(4.43)

because, by (3.14) and a simple Peierls argument, one sees that the probability with respect to µ0+ (resp. µ0− ) of an infinite b-connected set of − (resp. +) spins is zero, hence µ0+ (− ) = µ0− (+ ) = 0. We shall prove that, ∀A ⊂ Zd , and ∀{Nxn ∈ L−(1−η)n N}x∈A , µ0+ (

Y

χNxn χA ) ≤ exp(−βen N n (A)),

(4.44)

x∈A

where χNxn means that the random variable Nxn (s) defined by (3.16, 3.17) takes the value Nxn (by (3.17), Nxn ∈ L−(1−η)n N), χA is the indicator function of the event: “A is a union of connected components of Dn (s)”, and βen = cL−αn/4 βn . Then (4.43) follows from

(4.45)

378


µ0+ (cx ) ≤ lim

N →∞

∞ X

N →∞

n=N

≤ lim C N →∞

∞ Xc X

µ0+ (Nxn 6= 0)leq lim ∞ X

exp(−c

p β0 Lαn/4 |A|)

n=N A3x

p exp(−c β0 Lαn/4 ) = 0,

(4.46)

n=N

Pc where runs over connected sets, and, in the second inequality, we use (4.44) and the lower bound in (4.42). For µ0− , we can use the symmetry µ0+ (cx ) = µ0− (cx ), which follows from (2.4). To prove (4.44), consider first n = 0. Using (3.16), it is enough to show: µ0+ (A ⊂ D(s)) ≤ exp(−cβ0 |A|).

(4.47)

We have, by definition (3.6) of D(s), µ0+ (A ⊂ D(s)) = lim lim

32 ↑L 31 ↑Zd

XA Z3+ ({sx |d(x, A) ≤ b}) 1 , Z3+ 1

(4.48)

where the sum runs over {sx |d(x, A) ≤ b} such that, ∀x ∈ A, ∃y ∈ L, |x − y| = b and sx 6= sy . Using the representation (3.12) in the numerator of (4.48), we get: (T )−|32 |

XA

X

Z3+ 1 ({sx |d(x, A) ≤ b}) ≤

X1

ρ(0) =

0⊃BA

ρ(01 )

01 ⊃BA

X01

ρ(02 ),

02

(4.49) where the first sum runs over 01 such that γ ∩ BA 6= ∅, ∀γ ∈ 01 , with BA = ∪x∈A Bx , and the second sum runs over 02 ∩ BA = ∅, 02 ∩ 01 = ∅. Now, for any 01 in (4.49), P01 (T )−|32 | Z3+ 1 ≥ 02 ρ(02 ). Inserting this and (4.49) in (4.48), and using the bound (3.14), where we can put exp(−β0 |γ|) in the RHS (by taking 32 large enough), we get µ0+ (A ⊂ D(s)) ≤

X1

exp(−β0 |01 |)

01 ⊃BA

≤ exp(−β0 |A|/2)

X1

exp(−β0 |01 |/2)

01 ⊃BA

≤ exp(−β0 |A|/2) exp(ce−β0 /2 |A|) ≤ exp(−cβ0 |A|)

(4.50)

for β0 large enough, using 01 ⊃ BA in the first inequality, and the fact that each (connected) γ in 01 contains a box in BA in the second inequality, which implies X1 01 ⊃BA

exp(−β0 |01 |/2) ≤

Y

(1 +

X

e−β0 |γ|/2 )

γ3i

i∈BA

0 −β0 /2

≤ exp(c e

|BA |) ≤ exp(ce−β0 /2 |A|)

with c = c0 bd . This proves (4.47), i.e. (4.44), for n = 0. Then we proceed inductively: we have, by (3.17, 3.20),

(4.51)


µ0+ (

Y x0 ∈A0

379

χ N 0 0 χ A0 ) ≤ x

X

µ0+ (

[L−1 A]=A0 ,{Ny }

Y

Y

χNy χA )

x0 ∈A0

y∈A

X

χ(L−1+η

Ny = Nx0 0 )

(4.52)

y∈Lx0 ∩A

Since A is a union of connected components of Dn (s), using (4.44) on scale n and Lemma 6, we have: X∗

(4.52) ≤

Y

e−βN (A) ˜

x0 ∈A0

A,{Ny }

χ(L−1+η

X

Ny = Nx0 0 ),

(4.53)

y∈Lx0 ∩A

√ P∗ ˜ (A) ≥ c β0 Lαn/4 |A| (see (4.42, 4.45)). runs over A, {Ny } such that βN where Since Ny ∈ L−(1−η)n N, the sum over Ny , y ∈ Lx0 ∩A has at most L(1−η)(n+1) Nx0 0 terms. ˜ (A) ≥ Note also that, by (3.17, (3.33), βN (A) = β 0 N 0 (A0 ), i.e., again by (4.45), βN β˜ 0 Lα/4 N 0 (A0 ). √ 0 |A| ˜ (A) ≥ β˜ Lα/4 N 0 (A0 ) + β0 L αn 4 So, using βN 2 2 , we have: β˜ 0 (4.53) ≤ exp − Lα/4 N 0 (A0 ) 2 !Ld Y X 0 Nx 0 x0 ∈A0

L(1−η)(n+1)|A| exp(−

[L−1 A]=A0

p αn |A| ); β0 L 4 2

(4.54)

for β0 and L large, the sum over A is bounded by O(1), and we get: (4.54) ≤ exp(−β˜ 0 N 0 (A0 )),

(4.55)

which proves (4.44) on scale n + 1. Appendix: Proof of the Lemmas Proof of Lemma 1. First, note that a compatible family 0 that contains an s-compatible ˜ c and we have ˜ is trivially s-compatible: each x ∈ 0c ∩ L belongs also to 0 family 0 ˜ = sx (because 0 ˜ is s-compatible) and σx (0) ˜ = σx (0) (because 0 is compatible σx (0) ˜ ⊂ 0). and 0 ˜ ≡ Out(0) ∪ D32 this shows that, if Out(0) ∪ D32 is s-compatible, Now, letting 0 then, 0 is s-compatible. To prove the converse, we proceed inductively. Let us consider first n = 0. Assume that 0 is an s-compatible family of contours containing D32 . We shall show that Out(0)∪ D32 is s-compatible. First of all, note that Out(0) can be written as (γ1 , . . . , γn ) with γ i+1 ⊂ Int(γi ) and 32 ⊂ V (γn ). Obviously, since 0 is compatible, and since Out(0) contains all the contours in 0 such that 32 ⊂ V (γ), the sign in (V (γi+1 ))c must match the one in the component of γ ci containing γ i+1 ; therefore, Out(0) is compatible. To show that Out(0) ∪ D32 is s-compatible, we consider the following cases: a) γ n ∩ 32 6= ∅. In that case the signs of σ(γn ) must coincide with the signs of sx for all Bx adjacent to γ n (because γn belongs to 0, which is s-compatible). But since the signs of sx are constant outside D32 , this implies that Out(0) ∪ D32 is s-compatible.

380


b) γ n ∩32 = ∅, which means that γ ∩32 = ∅, for all γ ∈ Out(0) (γn is the innermost contour in Out(0)). It also means that sx = 1 for some x ∈ ∂32 : indeed, otherwise, ∂32 would entirely belong to D32 , and ∂32 would be a part of a contour γ ∈ 0 such that 32 ⊂ V (γ); hence this γ would belong to Out(0), but obviously γ ∩ 32 6= ∅, contradicting our assumption on γn . We shall show that the (constant) sign given by σ(γn ) to 32 must be +1. Assuming this result, we see that this sign is compatible with those sx = 1 for x ∈ ∂32 , and again, since the signs of sx are constant outside D32 , this implies that Out(0) ∪ D32 is s-compatible. To show that this sign must be +1, assume that it is −1. This means that all x ∈ ∂32 with sx = 1 must be in V (γ) for some contour γ ∈ 0\Out(0), since 0 is s-compatible. But since all x ∈ ∂32 with sx = −1 belong to 0 ⊃ D32 , by definition, and since ∂32 is connected, ∂32 must be in V (γ). But then 32 ⊂ V (γ), which contradicts the fact that γ ∈ 0\Out(0). Let us now proceed inductively: if 00 is an s0 -compatible family of contours on scale n + 1, then there is an s-compatible family 0 ⊂ L00 . Moreover, Out(0) ∪ (D32 \D32 ) ⊂ 0 L(Out(00 ) ∪ D3 ). But since, by assumption, Out(0) ∪ D32 is s-compatible, Out(00 ) ∪ 2 0 0 D32 is s -compatible because of the way s0 and σ(00 ) were inductively defined, at the end of the proof of Proposition 3. Proof of Lemma 2. Since the subset 01 of large contours of an s-compatible family 0 always satisfies Out(01 ) = Out(0), Cs (01 ) 6= ∅ means, by Lemma 1, that 01 ∪ D32 is scompatible. Thus, any compatible family 0 ⊃ 01 ∪D32 is s-compatible, by the argument given at the beginning of the previous proof. So, Cs (01 ) consists of all the families of contours 02 so that 01 ∪ 02 is compatible and contains D32 , which is equivalent to the statements in the Lemma. Proof of Lemma 3. We write exp(W (01 , 02 )) =

Y

(e9(Y,01 ,02 ) − 1 + 1) =

X

τ (Y, 01 , 02 ),

(A.1)

Y

Y

where the product over Y runs over Y ∩ 02 6= ∅, see (4.7), and Y (e9(Y,01 ,02 ) − 1). τ (Y, 01 , 02 ) =

(A.2)

Y ∈Y

From (3.31), we have

|e9(Y,01 ,02 ) − 1| ≤ Ce−β|Y | ,

(A.3)

and, using (4.3), (3.29), (3.30),(3.18) and (3.22) we have: ρ(γ) ≤ exp(2βkL−3α |γ ∩ D32 | − β|γ\D32 |)

(A.4)

because γ is small, which implies γ ∩ (D32 \D32 ) = ∅. On the other hand, |γ\D32 | ≥

1 −α L |γ|, 2d

if

γ 6⊂ D32 ,

(A.5)

which follows from the fact that γ is connected while the connected components of D32 contain at most Lα points, so that, for each connected component of D32 , there exists at


381

least one site in γ\D32 which is adjacent to that connected component, and the same site can be adjacent to a fixed number of such components (the worst case is when γ\D32 contains only one site separating different components of D32 of size Lα ). Using the / D32 , fact that we also have |γ ∩ D32 ] ≤ |γ| ≤ 2dLα |γ\D32 |, we get, for γ ∈ ρ(γ) ≤ exp(−cβL−α |γ|)

(A.6)

and ρ(γ) = 1 for γ ∈ D32 . Now inserting (A.1) in (4.12), we get Xs

ρ(02 ) exp(W (01 , 02 )) =

X Y

02

(A.7)

A(Z, 01 ),

(A.8)

Z Z∈Z

where Z = 02 ∪ Y, and Z = (02 , Y ) with 02 ∪ Y connected; A(Z, 01 ) = ρ(02 )τ (Y, 01 , 02 )

(A.9)

with 02 ∪ Y = Z. A(Z, 01 ) depends on 01 only through 01 ∩ Z and is a function of {sx |x ∈ Ln Z ∩ 32 } for n ≥ 1 or of {sx |x ∈ 32 , d(x, Z) ≤ 2b} for n = 0, because it is a product of factors with these properties. Combining (A.3) and (A.6), (A.7), we get A(Z, 01 ) = 1

(A.10)

|A(Z, 01 )| ≤ exp(−cβL−α |Z|)

(A.11)

if Z = (Di , ∅), with Di ∈ D32 , and otherwise. We shall now see how to apply the polymer formalism (see e.g. [22]). There are constraints on 02 coming from (1,2,3) in Lemma 2: to deal with constraint (3), define ˜ 01 ) = A(Z, 01 ) for the corresponding Z, and a polymer as Z˜ = ((02 \D32 )∪Y with A(Z, define Z˜ to be “connected” if 02 ∪ Y is connected. Since A(Z, 01 ) = 1, if Z = (Di , ∅), ˜ 01 ) = 1 if Z˜ = ∅ and the bound (A.11) allows us to with Di ∈ D32 , we have A(Z, control the sum over “connected” polymers. The remaining constraints on Z˜ come from (1,2) in Lemma 2. The constraint (1) gives rise to the usual hard-core constraint between polymers. To deal with constraint (2), observe that, since the contours in 02 are small, if γ ∈ 02 , and if a finite connected component Vi of γ c intersects [L−n 32 ], it must be adjacent to γ ∩ 32 , and the constraint (2) in Lemma 2 is automatically satisfied for that component of γ c , since the signs σ(γ) must agree (by definition of a contour) with those of the internal spins in the blocks adjacent to γ. On the other hand, if a connected component Vi does not intersect 32 , then the s-compatibility reduces there to compatibility, as for the usual Ising contours and the constraint can be dealt with as in that case. The claims of the lemma follow then from the polymer formalism ([22]) applied to (A.8), for β0 large; ϕ+ (Y, 01 ) is a sum of products of A(Z, 01 ), with Z ⊂ Y so its dependence on {sx } and on 01 follows from one of the A(Z, 01 )’s mentioned above. We put L−2α in (4.13), instead of L−α , in order to control constants. To prove that ϕ+ (Y, 01 ) = ϕ− (Y, 01 ), if Ln Y ⊂ 32 , observe that, if γ ⊂ 32 , and γ ∩ ∂32 = ∅, the values of σi (γ), i ∈ / γ are determined by the external spins outside γ, since, by definition, the internal spins coincide with the external ones on the boxes adjacent to γ. Hence, the value of ρ(γ), see (3.13), is independent of the boundary conditions on 31 . It is then easy to see inductively that, on scale n, if Ln Y ⊂ 32 , ϕ+ (Y, 01 ) is independent of the boundary conditions.

382


Proof of Lemma 4. First, observe that 8nX is indeed a function of {sx }x∈X , since, by Proposition 3 and Lemma 3, ρ(Di ), for Ln Di = X, and ϕ+ (Y ), for Ln Y = X, are functions of {sx }x∈X . Besides, 8nX is independent of the boundary conditions on 31 , by the argument given at the end of the proof of Lemma 3. Let us prove the bound (3.5). We consider separately each contribution to (4.18), and sum them over n. From (3.29, 3.30, 3.33), we have: | ln ρn (Di )| ≤ k0 βn N3n2 (Di ) ≤ k0 cβ0 Lnd |Di | ≤ C(L)β0 Lnd ,

(A.12)

where, in the second inequality, we use Lemma 6, and in the third, we use |Di | ≤ Lα , since Di belongs to D32 . Besides, d(X) ≤ C(L)Ln , if X = Ln Di and Di belongs to D32 . Let, for x ∈ L, n (s)}. n(x, s) = max{n|[L−n x] ∈ D3 2

(A.13)

(A.14)

n (s) means, by (3.23), that there exists a If s ∈ , n(x, s) < ∞, because [L−n x] ∈ D3 2 −n n y, with d(y, [L x]) ≤ 2, and Ny 6= 0. However, for any fixed x, Nyn = 0 for all such y 0 s, for n large enough, and s ∈ . So we get the bound XX X | ln ρn (D)|χ(Ln D = X) exp(d(X)1−η ) n X3x D∈D n 3

≤

X X

2

| ln ρn (D)|χ([L−n x] ∈ D) exp(C(L)Ln(1−η) )

n D∈D n 32

≤ C(L)β0

n(x,s) X

Lnd exp(C(L)Ln(1−η) ) ≡ C1 (x, s) < ∞,

(A.15)

n=0 n . since, for each n, [L−n x] ∈ D for at most one D ∈ D3 2 On the other hand,

d(X) ≤ C(L)Ln d(Y ) ≤ C(L)Ln |Y |

(A.16)

if X = Ln Y and Y is connected. So, with η = 4α, XXX |ϕ+ (Y )|χ(Ln Y = X) exp(d(X)(1−4α) ) n X3x Y

≤

X

X

|ϕ+ (Y )| exp(C(L)Ln(1−4α) |Y |)

n Y 3[L−n x]

≤

X

X

n Y 3[L−n x]

|ϕ+ (Y )| exp(

βn L−2α |Y | ), 2

where we used C(L)Ln(1−4α) ≤

βn L−2α , 2

(A.17)

(A.18)


383

which holds by (3.33) for all n ≥ 0, provided β0 is large enough. Then, the previous sum is bounded by X

Xc

n Y 3[L−n x]

exp(−

X βn L−2α |Y | L−2α )≤C ) ≡ C2 , exp(−βn 2 2 n

(A.19)

Pc using the bound (4.13) and the fact that the sum Y runs over connected sets, since + ϕ (Y ) = 0 unless Y is connected. Now take C(x, s) = C1 (x, s) + C2 to get (3.5). Turning to the proof of (3.4), k8nX k∞ ≤ cβ0 |X|, follows, for the second term in (4.18), from (A.12) (where Lnd |Di | ≤ |Ln Di ] = |X|), since a given X can equal Ln Di , for Di ∈ D for at most one n. The last term in (4.18) is controlled by 4.13, (A.19). The existence of the limit n → ∞ follows from the bounds (A.15, A.17, A.19). Turning to the proof of (3.32), observe that the last term in (4.20) is independent of sV , as long as d([L−n V ], [L−n 32 ]c ) ≥ L, since each ρ(Di ) is a function of {sx |x ∈ Ln Di }, Ln Di ∩3c2 6= ∅, and |Di | ≤ Lα . The contribution to (3.32) of the previous term in (4.20) is bounded (on scale n), using (4.13) and the fact that ϕ(Y ) is a function of sV only for Y ∩ [L−n V ] 6= ∅, by: |[L−n V ]| exp(−cβn L−2α d([L−n V ], [L−n 32 ]c )).

(A.20)

Now, use (3.33), which implies cβn L−2α d([L−n V ], [L−n 32 ]c ) ≥ Lηn (d(V, 3c2 ))1−2η (for β0 large) and |[L−n V ]| ≤ cL−nd |V |. The factor Lηn controls the sum over n, and we get (3.32). Proof of Lemma 5. Consider first 90 (Y 0 , 00 ). From (4.14), (3.31) and (4.13) we have 0 e |9(Y, 0ˆ )| ≤ 3 exp(−βL−2α |Y |).

Since Y is connected and all the terms in (4.26) have d(Y ) ≥

(A.21) L 4,

we have

|Y | ≥ cL|Y 0 |,

(A.22) 0

since [L−1 Y ] = Y 0 . The number of terms in the sum in (4.26) is at most 2L |Y | , since Y ⊂ LY 0 , so, using cLβL−2α − Ld ln 2 ≥ βL1−4α , for β0 large, we get (see (4.33)) d

|90 (Y 0 , 00 )| ≤ exp(−βL1−4α |Y 0 |)

(A.23)

for β0 large. 0 ˜ Then observe that, by induction and Lemma 3, 9(Y, 0ˆ ) = 0 unless Y is connected 0 and Y ∩ 0 6= ∅ (see (4.14)). Hence, since [L−1 Y ] = Y 0 , 90 (Y 0 , 0ˆ ) = 90 (Y 0 , 00 ) = 0 unless Y 0 is connected and Y 0 ∩ 00 6= ∅. Also by induction and Lemma 3, 9(Y, 0), 0 0 ˜ 0ˆ ) and 90 (Y 0 , 00 ) ϕ˜ + (Y, 0) depend on 0 only through 0∩Y ; hence, since 0 ⊂ L0ˆ , 9(Y, depend on 00 only through 00 ∩ Y 0 . Likewise, 9(Y, 0), ϕ˜ + (Y, 0) are functions (for 0 n ≥ 1) of {sx |x ∈ Ln Y ∩ 32 }, but since Y ⊂ LY when [L−1 Y ] = Y 0 , 90 (Y 0 , 00 ) is 0 a function of {sx |x ∈ Ln+1 Y ∩ 32 }. The same conclusion holds for n = 0, because 0 {x|d(x, Y ) ≤ 2b} ⊆ LY when [L−1 Y ] = Y 0 , for L large enough.

384


Turning to ρ0 (γ 0 ), a similar argument shows that it is a function of {sx |x ∈ Ln+1 γ 0 ∩ 32 }. Let us prove (3.29) inductively: observe that it holds trivially for n = 0 because of (3.14). To proceed inductively, we use (4.30): ρ0 (γ 0 ) =

X`

ρ(0)V ¯ (Y, 0) exp(

X

E(γˆ i0 , 0 ∩ γˆ i0 ))

(A.24)

i

(0,Y)

with the constraints (4.31) and (4.32). Let us bound the different parts of this expression. Using (4.25) and the bounds (3.31), (4.13), we have: X −2α X −3α |E(γˆ i0 , 0 ∩ γˆ i0 )| ≤ C(L)e−βL |γˆ i0 | ≤ e−βL |γ 0 |, (A.25) i

i

since the sum here runs over disjoint subsets γˆ i0 of γ 0 , and the sum in (4.25) runs over connected sets Y that intersect Lγˆ 0 . Next using 4.3), the bounds (3.29) and (3.30), we have: ρ(0) ¯ ≤ exp βkN32 ((D32 \D32 ) ∩ 0) + 2βkN32 (D32 ∩ 0) − β|0\D32 | . (A.26) Since, by (3.22), N32 (0) = N32 (D32 ∩ 0) = N32 (D32 \D32 ) ∩ 0 + N32 (D32 ∩ 0). Now, we can write: |0\D32 | ≥ cL−α |0\(D32 \D32 )|,

(A.27)

because 0\D32 can be written as 0\(D32 \D32 )\D32 and, for each connected component of D32 , there is always at least one site in 0\(D32 \D32 ) which is adjacent to that component. By definition of D32 , N32 (D32 ∩ 0) ≤ L−3α |D32 ∩ 0| ≤ L−3α |0\(D32 \D32 )|

(A.28)

(since D32 ∩ 0 ⊂ 0\(D32 \D32 ) ). So we get, ρ(0) ≤ exp βkN32 ((D32 \D32 ) ∩ 0) − cβL−α |0\(D32 \D32 )| .

(A.29)

We have, by (4.28, 4.23, 3.31) and Lemma 3, Y −2α Ce−βL |Y | , ≤ exp −βL−3α |Y| V (Y, 0) ≤

(A.30)

Y ∈Y

P where |Y| = Y ∈Y |Y |, and we put L−3α in order to eliminate the constant C. Inserting (A.25, A.29, A.30) in (A.24), we get X −3α ρ0 (γ 0 ) ≤ exp e−βL |γ 0 | + β 0 kN30 2 (γ 0 )

exp −βL

0,Y −3α

|Y| − cβL

−α

|0\(D32 \D32 )|

(A.31)

with the same constraints as before and we used, by definition (3.33) of β 0 , and (3.17) of N30 2 ,


385

βN32 (D32 \D32 ) ∩ 0 = β 0 N30 2 (γ 0 ).

(A.32) S

In order to control the sum over 0, Y, we decompose 0\(D32 \D32 ) = i Ai into connected components and call a component long if, either Ai = γ, with V (γ) ⊃ 0 ) 6= ∅ and we call Ai short otherwise. [L−n 32 ], or Ai ∩ L(γ 0 \D3 2 S S S S ` Write A = ` Ai and As = s Ai , where ` ( s ) is the union over the long (short) components. We claim that a long component satisfies d(Ai ) ≥ L; to show that, consider two cases: either Ai = γ, with V (γ) ⊃ [L−n 32 ], and d(Ai ) ≥ L because of 0 our restriction on n in Proposition 3; or Ai intersects L(γ 0 \D3 ), and, since 0 contains 2 only long contours, Ai must intersect D32 \D32 (see (4.5), in which case it must cross 0 \L([L−1 (D32 \D32 ]), which implies d(Ai ) ≥ L; the corridor of width L given by LD3 2 0 0 Moreover, each box in L(γ \D32 ) is intersected either by a long Ai or by a Y ∈ Y, all of which have a diameter at least L4 ; thus, we have X 0 |Ai | + |Y| > cL|γ 0 \D3 |. (A.33) 2 Ai ∈A`

We write then the sum over 0, Y as follows: X X X X exp −β(cL−α |Ai | + L−3α |Y|) exp(−cβL−α |Ai |). (A.34) A` ,Y

As

Ai ∈A`

Ai ∈As

Using (A.33) and β 0 = L1−4α β, this sum can be bounded by 0 exp(−2β 0 |γ 0 \D3 |) exp(e−βL 2

−3α

C(L)|γ 0 |),

for L large enough, where the last factor bounds the sums in (A.34). So, we get, going back to (A.31), −3α 0 0 | + β 0 kN30 2 (γ 0 ) + (C(L) + 1)e−βL |D3 ∩ γ 0 | , (A.35) ρ0 (γ 0 ) ≤ exp −β 0 |γ 0 \D3 2 2 −3α

0 0 using |γ 0 | = |γ 0 \D3 | + |D3 ∩ γ 0 |P and (C(L) + 1)e−βL ≤ β0. 2 2 0 0 0 0 0 By (3.22), we have N32 (γ ) = Di ⊂γ 0 N (Di ) = N (D32 ∩ γ 0 ), and, using Lemma 6, we can bound

β 0 kN30 2 (γ 0 ) + (C(L) + 1)e−βL

−3α

0 |D3 ∩ γ 0 | ≤ β 0 k 0 N30 2 (γ 0 ) 2

(A.36)

−3α

Lα and in , |Dj | ≤ Lα . In this latter case we must have N (Dj ) ≥ L−3α |Dj |, since Dj 6∈ D (see (3.18)). For P1 , we use Lemma 6 inductively where

j

X1

βN (Dj ) ≥

X1 p β0 Lαn/2 |Dj |,

j

(A.41)

j

and, since we have |Dj | ≥ Lα , we get Lαn/2

X1

0 |Dj | ≥ Lα(n+1)/2 |Di,1 |

(A.42)

j 0 with Di,1 = [L−1 (

S1 j

Dj )]. For

0

fact that each box Lx with x

0

box intersected by some Dj in X2

P2

, we have, using the definition (3.33) of βn and the S2 = [L−1 ( j Dj )] must contain or be adjacent to a

0 ∈ Di,2 P2

:

βN (Dj ) ≥ L−3α β

j

X2

|Dj |

j 0 | ≥ β0 L(1−4α)n L−3α c|Di,2 p α(n+1)/2 0 |Di,2 | ≥ β0 L

(A.43)

for α small enough and β0 large enough. Obviously (A.40-A.43) imply the lower bound in (4.42) on scale n + 1 since 0 0 | + |Di,2 | ≥ |Di0 |. |Di,1

(A.44)

n , N3n2 (Di ). The same arguments hold for D3 2

Acknowledgement. We would like to thank A. van Enter, R. Fernandez, C. Maes, C.-E. Pfister, A. Sokal, and K. Vande Velde for interesting discussions. This work was supported by NSF grant DMS-9205296 and by EC grant CHRX-CT93-0411.


387

References 1. Benfatto, G., Marinari, E. and Olivieri, E. Some numerical results on the block spin transformation for the 2d Ising model at the critical point. J. Stat. Phys. 78, 731–757 (1995) 2. J. Bricmont and A. Kupiainen. Phase transition in the 3d random field Ising model. Commun. Math. Phys. 116, 539–572 (1988) 3. Cammarota, C.: The large block spin interaction. Il Nuovo Cimento 96B, 1–16 (1986) 4. Cassandro, M. and Gallavotti, G.: The Lavoisier law and the critical point. Il Nuovo Cimento 25B, 695–705 (1975) 5. Cirillo, E.N.M. and Olivieri, E.: Renormalization-group at criticality and complete analyticity of constrained models: a numerical study. J. Stat. Phys. 86, 1117–1151 (1997) 6. Dobrushin, R.L.: Gibbs states describing a coexistence of phases for the three-dimensional Ising model. Th. Prob. and its Appl. 17, 582–600 (1972) 7. Dobrushin, R.L.: Lecture given at the workshop “Probability and Physics”, Renkum, (Holland), 28 August – 1 September, 1995 8. Domb, C. and Green, M.S. (Eds.): Phase transitions and critical phenomena, Vol. 6 New York: Academic Press, 1976 9. Fernández, R. and Pfister, C.-Ed.: Global specifications and non-quasilocality of projections of Gibbs measures. EPFL preprint, 1996 10. Gallavotti, G. and Knops, H.: Block spins interactions in the Ising model. Commun. Math. Phys. 36, 171–184 (1974) 11. Gallavotti, G. and Martin-Löf, A.: Block spins distributions for short range attractive Ising models. Il Nuovo Cimento 36, 1–16 (1974) 12. Gawe¸dzki, K., Kotecký, R. and Kupiainen, A.: Coarse graining approach to first order phase transitions. J. Stat. Phys. 47, 701–724 (1987) 13. Georgii, H.-O.: Gibbs Measures and Phase Transitions. de Gruyter Studies in Mathematics, Vol. 9), Berlin–New York: Walter deGruyter, 1988 14. Goldenfeld, N.: Lectures on phase transitions and the renormalization group. Frontiers in Physics 85, Reading, MA: Addison-Wesley, 1992 15. Griffiths, R.B.: Nonanalytic behavior above the critical point in a random Ising ferromagnet. Phys. Rev. Lett. 23, 17 (1969) 16. Griffiths, R.B. and Pearce, P.A.: Position-space renormalization-group transformations: Some proofs and some problems. Phys. Rev. Lett. 41, 917–920 (1978) 17. Griffiths, R.B. and Pearce, P.A.: Mathematical properties of position-space renormalization-group transformations. J. Stat. Phys. 20, 499–545 (1979) 18. Haller, K. and Kennedy, T.: Absence of renormalization group pathologies near the critical temperaturetwo examples. J. Stat. Phys. 85, 607–637 (1996) 19. Hasenfratz, A. and Hasenfratz, P.: Singular renormalization group transformations and first order phase transitions (I). Nucl. Phys. B 295, [FS21], 1–20 (1988) 20. Israel, R.B.: Banach algebras and Kadanoff transformations. In J. Fritz, J. L. Lebowitz, and D. Szász, eds. Random Fields (Esztergom, 1979), Vol. II, pp. Amsterdam: North-Holland, 1981, 593–608 21. Kennedy, T.: Some rigorous results on majority rule renormalization group transformations near the critical point. J. Stat. Phys. 72, 15–37 (1993) 22. Kotecky, R. and Preiss, D.: Cluster expansion for abstract polymer models. Commun. Math. Phys. 103, 491–498 (1986) 23. Lörinczi, J.: Some results on the projected two-dimensional Ising model. In M. Fannes, C. Maes, and A. Verbeure, ed. Proceedings NATO ASI Leuven Workshop “On Three Levels”, New York: Plenum Press, 1994, pp. 373–380 24. Lörinczi, J. and Vande Velde, K.: A note on the projection of Gibbs measures. J. Stat. Phys. 77, 881–887 (1994) 25. Lörinczi, J. and Winnink, M.: Some remarks on almost Gibbs states. In N. Boccara, E. Goles, S. Martinez, and P. Picco, eds. Cellular Automata and Cooperative Systems, Dordrecht: Kluwer, 1993, pp. 423–432 26. Maes, C. and Vande Velde, K.: Defining relative energies for the projected Ising measure. Helv. Phys. Acta. 65, 1055–1068 (1992) 27. Maes, C. and Vande Velde, K.: The (non-)Gibbsian nature of states invariant under stochastic transformations. Physica A 206, 587–603 (1994) 28. Maes, C. and Vande Velde, K.: Relative energies for non-Gibbsian states. Leuven preprint, 1996

388


29. Martinelli, F. and Olivieri, E.: Some remarks on pathologies of renormalization-group transformations. J. Stat. Phys. 72, 1169–1177 (1993) 30. Martinelli, F. and Olivieri, E.: Instability of renormalization-group pathologies under decimation. J. Stat. Phys. 79, 25–42 (1995) 31. Schonmann, R.H.: Projections of Gibbs measures may be non-Gibbsian. Commun. Math. Phys. 124, 1–7 (1989) 32. van Enter, A.C.D.: Ill-defined block-spin transformations at arbitrarily high temperatures. J. Stat. Phys. 83, 761–765 (1996) 33. van Enter, A.C.D., Fernández, R. and Kotecký, R.: Pathological behavior of renormalization group maps at high fields and above the transition temperature. J. Stat. Phys. 79, 969–992 (1995) 34. van Enter, A.C.D., Fernández, R. and Sokal, A.D.: Renormalization transformations in the vicinity of first-order phase transitions: What can and cannot go wrong. Phys. Rev. Lett. 66, 3253–3256 (1991) 35. van Enter, A.C.D., Fernández, R. and Sokal, A.D.: Regularity properties and pathologies of positionspace renormalization-group transformations: Scope and limitations of Gibbsian theory. J. Stat. Phys. 72, 879–1167 (1993) 36. van Enter, A.C.D. and Lörinczi, J.: Robustness of the non-Gibbsian property: Some examples. University of Groningen preprint, 1995, J. Phys. A, Math. and Gen. 29, 2465–2473 (1996) Communicated by D. C. Brydges

Commun. Math. Phys. 194, 389 – 462 (1998)

Communications in


Wulff Droplets and the Metastable Relaxation of Kinetic Ising Models Roberto H. Schonmann1,? , Senya B. Shlosman2,3,?? 1 2 3

Mathematics Department, University of California at Los Angeles, Los Angeles, CA 90095, USA Mathematics Department, University of California at Irvine, Irvine, CA 92697, USA Institute for the Information Transmission Problems, Russian Academy of Sciences, Moskow, Russia

Received: 7 May 1997 / Accepted: 29 October 1997

Abstract: We consider the kinetic Ising models (Glauber dynamics) corresponding to the infinite volume Ising model in dimension 2 with nearest neighbor ferromagnetic interaction and under a positive external magnetic field h. Minimal conditions on the flip rates are assumed, so that all the common choices are being considered. We study the relaxation towards equilibrium when the system is at an arbitrary subcritical temperature T and the evolution is started from a distribution which is stochastically lower than the (−)-phase. We show that as h & 0 the relaxation time blows up as exp(λc (T )/h), with λc (T ) = w(T )2 /(12T m∗ (T )). Here m∗ (T ) is the spontaneous magnetization and w(T ) is the integrated surface tension of the Wulff body of unit volume. Moreover, for 0 < λ < λc , the state of the process at time exp(λ/h) is shown to be close, when h is small, to the (−)-phase. The difference between this state and the (−)-phase can be described in terms of an asymptotic expansion in powers of the external field. This expansion can be interpreted as describing a set of C ∞ continuations in h of the family of Gibbs distributions with the negative magnetic fields into the region of positive fields. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390 1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390 1.2 Notation and terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 1.3 Some tools and further definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400 1.4 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 1.5 Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405 2 Metastable Regime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408 2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408 ? The work of R.H.S was partially supported by the N.S.F. through grants DMS 9100725, DMS 9400644 and DMS 9703814. ?? The work of S.B.S. was partially supported through grant DMS 9208029 and by the Russian Fund for Fundamental Research through grant 930101470.

390

R. H. Schonmann, S. B. Shlosman

2.2 2.3 2.4 2.5 3 3.1 3.2 3.3 3.4 3.5

Bottlenecks for the dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408 The restricted ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412 Asymptotic expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 More general initial distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 Relaxation Regime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 Inverted pyramids and droplet growth . . . . . . . . . . . . . . . . . . . . . . . . . . . 428 Rescaling and droplet creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439 Double well structure of equilibrium distributions . . . . . . . . . . . . . . . . . 444 Spectral gap estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453

1. Introduction 1.1. Preliminaries. This paper is a continuation of the paper [Sch 1], and contains substantial strengthening of the results of that paper in the case of dimension 2. We refer the reader to [Sch 1] for a discussion of the motivation and background of the problem. For introductions to metastability see, e.g., [GD] and [PL]. The precise results in the current paper can only be stated after enough notation is introduced and are therefore postponed to Sect. 1.4. We provide next an informal summary of our results. Our concern in this paper is with the metastable behavior of the 2 dimensional Ising model, evolving with a reversible spin-flip dynamics, in the proximity of the phasecoexistence line. We study the system at an arbitrary subcritical temperature T and under a small positive external magnetic field h. The results proven all refer to limits in which h & 0. These results fully confirm, in particular, a conjecture raised by Aizenman and Lebowitz in [AL]. This conjecture was that if started from a typical configuration of the (−)-phase, for times of order exp(λ/h) with λ below a critical value λc the system would be in a sort of metastable state, close to the (−)-phase (in spite of the presence of the positive external field). On the other hand, for a time of order exp(λ/h) with λ > λc the system would have already relaxed and so would be close to the (+)-phase. In [Sch 1] a weaker version of this conjecture was proven, with the first scenario occurring for λ < λ1 and the second for λ > λ2 , with these two constants λ1 and λ2 having been explicitly estimated, but both being non-optimal. Moreover the temperature was supposed to be substantially lower than the critical one and the initial distribution had to be concentrated on the configuration with all spins down. On the good side, the results in [Sch 1] are valid in arbitrary dimension. Here we will only consider dimension 2, but strengthen the result in the following ways: 1) The constants λ1 and λ2 are shown to be identical, the common value being given by λc = λc (T ) =

w(T )2 . 12 T m∗ (T )

(1.1)

Here m∗ (T ) is the spontaneous magnetization and w(T ) is the integrated surface tension of the Wulff body of unit volume. Note that all quantities on the right-hand-side of (1.1) pertain to equilibrium statistical mechanics. 2) The (subcritical) temperature T can be arbitrarily close to the critical one. 3) The initial distribution is only required to be stochastically lower than the (−)-phase. In particular it can be a Gibbs distribution under any negative value of the external field (as it would if the system were allowed to first relax to equilibrium under a negative external field and then, suddenly, the field was switched to a small positive value).

Wulff Droplets and Metastable Relaxation

391

4) We also show that in a certain technical fashion it makes sense to say that at times of order exp(λ/h), with λ < λc rather than being in the (−)-phase the system is better described as being in a metastable state which is infinitesimally (in h) higher than the (−)-phase. The rigorous result is presented in the form of an asymptotic expansion in powers of h. The metastable states can then be seen as a family of C ∞ continuations into the region of positive external fields of the curve of the equilibrium states with negative external fields. (See also the discussion after the statement of the main result of this paper in the Sect. 1.4.) The result about the C ∞ continuations is not in conflict with the known fact, proven in [Isa], that at least at low enough temperature, there is no analytic continuation of the equilibrium states beyond the transition point. Various comments regarding (1) above are in order. To our knowledge, this is the first rigorous relation established between the equilibrium Wulff shape and the time evolution of kinetic Ising models. (The first rigorous relation between the equilibrium surface tension and the time evolution of kinetic Ising models was, as far as we know, established in the fundamental paper [Mar], in the situation in which there is no external field.) It is important to point out that when we started the investigation which led to the current paper we could not see any evident reason for (1.1) to hold. This doubt was expressed, and to some extent discussed, in Sect. 1-iii of [Sch 1]. Since the doubt stemmed in part from the study of the metastable behavior of anisotropic Ising models in [KO], it is important to stress that regarding the problems treated in the current paper, our results and methods apply also to these models. In Sect. 1.5 we will present a heuristic picture which predicts (1.1) and is based on considering the free-energy of individual droplets of the stable phase in the midst of the metastable phase and taking into account droplet growth at a fast enough speed. The aspect of the heuristics which originally seemed weak to us was the idea that the evolution of these droplets is governed by their equilibrium free-energy. A recent detailed study of computer simulations of the metastable relaxation of twodimensional kinetic Ising models in [RTMS], which was done independently of our work in this paper, also indicated the validity of (1.1). It has become evident over the years that the metastable behavior of kinetic Ising models is very rich and that precise mathematical statements can be conjectured and sometimes proven in various different asymptotic regimes (see, e.g., Sect. 4 of [Sch 1] or a more complete discussion in [Sch2]). In a recent companion paper, [DS], results were obtained which are counterparts to some of those presented here, but in the case in which the external field is held fixed (positive and small) and the temperature is scaled to zero. This paper is divided into three parts. In the remainder of the first one, to which this section belongs, we will be introducing notation and terminology, stating results, motivating these results heuristically and presenting some basic tools. In the second part we prove the results concerning the metastable regime, i.e., the behavior at times of order of exp(λ/h) with 0 < λ < λc . In the third part we prove the results concerning the relaxation regime, i.e., the behavior at times of the order of exp(λ/h) with λ > λc . 1.2. Notation and terminology. In this section we introduce a long sequence of definitions, notation and techniques. We tried to make everything as standard as possible, so that most readers will browse quickly through this section, finding few things which they are not familiar with. Most statements are made without proof, and we refer readers to the books [Geo and Lig], and other references therein, for explanation. Almost all the notation introduced below is identical to that in the papers [Sch1 and SS1].

392


The lattice. The cardinality of a set 3 ⊂ Z2 will be denoted by |3|. The expression 3 ⊂⊂ Z2 will mean that 3 is a finite subset of Z2 . For each x ∈ Z2 , we define the usual norms kxkp = (|x1 |p + |x2 |p )1/p , p > 0 finite, and kxk∞ = max{|x1 |, |x2 |}. The distance between two sets 31 , 32 ∈ Z2 in each one of these norms will be denoted by distp (31 , 32 ) = inf{||x − y||p : x ∈ 31 , y ∈ 32 }. In case 31 = {x}, we also write distp (31 , 32 ) = distp (x, 32 ). The interior and exterior boundaries of a set 3 ⊂ Z2 will be denoted, respectively by ∂int 3 = {x ∈ 3 : kx − yk1 = 1 for some y 6∈ 3}, and ∂ext 3 = {x 6∈ 3 : kx − yk1 = 1 for some y ∈ 3}. The p-norm diameter of a set 3 ⊂⊂ Z2 is defined by diamp (3) = max{distp (x, y) : x, y ∈ 3}. Given a set A ⊂ R2 , we will write 3(A) = A ∩ Z2 . In case A = [−l/2, l/2]2 is a l × l square centered at the origin, we simplify the notation to 3(l) = 3(A) = Z2 ∩ [−l/2, l/2]2 . Given A, B ⊂ R2 , z ∈ R2 and c ∈ R, we define A + B = {x + y : x ∈ A, y ∈ B}, A + z = A + {z}, and cA = {cx : x ∈ A}. The set of bonds, i.e., (unordered) pairs of nearest neighbors is defined as B = {{x, y} : x, y ∈ Zd and kx − yk1 = 1}. Given a set 3 ⊂⊂ Z2 we define also B3 = {{x, y} : x, y ∈ 3 and kx − yk1 = 1}, ∂B3 = {{x, y} : x ∈ 3, y 6∈ 3 and kx − yk1 = 1}.

Notions from percolation. A chain is a sequence of distinct sites x1 , . . . , xn , with the property that for i = 1, . . . , n − 1, ||xi − xi+1 ||1 = 1. The sites x1 and xn are called the end-points of the chain x1 , . . . , xn , and n is its length. A (*)-chain, its end-points and its length are defined in the same way, but with || · ||1 replaced by || · ||∞ . Informally this means that while chains can only move along bonds of Z2 , (*)-chains can also move along diagonals. A set of sites with the property that each two of them can be connected by a chain contained in the set is said to be a connected set. A chain or (*)-chain is said to connect two sets if it has one end-point in each set. A set of sites is said to be simply-connected in case it is connected and its complement is also a connected set. A circuit is a chain such that ||x1 − xn ||1 = 1. Similarly a (*)-circuit is a (*)-chain such that ||x1 − xn ||∞ = 1.


393

The configurations and observables. At each site in Z2 there is a spin which can take d values −1 and +1. The configurations will therefore be elements of the set {−1, +1}Z = . Given σ ∈ , we write σ(x) for the spin at the site x ∈ Z2 . Two configurations are specially relevant, the one with all spins −1 and the one with all spins +1. We will use the simple notation − and + to denote them. The single spin space, {−1, +1} is endowed with the discrete topology and is endowed with the corresponding product topology. The following definition will be important when we introduce finite systems with boundary conditions later on; given 3 ⊂⊂ Z2 and a configuration η ∈ , we introduce 3,η = {σ ∈ : σ(x) = η(x) for all x 6∈ 3}. Real-valued functions with domain in are called observables. For each observable f we use the notation ||f ||∞ = supη∈ |f (η)|. Local observables are those which depend only on the values of finitely many spins, more precisely, f : → R is a local observable if there exists a set S ⊂⊂ Z2 such that f (σ) = f (η) whenever σ(x) = η(x) for all x ∈ S. The smallest S with this property is called the support of f , denoted Supp(f ). The topology introduced above on , has the nice feature that it makes the set of local observables be dense in the set of all continuous observables. In the following partial order is introduced: η ≤ ζ if η(x) ≤ ζ(x) for all x ∈ Z2 . A particularly important role will be played in this paper by the non-decreasing local observables. Each local observable can be written as the difference between two nondecreasing ones. A (+)-chain in a configuration σ is a chain of sites, x1 , . . . , xn , as defined above, with the property that for each i = 1, . . . , n, σ(xi ) = +1. Given 31 , 32 ⊂ Z2 , we will use the + notation {31 ←→ 32 } to denote the set of configurations in which there is a (+)-chain with one end-point in 31 and one end-point in 32 . In case 31 = {x} we simplify this + notation to {x ←→ 32 }, and similarly for 32 . Given a configuration σ, we say that a + site x is (+)-connected to a set 3 ⊂ Z2 in σ if σ ∈ {x ←→ 3}. The (+)-cluster of a set 2 3 ⊂ Z in the configuration σ is the set of sites which are (+)-connected to 3 in this configuration. Similar notions can be defined for (−)-connectedness, (+*)-connectedness − +∗ and (−*)-connectedness. In particular the notation {31 ←→ 32 }, {31 ←→ 32 }, and −∗ {31 ←→ 32 } should have now self-explanatory meaning. The probability measures. We endow also with the Borel σ-algebra corresponding to the topology introduced above. In this fashion, each probability measure µ in this space R can be identified by the corresponding expected values f dµ of all the local observables f . A sequence of probability measures, (µn )n=1,2,... , is said to converge weakly to the probability measure ν in case Z Z f dµn = f dν for every continuous observable f . (1.2) lim n→∞

The family of probability measures on will be partially ordered by the following relation: µ ≤ ν if Z Z f dµ ≤ f dν for every continuous non-decreasing observable f . (1.3)

394


Because the local observables are dense in the set of continuous observables, we can restrict ourselves to the local ones in (1.2) and (1.3). Moreover, because every local observable is the difference between two non-decreasing ones, we can also restrict ourselves to those in (1.2). The Gibbs measures. We will consider always the formal Hamiltonian. Hh (σ) = −

1 X hX σ(x)σ(y) − σ(x), 2 x,y n.n. 2 x

(1.4)

where h ∈ R is the external field and σ ∈ is a generic configuration. In order to give precise definitions, we define, for each set 3 ⊂⊂ Z2 and each boundary condition η ∈ , H3,η,h (σ) = −

1 2

X {x,y}∈B3

σ(x)σ(y) −

1 2

X {x,y}∈∂B3 y6∈3

σ(x)η(y) −

hX σ(x). 2 x∈3

(1.5)

In what follows the temperature T will often appear explicitly in the notation, for clarity. Later on in this paper we will usually be considering a situation in which the temperature is fixed, while we scale the external field h, and then the temperature will be omitted from the notation. Given 3 ⊂⊂ Z2 , η ∈ , E ⊂ , T > 0 and h ∈ R, we write X exp(−βH3,η,T,h (σ)), Z3,η,T,h (E) = σ∈3,η ∩E

where β = 1/T . We abbreviate Z3,η,T,h = Z3,η,T,h (). The Gibbs (probability) measure in 3 with boundary condition η under external field h and at temperature T is now defined on as   exp(−βH3,η,h (σ)) , if σ ∈ 3,η , Z3,η,T,h µ3,η,T,h (σ) =  0, otherwise. The Gibbs measures satisfy the following monotonicity relations to which we will refer as the FKG-Holley inequalities. If η ≤ ζ and h1 ≤ h2 , then, for each 3 ⊂⊂ Z2 and T > 0, µ3,η,T,h1 ≤ µ3,ζ,T,h2 . A Gibbs measure for the infinite system on Z2 is defined as any probability measure µ which satisfies the DLR equations in the sense that for every 3 ⊂⊂ Z2 and µ-almost all η ∈ , (1.6) µ( · |3,η ) = µ3,η,T,h ( · ). Alternatively and equivalently, Gibbs measures can be defined as elements of the closed convex hull of the set of weak limit points of sequences of the form (µ3i ,ηi ,h )i=1,2,... , where each 3i is finite and 3i → Z2 , as i → ∞, in the sense that ∞ 2 ∪∞ i=1 ∩j=i 3j = Z . For each value of T and h, µ3(l),−,T,h (resp. µ3(l),+,T,h ) converges weakly, as l → ∞, to a probability measure that we will denote by µ−,T,h (resp. µ+,T,h ). If h 6= 0 it is known that µ−,T,h = µ+,T,h , which will then be denoted simply by µT,h ; it is also


395

known that this is the only Gibbs measure for the infinite system in this case. If h = 0 the same is true if the temperature is larger than or equal to a critical value Tc > 0, and is false for T < Tc , in which case one says that there is phase coexistence. We use the following abbreviations and names: µ−,T,0 = µ−,T = the (−)-phase, µ+,T,0 = µ+,T = the (+)-phase. Another known fact is that for each fixed T ,

and

µT,h → µ+,T weakly, as h & 0,

(1.7)

µT,h → µ−,T weakly, as h % 0.

(1.8)

For the expected value corresponding to a Gibbs measure µ..., in finite or infinite volume, we will use the notation Z hf i... = f dµ..., where ... stands for arbitrary subscripts. The corresponding conditional expectation, given the event E will be denoted by hf |Ei... . The spontaneous magnetization at temperature T is defined as m∗ (T ) = hσ(0)i+,T . (Here we are using a common and convenient form of abuse of notation: σ(x) is being used to denote the observable which associates to each configuration the value of the spin at the site x in that configuration. This notation will also be used in other places.) It is known that m∗ (T ) > 0 if and only if µ−,T 6= µ+,T and also that limT &0 m∗ (T ) = 1. Surface tension and Wulff shape. The direction dependent 0-field surface tension is defined in the following way. First consider on R2 × R2 the usual inner product (x, y) = x1 y1 + x2 y2 . Let S1 = {x ∈ R2 : ||x||2 = 1}, and for each vector n ∈ S1 , consider the following configuration, to be used as a boundary condition ( +1, if (x, n) ≥ 0, η(n)(x) = −1, if (x, n) < 0. The surface tension in the direction perpendicular to n is given by τT (n) = lim − l→∞

1 Z3(l),η(n),T,0 log , β||y(l) − z(l)||2 Z3(l),+,T,0

where y(l) and z(l) = −y(l) are the points where the straight line {x ∈ R2 : (x, n) = 0} intersects the boundary of the square 3(l). It is known that for each T < Tc the surface tension τT (·) is a continuous strictly positive and finite function. We shall use D to denote the set of all closed self-avoiding rectifiable curves γ ⊂ R2 that are a boundary of a bounded region, γ = ∂V, V ⊂ R2 . Let us recall that a curve is called rectifiable if the supremum of the lengths of polygons, with edges connecting arbitrary collections of points chosen on the curve, in the order inherited from the curve, is finite (and equals then the length, |γ|, of the curve γ), and that a rectifiable curve has

396


a tangent at almost every point. It is easy to verify that a curve γ that is the boundary of a convex bounded region belongs to D. We can assign to each curve γ ∈ D the quantity Z W(γ) = WT (γ) = τT (ns )ds, γ

where s parametrizes the curve γ according to Euclidean length measured along this curve, and ns is the unit outward normal vector to the curve at the point s ∈ γ (i.e. the vector orthogonal to the tangent in the considered point and oriented outward the region bounded by γ). The functional WT will be called the Wulff functional associated to the zero-field direction-dependent surface tension τT (·). Sometimes we will refer to it also as the integrated surface tension. To every vector n ∈ S1 and λ > 0 we assign the half-plane LT,n,λ = x ∈ R2 : (x, n) ≤ λτT (n) . Let us consider the intersection WT,λ =

\

LT,n,λ .

(1.9)

n∈S1

These sets clearly satisfy the scaling relation WT,λ = λWT,1 . In particular they keep the same shape, as λ varies; this shape is called the Wulff shape. The Wulff body of volume 1 is defined as WT = WT,λ0 , where λ0 is chosen so that its volume is indeed 1. WT is clearly convex and thus its boundary ∂WT ∈ D. The following is therefore well defined, w = w(T ) = WT (∂WT ). For each T < Tc , the boundary of the Wulff body satisfies the following variational principle. For all γ ∈ D which are boundaries of regions of volume 1, w(T ) ≤ WT (γ),

(1.10)

with equality only in case γ is a translation of ∂WT . The dynamics. We introduce now for the Ising model above, the type of time evolution which makes it into what is known as the kinetic Ising model, or stochastic Ising model, or dynamic Ising model or Glauber dynamics. First we recall that a spin flip system is defined as a Markov process on the state space , whose generator, L, acts on a generic local observable f as X c(x, σ)(f (σ x ) − f (σ)), (1.11) (Lf )(σ) = x∈Zd

where σ x is the configuration obtained from σ by flipping the spin at the site x, and c(x, σ) is called the rate of flip of the spin at the site x when the system is in the state σ. In order for this generator to be well defined and indeed generate a unique Markov process, one has to assume that the rates c(x, σ) satisfy certain regularity conditions. For our purposes here, we will actually restrict ourselves to the following conditions, which are more than enough to assure the existence and uniqueness of the process. (H1) (Translation invariance) For every x, y ∈ Zd ,


397

c(x, σ) = c(x + y, θy σ), where θy σ is the configuration obtained by shifting σ by y, i.e., (θy σ)(z) = σ(z − y). (H2) (Finite range) There exists R such that c(0, η) = c(0, ζ) if η(x) = ζ(x) whenever kxk∞ ≤ R. The minimal such R is called the range of the interaction. The connection between the rates of flip and the Hamiltonian (1.4) and the temperature T = 1/β is established by imposing conditions which assure us that the Gibbs measures are not only invariant, but also reversible with respect to the dynamics. These conditions, called detailed balance, state that for each x ∈ Zd and σ ∈ , c(x, σ) = c(x, σ x ) exp(−β1x Hh (σ)), 

where

1x Hh (σ) = σ(x) 

X

(1.12)

 σ(y) + h ,

y:{x,y}∈B3

which formally equals Hh (σ x ) − Hh (σ). We will usually make the dependence on h explicit, by writing ch (x, σ) for the rates. There are many examples of rates which satisfy the conditions of detailed balance (1.12) and also the other hypotheses, H(1) and H(2). The most common examples found in the literature are: Example 1 (Metropolis Dynamics). ch (x, σ) = exp(−β(1x Hh (σ))+ ), where (a)+ = max{a, 0} is the positive part of a. Example 2 (Heat Bath Dynamics). ch (x, σ) = Example 3.

1 . 1 + exp(β1x Hh (σ))

β ch (x, σ) = exp − 1x Hh (σ) . 2

Each one of these rates satisfies also the further conditions below which will be needed for the analysis in this paper to be possible. (H3) (Attractiveness and monotonicity in h) If η(x) ≤ ζ(x) and h1 ≤ h2 , then ch1 (x, η) ≤ ch2 (x, ζ) if η(x) = ζ(x) = −1, ch1 (x, η) ≥ ch2 (x, ζ) if η(x) = ζ(x) = +1. (H4) (Uniform boundedness of rates) For each temperature T there is h(T ) > 0 and 0 < cmin (T ) ≤ cmax (T ) < ∞ such that for all h ∈ (−h(T ), h(T )) and σ ∈ , cmin (T ) ≤ ch (0, σ) ≤ cmax (T ).

398


Throughout this paper we will suppose that we have chosen and kept fixed a set of rates ch (x, σ) which satisfy the detailed balance conditions, (1.12) and all the hypotheses η )t≥0 , where η is the initial H(1) - H(4). This spin flip system will be denoted by (σh;t configuration. If this initial configuration is selected at random according to a probability ν )t≥0 . The probability measure measure ν, then the resulting process is denoted by (σh;t on the space of trajectories of the process will be denoted by P, and the corresponding expectation by E. (Later, when we couple various related processes, we will also use the symbols P and E to denote probabilities and expectations in some larger probability spaces, but no confusion should arise from this.) The assumption of detailed balance, (1.12), assures that the Gibbs measures are invariant with respect to the stochastic Ising models. Moreover, from the assumption of attractiveness, H(3), one obtains the following convergence results − → µ−,T,h , σh;t and

+ → µ+,T,h , σh;t

weakly, as t → ∞. We will want to consider, sometimes as a tool, and sometimes for its own sake, the counterpart of the stochastic Ising model that we are considering, on an arbitrary finite set 3 ⊂⊂ Z2 , with some boundary condition ξ ∈ . This process, which will be η )t≥0 , where η ∈ 3,ξ is the initial configuration, is defined as the denoted by (σ3,ξ,h;t spin flip system with rates of flip given by ch (x, σ) if σ, σ x ∈ 3,h, c3,ξ,h (x, σ) = 0 otherwise. When σ, σ x ∈ 3,h, , (1.12) yields, for all x ∈ Zd , µ3,ξ,h (σ)c3,ξ,h (x, σ) = µ3,ξ,h (σ x )c3,ξ,h (x, σ x ),

(1.13)

which is the usual reversibility condition for finite state-space Markov processes. (Conversely, if one requires (1.13) to be satisfied for arbitrary 3 ∈ F and ξ ∈ , then one η ) is irreducible and can deduce that (1.12) must hold.) It is clear from H(4) that (σ3,ξ,h;t hence from (1.13) it follows that, for any η, η → µ3,ξ,T,h , σ3,ξ,h;t

weakly, as t → ∞.

Graphical construction. In order to prove our claims in this paper, we will use a standard graphical construction which provides versions of the whole family of processes at a given temperature T , with arbitrary value of h ∈ (−h(T ), h(T )), either on the infinite lattice Z2 or on any of its finite subsets, with arbitrary boundary conditions and starting from any initial configuration, all on the same probability space. This construction is the same one used in [Sch 1]. But the relevance of this construction will be even greater here than it was in that paper, since in part 3 of our paper we will use it to define the process on regions of space-time which are fairly general, and we will set up a rescaling procedure based on such objects. The graphical construction that we use is a specific version of what is called basic coupling between spin flip processes: a coupling in which the spins flip together as


399

much as possible, considering the constraint that they have to flip with certain rates. The construction is carried out by first associating to each site x ∈ Z2 two independent Poisson processes, each one with rate cmax (T ). We will denote the successive arrival + − )n=1,2,... and (τx,n )n=1,2,... . Assume times (after time 0) of these Poisson processes (τx,n that the Poisson processes associated to different sites are also mutually independent. + We say that at each point in space-time of the form (x, τx,n ) there is an upward mark and − that at each point of the form (x, τx,n ) there is a downward mark. Next we associate to ∗ ∗ each arrival time τx,n , where ∗ stands for + or −, a random variable Ux,n with uniform distribution between 0 and 1. All these random variables are supposed to be independent among themselves and independent from the previously introduced Poisson processes. This finishes the construction of the probability space. The corresponding probability and expectation will be denoted, respectively, by P and E. We have to say now how the various processes are constructed on this probability space. For finite 3 and arbitrary ξ, η ) is constructed as follows. We know that almost surely the random the process (σ3,ξ,h;t ∗ times τx,n , x ∈ 3, n = 1, 2, . . . , ∗ = +, −, are all distinct, and we update the state of the process at each time when there is a mark at some x ∈ 3 according to the following ∗ ), and the configuration rules. If the mark that we are considering is at the point (x, τx,n ∗ immediately before time τx,n was σ, then i) The spins not at x do not change. ii) If σ(x) = −1 (resp. σ(x) = +1), then the spin at x can only flip if the mark is of upward type (resp. downward type). iii) If the mark is upward and σ(x) = −1, or if the mark is downward and σ(x) = +1, ∗ cmax . then we flip the spin at x if and only if c3,ξ,h (x, σ) > Ux,n One can readily see that the process constructed in this fashion has the correct rates of flip. In principle, one would like to construct the processes on the infinite lattice Z2 in a similar fashion, with ch (x, σ) replacing c3,ξ,h (x, σ) in (iii). Some extra care has to be taken, because during any non-degenerate interval of time infinitely many marks occur. This is not a real problem, because of the assumption that the range of the interaction, R, is finite. Starting from a configuration η at time 0, we have to say how the spin at a generic site x at a time t is obtained. Using percolation arguments one can argue that on a set of probability 1 in the probability space where the marks were defined, for any fixed η (x))l=1,2,... x and t, if we take any boundary condition ξ, then the sequence (σ3(l),ξ,h;t will converge as l → ∞ (i.e., will become constant for large l), to a limit which does not η (x), and it is clear that depend on ξ. This limit can then be taken to be the value of σh;t the process thus constructed has the correct flip rates and is therefore a version of the η ). The expected value of a function of that process will be denoted by h·iηh;t . process (σh;t A standard proof of the claim above about insensitivity to receding boundary conditions can be found in [Sch 1]. Those estimates presented in [Sch 1] show also that even if we let t grow with l, but keeping l/t large enough, then the spin at a fixed site x is almost insensitive up to time t to what happens outside of the box 3(l). We state this result in the form of a lemma for future reference. This lemma is a rigorous counterpart to the informal statement that because of the finite range of the interaction and of the uniform upper bound on the rates of flip, “the effects travel with a bounded speed”. Lemma 1.2.1. For each temperature T , there exists a finite positive constant C(T ) such that if l ≥ C(T )t, then there exist a positive function C1 (x), x ∈ Z2 and a positive constant C2 , such that for every site x ∈ Z2 ,

400


sup

η η sup sup P(σh;t (x) 6= σ3(l),ξ,h;t (x)) ≤ C1 (x) exp{−C2 l}.

h∈(−h(T ),h(T )) ξ∈ η∈

Proof. See [Sch 1]. Because of the hypotheses (H3), of attractiveness and monotonicity in h, the coupling provided by the construction above preserves the order between the coupled marginal processes, in various cases. In this paper we will need the following facts. If η ≤ ζ, ξ ≤ ξ 0 , −h(T ) < h1 ≤ h2 < h(T ) and 3 ⊂⊂ Z2 is arbitrary, then for all t ≥ 0, η ζ ≤ σ3,ξ σ3,ξ,h 0 ,h ;t , 1 ;t 2

(1.14)

σhη1 ;t ≤ σhζ 2 ;t ,

(1.15)

η ≤ σhζ 2 ;t . σ3,−,h 1 ;t

(1.16)

and We will refer to these inequalities as basic-coupling inequalities. (Observe that the FKGHolley inequalities for the models we are considering can be derived from (1.14).) We will sometimes have to enlarge the probability space defined by the graphical construction to accommodate a random choice of the initial configuration, according to some distribution ν, performed independently of the subsequent time evolution of the process. In this case we will replace the initial configuration with ν in the notation for the process, but, in a slight abuse of notation, we will still use P and E for probabilities and expectations in this enlarged probability space. A few more remarks on notation and conventions. We will use C, C1 , C2 ,C 0 ,C(T ), etc..., to denote positive finite constants, whose precise values are not relevant and may even change from appearance to appearance. We will omit the temperature T in most of the notation, since it is fixed. 1.3. Some tools and further definitions. Before we can state our main results, we need to introduce a few more concepts. This will by done in this section, intermingled with the presentation of some basic techniques. A fundamental fact that we will use often is that if T < Tc then there is a finite positive constant C(T ) such that for all h ≥ 0, all 3 ⊂ Z2 and all x, y ∈ Z2 , −∗ −∗ −∗ µ3,+,h x ←→ y ≤ µ+,h x ←→ y ≤ µ+ x ←→ y ≤ exp(−C(T )||x − y||∞ ).

(1.17 )

The first two inequalities above are instances of the FKG-Holley inequalities (and hold, of course, for all T ), and the third one is Theorem 1 in [CCS]. We review next some well known ways to exploit these inequalities in combination with the FKG-Holley inequalities and the Markov property of the Gibbs measures. Until we say otherwise, we will be supposing that f is an increasing local observable. We start with the observation that if ν1 and ν2 are two probability distributions on , and the event E ⊂ is such that ν1 ( · ) ≤ ν2 ( · |E c ), then


Z

Z f dν1 −

401

Z

Z Z f dν1 − ν2 (E) f dν2 ( · |E) − ν2 (E c ) f dν2 ( · |E c ) Z Z = f dν1 − f dν2 ( · |E c ) Z Z c + ν2 (E) − f dν2 ( · |E) + f dν2 ( · |E )

f dν2 =

≤ 2 ||f ||∞ ν2 (E).

(1.18 )

0 2 Suppose n that 3 ⊂ 3 ⊂⊂o Z and η ∈ . By partitioning the complement of the − event E = Supp(f ) ←→ 3c according to what the (−)-cluster of the set 3c is, and using the FKG-Holley inequalities and the Markov property of the Gibbs distributions, one obtains (1.19) µ30 ,η,h ( · |E c ) ≥ µ3,+,h ( · ).

Therefore, from the FKG-Holley inequalities and (1.18) we obtain for any η ∈ − (1.20) 0 ≤ hf i3,+,h − hf i30 ,η,h ≤ 2 ||f ||∞ µ30 ,η,h Supp(f ) ←→ 3c . If we let now 30 → Z2 and suppose that µ30 ,η,h → µ weakly for some distribution µ, we obtain Z − 0 ≤ hf i3,+,h − f dµ ≤ 2 ||f ||∞ µ Supp(f ) ←→ 3c . (1.21) By taking η = + and combining (1.21) with (1.17) we obtain, when T < Tc and h ≥ 0, (1.22) 0 ≤ hf i3,+,h − hf i+,h ≤ C(f ) exp − C(T ) dist∞ (Supp(f ), 3c ) . Next we consider the correlation functions hf ; gi3,+,h = hf gi3,+,h − hf i3,+,h hgi3,+,h , where f and g are two local observables, not necessarily increasing. We claim that there is a finite positive numerical constant C such that for each T and h, − (1.23) |hf ; gi3,+,h | ≤ C ||f ||∞ ||g||∞ µ3,+,h Supp(f ) ←→ Supp(g) . There is no loss in generality in supposing that f and g are increasing and have ||f ||∞ ≤ 1, ||g||∞ ≤ 1. Given η ∈ {−1, +1}Supp(g) , let Eη be the event that the configuration restricted to Supp(g) is identical to η, and let gη be the value that g assumes on Eη . Note now that using first the FKG-Holley inequalities and then (1.20), we obtain X hf |Eη i3,+,h gη µ3,+,h (Eη ) hf gi3,+,h = η∈{−1,+1}Supp(g)

≤ hf i3\Supp(g),+,h hgi3,+,h h i − ≤ hf i3,+,h + 2µ3,+,h Supp(f ) ←→ Supp(g) hgi3,+,h . This immediately implies (1.23).

402


In case T < Tc and h ≥ 0, we can combine (1.23) with (1.17) to obtain |hf ; gi3,+,h | ≤ C(f, g) exp − C(T ) dist∞ (Supp(f ), Supp(g)) ,

(1.24)

an estimate that in particular is uniform in h ≥ 0 and in 3. A consequence of the exponential decay of correlations in (1.24) is that when T < Tc the function hf i+,h of h ≥ 0, which to h = 0 associates hf i+ and to h > 0 associates hf ih , is infinitely differentiable at h = 0. Moreover, for j = 1, 2, ..., the following identity holds: j X dj hf i+,h β = hf ; σ(x1 ); ...; σ(xj )i+ . (1.25) j dh 2 h=0+ 2 x1 ,...,xj ∈Z

In this expression, which will be justified below, the quantities that appear inside of the summation are called generalized Ursell functions, and are defined next. We start by defining the generalized Ursell functions for a Gibbs measure µ3,η,h ( · ), where 3 is a finite set. For this purpose we consider a generalization of the Hamiltonian, in which at each site x the external applied field may be different, and takes the value hx . If h is the function that to each x ∈ Z2 associates hx , then we will denote by µ3,η,h the corresponding Gibbs distribution in 3 with boundary condition η. With a local observable f given we define j ∂ j hf i3,η,h 2 , hf ; σ(x1 ); ...; σ(xj )i3,η,h = β ∂hx1 . . . ∂hxj h≡h where h ≡ h means that the function h is identically h. It is easy to see, by induction on j, that hf ; σ(x1 ); ...; σ(xj )i3,η,h is a linear combination of products of µ3,η,h -expected values of local observables, all of them with support contained in Supp(f )∪{x1 , ..., xj }. Convergence of the generalized Ursell functions as 3 → Z2 follows from this, for arbitrary T , h and η, as long as µ3,η,h converges weakly to some distribution. In case T < Tc , h ≥ 0 and η = +, we can do better and obtain from (1.22) the bound |hf ; σ(x1 );...; σ(xj )i3,+,h − hf ; σ(x1 ); ...; σ(xj )i+,h | ≤ Cj (f ) exp − C(T ) dist∞ (Supp(f ) ∪ {x1 , ..., xj }, 3c ) . (1.26 ) Observe that in particular this estimate is uniform in h ≥ 0. The proof of (1.25) can be found in Sect. 2 of [M-L]. In that paper the result was proven at low enough temperature, but this was so simply because the estimate (1.24), which is a consequence of (1.17) was not available then. Replacing Theorem 1 in [M-L] with (1.24), the proof in that paper applies up to Tc . The basic estimate used in [M-L], and to which we will return in Sect. 2.4, states that the exponential decay of correlations (1.24) implies a similar exponential decay for the generalized Ursell functions, when the diameter of the set Supp(f ) ∪ {x1 , ..., xj } becomes large: |hf ; σ(x1 ); ...; σ(xj )i3,+,h | diam∞ (Supp(f ) ∪ {x1 , ..., xj }) . ≤ Cj (T, f ) exp −C(T ) j

(1.27 )


403

Observe that in particular this estimate is uniform in h ≥ 0 and in 3. The proof that (1.27) follows from (1.24) in the special case in which f is of the form f (σ) = σ(y1 ) · · · σ(ym ) for some set of sites {y1 , ..., ym } can be found in the Appendix B of [M-L]. The general case follows immediately from this one, since any local function f can be written as a linear combination of functions of this special form, with {y1 , ..., ym } running over all the subsets of Supp(f ). As explained in the proof of Theorem 4 in [M-L], (1.25) follows in a standard fashion from (1.27). In connection to (1.25) it is worth mentioning that for h > 0 the function hf ih is analytic, as follows from the Lee-Yang theorem. Moreover, in this case the Gibbs distributions are completely analytic in an appropriate sense (see [SS1] and references therein). The estimate (1.27) can be replaced for h > 0 by a stronger one: |hf ; σ(x1 ); ...; σ(xj )i3,+,h | ≤ Cj0 (T, h, f ) exp −C(T, h) disttree (Supp(f ), x1 , ..., xj ) , where disttree (31 , ..., 3j ) is the length of the shortest tree in R2 , connecting all the sets 31 , ..., 3j ⊂ Z2 . However, the constants Cj0 (T, h, f ) explode as h → 0. It is also important to recall that, on the other hand, it has been proven in [Isa] that at low enough temperatures there is an essential singularity nevertheless at h = 0; this is expected to be so up to Tc , but no proof of that claim is available, as far as we know. The identity (1.25) and the various related statements that we made above have, of course, analogues for h ≤ 0. Those are the ones that will be relevant for us when we study metastability under small h > 0, since the “metastable state” should then be a “continuation” of the equilibrium states with h ≤ 0. 1.4. Main results. Recall that we are considering a kinetic Ising model for the formal Hamiltonian (1.4) in dimension 2, which is supposed to satisfy conditions (H1), (H2), (H3) and (H4) of Sect. 1.2. Recall also that for T < Tc we define λc = λc (T ) = w(T )2 /(12 T m∗ (T )). The following theorem is our main result. Theorem 1. Suppose T < Tc . For every probability distribution ν ≤ µ− the following happens: i) If 0 < λ < λc , then for each n ∈ {1, 2, ...} and for each local observable f , n−1 X ν = bj (f )hj + O(hn ) E f σh;exp(λ/h) j=0

for h > 0, where bj (f ) =

j 1 dj hf i−,h 1 β = j! dhj h=0− j! 2

X x1 ,...,xj

hf ; σ(x1 ); ...; σ(xj )i− , ∈Z2

and O(hn ) is a function of f and h which satisfies lim suph&0 |O(hn )|/hn < ∞. ii) If λ > λc , then for any finite positive C there is a finite positive C1 such that for every local observable f , C ν , E f σh;exp(λ/h) − hf ih ≤ C1 ||f ||∞ exp − h for all h > 0.

404


From this theorem, the simple Proposition 1 in [Sch 1], and the fact that hf ih → hf i+ as h & 0 (see (1.7), or the paragraph which precedes (1.25)), the following corollary is obtained. Corollary 1. Suppose T < Tc . For every probability distribution ν ≤ µ− the following happens. If we let h & 0 and t → ∞ together, then for every local observable f , ν → hf i− if lim sup h log t < λc (T ). i) E f σh;t ν ii) E f σh;t → hf i+ if lim inf h log t > λc (T ). ν converges In other words, we are stating that the law of the random configuration σh;t weakly to µ− in case (i) and to µ+ in case (ii). This corollary is already an important strengthening of Theorem 1 in [Sch 1] in the 2 dimensional case. The following aspects of that theorem are improved here: 1) There is a single constant λc separating the regimes (i) and (ii). 2) The temperature is now only required to be below Tc . 3) The initial distribution is much more general than in [Sch 1], where it was supposed to be concentrated on the configuration with all spins down. Note that, by the FKG-Holley inequalities, for each h < 0 the distribution µh satisfies the condition above on the initial distribution ν. To illustrate and clarify the way in which Theorem 1(i) improves even further the statement in Corollary 1(i), let us take the local observable given by f (σ) = σ(0) and n = 2. We have then, when 0 < λ < λc , ν (0) = −m∗ + χh + O(h2 ), E σh;exp(λ/h)

when h > 0. Here

X dhσ(0)i−,h β = hσ(0); σ(x)i− , χ = b1 (f ) = dh 2 h=0− 2 x∈Z

is the susceptibility at h = 0− . This means that when h > 0 is small the function ∗ ν −m + χh is a better approximation to E σh;exp(λ/h) (0) than the constant function ∗ ∗ identical to −m = hf i− . This function −m +χh is the smooth linear continuation into the region h ≥ 0 of the function which to h < 0 associates the equilibrium expectation hf ih . Similar interpretations can be given for larger values of n and arbitrary f . Another way to express part of the content of Theorem 1 is to observe that it claims that for any λ, 0 < λ < λc and any probability distribution ν ≤ µ− the branch of states h·iνh;exp(λ/h) for h > 0 is a C ∞ continuation of the family h·ih for h < 0. That interpretation suggests that the phenomenon of metastability should be understood dynamically, in which case the physically meaningful smooth continuations through the critical point h = 0 become possible. In the physics literature (see, e.g., [BM]), one sometimes relates the metastable relaxation of a system to the presence of a “plateau” in the graph corresponding to the ν . Of course, strictly speaking there time evolution of a quantity of the type of E f σh;t is no “plateau”, and generically the slope of such a function is never 0. Still, from the experimental point of view a rough “plateau” can be seen and described as follows. In a ν seems to converge to a value close to hf i− ; after this, relatively short time E f σh;t one sees an apparent flatness in the relaxation curve over a period of time which may be quite long compared with the time needed to first approach this value. But eventually the relaxation curve starts to deviate from this almost constant value and move towards


405

the true asymptotic limit, close to hf i+ . The experimentally almost flat portion of the relaxation curve is referred to as a “plateau”. Theorem 1 can be seen to some extent as giving some precise meaning to such a “plateau”, and we discuss now two ways in which this can be done. First note that if 0 < λ0 < λ00 < λc , then from Part (i) of the theorem we have ν ν E f σh;exp(λ − E f σh;exp(λ → 0, 0 /h) 00 /h) faster than any power of h. Observe that we are considering times which are of different order of magnitudes, when h is small, and still we are observing a nearly constant ν . For a second way in which Theorem 1 can be seen as expressing the E f σh;t ν versus log(t), rather than presence of a “plateau”, we can think of plotting E f σh;t versus t. This is somewhat the natural graph to consider, if one is interested in the order of magnitude of the relaxation time. If the log(t)-axis is drawn in the proper scale, amounting to replacing it with h log(t), then, when h is small, Theorem 1 tells us that the graph should be close to that of a step function which jumps at the point λc , from the value hf i− to the value hf i+ . Readers who are familiar with [Sch1 and Sch2] can expect that also Theorem 4 and Corollary 1 in [Sch 1], which refer to finite systems with (−) boundary conditions and sizes which are scaled as h & 0, have stronger versions along the lines of the current paper. This is indeed the case, but for brevity we will omit the statements of these theorems, which can easily be obtained and can be proved with the techniques introduced in this paper. 1.5. Heuristics. One of the appealing features of the results proven in this paper is that some of them can be correctly predicted based on a very simple and naive-looking heuristics. It is probably a challenge to a historian of science to trace back the origin and evolution of this non-rigorous approach to the problem, to give the proper credit to the people involved and to elucidate how the interplay among empirical observation of metastable systems, theoretical analysis and computer simulation led to the reasoning described in a simple fashion below. Here we will make no attempt to clarify the history of the subject. The interested reader will find a great deal of references to the earlier literature on the subject in the paper [RTMS]. It is worth stressing that certain parts of the heuristics were rediscovered more than once in different forms and contexts, so that giving proper credit is a very difficult task. The reader may want to compare the heuristics presented below with the one presented in [Sch1] and [Sch2]. The main difference is that here we are being more ambitious by basing the heuristics on the computation of the free-energy of a droplet with an arbitrary shape, rather than on the energy of a square droplet. At the time that those two former papers were written, it seemed to its author that there was no compelling reason to believe that the equilibrium free-energy of droplets would predict correctly the value of λc , i.e, the behavior of the relaxation time to the level of the correct rate of exponential growth with 1/h. The first ingredient of the heuristics is the idea of looking at an individual droplet of the stable phase (roughly the (+)-phase, since h is small) in a background given by the metastable phase (roughly the (−)-phase). Let S be the shape of that droplet, which a priori can be arbitrary. Say that l2 is the volume (i.e., the number of sites) of the droplet, and let us find an expression for the free-energy of such a droplet. This free-energy may be seen as coming from two main contributions. There should be a bulk term, proportional to l2 . This term should be obtained by multiplying l2 by the difference in

406


free-energy per site between the (+)-phase and the (−)-phase in the presence of a small magnetic field h > 0. This difference in the free-energy per site of the two phases should come only from the term in the Hamiltonian which couples the spins to the external field and should therefore be given by 2m∗ h/2 = m∗ h. The other relevant contribution to the free-energy of the droplet should come from its surface, where there is an interface between the (+)-phase and the (−)-phase. This contribution is proportional to the length of the interface, which is of the order of l. It should be multiplied by a constant wS which depends on the shape of the droplet. This constant wS represents the excess free-energy per unit of length integrated over the surface of the droplet when its scale is changed so that its volume becomes 1. Therefore, since the external field h is small, we can take for wS the value W(γ), where γ is the boundary of the droplet, rescaled in this fashion. In particular, wS is minimized when the droplet has the Wulff shape, and in this case wS = w. Adding the pieces, we obtain for the free-energy of the droplet the expression 8S (l) = −m∗ hl2 + wS l. The two terms in this expression become of the same order of magnitude, in case l is of the order of 1/h. Therefore, for later convenience we write l = b/h, with a new variable b ≥ 0. This yields φS (b) , 8S (b/h) = h where φS (b) = −m∗ b2 + wS b. This very simple function takes the value 0 at b = 0, grows with b on the interval [0, BcS , ], (wS )2 wS S S where BcS = BcS (T ) = 2m ∗ , reaching its absolute maximum φS (Bc ) = 4m∗ = A (T ) = S A at the end of this interval and decreases with b on the semi-infinite interval [BcS , ∞). wS It crosses the value 0 at the point B0S (T ) = B0S = m ∗ = 2Bc . Metastability is then “understood” from the fact that systems in contact with a heat bath move towards lowering their free-energy, so that the presence of a free-energy barrier which needs to be overcome in order to create a large droplet of the stable phase with any shape keeps the system close to the metastable phase. Subcritical droplets are constantly being created by thermal fluctuations, in the metastable phase, but they tend to shrink, as dictated by the free-energy landscape. On the other hand, once a supercritical droplet is created due to a larger fluctuation, it will grow and drive the system to the stable phase, possibly colliding and coalescing in its growth with other supercritical droplets created elsewhere. As a function of h, the linear size of a critical droplet, BcS /h, blows up as h & 0. One can then, in a somewhat circular, but heuristically-meaningful way, say that the macroscopic free-energy of droplets is indeed a relevant object of consideration. One can also hope then that sharp theorems could be conjectured and possibly proven regarding the asymptotic behavior of quantities of interest in the limit h & 0. Regarding the shape of the droplet, the height of this barrier is minimized by minimizing the value of the constant wS , i.e., by considering Wulff-shaped droplets. This singles out the Wulff shape as the most relevant one in the heuristics above. We will simplify the notation by omitting the subscript S when talking about the Wulff shape. In particular, w w2 , A = . (1.28) Bc = 2m∗ 4m∗ Based on the expression above for the free-energy barrier, one predicts the rate of creation of supercritical droplets with center at a given place to be exp −βA . h


407

In what follows now we write d instead of 2, to make the role of the dimension clear in the geometric argument which comes next. We are concerned with an infinite system, and we are observing it through a local function f , which depends, say, on the spins in a finite set Supp(f ). For us the system will have relaxed to equilibrium when Supp(f ) is covered by a big droplet of the plus-phase, which appeared spontaneously somewhere and then grew, as discussed above. We want to estimate how long we have to wait for the probability of such an event to be large. If we suppose that the radius of supercritical droplets grows with a speed v, then we can see that the region in spacetime where a droplet which covers Supp(f ) at time t could have appeared is, roughly speaking, a cone with vertex in Supp(f ) and which has as base the set of points which have time-coordinate 0 and are at most at distance tv from Supp(f ). The volume of such a cone is of the order of (vt)d t. The order of magnitude of the relaxation time, trel , before which the region Supp(f ) is unlikely to have been covered by a large droplet and after which the region Supp(f ) is likely to have been covered by such an object can now be obtained by solving the equation βA d = 1. (1.29) (vtrel ) trel exp − h This gives us trel = v −d/(d+1) exp

βA (d + 1) h

.

(1.30)

In order to use this relation to predict the way in which the relaxation time scales with h, one needs to figure out the way in which v scales with h. If we suppose, for instance, that v does not scale with h, or at least that if it goes to 0, as h & 0, it does it so slowly that (1.31) lim hd−1 log v = 0, h&0

then we can predict that trel = exp

βA (d + 1) h

= exp

λc h

,

where λc =

βw2 βA βw2 = = , d + 1 (d + 1) 4m∗ 12 m∗

in agreement with our result (1.1). In Sect. 1-iii of [Sch1] and more explicitly in Sect. 4 of [Sch2] an argument was given in support of the conjecture that v ∼ Ch as h & 0, a much stronger conjecture than (1.31). In the paper [RTMS] (see display (9) there) a different non-rigorous argument is described, in which the same conclusion is derived from an “Allen-Cahn approximation”. In part 3 of this paper we will introduce a rescaling procedure and obtain results which can be seen as rigorous counterparts to (1.31). It is interesting to compare this feature of the regime of fixed T and h & 0 with the case of fixed h > 0 and T & 0, studied in [DS]. In that case the analogue of (1.31) is false, and consequently the v-term in (1.30) is of greater relevance than it is here.

408


2. Metastable Regime 2.1. Preliminaries. In this part of the paper we will prove part (i) of Theorem 1. The first step will be to prove, in Sections 2.2 and 2.3, the following proposition. Proposition 2.1.1. Suppose that T < Tc and 0 < λ < λc . Then for each constant a ∈ (0, 1/4), there is a positive finite constant C such that for each local observable f there is a positive finite constant C(f ) such that for h > 0, − − hf i3(1/ha ),−,h ≤ C(f ) exp(−C/ha ). E f σh;exp(λ/h) We will not try to optimize the constants C and C(f ) in this proposition. But we observe that from our proof, if the inequality displayed above is only required to hold for h ≤ h0 for some h0 > 0 depending on f , then we can take C(f ) = C 0 ||f ||∞ |Supp(f )|, where C 0 does not depend on f . Proposition 2.1.1 transforms our dynamical problem into an equilibrium one, in case the initial distribution is concentrated on the configuration with all spins down. In Sect. 2.4 we will study the behavior of hf i3(1/ha ),−,h for small h > 0 and show that it gives rise to the asymptotic expansion claimed in part (i) of Theorem 1. Let us note here that if our goal were only to prove Corollary 1(i), with the initial distribution having all spins down, then Proposition 2.1.1 would have reduced our task to a very simple one. First, from the heuristic viewpoint, with a > 0 small, the box 3(1/ha ) is too small for any supercritical droplet to fit inside, so that one should expect to see the (−)-phase inside it. A rigorous argument to the effect that for 0 < a < 1 hf i3(1/ha ),−,h → hf i−

as h & 0

(2.1)

can be obtained from the FKG-Holley inequalities in combination with part (a2) of Corollary 1 in [SS1]. But for 0 < a < 1/2, which is our case, a simple and direct argument can also be given. It is clear that the following estimate, which can be seen as a uniform bound on a Radon-Nikodym derivative, holds. For all η ∈ 3(1/ha ),− , exp(−β|3(1/ha )|h) ≤

µ3(1/ha ),−,h (η) ≤ exp(β|3(1/ha )|h). µ3(1/ha ),−,0 (η)

Since |3(1/ha )|h ≤ h1−2a → 0, as h & 0, (2.1) follows from the weak convergence of µ3(1/ha ),−,0 to µ− as h & 0. In this argument we see directly that, with a < 1/2, the box 3(1/ha ) is too small for the external field h > 0, acting on each spin, to be able to win over the effect of the negative spins at the boundary. The extension of Theorem 1(i) to arbitrary initial distributions ν ≤ µ− will be obtained in Sect. 2.5. Interestingly enough a basic tool there will be a result obtained in [Mar], and extended to arbitrary subcritical temperatures in [CGMS], concerning the 2 dimensional Ising model evolving in the absence of an external field. 2.2. Bottlenecks for the dynamics. We start now the proof of Proposition 2.1.1. Several times in the proof of this proposition we will use arguments which are only true for small enough h > 0, but the constant C(f ) can be adjusted so that we do not have to require h to be small in the statement of the proposition. In order to prove Proposition 2.1.1 there is no loss in generality in supposing that f is increasing and that it has ||f ||∞ ≤ 1. For the remainder of the proof of this proposition we will make these assumptions.


409

To simplify the notation we set th = exp(λ/h).

(2.2)

We turn first to the proof of the easy half of Proposition 2.1.1. We will show that for small h > 0, λ − . (2.3) )) − hf i3(1/ha ),−,h ≥ − exp − exp E(f (σh;t h 2h For this, observe first that from the basic-coupling inequalities we have − − )) ≥ E(f (σ3(1/h E(f (σh;t a ),−,h;t )). h h µ

(2.4)

3(1/h ),−,h − Let the process (σ3(1/h a ),−,h;t ) and the stationary process (σ3(1/ha ),−,h;t ) evolve on the probability space defined by the graphical construction, so that in particular once these processes hit each other they remain together forever. Note that, for some positive , the probability that these two processes will hit each other during any unit time interval 2a is at least 1/h , regardless of their states at the beginning of this time interval. Also, µ3(1/ha ),−,h − f (σ3(1/ha ),−,h;t ) ≤ f (σ3(1/h a ),−,h;t ) with probability one. From these remarks it is clear that − 1/h2a th −1 (2.5 ) 0 ≤ hf i3(1/ha ),−,h − E(f (σ3(1/h a ),−,h;t )) ≤ 1 − h λ , ≤ exp − exp 2h a

for small h > 0. The inequality (2.3) follows from (2.4) and (2.5). The main task in this and the next sections is to prove the other half of Proposition 2.1.1, i.e., the inequality − )) − hf i3(1/ha ),−,h ≤ C(f ) exp(−C/ha ). E(f (σh;t h

(2.6)

We approach this problem borrowing some ideas from [Sch1]. As in that paper, set 3h = 3(exp(λc /h))

(2.7),

and observe that Lemma 1.2.1 (which is the same as Lemma 1 in [Sch1]) gives us the following stronger version of Lemma 2 in [Sch1]: − − )) − E(f (σ3 ))| ≤ C1 (f ) exp(−C2 exp(λ/h)), |E(f (σh;t h h ,−,h;th

(2.8)

where C1 (f ) and C2 are positive and finite. Next we will introduce a restricted set of configurations, in a way similar to [Sch1], and inspired there by [CCO] and by the heuristic idea of critical droplets. To make this idea precise one uses the standard notion of contours, on the dual lattice Z2 + (1/2, 1/2), which separate spins −1 from +1. In the definition of these contours, we adopt here the splitting rules used in, e.g., [DKS] (see Sect. 3.1 there), which allow one to take the contours as self-avoiding curves, which are closed, when the boundary conditions are, e.g., (−), as is our case. We will denote by |0| the length of the contour 0, by Int 0 the set of sites it surrounds, and by V (0) the number of spins that it surrounds, which we call the volume of 0. As usual, a contour is called an external contour if it is not enclosed by any other contour. If 0 is such a contour of the configuration σ, and the boundary

410


conditions are (−), then at certain sites x, attached to 0, the values σ(x) are uniquely defined by the presence of 0. The set of such sites will be denoted by ∂0. We have ∂0 = ∂− 0 ∪ ∂+ 0,

(2.9)

where σ|∂± 0 = ±1. We will use the notation − =

∞ [

3(l),− .

l=1

Our restricted set of configurations is defined as n B 2 o c R = σ ∈ − : each contour 0 in σ has V (0) ≤ , h

(2.10)

where Bc is defined in (1.28). We want to argue that up to a time as large as th the system evolving in the box 3h with (−) boundary conditions and starting with all spins −1, will be unlikely to escape from R, in which case the system indeed would look very much like the (−) phase. In order to do this we introduce a modified dynamics evolving in 3h ,− , in which large droplets cannot, by definition, be formed and then we couple the unrestricted dynamics to this modified one, in a natural way. The modified dynamics is simply defined as the Markov process on 3h ,− which evolves as the original stochastic Ising model in 3h , with (−) boundary conditions, but for which all jumps out of R are suppressed. In other words, the rates, c˜3h ,−,h (x, σ), of the new process are identical to c3h ,−,h (x, σ) in case σ, σ x ∈ 3h ,− ∩ R and are 0 otherwise. We will denote this modified process, which is η )t≥0 ,where η ∈ R is the initial configuration. restricted to the state space R, by (σ˜ 3 h ,−,h;t It is well known, and very easy to prove, that such a modified process is also reversible and that since it is, in our case, irreducible, its unique invariant probability measure is µ˜ 3h ,−,h given by µ˜ 3h ,−,h ( · ) = µ3h ,−,h ( · |R). This distribution is sometimes called a “restricted ensemble”, and, informally speaking, represents the “metastable state”. η ) can be constructed on For each initial configuration η ∈ R, the process (σ˜ 3 h ,−,h;t the same probability space corresponding to the graphical construction introduced in Sect. 1.2. For this purpose it is enough to suppress all jumps out of R. In other words, Poisson marks which should cause such a jump are just ignored. The important fact about this construction is that if we introduce µ˜ 3

τ = inf{t ≥ 0 : the process (σ˜ 3hh,−,h;t ) has a suppressed jump at time t}, then

,−,h

µ˜ 3

− σ3 ≤ σ˜ 3hh,−,h;t for all t < τ. h ,−,h;t ,−,h

(2.11)

(Readers who are familiar with the argumentation in [Sch1], have noted that while most of what we introduced above in connection to the restricted ensemble is similar to its counterparts in that paper, the notions are not strictly parallel to those there. The reason is that, while we are still pursuing the idea that the boundary of R is a bottleneck, the arguments used in [Sch1] to prove Lemma 5(i) there would give us an estimate that, while good enough to obtain Corollary 1 (i), would not be good enough to obtain an estimate as sharp as Proposition 2.1.1, which is needed for the proof of Theorem 1(i).)


411

We introduce now a family of sets whose union is the inner boundary of R. For each site x ∈ Z2 define n o Fx− = σ ∈ R : σ x 6∈ R . Set also ϕ = sup µ˜ 3h ,−,h (Fx− ). x∈3h

We will now consider a discrete time Markov chain embedded into the stationµ˜ 3 ,−,h ary process (σ˜ 3hh,−,h;t ). It is formed by the configurations of our process between the successive jumps. For this purpose order all the Poisson marks in the graphical construction which occur on 3h , according to the time they occur. Let N (t) be the number of such marks from time 0 up to time t. Let Mx,k be the event that the k th such mark occurs at the site x, and Fx,k be the event that immediately before this k th mark the µ˜ 3

process (σ˜ 3hh,−,h;t ) is in Fx− . Note that Mx,k and Fx,k are independent events and that P(Mx,k ) = 1/|3h |, while by stationarity P(Fx,k ) = µ˜ 3h ,−,h (Fx− ) ≤ ϕ for all x and k. Set also K = b2 |3h | cmax (T ) th c to obtain the estimate X P(Mx,k ∩ Fx,k ) P(τ ≤ th ) ≤ P(N (th ) > K) + ,−,h

x∈3h ,k=1,...,K

≤ C1 exp(−C2 exp(λ/h)) + C3 |3h |th ϕ,

(2.12 )

where in the second inequality a standard large deviation estimate for Poisson random variables was used. From (2.8), (2.11) and (2.12) it follows that Z − (2.13) E(f (σh;th )) ≤ f dµ˜ 3h ,−,h + C3 |3h |th ϕ + C4 (f ) exp(−C5 exp(λ/h)). All the quantities in the right-hand-side of (2.13) pertain to equilibrium statistical mechanics, so that we have reduced our dynamical problem of proving (2.6) to the equilibrium problems of proving the following two claims concerning the measure µ˜ 3h ,−,h : i) for all > 0 and h > 0 small enough

A ϕ ≤ C1 exp −β(1 − ) h

,

(2.14)

where A is defined by (1.28); ii) for h > 0 Z f dµ˜ 3h ,−,h ≤ hf i3(1/ha ),−,h + C(f ) exp(−C2 /ha ).

(2.15)

It is interesting to note that the term C3 |3h |th ϕ in (2.13) has a direct connection with the heuristics in Sect. 1.4. The quantity ϕ plays the role of the rate of nucleation, while |3h |th is the space-time volume of a cylinder which plays a role similar to the space-time cone in the heuristics. The absence here of the velocity factor which appears in (1.29) is a consequence of our using an upper bound of order 1 for the velocity of propagation of effects, through (2.8). Once (2.14) is proven, it follows from the definitions (1.1), (1.28), (2.2) and (2.7) that |3h |th ϕ vanishes exponentially fast in 1/h as h & 0.

412


2.3. The restricted ensemble. We start our study of the measure µ˜ 3h ,−,h by observing that it is sufficient to study measures of the type µ˜ 3,−,h on subsets 3 of Z2 which are much smaller than 3h . The definition of µ˜ 3,−,h is analogous to that of µ˜ 3h ,−,h , with 3 replacing 3h . Suppose that for each sufficiently small value of h > 0 we have an event Eh which only depends on the values of the spins inside the box x + 3(1/h3 ). Consider the larger box x + 3(2/h3 ) and condition on what the set of exterior contours which surround at least one site in its complement, (x + 3(2/h3 ))c , is. Let that set consist of contours {0j }.SDenote by 3({0j }) the connected component of the complement, (x + 3(2/h3 )) \ j (Int 0j ∪ ∂(0j )), which contains the set x + 3(1/h3 ). Then one obtains, when h is small, that X αi · µ˜ 3i ,−,h (Eh ), (2.16) µ˜ 3h ,−,h (Eh ) = i

where the 3 -s denote different 3({0j })-s, the index i runs over a finite set, αi > 0 and P i αi = 1. The choices of the scales above and the need that h be small for (2.16) to hold, are clearly related to the fact that we are conditioning the Gibbs measure on the absence of any contour with volume larger than (Bc /h)2 . Therefore, with the choices above, we are sure that for small h the support of Eh will be disjoint from the set of sites surrounded by any contour which also surrounds any site in (x + 3(2/h3 ))c . The sets 3i which appear in (2.16) have an additional property, which will be important later: they are simply-connected. For each value of h > 0 and each x ∈ 3h , the event Fx− satisfies the condition above on Eh , so that to derive (2.14) it is enough to obtain a corresponding upper bound: A , (2.17) sup sup µ˜ 3,−,h (Fx− ) ≤ C1 exp −β(1 − ) h x∈3h 3∈Lx,h i

for small h > 0. Here Lx,h is the family of simply-connected sets 3 which satisfy (x + 3(1/h3 )) ∩ 3h ⊂ 3 ⊂ x + 3(2/h3 ). Similarly, the derivation of (2.15) is reduced to that of the following, Z f dµ˜ 3,−,h ≤ hf i3(1/ha ),−,h + C(f ) exp(−C2 /ha ), (2.18) sup 3∈L0,h

for h > 0. In order to derive (2.17) and (2.18) we will use the notion of the skeleton of a contour, as introduced in [DKS]. In what follows b is a fixed but arbitrary number in (a, 1/4) and r is a fixed but also arbitrary number in (0, b/2) ∩ (0, a). A contour 0 will be said to be h-vertebrate if V (0) > (1/h)2b , otherwise 0 will be said to be h-invertebrate. (Usually one says that 0 is large in the former case and small in the latter one, but this terminology would be confusing in the present paper, since “large” and “small” contours may also be used in connection with being supercritical or subcritical, with the threshold volume being the quantity (Bc /h)2 , which is much larger than (1/h)2b .) Often we will omit mention of h when referring to a vertebrate contour. Given now a vertebrate contour 0 one can associate to it, in an algorithmic way, a sequence of sites, (x1 , ..., xJ ) of the dual lattice Z2 + (1/2, 1/2). We think of the sites x1 , ..., xJ as the ordered vertices of a closed polygonal curve, with possible self-intersections (see Fig. 5.3 on p. 166 in [DKS]); we will denote this curve by γ in what follows and call it the skeleton of 0. For the construction of γ, given 0, the reader is referred to Chapter 5 of [DKS]; here we will limit ourselves to reviewing some of the basic properties that we can guarantee the skeleton to have:


413

(S.1) xi ∈ 0 for each i, moreover, the points xi are consecutive on 0 (for one of the orientations of it). (S.2) The length of each edge of γ is bounded between C1 (1/h)r and C2 (1/h)r , where 0 < C1 < C2 < ∞ are fixed appropriate constants. (S.3) The Hausdorff distance between 0 and γ satisfies ρH (0, γ) ≤ (1/h)r .

(2.19)

The length, |γ| of a skeleton γ is defined as the sum of the Euclidean lengths of its edges. To each skeleton γ we associate its Wulff functional, W(γ), defined by summing over the edges of γ the product of the Euclidean length of each edge by the surface tension in the direction defined by the edge, i.e., τT (n), with n perpendicular to the edge. Observe that from the fact that the surface tension τT (n) is bounded away from 0 and ∞ uniformly in n, (2.20) C3 W(γ) ≤ |γ| ≤ C4 W(γ). From (S.2) and (2.20) it follows that the number J(γ) of vertices in γ satisfies C5 W(γ)hr ≤ J(γ) ≤ C6 W(γ)hr .

(2.21)

To each configuration σ ∈ − we can associate the collection G = {01 , ...0n } of its external vertebrate contours. To this collection we can associate the collection S = {γ1 , ..., γn } of their skeletons. The Wulff functional associated to the configuration σ is then defined as n X W(γi ), W(S) = i=1

with the convention that W(∅) = 0. Next we want to consider the volume surrounded by the external vertebrate contours 01 , ..., 0n and say that it has to be close to the volume surrounded by the collection of skeletons γ1 , ..., γn . A difficulty lies in the fact that while the volume surrounded by the contours is easily defined as V (G) =

n X

V (0i ),

i=1

the fact that the skeletons can self-intersect and also intersect with each other makes the notion of the volume that they surround more delicate. Fortunately the notion of “phase volume”, as defined in Sect. 2.10 of [DKS], solves this difficulty. This definition is as follows (a look at Fig. 2.5 on p. 37 of [DKS] will probably lead the reader to guess correctly the definition). The set R2 \ ∪ γi splits up into a collection of connected components Qα with exactly one unbounded component among them. A component Qα will be called a minus-component if any path that connects its interior points with points of the unbounded component and intersects the curves from S in a finite number of points, intersects them in an odd number of points (counted with multiplicities). The phase volume of S, denoted by Vˆ (S), is defined as the joint volume of all the minuscomponents. Motivated by (2.19), we want to show that V (G) and Vˆ (S) have to be also relatively close to each other. If we remove from R2 all the points which are at a distance not larger than (1/h)r from ∪0i , then the remaining set also splits up into connected components with exactly one unbounded component among them. It is easy to see that all the bounded components are subsets of minus-components in the splitting produced by S, while the

414


unbounded component is a subset of the unbounded component in the splitting produced by S. It is also clear that the bounded components in this splitting are inside contours of G, while the unbounded component in this splitting is completely outside the contours of G. Hence ! n X |0i | (1/h)2r . (2.22) |V (G) − Vˆ (S)| ≤ C7 i=1

Similarly, by removing from R all the points which are at a distance not larger than (1/h)r from ∪γi , one can also derive ! n X ˆ |γi | (1/h)2r ≤ C9 W(S)(1/h)2r , (2.23) |V (G) − V (S)| ≤ C8 2

i=1

where in the second inequality use of (2.20) was made. For convenience we introduce also another measure of the “volume” of a collection of skeletons, which is motivated by the procedure described in the explanation of why (2.22) and (2.23) hold. We define Vˇ (S) as the number of sites inside the minus components Qα , which are at a distance larger than (1/h)r from ∪γi . Clearly we have

and

Vˇ (S) ≤ min{V (G), Vˆ (S)},

(2.24)

Vˇ (S) ≥ max{V (G), Vˆ (S)} − C10 W(S)(1/h)2r .

(2.25)

h,co Given a finite set G of h-vertebrate contours we denote by SG the set of configurations which belong to − and which have as their collection of external h-vertebrate contours the set G. We say that G is a compatible set of external h-vertebrate contours h,co is not empty. in case SG Similarly, given a finite set of skeletons S, we define CSh as the class of all G which are compatible sets of external h-vertebrate contours and which have as their set of skeletons the set S. We define also [ h,co SG . SSh,sk = h G∈CS

SSh,sk is the set of configurations which belong to − and which have as their collection of skeletons corresponding to their external h-vertebrate contours the set S. We say that S is a compatible set of skeletons in case SSh,sk is not empty. Again similarly, given an interval of real numbers, I, we define SIh,W as the set of configurations which belong to − and for which the collection S of skeletons corresponding to their external h-vertebrate contours satisfies W(S) ∈ I. One would like to say that the volume of a collection of skeletons is the sum of the volumes of the individual skeletons. One proper version of this is the following relation, which follows from the argumentation used to show (2.22) and (2.23). In this relation, and also later on, we use the simplified notation Vˇ (γi ) in place of the more cumbersome Vˇ ({γi }). If S = {γ1 , ..., γn } is a compatible set of skeletons, then Vˇ (S) =

n X i=1

Vˇ (γi ).

(2.26)


415

A fundamental fact about the Wulff shape and the associated quantity w is the variational characterization (1.10). By scaling lengths we obtain easily from this and (2.24) that for any skeleton γ q (2.27) w Vˇ (γ) ≤ W(γ). This inequality will now be used to derive another one, which is of central relevance in this paper. This is the content of the next lemma. Lemma 2.3 .1. For each configuration in R, the associated set of skeletons S, corresponding to its set of vertebrate external contours satisfies W(S) ≥ 2m∗ hVˇ (S). Proof. Say that the collection of external vertebrate contours for the configuration with which we are concerned is G = {01 , ..., 0n } and that 0i has skeleton γi , for i = 1, ..., n. From (2.24) and the definition (2.10) of R we have for each i, q Bc Vˇ (γi ) ≤ . (2.28) h Multiplying the inequalities (2.27) (with γ = γi ) and (2.28) by each other, and using the fact that Bc = w/(2m∗ ) we obtain W(γi ) ≥ 2m∗ hVˇ (γi ). The thesis now follows by adding over i, using (2.26).

In order to show (2.17), we need to know that the skeletons of configurations in Fx− are associated with sufficiently large values of W(·). This is the content of the next lemma. Lemma 2.3 .2. Given > 0 there is h0 > 0 such that for all 0 < h ≤ h0 and all x ∈ Z2 the following holds. For each configuration in Fx− , the associated set of skeletons S, corresponding to its set of vertebrate external contours satisfies W(S) ≥

2A(1 − ) . h

Proof. Let σ ∈ Fx− be the configuration with which we are concerned. By definition, σ x ∈ Rc , so the configuration σ x has an external contour 0 with V (0) > (Bc /h)2 . On the other hand, σ ∈ R, so the contour 0 of the configuration σ x is attached to the site x. e σ) Let sq(x) be the 3 × 3 square of the dual lattice, centered at x. Consider the set G(x, of all dual bonds that are either in sq(x) (there are 24 of them) or belong to a contour of e σ) be these bonds, which σ, which contains some bonds of sq(x). Let G(x, σ) ⊂ G(x, e σ). are “visible from infinity”, i.e., are not screened away from it by other bonds in G(x, Evidently, the set of bonds G(x, σ) serves as a set of bonds of exterior contours of some new configuration η = η(x, σ), which in fact has no interior contours. Let these contours be G1 , G2 , ..., Gk . Clearly, k ≤ 6, since each contour Gi passes through at least two e σ) = G(x, e σ x ), so the same is true also for boundary sites of sq(x). Note that G(x, x x G(x, σ ), η(x, σ ) and G1 , G2 , ..., Gk .

416


P The idea of the proof is to observe that Int 0 ⊂ ∪i Int Gi , so in particular i V (Gi ) ≥ V (0). That implies that the contours Gi should be quite long. On the other hand, the set of dual bonds G(x, σ) \ sq(x) belongs to exterior contours of σ, so they should be long as well. The implementation of the above argument is as follows. Let Se be the set of skeletons of the subset of all vertebrate contours among G1 , G2 , ..., Gk . Denote the corresponding subset of indices by ver ⊂ {1, ..., k}. We have then that Se = {γi , i ∈ ver}. Note that the volume of any invertebrate contour is o( h1 ), so we have: B 2 X 1 c . V (Gi ) ≥ −o h h i∈ver Hence the estimate (2.25) implies that also B 2 X 1 c ˇ V (γi ) ≥ . −o h h i∈ver Using now (2.27) we have s B 2 Xq 2A 1 1 c e ˇ V (γi ) ≥ w W(S) ≥ w = −o , −o h h h h i∈ver

(2.29)

where we are using an evident property of the square root and also the definitions (1.28). e One The only remaining problem now is the relation between the skeletons S and S. would like to argue that in some sense Se ⊂ S, which would imply our claim. However, the last inclusion is almost always violated. The way out of this unlucky circumstance is the following. Note, that in fact the skeleton of a contour is not uniquely defined; it should just be a closed polygon satisfying the properties of the type (S.1)-(S.3) above. Once this is the case, any such polygon satisfies any statement above made about any skeleton. We are going to use this nonuniqueness in order to prepare a special family of e the skeletons Se of the family {Gi , i ∈ ver}. This family is going to be constructed in such a way as to use as big pieces of the family S as possible. Namely, note that every maximal connected arc kj of G(x, σ) \ sq(x) is also an arc of some external contour of the configuration σ, and so it inherits a portion κj of the skeleton S. (Note that there are at most 6 such arcs.) Namely, define κj to be the maximal subpolygon of S with both endpoints in kj . (It might happen that some κj are empty; for example, it will be the case when the external contour in question would be invertebrate.) It is immediate to see that the family of the (open) polygons ∪κj can be made into a skeleton family of the contour family {Gi , i ∈ ver } by adding at most six extra edges to it, which addition might require the prior removal of some ending edges from ∪κj (also in the amount e of at most six). This is the skeleton family Se sought. Now (2.29), being valid for every possible skeleton family of the contour family {Gi , i ∈ ver }, implies that 2A 1 ee W(S) ≥ −o . h h On the other hand,

ee W(S) ≥ W(S) −o

1 , h


417

e since every edge of Se except a finite number belongs to S. (Note that in fact the family ee S might be much longer than S, though it is immaterial for our argument). The next lemma shows that vertebrate contours have a minimum cost. Lemma 2.3.3. There is a constant h0 > 0 such that for 0 < h ≤ h0 the following holds. For each configuration in − , the associated set of skeletons S, corresponding to its set of vertebrate external contours is either empty or satisfies w 1 b . W(S) ≥ √ 2 h Proof. Suppose that S is not empty, so that there exists γ ∈ S. This means that γ is the skeleton of a vertebrate contour 0. Suppose that the inequality that we should prove is false and use (2.25) to prove that then 1 2r+b 1 2r 1 2b 1 1 2b ≥ −C ≥ . Vˇ (γ) ≥ V (0) − CW(S) h h h 2 h In the last inequality we used the fact that r < b/2, and for this last inequality to be true h0 has to be taken properly. Using (2.27) now, we obtain w 1 b . W(S) ≥ W(γ) ≥ √ 2 h

The next two lemmas are of a somewhat technical nature. The first one of them will only be used in the proof of the second one, while that second one is a step towards proving (2.18) and it will also be used in the proof of the fundamental Lemma 2.3 .6 below. Some readers may prefer to first read the proof of Lemma 2.3 .6, which has an immediate heuristic appeal, and later return to Lemmas 2.3 .4 and 2.3 .5. +∗ We will use the notation P(x, l) = {x ←→ (x + 3(l))c } for the event that x is (+,*)-connected to a site outside the box x + 3(l). Given a set G of compatible external h-vertebrate contours (possibly G = ∅), we define the following conditional Gibbs measure and corresponding expectation: Z G,h h,co 0 = µ ( · |S ) and hf i = f dµG,h µG,h 3,−,h G 3,−,h0 3,−,h0 3,−,h0 . Lemma 2.3 .4. There are positive finite constants C1 and C2 such that for all 0 < h0 ≤ h ≤ 1, all x ∈ Z2 and all finite 3 ⊂ Z2 , 1 ∅,h ≤ C1 exp(−C2 /ha ). µ3,−,h0 P x, a 2h Proof. We start with an observation akin to the one that originated (2.16), but with the box x + 3(3/h2b ) replacing the box x + 3(2/h3 ) used there. Conditioning on what the set of contours which surround at least one site in (x + 3(3/h2b ))c is we obtain, when h is small,

418


µ∅,h 3,−,h0

X 1 1 ∅,h αi · µ3i ,−,h0 P x, a P x, a = , 2h 2h i

i is a subset of x+3(3/h2b ) where the index i runs over a finite set, for each of its values 3P 2b which contains (x + 3(1/h )) ∩ 3 and αi > 0; moreover i αi = 1. i The event that the box 3 is free of h-vertebrate contours is a decreasing event, while 1 P x, 2ha is an increasing event, so by the FKG-Holley inequalities we obtain 1 1 ∅,h ≤ µ3i ,−,h0 P x, a . µ3i ,−,h0 P x, a 2h 2h

Using now the fact that for each i, |3i | ≤ 9/h4b , that b < 1/4 and that 0 < h0 ≤ h ≤ 1, we have Z3i ,−,h0 (P x, 2h1a ) 1 = µ3i ,−,h0 P x, a 2h Z3i ,−,h0 Z3i ,−,0 (P x, 2h1a ) ≤ exp(β|3i |h) Z3i ,−,0 1 ≤ µ3i ,−,0 P x, a exp(C1 h1−4b ) 2h ≤ C2 exp(−C3 /ha ), where in the last inequality we are also using (1.17), with the role of +s and −s switched. This completes the proof of the lemma. Lemma 2.3.5. There are positive finite constants C1 and C2 such that for every local observable g there is h0 = h0 (g) > 0 such that for all 0 < h0 ≤ h ≤ h0 and all finite 3 ⊂ Z2 , a |hgi∅,h 3,−,h0 − hgi3∩(Supp(g)+3(1/ha )),−,h0 | ≤ C1 ||g||∞ |Supp(g)| exp(−C2 /h ),

Proof. Without loss in generality we suppose that g is increasing and has ||g||∞ ≤ 1. Set [ 1 P x, a . E1 = 2h x∈Supp(g)

Once more we use an argument similar to the one used to prove (2.16), this time we condition on what the (+,*)-cluster of the set (Supp(g) + 3(1/ha ))c is. In this fashion we obtain the following equality: ! X ∅,h ∅,h ∅,h ∅,h c αi · hgi3i ,−,h0 µ∅,h hgi3,−,h0 = 3,−,h0 ((E1 ) ) + hg|E1 i3,−,h0 µ3,−,h0 (E1 ), i

where the index i runs over a finite set, for each of its values 3i is a subset P of 3 ∩ (Supp(g) + 3(1/ha )) which contains 3 ∩ Supp(g) and αi > 0; moreover i αi = 1. Note that for small enough h0 (depending on Supp(g)) the fact that a < b implies that no h-vertebrate contour can fit inside 3 ∩ (Supp(g) + 3(1/ha )), so that for each value


419

of i we have h · i∅,h = h · i3i ,−,h0 . The equality displayed above and Lemma 2.3 .4 3i ,−,h0 now lead to X ∅,h αi · hgi3i ,−,h ≤ 2µ∅,h hgi3,−,h0 − 3,−,h0 (E1 ) i

≤ C1 |Supp(g)| exp(−C2 /ha ), (2.30 ) for a proper choice of h0 . Similarly, consider now the event E2 =

[ x∈Supp(g)+3(1/ha )

1 P x, a 2h

,

and condition on what the (+,*)-cluster of (Supp(g) + 3(2/ha ))c is. The same reasoning above, gives us, for small enough h0 , X ∅,h αj · hgi3j ,−,h ≤ 2µ∅,h hgi3,−,h0 − 3,−,h0 (E2 ) i

≤ C1 |Supp(g)| exp(−C2 /ha ), (2.31 ) where the index j runs over a finite set, for each of its values 3j is a subset of 3 ∩ (Supp(g) + 3(2/ha )) which contains 3 ∩ (Supp(g) + 3(1/ha )) and αj > 0; moreover P j αj = 1. Thanks to the fact that g is being supposed to be increasing, we have for all i and all j as above, (2.32) hgi3i ,−,h ≤ hgi3∩(Supp(g)+3(1/ha )),−,h ≤ hgi3j ,−,h . The lemma follows from (2.30), (2.31) and (2.32).

The next lemma is a fundamental step, in which an aspect of the heuristics about droplets and their surface and bulk free-energies is made into a rigorous results. Lemma 2.3 .6. For any p > 0, given > 0 there is a finite positive constant h0 such that for any 0 < h ≤ h0 , any collection of skeletons S, and any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp , µ˜ 3,−,h (SSh,sk ) ≤ exp −β (1 − )W(S) − (1 + )hm∗ Vˇ (S) W(S) ≤ exp −β(1 − 3) . 2 Note. In this and some further statements the restriction |3| ≤ 1/hp is not really necessary. Still, we are using it since it simplifies some arguments, and is enough for our purposes.

420


Proof. The second inequality is immediate from Lemma 2.3 .1, so that we only have to prove the first one. It is enough to consider the case of the nonempty skeleton S. We start with µ˜ 3,−,h (SSh,sk ) ≤

X Z3,−,h (S h,co ) G h G∈CS

=

Z3,−,h (S∅h,co )

X Z3,−,0 (S h,co ) G h G∈CS

Z3,−,0 (S∅h,co )

exp

β 2

Z

h 0

Xh

hσ(x)iG,h 3,−,h0

−

hσ(x)i∅,h 3,−,h0

!

i dh

0

,

x∈3

(2.33 ) where in the first step we used the fact that R ⊃ S∅h,co when h is small, while in the second step we used the fact that for an arbitrary E ⊂ 3,− , βX d log Z3,−,h (E) = hσ(x)|Ei3,−,h . dh 2 x∈3 Next we will show that given > 0 it is possible to take h0 > 0 small enough so that if G ∈ CSh , 0 < h0 ≤ h ≤ h0 , and 3 is simply-connected and satisfies |3| ≤ 1/hp , then i Xh ∅,h ∗ˇ 2a hσ(x)iG,h + 1. (2.34) 3,−,h0 − hσ(x)i3,−,h0 ≤ 2(1 + )m V (S) + CW(S)/h x∈3

(The reader should not be confused by the fact that a seemingly does not enter the l.h.s. In fact, it enters, because the restriction G ∈ CSh depends on the related parameter r.) For this purpose let ∂− G (resp. ∂+ G) be the set of sites where each configuration in − with the set of external contours equal to G is doomed to be −1 (resp. +1). Let G 3G ext and 3int be the components of 3\(∂− G ∪ ∂+ G) which are, respectively, external ˇG ˇG and internal to the contours in G. Let also 3 ext and 3int be, respectively, the subsets of G G 3ext and 3int obtained by removing from these sets all sites which are at a distance not larger than 2/ha from any point in any contour of G. ˇG ˇG First we consider the sites x which are neither in 3 int nor in 3ext . For these we have, using (2.19) (2.20) and the fact that r < a, that when h0 is small, X

h

i ∅,h 2a ˇG ˇG − hσ(x)i hσ(x)iG,h ≤ 2|3\(3 0 0 int ∪ 3ext )| ≤ CW(S)/h . 3,−,h 3,−,h

ˇG ˇG x∈3\(3 int ∪3ext )

(2.35) ˇG Regarding now the sites x ∈ 3 ext , we observe that for these sites ∅,h hσ(x)iG,h 3,−,h0 = hσ(x)i3G ,−,h0 . ext

a a But for each such x we have 3G ext ∩ (x + 3(1/h )) = 3 ∩ (x + 3(1/h )), so that a double application of Lemma 2.3 .5 gives us, for small h0 , that a |hσ(x)i∅,h − hσ(x)i∅,h 3,−,h0 | ≤ C1 exp(−C2 /h ). 3G ,−,h0 ext


421

p ˇG Combining the last two displays, and using the fact that |3 ext | ≤ |3| ≤ 1/h , we obtain for small enough h0 , i X h ∅,h hσ(x)iG,h (2.36) 3,−,h0 − hσ(x)i3,−,h0 ≤ 1. G

ˇ ext x∈3

ˇG Finally, regarding now the sites x ∈ 3 int , we observe that for these sites hσ(x)iG,h 3,−,h0 = hσ(x)i3G ,+,h0 . int

But since each such x is separated from the boundary of the set 3G int by a minimal distance 1/ha , when h is small, we obtain from the FKG-Holley inequalities, (1.22) and (1.7), hσ(x)i3G ,+,h0 ≤ hσ(x)i3G ,+,h ≤ m(h) + C1 exp(−C2 /ha ) ≤ m∗ (1 + ), int

int

provided h0 is chosen small enough. On the other hand, since 3 is simply-connected, a a ˇG for each x ∈ 3 int we have 3 ∩ (x + 3(1/h )) = x + 3(1/h ), so that Lemma 2.3 .5 gives us, for small h0 , a |hσ(x)i∅,h 3,−,h0 − hσ(x)ix+3(1/ha ),−,h0 | ≤ C1 exp(−C2 /h ).

By another application of the FKG-Holley inequalities and (1.22) (with +1 and −1 switched) we have hσ(x)ix+3(1/ha ),−,h0 ≥ hσ(x)ix+3(1/ha ),−,0 ≥ −m∗ − C1 exp(−C2 /ha ) ≥ −m∗ (1 + ), provided h0 is chosen small enough. Combining the last four displays, and using the ˇ ˇG fact that |3 int | ≤ V (S), for small enough h0 we obtain i X h ∅,h ∗ˇ (2.37) hσ(x)iG,h 3,−,h0 − hσ(x)i3,−,h0 ≤ 2(1 + )m V (S). G

ˇ int x∈3

By adding (2.35), (2.36) and (2.37), we obtain (2.34). Keeping in mind that our concern is with the r.h.s. of (2.33), we observe now that X Z3,−,0 (S h,co ) G h G∈CS

Z3,−,0 (S∅h,co )

=

µ3,−,0 (SSh,sk ) µ3,−,0 (S∅h,co )

.

(2.38)

The numerator of this fraction is controlled in the fundamental Lemma 10.1 in [Pfi], where it is shown to satisfy µ3,−,0 (SSh,sk ) ≤ exp(−βW(S) + CJ(S)),

P where J(S) = J(γi ), in case S = {γ1 , ..., γn }, with J(γ) being the number of vertices of the skeleton γ. In reality, in the statement of Lemma 10.1 in [Pfi] the assumption that the temperature is low enough is made. But, as pointed out in [Iof2], this assumption is actually not needed for Pfister’s elegant proof, based on a clever use of duality and Griffiths’ inequalities, to work. Using (2.21) we can now write that given > 0 there is h0 > 0 such that for 0 < h ≤ h0 ,

422


µ3,−,0 (SSh,sk ) ≤ exp(−β(1 − /2)W(S)).

(2.39)

Turning now to the denominator in the r.h.s. of (2.38), observe that at the boundary of any external vertebrate contour of a configuration in − there is a (+,*)-chain with l∞ -diameter at least 1/hb . If σ ∈ (S∅h,co )c , then in σ there is such a (+,*)-chain, and (1.17) (with +1 and −1 switched) can be used in combination with |3| ≤ 1/hp to conclude that 1 (2.40) µ3,−,0 (S∅h,co ) ≥ , 2 for small enough h. The first inequality claimed in the statement of the lemma follows from (2.33), (2.34), (2.38), (2.39) and (2.40), since Ch1−2a < /2 if h is small. The next lemma takes care of some remaining entropy. Lemma 2.3.7. For any p > 0, given > 0 there exists h0 > 0 such that for all u > 0 and D > 0, there is a finite positive constant C such that for any 0 ≤ h ≤ h0 , and any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp , D h,W µ˜ 3,−,h S D ,∞) ≤ C exp −β(1 − ) u . [ hu 2h Proof. There is no loss in generality in supposing that 0 < < 1 and that u > r; the second of these claims being justified by Lemma 2.3 .3 and the fact that r < b. , for k = 1, 2, .... To bound these We start by estimating µ˜ 3,−,h S h,W [ hDu k, hDu (k+1)) quantities from above, using Lemma 2.3 .6, all that we need is an upper bound on the number of choices of skeletons S which correspond to configurations in 3,− and have W(S) ∈ [ hDu k, hDu (k + 1)). Using (2.21), the number J(S) of vertices that S can have is bounded above by Nk = C1 D(k + 1)/hu−r ≤ C2 Dk/hu−r . Let V be the set of distinct points which are possible vertices of S. The cardinality of V is bounded above by 4|3| ≤ 4/hp . We consider now ordered Nk -tuples of points in V and associate to each point in such an Nk -tuple one of the words “continue", “close", or “quit". To every such object we associate a collection of closed polygons in the following way. We start from the first point in the Nk -tuple and while we keep seeing the word “continue", we do the following. We join the first point to the second point, this one to the third point, and so on. When we first reach a point where we read “close", we connect it to the first point of the Nk -tuple, closing a first polygonal line, and then we jump to the point which in the Nk -tuple follows the point where we just read “close". While we do not see the word “quit", we proceed in this fashion, understanding that the word “close" means that we close the polygonal line which we are currently constructing, and that then we jump to the next point in the Nk -tuple and start the next polygonal line from there. The word “quit" is self-explanatory: we stop the procedure by closing the last polygon and disregard the remaining points of the Nk -tuple. The procedure that we described generates all the collections of skeletons with which we are concerned and plenty of additional garbage. At any rate, counting the number of options here gives us an upper bound on the quantity we are interested in. This upper bound on the number of choices of S is bounded above by 2D k Cu−r D 12 h Nk Nk ≤ exp β u k , |V | 3 ≤ hp 4h


423

for small enough h. Combining this estimate with Lemma 2.3 .6, gives us that for small h0 , ∞ X h,W µ ˜ S = µ˜ 3,−,h S h,W 3,−,h D D D ,∞ k, (k+1) ) [ hu ) [ hu hu k=1 ∞ X D D D 0 C exp −β 1 − k exp β u k ≤ C exp −β(1 − ) u , ≤ 2 2hu 4h 2h k=1

provided < 1.

We are now close to completing the proof of Proposition 2.1.1. The inequality (2.17) is a direct consequence of Lemmas 2.3 .2 and 2.3 .7. For use in Sect. 3.2, we observe that the same argument gives for each p > 0 and > 0 the existence of some finite C1 so that A − , (2.41) sup µ˜ 3,−,h (Fx ) ≤ C1 exp −β(1 − ) sup h 3 simply-connected x∈3 |3|≤(1/h)p

for small h. Turning to (2.18), Lemma 2.3 .5, with h = h0 , gives us that when h is small, for all 3 ∈ L0,h , a |hf i∅,h (2.42) 3,−,h − hf i3(1/ha )),−,h | ≤ C(f ) exp(−C1 /h ). We will be done once we replace the conditional Gibbs distribution µ∅,h 3,−,h , implicit in this expression, with the conditional Gibbs distribution µ˜ 3,−,h . To this end we combine Lemmas 2.3 .3 and 2.3 .7 to write µ˜ 3,−,h (S∅h,co )c ≤ C2 exp(−C3 /hb ). Combining this inequality with the equalities Z Z Z f dµ˜ 3,−,h = f dµ˜ 3,−,h + S∅h,co

R

and hf i∅,h 3,−,h =

S∅h,co

S∅h,co

f dµ˜ 3,−,h

µ˜ 3,−,h (S∅h,co )

c f dµ˜ 3,−,h ,

,

gives us Z f dµ˜ 3,−,h − hf i∅,h ≤ C4 (f ) exp(−C3 /hb ) ≤ C4 (f ) exp(−C3 /ha ). 3,−,h

(2.43)

Together (2.42) and (2.43) give us (2.18), and Proposition 2.1.1 is proved.

2.4. Asymptotic expansion. Proposition 2.4.1. Suppose the T < Tc . Then for each constant a ∈ (0, 1/2) and local observable f , for n = 1, 2, 3, ..., the following expansion holds when h > 0:

424


hf i

3(1/ha ),−,h

=

n−1 X

bj (f )hj + O(hn ),

j=0

where j 1 dj hf i−,h 1 β = bj (f ) = j! dhj h=0− j! 2

X

hf ; σ(x1 ); ...; σ(xj )i− ,

x1 ,...,xj ∈Z2

and O(hn ) is a function of f and h which satisfies lim suph&0 |O(hn )|/hn < ∞. The existence of the derivatives bj (f ) and their relations with the summations over generalized Ursell functions are contained in (1.25), modulo the interchange of the role of +s and −s. Below we will also be using various other relations from Sect. 1.3 modulo this symmetry; we will do it without further warning. From (1.23) and (1.17), in combination with an idea already explained in Sect. 2.1in connection with the derivation of (2.1) and also exploited at the end of the proof of Lemma 2.3 .4 (and which amounts to an estimate on a Radon-Nikodym derivative), we have for each a ∈ (0, 1/2), T < Tc , 0 ≤ h0 ≤ h ≤ 1 and local observables f and g, + |hf ; gi3(1/ha ),−,h0 | ≤ C(f, g) µ3(1/ha ),−,h0 Supp(f ) ←→ Supp(g) + Z3(1/ha ),−,h0 Supp(f ) ←→ Supp(g) = C(f, g) Z3(1/ha ),−,h0 + Z3(1/ha ),−,0 Supp(f ) ←→ Supp(g) ≤ C(f, g) exp(β|3(1/ha )|h0 ) Z3(1/ha ),−,0 + ≤ C(f, g) eβ µ3(1/ha ),−,0 Supp(f ) ←→ Supp(g) ≤ C(f, g, T ) exp − C(T ) dist∞ (Supp(f ), Supp(g)) . (2.44 ) As with (1.27) in Sect. 1.3, the argument in Appendix B of [M-L] shows that from the exponential decay of correlations in (2.44) a similar exponential decay follows for the generalized Ursell functions. With a, T , h, h0 and f as above, |hf ; σ(x1 ); ...; σ(xj )i3(1/ha ),−,h0 | diam∞ (Supp(f ) ∪ {x1 , ..., xj }) . ≤ Cj (T, f ) exp −C(T ) j

(2.45 )

If we keep 0 < h ≤ 1 fixed and look at hf i3(1/ha ),−,h0 as a function of h0 ≥ 0, we can write the following Taylor expansion:


425

1 dn hf i3(1/ha ),−,h0 1 dj hf i3(1/ha ),−,h0 j n hf i = 0 h + n! 0 00 h 0 )n j! d(h0 )j d(h h =0 h =h (h) j=0   n−1  X  1 β j X = hf ; σ(x1 ); ...; σ(xj )i3(1/ha ),−,0 hj +  j! 2  j=0 x1 ,...,xj ∈3(1/ha )    1 β n  X + hf ; σ(x1 ); ...; σ(xn )i3(1/ha ),−,h00 (h) hn ,  n! 2  (2.46 ) x1 ,...,xn ∈3(1/ha ) n−1 X

3(1/ha ),−,h

where all that we know about h00 (h) is that h00 (h) ∈ [0, h]. Thanks to the uniformity on h and h0 of the bound (2.45), this is enough to conclude that the coefficient which multiplies hn in the last term of (2.46) is bounded in absolute value by a constant which does not depend on h. In other words, this last term in (2.46) is indeed a O(hn ). To conclude the proof of Proposition 2.4.1 we must show that for j = 0, 1, ..., n − 1, the coefficient which multiplies hj converges fast enough to bj (f ). We will show that this convergence occurs at a rate which is more than enough for this purpose. Indeed, using first (1.27) and then (1.26), we have for some finite positive C1 , C2 , C3 and C4 , X X a hf ; σ(x ); ...; σ(x )i − hf ; σ(x ); ...; σ(x )i 1 j 3(1/h ),−,0 1 j − x1 ,...,xj ∈3(1/ha ) x1 ,...,xj ∈Z2 X ≤ hf ; σ(x1 ); ...; σ(xj )i3(1/ha ),−,0 x1 ,...,xj ∈3(1/2ha ) X C2 − hf ; σ(x1 ); ...; σ(xj )i− + C1 exp − a h x1 ,...,xj ∈3(1/2ha ) X hf ; σ(x1 ); ...; σ(xj )i3(1/ha ),−,0 − hf ; σ(x1 ); ...; σ(xj )i− ≤ x1 ,...,xj ∈3(1/2ha )

C2 + C1 exp − a h C4 ≤ C3 exp − a . h This completes the proof of Proposition 2.4.1. We have now finished the proof of Theorem 1(i) in the special case in which the initial distribution ν is concentrated on the configuration with all spins down. 2.5. 2.5. More general initial distributions. In this section we will show that part (i) of Theorem 1 for an initial distribution ν ≤ µ− can be derived from the same result for the particular case in which ν is concentrated on the configuration with all spins down. With no loss in generality we will again suppose that f is increasing and has ||f ||∞ ≤ 1. In this case for all t > 0, − ν )) ≥ E(f (σh;t )). E(f (σh;t

426


The claim therefore will be proven once we show that for all λ and λ0 which satisfy 0 < λ < λ0 , there are finite positive constants C1 and C2 such that for h > 0, − ν E(f (σh;exp(λ/h) )) ≤ E(f (σh;exp(λ 0 /h) )) + C1 exp(−C2 /h).

(2.47)

To prove this inequality we first note that for arbitrary s, t ≥ 0, Z µ− µ ζ ν )). E(f (σh;t )) ≤ E(f (σh;t )) = P(σ0;s− ∈ dζ)E(f (σh;t From Lemma 1.2.1 there exists finite positive C3 , C4 and C5 so that we have Z Z µ ζ ζ − )) ≤ P(σ0;s ∈ dζ)E(f (σh;t )) P(σ0;s− ∈ dζ)E(f (σh;t µ − (x) 6= σ0;s− (x) for some x ∈ 3(C3 t) + C4 exp(−C5 t). + P σ0;s From the basic-coupling inequalities and the Markov property, Z Z ζ ζ − − − P(σ0;s ∈ dζ)E(f (σh;t )) ≤ P(σh;s ∈ dζ)E(f (σh;t )) = E(f (σh;s+t )). Combining the last three displays, and taking t = s = exp(λ/h) we obtain − ν )) ≤ E(f (σh;2 E(f (σh;exp(λ/h) exp(λ/h) ))

µ− − (0) 6= σ0;exp(λ/h) (0) + |3(C3 exp(λ/h))| P σ0;exp(λ/h)

+ C4 exp(−C5 exp(λ/h)) − ≤ E(f (σh;exp(λ 0 /h) )) + (0) − m∗ + C6 exp(2λ/h) E σ0;exp(λ/h)

+ C4 exp(−C5 exp(λ/h)), when h is small. In the second inequality above we used the basic-coupling inequalities − )) in t), and also the spintwice (which imply in particular the monotonicity of E(f (σh;t reversal symmetry in case h = 0. + (0)) → m∗ fast enough To complete the proof of (2.47) we have to show that E(σ0;u as u → ∞. The following lemma, which states that this happens faster than any power of 1/u, is clearly sufficient for our purpose. Lemma 2.5.1. Suppose T < Tc . Then for each p > 0 there is a positive finite constant C such that + (0) − m∗ ≤ Cu−p . 0 ≤ E σ0;u Proof. The lower bound is a standard application of a basic-coupling inequality. To prove the upper bound, we will first also use basic-coupling inequalities, in order to compare the infinite system with finite ones with (+) boundary conditions. For these finite systems we will then use a result in [Mar], which was extended up to Tc in [CGMS]. The result of [Mar] and [CGMS] that we will use refers to the spectral gap of the generator a kinetic Ising model in a finite box. For each finite 3 ⊂ Z2 , η ∈ and h, the · )t≥0 is a finite-state-space reversible irreducible Markov process and process (σ3,η,h;t


427

its generator has its (discrete) spectrum contained in the interval (−∞, 0], with 0 being in the spectrum. The spectral gap, denoted by gap(3, η, h), is then simply the absolute value of the largest non-zero number in the spectrum. It is shown in [Mar] (Theorem 3.1) that for low enough T , given ∈ (0, 1/2), there exists a finite positive C so that gap(3(l), +, 0) ≥ exp(−Cl1/2+ ). The common belief is that even up to Tc the lower bound on gap(3(l), +, 0) in this inequality is far from optimal. No rigorous result in this direction is available, and for temperatures close to Tc only a weaker bound has been proven. That bound, which is implicitly derived in [CGMS] (see the introduction of that paper), states that for any > 0, gap(3(l), +, 0) ≥ exp(−l), (2.48) for large l. Fortunately for us, this estimate suffices for our purpose here. For any l > 0, we can write, using basic-coupling inequalities, a standard estimate for the relaxation to equilibrium of expected values of observables in terms of the spectral gap (see, e.g., inequality (59) in [Sch1]), and (1.22), + + (0) − hσ(0)i3(l),+,0 + hσ(0)i3(l),+,0 − m∗ − m∗ ≤ E σ3(l),+,0;u E σ0;u ≤

e−gap(3(l),+,0) u + C1 exp(−C2 l). µ3(l),+,0 (+)

(2.49 )

We use now (2.48) with = C2 /(2p), and we choose l = (log u)/(2). Since 2 µ3(l),+,0 (+) ≥ exp(−C 3 l ), the first term in the right hand side of (2.49) is bounded √ above by exp(− u/2), when u is large. This finishes the proof, since the second term in the right hand side of (2.49) is of the claimed form and this upper bound on the first term goes to 0 even faster.

3. Relaxation Regime 3.1. Preliminaries. In this part of the paper we will prove part (ii) of Theorem 1. For this we suppose that λ > λc is fixed and that ν ≤ µ− . Once more, there is no loss in generality in supposing that f is increasing and that it has ||f ||∞ ≤ 1. For the remainder of the proof we will make these assumptions. Half of our goal is trivial, since the basic-coupling inequalities give for each t ≥ 0, µh ν )) ≤ E(f (σh;t )) = hf ih . E(f (σh;t

Our goal is therefore reduced to proving that for all C > 0 there exists a finite C1 such that ν E(f (σh;exp(λ/h) )) ≥ hf ih − C1 exp(−C/h). (3.1) At this point there is also no loss in supposing that ν is concentrated on the configuration with all spins down, and therefore also that λc < λ < 2λc . In our argumentation this second assumption will simplify things. In Sect. 3.2 we will introduce certain space-time structures, which we call inverted pyramids, and which will be used to obtain statements concerning droplet growth. To be

428


able to use these results in order to prove (3.1), we will need to use such inverted pyramids as building blocks of a rescaling procedure, and will also need to obtain a mathematical counterpart to the notion of droplet creation at the correct rate; these two topics will be covered in Sect. 3.3. In both Sects. 3.2 and 3.3, we will need to use some lemmas which can be seen as rigorous counterparts to the notion that the function φ(b) = wb − m∗ b2 gives the free-energy of optimally-shaped droplets, and that equilibrium distributions can be studied based on this heuristics. To avoid distracting the reader with the technicalities behind these lemmas, they are only presented later in the paper, in Sect. 3.4. Some of the results in that section were already contained in the paper [SS1], but here we will need substantial strengthenings of them; moreover the techniques used here will be different from those in [SS1] and so provide an alternative to parts of that paper. Finally some estimates on the spectral gap of the generator of the dynamics of some kinetic Ising models on some finite sets will also be needed in Sections 3.2 and 3.3. Again, those will be postponed to the final section, 3.5, in this part of the paper, in order to avoid distracting the reader’s attention from the main ideas in Sects. 3.2 and 3.3. The remainder of the paper is written having in mind a reader who will be following it in the order in which the sections are presented. With this in mind we tried to motivate and explain heuristically in Sections 3.2 and 3.3 the results which are used there but will only be proven later, in Sects. 3.4 and 3.5. Readers who prefer following the lemmas and propositions in a strictly logical order, and who do not worry about the motivation behind each lemma can read the sections in the following order: 3.4, 3.5, 3.2, 3.3. Considering this possibility and also the length of the paper, we repeat some definitions from Sects. 3.2 and 3.3 in Sects. 3.4 and 3.5. 3.2. Inverted pyramids and droplet growth. In this section we will introduce two propositions which are counterparts to the statement: “If we start with a large enough Wulffshaped droplet of the (+)-phase in the midst of the (−)-phase, then it is likely to grow with a linear speed larger than any negative exponential of 1/h”. In the first of these propositions “large enough” will be substantially larger than just “supercritical” (it will mean that the droplet has a negative free-energy); in the second of these propositions this aspect will be improved, at the cost of extra technicalities and extra work in the proof. · )t≥0 evolving in a box We will need to generalize the notion of a process (σ3,η,h;t 3, with boundary condition η. We will have to consider the time evolution to occur in boxes which may change with time. Since we will construct all of our generalized processes using the graphical construction, their definition is very elementary. The spacetime regions with which we will be concerned will all be of the following type. Let t0 < t1 < t2 < ... < tN +1 be an increasing finite sequence of times, and 30 , 31 , ..., 3N be finite subsets of Z2 . Our space-time region is ♦=

N [

[ti , ti+1 ] × 3i .

i=0

We will refer to [t0 , t1 ] × 30 as the bottom cylinder of ♦ and to [tN , tN +1 ] × 3N as the top cylinder of ♦. The base of ♦ is t0 × 30 , while the top of ♦ is tN +1 × 3N . While the boundary condition could be very general, in this paper we will only need to consider (−)-boundary conditions in space-time. We will need to start the evolution from an arbitrary time s ∈ [t0 , tN +1 ), from a space configuration η ∈ 3j ,− , where j is defined by s ∈ [tj , tj+1 ). The following notation will be used to denote the process:


429 s,η (σ♦,−,h;t )t≥s .

Its definition is as follows. We freeze all spins outside of ♦ as −1 and at time s set the configuration to η. We use then the graphical construction with its standard rules to update spins inside ♦, after time s. The basic-coupling inequalities generalize in obvious ways. For instance, if η ≤ ζ, and −h(T ) < h1 ≤ h2 < h(T ), then for all t0 ≤ s ≤ t, s,η s,ζ ≤ σ♦,−,h , σ♦,−,h 1 ;t 2 ;t

and, in case t0 = 0,

0,η σ♦,−,h ≤ σhζ 2 ;t . 1 ;t

We explain next the basic type (up to space-time translations) of space-time regions ♦ which we will deal with. For reasons which will become clear, we will refer to such regions as “inverted pyramids”. For i = 0, ..., N − 1, the set 3i+1 will be obtained from 3i by adding one site to it. In particular we will have 30 ⊂ 31 ⊂ ... ⊂ 3N . These sets 3i will all be simply-connected subsets of Z2 and have “approximate Wulff shapes”, in the following technical sense. Given a positive number l0 , we say that a subset 3 of Z2 is l0 -quasi-Wulff-shaped in case it is connected, simply connected and 3((l − l0 )W ) ⊂ 3 ⊂ 3((l + l0 )W ), for some number l. An l with this property will be said to be a linear-size-parameter for the quasi-Wulff-shaped set 3 (of course, more than one such value of l can exist). In our case there will be an l0 for which each 3i , i = 0, ..., N will be l0 -quasi-Wulff-shaped. The absolute constant l0 will be fixed throughout our work, and will be omitted from the notation. We will identify our inverted pyramids by four non-negative parameters: b1 < b2 , h and δ. The first three of these parameters are related to the space dimensions of the inverted pyramid, by setting 30 = 3( bh1 W ) and 3N = 3( bh2 W ). The parameters δ and again h are related to the time dimensions, by setting ti+1 − ti = exp(δ/h), for i = 0, ..., N . It is not hard to see that if l0 > 2 then for all b1 , b2 and h as above there is a sequence, Seq(b1 , b2 ; h) = (30 , ..., 3N ), of boxes with the required properties, and with N = N (b1 , b2 ; h) ≤ C

b2 h

2 (3.2)

for some finite constant C. (Of course, more than one such sequence typically exists, and we suppose that some rule to choose one from among them is being used). Given also δ, the notation O = O(b1 , b2 ; h; δ) will be used for the inverted pyramid which we described above and which also has t0 = 0. The technical counterpart to the idea of the growth of a droplet will be contained in an event that we define next. With b1 , b2 , h and δ fixed (and omitted from the notation in several places, for simplicity), and also an initial configuration η ∈ 3( b1 W ),− given, h we let Gη be the following event:

430


n o 0,η tN ,+ Gη = σO,−,h;t = σ O,−,h;tN +1 . N +1 Observe that it is clear from the basic-coupling inequalities that, regardless of what η 0,η tN ,+ is, σO,−,h;t ≤ σO,−,h;t , so that Gη only really requires that the complementary N +1 N +1 inequality holds. Informally, G+ can be seen as the event of a droplet of +1 spins of linear size proportional to b1 /h at time 0 growing to become a droplet of linear size proportional 2 to b2 /h at time tN = N exp(δ/h) ≤ C b2 /h exp(δ/h). Recall from Sect. 1.5 that heuristically the free-energy of a Wulff-shaped droplet (b/h)W of the (+)-phase is given by φ(b)/h, where φ(b) = wb − m∗ b2 , and that Bc is the value of b which maximizes this function, while B0 = 2Bc is the value of b above which this function becomes negative. In Sect. 1.5 this was used to predict heuristically the behavior of the relaxation time as h & 0. Similarly, the free-energy of such droplets can also be used to predict the typical aspect of a Gibbs distribution µ3( b W ),−,h ( · ), h when h is small, and this will be of great relevance in this paper. When b < B0 , one should expect this Gibbs distribution to resemble the (−)-phase, since droplets of the (+)-phase would all have positive free-energies. On the other hand, when b > B0 , one should expect this Gibbs distribution to resemble the (+)-phase, separated from the (−) spins at the boundary by a large contour, since a single droplet of the (+)-phase of the size of the system itself would have the lowest possible free-energy. Rigorous results of this type were obtained in [SS1]. Unfortunately, for our purposes in the current paper we will need technically stronger results than those in [SS1]. As mentioned in Sect. 3.1, these technical results will only be presented and proven in Sect. 3.4. If the reader accepts the picture which we just presented and justified heuristically as reasonable, then, he or she should have no difficulty believing in the specific statements which appear in this and in the next section and are proven only in Sect. 3.4. Next we state the first of the two main claims of this section. Proposition 3.2.1. Given B0 < b1 < b2 and δ > 0 there are positive finite constants C1 and C2 such that for h > 0, Z dµ3( b1 W ),−,h (η)P(Gη ) ≥ 1 − C1 exp(−C2 /h). (3.3) h

In particular P(G+ ) ≥ 1 − C1 exp(−C2 /h).

(3.4)

Moreover the choice of C2 does not depend on δ and b2 and it can be taken arbitrarily large, provided b1 is large enough. Observe that from the heuristic picture described before the last proposition, we can see that in (3.3) we are starting from a droplet of the (+)-phase, and the statement is that it is likely to grow at a speed which is controlled in a useful way. In comparison, for Bc < b1 < B0 , (3.3) should be false, since then in the Gibbs distribution µ3( b1 W ),−,h ( · ) h no droplet of the (+)-phase should be present. On the other hand, we should expect (3.4) to be true also in this case, since there we are starting from a droplet not just of the (+)-phase, but actually a solid droplet of (+) spins, with a supercritical size. This claim is contained in the next proposition. In this proposition we will use the following object. Recall the definition (2.10) of R and set µ b3,−,h ( · ) = µ3,−,h ( · |Rc ).


431

This is the Gibbs measure conditioned on the presence of a supercritical droplet. The following should be expected to happen, based on the heuristics. When 3 = 3( hb W ) and b > B0 the conditioning has no major effect; but if Bc < b < B0 , then the conditioning produces a droplet of the (+)-phase, of roughly the size of the whole system, separated from the (−) spins at the boundary by a large contour. This is so since a single droplet of the (+)-phase of the size of the system itself would have the lowest possible free-energy compatible with the conditioning. It is important to note that in the next proposition, in addition to having to modify (3.3) by introducing the conditioning on Rc , also the way in which δ can be chosen is different. The reason for this difference will be clarified in the proof of the proposition. Proposition 3.3.2. Given Bc < b1 < b2 there are positive finite constants δ0 , C1 and C2 such that if 0 < δ < δ0 , for h > 0, Z db µ3( b1 W ),−,h (η)P(Gη ) ≥ 1 − C1 exp(−C2 /h). (3.5) h

In particular

P(G+ ) ≥ 1 − C1 exp(−C2 /h).

(3.6)

In the remainder of this section we will prove Propositions 3.2. 1 and 3.2. 2. Besides leaving some technical lemmas which concern equilibrium distributions to Sect. 3.4, also some technical results which concern the kinetic Ising models run in certain finite boxes, including those of the type of 3 hb W , will have their proofs postponed to Sect. 3.5. These results, when used in the current section, will be heuristically motivated, though. Proof of Proposition 3.2.1 (modulo results in Sects. 3.4 and 3.5) . The second claim, (3.4), follows from the first one, (3.3), and the basic-coupling inequalities. Our task is to prove (3.3) and the claims about the value of the constant C2 . There is no loss in supposing that h is small, and we will assume that 0 < h ≤ 1. For i = 1, ..., N , and arbitrary ζ ∈ 3i−1 ,− set o n ti−1 ,ζ ti ,+ = σ Giζ = σO,−,h;t O,−,h;ti+1 . i+1 Note that

i Gη ⊃ G1η ∩ ∩N i=2 G+ .

(3.7)

Our goal now is to prove that for some positive finite C1 and C2 , as in the statement of the proposition, for i = 1, ...., N , Z dµ3i−1 ,−,h (ζ)P((Giζ )c ) ≤ C1 exp(−C2 /h). (3.8) In particular this implies, as in the first paragraph in this proof, that P((Gi+ )c ) ≤ C1 exp(−C2 /h).

(3.9)

By putting together (3.8) (used for i = 1), (3.9) (used for i = 2, ..., N ), (3.7) and (3.2) we obtain (3.3). The sets 3i−1 and 3i differ only in that the latter has one extra site, say x, that the former does not have. For an arbitrary ζ ∈ 3i−1 ,− , we will need to compare µ3i−1 ,−,h (ζ) with µ3i ,−,h (ζ). For this purpose we introduce the notation Sx− = {σ : σ(x) = −1} for the event that the spin at the site x is negative, and let

432


α = inf inf µ{0},ξ,h (S0− ) |h|≤1 ξ∈

(3.10)

be the largest lower bound on the probability of having in equilibrium a spin −1 at the origin, given any information about the other spins, for values of h in the arbitrarily chosen neighborhood [−1, +1] of the origin. Clearly α > 0 for each temperature T > 0. With this notation, µ3i ,−,h (ζ) = µ3i ,−,h (Sx− ) µ3i ,−,h (ζ|Sx− ) ≥ α µ3i−1 ,−,h (ζ), which can be seen as a uniform estimate on a Radon-Nikodym derivative: sup ζ∈3i−1 ,−

dµ3i−1 ,−,h 1 (ζ) ≤ . dµ3i ,−,h α

(3.11)

µ

3,−,h Using the stationarity of the processes (σ3,−,h )t≥0 , and (3.11) we obtain Z Z ti−1 ,ζ ti ,+ dµ3i−1 ,−,h (ζ)P (Giζ )c = dµ3i−1 ,−,h (ζ)P σO,−,h;t = 6 σ O,−,h;ti+1 i+1 Z ti ,ζ ti ,+ 6= σO,−,h;t = dµ3i−1 ,−,h (ζ)P σO,−,h;t i+1 i+1 Z 1 ti ,ζ ti ,+ dµ3i ,−,h (ζ)P σO,−,h;t = 6 σ ≤ O,−,h;ti+1 i+1 α Z 1 ζ + dµ3i ,−,h (ζ)P σ3 . = 6 σ = ,−,h;exp(δ/h) 3 ,−,h;exp(δ/h) i i α (3.12 )

This may have been seen at first sight as a minor and trivial maneuver, but it is actually a central step in our approach towards controlling droplet growth. We have just transformed our problem pertaining to “growth” into a problem pertaining to “rapid loss of memory” or, in other words, “rapid convergence to equilibrium”, since (3.12) will provide us with the aimed (3.8), once we show that Z

ζ + = 6 σ dµ3i ,−,h (ζ)P σ3 3i ,−,h;exp(δ/h) ≤ C1 exp(−C2 /h). i ,−,h;exp(δ/h)

(3.13)

Proving (3.13) seems like a standard problem, due to the vast current literature on + ) , and we want to this type of issue: we have a reversible Markov process (σ3 i ,−,h;t t≥0 show that it reaches equilibrium in a time of the order of exp(δ/h). There are nevertheless still major hurdles to overcome. The standard approach to such a problem starts with the derivation of a lower bound on the spectral gap of the generator of the process. The result that we are after would then follow if the time with which we are concerned were much larger than the inverse of the lower bound on the spectral gap. Such an approach is nevertheless unfeasible in our case, due to the fact that here the spectral gap is of the order of exp(−βA/h), so that its inverse is much larger than exp(δ/h), when δ is small. We will not give the full proof of this claim on the value of the spectral gap, since it will be of no use for us (some readers may want to take it as an exercise, with the hint that it can be solved using techniques in this paper), and will limit ourselves to explaining the nature of the difficulty at the heuristic level. This difficulty lies precisely in the sort of metastability studied in this paper. We are considering a Glauber dynamics in the box 3i ,


433

which is almost the same as a set 3( hb W ) with some b ∈ [b1 , b2 ]. Since B0 < b1 ≤ b, in equilibrium we should have a large droplet of the (+)-phase, covering the box 3i almost − )t≥0 , then to entirely. If we look at the process started from all spins down, (σ3 i ,−,h;t reach equilibrium this big droplet has to be formed, and the system has to go through the bottleneck presented by the situation with a critical droplet. Hence the free-energy barrier to be overcome has height A/h. The system should then reach equilibrium in a time of order exp(βA/h). (Because the linear size of the box is of the order of 1/h, droplet growth is of no relevance for the estimate of the order of magnitude of the relaxation time inside the box.) Starting with all spins down should maximize the relaxation time, since equilibrium is basically the (+)-phase, and the inverse of this relaxation time should hence give the order of magnitude of the spectral gap. The same heuristics above which pointed out the problem with using the spectral gap for our purpose of proving (3.13) indicates also why we should believe that this inequality holds nevertheless. The difficulty pointed out concerns the long time needed to relax towards equilibrium if the process is started with all spins down. We are, on the other hand, concerned with the case in which we start with all spins up, much closer to equilibrium. One can talk of an heuristic picture with a double well structure. The configuration with all spins down is in the higher (metastable) well, separated from the other one by the free-energy barrier, but the configuration with all spins up is inside the deeper (stable) well, and our problem concerns only relaxation inside this well. The problem still remains at this point how to exploit this heuristic picture and prove (3.13). The solution will be to use the basic-coupling inequalities in order to compare our process with some modified ones, for which the spectral gaps can be proven to be large enough for our purposes. In doing so we were inspired by arguments in Sect. 5 of [Mar]. · ) , in which One of the two comparison processes that we will use is (σ3 i ,+,h;t t≥0 (+)-boundary conditions are used. The other one will be denoted by · , and has the following meaning. The box in which it is run is σ3 core ,(+,−),h;t i \3 t≥0

+B0 the annulus 3i \3core , where 3core = 3( Bc2h W ), and the boundary condition denoted by (+, −) refers to freezing the spins up inside the core 3core and down outside 3i . In Sect. 3.5, Propositions 3.5.1 and 3.5.2, we will show that for any δ > 0 the generators of these two processes satisfy δ , (3.14) inf gap(3i , +, h) ≥ exp − i=0,...,N 2h δ inf gap 3i \3core , (+, −), h ≥ exp − , (3.15) i=0,...,N 2h

for small enough h. The intuitive reason behind these relatively large spectral gaps, is that the extra (+)’s introduced as boundary conditions eliminate metastability in the time evolution of these two processes. In equilibrium these systems are again basically in the (+)-phase, and if we start them with all spins down, then there is no need to nucleate a critical droplet in order to relax to equilibrium. In the first case the (+)-phase drifts inwards from the (+)-boundary towards the center. In the second case a supercritical (+)droplet is frozen by hand in the center of the box, so that the relaxation is a “down-hill” movement on the free-energy landscape; the (+)-phase should drift outwards from the center towards the outer boundary of the box. Our task now is to show how (3.14) and (3.15) can be used to derive (3.13). We partition 3i into two sets:

434


3 =3 in

b1 + B0 W 2h

,

in 3out i = 3i \3 .

Using the basic-coupling inequalities, we have Z ζ + dµ3i ,−,h (ζ)P σ3 = 6 σ 3i ,−,h;exp(δ/h) i ,−,h;exp(δ/h) Z o Xn ζ + (y) = +1 − P σ (y) = +1 P σ3 ≤ dµ3i ,−,h (ζ) ,−,h;exp(δ/h) ,−,h;exp(δ/h) 3 i i y∈3i

o Xn + = P σ3 (y) = +1 − µ ({σ : σ(y) = +1}) 3 ,−,h i i ,−,h;exp(δ/h) y∈3i

≤

X n o + (y) = +1 − µ3i ,−,h {σ : σ(y) = +1} P σ3 i ,+,h;exp(δ/h)

y∈3in

+

X n o + P σ3 (y) = +1 − µ {σ : σ(y) = +1} . core 3 ,−,h i ,(+,−),h;exp(δ/h) i \3

y∈3out i

At this point we can use (3.14) and (3.15) in a standard way. For instance, from inequality (59) in [Sch1], + (y) = +1 − µ3i ,+,h {σ : σ(y) = +1} P σ3 i ,+,h;exp(δ/h) 2 ! δ e− exp(δ/h)gap(3i ,+,h) b2 ≤ ≤ exp C exp{−e 2h } µ3i ,+,h (+) h and + − µ3i \3core ,(+,−),h {σ : σ(y) = +1} P σ3 core ,(+,−),h;exp(δ/h) (y) = +1 i \3 2 ! core δ e− exp(δ/h)gap(3i \3 ,(+,−),h) b2 ≤ exp{−e 2h }. ≤ exp C µ3i \3core ,(+,−),h (+) h Combining the last three displayed inequalities, we obtain Z ζ + = 6 σ dµ3i ,−,h (ζ)P σ3 ,−,h;exp(δ/h) 3 i i ,−,h;exp(δ/h) X µ3i ,+,h {σ : σ(y) = +1} − µ3i ,−,h {σ : σ(y) = +1} ≤ y∈3in

+

X

y∈3out i

+C

0

b2 h

µ3i \3core ,(+,−),h {σ : σ(y) = +1} − µ3i ,−,h {σ : σ(y) = +1} 2

exp C

b2 h

2 ! δ

exp{−e 2h }.

(3.16 )

Our remaining problem concerns only equilibrium. We will use a result from Sect. 3.4, but observe that this result is intuitively natural, based on the heuristics presented immediately before the statement of Proposition 3.2. 1. Set


c b1 W 3 ∗ →←→ 3 , h

B1 =

435

in −

n c o − . B2 = 3core ∗ →←→ 3in

(3.17)

(3.18)

From Lemma 3.4.8 we know that for every b1 > B0 there exists a C2 = C2 (b1 ) > 0, C2 → ∞ as b1 → ∞ and a finite C1 such that for j = 1, 2, sup µ3i ,−,h (Bj ) ≤ C1 exp(−C2 /h).

(3.19)

i=0,...,N

From (1.20), for each y ∈ 3in , µ3i ,+,h {σ : σ(y) = +1} − µ3i ,−,h {σ : σ(y) = +1} ≤ 2µ3i ,−,h (B1 ) .

(3.20)

Similarly, for each y ∈ 3out i ,

µ3i \3core ,(+,−),h {σ : σ(y) = +1} − µ3i ,−,h {σ : σ(y) = +1} ≤ 2µ3i ,−,h (B2 ) . (3.21) The desired inequality (3.3) and the claims about the choice of the constant C2 which appears there follow from combining (3.16), (3.20), (3.21) and (3.19). Proof of Proposition 3.2..2 (modulo results in Sects. 3.4 and 3.5) . We will explain how the proof of Proposition 3.2. 1 can be adapted to prove this proposition. There are several extra complications, since µ b3i ,−,h is not an invariant distribution for the process · ) , and in particular this is the reason for which we will have to choose δ0 (σ3 i ,−,h t≥0 small enough. The idea is to look at this distribution instead as a “metastable state” for this process, and to use techniques from Part 2 of the present paper in this connection. Using the graphical construction we can define the following processes restricted to Rc . For arbitrary s ∈ [t0 , tN +1 ), and η ∈ 3j ,− ∩Rc , where j is defined by s ∈ [tj , tj+1 ), the process s,η )t≥s (b σO,−,h;t is obtained in the following simple way. We freeze all spins outside O as −1 and at time s set the configuration to η. We use then the graphical construction with its standard rules modified by suppressing jumps which would bring the system to R, to update spins inside O after time s. Bottlenecks for this dynamics are the sets Fx+ = {σ : σ x ∈ Fx− }. Define also

Fx = Fx− ∪ Fx+ .

For each x the event Fx depends only on the spins at sites other than x. If |h| ≤ 1, we have from the definition (3.10) of α, that for i = 0, ..., N and x ∈ 3i , µ3i ,−,h (Fx− ) ≥ α µ3i ,−,h (Fx ).

(3.22)

Using this inequality in combination with the bottleneck estimate (2.41) we have that for arbitrary > 0,

436


µ b3i ,−,h (Fx+ ) =

1 − µ3i ,−,h (Fx ) µ3i ,−,h (Fx+ ) α µ3i ,−,h (Fx ) ≤ ≤ µ3i ,−,h (Rc ) µ3i ,−,h (Rc ) µ3i ,−,h (Rc )

1 C exp −β(1 − ) A µ3 ,−,h (R) 1 h µ˜ 3i ,−,h (Fx− ) i ≤ , α µ3i ,−,h (Rc ) α µ3i ,−,h (Rc ) for small h, independent of i. Since 3 bh1 W ⊂ 3i and b1 > Bc , Lemma 3.4.4 gives us for some δ0 > 0, A 2δ0 µ3i ,−,h (Rc ) ≥ exp −β + , h h =

for small h. So, if our choice of above is made properly, we obtain δ0 + , b3i ,−,h (Fx ) ≤ C1 exp − ϕ b = sup sup µ h i=0,...,N x∈3i

(3.23)

for some finite C1 . Define now ti ,ζ τiζ = inf{t ≥ ti : the process (b σO,−,h;t ) has a suppressed jump at time t}.

Clearly ti ,ζ ti ,ζ = σO,−,h;t σ bO,−,h;t

for

ti ≤ t < τiζ .

(3.24)

From the same argument used to prove (2.12), we obtain now, using (3.23), for i = 0, ..., N , and δ < δ0 , Z b2 ζ δ/h W eδ/h ϕ + C4 3 b db µ3i ,−,h (ζ)P(τi ≤ ti+1 ) ≤ C2 exp −C3 e h (3.25 ) ≤ C5 exp(−C6 /h), for small h. (We remind the reader that ti+1 = ti + eδ/h .) Before we can proceed with the adaptation of the proof of Proposition 3.2. 1 to prove Proposition 3.2. 2, we need to derive an analogue to (3.11). We will show that for small enough h, for i = 1, ..., N , sup ζ∈3i−1 ,−

db µ3i−1 ,−,h 2 (ζ) ≤ . db µ3i ,−,h α

(3.26)

To this end, as in the argumentation for (3.11), we will use the notation x = 3i \3i−1 , and Sx− = {σ : σ(x) = −1}. First note now that by partitioning (Fx )c ∩ Rc according to what the configuration in 3i−1 is and denoting by {Ej } the resulting parts, we have X µ b3i ,−,h (Ej |(Fx )c ) µ b3i ,−,h (Sx− |Ej ) µ b3i ,−,h (Sx− |(Fx )c ) = j

=

X

µ b3i ,−,h (Ej |(Fx )c ) µ3i ,−,h (Sx− |Ej )

j

≥

X j

µ b3i ,−,h (Ej |(Fx )c ) α = α.


437

Therefore, using (3.23), µ b3i ,−,h (Sx− ) ≥ µ b3i ,−,h ((Fx )c ) µ b3i ,−,h (Sx− |(Fx )c )

α b3i ,−,h (Sx− |(Fx )c ) ≥ , ≥ 1−µ b3i ,−,h (Fx+ ) µ 2

for small enough h, uniformly in i. Since ζ(x) = −1, we have µ b3i ,−,h (ζ) = µ b3i ,−,h (Sx− ) µ b3i ,−,h (ζ|Sx− ) b3i−1 ,−,h (ζ) ≥ =µ b3i ,−,h (Sx− ) µ

α µ b3i−1 ,−,h (ζ), 2

completing the proof of (3.26). We are now ready to explain how the proof of Proposition 3.2. 1 can be modified to prove Proposition 3.2. 2. In place of (3.8), we have to prove the analogous statement: Z db µ3i−1 ,−,h (ζ)P((Giζ )c ) ≤ C1 exp(−C2 /h).

(3.27)

ti−1 ,b µ3 ,−,h )ti−1 ≤t≤ti , and (3.26) For this we use (3.24), (3.25), the stationarity of (b σO,−,h;t i−1 to obtain the following replacement of (3.12)

Z ti−1 ,ζ ti ,+ = 6 σ µ3i−1 ,−,h (ζ)P σO,−,h;t db µ3i−1 ,−,h (ζ)P (Giζ )c = db O,−,h;ti+1 i+1 Z Z ti ,ζ ti ,+ ζ + db µ3i−1 ,−,h (ζ)P(τi−1 6= σO,−,h;t ≤ ti ) µ3i−1 ,−,h (ζ)P σO,−,h;t ≤ db i+1 i+1 Z 2 ti ,ζ ti ,+ db µ3i ,−,h (ζ)P σO,−,h;t + C5 exp(−C6 /h). 6= σO,−,h;t (3.28 ) ≤ i+1 i+1 α

Z

We can show that (3.28) leads to (3.27) by adapting the steps used to prove that (3.12) leads to (3.8). The following are the changes in the argument. This time we take 2Bc + b1 core W , =3 3 3h Bc + 2b1 in 3 =3 W , 3h and in 3out i = 3i \3 .

Similarly to the derivations of (3.16) and (3.28), we can use the basic-coupling inequalities, the spectral gap estimates in Propositions 3.5.1 and 3.5.2, (3.24), (3.25), ti ,b µ3i ,−,h )ti ≤t≤ti+1 to derive and the stationarity of (b σO,−,h;t

438


Z

ti ,ζ ti ,+ db µ3i ,−,h (ζ)P σO,−,h;t 6= σO,−,h;t i+1 i+1 Z o X n t ,+ ti ,ζ i (y) = +1 − P σ (y) = +1 P σO,−,h;t ≤ db µ3i ,−,h (ζ) O,−,h;ti+1 i+1 y∈3i

X n o + ≤ P σ3 (y) = +1 − µ b3i ,−,h {σ : σ(y) = +1} i ,+,h;exp(δ/h) y∈3in

+

X n o + P σ3 (y) = +1 − µ b {σ : σ(y) = +1} core 3 ,−,h i ,(+,−),h;exp(δ/h) i \3

y∈3out i

Z + |3i | db µ3i ,−,h (ζ)P(τiζ ≤ ti+1 ) X µ3i ,+,h {σ : σ(y) = +1} − µ b3i ,−,h {σ : σ(y) = +1} ≤ y∈3in

+

X

y∈3out i

+ C7

b2 h

µ3i \3core ,(+,−),h {σ : σ(y) = +1} − µ b3i ,−,h {σ : σ(y) = +1} 2 (

C5 exp(−C6 /h) + exp C8

b2 h

)

2 ! e

δ − exp( 2h )

.

(3.29 )

The definitions of B1 and B2 are the same as before (see (3.17) and (3.18)), but with the modified choices above of 3core and 3in . From Lemma 3.4.8 we know that for j = 1, 2, b3i ,−,h (Bj ) ≤ C9 exp(−C10 /h). (3.30) sup µ i=0,...,N

Due to the conditioning in the definition of µ b3i ,−,h , the derivation of the analogues of (3.20) and (3.21) are somewhat more delicate. For y ∈ 3in , we let {Ej } denote the c partition of (B1 )c according to what the (−,*)-cluster of 3 bh1 W is. We obtain the following: X αj µ b3i ,−,h {σ : σ(y) = +1}|Ej µ b3i ,−,h {σ : σ(y) = +1}|(B1 )c = j

=

X

αj µ3i ,−,h {σ : σ(y) = +1}|Ej

j

≥ µ3i ,+,h {σ : σ(y) = +1} , where in the second equality we used the fact that for each j, Ej ⊂ Rc , and in the final inequality P we used the same standard argument which gives rise to (1.19) and the fact that j αj = 1. From (1.18) it follows then that µ3i ,+,h {σ : σ(y) = +1} − µ b3i ,−,h {σ : σ(y) = +1} ≤ 2b µ3i ,−,h (B1 ) . (3.31) Similarly we can derive, for each y ∈ 3out i , b3i ,−,h {σ : σ(y) = +1} ≤ 2b µ3i ,−,h (B2 ) . µ3i \3core ,(+,−),h {σ : σ(y) = +1} − µ (3.32) Our goal, (3.27), follows from combining (3.28), (3.29), (3.30), (3.31) and (3.32).


439

3.3. Rescaling and droplet creation. The inverted pyramid O = O(b1 , b2 ; h; δ) and the event G+ were conceived having in mind their use in a rescaling procedure. To each point k = (k1 , k2 , k3 ) of the rescaled space time Z2 × Z+ we associate the following translate of O: 0 b b0 k1 , k2 , tN k3 + O, Ok = h h where as before N is 1 less than the number of elements in the sequence Seq(b1 , b2 ; h), tN = N exp(δ/h) and b0 > 0 is a new parameter. We suppose that b1 , b2 and b0 are such that 0 \ b b1 b1 W + ,0 3 W = ∅, (3.33) 3 h h h and

3

b1 W h

+

b0 ,0 h

⊂ 3

b2 W h

.

(3.34)

It will be important that this can be done for arbitrarily large values of b1 . Indeed, with b1 given we can choose b0 for (3.33) to hold, and then choose b2 for (3.34) to hold. Of course, we will sometimes confuse inverted pyramids with their indices in the terminology being introduced below, and no inconvenience should arise from this. We say that two inverted pyramids Ok and Ok0 are neighbors in case |k3 − k30 | = 1 and (k1 , k2 ) − (k10 , k20 ) ∈ {(0, 0), (0, 1), (1, 0), (0, −1), (−1, 0)}. Note that O = O(0,0,0) has exactly 5 neighbors, all at rescaled time 1. The bottom cylinders of these 5 inverted pyramids will be pairwise disjoint and each will be contained in the top cylinder of the inverted pyramid O(0,0,0) . A rescaled-space-time oriented chain will be a sequence (k (1) , ..., k (n) ) of elements of Z2 × Z+ such that k (i) and k (i+1) are neighbors and k3(i+1) = k3(i) + 1, for i = 1, ..., n − 1. The start of the chain is k (1) , and its end is k (n) . We will say that the inverted pyramid Ok is open if the corresponding event G+ happens for it. This event will be denoted by G+,k . More formally, G+,k is the set of realizations of the graphical construction which would be in G+ after space and time 0 0 were translated by the amount −( bh k1 , bh k2 , tN k3 ). The events G+,k have the very nice property that they are well suited for being concatenated. The simplest version of this idea is the following. Suppose that (k (1) , ..., k (n) ) is a rescaled-space-time oriented chain. We will be concerned with the · ). To fix some notation, say that the bottom cylinder of the bottom process (σ∪ i O (i) ,−,h;t k

inverted pyramid, Ok(1) , is the set 3bot × [s0 , s1 ] and that the top cylinder of the top inverted pyramid, Ok(n) , is the set 3top × [s3 , s4 ]. Standard applications of the basiccoupling inequalities yield ∩i=1,...,n G+,k(i) ⊂

n

s0 ,+ σ∪ iO

k(i) ,−,h;s4

s3 ,+ = σ∪ iO

o k(i) ,−,h;s4

.

Pictorially this amounts to a droplet of the (+)-phase flowing through the open tube ∪i Ok(i) . This relation is not yet strong enough for our purposes, because it requires a solid blob of (+) spins at the bottom, and we will not have such a solid droplet of (+)’s. Nevertheless the following stronger version of this relation can be derived via the same standard use of the basic-coupling inequalities. For an arbitrary η ∈ 3bot ,−

440


n

s0 ,η s0 ,+ = σO σO (1) ,−,h;s1 (1) ,−,h;s1 k

o\

∩i=1,...,n G+,k(i) n s0 ,η s3 ,+ ⊂ σ∪ = σ∪ i O (i) ,−,h;s4 iO

k

k

o k(i) ,−,h;s4

. (3.35 )

We will now construct a space-time structure motivated by the cone which appeared in the heuristics in Sect. 1.5. Our goal is to look at the system at time exp(λ/h), with λc ≤ λ ≤ 2λc , and prove (3.1). Set M = bexp(λ/h)/tN c. Moving backwards in time we choose now a set CM of inverted pyramids Ok . From the inverted pyramids with rescaled time coordinate M − 1 we take only O(0,0,M −1) . Inductively, once we have selected the inverted pyramids at a certain rescaled time m > 0, we take at rescaled time m − 1 the inverted pyramids which are neighbors to at least one inverted pyramid already included in our set. The procedure stops at rescaled time 0. Note that if we consider the indices of the selected inverted pyramids, what we have done is precisely to construct the discrete rescaled analog of the space-time cone, as in the heuristics. An alternative definition of CM is that it is the set of inverted pyramids from which we can start a rescaled space-time oriented chain which ends at O(0,0,M −1) . The cardinality of the set CM clearly satisfies C1 M 3 ≤ |CM | ≤ C2 M 3 . On the other hand, the bounds on λ and the fact that, due to (3.2), 2 b2 δ , exp tN ≤ C h h give us, for small h, C0 Therefore

C3

h b2

h b2

2

exp

6 exp

λ−δ h

3(λ − δ) h

≤ M ≤ exp

2λc h

≤ |CM | ≤ C4 exp

.

6λc h

.

Let us choose δ as

λ − λc , 2 which means that λ − δ = λc + δ. Then, since λc = βA/3, we have δ=

C3

h b2

6 exp

3δ h

exp

βA h

≤ |CM | ≤ C4 exp

(3.36)

6λc h

.

(3.37)

The lower bound in (3.37) is central to our analysis, and we will return to it later, when we discuss droplet creation. The upper bound in (3.37) is of technical relevance in connection to droplet growth, because we will want to have all the events G+,k , k ∈ CM happening. At this point recall that our goal is to prove (3.1), in which an arbitrarily large constant C is involved. It is clear from Proposition 3.2. 1 and the upper bound in


441

(3.37) that given a value for the constant C in (3.1) we can choose b1 so large that for some finite C1 , P(∪k∈CM (G+,k )c ) ≤ C1 exp(−C/h), (3.38) for all h > 0. Define the space-time region [

1=

! Ok

[

,

k∈CM

where =3

b2 W h

λ × M tN , exp . h

We can think of this space-time region 1 as playing the role of the cone in the heuristics. Informally speaking, (3.38) assures that once a supercritical droplet of size (b1 /h)W is born at the bottom of some Ok , k ∈ CM , then with large probability it will reach the top of the uppermost inverted pyramid, O(0,0,M −1) , of CM . If it survives also through the top cylinder of 1, it will reach the time exp(λ/h). A further condition on b2 is required to assure us that the cylinder at the top of 1 is wide enough so that the aimed conclusion (3.1) will be proven with an arbitrarily large constant C. Technically this condition is the following. Given the constant C in (3.1), we will need to take b2 large enough for there to exist a finite C1 so that hf i3( b2 W ),−,h ≥ hf ih − C1 exp(−C/h),

(3.39)

h

for all h > 0 small enough. That such a choice is possible is heuristically reasonable, since in the double-well picture of the Gibbs distribution µ3( b2 W ),−,h with b2 large, the h mass is concentrated in the well corresponding to the (+)-phase, which can be made very deep by choosing b2 appropriately large. From the rigorous view-point, aside from the claim that C can be arbitrarily large, a proof of (3.39) can be found in [SS1] (see the proof of Theorem 1.b.3 there). A complete proof of (3.39) is obtained by combining Lemma 3.4.8 with (1.20) and the FKG-Holley inequalities. Observe that our choices are made in the following order. Suppose λ and C are given. First (3.36) gives us the value of δ. Then (3.38) gives us the value of b1 . Afterwards, (3.33) gives us b0 , and finally (3.34) and (3.39) give us b2 . So far we have developed mathematically rigorous counterparts to the notion that if a supercritical droplet is created close to the bottom of any of the inverted pyramids Ok , k ∈ CM it is likely to grow and bring the (+)-phase to the neighborhood of the origin at time exp(λ/h). What we still need is to make mathematical sense of the creation of such a supercritical droplet as occurring at a rate predicted by the heuristics. The lower bound in (3.37) tells us that if we could say that close to the bottom of each one of these inverted pyramids, and independently of what happens close to the bottom of the other inverted pyramids, there is probability of the order of exp(−β A h ) of creating such a droplet, then we would be done. This would be akin to saying that the rate of creation of supercritical droplets is exp(−β A h ), as we expect. Motivated by (3.35) we will say that supercritical droplet creation occurs in the bottom of the inverted pyramid Ok , which has 3bot × [s0 , s1 ] as its bottom cylinder, in case the following event happens:

442


n Fk =

o s0 ,− s0 ,+ σO = σ Ok ,−,h;s1 . k ,−,h;s1

The events Fk , k ∈ CM are clearly mutually independent, since they are determined by the graphical construction marks in disjoint regions of space-time. The final part of this section will be concerned with proving that for each k for small h > 0 βA 1 . (3.40) P(Fk ) = P(F0 ) ≥ exp − 2 h If for the moment we suppose that this is known, we can complete the proof of (3.1) as follows. Consider the event \ ∪k∈CM Fk . E = ∩k∈CM G+,k From (3.38), the lower bound in (3.37), and (3.40) we have for small h, |CM | βA 1 ≤ 2C1 exp(−C/h). P(E ) ≤ C1 exp(−C/h) + 1 − exp − 2 h

c

Now, using basic-coupling inequalities, (3.35) and (3.39) we obtain 0,− ν E f σh;exp(λ/h) ≥ E f σ1,−,h;exp(λ/h) 0,− ≥ E f σ1,−,h;exp(λ/h) ; E − P(E c ) M tN ,+ = E f σ,−,h;exp(λ/h) ; E − P(E c ) M tN ,+ ≥ E f σ,−,h;exp(λ/h) − 2P(E c ) ≥ hf i3( b2 W ),−,h − 4C1 exp(−C/h) h

≥ hf ih − C10 exp(−C/h). All that remains to be done in this section is to show (3.40). For this purpose we will insert another inverted pyramid inside the bottom cylinder of each one of our inverted pyramids Ok , k ∈ CM . To distinguish the new inverted pyramids from the ones that we have been discussing so far (parametrized by b1 , b2 , δ and h), we will call the old ones “growth inverted pyramids” and the ones which we are introducing now “creation inverted pyramids”. The creation inverted pyramid which is inserted inside the bottom cylinder of O = O0 is described next. It will be of the form Ocr =

Ncr [

[ui , ui+1 ] × 3cr i ,

i=0

with the following features. As with the growth inverted pyramids, for cr i = 0, ..., Ncr − 1, the set 3cr i+1 will be obtained from 3i by adding one site to it. In cr cr cr particular we will have 30 ⊂ 31 ⊂ ... ⊂ 3N . These sets 3cr i will all be l0 -quasib0 Wulff-shaped. We will take 3cr = 3 W , where b ∈ (B , b ) 0 c 1 is a new parameter. At 0 h cr b1 the other end, 3Ncr = 3 h W , where b1 is the same one used for the growth inverted δ pyramids. The bottom cylinder of Ocr will have height u1 − u0 = exp 2h , while all its


443

0 other cylinders will have height ui+1 − ui = exp δh , for i = 1, ..., Ncr , where δ 0 < δ is also a new parameter. With the parameters b1 , δ, b0 and δ 0 fixed and satisfying the conditions above, it is clear that for small enough h > 0, there is an inverted pyramid as described above which fits inside the bottom cylinder of O, and has its top 3cr Ncr × uNcr +1 coinciding with the top of the bottom cylinder of O, i.e., such that uNcr +1 = exp(δ/h). This is the inverted pyramid that we will denote by Ocr . The basic-coupling inequalities imply that n

u0 ,− σO cr ,−,h;uN

u

cr +1

o

,+

cr = σON cr ,−,h;uN

cr +1

⊂

n

o 0,− 0,+ σO,−,h;exp(δ/h) . = σO,−,h;exp(δ/h)

Therefore the next lemma implies (3.40). Lemma 3.2. Given b1 > Bc and δ > 0 there are b0 ∈ (Bc , b1 ) and δ 0 > 0 such that for small h > 0, P

u0 ,− σO cr ,−,h;uNcr +1

=

u cr ,+ σON cr ,−,h;uNcr +1

βA 1 . ≥ exp − 2 h

Proof (modulo results in Sects. 3.4 and 3.5). Before starting the rigorous proof we will motivate it heuristically. Consider first the bottom cylinder of Ocr . Intuitively, when b0 is close to Bc , we can see the system in the box 3 bh0 W as having a double well structure with the deeper well corresponding to the (−)-phase, and the higher well corresponding to the presence of a supercritical droplet of the (+)-phase. The barrier between these wells is given by the configurations with a critical droplet. Note that if we were in b0 inside the box 3 W with (−) boundary conditions, then the equilibrium at time u 1 h

quantity 21 exp − βA would indeed be a lower bound on the probability of being in h the higher well and hence having a supercritical droplet. We would therefore like to say that inside this bottom cylinder the system started at should reach equilibrium at the top of this cylinder, i.e., at time u0 with all spins down δ exp 2h . In other time u1 = u0 + words, we would like to say that the relaxation time δ · for the process σ b0 is shorter than exp 2h . For this to be true it should 3

h

W ,−,h;t

be enough to take b0 close enough to Bc , so that from the higher well there is a very small barrier to overcome to reach equilibrium. This free-energy barrier can be made δ , so the available time should be enough to equilibrate the system. smaller than 2h The heuristics in the last paragraph can be made rigorous by considering the spectral gap of the generator of the process. This will be done in Sect. 3.5. From Proposition 3.5.3 we have that if b0 is chosen close enough to Bc , then δ b0 W , −, h ≥ exp − , gap 3 h 4h

(3.41)

for small enough h. We start now the rigorous proof of the lemma. First we break things down according δ to what happens at time u1 = u0 + exp 2h ,

444


u0 ,− P σO cr ,−,h;uN

cr +1

u cr ,+ = σON = cr ,−,h;uNcr +1 X u ,− u1 ,ζ P σO0cr ,−,h;u1 = ζ P σO cr ,−,h;uN

u

cr +1

,+


cr +1

.

ζ

Combining (3.41) with the standard inequality (59) in [Sch1], we have X uNcr ,+ u0 ,− u1 ,ζ P σ = ζ P σ = σ Ocr ,−,h;u1 Ocr ,−,h;uNcr +1 Ocr ,−,h;uNcr +1 − ζ X u ,+ u1 ,ζ cr µ3 b0 W ,−,h (ζ) P σO = σON cr ,−,h;uNcr +1 cr ,−,h;uNcr +1 h ζ ! b δ 2 − exp( 2h )gap 3 h0 W ,−,h δ e b0 ≤ exp C e− exp( 4h ) . ≤ µ3 b0 W ,−,h (−) h h

But using Lemma 3.4.4 and Proposition 3.2. 2, we can take δ 0 small enough so that X uNcr ,+ u1 ,ζ µ3 b0 W ,−,h (ζ) P σO = σ Ocr ,−,h;uN +1 cr ,−,h;uN +1 ζ

≥ µ3

cr

h

b0 h

W

(Rc ) ,−,h

X ζ∈Rc

µ b3

b0 h

cr

u1 ,ζ (ζ) σO P cr ,−,h;uN ,−,h

u

cr +1

≥

for small h. The three displays above combined give us the lemma.

exp −

,+


βA h

cr +1

3 · , 4

At this point the proof of part (ii) of Theorem 1 has been reduced to proving the claims in Sects. 3.4 and 3.5. 3.4. Double well structure of equilibrium distributions. In this section we will study some of the features of the Gibbs distributions on finite simply-connected sets with (−)boundary conditions. We will extend and strengthen results contained in Theorem 1 of [SS1]; the purpose being the use of these stronger results in several other sections of the current paper. While the main results that we will derive in this section and use in other ones could be derived using the same approach as in [SS1], we will nevertheless introduce an alternative method for proving them, based on results and techniques from Sect. 2.3 of this paper. Basically, in [SS1] we started from results on large deviations of the average spin (i.e., the magnetization) inside of a box, under no external field. We could see the external field then as tilting the distribution. Such an approach is appealing and even natural, but since in the current paper our interest lies primarily on contours and not on the magnetization, it seemed even more natural to search for a direct approach to the problems, in which the magnetization need not to be mentioned. Having in mind that we are interested in the contours of configurations in − , it is natural to regard W and Vˆ , defined in Sect. 2.3, as random variables. Given a configuration η ∈ − , the associated values of these random variables are, respectively,


445

W(S) and Vˆ (S), where S is the collection of skeletons corresponding to the external contours of η. Basically we will show that in a sense the function φ(b) = wb − m∗ b2

b ≥ 0,

plays the role of a large-deviation rate function for the random variable Vˆ . The title of this section derives from the shape of this function. The reader will realize that in the lemmas below the results are stated and proven with a certain amount of uniformity over the allowed sets 3; this uniformity is needed in some of our applications of the lemmas, since for instance in Sect. 3.2 we need uniformity over all the sets which are bases of cylinders of inverted pyramids. Recall that Bc is the value of b which maximizes φ, with φ(Bc ) = A, while B0 = 2Bc is the value of b above which this function becomes negative. Recall also that given a h,co finite set of vertebrate contours G we denote by SG the set of configurations which belong to − and which have as their collection of external vertebrate contours the set G. In some of the lemmas below, we will use as a “reference” the set of configurations with no external vertebrate contours, S∅h,co . Our next lemma shows that this is essentially the same as using the set R, defined by (2.10), as a “reference”. We will use the notation oh (1) to represent some function of h > 0, satisfying the property limh&0 oh (1) = 0. Lemma 3.4.1. For any p > 0 there is a function oh (1) such that for any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp , 1≤

Z3,−,h (R) ≤ 1 + oh (1). Z3,−,h S∅h,co

Proof. From Lemmas 2.3.3 and 2.3.7 we have, for small enough h depending on p and some finite C also depending on p, c w . ≤ C exp −β √ µ˜ 3,−,h S∅h,co 4 2heb (Here we use eb to denote the parameter entering the definition of the vertebrate contour.) The thesis follows immediately from the last estimate. Lemma 3.4.2. For any p > 0 there is h0 > 0 and a function oh (1) such that given D0 > 0 there is a finite constant C so that for any D > D0 , any b > 0, any 0 < h ≤ h0 , and any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp , Z3,−,h W ≥

, Vˆ ≤ Z3,−,h S∅h,co D h

b 2 h

β ∗ 2 ≤ C exp − (D − m b )(1 + oh (1)) . h

Proof. The proof of Lemma 2.3.6 and (2.24) show that we can choose h0 and C so as 2 to have for each collection of skeletons S such that Vˆ (S) ≤ hb ,

446


h G∈CS

β ∗ 2 ≤ exp − (W(S) − m b )(1 + oh (1)) . h Z3,−,h (S∅h,co )

X Z3,−,h (S h,co ) G

Summing then over S with W(S) ≥ D h , using the entropy estimate in the proof of Lemma 2.3 .7, we obtain the desired conclusion. h will denote the event that In what follows, with b > 0 and 0 < ρ < 1 given, Eb,ρ b(1−ρ) there is an external contour which surrounds h W and is contained in b(1+ρ) h W , and that moreover this is the only external vertebrate contour.

Lemma 3.4.3. For any p > 0 and γ < 1 there are finite positive constants h0 and C1 and a function oh (1) such that for any 0 < h ≤ h0 , any b > 0, any ρ = ρ(h) > hγ and any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp and contains 3 b(1+ρ) h W , h Z3,−,h Eb,ρ β ≥ C1 exp − φ(b)(1 + oh (1)) . h Z S h,co 3,−,h

∅

In particular, ρ can be a positive constant. Proof. Without loss of generality we can suppose that ρ = hγ with γ < 1. h according to what the vertebrate external contours are, as Partition Eb,ρ [ h,co h Eb,ρ = SG . h G∈Gb,ρ

h h By the definition of Eb,ρ , each G ∈ Gb,ρ is a singleton. Exactly as in (2.33), we have h Z3,−,h Eb,ρ X Z3,−,h (S h,co ) G = h,co Z (S ) 3,−,h Z3,−,h S∅h,co h ∅ G∈Gb,ρ ! Z i X Z3,−,0 (S h,co ) β hXh G,h ∅,h 0 G exp = hσ(x)i3,−,h0 − hσ(x)i3,−,h0 dh . 2 0 x∈3 Z3,−,0 (S h,co ) h G∈Gb,ρ

∅

Arguments analogous to the ones which led to (2.34) provide also the following bound, in the opposite direction. For small enough h0 , and some finite positive C2 , if h and 0 < h0 ≤ h ≤ h0 , then G ∈ Gb,ρ 2 i Xh b ∅,h ∗ − hσ(x)i (1 + o (1)) . hσ(x)iG,h ≥ 2m 0 0 h 3,−,h 3,−,h h x∈3 Therefore we have h h,co X Z Z3,−,h Eb,ρ β 3,−,0 (SG ) ∗ 2 ≥ C1 exp m b (1 + oh (1)) h,co h Z3,−,0 (S∅ ) Z3,−,h S∅h,co h G∈Gb,ρ β ∗ 2 h m b (1 + oh (1)) ≥ C1 exp µ3,−,0 (Eb,ρ ). h


447

The problem is now reduced to a 0-field problem. We will use the notation introduced h , let FG be the event that the spins in in the proof of Lemma 2.3 .6. For each G ∈ Gb,ρ ∂− G are all negative and those in ∂+ G are all positive. With this notation we can write h )= µ3,−,0 (Eb,ρ

X

µ3,−,0 (FG )µ3G (S∅h,co ) ext ,−,0

h G∈Gb,ρ

X

≥ µ3,−,0 (S∅h,co )

µ3,−,0 (FG ) ≥

h G∈Gb,ρ

1 X µ3,−,0 (FG ), 2 h G∈Gb,ρ

where in the second step we used the FKG-Holley inequalities, and in the last step we used (2.40). To complete the proof, we note that from Lemma 5.2 in [Iof1], or from the techniques in Sect. 7 of [SS2], we have X β µ3,−,0 (FG ) ≥ exp − wb(1 + oh (1)) . h h G∈Gb,ρ

Lemma 3.4.4. For any p > 0 and any b > Bc there are finite positive constants h0 and 2 such that for any 0 < h ≤ h0 and any simply-connected set 3 ⊂ Z which satisfies |3| ≤ 1/hp and contains 3 hb W ,

β µ3,−,h (R ) ≥ exp − A(1 − ) . h c

Proof. From Lemmas 3.4.1 and 3.4.3 we have, for small enough h0 , , small constant ρ and some positive finite C1 and C2 , h E Z 3,−,h b(1−ρ),ρ Z3,−,h (R ) ≥ Z3,−,h (R) 2 Z3,−,h S∅h,co β ≥ C1 exp − φ(b(1 − ρ))(1 + oh (1)) h β ≥ 2 exp − A(1 − ) . h c

The conclusion is now immediate.

Lemma 3.4.5. For any p > 0, there are finite positive constants h0 , C1 and C2 and a function oh (1) such that given also 0 < b1 < b2 there is a finite constant C10 so that for any 0 < h ≤ h0 , any b ∈ [b1 , b2 ], any κ ∈ (0, 1), any integer k and any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp and contains 3 hb (1 − κ2 /4)W ,

448


! β min C1 exp − φ(eb)(1 + oh (1)) h b(1−κ)<eb 0,  !2 !2   eb eb  (1 − ) ≤ Vˆ ≤ ⊂ Eeh , b(1−ρ),ρ  h h  W ⊂ 3. Hence the claim follows from Lemma 3.4.3. and also 3 b(1−ρ)(1+ρ) h For the second inequality, note that from the variational result for families of 2 curves presented in Sect. 2.9 of [DKS] we know that if Vˆ ≥ hb then W ≥ w hb . 2 We can therefore apply Lemma 3.4.2 to each of the k events hb (1 − i κk ) ≤ Vˆ ≤ b κ 2 h (1 − (i − 1) k ) , i = 1, 2, ..., k, to conclude that 2 2 Z3,−,h hb (1 − i κk ) ≤ Vˆ ≤ hb (1 − (i − 1) κk ) Z3,−,h S∅h,co 2 Z3,−,h W ≥ w hb (1 − i κk ) , Vˆ ≤ hb (1 − (i − 1) κk ) ≤ Z3,−,h S∅h,co κ β κ 0 ≤ C1 exp − φ(b(1 − (i − 1) )(1 + oh (1)) − C2 b . h k k The previous lemma basically contains the promised characterization of φ(·) as a large-deviation rate function for the random variable Vˆ . Informally it tells us that under appropriate conditions on 3, 2 Z3,−,h Vˆ ≈ hb β ∼ exp − φ(b) . h S h,co Z 3,−,h

∅

There is a technical difficulty in applying Lemma 3.4.5, nevertheless, and this is the motivation for the next lemma. The issue is that, as stated above, the lemma cannot be used to estimate Z3,−,h Vˆ ∈ (I/h)2 /Z3,−,h S∅h,co , if I is, e.g., an interval of the

form [0, b], for some b > 0, since C10 = C10 (b1 , b2 ) can explode as b1 → 0.


449

Lemma 3.4.6. For any p > 0, there is h0 > 0 and a function oh (1) such that for any 0 < h ≤ h0 , any 0 < b < B0 , and any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp , 2 Z3,−,h Vˆ ≤ hb ≤ 1 + oh (1). 1≤ Z3,−,h S∅h,co Proof. Only the upper bound has to be explained. If b < Bc the claim follows from Lemma 3.4.1. The general case is reduced to this one by taking 0 < b0 ≤ min{Bc , b} and using Lemma 3.4.5 to control the contribution from the interval [b0 , b]. h ¯ Let E be the event that there is an external contour which up to a translation b,ρ b(1−ρ) h W

surrounds

and is contained in

b(1+ρ) h W.

Lemma 3.4.7. For any p > 0 and ρ > 0 there are finite positive constants 0 and C2 such that given also 0 < b1 < b2 there are finite positive constants h0 and C1 such that set 3 ⊂ Z2 for any 0 < h ≤ h0 , any b ∈ [b1 , b2 ], any ≤ 0 and any simply-connected which satisfies |3| ≤ 1/hp and contains 3 hb (1 − 2 /4)W , 2 2 ! b C2 b b h c µ3,−,h (E¯ b,ρ ) (1 − ) ≤ Vˆ ≤ . ≤ C1 exp − h h h Proof. We will use the stability result for the Wulff functional (in the case of families of curves) contained in Theorem 2.9 of [DKS]. Consider a configuration in ( 2 2 ) b b h c (1 − ) ≤ Vˆ ≤ . (E¯ b,ρ ) ∩ h h Clearly there is a positive finite C (depending only on the temperature) such that, if h is small, for any such configuration there is no external contour whose skeleton, even after being translated by any amount, is at a Hausdorff distance less than C bρ h from the boundary of hb W . To use Theorem 2.9 in [DKS], we scale lengths down by dividing them by b(1−) h . The rescaled collection of skeletons corresponding to the external contours has a phase volume bounded below by 1. On the other hand, no translate of any of the rescaled ρ 1 from the boundary of 1− W. skeletons is at a Hausdorff distance less than C 1− The quoted theorem in [DKS] tells us then that the Wulff functional associated to the collection of scaled skeletons is bounded below by an expression of the type w(1 + G(ρ, )), where lim&0 G(ρ, ) = G(ρ, 0) > 0. Scaling lengths back to their original value, we have obtained the following lower bound for the Wulff functional of any configuration in the event with which we are concerned: b W ≥ w (1 − )(1 + G(ρ, )). h If 0 is chosen small enough so that (1 − 0 )(1 + G(ρ, 0 )) > 1, and then h0 is chosen small enough, based on Lemma 3.4.2, we obtain for all ≤ 0 ,

450


n h c Z3,−,h (E¯ b,ρ ) ∩

b h (1

2

− )

Z3,−,h S∅h,co

≤ Vˆ ≤

b 2 h

o

2 Z3,−,h W ≥ w hb (1 − )(1 + G(ρ, )) , Vˆ ≤ hb ≤ Z3,−,h S∅h,co β ∗ 2 ≤ C1 exp − wb(1 − )(1 + G(ρ, ))(1 + oh (1)) − m b (1 + oh (1)) h β C20 b ≤ C1 exp − φ(b)(1 + oh (1)) − , h h where C20 = βw((1 − )G(ρ, ) − )/2. The result now follows from the comparison between this estimate and the first inequality in Lemma 3.4.5. In this fashion, with both h0 and 0 small enough, we can take C2 = βwG(ρ, 0)/3. As motivation for the next lemma, we recall that by controlling how deeply the (−,*) cluster of the boundary penetrates in a set 3, we can obtain estimates similar to (1.20) for the expected value of observables. Define 0 c [ b0 − b − b −∗ W ∗ →←→ 3 W 3 . () = Bh,b h h 0 b ∈(b,b]

Recall that a set 3 ⊂ Z2 is said to be l0 -quasi-Wulff-shaped with linear parameter l in case it is simply-connected and 3((l − l0 )W ) ⊂ 3 ⊂ 3((l + l0 )W ). Recall also that µ b3,−,h ( · ) = µ3,−,h ( · | Rc ). Lemma 3.4.8. For any l0 > 0 and any > 0 there is C2 > 0 such that given also Bc < b1 ≤ b2 there are positive finite constants h0 and C1 such that if 0 < h ≤ h0 , then for any l0 -quasi-Wulff-shaped set 3 which has linear size parameter b/h with b1 ≤ b ≤ b2 , C2 (b1 − Bc ) −∗ . (3.42) ()) ≤ C1 exp − µ b3,−,h (Bh,b h In case b1 > B0 the same holds also for the measure µ3,−,h : C2 (b1 − B0 ) −∗ µ3,−,h (Bh,b . ()) ≤ C1 exp − h

(3.43)

Proof. We start by introducing more terminology. The event that a certain contour 0 is present is equivalent to the statement that the spins in a certain set of sites S1 (0) all have the same sign and the spins in a certain other set of sites S2 (0) all have the opposite sign. Exactly one of the two sets, S1 (0) or S2 (0) is surrounded by the contour, while the other is completely outside of the contour. The one which is surrounded by 0 will be denoted by ∂int 0, while the one which is outside of 0 will be denoted by ∂ext 0.


451

+ Let Dh,b () be the event that for some contour 0 which surrounds spins in ∂int 0 are +1. Our first goal is to prove that c C2 (b1 − Bc ) + µ b3,−,h Dh,b () . ≤ C1 exp − h

b(1−/2) W h

all

(3.44)

0 2 In order to do it let us consider the event Vˆ ≥ bh , which for b0 < Bc and h small 0 2 . If b0 is just slightly smaller enough is bigger than Rc , that is Rc ⊂ Vˆ ≥ b h

than Bc , then the bigger event is a good approximation to the smaller one. So instead of considering the conditional distribution µ b3,−,h , we begin with the dis tribution µ¯ 3,−,h ( · ) = µ3,−,h ( · | Vˆ ≥ (b0 /h)2 ). The choice of b0 is immaterial. The only thing we need is that b0 < Bc and that it is close enough to Bc , so that φ(b0 ) > φ(b1 ). Under such a choice this value b0 would not even appear in our estimates. Elementary geometric considerations show that for small enough ρ, dependent on but not on b and h, c + h c () ⊂ E¯ b,ρ . 3,− ∩ Dh,b 2 And obviously every configuration in 3,− has Vˆ ≤ hb + Ch , for some finite constant C, which depends on b. So we can use Lemma 3.4.7 to conclude that we can choose an appropriate b00 (larger than but close enough to b), and small 0 > 0, 00 and h0 > 0 so as to have 2 ! c b + 0 (1 − ) µ¯ 3,−,h Dh,b () Vˆ ≥ h 2 ! c b + 0 ˆ = µ3,−,h Dh,b () V ≥ (1 − ) h 2 ! 00 c b + 00 = µ3,−,h Dh,b () Vˆ ≥ (1 − ) h 2 00 2 ! c b00 b + 00 ˆ = µ3,−,h Dh,b () (1 − ) ≤ V ≤ h h ! 2 00 2 c b00 b h 00 ≤ µ3,−,h E¯ b,ρ (1 − ) ≤ Vˆ ≤ h h 00 2 ! 2 c b00 b h 00 ˆ ¯ ≤ µ3,−,h (1 − ) ≤ V ≤ Eb00 ,ρ/2 h h C4 b ≤ C3 exp − , (3.45 ) h for some finite positive C3 and C4 , which depend on ρ, and hence on , but with C4 not depending on b, b1 and b2 .

452


Lemma 3.4.5 implies that given 000 > 0, after possibly readjusting the value of h0 , we will also have 2 ! β φ(b(1 − 000 )) − φ(b) b 000 ˆ (1 − ) ≤ C5 exp − µ¯ 3,−,h V ≤ h h 2 000 C (b1 − Bc ) ≤ C5 exp − . (3.46 ) h The last step above is based on elementary calculus, and the resulting C > 0 is a constant which depends only on the temperature. From (3.45) and (3.46) (with 0 = 000 ) we have c C7 (b1 − Bc ) + . (3.47) () ≤ C6 exp − µ¯ 3,−,h Dh,b h To go back from µ¯ 3,−,h to µ b3,−,h , observe that if we choose an appropriate 000 , from (3.46) and an application of Lemma 3.4.7 similar to the one above we obtain C 9 b1 . µ¯ 3,−,h (R) ≤ C8 exp − h 0 2 c ˆ , we have Because R ⊂ V ≥ bh µ b3,−,h

+ () Dh,b

c

=µ b3,−,h

0 2 ! b c + () Vˆ ≥ Dh,b h

= µ¯ 3,−,h

c + µ ¯ D () 3,−,h h,b c + . () Rc ≤ Dh,b µ¯ 3,−,h (Rc )

From the last three displays we obtain (3.44) + () according to what the outermost contour 0 in its Let now {Ej } partition Dh,b definition is. In this fashion we obtain the following: X + −∗ −∗ () Dh,b αj µ b3,−,h Bh,b µ b3,−,h Bh,b () = () Ej j

=

X

−∗ αj µ3,−,h Bh,b () Ej

j

−∗ ≤ µ3,+,h Bh,b (/2) 4 C11 b b2 exp − ≤ C10 h h C13 b1 ≤ C12 exp − , h

(3.48 )

where in the second equality we used the fact that for each j, Ej ⊂ Rc , in the first inequality we used the Markov P property of Gibbs distributions, the FKG-Holley inequalities and the fact that j αj = 1, and in the second inequality we used (1.17).


453

The first claim that we wanted to prove, (3.42), follows from (3.44) and (3.48). Regarding the second claim, (3.43), which refers to the case in which b1 > B0 , note that then Lemmas 3.4.5 (lower bound), 3.4.6 and 3.4.7 imply C2 b β φ(b) + C1 exp − µ3,−,h (R) ≤ C14 exp h 2 h C2 b1 C16 (b1 − B0 ) β φ(b1 ) ≤ C14 exp + C1 exp − ≤ C15 exp − . h 2 h h This shows that our second claim follows from the first one, already proven, since b3,−,h ( · ) + µ3,−,h (R). µ3,−,h ( · ) ≤ µ 3.5. Spectral gap estimates. In this section we will prove three propositions which provide lower bounds on the spectral gap of the generator of the kinetic Ising models on some finite sets. A basic technique to be used comes from the fundamental paper [Mar], where for the first time (to our knowledge) a rigorous mathematical relation was established between relaxation times of kinetic Ising models and the equilibrium surface tension. This was done in a setting in which there is no external applied field, and the system was taken in a square box of size l × l with free boundary conditions and at low temperature (in that paper only very low temperatures were considered, but the main results were later extended up to Tc in [CGMS]). The “time to jump between the (+)phase and the (−)-phase” was shown in that paper to behave as exp(β τ¯ l), as l → ∞, where τ¯ = τ ((1, 0)) is the surface tension in a coordinate direction. · )t≥0 is a finite-stateFor each finite 3 ⊂ Z2 , η ∈ and h, the process (σ3,η,h;t space reversible irreducible Markov process and its generator has its (discrete) spectrum contained in the interval (−∞, 0], with 0 being in the spectrum. The spectral gap, denoted by gap(3, η, h), is then simply the absolute value of the largest non-null number in the spectrum. We will prove the three propositions below, the first two of them being needed in Sect. 3.2 and the third one being needed in Sect. 3.3. The heuristics behind these three propositions was explained in those sections. For a recap of the terminology used in these lemmas, see the beginning of Sect. 3.4 and the paragraph in that section which precedes Lemma 3.4.8. Proposition 3.5.1. For any l0 > 0, any > 0 and any b¯ > 0 there is h0 > 0 such that if 0 < h ≤ h0 then for any l0 -quasi-Wulff-shaped set 3 which has linear size parameter ¯ b/h with b ≤ b, . gap(3, +, h) ≥ exp − h Proposition 3.5.2. For any l0 > 0, any > 0 and any Bc < bcore < b¯ there is h0 > 0 such that if 0 < h ≤ h0 then for any l0 -quasi-Wulff-shaped set 3 which has linear size ¯ parameter b/h with bcore < b ≤ b, , gap 3\3core , (+, −), h ≥ exp − h

454


where 3core = 3 bcore h W , and the boundary condition (+, −) refers to freezing the spins up inside the core 3core and down outside 3. Proposition 3.5.3. For any > 0 there are b > Bc and h0 > 0 such that if 0 < h ≤ h0 then b W , −, h ≥ exp − . gap 3 h h We will explain how certain results and techniques in [Mar] can be used to prove the three propositions above. The specific problem in [Mar] which is close to ours is the subject of Sect. 3 in that paper. This problem concerns the spectral gap for the process with no external field, in a square box of size l × l with (+)-boundary conditions. It is shown that given ∈ (0, 1/2), at low enough temperature there is C > 0 so that gap(3(L), +, 0) ≥ exp(−CL1/2+ ).

(3.49)

Heuristically speaking, in this problem (contrary to the case of free-boundary conditions, which is the main concern in [Mar]), there is no free-energy barrier to overcome for the system either starting with all spins down or all spins up to reach equilibrium. One simply expects the (+)-phase to drift inwards from the boundary. (This indicates that the result (3.49) is far from optimal, and that the corresponding gap should be much larger than this lower bound.) This situation is similar to our problems, as stated in the propositions above. In the first two propositions we are dealing with situations without free-energy barriers, and in the third one the free-energy barrier can be made as small as needed, by adjusting the value of b. The technique used in [Mar] to prove (3.49) consists in comparing the spectral gap for the kinetic Ising model with the one for a block-dynamics. The estimate on the spectral gap for the block-dynamics is reduced to equilibrium-statistical-mechanics problems, which are then solved. By a block-dynamics the following is meant. Suppose that {30 , ..., 3J } is a finite collection of finite subsets of Z2 and that 3 ⊂ ∪j=0,...,J 3j is another finite subset of Z2 . The block-dynamics in 3 with blocks {30 , ..., 3J } and with boundary condition η ∈ 3,− will be denoted by · (σ3,{3 0 ,...,3J },η,h;t )t≥0 .

It is defined by updating each block 3j ∩ 3 at rate 1, independently of the other blocks, and at each update of 3j ∩ 3 replacing the configuration inside this block with a configuration chosen at random according to the Gibbs distribution µ3j ∩3,σ,h , where σ is the current configuration. The corresponding generator is given by X µ3j ∩3,σ,h (σ 0 )(f (σ 0 ) − f (σ)). (Lf )(σ) = j=0,...,J

To state an inequality which compares the spectral gap gap(3, η, h) with the spectral gap gap(3, {30 , ..., 3J }, η, h) of the block dynamics, we need to introduce some notation. Set Lj = max |{(x1 , x2 ) ∈ 3j : x2 = k}|, k∈Z

L = max Lj , j=0,...,J


and

455

V = max |3j |. j=0,...,J

Theorem 2.1 in [Mar] provides us with the following bound: gap(3, η, h) ≥

C1 exp(−C2 L) gap(3, {30 , ..., 3J }, η, h), V

(3.50)

where C1 and C2 are finite positive constants which depend only on the temperature. The proof of this result in [Mar] is restricted to the case in which the blocks are of a certain type, adapted to the needs in that paper, but this restriction is clearly not relevant in the proof. Somewhat more importantly, the rates of the kinetic Ising models in [Mar] are not as general as ours, with only a special case of rates, which satisfy detailed-balance and our assumptions (H1) – (H4) being considered. This is not a problem, and indeed, once (3.50) is established for one choice of rates satisfying detailed-balance and (H4), it holds for all such rates, thanks to the fact that the spectral gap is bounded below and above by, respectively, the same equilibrium quantity multiplyed by cmin (T ) and cmax (T ) (for this see, e.g., Eq. (61) in [Sch1]). In order to prove Propositions 3.5.1 – 3.5.3, our blocks will be Wulff-annuli, defined as follows. Given ρ > 0, set A0ρ = 2ρW, Ajρ = (j + 2)ρW \ jρW

for

j = 0, 1, . . . ,

and for h > 0, and j = 0, 1, . . . , 3j = 3

1 j Aρ , h

where the value of ρ is chosen in a fashion that we describe next, and which depends on the value of and b¯ in the propositions that we are proving (in the case of Proposition 3.5.3 we can choose some arbitrary b¯ > Bc ). Note that for each ρ > 0, j = 0, 1, ..., and r ∈ R, the set Ajρ ∩ {(x1 , x2 ) ∈ R2 : x2 = r} if not empty consists of either an interval or the union of two intervals, and that its Lebegue measure satisfies max

max

¯ r∈R j=0,...,bb/ρc

|Ajρ ∩ {(x1 , x2 ) ∈ R2 : x2 = r}| → 0

as

ρ → 0.

¯ Therefore we can choose ρ small enough and take J = bb/ρc, so that the corresponding 0 J collection of blocks, {3 , ..., 3 }, satisfies for each h > 0, L≤

, 2C2 h

(3.51)

where C2 comes from (3.50). Since V grows only as a power of 1/h, from (3.50) and (3.51) we see that all that remains is to show that with the choices above and the pertinent 3 and η, gap(3, {31 , ..., 3J }, η, h) can be assured to be large enough. In the case of Proposition 3.5.1 and 3.5.2 this amounts to showing that this quantity can be bounded below by a positive constant which does not depend on h. In the case of Proposition 3.5.3, we need to show that by taking b sufficiently close to Bc this will also be the case. (The way we set things up above, some blocks 3j may not intersect the set 3 in some situations.

456


This, of course, is not important, since such blocks have no effect on the dynamics. The setup above was chosen for notational convenience.) To study gap(3, {30 , ..., 3J }, η, h) one can couple the processes − + (σ3,{30 ,...,3J },η,h;t )t≥0 and (σ3,{3 0 ,...,3J },η,h;t )t≥0 in such a way that the first marginal never lies above the second one, and after they hit each other they coalesce and remain together. This can be done, for instance, via a graphical construction, in which to each block 3j we associate a rate 1 exponential Poisson process. The occurrence times of this Poisson process then determine the moments of update of 3j ∩ 3 in both marginal processes, and the updates are coupled in a way that preserves the order, which is possible due to the FKG-Holley inequalities. In what follows we will use P to denote the probabilities associated to this coupling. The goal is now to show that there are positive finite constants t0 , and h0 so that for all 0 < h ≤ h0 and all 3 with which we are concerned, and corresponding η, 1 − + P σ3,{3 ≤ . 0 ,...,3J },η,h;t 6= σ3,{30 ,...,3J },η,h;t0 0 2

(3.52)

From this inequality it then follows that for all t > 0, 1 bt/t0 c log(1/2) − + ≤ ≤ C exp − t . P σ3,{3 0 ,...,3J },η,h;t 6= σ3,{30 ,...,3J },η,h;t 2 t0 In particular then we have gap(3, {30 , ..., 3J }, η, h) ≥

log(1/2) , t0

as needed to complete the proofs. Concerning the proof of (3.52), we start by observing that we have only a finite number of blocks. Therefore, if t0 is chosen large enough it will be likely that before time t0 a sequence of updates will have occurred in the coupled processes in which the blocks were updated in a particular, predetermined order. In the case of Propositions 3.5.1 and 3.5.3 the good order is the decreasing one, from J down to 0, while in the case of Proposition 3.5.2 it is the increasing order, from 0 up to J. The point is that at the end of a sequence of updates produced in the good order it is very likely, when h is small, that the two marginal processes will have hit each other. To show this, we will use an equilibrium estimate on how much the boundary conditions can influence the Gibbs distributions inside the annular blocks. This is the content of the following lemma. After stating it and before going into the proof of it we will explain the idea of how it has to be used in order to prove (3.52). In the lemma we will use the notation − () = Ph,b

and + Ph,b () =

c b(1 − ) b(1 + ) − W →←→ 3 W , 3 h h c b(1 − ) b(1 + ) + W →←→ 3 W 3 . h h

Lemma 3.5.1. For any l0 > 0, any > 0 and any 0 < b1 < b2 there are finite positive constants C1 and C2 such that


457

(a) If φ(b1 ) < φ(b2 ) then for all 31 and 32 which are l0 -quasi-Wulff shaped with respective linear-size-parameters bh1 and bh2 , + µ32 \31 ,(+,−),h (Ph,b ()) ≤ C1 exp(−C2 /h), 1 c then for all h > 0. And if also x ∈ 3 b1h+ W

|hσ(x)i32 \31 ,(+,−),h − hσ(x)i32 \31 ,(−,−),h | ≤ C1 exp(−C2 /h), for all h > 0. (b) If φ(b1 ) > φ(b2 ) then for all 31 and 32 which are l0 -quasi-Wulff shaped with respective linear-size-parameters bh1 and bh2 , − ()) ≤ C1 exp(−C2 /h), µ32 \31 ,(+,−),h (Ph,b 2 for all h > 0. And if also x ∈ 3 b2h− W then

|hσ(x)i32 \31 ,(+,−),h − hσ(x)i32 \31 ,(+,+),h | ≤ C1 exp(−C2 /h), for all h > 0. (c) For all 31 and 32 which are l0 -quasi-Wulff shaped with respective linear-sizeparameters bh1 and bh2 , − ()) ≤ C1 exp(−C2 /h), µ32 \31 ,(−,+),h (Ph,b 1

for all h > 0. And if also x ∈ 3

b1 + h W

c

then

|hσ(x)i32 \31 ,(−,+),h − hσ(x)i32 \31 ,(+,+),h | ≤ C1 exp(−C2 /h), for all h > 0. Proof of (3.52). Here we explain the idea of deriving the estimate (3.52) from the above lemma. We will do it for the case of Proposition 3.5.1, for which the statement (c) of the lemma is used. The main observation is very simple. As the reader remembers, we are concerned with the event that the updates of the blocks happen in decreasing order, that is first the block 3J is updated, then the block 3J−1 is, and so on. Note that the boundary condition on the outer boundary of 3J is (+). As we learn from the lemma, after the update of the block 3J , with overwhelming probability it becomes almost completely filled with (+)-phase, no matter what the boundary condition on the inner boundary of 3J is. In particular, its middle line — which is the outer boundary of the block 3J−1 — is in the (+)-phase. So the argument can be repeated. For the complete argument the reader is referred to [Mar], Theorem 3.1. In the final remark before going into the proof of the Lemma 3.5.1 let us explain the relations between different parts of the lemma and Propositions 3.5.1 – 3.5.3: part (a) has to be used for Proposition 3.5.3, part (b) – for Proposition 3.5.2, and part (c) – for the Proposition 3.5.1. The choice of b in Proposition 3.5.3 has to be done in such a way that φ(b) > φ(b − ρ) with ρ chosen before (3.51). Proof of Lemma 3.5.1. In each one of the three parts of the lemma, the second claim, about the expected value of the spin at a site x, follows from the first claim in the same part of the lemma, by arguments analogous to (1.20).

458


Regarding the first claim in each part of the lemma, we could in principle develop a machinery similar to the one developed in Sect. 3.4, to deal with the Gibbs distribution inside Wulff annuli. Some technical complications would arise from the fact that such sets are not simply-connected. It turns out, nevertheless, that for our purposes we can avoid this lengthy approach, and rather use the results for simply-connected sets (more specifically, for quasi-Wulff-shaped sets) of Sect. 3.4, combined with some tricks involving conditioning. To stress that our approach is to some extent natural, we observe that the hypothesis in our lemma refers to the values of φ(b1 ) and φ(b2 ), and hence somehow we should use our knowledge about quasi-Wulff-shaped boxes to prove these results on the annuli. And to make the conditioning below appear less of a trick, observe that, say in part (a), we are interested in (+)-boundary conditions in the center, something that is akin to having a droplet of the (+)-phase placed there; the conditioning places such a droplet in the center. We turn now to the proof of the first claim in part (a) of the lemma. Recall the definitions in the first paragraph of the proof of Lemma 3.4.8. Using that notation, we will say that 0 is a negative-outside (same as positive-inside) contour of a configuration η in case η is identically −1 on ∂ext 0 and identically +1 on ∂int 0. External contours of any configuration in − are negative-outside. Let E be the event that there is a contour which surrounds 31 and E 0 ⊂ E be the W and is negative-outside. We are going event that such a contour is contained in b1 (1+/2) h to argue that if E happens, then with very high probability E 0 happens as well. So we need an estimate on µ32 ,−,h (E) from below, and an estimate on µ32 ,−,h E ∩ (E 0 )c from above. FromLemma 3.4.3 we know that the first probability is at least of the order β of exp − h φ(b1 ) . To obtain the second estimate, let us introduce the number b0 > b1 , which is close enough to b1 , so that δ = 1 − bb10 is small enough. The choice of δ will be made later. One sees immediately that µ32 ,−,h (E 0 )c ∩ E " 2 #! 2 b1 b2 0 c =µ32 ,−,h (E ) ∩ E ∩ ≤ Vˆ ≤ h h 0 2 ! 2 b0 b ≤µ32 ,−,h (E 0 )c ∩ E (1 − δ) ≤ Vˆ ≤ × h h 2 0 2 ! 0 b b ˆ × µ32 ,−,h (1 − δ) ≤ V ≤ h h 0 2 2 ! b b2 +µ32 ,−,h ≤ Vˆ ≤ . h h 0 2 Note now, that if both events (E 0 )c ∩ E and Vˆ ≤ bh happen, and δ is small enough, then we can claim that the following three properties hold: there is an exterior contour surrounding 31 ; 0 W; this contour can not be shifted so as to fit inside b (1+/4) h there are no other exterior contours which can surround 3 1 even after being shifted. 0 2 In other words, we have the inclusion (E 0 )c ∩ E ∩ Vˆ ≤ bh ⊂ (E¯ bh0 ,/4 )c . So from the hypothesis that φ(b1 ) < φ(b2 ) and Lemmas 3.4.3, 3.4.5 and 3.4.7 we have


µ32 ,−,h (E 0 )c | E

459

≤ C1 exp(−C2 /h),

(3.53)

for some positive finite constants C1 and C2 . Set c b1 (1 + ) b1 (1 + /2) + W →←→ 3 W . F = 3 h h Partitioning E 0 according to what the innermost negative-outside contour around 31 is and using the FKG-Holley inequalities we obtain µ32 ,−,h F | E 0 ≤ µ32 ,−,h (F ) . In order to have φ(b1 ) < φ(b2 ) we must havenb2 < B0 .oClearly also, for some 0 > 0 0 which depends on and b1 , F ∩ 32 ,− ⊂ W ≥ h . Therefore, if we choose a conveniently small 00 > 0 and use Lemmas 3.4.2, and 3.4.5 we obtain 00 2 ! 00 2 ! 0 ˆ ˆ + µ32 ,−,h V ≥ µ32 ,−,h (F ) ≤ µ32 ,−,h W ≥ , V ≤ h h h ≤ C1 exp(−C2 /h). Combining the various inequalities displayed above, we have µ32 ,−,h F | E ≤ µ32 ,−,h (E 0 )c | E + µ32 ,−,h F | E 0 ≤ C1 exp(−C2 /h).

(3.54 )

To proceed, we have to introduce a new notion. We will call a (*)-circuit γ = x1 , . . . , xn a c-circuit iff there exists a configuration σ ∈ − with one contour, 0, such that ∂+ 0 = {x1 , . . . , xn }. (Note that our definition depends on the choice of the splitting rules, used in connection with transforming contours into closed curves.) The interior Int γ of a c-circuit γ is by definition the interior of the corresponding contour 0. Let γ1 , γ2 be two c-circuits; we define the intersection c-circuits δk in the following way. Let I = Int γ1 ∩ Int γ1 and Ik be connected components of I in the sense of the above mentioned splitting rules. Since I is simply-connected, so are the Ik -s. Let δk be c-circuits, such that Int δk = Ik . For immediate use we need the following property of c-circuits: let γ1 and γ2 be two c-circuits, δ be one of the intersection c-circuits, and a configuration σ be given, such that both γ1 and γ2 are (+∗)-circuits of it. Then so is the circuit δ. To see this, let us introduce the corresponding contours 01 , 02 and 1, and let first a site x of the c-circuit δ be at the distance 21 from some bond b of 1. But this bond this contour, b evidently belongs to at least one of the contours 01 , 02 , while x is inside √ 2 hence σ(x) = +1. In the remaining case the distance dist(x, 1) = 2 , and in such a √

situation there are two adjacent bonds b1 , b2 of 1, with dist(x, b1 ) = dist(x, b2 ) = 22 , and two nearest neighbors x1 , x2 of x, belonging to δ, such that dist(xi , bi ) = 21 , i = 1, 2. Consider another nearest neighbor y of the sites x1 , x2 . It stays outside 1, hence it also stays outside at least one of the contours 01 , 02 , while all three sites x, x1 , x2 are inside both of them. Hence both bonds b1 , b2 belong to that contour, which again implies that σ(x) = +1. Let E 00 be the event that some c-circuit γ which surrounds 31 and is contained b1 (1+/2) in W is a (+∗)-circuit. Let {Ej00 } be the partition of E 00 according to what the h innermost such circuit γ is. (The preceding paragraph ensures the existence of such an

460


innermost (+)-c-circuit.) Using the fact that for each j we have 32 ,− ∩ Ej00 ⊂ E, the FKG-Holley inequalities and (3.53) we obtain X µ32 ,−,h F | Ej00 ∩ E µ32 ,−,h Ej00 | E µ32 ,−,h F | E ≥ j

=

X

µ32 ,−,h F | Ej00 µ32 ,−,h Ej00 | E

j

≥ µ32 \31 ,(+,−),h (F )

X

µ32 ,−,h Ej00 | E

j

= µ32 \31 ,(+,−),h (F ) µ32 ,−,h E 00 | E ≥ µ32 \31 ,(+,−),h (F ) µ32 ,−,h E 0 | E 1 ≥ µ32 \31 ,(+,−),h (F ), 2

(3.55 )

for small h. Comparing (3.54) with (3.55) we obtain + ()) ≤ µ32 \31 ,(+,−),h (F ) ≤ 2C1 exp(−C2 /h). µ32 \31 ,(+,−),h (Ph,b 1

This completes the proof of part (a) of the lemma, and we turn to the proof of the first claim in part (b) of the lemma. Let this time E be the event that there is a contour which surrounds b1 (1−) W and h 0 W . With no loss of E be the event that there is a contour which surrounds b2 (1−/2) h generality, we will suppose that is small enough so that E 0 ⊂ E. From the hypothesis that φ(b1 ) > φ(b2 ) and Lemmas 3.4.5 and 3.4.7 we have (3.56) µ32 ,−,h (E 0 )c | E ≤ C1 exp(−C2 /h), for some positive finite constants C1 and C2 . Therefore also − µ32 ,−,h Ph,b () E ≤ C1 exp(−C2 /h). 2

(3.57)

W. Let this time E 00 be the event that there is a (+)-circuit which surrounds b1 (1−) h b1 (1−/2) 00 00 W . Let {Ej } partition E according to what the innermost and is contained in h such (+)-circuit is. − ())c according to what the (−)-cluster of (32 )c is and using the By partitioning (Ph,b 2 FKG-Holley inequalities combined with (1.17) and the graph-theoretic duality between connectivity and (*)-connectivity, in addition to (3.57), we obtain c − − 00 c () E + µ ) () P (E µ32 ,−,h (E 00 )c | E ≤ µ32 ,−,h Ph,b 3 ,−,h 2 h,b 2 2 ≤ C1 exp(−C2 /h).

(3.58 )

Using the fact that for each j we have 32 ,− ∩Ej00 ⊂ E, the FKG-Holley inequalities and (3.58) we obtain


461

X 00 − − µ32 ,−,h Ph,b () E ≥ µ () E ∩ E µ32 ,−,h Ej00 | E P 3 ,−,h j 2 h,b2 2 j

=

X

00 − µ32 ,−,h Ej00 | E µ32 ,−,h Ph,b () E j 2

j

X − ≥ µ32 \31 ,(+,−),h Ph,b () µ32 ,−,h Ej00 | E 2

j

− Ph,b () 2

µ32 ,−,h E 00 | E 1 − () , ≥ µ32 \31 ,(+,−),h Ph,b 2 2

= µ32 \31 ,(+,−),h

(3.59 )

for small h. Comparing (3.57) with (3.59) we obtain + () ≤ 2C1 exp(−C2 /h). µ32 \31 ,(+,−),h Ph,b 1 This completes the proof of part (b) of the lemma, and we turn to to the proof of the first claim in part (c) of the lemma. Because in Sect. 3.4 we did not study systems with (+) boundary conditions, we will use a somewhat artificial approach to part (c), by reducing it to part (a), studied above. (In doing so we will proceed as someone who first heats up cold water in order to freeze it later, because he knows how to freeze hot water, but has never frozen cold water). First note that by reversing all signs and then using the FKG-Holley inequalities, we obtain, for any h0 ≥ −h, − + + () = µ32 \31 ,(+,−),−h Ph,b () ≤ µ32 \31 ,(+,−),h0 Ph,b () . µ32 \31 ,(−,+),h Ph,b 1 1 1 Suppose that h0 > 0 and set b01 =

h0 b1 h

and

b02 =

h0 b2 . h

If h0 is small enough we have 0 < b01 < b02 < Bc and hence φ(b01 ) < φ(b02 ). But 31 b0 and 32 are l0 -quasi-Wulff shaped with respective linear-size-parameters bh1 = h10 and b2 h

=

b02 h0 .

So we can quote part (a) of the lemma to conclude the proof of part (c).

Acknowledgement. Over the years during which we have worked on this project, we have enjoyed the benefit of several stimulating conversations with various colleagues. We are especially thankful to A. van Enter, R. Kotecký, F. Martinelli, E. Olivieri and E. Scoppola. Part of this work was done while R.H.S. was visiting Rome in the Fall of 1994, and he thankfully acknowledges the warm hospitality of the Physics Department of the University of Rome I and of the Mathematics Departments of the Universities of Rome II and III.

462


References [AL]

Aizenman, M. and Lebowitz, J.L.: Metastability effects in bootstrap percolation. J. Phys. A 21, 3801–3813 (1988) [BM] Binder, K. and Müller-Krumbhaar, H.: Investigation of metastable states and nucleation in the kinetic Ising model. Physical Review B 9, 2328–2353 (1974) [CCO] Capocaccia, D., Cassandro, M. and Olivieri, E.: A study of metastability in the Ising model Commun. Math. Phys. 39, 185–205 (1974) [CCS] Chayes, J., Chayes, L. and Schonmann, R.H.: Exponential decay of connectivities in the twodimensional Ising model. J. Stat. Phys. 49, 433–445 (1987) [CGMS] Cesi, F., Guadagni, G., Martinelli, F. and Schonmann, R.H.: On the 2D dynamical Ising model in the phase coexistence region near the critical point. J. Stat. Phys. 85, 55–102 (1996) [DS] Dehghanpour, P. and Schonmann, R.H.: Metropolis dynamics relaxation via nucleation and growth. Commun. Math. Phys. 188, 89–119 (1997) [DKS] Dobrushin, R.L., Kotecký, R. and Shlosman, S.B.: Wulff construction: a global shape from local interaction. AMS translations series, Providence, RI.: Am. Math. Soc., 1992 [Geo] H.-O. Georgii: Gibbs measures and phase transitions. Berlin, New York: Walter de Gruyter, 1988 [GD] Gunton, J.D. and Droz, M.: Introduction to the theory of metastable and unstable states. In: Lecture Notes in Physics, 183, Berlin–Heidelberg–New York: Springer-Verlag, 1983 [Iof1] Ioffe, D.: Large deviations for the 2D Ising model: A lower bound without cluster expansions. J. Stat. Phys. 74, 411–432 (1994) [Iof2] Ioffe, D.: Exact large deviation bounds up to Tc for the Ising model in two dimensions. Prob. Th. rel. Fields 102, 313–330 (1995) [Isa] Isakov, S.N.: Nonanalytic features of the first order phase transition in the Ising model. Commun. Math. Phys. 95, 427–443 (1984) [KO] Kotecký, R. and Olivieri, E.: Droplet dynamics for asymmetric Ising model. J. Stat. Phys. 70, 1121– 1148 (1993) [Lig] Liggett, T.: Interacting Particle Systems. Berlin–Heidelberg–New York: Springer-Verlag, 1985 [Mar] F. Martinelli: On the two-dimensional dynamical Ising model in the phase coexistence region. J. Stat. Phys. 76, 1179–1246 (1994) [M-L] Martin-Löf, A.: Mixing properties, differentiability of the free energy and the central limit theorem for a pure phase in the Ising model at low temperature. Commun. Math. Phys. 32, 75–92 (1973) [PL] Penrose, O. and Lebowitz, J.L.: Towards a rigorous molecular theory of metastability. In: Fluctuation Phenomena (second edition), E. W. Montroll, J. L. Lebowitz, editors, Amsterdam: North–Holland Physics Publishing, 1987 [Pfi] Pfister, C.E.: Large deviations and phase separation in the two-dimensional Ising model. Helv. Phys. Acta 64, 953–1054 (1991) [RTMS] Rikvold, P.A., Tomita, H., Miyashita, S. and Sides, S.W.: Metastable lifetimes in a kinetic Ising model: dependence on field and system size. Phys. Rev. E 49, 5080–5090 (1994) [Sch1] Schonmann, R.H.: Slow droplet-driven relaxation of stochastic Ising models in the vicinity of the phase coexistence region. Commun. Math. Phys. 161, 1–49 (1994) [Sch2] Schonmann, R.H.: Theorems and conjectures on the droplet driven relaxation of stochastic Ising models. In: Probability theory of spatial disorder and phase transition, G. Grimmett, ed., Amsterdam: Kluwer Publ. Co, 1994, pp. 265–301 [SS1] Schonmann, R.H. and Shlosman, S.B.: Complete analyticity for 2D Ising completed. Commun. Math. Phys. 170, 453–482 (1995) [SS2] Schonmann, R.H. and Shlosman, S.B.: Constrained variational problem with applications to the Ising model. J. Stat. Phys. 83, 867–905 (1996) Communicated by Ya. G. Sinai

Commun. Math. Phys. 194, 463 – 479 (1998)

Communications in


A Steady-State Quantum Euler–Poisson System for Potential Flows 1,2 ¨ Ansgar Jungel 1

Fachbereich Mathematik, Universität Rostock, Universitätsplatz 1, D-18055 Rostock, Germany. Fachbereich Mathematik, Technische Universität Berlin, Straße des 17. Juni 136, D-10623 Berlin, Germany. E-mail: [email protected]

2

Received: 25 February 1997 / Accepted: 29 October 1997

Abstract: A potential flow formulation of the hydrodynamic equations with the quantum Bohm potential for the particle density and the current density is given. The equations are selfconsistently coupled to Poisson’s equation for the electric potential. The stationary model consists of nonlinear elliptic equations of degenerate type with a quadratic growth of the gradient. Physically motivated Dirichlet boundary conditions are prescribed. The existence of solutions is proved under the assumption that the electric energy is small compared to the thermal energy. The proof is based on Leray-Schauder’s fixed point theorem and a truncation method. The main difficulty is to find a uniform lower bound for the density. For sufficiently large electric energy, there exists a generalized solution (of a simplified system), where the density vanishes at some point. Finally, uniqueness of the solution is shown for a sufficiently large scaled Planck constant.

1. Introduction The evolution of a fluid or gas is governed by the hydrodynamic equations [20] ∂n + div J = 0, ∂t ∂J J ⊗J + div + P − nF = W. ∂t n

(1.1) (1.2)

The first equation expresses the conservation of mass where n is the particle density and J the particle current density. The second equation expresses the conservation of momentum where P = (Pij ) denotes the pressure tensor, F the sum of the external forces, and W the momentum relaxation term. The ith component of div (J ⊗ J/n + P ) is given by

464

A. Jüngel

d X ∂ Ji Jj + Pij , ∂xj n j=1

where d ≥ 1 is the space dimension. We consider an isothermal or isentropic quantum fluid of charged particles. In par ticular, the pressure tensor is assumed to be of the form P = δij r(n) , where δij is the Kronecker symbol. The pressure function r is given by the particle density, i.e. r(n) = To n in the isothermal case and r(n) = To nβ in the isentropic case, where β > 1 and To is a (scaled) temperature constant. In the isothermal case, the fluid temperature T is equal to To ; in the isentropic case we get T = To nβ−1 . We assume that the external force is the gradient of the sum of the electric potential V , the external potential Vext , and the quantum Bohm potential 1 √ Q = δ 2 √ 1 n, n δ > 0 being the scaled Planck constant. The external potential models (interior) quantum wells. Equations (1.1)–(1.2) are coupled to Poisson’s equation for the electric potential, λ2 1V = n − C(x).

(1.3)

Here, λ denotes the scaled Debye length, and C(x) models fixed background ions. Finally, the relaxation term is given by W = −αJ, where α > 0 is the inverse of the scaled relaxation time. With these assumptions the quantum hydrodynamic equations can be formulated as ∂n + div J = 0, (1.4) ∂t √ ∂J J ⊗J 1 n + div + ∇r(n) − n∇(V + Vext ) − δ 2 n∇ √ = −αJ. ∂t n n (1.5) The primary application of the quantum hydrodynamic equations to date has been in analyzing the flow of electrons in quantum semiconductor devices, like resonant tunneling diodes [10]. Very similar model equations have been used in other areas of physics, e.g. in superfluidity [22] and in superconductivity [6]. The quantum Euler–Poisson system (1.3)–(1.5) has been justified in [1, 10, 12, 13, 14]. It can be derived from a moment expansion of the Wigner-Boltzmann equations [10] or from a mixed state Schrödinger–Poisson system [12]. In particular, the single state Schrödinger-Poisson system iε

ε2 ∂ψ = − 1ψ + (V + Vext )ψ, ∂t 2

λ2 1V = |ψ|2 − C(x)

is equivalent (for appropriate “smooth” solutions) to the irrotational zero temperature flow equations ∂n + div J = 0, ∂t √ ε2 ∂J J ⊗J 1 n + div − n∇(V + Vext ) − n∇ √ =0 ∂t n 2 n

A Quantum Euler–Poisson System

465

and Poisson’s equation (1.3) (see [21, 14]). These equations are known as Madelung’s fluid equations [22]. The expression “irrotational” means that the current density can be written as J = n∇S, where S is called a phase or quantum Fermi potential. The √ equivalence of the two models follows from the definitions n = |ψ|2 , ψ = n exp(iS/ε) and J = n∇S. We note that for finite relaxation times α > 0, there is no equivalence to a Schrödinger-Poisson system, even not in the mixed state. In this paper we study the steady-state equations div J = 0, √ J ⊗J 1 n 2 div + ∇r(n) − n∇(V + Vext ) − δ n∇ √ = −αJ, n n

(1.7)

λ 1V = n − C

(1.8)

2

(1.6)

in a bounded domain ⊂ Rd (d ≥ 1) occupied by the fluid. The main assumption is that we consider a potential flow, i.e. we assume that the particle current can be written as J = n∇S with the quantum Fermi potential S (see above). This means that the velocity J/n = ∇S is assumed to be irrotational. It is physically reasonable to assume that n > 0 holds in the device. Since div (J ⊗ J/n) = 21 n∇|∇S|2 we can rewrite (1.7) as √ 1 n 1 n∇ |∇S|2 + To h(n) − V − Vext − δ 2 √ = −αn∇S, (1.9) 2 n where 1 h(n) = To

Z

n 1

r0 (s) ds s

(1.10)

is the enthalpy function. In the isothermal case, h(n) = log(n) holds; for isentropic states, we have h(n) = (β/(β − 1))(nβ−1 − 1) for β > 1. Notice that the electric potential and the quantum Fermi potential are fixed only up to additional constants. Since n > 0, Eq. (1.9) implies √ 1 n 1 |∇S|2 + To h(n) − V − Vext − δ 2 √ + αS = 0. 2 n The integration constant can be assumed to be zero by choosing√a reference point for the electric potential. For the analysis it is convenient to use w = n as a variable. Then (1.6), (1.8), and (1.9) can be written as δ 2 1w = w( 21 |∇S|2 + To h(w2 ) − V − Vext + αS),

(1.11)

div (w ∇S) = 0,

(1.12)

2

λ 1V = w − C 2

2

in .

(1.13)

Physically relevant boundary conditions for w, S, and V will be specified later. The fluid models (1.6)–(1.8) or (1.11)–(1.13) have been studied in some special situations. For vanishing convective and quantum terms the problem (1.6)–(1.8) is known as the isentropic drift-diffusion model used for semiconductor devices [17, 18, 24]. The

466

A. Jüngel

quantum drift-diffusion model (zero convective term, δ > 0) has been investigated in [2]. The classical potential flow hydrodynamic equations (δ = 0) are analyzed in, e.g. [5, 7, 9]. In the paper [29] the existence for the one-dimensional stationary quantum hydrodynamic equations (1.6)–(1.8) with non-standard boundary conditions is investigated. The steady-state system (1.11)–(1.13) in several space dimensions is studied here mathematically for the first time. In the analysis of (1.11)–(1.13), two main difficulties arise. The elliptic equation (1.12) is, a priori, of degenerate type with a non-standard (since non-local) degeneracy. We will show, however, that the solution w is strictly positive and therefore, (1.12) becomes strictly elliptic. Every solution (w, S, V ) of (1.11)–(1.13) with positive w is a solution of the problem (1.6)–(1.8) with n = w2 , J = n∇S. Another difficulty arises due to the term |∇S|2 on the right hand side of (1.11), stemming from the convective term in (1.6). This difficulty also appears in the thermistor problem (see [4, 27]). However, we have to apply different techniques than used in the thermistor problem. To derive the boundary conditions we make physically relevant hypotheses. The boundary data are assumed to be the superposition of the thermal equilibrium functions (neq , Seq , Veq ) and the applied potential U (x): n = neq ,

S = Seq + U,

V = Veq + U

on ∂.

The thermal equilibrium state is defined by J = 0 or, equivalently, S = const. (as n > 0). By fixing the reference point for S (and Seq ) we can suppose that Seq = 0. We assume further that the total space charge C − neq vanishes at the boundary and that no quantum √ √ effects occur on ∂, i.e. 1 neq / neq = 0. Finally, Vext = 0 on ∂, since Vext is introduced to model interior quantum wells. We get from (1.11) 0=

1 |∇Seq |2 + To h(neq ) − Veq + αSeq 2

or, since Seq = 0, Veq = To h(neq )

on ∂.

Therefore we get the Dirichlet boundary conditions w = wo , with wo =

√

C,

S = So ,

So = U,

V = Vo

on ∂

(1.14)

Vo = To h(C) + U.

(1.15)

It is the aim of this paper to show the existence and uniqueness of solutions to (1.11)–(1.14). More precisely, we prove in Sect. 2 that there exists a solution (w, S, V ) to (1.11)–(1.14) with ∇S ∈ L∞ () under the assumption that the temperature constant To is large enough (isothermal and isentropic case) or that the boundary Fermi potential So is small enough in some norm (isothermal case). This means that the electric energy which is connected with the applied potential U (and hence with So ) has to be much smaller than the thermal energy, in some sense. For the proof we first replace Eq. (1.12) by div (max(m, w)2 ∇S) = 0 (m > 0) which is uniformly elliptic. By means of LeraySchauder’s fixed point theorem, the existence of a solution to the truncated problem will be shown. For this solution the density w turns out to be strictly positive. So we get a solution to the original problem (1.11)–(1.14) by choosing the truncation parameter m > 0 smaller than the lower bound of w.


467

We need the smallness assumption on the data in the proof of the positivity of w. We do not know if the existence of solutions can be proved without this assumption. In the stationary thermistor problem which is formally related to the quantum hydrodynamic model, it is well known that there exist solutions only if the applied potential is “small” enough (for the precise conditions see [27]). Furthermore, in the one-dimensional case it is possible to show the non-existence of solutions for “large” applied voltages [3, 4]. We recall that the thermistor problem reads div (k(w)∇w) = −σ(w)|∇S|2 , div(σ(w)∇S) = 0, where w and S have here the meaning of the temperature and electric potential, respectively. In the simulation of semiconductor tunneling devices where a variant of the presented quantum fluid model has been used, numerical results indicate that the density can be extremely small compared to, e.g., the boundary density, for values of the applied voltage U far from the thermal equilibrium (e.g. nmin = 10−4 , n|∂ = 1; see [10]). It is not clear if there is a lower bound for the density for all U and if yes, how it can be controlled. The positivity property of w is connected to the regularity for S. Indeed, we show that w is strictly positive if and only if the gradient of S is bounded (Sect. 3). For ultra-small devices, Eqs. (1.11)–(1.14) can be replaced asymptotically by a simplified system [19]. We show that there exists a solution of this (one-dimensional) system, where the density vanishes at some point. However, the solution is discontinuous and therefore, it is only defined in a generalized sense (see Sect. 3). There exists at most one solution to (1.11)–(1.14) if the scaled Planck constant δ is sufficiently large (Sect. 4). For δ = 0, there exist situations where the problem has more than one solution [11].

2. Existence of Solutions In this section we prove the existence of solutions to (1.11)–(1.14) with general Dirichlet boundary data. The following assumptions are needed: (A1) ⊂ Rd (d ≥ 1) is a bounded domain with boundary ∂ ∈ C 1,1 . (A2) h ∈ C 0 (0, ∞) is a non-decreasing function satisfying lim h(x) = +∞,

x→∞

lim xh(x2 ) < +∞.

x→0+

(A3) wo ∈ W 2,p () for p > d/2, inf ∂ wo > 0; So ∈ C 1,γ () with γ = 2 − d/p; Vo ∈ H 1 () ∩ L∞ (); C, Vext ∈ L∞ (). The constants α, δ, λ, and To are assumed to be positive. We call a function h ∈ C 0 (0, ∞) satisfying (A2) isothermal if h(0+) = −∞ and isentropic if h(0+) < 0. The enthalpy function h(s) = log(s) is isothermal. Furthermore, the enthalpy h(s) = (β/(β − 1))(sβ−1 − 1) is isentropic. The main results of this section are the following theorems:

468

A. Jüngel

Theorem 2.1. Let (A1)–(A3) hold and let h be isothermal. Then there exists ε > 0 such that if kSo kC 1,γ () ≤ ε or To ≥ 1/ε, then there exists a solution (w, S, V ) of (1.11)–(1.14) satisfying, for some w > 0, w ∈ W 2,p (),

S ∈ C 1,γ (), V ∈ H 1 () ∩ L∞ (), w(x) ≥ w > 0 in .

(2.1) (2.2)

Theorem 2.2. Let (A1)–(A3) hold and let h be isentropic. Then there exists ε > 0 such that if To ≥ 1/ε then there exists a solution (w, S, V ) of (1.11)–(1.14) satisfying (2.1)–(2.2). Notice that we are assuming boundary data which are independent of the parameter To . The case of the boundary functions (1.15) can also be treated, see Remark 2.5. First we prove that there exists a solution of a truncated system. For this, define sK = max (0, min (s, K)) and tm (s) = max (m, s) for s ∈ R and 0 < m ≤ K. Throughout this section (A1)–(A3) are assumed to hold. Consider 2 ) − V − Vext + αS), δ 2 1w = wK ( 21 |∇S|2 + To h(wK

div (tm (wK ) ∇S) = 0, λ2 1V = wK w − C 2

w = wo ,

S = So ,

(2.4) (2.5)

in , V = Vo

(2.3)

on ∂.

(2.6)

The proof of existence of solutions to this truncated system is based on the following a priori estimates. Lemma 2.3. Let (w, S, V ) be a weak solution to (2.3)–(2.6). Then there exist constants w, S, S, V , V , and c1 (m) such that 0 ≤ w(x) ≤ w,

−S ≤ S(x) ≤ S, −V ≤ V (x) ≤ V kwk2,p, ≤ c1 (m).

in ,

(2.7) (2.8)

Here, k · k2,p, denotes the norm of the Sobolev space W 2,p (). The precise dependence of the above bounds on the data is needed in the uniqueness proof in Sect. 4 and is stated here for future reference: S = − inf So , ∂

S = sup So ,

(2.9)

∂

V = sup Vo + c(, d, λ)kCk0,∞, ,

(2.10)

∂

w = max kwo k0,∞,∂ , w1 (V , S, To , h) ,

V = − inf Vo + c(, d, λ) kCk0,∞, + w , 2

∂

(2.11) (2.12)

where c(, d, λ) > 0 and w1 = w1 (V , S, To , h) > 0 is such that h(w12 ) ≥ (V + kVext k0,∞, + αS)/To .


469

Proof. First step. L∞ estimates for w, S, and V . First observe that, using w− = min (0, w) ∈ H01 () as test function in (2.3), it follows w(x) ≥ 0

a.e. in .

(2.13)

The maximum principle gives the bounds −S = inf So ≤ S(x) ≤ sup So = S ∂

in .

(2.14)

∂

Next we show that V is uniformly bounded in L∞ (). Let Uo = sup∂ Vo , U ≥ Uo , and take (V − U )+ = max (0, V − U ) as a test function in (2.5). Then Z Z Z |∇(V − U )+ |2 = − wK w(V − U )+ + C(V − U )+ (2.15) λ2 Z ≤ C(V − U )+ ≤ ck(V − U )+ k1,2, (meas (V > U ))1/2 . Here and in the following, c, ci denote positive constants only depending on the given data. Let r > 2 be such that the embedding H 1 () ,→ Lr () is continuous. It is well known that for W > U , (meas (V > W ))1/r (W − U ) ≤ c()k(V − U )+ k1,2, holds [25, Ch. 4]. Therefore we get from (2.15), for W > U ≥ Uo , meas (V > W ) ≤

c (meas (V > U ))r/2 . (W − U )r

Since r/2 > 1, we can apply Stampacchia’s Lemma (see [26, Ch. 2.3] or [25, Ch. 4]) to get def

V (x) ≤ V = Uo + c(, d, λ)kCk0,∞, ,

(2.16)

where c(, d, λ) > 0. Before we can find a lower bound for V , we prove that w is bounded from above (independently of K). For this set V ext = kVext k0,∞, , let w ≥ kwo k0,∞,∂ and K > w and use (w − w)+ as a test function in (2.3): Z Z 1 δ2 wK (w − w)+ |∇S|2 |∇(w − w)+ |2 = − 2 Z − wK (w − w)+ To (h(w2 ) − h(w2 )) Z + wK (w − w)+ (V + Vext − To h(w2 ) − αS) Z ≤ wK (w − w)+ (V + V ext − To h(w2 ) + αS), using (A2), (2.14) and (2.16). Since h(s) → ∞ as s → ∞, there exists w ≥ kwo k0,∞,∂ such that h(w2 ) ≥ (V + V ext + αS)/To . This implies

470

A. Jüngel

w(x) ≤ w

a.e. in .

(2.17)

Now use (−V − U )+ with U ≥ Uo = − inf ∂ Vo as test function in (2.5) to get Z Z λ2 |∇(−V − U )+ |2 ≤ (wK w − C)(−V − U )+ Z ≤ c (−V − U )+ , where c > 0 depends on C and w. Using Stampacchia’s method as above allows to conclude that def

V (x) ≥ −V = −Uo − c(, d, λ)(kCk0,∞, + w2 ). Second step. H 1 estimate for w. Use w − wo as test function in (2.3) to obtain Z Z Z Z 1 1 2 2 2 2 δ wK w|∇S| + wK wo |∇S|2 (2.18) ∇w · ∇wo − |∇w| = δ 2 2 Z Z Z 2 ) + wK wV − wK wo V − To wK (w − wo )h(wK Z Z + wK (w − wo )Vext − α wK (w − wo )S. With the test functions V − Vo and S − So in (2.5), (2.4) respectively, we get on the one hand Z Z Z Z Z wK wV = −λ2 |∇V |2 + λ2 ∇V · ∇Vo + Vo wK w + C(V − Vo ) Z Z Z λ2 |∇V |2 + λ2 |∇Vo |2 + c wK w + c, ≤− 2 using Young’s and Poincaré’s inequalities; on the other hand, we have for K > w, Z Z Z m wK |∇S|2 ≤ tm (wK )2 |∇S|2 ≤ tm (wK )2 |∇So |2 Z ≤ w2 |∇So |2 . Therefore we can estimate (2.18) as follows: Z Z Z δ2 δ2 w2 2 2 2 |∇wo | + c(wo , So ) − To wK wh(wK |∇w| ≤ ) 2 2 m Z Z Z λ2 2 2 |∇V |2 + c(λ) wK )| − + c |wK h(wK 4 Z + c wK w + c ≤ c(m, w). Third step. W 2,p estimate for w. The following elliptic estimate holds [15, Thm. 8.33 and 8.34]:


471

kSkC 1,ε () ≤ c2 kSo kC 1,ε ()

for all 0 < ε ≤ γ,

(2.19)

where c2 > 0 depends on , d, m, and the C 0,ε () norm of tm (wK )2 . It can be seen from the proof of this estimate that c2 = c3 (, d)c4 (m)ktm (wK )2 kC 0,ε () . Furthermore, we have the elliptic estimate 2 ) − V − Vext + αS)k0,p, , kwk2,p, ≤ c5 kwo k2,p, + kw( 21 |∇S|2 + h(wK where c5 > 0 depends on , d and δ [15, 9.15 and 9.17]. Hence, using (2.19) for ε = γ/2, δ 2 kwk2,p, ≤ c(1 + kSk21,2p, ) ≤ c(1 + kSk2C 1,γ/2 () ) ≤ c(1 + kw2 k2C 0,γ/2 () ) ≤ c(1 + w2 kwk2C 0,γ/2 () ) ≤ c(1 + kwkC 0,γ () ) ≤

δ2 kwk2,p, + c(δ, m). 2

In the last step we have used the interpolation inequality kwkC 0,γ () ≤ εkwk2,p, + c(ε)kwk0,∞, , which follows from the facts that the embedding W 2,p () ,→ C 0,γ () is compact (since p > d/2) and the embedding C 0,γ () ,→ L∞ () is continuous [28, p. 365]. The constant c(δ, m) in the above estimate depends on , d, δ, m, w, and V . We obtain finally kwk2,p, ≤ 2c(δ, m)/δ 2 = c1 (m). Lemma 2.4. There exists a solution (w, S, V ) of δ 2 1w = w( 21 |∇S|2 + To h(w2 ) − V − Vext + αS), div (tm (w) ∇S) = 0, λ2 1V = w2 − C

in ,

w = wo ,

V = Vo

2

S = So ,

(2.20) (2.21) (2.22)

on ∂,

(2.23)

such that w ∈ W 2,p (), S ∈ C 1,γ (), V ∈ H 1 () ∩ L∞ (), and w(x) ≥ 0 in . Proof. We use a fixed point argument. Let u ∈ C 0,γ (). Let V ∈ H 1 () be the unique solution of λ2 1V = uK u − C in , V = Vo on ∂, and let S ∈ H 1 () be the unique solution of div (tm (uK )2 ∇S) = 0

in ,

S = So

on ∂.

As in the proof of Lemma 2.3, we see that V ∈ L∞ (). Since tm (uK )2 is Hölder continuous of order γ, we get S ∈ C 1,γ () [15, Thm. 8.34]. Finally, let w ∈ H 1 () be the unique solution of

472

A. Jüngel

δ 2 1w = σuK ( 21 |∇S|2 + To h(u2K ) − V − Vext + αS) w = σwo on ∂,

in ,

with σ ∈ [0, 1]. The right-hand side of this elliptic problem lying in L∞ (), we conclude w ∈ W 2,p () and, since p > d/2, w ∈ C 0,γ (). Thus the fixed point operator T : C 0,γ () × [0, 1] → C 0,γ (), (u, σ) 7→ w, is well defined. It holds T (u, 0) = 0 for u ∈ C 0,γ (). Estimates similarly as in the proof of Lemma 2.3 give the bound kwk2,p, ≤ c for all w ∈ C 0,γ () satisfying T (w, σ) = w, where c > 0 is independent of w and σ. Standard arguments show that T is continuous and compact, noting the compactness of the embedding W 2,p () ,→ C 0,γ (). We can apply Leray-Schauder’s fixed point theorem to get a solution (w, S, V ) of (2.3)–(2.6). Choosing K > w (see (2.7)), this tripel is also a solution of (2.20)–(2.23).

Proof of Theorems 2.1 and 2.2. We rewrite the elliptic estimate (2.19) for ε = γ: kSkC 1,γ () ≤ c3 (, d)c4 (m)ktm (w)2 kC 0,γ () kSo kC 1,γ () . It holds c4 (m) → ∞ as m → 0+. Now, ktm (w)2 kC 0,γ () ≤ c(w)kwkC 0,γ () ≤ c(w)kwk2,p, ≤ c5 . From the proof of Lemma 2.3 it can be seen that c5 = c6 (w)c7 (m) with c6 (w) → ∞ as w → ∞ and c7 (m) → ∞ as m → 0+. The bound w depends on To such that w → ∞ as To → 0+ (see (2.11)). Thus we can write kSk2C 1,γ () ≤

c0 kSo k2C 1,γ () , f (To )g(m)

(2.24)

where f and g are positive continuous non-decreasing functions in [0, ∞) such that f (To ) → 0 as To → 0+, f (To ) > 0 as To → ∞, and g(m) → 0 as m → 0+. The constant c0 > 0 does not depend on So , To , or m. Let 0 < m < inf ∂ wo and take (w − m)− = min(0, w − m) as test functions in (2.20). Then, using (A2), (2.24), and (2.7), Z Z 2 − 2 δ |∇(w − m) | = − w(w − m)− To (h(w2 ) − h(m2 )) Z 1 − w(w − m)− ( |∇S|2 + To h(m2 ) − V − Vext + αS) 2 Z c0 kSo k2C 1,γ () ≤ w(−(w − m)− ) f (To )g(m) + To h(m2 ) + V + V ext + αS , def

where V ext = kVext k0,∞, . The constant c8 (To ) = V +V ext +αS depends on To through V such that c8 (To ) can be taken to be non-increasing as To increases (see (2.11)–(2.12)). Then


Z δ

473 − 2

2

|∇(w − m) | ≤

To I2 I1 + g(m)

Z

w(−(w − m)− ),

(2.25)

where 1 To h(m2 ) + c8 (To ), 2 c0 1 kSo k2C 1,γ () + g(m)h(m2 ). I2 = To f (To ) 2 I1 =

First case: Let h be isothermal. For arbitrary To > 0, let w ∈ (0, inf ∂ wo ) be such that h(w2 ) ≤ −2c8 (To )/To (using (A2)). This implies, for m = w, that I1 ≤ 0. Set A = − 21 g(w)h(w2 ) > 0 and ε2 = ATo f (To )/c0 . Then, for m = w and kSo kC 1,γ () ≤ ε, we obtain c0 I2 ≤ ε2 − A ≤ 0. To f (To ) Taking into account (2.25) we conclude that w ≥ w in . For arbitrary So , take m = w ∈ (0, inf ∂ wo ) such that h(w2 ) ≤ −2c8 (1) and let A be defined as above. Choose T1 ≥ 1 such that T1 f (T1 ) ≥ c0 kSo k2C 1,γ () /A. Then we have for all To ≥ T1 , since T 7→ c8 (T )/T is non-increasing, h(w2 ) ≤ −2c8 (1) ≤ −2c8 (To )/To , and hence I1 ≤ 0. Since the function T 7→ T f (T ) is increasing, we obtain c0 c0 kSo k2C 1,γ () ≤ kSo k2C 1,γ () ≤ A, To f (To ) T1 f (T1 ) by definition of T1 . This implies I2 ≤ 0 and w ≥ w in . Second case: Let h be isentropic. Let w ∈ (0, inf ∂ wo ) be such that h(w2 ) < 0, and let T2 ≥ 1 be such that T2 ≥ −2c8 (1)/h(w2 ) > 0 and T2 f (T2 ) ≥ c0 kSo k2C 1,γ () /A, where A is defined as in the first case. Taking m = w and To ≥ T2 , we get I1 ≤ 0 and I2 ≤ 0. We conclude the proof by taking the truncation parameter m = w in (2.21). Remark 2.5. We have assumed that the boundary functions wo , So , and Vo do not depend on the parameters, e.g. To . However, if we take Vo = To h(C) + U (x) (see (1.15)), the above arguments also apply. Indeed, let Co > 0 be such that h(Co ) = 0 and choose a scaling of the variables and functions such that inf ∂ C ≥ C0 (this does not affect To ). Then, for isothermal or isentropic functions, h(inf ∂ C) ≥ 0. This implies V = −To inf ∂ h(C) + U ≤ U , and the constant c8 (To ) can be taken non-increasing as To increases. Note that now V also depends on To , but in such a way that the property w → ∞ as To → 0+ remains valid. Remark 2.6. Using a relaxation scaling as in [23], i.e. defining the rescaled variables nˆ = n, Sˆ = αS = S/τ , Vˆ = V , where τ = 1/α is the scaled relaxation time, we get from (1.11)–(1.12) the equations ˆ δ 2 1wˆ = w( ˆ = 0. div(wˆ 2 ∇S)

τ2 ˆ 2 + To h(wˆ 2 ) − Vˆ − Vext + S), ˆ |∇S| 2

474

A. Jüngel

One may expect that the diffusive term To h(wˆ 2 ) dominates the convective term ˆ 2 for sufficiently small τ > 0, which would give the existence of solu(τ 2 /2)|∇S| tions by the presented method, for fixed To . However, we also have to transform the boundary function Sˆ o = So /τ = U/τ , and it is easy to see that then the convective term is not necessarily “small” for small relaxation times. Choosing different boundary conditions, namely So = U/α, the above rescaling gives Sˆ o = U , and the estimates of the presented proofs lead to an existence result for sufficiently small τ > 0 (see [8]). Remark 2.7. It would be very interesting to study the small dispersion limit δ → 0 and the relaxation time limit τ → 0. However, the W 2,p () norm of w and therefore, the lower bound w depend on δ such that w → 0 as δ → 0. Moreover, it seems difficult to identify the limits of the nonlinear functions. Concerning the relaxation time limit, it can be seen that c8 (To ) → ∞ as τ → 0 (see the proof of Theorems 2.1 and 2.2), and hence, w → 0 as τ → 0. Taking the boundary conditions discussed in Remark 2.6, we expect, however, that the limit τ → 0 can be performed (see [8]). For the small dispersion limit in thermal equilibrium states, we refer to [11]. The relaxation time limit τ → 0 of the hydrodynamic equations (i.e. δ = 0 in (1.7)) is performed in [23]. 3. Positivity and Non-Positivity Properties We show in this section that the existence of a uniform lower bound for the density w is related to the regularity of the gradient of S. Furthermore, we construct a generalized one-dimensional solution of a simplified problem, where the density w vanishes at some point. For this solution, the Fermi potential S is discontinuous. Let (A1)–(A3) hold and let h be isothermal or isentropic. Proposition 3.1. Let (w, S, V ) ∈ (H 1 () ∩ L∞ ())3 be a weak solution to (1.11)– (1.14) with S ∈ W 1,∞ (). Then there exists m > 0 such that w(x) ≥ m > 0

in .

Proof. First let h be isentropic. Then the function f = 21 |∇S|2 + To h(w2 ) − V − Vext + αS is bounded in . Since w ≥ 0, we can apply Harnack’s inequality [15, p. 199] to δ 2 1w = wf to conclude that for all subsets ω ⊂⊂ , sup w ≤ c(ω) inf w. ω

ω

(3.1)

Now suppose that w vanishes in some non-empty set ωo ⊂⊂ . Let ωn ⊂⊂ be a sequence of sets with ωo ⊂ ωn and ωn → as n → ∞ in the set theoretic sense. Then (3.1) gives w = 0 in ωn and, in the limit n → ∞, w = 0 in . This contradicts the positivity of wo on ∂. If h is isothermal, we proceed as in [2]. Consider ωo = {w = 0} ⊂ . Since wf ∈ L∞ (), w is continuous, hence ωo is relatively closed in . Suppose that ωo is nonvoid and choose xo ∈ ωo . Then wf ≤ 0 in a ball B(xo ) ⊂ and 1w ≤ 0 in B(xo ). As the function w assumes its nonnegative infimum 0 in B(xo ), it follows that w = 0 in


475

B(xo ). Thus ωo is relatively open in . This implies ωo = or ωo = ∅. By the positivity of wo , we conclude that w > 0 in . The existence of a uniform lower bound m > 0 for w now follows from the continuity of w in . Corollary 3.2. Let (w, S, V ) be a weak solution to (1.11)–(1.14). Then w(x) ≥ m > 0

a.e. in

if and only if

S ∈ W 1,∞ ().

Now we consider the following simplified system in = (0, 1) ⊂ R: δ 2 wxx = 21 w(Sx )2

in ,

w(0) = 1, w(1) = 1,

(3.2)

2

in ,

S(0) = 0, S(1) = Uo ,

(3.3)

Jx = (w Sx )x = 0

It can be seen that Eqs. (1.11)–(1.12) reduce to (3.2)–(3.3) for very small √ domains (after an appropriate asymptotic limit; see [19]). We only√consider Uo ∈ √ [0, 2δπ]. To solve (3.2)–(3.3) we have to distinguish the cases Uo < 2δπ and Uo = 2δπ. We say that (w, S) ∈ H 1 () × L∞ () is a generalized solution to (3.2)–(3.3) with S(1) = Uo if there exists a sequence of weak solutions (wε , Sε ) ∈ (H 1 ())2 of (3.2)–(3.3) with S(1) = Uε and Uε → Uo as ε → 0 such that w = lim wε , ε→0

S = lim Sε ε→0

in the L2 () sense,

and for all φ ∈ H01 () it holds Z Z 1 wε (Sε )2x φ, lim δ 2 (wε )x φx = − lim ε→0 ε→0 2 Z wε2 (Sε )x φx = 0. lim ε→0

√ Proposition 3.3. (i) Let 0 ≤ Uo < 2δπ. Then there exists a smooth solution (w, S) ∈ 2 C 2 () of (3.2)–(3.3) such that √

w(x) ≥ c(Uo ) > 0

in .

(ii) If Uo = 2δπ then there exists a generalized solution (w, S) ∈ H 1 () × L∞ () of (3.2)–(3.3) such that w( 21 ) = 0. √ √ Proof. Let Uo√= 2δπ and let Uε < 2δπ be a sequence such that Uε → Uo as ε → 0. Set σε = Uε / 2δ. A computation shows that 1/2 , wε (x) = (1 − 2x)2 + 2(1 + cos σε )x(1 − x) √ 1 − (1 − cos σε )x , x ∈ [0, 1], Sε (x) = 2δ arccos wε (x) solve (3.2)–(3.3) with Sε (1) = Uε . Furthermore, √ wε2 (x)(Sε )x (x) = 2δ sin σε and

(3.4)

476

A. Jüngel

r

1 (1 + cos σε ) > 0 2 In the limit ε → 0 we get cos σε → −1 and wε (x) ≥

wε (x) → w(x) = |1 − 2x| √ Sε (x) → 2δ H(x)

in .

in L2 (), in L2 ()

(ε → 0),

where H(x) = 0 for x ∈ (0, 1/2) and H(x) = π for x ∈ (1/2, 1). Taking into account (3.4) we obtain for all φ ∈ H01 (), Z Z Z 1 1 − , wε (Sε )2x φ = δ 2 (wε )x φx → δ 2 wx φx = 4δ 2 φ 2 2 Z Z √ wε2 (Sε )x φx = 2δ sin σε (ε → 0). φx → 0

Therefore, (w, S) is a generalized solution to (3.2)–(3.3). 4. Uniqueness of Solutions Uniqueness of solutions follows under the assumption that the scaled Planck constant δ is large enough. If δ = 0, there exists more than one solution of the thermal equilibrium state (i.e. J = 0; see [11]). Theorem 4.1. Let (A1)–(A3) hold and let h be isothermal or isentropic. Then there exists δo > 0 such that if δ ≥ δo , there exists at most one solution (w, S, V ) to (1.11)–(1.14) satisfying (2.1)–(2.2). Proof. Let (w1 , S1 , V1 ) and (w2 , S2 , V2 ) be two solutions to (1.11)–(1.14) satisfying (2.1)–(2.2). Take w1 − w2 as a test function in the difference of the Eqs. (1.11) satisfied by w1 , w2 , respectively, to get Z Z 1 (w1 |∇S1 |2 − w2 |∇S2 |2 )(w1 − w2 ) |∇(w1 − w2 )|2 = − (4.1) δ2 2 Z + (w1 V1 − w2 V2 )(w1 − w2 ) Z − To (w1 h(w12 ) − w2 h(w22 ))(w1 − w2 ) Z − α (w1 S1 − w2 S2 )(w1 − w2 ) Z + Vext (w1 − w2 )2 = I1 + · · · + I 5 . The weak formulation of the difference of (1.12) for S1 , S2 , respectively, reads Z Z w12 ∇(S1 − S2 ) · ∇φ = − (w12 − w22 )∇S2 · ∇φ for all φ ∈ H01 (). Taking φ = S1 − S2 we obtain


477

Z

Z

w2

|∇(S1 − S2 )|2 ≤

w12 |∇(S1 − S2 )|2 Z = − (w12 − w22 )∇S2 · ∇(S1 − S2 )

≤ 2wkw1 − w2 k0,2 k∇S2 k0,∞ k∇(S1 − S2 )k0,2 which implies k∇(S1 − S2 )k0,2 ≤ (2w/w2 )kw1 − w2 k0,2 k∇S2 k0,∞ .

(4.2)

Now we are able to estimate I1 , . . . , I5 : Z 1 (w1 ∇(S1 − S2 ) · ∇(S1 + S2 ) + (w1 − w2 )|∇S2 |2 )(w1 − w2 ) I1 = − 2 ≤ (w/w)2 + 1 k∇S2 k0,∞ (k∇S1 k0,∞ + k∇S2 k0,∞ )kw1 − w2 k20,2 , using (4.2). The integral I2 is estimated by using (1.13): Z 1 ((w1 − w2 )2 (V1 + V2 ) + (w12 − w22 )(V1 − V2 )) I2 = 2 Z Z λ2 1 2 (w1 − w2 ) (V1 + V2 ) − |∇(V1 − V2 )|2 = 2 2 ≤ V kw1 − w2 k20,2 . The monotonicity of h implies Z I3 = −To (w1 (h(w12 ) − h(w22 ))(w1 − w2 ) + (w1 − w2 )2 h(w22 )) ≤ −To h(w2 )kw1 − w2 k20,2 . Finally, we can estimate the integral I4 employing Poincaré’s inequality and (4.2): Z I4 = −α (w1 (S1 − S2 )(w1 − w2 ) + (w1 − w2 )2 S2 ) ≤ α c()(w/w)2 k∇S2 k0,∞ + S kw1 − w2 k20,2 . Let K = k∇S1 k0,∞ + k∇S2 k0,∞ . Then we get from (4.1), 2 w2 2 2 w 2 δ − 2K + 1 − V − V ext + To h(w ) − α 2 K + S kw1 − w2 k20,2 ≤ 0. w2 w (4.3) Only K depends on δ (via the W 2,p () norm of w; see the third step of the proof of Lemma 2.3) such that K remains bounded as δ → ∞. Therefore there exists δo > 0 such that if δ ≥ δo , then (4.3) implies kw1 − w2 k20,2 ≤ 0. Hence w1 = w2 in . Finally, we infer S1 = S2 from (4.2) and V1 = V2 from (1.13).

478

A. Jüngel

Remark 4.2. There exists at most one weak solution (w, S, V ) in the class of functions satisfying w, V ∈ H 1 () ∩ L∞ (), w(x) ≥ m > 0 in , and (only) S ∈ W 1,q (), where q = d if d ≥ 3, q > 2 if d = 2 and q = 2 if d = 1, under the assumption that the scaled Planck constant δ > 0 is large enough. The proof of this result is similar to the proof in [16]. Acknowledgement. This work was partially supported by the EC-TMR network # ERB-4061-PL97-0396, by the Deutscher Akademischer Austauschdienst (DAAD), and by the Deutsche Forschungsgemeinschaft, grant numbers MA1662/-1 and -2.

Referencs 1. Ancona, M. and Iafrate, G.: Quantum correction to the equation of state of an electron gas in a semiconductor. Phys. Rev. B 39, 9536–9540 (1989) 2. Ben Abdallah, N. and Unterreiter, A.: On the stationary quantum drift-diffusion model. To appear in Math. Mod. Num. Anal. (1998) 3. Cimatti, G.: Remark on the existence and uniqueness for the thermistor problem under mixed boundary conditions. Quart. Appl. Math. 47, 117–121 (1989) 4. Cimatti, G. and Prodi, G. Existence results for a nonlinear elliptic system modelling a temperature dependent electrical resistor. Ann. Mat. Pura Appl. 152, 227–237 (1988) 5. Degond, P. and Markowich, P.: A steady state potential flow model for semiconductors. Ann. Mat. Pura Appl. 165, 87–98 (1993) 6. Feynman, R.: Statistical Mechanics, A Set of Lectures. Frontiers in Physics. New York: W.A. Benjamin, 1972 7. Gamba, I.: Sharp uniform bounds for steady potential fluid-Poisson systems. Proc. Roy. Soc. Edinburgh Sect. A 127, 479–516 (1997) 8. Gamba, I., Gasser, I., and Jüngel, A.: In preparation. 1998 9. Gamba, I. and Morawetz, C.: A viscous approximation for a 2D steady semiconductor or transonic gas dynamic flow: Existence theorem for potential flow. Comm. Pure Appl. Math. 49, 999–1049 (1996) 10. Gardner, C.: The quantum hydrodynamic model for semiconductor devices. SIAM J. Appl. Math. 54, 409–427 (1994) 11. Gasser, I. and Jüngel, A.: The quantum hydrodynamic model for semiconductors in thermal equilibrium. ZAMP 48, 45–59 (1997) 12. Gasser, I. and Markowich, P.A.: Quantum hydrodynamics, Wigner transforms and the classical limit. Asympt. Anal. 14, 97–116 (1997) 13. Gasser, I., Markowich, P.A., and Ringhofer, C.: Closure conditions for classical and quantum moment hierachies in the small temperature limit. Transp. Theory Stat. Phys. 25, 409–423 (1996) 14. Gasser, I., Markowich, P.A. and Unterreiter, A.: Quantum Hydrodynamics. Proceedings of the SPARCH GdR Conference, St. Malo, 1995 15. Gilbarg, D. and Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Berlin: Springer, 2nd edition, 1983 16. Howison, S., Rodrigues, J. and Shillor, M.: Stationary solutions to the thermistor problem. J. Math. Anal. Appl. 174, 573–588 (1993) 17. Jüngel, A.: On the existence and uniqueness of transient solutions of a degenerate nonlinear drift-diffusion model for semiconductors. Math. Models Meth. Appl. Sci. 4, 677–703 (1994) 18. Jüngel, A.: Asymptotic analysis of a semiconductor model based on Fermi-Dirac statistics. Math. Meth. Appl. Sci. 19, 401–424 (1996) 19. Jüngel, A.: A note on current-voltage characteristics from the quantum hydrodynamic equations for semiconductors. Appl. Math. Lett. 10, 29–34 (1997) 20. Kreuzer, H.: Nonequilibrium Thermodynamics and its Statistical Foundation. Oxford: Clarendon Press, 1981 21. Landau, L. and Lifschitz, E.: Lehrbuch der Theoretischen Physik. volume III, Quantenmechanik. Berlin: Akademie-Verlag, 1985 22. Loffredo, M. and Morato, L.: On the creation of quantum vortex lines in rotating HeII. Il nouvo cimento 108B, 205–215 (1993)


479

23. Marcati, P. and Natalini, R.: Weak solutions to a hydrodynamic model for semiconductors and relaxation to the drift diffusion equations. Arch. Rat. Mech. Anal. 129, 129–145 (1995) 24. Markowich, P.A. and Unterreiter, A.: Vacuum solutions of the stationary drift-diffusion model. Ann. Sc. Norm. Sup. Pisa 20, 371–386 (1993) 25. Stampacchia, G.: Equations elliptiques du second ordre a` coefficients discontinus. Les Presses de l’Université de Montréal, Canada, 1966 26. Troianiello, G.M.: Elliptic Differential Equations and Obstacle Problems. New York: Plenum Press, 1987 27. Xie, H. and Allegretto, W.: C α () solutions of a class of nonlinear degenerate elliptic systems arising in the thermistor problem. SIAM J. Math. Anal. 22, 1491–1499 (1990) 28. Zeidler, E.: Nonlinear Functional Analysis and its Applications, Vol. II. New York: Springer, 1990 29. Zhang, B. and Jerome, J.: On a steady-state quantum hydrodynamic model for semiconductors. Nonlin. Anal. 26, 845–856 (1996) Communicated by J. L. Lebowitz

Commun. Math. Phys. 194, 481 – 492 (1998)

Communications in


The Semi-Classical Approximation for Modular Operads E. Getzler? Max-Planck-Institut für Mathematik, Gottfried-Claren-Str. 26, D-53225 Bonn, Germany. Received: 6 February 1997 / Accepted: 20 February 1998

Abstract: We study the contribution of one-loop graphs (the semi-classical expansion) problem in the setting of modular operads. As an application, we calculate the Betti numbers of the Deligne-Mumford-Knudsen moduli spaces of stable curves of genus 1 with n marked smooth points. The semi-classical approximation is an explicit formula of mathematical physics for the sum of Feynman diagrams with a single circuit. In this paper, we study the same problem in the setting of modular operads [5]; instead of being a number, the interaction at a vertex of valence n will be an Sn -module. The motivation for developing this theory was the desire to calculate the Sn equivariant Hodge polynomials of the Deligne-Mumford-Knudsen moduli spaces M1,n of stable curves of genus 1 with n marked smooth points. In performing these calculations, we use the formulas for the Sn -equivariant Serre characteristics of M0,n and M1,n derived in [1] and [3] respectively. A particular consequence of our calculations will be needed in [4] to find a relation among the codimension two cycles in M1,4 . Theorem. The S4 -module H 4 (M1,4 , Q) is isomorphic to V(4) ⊗ Q7 ⊕ V(3,1) ⊗ Q4 ⊕ V(2,2) ⊗ Q2 . 1. Wick’s Theorem and the Semi-Classical Approximation Let 0g,n be the small category whose objects are isomorphism classes of stable graphs G of genus g(G) = g with n totally ordered legs [5], and whose morphisms are the ? Current address: Department of Mathematics, Northwestern University, Evanston, IL 60208-2730, USA. E-mail: [email protected]

482

E. Getzler

automorphisms: if G ∈ 0g,n , its automorphism group Aut(G) is the subset of the permutations of the flags which preserve all the data defining the stable graph, including the total ordering of the legs. Because of the stability condition, 0g,n is a finite category. Define polynomials {Mvg,n | 2(g − 1) + n > 0} of a set of variables {vg,n | 2(g − 1) + n > 0} by the following formula: X

Mvg,n =

G∈Ob 0g,n

1 | Aut(G)|

Y

vg(v),n(v) .

(1.1)

v∈Vert(G)

Introduce the sequences of generating functions X

ag (x) =

vg,n

2(g−1)+n>0

xn , n!

and

bg (x) =

X

Mvg,n

2(g−1)+n>0

xn . n!

Wick’s theorem gives an integral formula for the generating functions {bg } in terms of {ag }: X Z ∞ ∞ ∞ X (x − ξ)2 dx g−1 g−1 √ bg ~ = log exp ag ~ − . 2~ 2π~ −∞ g=0 g=0 As written, this is purely formal, since it involves the integration of a power series in x. It may be made rigourous by observing that the integral transform Z ∞ 2 dx f (~, x)e−(x−ξ) /2~ √ f 7−→ 2π~ −∞ induces a continuous linear map on the space of Laurent series Q((~))[[x]] topologized by the powers of the ideal (~, x). The semi-classical expansion is a pair of formulas for b0 and b1 in terms of a0 and a1 , which we now recall. Definition 1.2. Let R be a ring of characteristic zero. The Legendre transform L is the involution of the set x2 /2 + x3 R[[x]] characterized by the formula (Lf ) ◦ f 0 + f = p1 f 0 . Theorem 1.3. The series x2 /2 + b0 is the Legendre transform of x2 /2 − a0 . The first few coefficients of b0 may be calculated, either from the definition of Mv0,n or from Theorem 1.3: n

Mv0,n

3

v0,3

4

2 v0,4 + 3v0,3

5

3 v0,5 + 10v0,4 v0,3 + 15v0,3

6

2 + 105v v 2 + 105v 4 v0,6 + 15v0,5 v0,3 + 10v0,4 0,4 0,3 0,3

We now come to the formula for b1 , known as the semi-classical approximation.

Semi-Classical Approximation for Modular Operads

483

Theorem 1.4. The series b1 and a1 are related by the formula b1 = a1 − 21 log(1 − a000 ) ◦ (x + b00 ). By the definition of the Legendre transform, we see that (Lf )0 ◦ f 0 = x. It follows that Theorem 1.4 is equivalent to the formula b1 ◦ (x − a00 ) = a1 − 21 log(1 − a000 ). This formula expresses the fact that the stable graphs contributing to b1 are obtained by attaching a forest whose vertices have genus 0 to two types of graphs: (i) those with a single vertex of genus 1 (corresponding to the term a1 ); (ii) stable graphs with a single circuit, and all of whose vertices have genus 0 – we call such a graph a necklace. The presence of a logarithm in the term which contributes the necklaces is related to the fact that there are (n − 1)! cyclic orders of n objects. The first few coefficients of b1 are also easily calculated: n

Mv1,n

1

v1,1 + 21 v0,3

2

v1,2 + v1,1 v0,3 +

3

v1,3 + 3v1,2 v0,3 + v1,1 v0,4 +

4

2 +v v v1,4 + 6v1,3 v0,3 + 3v1,2 v0,4 + 15v1,2 v0,3 1,1 0,5

+

1 2

1 2

2 v0,4 + v0,3 1 2

3 v0,5 + 3v0,4 v0,3 + 2v0,3

2 + 12v v 2 + 6v 4 v0,6 + 4v0,5 v0,3 + 3v0,4 0,4 0,3 0,3

2. The Semi-Classical Approximation for Modular Operads In the theory of modular operads, one replaces the sequence of coefficients {vg,n } considered above by a stable S-module, that is, a sequence of Sn -modules V((g, n)). The analogue of (1.1) is the functor on stable S-modules which sends V to O V((g(v), n(v))). (2.1) MV((g, n)) = colim G∈0g,n

v∈Vert(G)

Thus, the coefficients in (1.1) are promoted to vector spaces, the product to a tensor product, the sum over stable graphs to a direct sum, and the weight | Aut(G)|−1 to colimAut(G) , that is, the coinvariants with respect to the finite group Aut(G). Note that this definition makes sense in any symmetric monoidal category C with finite colimits. We will need the Peter-Weyl theorem to hold for actions of the symmetric group Sn on C; thus, we will suppose that C is additive over a ring of characteristic zero. Definition 2.1. The characteristic chn (V) of an Sn -module is defined by the formula 1 X Tr (V)pσ ∈ 3n ⊗ K0 (C), chn (V) = σ n! σ∈Sn

where pσ is the product of power sums p|O| over the orbits O of σ.

484

E. Getzler

Although this definition appears to require rational coefficients, this is an artifact of the use of the power sums pn ; it is shown in [2] that the characteristic is a symmetric function of degree n with values in the Grothendieck group of the additive category C. If rk : 3 → Q[x] is the homomorphism defined by hn 7→ xn /n!, we have rk(chn (V)) = [V]/n! ∈ K0 (C) ⊗ Q. Note that rk(f ) is obtained from f by setting the power sums pn to 0 if n > 1, and to x if n = 1. The place of the generating functions ag and bg is now taken by X ˆ 0 (C), chn (V((g, n))) ∈ 3⊗K ag = 2(g−1)+n>0

X

bg =

ˆ 0 (C). chn (MV((g, n))) ∈ 3⊗K

2(g−1)+n>0

Theorem (8.13) of [5], whose statement we now recall, calculates bg in terms of ah , h ≤ g. Let 1 be the “Laplacian” on 3((~)) given by the formula ∞ X ∂ n ∂2 n ~ + . 1= 2 ∂p2n ∂p2n n=1

Theorem 2.2. If V is a stable S-module, then ∞ ∞ X X bg ~g−1 = Log exp(1) Exp ag ~g−1 . g=0

g=0

There is also a formula for b0 in terms of a0 . To state it, we must recall the definition of the Legendre transform for symmetric functions. Let ˆ 0 (C) = {f ∈ 3⊗K ˆ 0 (C) | rk(f ) = x2 /2 + O(x3 )}. 3∗ ⊗K If f is a symmetric function, let f 0 = ∂f /∂p1 ; this operation may be expressed more invariantly as p⊥ 1 (Ex. I.5.3, Macdonald [6]). ˆ 0 (C) characterized Definition 2.3. The Legendre transform L is the involution of 3∗ ⊗K by the formula (Lf ) ◦ f 0 + f = p1 f 0 . The Legendre transform Lf of a function f is characterized by the formula (Lf )0 ◦ f = x. For symmetric functions, although the analogue of this formula holds, in the form (Lf )0 ◦ f 0 = h1 , 0

the situation is not as simple, since there is no single notion of integral for symmetric functions (the “constant” term may be any function of the power sums pn , n > 1). Nevertheless, there is a simple algorithm for calculating Lf from f . Denote by fn and gn the coefficents of f and g = Lf lying in 3n ⊗ K0 (C). (i) The formula f 0 ◦ (Lf )0 = h1 may be rewritten as N X n=3

gn0 +

N X n=3

N −1 X fn0 ◦ h1 + gk0 ∼ =0 k=3

This gives a recursive procedure for calculating gn0 .

mod 3N ⊗ K0 (C).


485

(ii) Having determined g 0 , we obtain g from the formula f = Lg, or g = p1 g 0 − f ◦ g 0 . We now recall Theorem (7.17) of [5], which is the generalization to modular operads of Theorem 1.3. Theorem 2.4. The symmetric function h2 + b0 is the Legendre transform of e2 − a0 . The main result of this paper is a formula for b1 in terms of a1 and a0 , generalizing Theorem 1.4. If f is a symmetric function, write f˙ = ∂f /∂p2 = 21 p⊥ 2 f. Theorem 2.5. b1 =

∞

a1 −

1 X φ(n) a˙ 0 (˙a0 + 1) log(1 − ψn (a000 )) + 2 n 1 − ψ2 (a000 )

◦ (h1 + b00 ).

n=1

Here, φ(n) is Euler’s function, the number of prime residues modulo n. Remark. The first two terms inside the parentheses on the right-hand side of Theorem 2.5 are analogues of the corresponding terms in the formula of Theorem 1.4. In particular, the second of these terms is closely related to the sum over necklaces in the definition of MV((1, n)), as is seen from the formula ∞ X n=1

∞ X φ(n) log(1 − pn ). chn IndSZnn 11 = − n n=1

The remaining term may be understood as a correction term, which takes into account the fact that necklaces of 1 or 2 vertices have non-trivial involutions (while those with more vertices do not). A proof of the theorem could no doubt be given using this observation; however, we prefer to derive it directly from Theorem 2.2. If we take the plethysm on the right of the formula of Theorem 2.5 with the symmetric function h1 −a00 , and apply the formula (h1 +b00 )◦(h1 −a00 ) = h1 , we obtain the equivalent formulation of this theorem: b1 ◦ (h1 − a00 ) = a1 −

∞

1 X φ(n) a˙ 0 (˙a0 + 1) log(1 − ψn (a000 )) + . 2 n 1 − ψ2 (a000 ) n=1

Proof of Theorem 2.5. The symmetric function b1 is a sum over graphs obtained by attaching forests whose vertices have genus 0 to either a vertex of genus 1, or to a necklace. In other words, b1 = a1 + sum over necklaces ◦ (h1 + b00 ). To prove the theorem, we must calculate the sum over necklaces. To do this, observe that a necklace is a graph with flags coloured red or blue, such that each vertex has exactly two red flags, each edge is red, and all tails are blue. Let W((n)), n ≥ 1, be the sequence of representations of S2 × Sn , V((0, n + 2)); W((n)) = ResSSn+2 n ×S2 think of the first factor of the product Sn × S2 as acting on the blue flags at a vertex, and the second factor as acting on the red flags. Applying Theorem 2.2, we see that ˆ ⊗K ˆ 0 (C) Log exp(1 ⊗ 1) Exp(Ch(W)) ∈ 3⊗3

486

E. Getzler

is the sum over stable graphs all of whose edges are red. To impose the condition that all tails are blue, we set the variables qn to zero before taking the Logarithm. We now proceed to the explicit calculation. We set ~ = 1, since it plays no rôle when ˆ we will denote power sums in the all graphs have genus 1. In writing elements of 3⊗3, first factor of 3 by pn , and in the second by qn . Lemma 2.6. The characteristic Ch(W) of W is the “bisymmetric” function ˆ 0 (C). ˆ 2 ⊗K Ch(W) = 21 a000 q12 + a˙ 0 q2 ∈ 3⊗3 ⊥ Proof. We have Ch(W) = h⊥ 2 a0 ⊗ h2 + e2 a0 ⊗ e2 . Expressing this in terms of power sums, we have ⊥ ⊥ 1 ⊥ 2 1 2 h⊥ 2 a0 ⊗ h2 + e2 a0 ⊗ e2 = 2 (p1 ) + p2 a0 ⊗ 2 (q1 + q2 ) 2 ⊥ 1 2 + 21 (p⊥ 1 ) − p2 a0 ⊗ 2 (q1 − q2 ) 2 2 ⊥ = 21 (p⊥ 1 ) a0 ⊗ q1 + p2 a0 ⊗ q2 .

From this lemma, it follows that ∞ ∞ Y q2 Y q2n ˆ ⊗K ˆ 0 (C), ∈ 3⊗3 exp ψn (a000 ) n exp ψn (˙a0 ) Exp Ch(W) = 2n n n=1

n=1

We now apply the heat kernel and separate variables: 2 Y n ∂2 00 qn exp (a ) exp ψ exp(1 ⊗ 1) Exp Ch(W) |qn =0 = n 0 2 ∂qn2 2n qn =0 n odd Y × exp

n even

∂ n ∂2 + 2 2 ∂qn ∂qn

q2 2qn exp ψn (a000 ) n + ψn/2 (˙a0 ) . 2n n qn =0

We now insert the explicit formulas for the heat kernel of the Laplacian, namely 2 Z ∞ q n ∂2 dq √ exp )| = f (q ) exp − . f (q n qn =0 n 2 ∂qn2 2n 2πn −∞ For the odd variables, matters are quite straightforward: Z ∞ q2 2 qn2 n ∂2 dq n 00 00 qn √ n exp ψn (a0 ) − exp = exp ψn (a0 ) 2 2 ∂qn 2n 2n 2n qn =0 2πn −∞ 00 −1/2 . = 1 − ψn (a0 ) For the even variables, things become a little more involved: 2 ∂ 2qn n ∂2 00 qn + ψ + (a ) (˙ a ) exp ψ exp n 0 n/2 0 2 ∂qn2 ∂qn 2n n qn =0 q2 2qn n ∂2 exp ψn (a000 ) n + ψn/2 (˙a0 ) = exp 2 2 ∂qn 2n n qn =1 Z ∞ (qn − 1)2 q2 2qn dq √ n . − = exp ψn (a000 ) n + ψn/2 (˙a0 ) 2n n 2n 2πn −∞


487

To perform this gaussian integral, we complete the square in the exponent: ψn (a000 )

(qn − 1)2 qn2 2qn + ψn/2 (˙a0 ) − 2n n 2n 2 qn 1 00 qn = − 1 − ψn (a0 ) + 2ψn/2 (˙a0 ) + 1 − 2n n 2n 2ψn/2 (˙a0 ) + 1 2 2 ψn/2 (˙a0 ) ψn/2 (˙a0 ) + 1 1 − ψn (a000 ) qn − . + =− 2n 1 − ψn (a000 ) n 1 − ψn (a000 )

Thus, the gaussian integral equals 1−

−1/2 ψn (a000 )

2 ψn/2 (˙a0 ) ψn/2 (˙a0 ) + 1 . exp n 1 − ψn (a000 )

Putting these calculations together, we see that

exp(1 ⊗ 1) Exp Ch(W) |qn =0

1 ψn (˙a0 ) ψn (˙a0 ) + 1 = exp 1− n 1 − ψ2n (a000 ) n=1 ∞ Y −1/2 a˙ 0 (˙a0 + 1) , = Exp 1 − ψn (a000 ) 1 − ψ2 (a000 ) ∞ Y

−1/2 ψn (a000 )

n=1

and, applying the operation Log, that ∞ Y −1/2 a˙ 0 (˙a0 + 1) Log exp(1 ⊗ 1) Exp Ch(W) |qn =0 = Log . + 1 − ψn (a000 ) 1 − ψ2 (a000 ) n=1

The proof of the theorem is completed by the following lemma, applied to f = 1 − a000 . ˆ 0 (C) have constant term equal to 1; that is, rk(f ) = 1+O(x). Lemma 2.7. Let f ∈ 3⊗K Then ∞ ∞ Y 1 X φ(n) Log log(ψn (f )). ψn (f )−1/2 = − 2 n n=1

n=1

Proof. By definition, Log

∞ Y n=1

ψn (f )−1/2 =

∞ X µ(k) k=1

k

log

∞ Y

ψnk (f )−1/2 = −

n=1

∞

n=1

The lemma follows from the formula X µ(d) d|n

1 YX µ(d) log(ψn (f )). 2 d

d

which follows by Möbius inversion from

=

d|n

φ(n) , n

P d|n

φ(d) = n.

Corollary 2.8. Define ag = rk(ag ), bg = rk(bg ), and a˙ 0 = rk(˙a0 ). Then we have a1 ◦ (x − a00 ) = a1 − 21 log(1 − a000 ) + a˙ 0 (˙a0 + 1).

488

E. Getzler

Example 2.9. Suppose V((0, n)) = 11 is the trivial one-dimensional representation for all n ≥ 3, while V((1, n)) = 0. Then MV((1, n)) is an Sn -module whose rank is the number of graphs in 001,n , where 001,n ⊂ 01,n is the subset of stable graphs all of whose vertices have genus 0. We have a0 =

∞ X n=3

∞ X pn

hn = exp

n=1

n

− 1 − h 1 − h2 .

Theorem 2.5 leads to the following results; the calculations were performed using J. Stembridge’s symmetric function package SF for maple [7]. 0 n chn MV((1, n)) |01,n | 1

s1

1

2

3 s2

3

3

7 s3 + 4 s21

15

4

20 s4 + 17 s31 + 14 s22 + 4 s212

111

5

52 s5 + 78 s41 + 71 s32 + 33 s312 + 34 s22 1 + 4 s213 + s15

1104

An explicit formula for the generating function of the numbers |001,n | may be obtained from Corollary 2.8, using the formulas a00 = ex − 1 − x, a000 = ex − 1 and a˙ 0 = 21 (ex − 1). Proposition 2.10. ∞ X n=1

|001,n |

1 xn 1 = − log 2 − ex + (e2x − 1) ◦ (1 + 2x − ex )−1 . n! 2 4

3. The Sn -Equivariant Hodge Polynomial of M1,n A more interesting application of Theorem 2.5 is to the stable S-module in the category of Z-graded mixed Hodge structures V((g, n)) = Hc• (Mg,n , C). Let KHM be the Grothendieck group of mixed Hodge structures. The Sn -equivariant Serre characteristic eSn (Mg,n ) is by definition the characteristic chn (V((g, n))) ∈ 3n ⊗ KHM. It follows from the usual properties of Serre characteristics (see [2] or Proposition (6.11) of [5]) that chn (MV((g, n))) is the Sn -equivariant Serre characteristic eSn (Mg,n ) of the moduli space Mg,n of stable curves. Since the moduli space Mg,n is a complete smooth Deligne-Mumford stack, its k th cohomology group carries a pure Hodge structure of weight k; thus, the Hodge polynomial of Mg,n may be extracted from eSn (Mg,n ). Using Theorem 2.5, we will calculate the Serre characteristics eSn (M1,n ). It is shown in [1] (see also [2]) that Y P ∞ 1 µ(n/d)(1+Ld ) n d|n −1 (1 + p ) n ∞ X h1 h2 n=1 , − 2 − eSn (M0,n ) = a0 = L3 − L L −L L+1 n=3


489

where L is the pure Hodge structure C(−1) of weight 2. Theorem 2.4 implies that ∞ X h2 + b0 = h2 + eSn (M0,n ) n=3

is the Legendre transform of e2 − a0 ; this was used in [1] to calculate eSn (M0,n ). 2k 1 Let S2k+2 be the pure Hodge structure gr W 2k+1 Hc (M1,1 , Sym H), where H is the 1 local system R π∗ Q of rank 2 over the moduli stack of elliptic curves. (Here, π : M1,2 → M1,1 is the universal elliptic curve.) This Hodge structure has the following properties: (i) S2k+2 = F 0 S2k+2 ⊕ F 0 S2k+2 ; (ii) there is a natural isomorphism between F 0 S2k+2 and the space of cusp forms S2k+2 for the full modular group SL(2, Z). (In particular, S2k+2 = 0 for k ≤ 4.) It is shown in [3] that a1 =

∞ X

Q∞

Sn

e (M1,n ) = res0

n=1 (1

n=1

×

+ pn )

∞ X k=1

1 n

P d|n

µ(n/d)(1−ω d −Ld /ω d +Ld )

−1

1 − ω − L/ω + L ! S2k+2 + 1 2k ω − 1 ω − L/ω dω , L2k+1

where res0 [α] is the residue of the one-form α at the origin. We may now apply Theorem 2.5 to calculate the generating function of the Sn equivariant Serre characteristics eSn (M1,n ). We do not give the details, since they are quite straightforward, though the resulting formulas are tremendously complicated when written out in full. However, we do present some sample calculations, performed with the package SF.

n

eSn M1,n

1

(L + 1)s1

2

2

(L2 + 2L + 1)s2

4

3

(L3 + 3L2 + 3L + 1)s3 + (L2 + L)s21

12

4

(L4 + 4L3 + 7L2 + 4L + 1)s4 + (2L3 + 4L2 + 2L)s31 + (L3 + 2L2 + L)s22

49

5

(L5 + 5L4 + 12L3 + 12L2 + 5L + 1)s5 + (3L4 + 11L3 + 11L2 + 3L)s41

260

χ(M1,n )

+ (2L4 + 7L3 + 7L2 + 2L)s32 + (L3 + L2 )(s312 + s22 1 )

In a table at the end of the paper, we give a table of non-equivariant Serre characteristics of M1,n for n ≤ 15; these give an idea of the way in which the Hodge structures S2k+2 typically enter into the cohomology. In particular, we see that the evendimensional cohomology of the moduli spaces M1,n is spanned by Hodge structures of the form Q(`), while the odd dimensional cohomology is spanned by Hodge structures of the form S2k+2 (`). The rational cohomology groups of M1,n satisfy Poincaré duality: there is a nondegenerate Sn -equivariant pairing of Hodge structures H k (M1,n , Q) ⊗ H 2n−k (M1,n , Q) −→ Q(−n). Unfortunately, our formula for eSn (M1,n ) does not render this duality manifest.

490

E. Getzler

4. The Euler Characteristic of M1,n As an application of Corollary 2.8, we give an explicit formula for the generating function of the Euler characteristics χ(M1,n ). Theorem 4.1. Let g(x) ∈ x + x2 Q[[x]] be the solution of the equation 2g(x) − (1 + g(x)) log(1 + g(x)) = x. Then ∞ X

χ(M1,n )

n=1

1 1 xn = − log 1 + g(x) − log 1 − log(1 + g(x)) + (g(x)), n! 12 2

where (x) =

1 19 x + 23 x2 /2 + 10 x3 /3 + x4 /2 . 12

Proof. We apply Corollary 2.8 with the data a00 =

∞ X

∞

χ(M0,n+1 )

n=2

xn X xn = = (1 + x) log(1 + x) − x, (−1)n (n − 2)! n! n! n=2

1 a000 = log(1 + x), a˙ 0 = x(x + 2), 4 ∞ 1 X x2 x3 x4 xn (−1)n (n − 1)! a1 = χ(M1,1 )x + χ(M1,2 ) + χ(M1,3 ) + χ(M1,4 ) + 2 6 24 12 n! n=5 1 1 x2 x 3 x4 x2 − log(1 + x) + x+ + − , =x+ 2 12 12 2 3 4 where we have used that χ(M1,1 ) = χ(M1,2 ) = 1 and χ(M1,3 ) = χ(M1,4 ) = 0. The function g(x) of the statement of the theorem is x + b00 (x). The following corollary was shown us by D. Zagier. Corollary 4.2. χ(M1,n ) ∼ where

r C=

(n − 1)! −1/2 −3/2 + O n 1 + Cn , 4(e − 2)n

e−2 (1 + 4e + 9e2 + 4e3 + 2e4 ) ≈ 18.31398807. 18πe


491

492

E. Getzler

Proof. Analytically continue g(x) to the domain C \ [e − 2, ∞). The resulting function has an asymptotic expansion of the form g(x) ∼ e − 1 −

∞ X p 2e(e − 2 − x) + ak (e − 2 − x)k/2 . k=3

The asymptotics (4.2) follow by applying Cauchy’s integral formula to the right-hand side of Theorem 4.1, with contour the circle |x| = e − 2. The peculiar polynomial (x) of Theorem 4.1 combines the error terms in the formula for χ(M1,n ) with the correction terms involving a˙ 0 in Corollary 2.8. Omitting the term (g(x)) in Theorem 4.1 and dividing by 2, we obtain the generating function not of the Euler characteristics χ(M1,n ), but rather of the virtual Euler characteristics χv (M1,n ) of the underlying smooth moduli stack (orbifold). The asymptotic behaviour of the virtual Euler characteristics is the same as that of the Euler characteristics, with C replaced e = e−2 1/2 ≈ 0.06835794. The ratio between these Euler characteristics has the by C 18πe asymptotic behaviour 2

χv (M1,n ) e − C)n−1/2 + O(n−1 ), ∼ 1 + (C χ(M1,n )

giving a statistical measure of the ramification of M1,n for large n. Acknowledgement. I wish to thank the Department of Mathematics at the Université de Paris-VII, and the Max-Planck-Institut für Mathematik in Bonn for their hospitality during the inception and completion, respectively, of this paper. I am grateful to D. Zagier for showing me the asymptotic expansion of Corollary 4.2. This research was partially supported by a research grant of the NSF and a fellowship of the A.P. Sloan Foundation.

References 1. Getzler, E.: Operads and moduli spaces of genus 0 Riemann surfaces. In: The moduli space of curves, eds. R. H. Dijkgraaf et al., Basel: Birkhäuser, 1995, pp. 199–230 2. Getzler, E.: Mixed Hodge structures of configuration spaces. Max-Planck-Institut, preprint MPI-96-61, alg-geom/9510018 3. Getzler, E.: Resolving mixed Hodge modules on configuration spaces. To appear, Duke Math. J.; alg-geom/9611003 4. Getzler, E.: Intersection theory on M1,4 and elliptic Gromov-Witten invariants. J. Am. Math. Soc. 10, 973–998 (1997); alg-geom/9612004 5. Getzler, E. and Kapranov, M.M.: Modular operads. Compositio Math. 110, 65–126 (1998); dg-ga/9408003 6. Macdonald, I.G.: Symmetric Functions and Hall Polynomials. 2nd edition, Oxford: Clarendon Press, 1995 7. Stembridge, J.: A Maple package for symmetric functions. http://www.math.lsa.umich.edu/∼jrs/maple.html Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 194, 493 – 512 (1998)

Communications in


Suppression of Critical Fluctuations by Strong Quantum Effects in Quantum Lattice Systems Sergio Albeverio1,2 , Yuri Kondratiev3,4 , Yuri Kozitsky5,6 1 Fakult¨ at für Mathematik, Ruhr-Universität, D-44780 Bochum, Germany, SFB 237 Essen–Bochum–Düsseldorf and BiBoS Research Centre, D 33615 Bielefeld, Germany 2 CERFIM, Locarno, Switzerland 3 BiBoS Research Centre, Bielefeld Universit¨ at, D 33615, Germany 4 Institute of Mathematics, Kiev, Ukraine 5 Institute of Mathematics, Marie Curie–Sklodowska University PL 20-031 Lublin, Poland 6 Institute for Condensed Matter Physics, Lviv, Ukraine

Received: 31 May 1996 / Accepted: 12 March 1997

Abstract: Translation invariant models of quantum anharmonic oscillators with a polynomial anharmonicity and a ferroelectric interaction are considered. For these models, it is proved that the critical fluctuations of the position operator, peculiar to the critical point, are suppressed at all temperatures provided the oscillators are strongly quantum. This phenomenon is shown to occur in particular if the oscillator’s mass is less than some threshold value depending on the anharmonicity parameters. 1. Posing of the Problem and Results The object of our investigation in this paper is a countable system of quantum particles (oscillators) performing one-dimensional oscillations around their equilibrium positions and interacting among themselves. The set of all equilibrium positions is assumed to be a lattice IL. For further simplicity we choose this lattice to be simple cubic, that means IL = ZZ ν with a certain positive integer ν. For every l ∈ IL, let a quantum particle with the mass m and one internal degree of freedom be given. The dynamics of its oscillations around the equilibrium position l is described by means of canonical momentum and displacement operators {pl , ql } defined on (a dense subset of) the Ncomplex Hilbert space Hl = L2 (IRl , dxl ). For every finite subset 3 ⊂ IL, we put H3 = l∈3 Hl . The dynamics of the whole system of particles can be described by means of the model Hamiltonian formally given by X 1 X H= dll0 ql ql0 + Hl , (1) 2 0 0 l6=l ;l,l ∈IL

2

l∈IL

pl + V (ql ), (2) 2m where the coefficients dll0 form the dynamical matrix, m is the oscillator’s mass, and V is the one-particle potential. The first term in (1) describes the interaction between Hl =

494

S. Albeverio, Y. Kondratiev, Y. Kozitsky

the particles, the second one describes their individual properties by means of the oneparticle Hamiltonian Hl , l ∈ IL. The interaction is assumed to be translation invariant and ferroelectric. The simplest example of such type can be given as follows. Let |l − l0 | be the Euclidean distance between the points l and l0 . We set for l 6= l0 dll0 = −φ(|l − l0 |),

(3)

where φ is some nonnegative monotone function with sufficiently fast falloff, such that X φ(|l|) < ∞. l∈IL

For the further convenience, we include the case l = l0 in the sum in (1) putting dll = 0. The formal Hamiltonian cannot be defined directly as an operator, it is the heuristic limit of “local Hamiltonians” H3 , associated with finite subsets 3 ⊂ IL, each of which being defined as an essentially self-adjoint lower bounded operator acting on the corresponding Hilbert space H3 . For our purposes, it is enough to consider these Hamiltonians indexed by boxes 3 = {l = (l1 , . . . , lν ) | lj0 ≤ lj ≤ lj1 , j = 1, . . . , ν; lj0 < lj1 , lj0 , lj1 ∈ ZZ }. To introduce the local Hamiltonian in the box 3 which corresponds to the periodic boundary conditions, we set 0 0 3 d3 ll0 = −φ(|l − l |3 ), l 6= l ; dll = 0;

(4)

where the function φ is the same as in (3) and the distance |l−l0 |3 is assumed to be on the torus T (3) which is obtained from 3 by suitable identification of boundary components. For the box 3 described above, it can be defined by the expression |l − l0 |23 =

ν X

|lj − lj0 |23 ,

j=1

where

|lj − lj0 |3 = min{|lj − lj0 |; lj1 − lj0 + 1 − |lj − lj0 |}.

This yields

(5)

|l − l0 |3 ≤ |l − l0 |

for all pairs l, l0 in 3. Taking into account the monotonocity of φ, one gets d3 ll0 ≤ dll0

(6)

0

for all pairs l, l in 3. Thus we set H3 =

X 1 X 3 dll0 ql ql0 + Hl , 2 0

(7)

X 1 X dll0 ql ql0 + Hl , 2 0

(8)

l,l ∈3

H30 =

l,l ∈3

l∈3

l∈3

where Hl is given by (2). The latter local Hamiltonian corresponds to zero boundary conditions. These ones will also be considered. As it was mentioned above, the dynamical matrices introduced by (3) – (5) are the simplest ones having the properties we will utilize in this paper. Let us describe them more systematically. Denote by L the family of all boxes. For a given box 3,

Suppression of Critical Fluctuations by Strong Quantum Effects

495

let L(3) denote the partition of IL by boxes that can be obtained as corresponding translations of this 3. Denote by G the group of all translations of IL. Let G(3) be the subgroup of G consisting of translations which generate the partition L(3). This means L(3) = {t(3), t ∈ G(3)}, where t(3) = {t(l), l ∈ 3}. Thereafter, the dynamical matrices (dll0 )l,l0 ∈IL and (d3 ll0 )l,l0 ∈3 have the following properties: (D1) dll0 is invariant under translations on IL; 3 (D2) d3 ll0 ≤ 0 (ferroelectricity); dll = dll = 0; ll0 ≤ 0, dP 0 (D3) the series l0 ∈IL dll converges for all l ∈ IL; (D4) d3 ll0 = min{dlt(l0 ) : t ∈ G(3)}. Now let us describe the one-particle potentials V that will be considered. We assume: (V1) V is an even polynomial, deg V = 2M with M ≥ 2; (V2) the polynomial v such that v(q 2 ) = V (q) is convex on IR+ = [0, +∞). An example of V satisfying these conditions is V (q) = aq 2 + q 4

(9)

with arbitrary real a. For a < 0, the potential (9) has a double-well shape. In the sequel, when speaking about a model of quantum anharmonic oscillators, we will mean the model possessing the properties (D1)–(D4), (V1),(V2). The simplest example of such a model was described above. Remark 1.1. Let V satisfy (V1), (V2). Then the probability measure dµV (x) = C exp(−V (x))dx (C is a positive normalizing constant) belongs to the BFS class of measures [6], and for every α > 0, the function exp(αx2 ) is µV -integrable on IR. For every 3 ∈ L and given inverse temperature β = T −1 , the Gibbs states in 3 are defined as functionals on an appropriate C ∗ -algebra of observables by means of the local Hamiltonians H3 , H30 given by (7), (8) respectively as follows: trace(Ae−βH3 ) trace(Ae−βH3 ) 0 , γβ,3 (A) = , γ (A) = 0 β,3 trace(e−βH3 ) trace(e−βH3 ) 0

(10)

see e.g. [4] for the definition and related discussions. The state γ is periodic, the state γ 0 corresponds to zero boundary conditions. These states can be fully determined by means of temperature Green functions. Let Sβ be a circle of length β (isometric to the segment [0, β] with identified ends). For each ordered set {τ0 , τ1 , . . . , τn }, τi ≤ τi+1 , τi ∈ Sβ , and appropriate operators {A0 , A1 , . . . , An }, the corresponding temperature Green function is determined as follows: 0β,3 A0 ;A1 ;...;An (τ0 , τ1 , . . . , τn ) = where

1 trace A0 e−(τ1 −τ0 )H3 A1 e−(τ2 −τ1 )H3 . . . An e−(β−τn +τ0 )H3 , Z3 Z3 = trace e−βH3 .

(11) (12)

496


The Green function defined by (11), (12) with H30 instead of H3 will be denoted by 00,β,3 A0 ;A1 ;...;An (τ0 , τ1 , . . . , τn ). These functions can be obtained by using the analytic continuation of the usual Green functions corresponding to the unitary time evolution, see [1, 7]. Thermodynamic properties of the model can be described by passing to the limit 3 % IL. To do it, we introduce the collections of boxes C as sets completely ordered by inclusion. Each such a set is countable and thus can be considered as a sequence of boxes. For the models we consider, the existence of limiting periodic Gibbs states can be proved (see e.g. [2], where it was done for systems satisfying a reflection positivity condition). Our intention in this paper is to show that the large fluctuations of displacements of oscillators described by the Hamiltonians (1)–(7), (8) are absent (suppressed) at any temperature if the model parameters satisfy certain conditions that can be characterized as conditions of “strong quantumness”. Thus we restrict ourselves to the consideration of those operators that describe such fluctuations. Due to the Z2 -symmetry of the local 0 (ql ) = 0. For some real δ, we define Hamiltonians (7), (8) one has γβ,3 (ql ) = γβ,3 Aδ (3) =

1 |3|

1 2 +δ

Q3 =

1 |3|

1 2 +δ

X

ql .

(13)

l∈3

This operator describes the fluctuations of displacements of particles in 3. For some C, let us consider the sequence {Aδ (3), 3 ∈ C} from the point of view of the central limit theorem. In the case of a weak dependence between oscillations, one can expect the asymptotic “normality” of this sequence with δ = 0. But if this “normality” occurs for some δ > 0, this will correspond to the appearance of a strong dependence between oscillations which takes place at the critical temperature. In the quantum case, the normality of these sequences can be understood in terms of corresponding Green functions. Accordingly, we give the following definition of the critical temperature. Definition 1.1. For given temperature T , some δ > 0, and a sequence C, let the sequence 0 of temperature Green functions {0β,3 Aδ (3);Aδ (3) (τ, τ ), 3 ∈ C} given by (11)–(13), or the 0,β,3 sequence of 0Aδ (3);Aδ (3) (τ, τ 0 ) functions, converge in L1 (Sβ2 ) to some function 0δ (τ, τ 0 ) different from zero on an Sβ2 subset of positive Lebesgue measure. Then one says that the critical fluctuations of displacements of particles occur at this temperature. The latter is said to be a critical temperature. If for every sequence C and arbitrary positive δ, these sequences of Green functions converge to zero, we say that the critical fluctuations are absent at this temperature. Now we can formulate our main result. Theorem 1.1. Let the model of quantum anharmonic oscillators which possesses the properties (D1)–(D4), (V1), (V2) be considered. For this model, there exists a positive m∗ such that for all values of the oscillator’s mass m ∈ (0, m∗ ), the critical fluctuations are absent at all temperatures. The proof of this statement is preceded by a number of lemmas, other assertions, and remarks. Remark 1.2. The model with the small values of the oscillator’s mass can be characterized as “strongly quantum”. Below we analyze this and some other conditions of strong (or large) quantumness.


497

Now we simplify our notations by setting 0,β,3 0 δ 0 0 0δ3 (τ, τ 0 ) = 0β,3 Aδ (3);Aδ (3) (τ, τ ) , 00,3 (τ, τ ) = 0Aδ (3);Aδ (3) (τ, τ ).

(14)

From the definitions (11)–(13), one has: 0δ3 (τ + θ, τ 0 + θ) = 0δ3 (τ, τ 0 ),

(15)

where the addition of the type of τ + θ is understood modulo β. The same periodicity can be shown to hold also for 0δ0,3 . Proposition 1.1. The following estimates hold with arbitrary δ ≥ 0, 3 ∈ L, almost everywhere on Sβ2 : 0δ3 (τ, τ 0 ) ≥ 0δ0,3 (τ, τ 0 ) > 0.

The proof of this variant of the Lebowitz and Griffiths inequalities for our model will be given below. We set Z Z 1 0δ3 (τ, τ 0 )dτ dτ 0 . (16) U3δ = 2 β Sβ Taking into account (15), we obtain Z Z 0δ3 (τ, τ 0 )dτ 0 = U3δ = Sβ

Sβ

0δ3 (τ, τ 0 )dτ.

(17)

Proposition 1.2. For given δ > 0, let the sequence {U3δ , 3 ∈ C} converge to zero. Then the sequences {0δ3 (τ, τ 0 ), 3 ∈ C}, {0δ0,3 (τ, τ 0 ), 3 ∈ C} converge to zero in L1 (Sβ2 ). The proof of this assertion immediately follows from Proposition 1.1 and (16). Taking into account the definition (13), one has 0δ3 (τ, τ 0 ) = |3|−2δ 003 (τ, τ 0 ) ; U3δ = |3|−2δ U30 .

(18)

Let us return to the dynamical matrix D = (dll0 ) that can naturally be defined on the Hilbert space l2 (IL) as a linear operator. Having in mind the translation invariance (D1) and the “ferroelectricity” property (D2), we deduce that the series mentioned in (D3) does not depend on l. Thus the norm of the mentioned operator is X dll0 . (19) kDk = − l0 ∈IL

Write H˜ l = Hl + kDkql2 , l ∈ IL,

(20)

where Hl is given by (2). Each Hamiltonian H˜ l with V obeying (V1), (V2) has purely discrete spectrum. We set (21) H˜ l ψs = Es ψs , s ∈ ZZ+ , where the eigenvalues Es are numbered in increasing order. Denote 1 = min {Es+1 − Es | s ∈ ZZ+ }.

(22)

498


Lemma 1.1. (Main estimate). For the model of quantum oscillators, let the parameters of the one-particle Hamiltonian m and 1 be given by (22), and the interaction parameter kDk satisfy the following condition: m12 > 2kDk.

(23)

Then for every β > 0 and arbitrary 3 ∈ L, U30 ≤

1 . m12 − 2kDk

(24)

The proof will be given below. Lemma 1.2. For the model of quantum anharmonic oscillators, let the parameters m, 1, and kDk satisfy condition (23). Then the critical fluctuations are absent for all β > 0. Proof. Proposition 1.1 implies U30 > 0, for all β > 0 and 3 ∈ L. Therefore, for m, 1, and kDk satisfying (23), the estimate (24) holds, thus the sequence {U30 , 3 ∈ C} is bounded. This yields, for arbitrary δ > 0, that the sequence {U3δ , 3 ∈ C} converges to zero . The latter and Proposition 1.2 yield in their turn that the sequences {0δ3 (τ, τ 0 ), 3 ∈ C} , {0δ0,3 (τ, τ 0 ), 3 ∈ C} converge to zero in L1 (Sβ2 ). Thus the critical fluctuations are absent for all β > 0. For the system of harmonic oscillators, we obtain an equality in (24) (this follows from the proof of Lemma 1.1 given below). This agrees with the corresponding results obtained for such systems (see e.g. expression (7) in [7]). For a harmonic oscillator ( V (q) = 21 a2 q 2 ) , we have m12 = kDk + a2 , where the latter coefficient describes the “rigidity” of the oscillator. In this case, the condition (23) does not depend on m. But in the case of anharmonic oscillators where the one-particle potential V satisfies the conditions (V1), (V2), m12 may infinitely grow with m tending to zero, which implies that the condition (23) can be satisfied for given kDk by putting m to be less than some threshold value m∗ . Such a growth of m12 is proven in the following assertion. Lemma 1.3. Let V satisfy the conditions (V1), (V2). Then there exists a positive m∗ such that for all values of the oscillator’s mass m less than m∗ , the condition (23) is satisfied. Proof. The one-particle potential V (q) = v(q 2 ) in the Hamiltonian (7) has the form V (q) = b0 q 2M + b1 q 2M −2 + . . . + bM −1 q 2 + bM ,

(25)

where b0 > 0, M = deg v ≥ 2 , due to the condition (V1). For some σ > 0, we consider the unitary operator Wσ in L2 (IR, dx) (Symanzik scaling) given by the formula 1

(Wσ ϕ)(x) = σ 2 ϕ(σx).

(26)

Wσ pWσ−1 = σ −1 p , Wσ qWσ−1 = σq.

(27)

Then from (26), we have

1 ˜ l is unitary equivalent to the Taking σ = σm := m− 2M +2 and using (27), we have that H operator h i M 1 m− M +1 L0 + m M +1 Rm (q) , (28)


where L0 =

499

1 2 p + b0 q 2M 2

(29)

and M −2

M −1

Rm (q) = b1 q 2M −2 + b2 m M +1 q 2M −4 + . . . + (bM −1 + kDk)m M +1 q 2 + bM m M +1 . 1

Let 1(0) be defined by (22) but with the eigenvalues Es of the operator L0 instead of H˜ l . Then using the perturbation theory for the operator 1

L0 + m M +1 Rm (q) (see e.g. [8]), we deduce the asymptotic equivalence M −1

m12 ∼ m− M +1 (1(0) )2 , m → 0 + . Therefore, we can choose m∗ such that m12 > 2kDk for all m ∈ (0, m∗ ).

2

Remark 1.3. For the anharmonic oscillator with V meeting (V1), (V2), m1 can be considered as a parameter describing the quantum character of this oscillator. It may infinitely grow with m tending to zero, as it was established by Lemma 1.3, or with the growth of 1. The latter may occur for the double-well potential, when its minima are coming close (e.g. by means of the external pressure [12]), increasing the tunnelling between the wells. Therefore, for these systems, the condition (23) may serve as a mathematical realization of the notion of “strong quantumness”. Proof of Theorem 1.1. It follows directly from Lemmas 1.2 and 1.3.

In fact, we can prove stronger statements than that of Theorem 1.1. Let us consider a sequence of operators {B(3), 3 ∈ C}. Definition 1.2. The sequence {B(3), 3 ∈ C} is said to be degenerate at zero if for all n ∈ IN , the sequences of temperature Green functions {0β,3 B(3);...;B(3) (τ1 , . . . , τn ), 3 ∈ C} converge to zero in L1 (Sβn ). Theorem 1.2. Let the conditions of Lemma 1.2 be satisfied. Then for all β > 0 and arbitrary δ > 0, the sequence of fluctuation operators {Aδ (3), 3 ∈ C} is asymptotically degenerate at zero. The proof of this and the following theorem will be given in the next section. For any given sequence {λ(3), 3 ∈ C | λ(3) ∈ IR}, we set Bλ (3) = λ(3)A0 (3).

(30)

Theorem 1.3. Let the conditions of Lemma 1.2 be satisfied. Then for all β > 0 and an arbitrary sequence {λ(3), 3 ∈ C} converging to zero, the sequence of operators {Bλ (3), 3 ∈ C} given by (30) is asymptotically degenerate at zero. Remark 1.4. The suppression of the long range order by strong quantum fluctuations in the physical systems described by the above models was experimentally observed (see e.g. J.E. Tibballs et al [12]) and discussed long ago from the physical point of view (see e.g. T. Schneider et al [9] or the book [5], chapter 2.5.4.3). A rigorous justification of this phenomenon was given by A. Verbeure and V.A. Zagrebnov [14]. Our theorems describe a quantum effect which is essentially stronger than the one described in [14] – they imply the suppression not only of the long-range order but also of any critical anomalies. The suppression of the long-range order would correspond to the value of δ = 21 in (13).

500


2. Functional Integral Representation For given inverse temperature β and 3 ∈ L, let us consider the measure space (β,3 ; 6β,3 ), where β,3 = {ω3 (·) | ω3 : Sβ → IR3 }, ω3 (·) = {ωl (·), l ∈ 3 | ωl ∈ := C(Sβ → IR)}. The set 6β,3 is the standard σ-algebra of β,3 subsets generated by cylinder subsets, see e.g. [2]. In view of Definitions 1.1, 1.2 , our main objects are the Green functions defined by (11) with all Aj chosen to be Aδ (3). But in the case where Aj are the multiplication operators by measurable functions Aj (q3 ) with q3 = {ql , l ∈ 3}, we have, in fact [7]: Z n Y β,3 0A0 ;...;An (τ0 , . . . , τn ) = Aj (ω3 (τj ))dνβ,3 (ω3 (·)). (31) β,3 j=0

Here νβ,3 is determined as a measure on β,3 by the Hamiltonian H3 and has been rigorously defined in [1, 7]. In the sequel, we will use the local Hamiltonian H3 presented as follows: X 1 X 3 H˜ l , Jll0 ql ql0 + (32) H3 = − 2 0 l,l ∈3

l∈3

where H˜ l is given by (20) and Jll30 = −d3 ll0 + kDkδll0 ,

(33)

with d3 ll0 defined by (D4). For the one-particle Hamiltonian, we will use the following representation ˜ l = 1 p2l + V˜ (ql ) , V˜ (ql ) = V (ql ) + kDkql2 . (34) H 2m By means of the matrix (Jll30 ), one can define on the Euclidean space IR3 the operator J 3 that has the norm X d3 (35) kJ 3 k = kDk − ll0 . l0 ∈3

Therefore, the smallest eigenvalue of J 3 is not less than X d3 kDk + ll0 > 0, l0 ∈3

where (D2) and (D4) were taken into account. Thus this operator is symmetric and positive. The latter means that the scalar product in IR3 , (J 3 x, x) is strictly positive for all nonzero x ∈ IR3 . Thus we can write νβ,3 as follows:   Z X 1 1 dνβ,3 (ω3 (·)) = exp  Jll30 ωl (τ )ωl0 (τ )dτ  X dχβ (ωl (·)). (36) Zβ,3 2 0 Sβ l,l ∈3

l∈3

The measure χβ is determined by the one-particle Hamiltonian (34) and has also been defined in [1, 7]. For our purpose, it is convenient to use its heuristic representation [1, 7], ! Z Z m 1 2 ˜ V (ωl (τ )dτ dωl (·). (ω˙ l (τ )) dτ − (37) dχβ (ωl (·)) = exp − Z 2 Sβ Sβ


501

It is possible to give a rigorous meaning to (37) by means of a lattice approximation , as in constructive quantum field theory. Within this approximation, the derivative ω˙ l (τ ) := dωl (τ )/dτ is replaced by a difference relation, whereas the integrals over Sβ are put to be equal to the corresponding Riemann integral sums [10]. This approach will be used in the next section. For given k ∈ IN , let π be a partition of the set {1, 2, . . . , 2k} onto unordered pairs, i.e. π = {(π(1), π(2)); (π(3), π(4)); . . . ; (π(2k − 1), π(2k))}, and P(2k) be the collection of all such partitions. We set 0δ2k,3 (τ1 , . . . , τ2k ) = 0β,3 Aδ (3);...;Aδ (3) (τ1 , . . . , τ2k ),

(38)

where the right-hand-side of (38) is defined by (11) with Aδ (3) given by (13). Then (see (14)) 0δ2,3 (τ, τ 0 ) = 0δ3 (τ, τ 0 ). Lemma 2.1. For all δ ≥ 0, k ∈ IN , the following estimate holds for all 3 ∈ L and almost all values of corresponding variables k X Y

0δ2k,3 (τ1 , . . . , τ2k ) ≤

0δ3 τπ(2i−1) , τπ(2i) .

(39)

π∈P(2k) i=1

The proof of this Gaussian-upper-bound-like inequality is done in the next section. For 3 consisting of only one element, we denote the corresponding U3δ given by (17) simply by U . Clearly, this U does not depend on δ and is determined by the one-particle Hamiltonian (34). The proof of Lemma 1.1 is based on the following Gaussian-upperbound-like estimate, which for our model is proved in the next section by means of the lattice approximation. Lemma 2.2. For given β, let U and kDk satisfy the condition U −1 > 2kDk.

(40)

Then for arbitrary 3 ∈ L, U30 ≤

U . 1 − 2U kDk

(41)

Lemma 2.3. Let 1 be defined by (22), then for all β > 0, the following estimate holds: U≤

1 . m12

Proof. Taking into account (17), (14), (12), (11), (10) and (21), one gets Z 1 ˜ ˜ trace qe−τ Hl qe−(β−τ )Hl dτ, U= Z Sβ where

˜ Z = trace e−β Hl .

We set qss0 = (ψs , qψs0 ) and obtain from (43)

(42)

(43)

502


U=

1 Z

X

(qss0 )2

s,s0 ∈Z Z+

(Es − Es0 )(e−βEs0 − e−βEs ) . (Es − Es0 )2

(44)

For symmetry reasons, we have qss = 0, thus the case s = s0 in the sum (44) can be excluded. Therewith, taking into account definition (22), one can estimate the denominator in (44) and obtain 1 X (qss0 )2 (Es − Es0 )(e−βEs0 − e−βEs ) Z12 s,s0 1 1 ˜ l , q]]e−β H˜ l = 1 , = 2 · trace [q, [H 1 Z m12

U ≤

where [·, ·] means the commutator. This gives (42).

Proof of Lemma 1.1. Follows immediately from Lemmas 2.2 and 2.3. Proof of Theorem 1.2 and 1.3. Follows immediately from Theorem 1.1 and estimate (39).

3. Proof of Lemmas 2.1 and 2.2 In order to prove Lemmas 2.1 and 2.2, we will construct some technical background within the lattice approximation approach. The starting point here is expression (37). As it was mentioned, the main idea of this approach is to replace the integrals over Sβ n by corresponding Riemann integral sums. Let us divide Sβ by the points τn = N β, n = 0, 1, . . . , N − 1, and introduce the following notations: r β ωl (τn ); n = 0, 1, . . . , N − 1; (45) ξln = N ξl0 = ξlN , ξl = (ξl0 , ξl1 , . . . , ξlN −1 ); β ; n = 0, 1 . . . N − 1; N ωl (τn+1 ) − ωl (τn ) . ω˙ l (τn ) = 1τ

1τ = τn+1 − τn =

Now we define a sequence of measures on IRN that converges to the corresponding measure on trajectories, when N → ∞, in the sense of convergence of integrals on cylinder functions.We set !N −1 N −1 X 1 (N ) ) ξln ξln+1 X dρ(N (46) dχβ (ξl ) = (N ) exp αN β (ξln ), Xβ n=0 n=0 s ) dρ(N β (ξln )

=

1 Yβ(N )

β 2 exp −αN ξln − V˜ N

where Xβ(N ) , Yβ(N ) are normalizing constants, and

N ξln β

!! dξln ,

(47)


αN = m

N β

503

2 .

We consider (N ) (ξ) dνβ,3



 N −1 X X 1 1 ) = (N ) exp  Jll30 ξln ξl0 n  X dχ(N (48) β (ξl ) 2 0 Zβ,3 n=0 l∈3 l,l ∈3   N −1 N −1 X X 1 1 ) = (N ) (N ) exp  J3(N ) (ln, l0 n0 )ξln ξl0 n0  X X dρ(N β (ξln ), 2 0 Zβ,3 Xβ l,l ∈3 n=0

where

l∈3 n=0

J3(N ) (ln, l0 n0 ) = δnn0 Jll30 + δll0 δ|n−n0 |N ,1 αN ,

(49)

0

δ|n−n0 |N ,1 = 1 iff |n − n | = 1, N − 1. For a suitable function F (ξ), we define Z ) (N ) = F (ξ)dνβ,3 (ξ), < F (ξ) >(N 3 IRN |3|

(50)

Z

and ) < F (ξ) >(N 0,3 =

IRN |3|

(0,N ) F (ξ)dνβ,3 (ξ),

(51)

(0,N ) where νβ,3 is defined by (48) with (N ) J0,3 (ln, l0 n0 ) = δnn0 Jll0 + δll0 δ|n−n0 |N ,1 αN ,

(52)

instead of J3(N ) (ln, l0 n0 ), and Jll0 = −dll0 + kDk,

(53)

instead of Jll30 . Note that (D4) yields Jll0 ≤ Jll30 .

(54)

Introduce δ,N (n1 , . . . , n2k ) S2k,3 k N −k(1+2δ) = |3| β

X

) < ξl1 n1 . . . ξl2k n2k >(N 3 ,

(55)

l1 ,...l2k ∈3

δ,N (n1 , . . . , n2k ) as where k ∈ IN and nj = 0, 1, . . . , N − 1. We also introduce S2k,0,3 (N ) (N ) given by (55) with < . . . >0,3 instead of < . . . >3 . Let us consider a sequence {Ns ∈ IN , s ∈ IN } possessing the property Ns < Ns+1 and, therefore, tending to infinity when s → ∞. By means of this sequence, we can define a sequence {n(s) (τ ), s ∈ IN | 0 ≤ n(s) ≤ Ns } such that for given τ ∈ Sβ , lims→∞ (n(s) (τ )/Ns ) = τ /β.

504


Lemma 3.1. For every k ∈ IN , δ ≥ 0, 3 ∈ L, the sequence n o δ,Ns (s) (s) S2k,3 (n1 (τ1 ), n(s) 2 (τ2 ), . . . , n2k (τ2k )), s ∈ IN converges to the temperature Green function 0δ2k,3 (τ1 , . . . , τ2k ), given by (38), (31), when s → ∞, for almost all values of (τ1 , . . . , τ2k ) ∈ Sβ2k . The same convergence of the (s) δNs δ sequence of S2,0,3 (n(s) 1 (τ1 ), n2 (τ2 )) to 00,3 (τ1 , τ2 ) given by (14) holds true. (N ) , N ∈ IN } Proof. The standard arguments yield that the sequence of measures {νβ,3 converges to the measure νβ,3 given by (36) in the sense of convergence of integrals of cylinder functions. This yields the convergence of the corresponding moments, which yields in turn the convergence to be proven. Just the same arguments yield the convergence to the Green functions in the case of zero boundary conditions. P Now we can fix 3 ∈ L, the integer N , and simplify our notations by putting l = P P PN −1 l∈3 , n = n=0 , and so on. Let us consider the measure (48) as a Gibbs measure (with corresponding boundary conditions) for a classical ferromagnetic spin model with ) the single-spin measure ρ(N β given by (47), which belongs to the BFS class of measures (see Remark 1.1). We set ) (0) (N ) 0 0 u2 (ln, l0 n0 ) =< ξln ξl0 n0 >(N 3 ; u2 (ln, l n ) =< ξln ξl0 n0 >0,3 .

(56)

u4 (l1 n1 , l2 n2 , l3 n3 , l4 n4 ) ) = < ξl1 n1 ξl2 n2 ξl3 n3 ξl4 n4 >(N 3 −u2 (l1 n1 , l2 n2 )u2 (l3 n3 , l4 n4 ) − u2 (l1 n1 , l3 n3 )u2 (l2 n2 , l4 n4 ) − u2 (l1 n1 , l4 n4 )u2 (l2 n2 , l3 n3 ).

(57)

Proposition 3.1. For our model, the following estimates hold: 0 0 0 0 0 < u(0) 2 (ln, l n ) ≤ u2 (ln, l n ),

(58)

u4 (l1 n1 , l2 n2 , l3 n3 , l4 n4 ) < 0,

(59)

) < ξl1 n1 . . . ξl2k n2k >(N 3 ≤

X

k Y

u2 (lπ(2i−1) nπ(2i−1) , lπ(2i) nπ(2i) ).

(60)

π∈P(2k) i=1

Proof. Equations (58) (positivity) and (59) are known as Lebowitz inequalities, the estimate (60) is the Gaussian upper bound inequality, the relationship between u(0) 2 and u2 in (58) is the Griffiths inequality produced by (54). The validity of these inequalities follows from the properties of the measures (48), mentioned above (see Sect. 12 of [6]). Proof of Proposition 1.1 and Lemma 2.1. It follows immediately from (58), (60), (55) and Lemma 3.1. The rest of this paper is devoted to the proof of Lemma 2.2. To achieve this aim, we will need an additional technique explained below. Let F be a set of functions F defined on IRM , M ∈ IN , that can be continued on M C as entire functions, such that the norm


505

kF ka = sup{|F (z)| exp(−akzk2 ) : z ∈ CM }

(61)

is finite for all a > 0. Here kzk2 = |z1 |2 +. . .+|zM |2 . This set equipped with the pointwise linear operations and the topology defined by the family of norms {k·ka , a > 0} becomes a Fréchet space that will also be denoted as F . We will use the following property of this space [11]: Proposition 3.2. The set of all polynomials is dense in F. With each symmetric bilinear form on IRM × IRM , one can associate a symmetric linear operator A such that this form can be written as (y, Ax), where (·, ·) denotes the scalar product in IRM . The matrix (Ajj 0 ) consisting of the matrix elements of the operator A in the canonical basis of IRM will also be denoted by A. We will use the notation A > 0, for positive operators of such type. Let A be a symmetric linear positive operator. We set M ∞ M X X 1 X ∂ 2k F (x) Aj1 j10 . . . Ajk jk0 . (62) (T (A)F )(x) = k! ∂xj1 . . . ∂xjk0 0 0 k=0

j1 ...jk =1 j1 ...jk =1

Estimating the derivatives, one can prove that for all such A, T (A) continuously maps F into itself. T (A) can be expressed in the integral form: (T (A)F )(x) Z 1 1 = F (y) exp − (y − x, A−1 (y − x) dy, X(A) IRM 2

(63)

where 1

X(A) = [det(2πA)] 2 .

(64)

The identity of (62) and (63) can be proven for monomials, and then for every F ∈ F by means of Proposition 3.2. This identity permits the operator T (A) to be extended to a wider class of functions. Let A be a symmetric linear positive, B be a symmetric linear operator on IRM . The set of all such B satisfying with this A the condition A−1 − B > 0 will be denoted by B(A). For every B ∈ B(A), the operator T (A) given by (63) can be applied to F (x) = exp( 21 (x, Bx)), giving (T (A)F )(x) Z 1 1 1 −1 = (y, By) − (y − x, A (y − x)) dy. exp X(A) IRM 2 2

(65)

We put (y, By) − (y − x, A−1 (y − x)) = (x, Cx) − (y − W x, D(y − W x)),

(66)

and obtain C = B(1 − AB)−1 ; D = A−1 − B; W = (1 − AB)−1 . Then we insert the right-hand-side of (66) into the exponent in (65). This yields:

(67)

506


(T (A)F )(x) X((1 − AB)−1 A) 1 = exp( (x, B(1 − AB)−1 x)) X(A) 2 Z 1 1 exp(− (y − W x, (A−1 − B)(y − W x)))dy X((1 − AB)−1 A) IRM 2 1 X((1 − AB)−1 A) exp( (x, B(1 − AB)−1 x)). = X(A) 2

(68)

Proceeding in this direction, we can extend the operator T (A) to the following class of functions. For a symmetric linear positive operator A, we set 1 (69) F(A) = F (x) = exp( (x, Bx))G(x) | B ∈ B(A), G ∈ F . 2 Proposition 3.3. For a given symmetric linear positive operator A, the operator T (A) defined on F by the expression (63) can be continuously extended to F(A) as follows: (T (A)F )(x) 1 = 4(A, B) exp( (x, B(1 − AB)−1 x)) 2 (T ((1 − AB)−1 A)G)((1 − AB)−1 x), with 4(A, B) =

X((1 − AB)−1 A) . X(A)

(70) (71)

Proof. For G ≡ 1, (70) reduces to (68). Now let G(x) be a polynomial. Then the integral on the right-hand-side of (63) with F (x) = exp( 21 (x, Bx)) G(x) and B ∈ B(A) converges. Applying the representation (66), we obtain (T (A)F )(x) 1 1 = 4(A, B) exp( (x, B(1 − AB)−1 x)) 2 X((1 − AB)−1 A) Z 1 G(y) exp(− (y − W x, (A−1 − B)(y − W x)))dy M 2 IR 1 = 4(A, B) exp( (x, B(1 − AB)−1 x))(T ((1 − AB)−1 A)G)(W x), 2 where the integral form (63) is used as a definition, and W is given by (67). Thus (70) holds for G being a polynomial. Now we apply Proposition 3.2 and obtain the stated extension. Let us consider F ∈ F(A), then, for every t ∈ (0, 1], F ∈ F(tA). Having this in mind, we set (72) Ft (x) = (T (tA)F )(x), t ∈ (0, 1]; F0 (x) = F (x). Proposition 3.4. For every t ∈ [0, 1], Ft (x) is differentiable with respect to t, and obeys the following equation: M ∂ 2 Ft (x) ∂Ft (x) 1 X = Ajj 0 ; F0 (x) = F (x). ∂t 2 0 ∂xj ∂xj 0 j,j =1

(73)


507

The proof follows easily from definition (62) of T (A). Now let us return to the measures (46)–(49). We set   Z X ) exp  ηln ξln  dν (N ) (ξ), 8(N 3 (η) = IRN |3|



Z F3(N ) (η) =

X

IRN |3|

β,3

l,n

exp 

l,n

(74)

 ) N |3| ηln ξln  X dχ(N . β (ξl ), η ∈ IR l

Taking into account that the one-particle potential V satisfies the requirements (V1), ) (N ) (V2), we conclude that both 8(N 3 and F3 belong to F , where M used in the definition of the latter is put equal to N |3|. Moreover, the expression (48) implies ) 8(N 3 (η) =

1 (N ) Zβ,3

(T (I3 )F3(N ) )(η),

(75)

where the operator I3 : IR|3|N → IR|3|N is defined by its matrix elements I3 (ln, l0 n0 ) = Jll30 δnn0

(76)

and then is positive, symmetric, and periodic. The linear operator C : IR|3|N → IR|3|N is said to be periodic if for its matrix elements one has C(ln, l0 n0 ) = C((l +λ)(n+ν), (l0 + λ)(n0 + ν)) with arbitrary λ ∈ 3 and ν ∈ {0, 1, . . . , N − 1}, where the additions are modulo |3| and N respectively. Since 3 and N are fixed, we simplify our notations by omitting these symbols in 8, F , Zβ and I. Now let C be a periodic linear symmetric operator belonging to B(I). Write 1 F (η) = exp( (η, Cη))G(η). 2

(77)

For given C, this can be considered as a definition of G(η). It is not difficult to check that for each positive symmetric I, C ∈ B(I) implies −C ∈ B((1 − IC)−1 I). This yields G(η) ∈ F((1 − IC)−1 I) (due to F ∈ F ). We set for t ∈ [0, 1] (see (72)): exp R(t, η) = (T (t(1 − IC)−1 I)G)(η).

(78)

Inserting F (η) as given by (77) into (75) and applying the identity (70), one obtains 8(η) =

1 1 4(I, C) exp{ (η, C(1 − IC)−1 η) + R(1, (1 − IC)−1 η)}. Zβ 2

(79)

Now let us consider the expression (78) having in mind Proposition 3.4. This yields X ˙ η) := ∂R(t, η) = 1 R(t, [(1 − IC)−1 I](ln, l0 n0 ) ∂t 2 0 0 ln,l n 2 ∂R(t, η) ∂R(t, η) ∂ R(t, η) . + ∂ηln ∂ηl0 n0 ∂ηln ∂ηl0 n0

(80)

˙ η) are even and infinitely It is not difficult to show that for all t ∈ [0, 1], R(t, η) and R(t, often differentiable functions with respect to η at the point η = 0. We set

508


R2k (t | l1 n1 , l2 n2 , . . . , l2k n2k ) =

∂ 2k R(t, η) ∂ηl1 n1 . . . ∂ηl2k n2k

, η=0

∂ R˙ 2k (t | l1 n1 , . . . , l2k n2k ) = R2k (t | l1 n1 , . . . , l2k n2k ). ∂t Then we obtain from (80): 1 R˙ 2 (t | ln, l0 n0 ) = 2

X

[(1 − IC)−1 I](l1 n1 , l2 n2 )

l1 n1 ,l2 n2

{R4 (t | l1 n1 , l2 n2 , ln, l0 n0 ) +2R2 (t | l1 n1 , ln)R2 (t | l2 n2 , l0 n0 )}.

(81)

Taking into account the periodic boundary conditions with respect to l ∈ 3, as well as to n ∈ {0, 1, . . . , N − 1}, one deduces that X X R2 (t | ln, l0 n0 ) = |3|N R2 (t | ln, l0 n0 ). (82) ln,l0 n0

ln

We set R2 (t) =

X 1 R2 (t | ln, l0 n0 ), |3|N 0 0

(83)

ln,l n

and obtain from (81) taking into account (82), R˙ 2 (t) = KR22 (t) + Q(t), where K=

(84)

X 1 [(1 − IC)−1 I](ln, l0 n0 ), |3|N 0 0

(85)

ln,l n

Q(t) =

X X 1 [(1 − IC)−1 I](ln, l0 n0 ) R4 (t | ln, l0 n0 , l1 n1 , l2 n2 ). (86) 2|3|N 0 0 ln,l n

l1 n1 ,l2 n2

The initial condition for R2 (t) follows from (78) and (77): X 1 G2 (ln, l0 n0 ) |3|N 0 0 ln,l n     X X 1 = F2 (ln, l0 n0 ) − C(ln, l0 n0 ) ,  |3|N  0 0 0 0

R2 (0) =

ln,l n

where G2 (ln, l0 n0 ) = 0 0

F2 (ln, l n ) =

(87)

ln,l n

∂ 2 log G(η) ∂ηln ∂ηl0 n0 ∂ 2 log F (η) ∂ηln ∂ηl0 n0

,

(88)

.

(89)

η=0

η=0

The operator C has been restricted to be periodic and to belong to B(I). Now we impose more essential restrictions as follows:


X

X

C(ln, l0 n0 ) =

ln,l0 n0

509

F2 (ln, l0 n0 ),

(90)

ln,l0 n0

C = gI −1 , g < 1, −1

(91) 3

where I is the inverse operator of I (it exists due to the positivity of J ), g is a positive quantity (depending on 3 and N ) to be determined from (90). The possibility to choose it being less than one will be shown to be produced by (23). Lemma 3.2. Let the operator C satisfy (90), (91). Then R2 (t) given by (83) is negative for all t ∈ (0, 1]. In particular (92) R2 (1) < 0. Proof. Taking into account (90), one gets R2 (0) = 0. The condition (91) yields in turn [(1 − IC)−1 I](ln, l0 n0 ) =

1 J 30 δnn0 ≥ 0, 1 − g ll

(93)

due to (33) and (D2). Let us prove the assertion of this lemma assuming R4 (t | l1 n1 , l2 n2 , ln, l0 n0 ) < 0

(94)

for t ∈ [0, 1]. Then Q(t) given by (86) is negative for t ∈ [0, 1]. Taking into account (93), we obtain that K given by (85) is positive. Thus the function R2 (t) possesses the properties: it is equal to zero at t = 0, has a negative derivative at t = 0 and a negative derivative at each t such that R2 (t) = 0. The latter can be deduced from (84). Clearly, each continuous function of t possessing these properties becomes zero at most once. In our case this occurs at the point t = 0. Thus, this function preserves its sign for all t ∈ (0, 1], that means it is negative (it decreases at t = 0) for these t, which was to be proved. Now let us prove (94). We return to (78) and insert in it G as given by (77), that is 1 G(η) = exp(− (η, Cη))F (η). 2 Taking into account (91), we obtain then

N |23| 1 g(1 − g) 1−g exp R(t, η) = (η, I −1 η) exp − 1 − (1 − t)g 2 1 − (1 − t)g t 1−g T I F η . 1 − (1 − t)g 1 − (1 − t)g

(95)

We set ht = (1 − g)[1 − (1 − t)g]−1 and exp 9(t, η) = (T (

tht I)F )(η). 1−g

(96)

Then (95) can be rewritten as N |3 | 2

exp(R(t, η) = ht hence

1 exp(− ght (η, I −1 η) + 9(t, ht η)), 2

0 0 R4 (t | l1 n1 , l2 n2 , ln, l0 n0 ) = h−4 t 94 (t | l1 n1 , l2 n2 , ln, l n ),

(97)

510


where 94 is defined in the same way as R4 . Recalling definition (74) of F (η) = F3(N ) (η), we obtain from (96), Z X X 1 exp ηln ξln + tht Jll30 ξln ξl0 n exp 9(t, η) = N | 3 | 2 IR ln l,l0 ,n X (N ) +αN ξln ξln+1 X dρβ (ξln ). (98) ln

ln

where we have also taken into account (46). Therefore, 94 is the fourth semi-invariant (of the type (57)) of the ferromagnetic Gibbs measure (Jll30 ≥ 0, αN > 0) with the initial ) measure ρ(N β belonging to the BFS class. Then 94 is negative (see Proposition 3.1). The latter implies (94), which completes the proof. Now let us return to (79). We set X 1 82 (ln, l0 n0 ), N |3| 0 0

U3(N ) =

(99)

ln,l n

where 0 0

82 (ln, l n ) =

∂ 2 log 8(η) ∂ηln ∂ηl0 n0

.

(100)

η=0

Then (79) gives with C = gI −1 , U3(N ) =

1 g + R2 (1), (1 − g)kJ 3 k (1 − g)2

where

kJ 3 k =

X

Jll30

(101)

(102)

l∈3

is given by (35). To obtain (101), we have used (76). But Lemma 3.2 yields (92), thus U3(N )
(N 3 . N |3| 0 0

U3(N ) =

(107)

ln,l n

Then taking into account definition (55), one gets U3(N )

2 1 X 0,N β 0 = S2,3 (n, n ) . β N 0 n,n

The latter can be rewritten as follows: 1 X 0,N S2,3 (n, n0 )1τ 1τ 0 . U3(N ) = β 0

(108)

n,n

The right-hand-side of (108) tends to Z Z Z Z 1 1 002,3 (τ, τ 0 )dτ dτ 0 = 003 (τ, τ 0 )dτ dτ 0 = U30 2 2 β β Sβ Sβ as N → ∞. Here we have taken into account (16) and Lemma 3.1. This means that the left-hand-side of (103) tends to U30 when N → ∞. In order to obtain the limit of the right-hand-side of (103), we use (91), (90), and obtain P 0 0 ln,l0 n0 F2 (ln, l n ) , g=P −1 )(ln, l0 n0 ) ln,l0 n0 (I and then by means of (106) and (45), we have kJ 3 k lim lim g = N →∞ β N →∞ kJ 3 k = β =

kJ 3 k β

Z

Z

X IRN

Z

n

ω(τ )dτ

Sβ

Z Z 2 Sβ

!2

ω(τn )1τ

) dχ(N β (ξ)

!2 dχβ (ω(·))

0(τ, τ 0 )dτ dτ 0 = kJ 3 kU < 2kDkU < 1,

where we have used (40) as well as the estimate kJ 3 k < 2kDk which follows from (35) and (D2), (D4). Meanwhile, 0(τ, τ 0 ) means 0δ3 (τ, τ 0 ) with one–point 3, and U is the same as in Lemma 2.2. Then passing to the limit N → ∞ in (103), we obtain U30 ≤

U U ≤ , 1 − U kJ 3 k 1 − 2U kDk

where we have taken into account definition (35) which yields lim kJ 3 k = 2kDk,

3%IL

when 3 % IL.

Acknowledgement. One of the authors (Yuri Kozitsky) gratefully acknowledges the partial support of the International Science Foundation under the Grants UCN 000, UCN 200 as well as of Deutscher Akademischer Austauschdienst (Referat 325). The financial support of the DFG (Research Project AL 214 / 9- 2) is also gratefully acknowledged by the authors.

512


References 1. Albeverio, S., Høegh–Krohn, R.: Homogeneous Random Fields and Quantum Statistical Mechanics. J. Funct. Anal. 19, 242–279 (1975) 2. Barbulyak, V.S., Kondratiev, Yu.G.: A Criterion for the Existence of Periodic Gibbs States of Quantum Lattice Systems. Selecta Math. (formerly Sov.) 12, 25–35 (1993) 3. Berezansky, Yu.M., Kondratiev, Yu.G.: Spectral Methods in Infinite Dimensional Analysis. Dordrecht: Kluwer Academic Publishers, 1994 4. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 2. New York– Heidelberg–Berlin: Springer Verlag, 1979 5. Bruce, A.D. Cowley, R.A.: Structural Phase Transitions, London: Taylor and Francis Ltd, 1981 6. Fernandez, R., Fröhlich, J., Sokal, A.: Random Walks, Critical Phenomena and Triviality in Quantum Field Theory. Berlin–Heidelberg–New–York–London–Paris–Tokyo–Hong Kong: Springer Verlag, 1992 7. Globa, S.A., Kondratiev, Yu.G.: The Construction of Gibbs States of Quantum Lattice Systems. Selecta Math. Sov. 9, 297–307 (1990) 8. Kato, T.: Perturbation Theory for Linear Operators. Berlin–Heidelberg–New York: Springer Verlag, 1966 9. Schneider, T., Beck, H., Stoll, E.: Quantum Effects in an n-component Vector Model for Structural Phase Transitions. Phys. Rev. B13, 1123–1130 (1976) 10. Simon, B.: The P (ϕ)2 Euclidean (Quantum) Field Theory. Princeton, NJ: Princeton Univ. Press, 1974 11. Taylor, B.A.: Some Locally Convex Spaces of Entire Functions. In: Proceedings of Symposia of Pure Mathematics, Vol. XI, Providence, RI: AMS, 1968 12. Tibballs, J.E., Nelmes, R.J., McIntyre, G.J.: The Crystal Structure of Tetragonal KH2 PO4 and KD2 PO4 as a Function of Temperature and Pressure. J. Phys. C: Solid State Phys. 15, 37–58 (1982) 13. Verbeure, A., Zagrebnov, V.: Phase Transitions and Algebra of Fluctuation Operators in Exactly Soluble Model of a Quantum Anharmonic Crystal. J. Stat. Phys. 69, 329–359 (1992) 14. Verbeure, A., Zagrebnov, V.: No–Go Theorem for Quantum Structural Phase Transitions. J. Phys. A: Math. Gen. 28, 5415–5421 (1995) Communicated by Ya. G. Sinai

Commun. Math. Phys.194, 513 – 539 (1998)

Communications in


Lump Dynamics in the CP 1 Model on the Torus J. M. Speight Department of Mathematics, University of Texas at Austin, Austin, Texas 78712, USA. E-mail: [email protected] Received: 28 July 1997 / Accepted: 3 November 1997

Abstract: The topology and geometry of the moduli space, M2 , of degree 2 static solutions of the CP 1 model on a torus (spacetime T 2 × R) are studied. It is proved that M2 is homeomorphic to the left coset space G/G0 , where G is a certain eightdimensional noncompact Lie group and G0 is a discrete subgroup of order 4. Low energy two-lump dynamics is approximated by geodesic motion on M2 with respect to a metric g defined by the restriction to M2 of the kinetic energy functional of the model. This lump dynamics decouples into a trivial “centre of mass” motion and nontrivial relative motion on a reduced moduli space. It is proved that (M2 , g) is geodesically incomplete and has only finite diameter. A low dimensional geodesic submanifold is identified and a full description of its geodesics obtained.

1. Introduction The CP 1 model in (2 + 1) dimensions has long been popular in theoretical physics, both for its condensed matter applications, and as a simple nonlinear field theory possessing topological solitons, usually called lumps. The Euler-Lagrange equation of the system is not integrable, so there is no hope of solving the multilump initial value problem exactly. Numerical simulations of the model have revealed a rich diversity in the lump dynamics, which includes not only the now-familiar 90◦ scattering in head on collisions, but also lump expansion, collapse and singularity formation. It is an interesting and highly nontrivial problem to understand the mechanisms underlying this complicated dynamics. Such understanding has been afforded in similar field theories (those of Bogomol’nyi type) by the geodesic approximation of Manton [17, 1, 22]. Here the low-energy dynamics of n solitons is approximated by geodesic motion in the moduli space of static n-soliton solutions, Mn , the metric g being defined by the restriction to Mn of the kinetic energy functional of the field theory. So understanding n-soliton dynamics is

514

J. M. Speight

reduced to studying the topology and geometry of (Mn , g), a finite dimensional, smooth Riemannian manifold. Several authors have pursued this programme for the CP 1 model in R2+1 with standard boundary conditions [28, 16], concentrating on the case of two lumps. There is, however, a technical problem: the metric on M2 does not, strictly speaking, exist, that is, at every point p ∈ M2 some vectors in Tp M2 are assigned infinite length by the kinetic energy functional (they are “non-normalizable zero modes”). These divergences stem from the noncompactness of space R2 . They are essentially due to the existence in the general static solution of scale and orientation parameters which are frozen in the geodesic approximation because to alter them, no matter how slowly, costs infinite kinetic energy. This is only possible because the kinetic energy is an integral over a noncompact space. One can study geodesic motion orthogonal to the bad directions, or one can remove the problem entirely by studying the model on a compact space [24]. In this paper we impose square periodic spatial boundary conditions on the model, or, equivalently, place it on a flat torus. The aim is to establish results concerning the topology and geometry of (M2 , g), and to describe their implications for low-energy two lump dynamics on the torus, within the framework of the geodesic approximation. The work is arranged as follows. In Sect. 2 we introduce the CP 1 model on the torus, and review some relevant background material. In particular we use a standard argument of Belavin and Polyakov to show that M2 is the space of degree 2 elliptic functions. In Sect. 3 we equip M2 with a natural metric topology, and prove that it is homeomorphic to the left coset space G/G0 , where G is the Lie group P SL(2, C) × T 2 and G0 is a discrete subgroup of order 4. This allows one to give M2 a natural differentiable structure (that of the smooth manifold G/G0 ) and provides M2 with a good global parametrization, using the covering space G. In Sect. 4 this parametrization is used to survey the degree 2 static solutions and describe their energy density distributions. It is found that exceptionally symmetric solutions exist with four, rather than two identical energy lumps, as well as the expected two-lump and annular solutions. In Sect. 5 the metric g on M2 is defined, and some of its properties discussed. We lift g to obtain ge, the metric on the covering space G, and show that ge is a product metric on P SL(2, C)×T 2 . In this way, we show that lump dynamics in the geodesic approximation decomposes into a trivial “centre of mass” motion, the T 2 part, and a nontrivial relative motion, the P SL(2, C) part. So attention may be restricted to geodesic motion on a reduced covering space, without loss of generality. In Sect. 6 it is proved that (M2 , g) is geodesically incomplete by finding an explicit, maximally extended geodesic, and showing that it has only finite length. It follows that lumps can collapse to form singularities in finite time. In Sect. 7 a 2-dimensional totally geodesic submanifold is identified by computing the fixed point set of a discrete group of isometries. The geodesics of this submanifold and their associated lump motions are described. In Sect. 8 it is proved that (M2 , g) has only finite diameter, despite its noncompactness. One should therefore visualize it as having only finite extent. In consequence, all static solutions are close to the end of moduli space, that is, close to collapse. In Sect. 9 some concluding remarks are presented. Two 3-dimensional totally geodesic submanifolds are identified, and it is shown that 90◦ head on scattering must occur in the model under certain conditions. The present work is summarized, and extensions suggested.

Lump Dynamics in the CP 1 Model on the Torus

515

2. The CP 1 Model on the Torus The field, a map from spacetime to CP 1 , W : R × T 2 → CP 1 , will throughout be considered complex valued, so that we are using an inhomogeneous coordinate on CP 1 , or equivalently, a stereographic coordinate on S 2 , exploiting the well known diffeomorphism between CP 1 and the two sphere. The metric and volume form on the codomain in terms of such a coordinate are, respectively, h=

4 du du¯ , (1 + |u|2 )2

ω=

2i du ∧ du¯ . (1 + |u|2 )2

(1)

It is convenient to use a complex coordinate on physical space also, by identifying T 2 with C/, where is the period module, which we choose, for concreteness, to be = {n + im : n, m ∈ Z}.

(2)

So we impose square periodic boundary conditions of unit period on W . Position in T 2 is parametrized by position z = x + iy in the covering space C. The metric on spacetime is η = dt2 − dx2 − dy 2 , and the action functional of the field theory is the standard harmonic map functional for mappings (R × T 2 , η) → (CP 1 , h), that is, Z S[W ] = R×T 2

¯ µν ∂µ W ∂ ν W η . 2 (1 + |W | )2

This may be written in a fashion reminiscent of Lagrangian mechanics, S = upon definition of the kinetic and potential energy functionals, Z T2

˙ |2 |W , (1 + |W |2 )2

T2

1 (1 + |W |2 )2

T = Z V =

(3) R

dt(T −V ),

(4) ! ∂W 2 ∂W 2 ∂x + ∂y .

(5)

The configuration space Q is C 1 (T 2 , S 2 ), the space of continuously differentiable maps T 2 → S 2 (note that V [W ] is finite for all W ∈ Q by compactness of T 2 ). By Hopf’s Degree Theorem [11], Q decomposes into disjoint homotopy classes labelled by topological degree n, an integer, Q=

a

Qn .

(6)

n∈Z

Physically, n is interpreted as the “lump number” of the configuration, the excess of lumps over antilumps. Static solutions are extremals of V , that is harmonic maps T 2 → S 2 . The space of minimal energy static solutions in Qn is called the degree n moduli space, denoted Mn . A well-known argument due to Belavin and Polyakov [4] shows that Mn (n assumed nonnegative) is in fact the space of degree n elliptic functions, that is, holomorphic maps T 2 → S2:

516

J. M. Speight

Z

|∂x W + i∂y W |2 (1 + |W |2 )2 T2 Z 1 = V [W ] − W ∗ω 2 T2 1 = V [W ] − Vol(S 2 )n = V [W ] − 2πn, 2

0≤

(7)

where W ∗ ω is the pullback of the volume form on S 2 by W . It follows that V |Qn ≥ 2πn

(8)

with equality if and only if (∂x + i∂y )W = 0, which is the Cauchy-Riemann equation for W . So if there exist degree n elliptic functions, then Mn is the space of such functions, since any other function has higher energy. If there are no such functions, then Mn is empty, for the energy bound (8) is optimal. To see this, consider the following family of functions. For > 0 small, define W ∈ Qn so that  2n  zn |z| < W (z) = (9)  0 |z| > 2 interpolating between these two regions with a smooth cutoff function. This consists of a flat-space degree n lump of width 2 cut off on a disc of radius . Since W is not exactly holomorphic, V [W ] > 2πn, but the excess can be made arbitrarily small by choosing small enough. It is easily proved that there are no unit degree elliptic functions [13], so we conclude that M1 = ∅, and the simplest nontrivial moduli space is M2 . 3. The Degree Two Moduli Space Weierstrass explicitly constructed a degree 2 elliptic function ℘, and it is on this that we base our parametrization of M2 . The partial fraction representation of ℘ is X 1 1 1 ℘(z) = 2 − (10) − 2 . z (z − ν)2 ν ν∈\{0}

Several properties of ℘ will be needed, some of which follow easily from Eq. (10), others of which are less straightforward. A comprehensive treatment can be found in [15]. Specifically: ℘(iz) = −℘(z),

℘(−z) = ℘(z), ℘0 (z)2 = 4℘(z)(℘(z)2 − e21 ),

℘(z) ¯ = ℘(z), (11)

where e1 = ℘( 21 ) is a real number, approximately 6.875. It follows that ℘ is real on the boundary and central cross of the unit square, and purely imaginary on the diagonals of the unit square (see Fig. 1) and that ℘ has a double pole at 0 and a double zero at (1 + i)/2. Given one holomorphic function ℘ ∈ M2 one can obtain others by composing on the right with a rigid translation of T 2 and on the left with a Möbius transformation of S 2


517

i

1+ i

s0

s2 double zero double pole

0

s1

1

Fig. 1. The fundamental domain of the Weierstrass ℘ function: ℘ is real on the solid lines and imaginary on the dashed lines. The four double valency points are marked by circles

since these preserve holomorphicity and degree. In terms of a stereographic coordinate W on S 2 , Möbius transformations are unit degree rational maps [23] W 7→

a11 W + a12 , a21 W + a22

(12)

where aij ∈ C and a11 a22 6= a12 a21 else the degree degenerates to zero. One may collect the parameters aij into a matrix L ∈ GL(2, C) and denote the action of the matrix L on S 2 defined in Eq. (12) by W 7→ L W . (The constraint a11 a22 6= a12 a21 is now det L 6= 0, ensuring that L is invertible, and hence in GL(2, C).) Composition of Möbius transformations coincides with matrix multiplication, L2 (L1 W ) ≡ (L2 L1 ) W.

(13)

Note, however, that this Möbius representation of GL(2, C) is not faithful since any pair of matrices L, L0 ∈ GL(2, C) such that L = λL0 for some λ ∈ C generate the same Möbius transformation. Denoting this scale equivalence ∼ we identify the Möbius group with GL(2, C)/ ∼, each equivalence class of which may be represented by a unimodular matrix (det L = 1). If L is unimodular then so is −L, so SL(2, C) is a double cover of GL(2, C)/ ∼, and the Möbius group is identified with SL(2, C)/Z2 , usually denoted P SL(2, C), which is easily seen to be six dimensional (the P stands for “projective”). For the sake of brevity, let G denote the eight dimensional Lie group P SL(2, C)×T 2 with the group product (L1 , s1 ) · (L2 , s2 ) = (L1 L2 , s1 + s2 ). We can define a G-action on M2 , G × M2 → M2 such that (g, W ) 7→ Wg , where W(L,s) (z) = L W (z − s).

(14)

518

J. M. Speight

We claim that this action is transitive, since the G-orbit of ℘ exhausts M2 . Lemma 1. For each W ∈ M2 there exists (L, s) ∈ G such that W (z) = L ℘(z − s). Proof. This may be established in several ways [14, 8, 2]. One economical, instructive (and apparently novel) argument appeals to the Riemann-Hurwitz formula, which constrains the number and valency of multivalent points of a holomorphic mapping between compact Riemann surfaces given their genera and the degree of the map [3]. In the case of a degree 2 holomorphic map from T 2 to S 2 the formula states that any such function must have exactly 4 distinct double valency points (for ℘ these are 0, 21 , 2i and (1 + i)/2). Let W ∈ M2 and s ∈ T 2 be one of its double valency points which is not a double pole. Then (W (z + s) − W (s))−1 is another elliptic function with a double pole at 0, and no poles elsewhere (in the fundamental period square). Its Laurent expansion about 0 is a1 a2 1 = + a3 + · · · , + W (z + s) − W (s) z 2 z

(15)

where a1 6= 0. Consider f (z) = [W (z+s)−W (s)]−1 −a1 ℘(z). This is an elliptic function with at most a simple pole at 0, and no poles elsewhere. Hence it has degree 1 or degree 0. But there are no degree 1 elliptic functions, and all degree 0 elliptic functions are constant, so f (z) = c. Defining a1 W (s) cW (s) + 1 (16) L0 = c a1 and L = (det L0 )− 2 L0 , it follows that W (z) = L ℘(z − s). 1

It is clear that for each W ∈ M2 the associated g ∈ G is not unique, since any one of the four distinct double valency points can be chosen as the basis of the construction of (L, s) outlined above. Conversely, given a choice of s ∈ T 2 , a double valency point of W , the construction of L ∈ P SL(2, C) is unique, so for each W ∈ M2 there are exactly four different g ∈ G such that W = ℘g . In particular, we can construct three alternative formulae for ℘(z) based on the three double valency points s0 = (1 + i)/2, s1 = 1/2 and s2 = i/2 (the trivial formula ℘(z) results from choosing s = 0, the fourth double valency point): −e21 ℘(z − s0 ) e1 [℘(z − s1 ) + e1 ] ≡ ℘(z − s1 ) − e1 −e1 [℘(z − s2 ) − e1 ] ≡ . ℘(z − s2 ) + e1

℘(z) ≡

(17) (18) (19)

These are found by computing the Laurent expansions of ℘ about si using (11), the formula for ℘0 . It is convenient to treat M2 as the G-orbit of ℘/e1 , rather than ℘. The identities (17,18,19) can be rewritten ℘(z) ℘(z − si ) i = 0, 1, 2, (20) ≡ Ui e1 e1 where Ui are the following SU (2) matrices:


U0 =

0 1 −1 0

,

i U1 = √ 2

519

1 1 1 −1

,

i U2 = √ 2

−1 1 1 1

.

(21)

So the stabilizer of ℘/e1 under the G-action is G0 = {(I, 0), (U0 , s0 ), (U1 , s1 ), (U2 , s2 )},

(22)

a discrete subgroup of G isomorphic to the Viergruppe V4 , that is, abelian, with each element its own inverse (when checking this recall that SL(2, C) matrices which differ only in sign are identified). The SU (2) subgroup of SL(2, C) acting on S 2 via is a double cover of SO(3) acting on S 2 via the natural rotation action. So G0 is a discrete group of simultaneous rotations of the target space S 2 and translations of the domain T 2 . In fact a straightforward calculation shows that the Ui are rotations of S 2 by π about three orthogonal axes. For a general W ∈ M2 , then, if W = (℘/e1 )g , then W = (℘/e1 )h if and only if h is an element of the left coset gG0 , which we will henceforth denote [g]. So the mapping φ : G/G0 → M2 , φ : [g] 7→ φ[g] , where ℘(z − s) (23) φ[(L,s)] (z) = L e1 is well defined and bijective. It would seem natural, therefore, to identify M2 with G/G0 via φ, but this only makes sense provided φ is a homeomorphism. Before proving that this is indeed the case, there are a few necessary preliminaries. Let p : G → G/G0 be the projection map p(g) = [g]. Since G0 is a discrete subgroup of G, it acts freely and properly discontinuously on G, so the quotient space G/G0 is, like G itself, a Hausdorff, smooth manifold [27]. The pair (G, p) is a covering space of G/G0 , and p is a local homeomorphism. It is useful to define φe : G → M2 such that φe = φ ◦ p, that is, φe : g 7→ φeg = (℘/e1 )g . The Lie group SL(2, C) is noncompact, and is, in fact, homeomorphic to R3 ×SU (2), as may be shown [19] by decomposing any SL(2, C) matrix L into the product HU , where U ∈ SU (2) and H is a positive definite, hermitian, unimodular matrix, this pair being unique. The space of H-matrices is homeomorphic to R3 and may be parametrized so that for all λ ∈ R3 , q (24) H(λ) = 1 + |λ|2 I + λ · τ , where τ = (τ1 , τ2 , τ3 ) are the Pauli spin matrices. It follows that P SL(2, C) ∼ = R3 × 3 3 2 ∼ ∼ (SU (2)/Z2 ) = R × SO(3), and so G = R × SO(3) × T . To prove that φ is a homeomorphism we will need to understand the behaviour of φeg : T 2 → S 2 as g approaches the end of G, i.e. as λ = |λ| → ∞. For this purpose, b λ ∈ (0, ∞)} for consider the one parameter family {φλ,b = φe(L,0) ∈ M2 : L = H(λλ), λ b = λ/λ ∈ S 2 . Explicitly, some fixed λ ℘(z) . (25) (z) = H(λ) φλ,b λ e1 b The action of H(λ) on S 2 (H(λ) : W 7→ H(λ) W ) has exactly two fixed points, λ b and as λ → ∞ all but a vanishing neighbourhood of −λ b is mapped by H(λ) to and −λ, b within a vanishing neighbourhood of λ [24]. So the limiting function φ∞,b : T 2 → S2 λ has the general form

520

J. M. Speight

φ∞,b (z) = lim φλ,b (z) = λ λ λ→∞

b b z∈ λ / (℘/e1 )−1 (−λ) −1 b b −λ z ∈ (℘/e1 ) (−λ).

(26)

b under ℘/e1 , are b all but two points of T 2 , the preimages of −λ That is, for generic λ, b b mapped by φ∞,b to λ, while these two points are mapped to −λ. In the four special cases λ b b coincide (double valency points) so λ ∈ {(0, 0, ±1), (±1, 0, 0)}, the preimages of −λ 2 b collapses all but one point in T is mapped to λ. The point to note is that in all cases φλ,b λ to a discontinuous limit. The statement that φ : G/G0 → M2 is a homeomorphism is, of course, meaningless until we equip M2 with a topology (the domain inherits its topology from G, which we take to have the natural product topology on P SL(2, C) × T 2 ). There are many sensible choices for the topology on T 2 . One simple and directly physical choice is to endow Q2 with the metric topology where distance between configurations is measured by their maximum pointwise deviation in the codomain S 2 , so that M2 ⊂ Q2 inherits the relative topology. That is, let d : S 2 × S 2 → R be the usual distance function on S 2 , and define D : Q2 × Q2 → R such that, for all W1 , W2 ∈ Q2 , D(W1 , W2 ) = sup d(W1 (z), W2 (z)).

(27)

z∈T 2

It is straightforward to verify that D satisfies the axioms of a distance function. The resulting metric topology on Q2 is Hausdorff, as is any metric topology [5]. Rather than break up the smooth manifold G into coordinate charts, it is convenient to equip G with a metric topology also, as follows: let h be the (Riemannian) product metric h = (dλ · dλ) ⊕ hSO(3) ⊕ ds ds¯

(28)

on G ∼ = R3 ×SO(3)×T 2 , where hSO(3) is the biinvariant metric on SO(3) of unit volume. e where d(g e 1 , g2 ) is The Riemannian manifold (G, h) has a natural distance function d, 1 the infimum of lengths (with respect to h) of piecewise C paths connecting g1 and g2 . That de is a distance function, and that the associated metric topology coincides with the original topology on G (independent of the choice of h) are standard theorems of Riemannian geometry [9]. We may now state and prove Theorem 1. Throughout, B (x) denotes the open ball of radius centred on x, where the space containing x (S 2 , M2 e should be clear from or G), and hence the appropriate distance function (d, D or d), context. Theorem 1. The bijection φ : G/G0 → M2 is a homeomorphism. Proof. We must prove that both φ and φ−1 are continuous. To prove the former, it suffices to show that φe = φ ◦ p, is continuous, since the projection p is a local homeomorphism. Fix g0 ∈ G and > 0. Then we must show that ∃δ > 0 such that ∀g ∈ Bδ (g0 ), φeg ∈ B (φeg0 ). Let φ∗ : G × T 2 → S 2 such that φ∗ (g, z) = φeg (z). Note that φ∗ is ˜ > 0 such that manifestly continuous. Hence, for each z˜ ∈ T 2 there exists δ(z) ˜ ⇒ d(φ∗ (g, z), φ∗ (g0 , z)) ˜ < (g, z) ∈ Bδ(z) ˜ (g0 ) × Bδ(z) ˜ (z)

. 3

(29)

The collection of open balls {Bδ(z) ⊂ T 2 : z ∈ T 2 } is an open cover of T 2 . Since T 2 is compact, there exists a finite subcover {Bδ(zj ) (zj ) : j = 1, 2, . . . , N }. Define δ = inf{δ(zj ) : j = 1, 2, . . . , N } > 0.


521

Now, let g ∈ Bδ (g0 ) and consider D(φeg , φeg0 ). For each z ∈ T 2 there exists j ∈ {1, 2, . . . , N } such that z ∈ Bδ(zj ) (zj ). Further, g, g0 ∈ Bδ (g0 ) ⊂ Bδ(zj ) (g0 ) by definition of δ, so (g, z), (g0 , z) ∈ Bδ(zj ) (g0 ) × Bδ(zj ) (zj ). Hence, using (29) and the triangle inequality, d(φ∗ (g, z), φ∗ (g0 , z)) ≤ d(φ∗ (g, z), φ∗ (g0 , zj )) + d(φ∗ (g0 , z), φ∗ (g0 , zj ))
0 such that p(U ) ⊂ U , where 3 [ B (gi ) ⊂ G. (32) U := i=0

We will show that there exists δ > 0 such that φe−1 (Bδ (W0 )) ⊂ U , where W0 = φ[g0 ] ∈ M2 . It follows that φ−1 (Bδ (W0 )) ⊂ U , and hence that φ−1 is continuous. For each n ∈ N, define the compact set An = B n (I, 0)\U ⊂ G,

(33)

where B n (I, 0) is the closed ball of radius n centred on (I, 0) ∈ G. Since φe is continuous, e n ) ⊂ M2 is also compact, and therefore closed (M2 is Hausdorff). Hence the comφ(A e n ) is open, and it contains W0 by construction (since An ∩ U = ∅), plement M2 \φ(A e n ). In this way, construct a so there exists δn > 0 such that Bδn (W0 ) ⊂ M2 \φ(A ∞ positive sequence (δn )n=1 , which, without loss of generality, we may assume is decrease By construction, ing and converges to 0. Consider the preimage of Bδn (W0 ) under φ. −1 φe (Bδn (W0 )) ∩ An = ∅, so every point in the preimage lies either in U , or at a distance greater than n from (I, 0) ∈ G. We claim that there exists N ∈ N such that φe−1 (Bδn (W0 )) ⊂ U . Choosing δ = δN , the proof is then complete. Assume this claim is false. Then ∀n ∈ N there exists / B n (I, 0) such that φegn ∈ Bδn (W0 ). For each n, choose such a gn and consider gn ∈ ∞ e e ∞ the sequence (gn )∞ n=1 . Since (δn )n=1 → 0, the image of the sequence under φ, (φgn )n=1 3 ∼ converges to W0 in M2 . Define two projection maps on G = R × SO(3) × T 2 : π1 : G → [0, ∞) π2 : G → S × SO(3) × T 2

2

such that π1 (λ, U, s) = λ = |λ|, b U, s). such that π2 (λ, U, s) = (λ,

(34)

The singularity of π2 when λ = 0 is irrelevant here. By construction, (λn )∞ n=1 = (π1 (gn ))∞ is unbounded, and without loss of generality, we may choose g such that λn n n=1 is increasing. Since (π2 (gn ))∞ takes values in a compact space, it has, by the Bolzanon=1 Weierstrass theorem, a convergent subsequence (π2 (gnr ))∞ r=1 . By translation and rotation

522

J. M. Speight

symmetry of T 2 and S 2 respectively, we may assume without loss of generality that its b I, 0). limit is (λ, Consider the image under φe of the associated subsequence (gnr )∞ r=1 , which apb I, 0) : t ∈ (0, ∞)}. proaches the end of R3 × SO(3) × T 2 asymptotic to the line {(tλ, The function φegnr (z) converges pointwise, as r → ∞, to φ∞,b (z), the limiting function λ

previously described (to check this, use continuity of the SO(3) and T 2 actions on S 2 and T 2 respectively, and of the function ℘/e1 ). But φ∞,b , being discontinuous, cannot λ be in M2 , and hence cannot be W0 , a contradiction. 4. Degree 2 Static Solutions

e is a covering space of M2 . The aim of An immediate corollary of Theorem 1 is that (G, φ) this section is to describe the connexion between any point g ∈ G and its corresponding static solution φeg ∈ M2 , that is, to obtain a picture of what the static lumps look like, and how they change as g varies. A configuration W may be visualized as a distribution of unit length three-vectors (“arrows”) over the torus. The energy density function of W is |Wx |2 + |Wy |2 , (35) E(x, y) = (1 + |W |2 )2 so the energy is located where the direction of the arrows is varying sharply in (x, y), in other words, where neighbouring arrows are stretched apart. It is the function E that we will describe as W varies in M2 . For this purpose, rather than using the hermitian-unitary (or “polar”) decomposition of SL(2, C) used above, another standard decomposition is convenient. Namely, any L ∈ SL(2, C) may be uniquely decomposed into a product U T with U ∈ SU (2) and T upper triangular, real on the diagonal, positive definite and unimodular. The space of such T -matrices is homeomorphic to R+ × C (here R+ = (0, ∞)) and may be parametrized thus: √ √ αe1 √ αe1 ρ . (36) T (α, ρ) = 0 1/ αe1 This allows one to write any W ∈ M2 in the form ℘(z − s) = U [α(℘(z − s) + ρ)]. W (z) = (U T ) e1

(37)

Changing U ∈ SU (2) merely produces a global internal rotation of the solution and so has no effect on E(z). Changing s ∈ T 2 translates the solution on the torus, so it suffices to examine the three parameter family W (z) = α(℘(z) + ρ)

(38)

(α, ρ) ∈ R+ × C, whose energy density is E(z) =

8α2 |℘(z)||℘(z)2 − e21 | . (1 + α2 |℘(z) + ρ|2 )2

(39)

Note that for all (α, ρ), E = 0 at the four double valency points z = 0, s0 , s1 , s2 , around which the direction of the arrows is constant to first order.


523

(a)

(b)

600

150

400

100

200

50

0 1

1

0.5

0 1

0 0

(d)

300

300

200

200

100

100

0 1

1 0.5 0 0

0.5 0 0

(c)

0.5

1

0.5

0.5

0 1

1

0.5

0.5 0 0

Fig. 2. Energy density plots of W (z) = ℘(z) + ρ for various values of ρ. In plot (a) ρ = 1 − i, so the roots of W are separate and two lumps form. In plots (b), (c) and (d), ρ = 0, e1 , −e1 respectively so the roots of W coincide. Here the energy distribution is roughly annular, centred on the double valency points s0 , s2 , s1

The behaviour of E as (α, ρ) covers R+ × C is remarkably varied, going beyond the two-lump and annular structures one might expect by analogy with the planar CP 1 model. First, consider the case α = 1. Here, the energy is located in lumps close to the two roots of ℘(z) + ρ (symmetrically placed about s0 since ℘ is even) where the denominator of (39) is smallest. The only exceptions are when these roots coincide, ρ = 0, −e1 , e1 , or are close to coincidence, for then the lumps lose their individual identity and form a ring-like structure (centred on s0 , s1 or s2 respectively) rather reminiscent of coincident planar solitons (Fig. 2). If we now imagine increasing α above 1, the effect on W is to pull all the arrows in the configuration towards the north pole of S 2 (W = ∞), so that those close to the south pole (W = 0) are stretched apart. Since the energy is located where the arrows are stretched apart, increasing α therefore tends to concentrate E more strongly on roots of ℘(z) + ρ, and the lumps become taller and narrower. As α → ∞ the lumps collapse and “pinch off”. Conversely, if α is decreased below 1 the arrows of the configuration are pulled southwards, and for α very small, E concentrates on the double pole of ℘(z) + ρ (z = 0), where W points north. In this case a ring structure appears, centred on z = 0, and collapses to zero width as α → 0. These two cases are compared in Fig. 3. Noting the symmetry property ℘(iz) ≡ −℘(z), we see that whenever ρ passes through 0 ∈ C along a smooth curve, the roots of ℘(z) + ρ coalesce and emerge at right angles to their line of approach, giving a first hint that the familiar 90◦ scattering of lumps through a ring structure may take place in the geodesic approximation. We shall return to this point later.

524

J. M. Speight (a)

(b)

1500

100

1000 50 500 0 1

0 1 1

0.5 0

1

0.5

0.5

0.5 0

0

0

Fig. 3. Energy density plots of W (z) = α(℘(z) + 1 − i) in the cases of (a) large α (α = 2) and (b) small α (α = 0.03)

500

(a)

40

(b)

20 0 1 0.5 0 0 40

0.5

1

0.5 0 0

(c)

40

20

0.5

1

(d)

20

0 1 0.5 0 0 1000

0 1

0.5

1

0 1 0.5 0 0

0.5

1

(e)

500 0 1 0.5 0 0

0.5

1

Fig. 4. The exceptionally symmetric family W (z) = α℘(z). The parameter values are (a) α = 4, (b) α = 0.3, (c) α = 1/e1 , (d) α = 0.01 and (e) α = 0.005. Plot (c) depicts the most evenly spread energy distribution possible for a degree 2 static solution

The special case ρ = 0 is exceptional, and will be prominent in later sections. Examining the formula (39) in this case, we see that the global maxima of E must occur where ℘ is purely imaginary. If not, assume that a global maximum occurs at z0 and let ℘(z0 ) = u ∈ C\iR. Then E(z0 ) =

8α2 |u||u2 − e21 | 8α2 |u|(|u|2 + e21 ) < , (1 + α2 |u|2 )2 (1 + α2 |u|2 )2

(40)


525

Fig. 5. Energy density plot of W (z) = (℘(z) − i)/e1

where the inequality is strict since u2 is not real-negative. But there exists z1 ∈ T 2 such that ℘(z1 ) = i|u|, and E(z1 ) =

8α2 |u|| − |u|2 − e21 | > E(z0 ), (1 + α2 |u|2 )2

(41)

a contradiction. Given the symmetry of E under ℘ 7→ −℘, and that ℘ is even, it follows that E(z) has at least four peaks on the diagonals of the unit square, symmetrically placed about s0 . Plots of E confirm that there are, in fact, exactly four such peaks (Fig. 4). The must symmetric case is α = 1/e1 , that is, W (z) = ℘(z)/e1 . Here, using the identity (17), one can easily show that E(z −s0 ) ≡ E(z), so the four peaks are located halfway towards the centre s0 along the diagonals, i.e. at the points (1+i)/4, (3+i)/4, 3(1+i)/4, (1+3i)/4. This solution has the most evenly spread energy distribution possible. Once again, one can consider the effect of increasing α (pulling the arrows northwards) or decreasing α (pulling southwards) for this family. Increasing α moves the lumps towards s0 , where they coalesce, form a shrinking ring structure and pinch off. Decreasing α has the same effect, except the ring is centred on 0 rather than s0 . In fact, the solution α℘(z) is identical, up to the rotation and translation (U0 , s0 ) ∈ G0 , to the solution ℘(z)/(e21 α). When α is close to 1/e1 and |ρ| is small but nonzero, the behaviour of E(z) is intermediate between the two cases described above. It has four peaks, but two of these are larger than the other two (Fig. 5). 5. The Metric on M2 The argument of Belavin and Polyakov (7) shows that M2 is the flat valley bottom of Q2 , on which V attains its topological minimum value, 4π. Any departure from M2 involves increasing V , and hence climbing the valley walls. Consider the initial value

526

J. M. Speight

problem where W starts on M2 and is given a small push tangential to it. Then, by energy conservation, it must stay close to M2 during its subsequent evolution. In the geodesic approximation one constrains the configuration to lie on M2 for all time, but allows the position in M2 to evolve in time according to the constrained action principle. Since V = 4π always, the dynamics is determined solely by the kinetic energy functional (4). Using the homeomorphism φ we can transfer the differentiable structure of G/G0 to M2 . Let {qi : i = 1, 2, . . . , 8} be local coordinates on M2 , and consider the kinetic energy T as q i vary in time: T = gij (q)q˙i q˙j , (42) where

Z gij (q) = Re

T2

1 ∂W ∂W . (1 + |W |2 )2 ∂qi ∂q j

(43)

the Equation (43) defines a Riemannian metric on M2 , g = gij dq i dq j , and furthermore R constrained Euler-Lagrange Eq. (obtained by varying the action S[q] = dt T (q, q)) ˙ is the geodesic equation for (M2 , g). The conjecture is, then, that geodesics in this Riemannian manifold are, when travelled at low speed, close to low-energy two-lump dynamical solutions of the CP 1 model. Some justification for this can be found when comparison is made with other models for which the approximation has been used. In the case of abelian-Higgs vortices, for example, the approximation [22] is supported by rigorous analysis [26] and extensive numerical solution of the full field equations [18]. Ideally, one would like an explicit, closed-form expression for the metric g, but this is rarely possible in practice. There are exceptions [1, 24, 25], but unfortunately this is not one of them. It is possible to place fairly strong constraints on the possible form of g, but not to write it down explicitly (to do so requires, naively, the evaluation of 36 integrals over T 2 , each with 8 parameters). It is convenient to lift the geometry to the covering e space (G, ge), where ge = φe∗ g, the pullback of the metric g by the covering projection φ. The most useful constraint is that ge is a product metric ge = gb ⊕ δ on P SL(2, C) × T 2 , where δ = 2πds ds. ¯ By product metric [10] we mean block diagonal with gb independent of position in T 2 and δ independent of position in P SL(2, C). This is easily established if we recall that any W ∈ M2 is a rational function of ℘(z − s), W (z) = R0 (℘(z − s)), so denoting by µ any one of the six P SL(2, C) moduli, we see that ∂W = −R1 (℘(z − s))℘0 (z − s), ∂s ∂W = R2 (℘(z − s)), ∂µ where R1 , R2 are also rational functions. So the (µ, s) component of ge is Z Z 0 geµs = Re f (℘(z − s))℘ (z − s) = Re f (℘(z))℘0 (z) = 0, T2

(44)

(45)

T2

since ℘ is even while ℘0 is odd (here f (u) = −(1 + |R0 (u)|2 )−2 R1 (u)R2 (u)). Similarly, geµs¯ = 0, so ge is block diagonal as claimed. Translation symmetry implies that ge, and hence gb, must be independent of s. Hence, it remains to show that δ is independent of the f (z − s(t)), W f ∈ M2 , and compute the kinetic P SL(2, C) moduli. Let W (x, y, t) = W energy,


Z T = T2

f 0 |2 1 f 2 |W ]|s| ˙ = 2π|s| |s| ˙ 2 = V [W ˙ 2. f |2 )2 2 (1 + |W

527

(46)

Since this is, by definition, gss¯ |s| ˙ 2 , we read off the metric δ = 2πds ds¯ on T 2 . The geodesic equation for (G, ge) decouples into independent geodesic equations for (P SL(2, C), gb) and (T 2 , δ). Consequently, we may identify s as an effective “centre of mass coordinate” which drifts on T 2 at constant velocity, independent of the lumps’ relative motion in P SL(2, C). Without loss of generality, therefore, we can investigate b gb), where G b denotes P SL(2, C). So geodesic motion in the reduced covering space (G, lump dynamics in the geodesic approximation has Galilean boost symmetry. This may be understood as a remnant of the Lorentz symmetry of the CP 1 model in R2+1 : the field equation is still Lorentz invariant, but the spatial boundary conditions now are not. Under a Lorentz boost, they suffer Lorentz contraction. In a low speed approximation such as this, however, Galilean boost symmetry is recovered, since the spatial contraction is a high order effect. One further constraint on gb will prove useful: since the kinetic energy functional is invariant under global internal rotations of W (rotations of the codomain S 2 ), SU (2) b gb) [24]. Briefly, gb (or ge) is left-invariant acts isometrically by left multiplication on (G, under SU (2). 6. Geodesic Incompleteness of (M2 , g) One of the most basic questions one can ask about a Riemannian manifold without boundary is whether it is geodesically complete, that is, whether all geodesics can be extended infinitely in time (forwards and backwards). In view of the noncompactness of M2 , this is a nontrivial question for (M2 , g). We will prove that (M2 , g) is , in fact, geodesically incomplete, by finding a geodesic which, although maximally extended, has only finite length (since geodesics are traversed at constant speed, this is sufficient). This geodesic is obtained explicitly, despite our lack of explicit information about g, by using discrete isometries to identify a one dimensional geodesic submanifold. Such arguments have been used to obtain multimonopole scattering geodesics [12] given similarly scant knowledge of the metric on moduli space. The key observation is that the fixed point set of a discrete group of isometries of a Riemannian manifold is (if a submanifold) a totally geodesic submanifold, that is, a geodesic which starts on and tangential to the fixed point set must remain on the fixed point set for all subsequent time. This follows directly from uniqueness of solutions to the initial value problem of an ordinary differential equation. If a discrete group is found whose fixed point set is diffeomorphic to R, then the set itself is a geodesic. b gb): The following mappings are isometries of (G, P : L 7→ L, R : L 7→ τ3 Lτ3 .

(47) (48)

To see this, consider their effect on W (z) = L (℘(z)/e1 ), ℘(z) ℘(z) ¯ =L = W (z), ¯ e1 e1 ℘(z) = −[L (−℘(z)/e1 )] = −W (iz). R : W (z) → 7 (τ3 Lτ3 ) e1

P : W (z) 7→ L

(49) (50)

528

J. M. Speight

So P produces simultaneous reflexions in both domain (z 7→ z) ¯ and codomain (W 7→ W ), while R produces rotations of π/2 in the domain (z 7→ iz) and π in the codomain (W 7→ −W ), all of which are symmetries of the CP 1 model. The composition of P b gb). Since P 2 = R2 = and R in either order (they commute) is another isometry of (G, 2 (P R) = Id, the isometries {Id, P, R, P R} form the Viergruppe V4 under composition. b V , the fixed point set of V4 is A straightforward calculation shows that 6 b V = {diag((αe1 ) 21 , (αe1 )− 21 ) : α ∈ R+ }. 6

(51)

b diffeomorphic to R, and hence is a geodesic. Its image This is clearly a submanifold of G e under the projection φ is 6V = {α℘(z) : α ∈ R+ } which is a geodesic of (M2 , g), also diffeomorphic to R. The submanifold 6V was described at the end of Sect. 4. The lump motion corresponding to this geodesic is an infinitely tall thin ring centred at 0 in the past spreading out into four distinct identical peaks, which recombine to form an infinitely tall thin ring centred on s0 in the future (assuming that 6V is traversed in the sense of increasing α). The question remains whether 6V is traversed in finite time, i.e. has finite length, and to answer this one needs to understand the induced metric gV on 6V . The restriction of g to 6V is gV = f (α)dα2 , (52) where

Z f (α) = T2

|℘|2 . (1 + α2 |℘|2 )2

(53)

Note that f is clearly positive and decreasing, and is easily shown to have limits ∞ and 0 as α → 0 and α → ∞ respectively. To prove that 6V has finite length we will need detailed asymptotic estimates for f in these two limits. The identity (17) implies that f (α) ≡

1 1 f( ), (αe1 )4 αe21

(54)

so the behaviour in one limit follows directly from the behaviour in the other. Lemma 2. The following asymptotic formulae hold, π2 as α → 0, 4α π2 f (α) ∼ 2 3 as α → ∞. 4e1 α

f (α) ∼

(55) (56)

Proof. We need only prove (55) since (56) follows from this and Eq. (54). The idea is to split the integration region of (53) into a small neighbourhood of 0 and its complement, bound the contribution of the latter region and use ` a Laurent expansion in the former. Fix some ∈ (0, 41 ) and split T 2 into D (0) (T 2 \D (0)), where D (0) is the open disk of radius centred on 0. Then Z Z Z |℘|2 |℘|2 |℘|2 < f (α) < + (57) 2 2 2 2 2 2 2 2 2 D (0) (1 + α |℘| ) D (0) (1 + α |℘| ) T 2 \D (0) (1 + α |℘| ) and |℘| is bounded on T 2 \D (0), so there exists M ∈ (0, ∞) such that


Z T 2 \D (0)

529

|℘|2 < (1 + α2 |℘|2 )2

Z T 2 \D (0)

|℘|2 < M

(58)

independent of α. Hence Z Z α|℘|2 α|℘|2 < αf (α) < αM + , 2 2 2 2 2 2 D (0) (1 + α |℘| ) D (0) (1 + α |℘| )

(59)

so it suffices to prove that Z lim

α→0

D (0)

π2 α|℘|2 . = 2 2 2 (1 + α |℘| ) 4

(60)

/ D (0)) bounded away The function h(z) = z 2 ℘(z) is analytic, bounded and (since s0 ∈ from 0 on D (0). So ℘(z) =√h(z)/z 2 , where 0 < c < |h(z)| < c 1 2 < ∞, c1 and c2 being √ constants. Defining γ = / α and u = z/ α, Z Z α|h(z)|2 /|z|4 α|℘|2 = lim dz d z ¯ lim α→0 D (0) (1 + α2 |℘|2 )2 α→0 D (0) (1 + α2 |h(z)|2 /|z|4 )2 √ Z |h( αu)|2 |u|4 √ = lim du du¯ χγ (u) , (61) α→0 C (|u|4 + |h( αu)|2 )2 where χγ is the characteristic function of the disk (i.e. χγ (u) = 1 if |u| < γ, 0 otherwise). The integrand of (61) is bounded above, independent of α, by c22 |u|4 + |u|4 )2

(62)

(c21

which is integrable on C. Hence, Lebesgue’s dominated convergence theorem applies [6], and we may interchange the order of limit and integration in Eq. (61). From the Laurent expansion of ℘ about 0, ℘(z) =

1 + O(z 2 ), z2

(63)

one sees that h(z) = 1 + O(z 4 ), whence √ |u|4 χγ (u)|h( αu)|2 |u|4 √ = . lim 4 2 2 α→0 (|u| + |h( αu)| ) (1 + |u|4 )2 Integrating this function over C yields π4 , which completes the proof.

(64)

There now immediately follows Theorem 2. The moduli space (M2 , g) is geodesically incomplete. Proof. We need only prove that the length of 6V , Z ∞ p dα f (α) l=

(65)

0

is finite. By Lemma 2 there exist 0 < α1 < α2 < ∞, 0 < c3 , c4 < ∞ such that f (α) < c3 /α on (0, α1 ) and f (α) < c4 /α3 on (α2 , ∞). Hence,

530

J. M. Speight

√

l < 2 c3 α1 +

Z

α2

p dα f (α) + 2

α1

r

√

< 2 c3 α1 + (α2 − α1 )f (α1 ) + 2 by monotonicity of f .

r

c4 α c4 α

(66)

The geodesic approximation predicts, then, that lumps (at least when coincident) can shrink and form singularities in finite time. Shrinking has been observed in numerical simulations of the CP 1 model in both the plane [29] and the torus [7], although the particular initial value problem considered here has not been simulated. 7. A Two-Dimensional Geodesic Submanifold The Viergruppe V4 has three Z2 subgroups {Id, P }, {Id, R} and {Id, P R} whose fixed b gb). Of these, b R, 6 b P R respectively are all geodesic submanifolds of (G, bP , 6 point sets 6 b R is two dimensional (the others are three dimensional) and projects under φe to 6 6R = {αeiψ ℘(z) : α ∈ R+ , ψ ∈ [0, 2π]},

(67)

a geodesic submanifold of (M2 , g) diffeomorphic to a cylinder. This has a tractable geodesic problem. Recalling that g is left-invariant under SU (2), its restriction gR to 6R is independent of ψ. In fact, gR = f (α)(dα2 + α2 dψ 2 ),

(68)

where f is the same function defined in (53). Lemma 2 implies that the asymptotic form of gR towards the ends of the cylinder is π2 (dα2 + α2 dψ 2 ) as α → 0, 4α π2 gR ∼ 2 3 (dα2 + α2 dψ 2 ) as α → ∞. 4e1 α gR ∼

(69) (70)

The formula in (69) is the metric of a flat, singular cone with deficit angle π, so (6R , gR ) can be visualized as having a conical singularity at α = 0. By virtue of the identity (54), the metric gR is invariant under the mapping α 7→ (e21 α)−1 , and consequently (6R , gR ) has an identical conical singularity at α = ∞, as may be shown by the reparametrization β = (e21 α)−1 in (70). So (6R , gR ) is a rotationally symmetric cylinder of finite length with its ends pinched to identical cones. It is the internal rotation orbit, about a fixed axis, of the one parameter family of exceptionally symmetric configurations already described (6V ). The singular points α = 0 and α = ∞ correspond to infinitely narrow, spiky, ring like configurations centred on 0 and s0 respectively. Motion on 6R corresponds to rotational and shape changing motion of the double lump on the torus. The conserved kinetic energy of this motion is T = f (α)α˙ 2 +

J2 , 4α2 f (α)

(71)

where J = α2 f (α)ψ˙ is the conserved angular momentum conjugate to ψ. One may imagine the dynamics as that of a point particle moving on the interval (0, ∞) with


531

position dependent mass and subject to a potential. Geodesic motion is invariant under rescaling of time, so one can restrict attention to the two cases J 2 = 0 and J 2 = 1. If J 2 = 0 the motion is irrotational and the point particle travels from one conical singularity to the other in finite time along a path of constant ψ. These geodesics are just rotated versions of 6V . The more interesting case is when J 2 = 1, where the nature of the motion is determined by the centrifugal potential U(α) =

1 . 4α2 f (α)

(72)

From the asymptotic formulae of Proposition 1 we see that the potential has the asymptotic behaviour , as α → 0, (73) π2 α 2 e U(α) ∼ 12 α as α → ∞ (74) π implying that U must have at least one stable equilibrium. The identity (54) implies a similar identity for the potential, U(α) ∼

U(

1 ) ≡ U (α). αe21

(75)

12

10

8

U6

4

2

0 0

0.2

0.4

0.6

0.8

1 a

1.2

1.4

1.6

1.8

2

Fig. 6. The centrifugal potential U (α) of Eq. (72), solid line, compared with the asymptotic formulae for U for small and large α given in Eqs. (73) and (74), dashed lines

532

J. M. Speight

Differentiating both sides of (75) one finds that U has a critical point at α = 1/e1 , the fixed point of the isometry α 7→ (αe21 )−1 . Numerical evaluation of U suggests that this is the only critical point, a global minimum, so that U is a single potential well (see Fig. 6). Since U grows unbounded as α → 0 and α → ∞, all motion in the well is oscillatory. So these geodesics wind around 6R , passing back and forth along its length indefinitely. They are bounded away from the singularities by angular momentum conservation. They correspond to rotational motions of the double lump during which the arrows of the configuration spin about the north-south axis of S 2 , and its shape periodically oscillates about that of the most symmetric configuration, W (z) = ℘(z)/e1 . 8. The Diameter of (M2 , g) In this section we will prove that (M2 , g) has finite size, in an appropriate sense. Since one is interested in (M2 , g) primarily for its geodesics, a linear measure of size is most meaningful, so we will consider its diameter. Recall that (M2 , g), like any Riemannian manifold, has a natural distance function d : M2 × M2 → R, where d(W, W 0 ) is the infimum of lengths with respect to g of piecewise C 1 paths in M2 connecting W and W 0 (note that d has nothing to do with D, the distance function defined in Sect.3, although they define equivalent topologies on M2 ). The diameter of (M2 , g) is simply the diameter of the associated metric space (M2 , d), that is, diam(M2 , g) =

sup

W,W 0 ∈M2

d(W, W 0 ).

(76)

Once again, it is the noncompactness of M2 which makes this diameter interesting, and its finiteness nontrivial. The geometric meaning of the result is that all points lie within a bounded distance of each other, and, in particular, no point lies far from the end of M2 , where the static solutions collapse to singular, spiky configurations. Thus all static solutions are close to collapse in this geometry. This may be the underlying cause of the ubiquitous instability found in numerical simulations of two-lump scattering on the torus [7]. Theorem 3. The moduli space (M2 , g) has finite diameter. Proof. It suffices to prove that the covering space (G, ge) has finite diameter and, further, since ge = gb ⊕ δ is a product metric and T 2 is compact, it is sufficient to prove that b gb) has finite diameter. By the triangle inequality for the reduced covering space (G, b×G b → R, d:G b gb) ≤ 2 sup d(W, W0 ), (77) diam(G, b W ∈G b Let W0 = ℘. We will explicitly construct a path from where W0 is any point in G. + ∼ b = SO(3) × [R × C] to W0 , and bound its length independent of W . W ∈G Let W = U [α(℘ + ρ)]. The first piece of the path has (α, ρ) ∈ R+ × C fixed, but takes U to I. For example, since any U ∈ SU (2) is exp(u) for some u ∈ su(2), we could consider the path (t) = exp((1 − t)u), so that (0) = U while (1) = I. Denote by γ(α, ρ) the metric on SO(3) induced by gb at fixed (α, ρ). Since SO(3) is compact the length of (t) is bounded independent of U for each (α, ρ). One must check, however, that the length remains bounded as a function of (α, ρ). Since γ(α, ρ) is a left-invariant metric on SO(3), it suffices to show that


0(α, ρ) :=

533 3 X

|γij (α, ρ)|

(78)

i,j=1

is a bounded function, where γij (α, ρ) are the metric coefficients of γ evaluated at I ∈ SO(3) with respect to a particular choice of basis for TI SO(3). The basis used does not matter. One convenient choice consists of the three vectors represented by the curves τ i , i = 1, 2, 3, t ∈ (−, ), (79) exp it 2 where τi are the Pauli matrices (this is equivalent to choosing {iτi /2 : i = 1, 2, 3} as a basis for su(2)). Elementary calculation then shows that 0 < 3 for all (α, ρ). For example, γ33 (α, ρ) is the squared length of the vector [exp(itτ3 /2)]. Let w := α(℘ + ρ). Then τ 3 w(z) = eit w(z), W (z, t) = exp it 2 ˙ (z) = ∂W = iw(z), W (80) ∂t t=0 and

Z γ33 (α, ρ) =

T2

˙ |2 |W = (1 + |W |2 )2

Z T2

1 |w|2 < . 2 2 (1 + |w| ) 2

(81)

Bounds on the other metric coefficients are equally straightforward. It remains to construct a path from α(℘ + ρ) to ℘ with length bounded above inde` pendent of (α, ρ). It is necessary to split R+ × C into two pieces X+ X− and construct the path differently in each piece. Here X+ = {(α, ρ) ∈ R+ × C : α > 1} and X− is its complement. For any (α, ρ) ∈ X− construct the path x− : [0, 1] → R+ × C, where (α, (1 − 2t)ρ) t ∈ [0, 21 ] x− (t) = , (82) (1 + 2(1 − α)(t − 1), 0) t ∈ ( 21 , 1] so that x− (0) = (α, ρ), x− (1) = (1, 0). Thinking of R+ × C as the upper half of R3 , this path consists of a horizontal line from (α, ρ) to (α, 0) followed by a vertical line from (α, 0) to (1, 0) (see Fig 7). Its length is bounded above by the sum of the lengths of the curves {(α, teiψ ) : t ∈ [0, ∞) and {(t, 0) : t ∈ (0, 1]}, where ψ = argρ. So l[x− ] < l1 (α, ψ) + l3 ,

(83)

where Z

∞

q

Z

∞

Z

α2 d|ρ| gb|ρ||ρ| (α, ρ) = d|ρ| l1 (α, ρ) = 2 iψ 2 2 T 2 (1 + α |℘ + |ρ|e | ) 0 0 1 Z 2 Z 1 p Z 1 |℘|2 dα gbαα (α, 0) = dα . l3 = 2 2 2 T 2 (1 + α |℘| ) 0 0

21 , (84)

534

J. M. Speight

α X+

. x+

α=1 Im(ρ)

x-

.

XRe(ρ) Fig. 7 The paths x− and x+ constructed in the proof of Theorem 3

That l3 is finite follows directly from Lemma 2, since gbαα (α, 0) is precisely f (α), the function previously discussed. To prove that l1 (α, ρ) is finite and bounded independent of (α, ψ) ∈ (0, 1] × [0, 2π] is more involved. By a change of variable, σ := α|ρ|, we can rewrite l1 (α, ψ) as Z l1 (α, ψ) =

Z

∞

dσ 0

T2

1 (1 + |α℘ + σeiψ |2 )2

21 .

(85)

One must now appeal to a technical lemma, whose proof we postpone: Lemma 3. There exist σ∗ , C > 0, independent of (α, ψ), such that ∀σ > σ∗ and α ≤ 1, Z C 1 < 3. (86) iψ |2 )2 (1 + |α℘ + σe σ 2 T It follows that Z

σ∗

Z

1 l1 (α, ψ) = iψ 2 2 0 T 2 (1 + |α℘ + σe | ) Z ∞r C < σ∗ + < C 0 < ∞, 3 σ σ∗

21

Z

∞

Z

+ σ∗

T2

1 (1 + |α℘ + σeiψ |2 )2

C 0 being a constant. Now, for any (α, ρ) ∈ X+ construct the path x+ : [0, 1] → R+ × C, where

21

(87)


x+ (t) =

535

(α − 2(α − 1)t, ρ) t ∈ [0, 21 ] (1, 2(1 − t)ρ) t ∈ ( 21 , 1],

(88)

consisting (see Fig. 7) of a vertical line from (α, ρ) to (1, ρ) followed by a horizontal line from (1, ρ) to (1, 0). Its length is bounded above by the sum of the lengths of the lines {(t, ρ) : t ∈ [1, ∞)} and {(1, teiψ ) : t ∈ [0, ∞)}. So l[x+ ] < l2 (ρ) + l1 (1, ψ),

(89)

where l1 was previously defined and Z

∞

dα

l2 (ρ) =

p

Z gbαα (α, ρ) =

1

Z

∞

dα 1

T2

|℘ + ρ|2 (1 + α2 |℘ + ρ|2 )2

21 .

(90)

We have already shown that l1 (1, ψ) is finite and bounded independent of ψ (this follows from Lemma 3 in the case α = 1). That l2 (ρ) is finite ∀ρ ∈ C is easily shown, using an argument similar to that of Lemma 2. Let z1 , z2 be the roots of ℘ + ρ (possibly coincident) and split T 2 into small neighbourhoods of these roots and their complement. In the complement use the trivial bound |℘ + ρ| ≥ C, constant, while near the roots use Laurent expansions of ℘ + ρ. One finds that gbαα < C 0 /α3 which is sufficient for finiteness of l2 (ρ) for all ρ, and boundedness of l2 on any compact subset of C. This is insufficient for our purposes, since l2 (ρ) could grow unbounded as |ρ| → ∞. We again appeal to a technical lemma whose proof we postpone: Lemma 4. For all ρ ∈ C such that |ρ| > e1 + 2, Z 2 π |℘ + ρ|2 < 4 + 4 log(1 + α2 ). 2 2 2 α 2α T 2 (1 + α |℘ + ρ| )

(91)

So for all ρ outside the closed disk De1 +2 (0), Z l2 (ρ) < 1

∞

21

π 2 dα 4 + 4 log(1 + α2 ) α 2α

= C < ∞,

(92)

b lie within C being a constant. Hence l2 is bounded independent of ρ, and all points in G a bounded distance of ℘, and hence, one another. Proof of Lemma 4. Let ρ ∈ C such that |ρ| > e1 + 2. Since ℘ is an even function, Z |℘ + ρ|2 , (93) gbαα (α, ρ) = 2 2 2 2 H (1 + α |℘ + ρ| ) where H = [0, 1) × [0,`21 ) is the “half torus” (the point is that ℘ is injective on H). Split H into two pieces H+ H− , where H+ = {z ∈ H : |℘ + ρ| > 1}. Now, Z Z Z 1 |℘ + ρ|2 1 1 < < < 4. (94) 2 |℘ + ρ|2 )2 4 |℘ + ρ|2 4 (1 + α α α α H+ H+ H+ To estimate the contribution of the H− region, we perform a variable change z 7→ u = ℘(z) on H− . Since ℘ is injective on H− , this variable change is well defined provided ℘ has no critical (i.e. double valency) points in H− . The transformed integration range

536

J. M. Speight

℘(H− ) is a closed disk of unit radius centred on −ρ, so given that |ρ| > e1 + 2, ℘(H− ) contains none of {∞, 0, ±e1 }, and hence H− excludes all the double valency points. The Jacobian of the variable change is |℘0 (z)|−2 = |4u(u2 − e21 )|−1 , so Z H−

1 |℘ + ρ|2 = (1 + α2 |℘ + ρ|2 )2 4

Z ℘(H− )

|u + ρ|2 du du¯ . 2 2 |u||u − e1 | (1 + α2 |u + ρ|2 )2

(95)

Now, for all u ∈ ℘(H− ), |u| ≥ e1 + 1 > 1, and |u ± e1 | ≥ ||u| − e1 | ≥ 1, so Z H−

1 |℘ + ρ|2 < (1 + α2 |℘ + ρ|2 )2 4 =

π 2

Z du du¯ ℘(H− )

Z

1

dx 0

|u + ρ|2 (1 + α2 |u + ρ|2 )2

x3 (1 + α2 x2 )2

Z 1+α2 y−1 π dy 4 4α 1 y2 π log(1 + α2 ). < 4α4

(x := |u + ρ|) (y := 1 + α2 x)

=

(96)

Using inequalities (94) and (96) in Eq. (93), the result immediately follows.

Proof of Lemma 3. The idea is similar to the proof of Lemma 4: "Z # Z Z 1 1 1 =2 + , iψ 2 2 iψ 2 2 iψ 2 2 T 2 (1 + |α℘ + σe | ) H+ (1 + |α℘ + σe | ) H− (1 + |α℘ + σe | ) (97) ` where again H+ H− = H, the half torus, but now H+ = {z ∈ H : |α℘ + σeiψ | > σ 4 }. 3

(98)

The H+ integral is trivially bounded by 1/σ 3 . We make the same variable change z 7→ 3 u = ℘(z) on H− . Now ℘(H− ) is a closed disk of radius σ 4 /α centred on σei(ψ+π) /α. In order that ℘(H− ) contain none of {∞, 0, ±e1 }, it suffices that σ ≥ σc , where σc is the real solution of 3 (99) σc − σc4 = 2e1 . To see this, note that ∀u ∈ ℘(H− ), 3

|u| ≥

3 3 σ − σ4 ≥ σ − σ 4 ≥ σc − σc4 = 2e1 , α

(100)

where the restriction α ≤ 1 has been used. So the variable change is well defined provided σ ≥ σc . Recall that the Jacobian of the transformation is |4u(u2 − e21 )|−1 . Now ∀u ∈ ℘(H− ), |u| ≥ 2e1 as shown above. Hence |u|3 = |u|2 |u| ≥ 4e21 |u|, and

(101)


537

|u(u2 − e21 )| ≥ ||u|3 − e21 |u|| = |u|3 − e21 |u| 1 3 ≥ |u|3 − |u|3 = |u|3 . 4 4 Thus,

Z H−

1 1 ≤ (1 + |α℘ + σeiψ |2 )2 3

Z ℘(H− )

du du¯ 1 . |u|3 (1 + |αu + σeiψ |2 )2

(102)

(103)

Now let σ˜ c be the real solution of 3

σ˜ c − σ˜ c4 =

σ˜ c , 2

(104)

and define σ∗ = sup{σc , σ˜ c }. Then, provided σ ≥ σ∗ , for all u ∈ ℘(H− ), 3

|u| ≥

σ − σ4 σ ≥ . α 2α

(105)

This allows one to estimate the |u|−3 part of the integrand of inequality (103), which still holds since σ∗ ≥ σc : Z Z 8α3 1 du du¯ ≤ iψ |2 )2 3 (1 + |α℘ + σe 3σ (1 + |αu + σeiψ |2 )2 H− ℘(H− ) Z 1 8α3 2π x = dx (x := |αu + σeiψ |) 3σ 3 α2 0 (1 + x2 )2 Cα C = 3 ≤ 3. (106) σ σ The result immediately follows.

9. Conclusion In this paper we have considered the low-energy dynamics of two CP 1 lumps moving on a torus in the framework of the geodesic approximation. We have proved that the degree 2 moduli space M2 is homeomorphic to the left coset space G/G0 , where G is the eight-dimensional, noncompact Lie group P SL(2, C) × T 2 and G0 is a discrete subgroup of order 4. This result provides a good global parametrization of M2 with unconstrained parameters, based on the Weierstrass ℘ function (this situation should be compared with other studies where M2 was parametrized using the Weierstrass σ function and constrained parameters [7, 20]), and allows a systematic description of the degree 2 static solutions, some of which display four rather than, as one might expect, two distinct energy peaks. By lifting the metric g on M2 defined by the kinetic energy to ge on the covering space G, we showed that the dynamics decouples into a trivial “centre of mass” motion and a nontrivial relative motion of the lumps. This reduces the problem b gb). Two further results to geodesic motion in a 6-dimensional reduced covering space (G, were proved concerning the Riemannian geometry of (M2 , g), namely that the moduli space is geodesically incomplete and has finite diameter. These imply that static lumps can collapse to singularities in finite time, and that all static solutions are close to such

538

J. M. Speight

singularities. In addition, a two dimensional geodesic submanifold was identified, and its geometry and geodesics described in detail. To make further progress in solving the geodesic problem for (M2 , g) one would need to resort to numerical solution of the geodesic equation. Given the explicit parametrization of M2 , and that the metric components are integrals over a compact, two dimensional domain, such numerical work should be reasonably economical. In particular, there are two 3 dimensional geodesic submanifolds whose geodesic problems would be well suited to numerical study, and which should yield interesting lump dynamics. These b → G. b are 6P and 6P R , the projected fixed point sets of the isometries P, P R : G Explicitly, 6P = {exp(iψτ2 /2) [α(℘(z) + ρ1 )] : ψ ∈ [0, 2π], α ∈ R+ , ρ1 ∈ R}, 6P R = {exp(iψτ1 /2) [α(℘(z) + iρ2 )] : ψ ∈ [0, 2π], α ∈ R+ , ρ2 ∈ R}, (107) so both are internal rotation orbits, about (different) fixed axes, of the α(℘ + ρ) family, but with ρ real (6P ) or purely imaginary (6P R ). On these submanifolds, therefore, the two lumps, when distinct, are constrained to lie either on the central cross and boundary of the unit square, or its diagonals, respectively. In either case, they can only scatter through 90◦ . In the case of 6P , for example, any geodesic which punctures any of the cylinders ρ1 = 0, ρ1 = e1 , ρ1 = −e1 at α much greater than 1/e1 gives rise to 90◦ scattering of the lumps. Similarly, any geodesic which punctures the ρ2 = 0 cylinder in 6P R gives rise to 90◦ scattering along the diagonals. Both these processes have been observed in numerical simulations of the field equation [7]. To understand the long time behaviour of the geodesics after the scattering event would require detailed numerical work. Other extensions of the present work would be interesting. One can extend the geodesic incompleteness results proved here for (M2 , g) on T 2 and elsewhere [24] for (M1 , g) on S 2 to the general setting of (Mn , g) for the CP 1 model on an arbitrary compact Riemann surface [21]. It may well be possible to similarly extend our result concerning the finite diameter of moduli space to the general setting. Also, one would expect that (M2 , g) has finite volume, as well as diameter (although neither guarantees the other), and perhaps this can be established by making refined versions of estimates such as those in Lemmas 2, 3 and 4. Finally, it should be emphasised that all our results concern an approximation to the field theory. While this has proved remarkably successful in all situations where it has been tested, one would ideally like rigorous analysis to back up physical intuition. Given the singularity of the geometry of moduli space for the planar CP 1 model, the model on the torus provides an ideal starting point for an analysis fashioned after Stuart’s work on vortices and monopoles [26]. Acknowledgement. I would like to thank Sharad Agnihotri, Jay Handfield and Carlo Morpurgo for several helpful discussions.

References 1. Atiyah M.F., Hitchin, N.J.: The Geometry and Dynamics of Magnetic Monopoles. Princeton NJ: Princeton University Press, Princeton 1988 2. Beardon, A.F.: A Primer on Riemann Surfaces Cambridge: Cambridge University Press, 1984, p. 89 3. Beardon, A.F.: op. cit., (p. 81) 4. Belavin, A.A., Polyakov, A.M.: Metatstable states of two-dimensional isotropic ferromagnets. JETP Lett. 22, 245–247 (1975)


539

5. Choquet-Bruhat, Y., DeWitt-Morette, C., Dillard-Bleick, M.: Analysis, Manifolds and Physics, Part I. Amsterdam: North-Holland, 1982, p. 13 6. Choquet-Bruhat, Y., DeWitt-Morette, C., Dillard-Bleick, M.: op. cit., (p. 43) 7. Cova, R.J., Zakrzewski, Z.J.: Soliton scattering in the O(3) model on a torus. Preprint hep-th/9706166 (1997) 8. Du Val, P.: Elliptic Functions and Elliptic Curves. Cambridge: Cambridge University Press, 1973, p. 13 9. Gallot, S., Hulin, D., Lafontaine, J.: Riemannian Geometry. Berlin: Springer-Verlag, 1987 p. 13 10. Gallot, S., Hulin, D., Lafontaine, J.: op. cit., (p. 58) 11. Guillemin, V., Pollack, A.: Differential Topology. Englewood Cliffs NJ: Prentice-Hall, 1974, p. 146 12. Hitchin, N.J., Manton, N.S., Murray, M.K.: Symmetric monopoles. Nonlinearity 8, 661–692 (1995) 13. Knopp, K.: Theory of Functions, Part 2. New York: Dover, 1947. p. 77 14. Knopp, K.: op. cit., pp. 86–91 15. Lawden, D.F.: Elliptic Functions and Applications (Chap. 6). New York: Springer-Verlag, 1989 16. Leese, R.A.: Low-energy scattering of solitons in the CP 1 model. Nucl. Phys. B344, 33–72 (1990) 17. Manton, N.S.: A remark on the scattering of BPS monopoles. Phys. Lett. 110B, 54–56 (1982) 18. Myers, E., Rebbi C., Strilka, R.: Study of the interaction and scattering of vortices in the abelian Higgs (or Ginzburg-Landau) model. Phys. Rev. D45, 1355–1364 (1992) 19. Penrose, R., Rindler, W.: Spinors and space-time, Vol. 1. Cambridge: Cambridge University Press, 1984 pp. 18-21 20. Richard, J-L., Rouet, A.: The CP 1 model on the torus. Nucl. Phys. B211, 447–464 (1983) 21. Sadun, L.A., Speight, J.M.: Geodesic incompleteness in the CP 1 model on a compcat Riemann surface. To appear in Lett. Math. Phys. 22. Samols, T.M.: Vortex scattering. Commun. Math. Phys. 145, 149–179 (1992) 23. Schwerdtfeger, H.: Geometry of Complex Numbers. New York: Dover, 1979, p. 41 24. Speight, J.M.: Low-energy dynamics of a CP 1 lump on the sphere. J. Math. Phys. 36, 796–813 (1995) 25. Strachan, I.A.B.: Low-velocity scattering of vortices in a modified Abelian Higgs model. J. Math. Phys. 33, 102–110 (1992) 26. Stuart, D.: Dynamics of abelian Higgs vortices in the near Bogomolny regime. Commun. Math. Phys. 159, 51–91 (1994) 27. Thurston, W.P.: Three-Dimensional Geometry and Topology. Princeton NJ: Princeton University Press, 1997 pp. 153–157 28. Ward, R.S.: Slowly moving lumps in the CP 1 model in (2 + 1) dimensions. Phys. Lett. 158B, 424–428 (1985) 29. Zakrzewski W.J., Piette, B.: Shrinking of solitons in the (2+1)-dimensional S 2 sigma model. Nonlinearity 9, 897–910 (1996) Communicated by A. Jaffe

Commun. Math. Phys. 194, 541 – 567 (1998)

Communications in


Existence of Gelling Solutions for CoagulationFragmentation Equations Intae Jeon Department of Mathematics, Ohio State University, Columbus, Ohio 43210, USA. E-mail: [email protected] Received: 25 April 1997 / Accepted: 4 November 1997

Abstract: We study the Smoluchowski coagulation-fragmentation equation, which is an infinite set of non-linear ordinary differential equations describing the evolution of a mono-disperse system of particles in a well stirred solution. Approximating the solutions of the Smoluchowski equations by a sequence of finite Markov chains, we investigate the qualitative behavior of the solutions. We determine a device on the finite chains which can detect the gelation phenomena – the density dropping phenomena. It shows how the gelation phenomena are reflected on the sequence of finite Markov chains. Using this device, we determine various types of gelation kernels and get the bounds of gelation times.

0. Introduction The Smoluchowski coagulation-fragmentation equation is an infinite set of non-linear ordinary differential equations describing the evolution of a mono-disperse system of particles in a well stirred solution given by Ct˙(j) =

1 Xj−1 {K(j − k, k)Ct (j − k)Ct (k) − F (j − k, k)Ct (j)} k=1 2 X∞ {K(j, k)Ct (j)Ct (k) − F (j, k)Ct (j + k)}, −

(1)

k=1

for j = 1, 2, 3 · · · , where Ct (j) ≥ 0 is the expected number of j−clusters ( a cluster consisting of j−particles) per unit volume, and K, F are nonnegative symmetric functions which represent the coagulation rate of i and j−cluster, fragmentation rate of i + j cluster breaking up into i and j−cluster, respectively, [Sm, D, BCC, vE3]. Here particles may coagulate to form clusters of k particles for k ≥ 1, and large clusters may fragment into smaller ones. The equations describe the dynamics of the density of k-clusters per unit volume based on the rates at which these events take place

542

I. Jeon

assuming only second order reactions, that is, only two clusters may coagulate at a time and when a cluster fragments, it breaks into exactly two smaller clusters. Since phenomena such as polymerization, cloud formation, star formation, and binary alloy follow this dynamics, this model has a lot of applications in science, for example, astrophysics, atmospheric physics, colloidal chemistry, polymer science, etc. [Sm, S, Sc, D]. Many studies have been devoted to these models recently, including deterministic and stochastic models especially for the pure coagulation case (F (i, j) ≡ 0) [HEZ, LT, BP, BW, vE1, vE2]. Most recently, while we were preparing this monograph, Aldous, in his preprint, surveyed broad scientific literatures about this area and raised many interesting questions. Some direct and indirect answers P∞ are contained in this paper [A]. One expects the total density of particles (ρ = i=1 iCt (i)) to be conserved. However, explicit solutions for the case K(i, j) = ij show that this is not always true. It may happen that the total density of particles decreases after a finite time, a phenomenon interpreted as gelation, for it is taken to signal the formation of a cluster of infinite size. We treat the problem of gelation probabilistically by approximating the solutions of the Smoluchowski equations by a sequence of finite state Markov chains. This is a type of Law of Large Numbers. By making estimates on the approximating Markov chains, we derive conditions under which the Smoluchowski equations have a solution which exhibits the gelation phenomenon, and we derive explicit bounds on the gelation time. This paper is organized as follows: After this introduction, in Sect. 1, we construct finite Markov chains on l2 space, which will be used to approximate the solution of Smoluchowski equation. Also, we define various types of gelation. In Sect. 2, we state the main theorems and corollaries; Theorems 1 and 2 show the tightness of the chains and the existence of solutions for the Smoluchowski equation. Theorems 3 and 4 show when such gelation occurs. The proofs of Theorem 1 and 2 are given in Sect. 3. In Sect. 4, we develop machinery to detect gelation phenomena. We see how the gelation phenomena are reflected on finite chains. Using this machinery we prove Theorem 3 and 4 in Sect. 5.

1. Construction In this section we construct a sequence of finite state Markov chains associated to the rate constants K(i, j) and F (i, j), i, j ≥ 1, in the Smoluchowski coagulation fragmentation equation. In the nth Markov chain, there are on the order of n particles, each of a size inversely proportional to n, which can coagulate to form clusters. These larger clusters can coagulate among themselves or fragment to form smaller clusters, at rates proportional to n, determined by K(i, j) and F (i, j). This is a law of large numbers, or Euler scaling. With this scaling, the Markov chains can be thought of as discrete, stochastic approximations to solutions of the Smoluchowski equations. Notation. (a) Let N = {0, 1, 2, · · · }, N+ = {1, 2, 3, · · · }. P∞ 1 + (b) Let l2 = {η ∈ RN : kηkl2 = ( k=1 |η(k)|2 ) 2 < ∞}. For ρ > 0, P∞ (c) let E ρ = {η ∈ l2 : η(k) ≥ 0 for all k and k=1 kη(k) ≤ ρ}, + P∞ (d) let Enρ = { nρ η ∈ l2 : η ∈ NN , k=1 kη(k) = n}.

Existence of Gelling Solutions

543

For η ∈ E ρ , m, Pn∞≥ 1, (e) let kηk = P kη(k), k=1 n (f) let kηkn = Pk=1 kη(k), n (g) let kηknm = k=m kη(k). (h) [·] represents the largest integer function. (i) Let {ei }∞ i=1 be the basis of l2 . P∞ Remark. The density functional η → kηk = k=1 kη(k) is not continuous on {E ρ , k · ρ kl2 }. For example, let ηn = n en . Then kηn k = ρ for every n, yet limn→∞ ηn = 0. Pl However, each partial sum η → kηkl = k=1 kη(k) is continuous. ∞ Let {K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 be nonnegative, symmetric sequences, that is, K(i, j) = K(j, i), and F (i, j) = F (j, i). For all bounded functions f on Enρ consider the generator Gn given by n X ρ [{f (η + 1nij ) − f (η)}K(i, j)η(i)(η(j) − δij ) Gn f (η) = 2ρ n i+j≤n (2) n + {f (η − 1ij ) − f (η)}F (i, j)η(i + j)],

where 1nij = nρ (ei+j − ei − ej ), and δij = 1, if i = j, and 0, if i 6= j. Let Xtn,ρ denote the corresponding Markov chain on Enρ . Informally we may describe the dynamics as the following: The process waits at state η for an exponentially distributed amount of time with parameter ρ . n X {K(i, j)η(i)(η(j) − δij ) + F (i, j)η(i + j)}, λn (η) = 2ρ n i+j≤n

then jumps to state η + 1nij with probability n ρ K(i, j)η(i)(η(j) − δij ), n 2ρλ (η) n or to state η − 1nij with the complementary probability n F (i, j)η(i + j), 2ρλn (η) where i, j ≥ 1, i + j ≤ n. In any event, since the state space consists of finitely many points for all n, i.e., |Enρ | < ∞, there is a unique well defined pure jump process on Enρ for each n. This process is strong Markov and has the characteristic property that for any bounded function f on Enρ , Z t n,ρ Gn f (Xsn,ρ )ds (3) f (Xt ) − 0

is a martingale. Definition 1. A stochastic coagulation-fragmentation system with density ρ consists of n,ρ ∞ ρ the triple [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] such that ∞ (a) {K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 are nonnegative, symmetric sequences, which are called coagulation and fragmentation kernels, respectively.

544

I. Jeon

(b) Xtn,ρ is the pure jump process on the probability space (n , A, Pn ) whose Markov generator is Gn . (c) X0n,ρ = η n , for η n ∈ Enρ , and η n −→ η0 in (E ρ , k · kl2 ), where kη0 kl2 = ρ. Note. From now on, if there is no confusion, we will drop ρ from the index. n,ρ ∞ ρ Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulationα fragmentation system. For 0 < α ≤ 1, let τn be the first hitting time of cluster size greater than or equal to αn, i.e.,

. τnα = inf{t : Xtn (k) > 0 for some k ≥ αn}. Note that if α < β, then τnα ≤ τnβ a.s. Definition 2. (a) The strong gelation time for a coagulation-fragmentation system is defined by tsg = inf{t > 0 : ∃0 < α ≤ 1 such that lim sup P {τnα ≤ t} > 0}. n→∞

(b) The gelation time of a solution Ct of the Smoluchowski equations is defined by tg = inf{t > 0 : kCt k < kC0 k}. (c) The stochastic gelation(sto-gel) time of a weak limit X of X n,ρ is defined by Tg = inf{t > 0 : P {kXt k < kX0 k} > 0}. (d) We say that strong gelation occurs, gelation occurs and sto-gel occurs if tsg < ∞, tg < ∞ and Tg < ∞, respectively. Remark. In the case of pure coagulation (F (i, j) ≡ 0), strong gelation always implies sto-gel, but it is not clear if we have nonzero fragmentation kernels. We will deal with the relations between strong gelation and sto-gel in Sect. 4.2. The Becker–Döring equation is a special case of Smoluchowski coagulationfragmentation equation which allows only coagulation and fragmentation involving single particles. (See [BCC]). We generalize this as follows: n,ρ ∞ ρ Definition 3. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. We say that it is a generalized Becker–Döring system of degree r (or r-generalized Becker–Döring system) if there is an integer r > 0 such that F (i, j) = 0 for all i, j with min(i, j) > r.

2. Statement of the main theorems n,ρ ∞ ρ Theorem 1. (Tightness) Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. Suppose the kernels satisfy P (a) (TC1) supη∈E i,j≥1 K(i, j)η(i)η(j) < ∞, P (b) (TC2) supη∈E i,j≥1 F (i, j)η(i + j) < ∞,

then the laws of {X n } form a tight sequence.


545

The above theorem shows the existence of weak limits. Then, some natural questions arise about whether the weak limits solve the Smoluchowski coagulation-fragmentation equation and whether they show the density dropping phenomena (sto-gel). The following theorems answer those questions for some interesting classes of coagulation and fragmentation kernels. n,ρ ∞ ρ Theorem 2. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. Suppose the kernels satisfy the conditions

= 0, (a) limi+j→∞ K(i,j) ij (b) there exists G(i+j) such that F (i, j) ≤ G(i+j) for all i, j and limi+j→∞ G(i+j) = 0, then there exists a weak limit Xt of Xtn , and it solves the system of the integral version of the Smoluchowski equation on any interval t ∈ [0, T ], T < ∞. n,ρ ∞ ρ Theorem 3. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. Suppose the kernels satisfy

(a) (ij)α ≤ K(i, j) ≤ M ij for some 21 < α ≤ 1, > 0, M < ∞, (b) F (i, j) ≡ 0, then there exists a weak limit Xt of Xtn , and Tg ≤ C(α) ρ , where C(α) =

inf

0 β then strong gelation occurs with tsg ≤ (ρ−ρ and Tg ≤ tsg . c) Combining Theorems 4 and 5 and Theorem 6 in Sect. 4.2, we have Corollary 2. Suppose coagulation and fragmentation kernels satisfy (a) ij ≤ K(i, j) ≤ M ij, for some 0 < ≤ M < ∞, C , for some 0 ≤ C < ∞. (b) 0 ≤ F (i, j) ≤ i+j If ρ >

C ,

then sto-gel occurs and Tg ≤

4 2ρ−C .

Corollary 3. Suppose coagulation and fragmentation rates satisfy (a) ij ≤ K(i, j) ≤ M ij, for some 0 < ≤ M < ∞, (b) F (i, j) ≤ Ci ∨ j, and there exists r > 0 such that F (i, j) = 0, if i ∧ j > r. If ρ >

Cr ,

then sto-gel occurs and Tg ≤

2 ρ−Cr .

The next theorem is a device on the finite chains which can detect the gelation phenomena. n,ρ ∞ ρ Theorem 5. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system, and suppose there is a subsequence X nk which converges weakly to X. For any fixed t > 0, the following are equivalent.

I. There exists a nondecreasing function φ : N+ −→ N+ such that φ(n) ≤ n, limn→∞ φ(n) = ∞, and there exists > 0 such that lim sup P {kXtnk kφ(nk ) ≤ ρ − } > 0. k→∞

II. There exists > 0 such that P {kXt k ≤ ρ − } > 0.

3. Tightness and Weak Limits (Theorems 1 and 2) 3.1. Preliminaries. Lemma 3.1. (a) E ρ is a compact subset of l2 . (b) Enρ is a finite subset of E ρ . P∞ Proof. (a) Note that if η ∈ E ρ then supk≥1 kη(k) ≤ ρ, hence kηk2l2 = k=1 η(k)2 ≤ P∞ ρ2 2 ρ k=1 k2 ≤ cρ < ∞. Furthermore, if ηn ∈ E and ηn → η in l2 , then for all k ≥ 1, Pl Pl ηn (k) → η(k) ≥ 0. Thus for each l ≥ 1, k=1 kη(k) = limn→∞ k=1 kηn (k) ≤ ρ, and so η ∈ E ρ . Thus E ρ is a closed subset of l2 . As for compactness, let A : l2 → l2 be the diagonal operator given on the standard orthonormal basis ek = (0, 0, · · · , 0, 1, 0, · · · , 0 · · · ) by Aek = kρ ek . It is clear that A is a compact operator since it has a complete set of eigenvectors of multiplicity one and its


547

eigenvalues accumulate only at 0. For η ∈ E ρ , define ξ by ξ(k) = kρ η(k). Then η = Aξ P∞ P∞ and ξ has norm kξk2l2 = k=1 ( kρ η(k))2 ≤ ρ−2 supk≥1 kη(k) k=1 kη(k) ≤ 1. Thus, E ρ is a closed subset of a compact set, namely the image under A of the unit ball, and so E ρ is compact. (b) It is clear from the definitions. Lemma 3.2. P For any nonnegative function K on N+ such that limi→∞ ∞ function η → i=1 K(i)η(i) is continuous on (E ρ , k · kl2 ).

K(i) i

= 0, the

Proof. Suppose η n ∈ E ρ and η n → η in l2 . For any > 0, choose N such that K(i) i < 2ρ , for all i > N . Then |

∞ X

K(i){η(i) − η n (i)}| ≤

i=1

X i≤N

K(i)|η(i) − η n (i)| +

X

K(i)|η(i) − η n (i)|.

i>N

But the first term is a finite sum and it goes to 0, as n → ∞. The second term becomes X K(i) |iη(i) − iη n (i)| < (2ρ) < . i 2ρ i>N

The following lemma states the necessary and sufficient condition of tightness. (We refer to [JM]). Lemma 3.3. Let (X n )∞ n=1 be a sequence of processes defined on their respective probability space (n , A, Pn ) with values in the complete separable metric space H. Then the sequence {Pñ } of laws of the processes (X n ) form a tight sequence if and only if [T 1] and [T 2] hold, where [T1] For any t in some dense subset T of R+ , the laws of the random variables (Xtn ) form a tight sequence of laws in H. [T2] For every N > 0, β > 0, > 0, there exists δ > 0 such that lim sup Pn {ω ∈ n : W N (X n (·, ω), δ) > β} ≤ , n→∞

(5)

. where W N (X, δ) = inf 5δ maxti ∈5 supti ≤s Jn−1 : Xs 6= Xs− }, Xn = X n (Jn ), for all n ≥ 0, with J0 = 0. Then Jn = TX0 + TX1 + · · · + TXn−1 , and it represents the holding time in the successive states visited by Xtn . (See p. 18, [An].) 1 , and let Now let {Ti } be a sequence of i.i.d exponential r.v.’s with mean cn S n = T1 + T 2 + · · · + T n , then Sn is the holding time in successive states visited by the Poisson process Yt with parameter cn. We can verify easily that Jn stochastically dominates Sn . i.e., for any k, P {Jk > k} ≥ P {Sn > k}. Now, for given N > 0, > 0, β > 0, pick δ > 0 such that δ < √β5ρc and δl = N for some integer l. Let 5 be the partition t0 = 0 < t1 = δ < t2 = 2δ < · · · < tl = N , and let Zi =

sup

ti ≤s β} 0≤i≤l−1

= P {Z1 > β or Z2 > β · · · or Zl−1 > β} ≤ l max P {Zi > β}. i

Note that the jump size of Xtn is bounded by

√

5ρ n ,

since


549

√ d(Xtn− , Xtn )

=

kXtn−

−

Xtn kl2

≤ sup

i+j≤n

k1nij kl2

5ρ . n

=

Thus, in order for Zi > β, we need more than [ √β5ρ n] jumps on interval [iδ, (i + 1)δ), since if not, √ β 5ρ ≤ β. Zi ≤ [ √ n] n 5ρ Therefore, Pn {W N (X n (·, ω), δ) ≥ β} ≤ l max P {Zi > β} i

≤ lPn {J[ √βn ] ≤ δ} 5ρ

≤ lP {T[ √βn ] ≤ δ} 5ρ

βn = lP {Yδ ≥ [ √ ]}. 5ρ But the last term goes to 0 as n goes to infinity since δcn < Poisson random variable goes to 0 exponentially fast. Therefore,

βn √ 5ρ

and the tail of a

lim Pn {W N (X n (·, ω), δ) > β} −→ 0, i.e., n

lim sup Pn {ω ∈ n : W N (X n (·, ω), δ) > β)} = 0. n

3.3. The weak limits. Proof of Theorem 2. For any bounded function f which has a continuous bounded . directional derivative in the sense that 5f = (5e1 f, 5e2 f, · · · , 5en f, · · · ) is l2 bounded and continuous, i.e., supη∈E k(5f )(η)kl2 < ∞, define Z t . Mtn,f = f (Xtn ) − f (X0n ) − Gn f (Xsn )ds. (7) 0

Then M n,f is a right continuous Pn martingale and M0n,f = 0 for all n. Let Pñ be the law of X n in D([0, T ] : E), and set ξt (ω) = ω(t), for each ω ∈ D([0, T ] : E), where D([0, T ] : E) is the set of E -valued cadlag functions defined on [0, T ] equipped with the Skorohod topology. Set 8ft (ω) = f (ξt (ω)) − f (ξ0 (ω)) − · ξs (ω)(i)(ξs (ω)(j) −

δij ξˆs (ω))

1 2

Z

t

X

{5f (ξs (ω)) · 10i,j K(i, j)

0 i,j≥1

− 5f (ξs (ω)) ·

10i,j F (i, j)ξs (ω)(i

(8)

+ j)}ds, S∞ / n=1 En . where 10i,j = ei+j − ei − ej , and ξˆs (ω) = nρ , if ω(s) ∈ En , and 0 if ω(s) ∈ Then for any t, 0 ≤ t < T , 8ft (X n ) = Mtn,f − φn (t),

550

I. Jeon

where kφn k∞ → 0 as n → ∞. On D([0, T ] : E), consider the right continuous canonical (Dt0 )t≥0 genT filtration 0 erated by ξ. For any s, t < T and for any F ⊂ Ds = s0 >s Ds0 , let ψ f (ω) = 1F (ω)[8ft (ω) − 8fs (ω)]. Then Eñ {ψ f (ξ)} ≤ kφn k∞ .

(9)

˜ Consider any convergent sequence (Pñk )∞ k=1 and let P be its limit. For any > 0, . let A = {ω : ∃s such that d(ωs− , ωs ) > }, then P˜ (A ) = 0, since A ⊂ (Ac )c = G 2 and since G is open, P˜ (G) ≤ lim inf nk →∞ Pñk (G) = 0. Therefore, the weak limit has continuous sample paths, i.e., Supp(P˜ ) ⊂ C([0, T ] : E) ⊂ D([0, T ] : E), where C([0, T ] : E) is the set of E-valued continuous functions. This implies that for any ω ∈ Supp(P˜ ), convergence to ω in Skorohod topology in D([0, T ] : E) is reduced to a uniform convergence. Using Lemma 3.2, under the hypothesis of this theorem, we can prove that, for P˜ −almost all ω, the mapping ω −→ ψ f (ω) is continuous at ω. Also, the ˜ f ) = limnk E˜ nk (ψ f ) = 0. Therefore, weak convergence and the inequality (9) imply E(ψ from (8), Mtf

1 . = 8ft (X) = f (Xt ) − f (X0 ) − 2 · Xs (i)Xs (j) −

Z

t

X

(5f (Xs ) · 10i,j K(i, j) 0 i,j≥1 5f (Xs ) · 10i,j F (i, j)Xs (i + j))ds

(10)

is a P˜ martingale with M0f = 0. Moreover, 8fT (X nk ) → 8fT (X) weakly, since 8fT is continuous (see [Bi]). That is, M nk ,f −→ M f weakly. Applying the same argument used in counting the number of jumps and bounding the jump size in Theorem 1, since M n,f P has paths with finite variation, we can show that the quadratic variation [M n,f ]t = s≤t (1Msn,f )2 converges to 0 in probability as n → ∞. Also, since [M n,f ] → [M f ] weakly, [M f ]T = 0. (See p. 342 [JS].) Therefore, Mtf = 0 for t ≤ T , with probability 1, i.e., f (Xt ) − f (X0 ) −

1 2

Z

t

X

(5f (Xs ) · 10i,j K(i, j)Xs (i)Xs (j)

0 i,j≥1

− 5f (Xs ) ·

10i,j F (i, j)Xs (i

(11)

+ j))ds = 0.

Now, for any k ∈ N+ , choosing f k , the k th coordinate projection on l2 , and restricting it to the domain E, we get for any t ∈ [0, T ],


1 Xt (k) = X0 (k) + 2

551

Z

t

X

(5f k (Xs ) · 10i,j K(i, j)Xs (i)Xs (j)

0 i,j≥1

− 5f k (Xs ) · 10i,j F (i, j)Xs (i + j))ds Z k−1 1 tX [ {K(k − i, i)Xs (k − i)Xs (i) − F (k − i, i)Xs (k)} = X0 (k) + 2 0 i=1

−

∞ X

{K(k, i)Xs (k)Xs (i) − F (k, i)Xs (k + i)}]ds.

i=1

This is the integral version of the Smoluchowski equation.

4. Gelation Phenomena In this section we will study the gelation phenomenon, which indicates the appearance of huge clusters in a finite time. In a deterministic model, the situation of dropping total density in a finite time is interpreted as gelation. 4.1. The device. Theorem 5 shows how the stochastic gelation phenomena are reflected on the finite sequence of Markov chains X n . Proof of Theorem 5. First note that since Xt has continuous sample paths, for any t, Xtnk → Xt weakly. (See p. 131 [EK].) (a) (I −→ II). Let Al = {η ∈ E : kηkφ(l) ≤ ρ − }. Since φ is nondecreasing, we have for any l ≥ 1, kηkφ(l) =

φ(l) X

iη(i) ≤

i=1

φ(l+1) X

iη(i) = kηkφ(l+1) .

i=1

Hence, Al+1 ⊂ Al . Thus Al is a decreasing sequence of closed sets. By weak convergence, P {Xt ∈ Al } ≥ lim sup P {Xtnk ∈ Al } k→∞

≥ lim sup P {Xtnk ∈ Ank } k→∞

= lim sup P {kXtnk kφ(nk ) ≤ ρ − } k→∞

= δ > 0. But then, P {kXt k ≤ ρ − } = P {Xt ∈

∞ \ l=1

Al } = lim P {Xt ∈ Al } ≥ δ > 0. l→∞

(b) (II −→ I). Let us show that the negation of I implies the negation of II. Thus we assume

552

I. Jeon

(¬I) for every nondecreasing φ : N+ −→ N+ such that φ(n) ≤ n, φ(n) → ∞, as n → ∞, and for every > 0, lim sup P {kXtnk kφ(nk ) ≤ ρ − } = 0. k→∞

Claim. (¬I) implies the condition ∀ > 0, lim lim sup P {kXtnk kl ≤ ρ − } = 0. l→∞ k→∞

Proof of the claim. Suppose, to the contrary, that there exists > 0 and a subsequence lm , m ≥ 1 such that for every m ≥ 1, lim supk→∞ P {kXtnk klm ≤ ρ − } = δ > 0. Thus, for each m ≥ 1 there exists a subsequence k(m, r), r ≥ 1, such that for all r ≥ 1, n

P {kXt k(m,r) klm ≤ ρ − } ≥

δ . 2

Choose r(1) so that nk(1,r) ≥ l1 for every r ≥ r(1), and inductively choose r(m + 1) so that both nk(m,r(m)) ≤ nk(m+1,r(m+1)) and nk(m+1,r) ≥ lm+1 for all r ≥ r(m + 1). Define φ by φ(j) =

1 if 1 ≤ j ≤ nk(1,r(1)) , lm if nk(m,r(m)) ≤ j < nk(m+1,r(m+1)) .

Then φ is nondecreasing, φ(j) ≤ j, ∀j ≥ 1, and limj→∞ φ(j) = +∞. Letting km = k(m, r(m)), we find for all m ≥ 1, n

n

P {kXt km kφ(nkm ) ≤ ρ − } = P {kXt km klm ≤ ρ − } ≥ Hence,

δ . 2

lim sup P {kXtnk kφ(nk ) ≤ ρ − } > 0, k→∞

that is, condition I is true. This proves the claim. Now assume the negation of I and let Bl () = {η ∈ E : kηkl > ρ − }. For each > 0, Bl (), l ≥ 1, is an increasing sequence of open sets. Since Bl ( 2 ) ⊂ Bl (), we have by weak convergence, P {Xt ∈ Bl ()} ≥ P {Xt ∈ Bl ( )} 2 ≥ lim sup P {Xtnk ∈ Bl ( )} 2 k→∞ = 1 − lim inf P {kXtnk kl < ρ − } k→∞ 2 nk ≥ 1 − lim sup P {kXt kl ≤ ρ − }. 2 k→∞ S∞ But then, by the claim, if B() = l=1 Bl (),


553

P {Xt ∈ B()} = lim P {Xt ∈ Bl ()} l→∞

≥ 1 − lim lim sup P {kXtnk kl ≤ ρ − } l→∞ k→∞ 2 = 1. Therefore,

P {kXt k ≤ ρ − } = P {Xt ∈ B()c } = 0.

Since > 0 was arbitrary, this shows that the negation of condition II holds.

The above theorem suggests that, without looking at the system of deterministic equations or the weak limits, we can determine whether stochastic gelation occurs or not by just looking at the system of processes defined on finite particle systems. Also, this theorem implies that the gelation phenomenon defined in the system of deterministic equations indicates the appearance of many but relatively small clusters in a finite time as well as the appearance of a huge cluster. This may raise some questions about the correctness of the interpretation that non-preservation of density implies the appearance of a giant cluster. 4.2. Strong gelation vs. Stochastic gelation. A more suitable definition of gelation as the appearance of a huge cluster in a finite time can be given by the strong gelation which we defined in Sect. 1. The next few results show that under certain conditions on the kernels K(i, j) and F (i, j), strong gelation implies gelation. More precisely, suppose the coagulation-fragmentation system satisfies the tightness conditions (TC1) and (TC2). If the fragmentation rates are not too large, then the sto-gel time Tg of any weak limit point can be estimated in terms of the strong gelation time tsg , provided the latter is finite. n,ρ ∞ ρ Proposition 1. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. Suppose F (i, j) ≡ 0 for all i, j, then Tg ≤ tsg , i.e., if strong gelation occurs, then stochastic gelation occurs.

Proof. Since there is no fragmentation, the first condition of Theorem 5 holds with φ(n) = [αn]. n,ρ ∞ ρ Theorem 6. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. If the kernels satisfy the tightness condition and if P s i+j=k F (i, j) ≤ C, for all k and for some C > 0, then Tg ≤ tg , i.e., if strong gelation occurs then stochastic gelation occurs.

Proof. Suppose tsg < ∞, then exist α > 0 and t ≥ 0 such that lim supn Pn {τnα ≤ t} > 0. Let tα = inf{t > 0 : lim sup Pn {τnα ≤ t} > 0}, n

then limα→0 tα =

tsg .

Choose a subsequence and denote it by {n} again, such that lim Pn {tα ≤ τnα ≤ tα + n

α } > 0. C

For any η ∈ En , let l(η) = k, where k is the largest integer such that η(k) > 0, i.e., l represents the size of the largest cluster of a state η. Let

554

I. Jeon

. G(η) = {ξ ∈ En : l(ξ) < l(η)}, . H(η) = {ξ ∈ En : l(ξ) ≥ l(η)}, . T n = inf{t > 0 : Xτnnα +t ∈ G(Xτnnα )}. Note that T n is independent of τnα . In order for the largest cluster size to decrease, it is necessary that the largest cluster should fragment. Also, the time taken for the largest cluster size to decrease is longer than the time taken for one of the largest clusters to fragment, and the rate µ(ξ) of fragmentation for one cluster of size l(ξ) ξ ∈ H(η), with l(η) ≥ αn, satisfies . n X F (i, j)ξ(i + j) µ(ξ) = 2ρ i+j=l(ξ) n X = F (i, j)ξ(l(ξ)) 2ρ i+j=l(ξ)

≤

1 X F (i, j) 2α i+j=l(ξ)

C , 2α ρ ρ ≤ l(η) ≤ where the first inequality is true since l(ξ)ξ(l(ξ)) ≤ ρ implies ξ(l(ξ)) ≤ l(ξ) This shows that T n stochastically dominates the exponential r.v. Y with mean Therefore, ≤

ρ αn . 2α c .

α α , tα ≤ τnα < tα + } C C α α = Pn {T n > } · Pn {tα ≤ τnα < tα + } C C α α α ≥ P {Y > } · Pn {tα ≤ τn < tα + } C C α 1 = √ Pn {tα ≤ τnα < tα + }. C e

Pn {T n >

For ω, with τnα (ω) < ∞, since kXτnαn kn[αn] =

X k≥[αn]

kXτnαn (k) ≥ αn

ρ = αρ, n

α α , tα ≤ τnα < tα + } C C α 1 ≥ √ Pn {tα ≤ τnα < tα + }. C e

Pn {kXtα + Cα kn[αn] ≥ αρ} ≥ Pn {T n >

Therefore, By Theorem 5, Tg ≤

lim Pn {kXtα + Cα kn[αn] ≥ αρ} > n→∞ α tα + C , and by letting α → 0, we get

0. Tg ≤ tsg .

n,ρ ∞ ρ Theorem 7. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be an r- generalized Becker– Döring system. If the kernels satisfy the tightness condition (T1) and (T2), then Tg ≤ tsg , i.e., if strong gelation occurs, then sto-gel occurs.


555

P Proof. Let C = supη i,j F (i, j)η(i + j) < ∞. Suppose tsg < ∞, then by the same argument in Theorem 6, there exists α > 0, and t ≥ 0 such that lim supn Pn {τnα ≤ t} > 0. Let tα = inf{t > 0 : lim sup Pn {τnα ≤ t} > 0}, n

choose a subsequence and denote it by {n} again, such that lim Pn {tα ≤ τnα ≤ tα + n

αρ } > 0. Cr

Let Ttn be the process which counts the number of fragmentations from τnα to τnα + t, i.e., let Ttn = |{τnα < s ≤ τnα + t : Xsn = Xsn− − 1nij }|. Since the fragmentation rate Cn . n X , F (i, j)η(i + j) ≤ µn (η) = 2ρ i,j 2ρ for all η ∈ E, if we let Ytn be the Poisson process with parameter Cn 2ρ , then the same argument (with different notations) as in Theorem 1 shows that Ttn is stochastically dominated by Ytn . Therefore, αn αρ ], tα ≤ τnα < tα + } 2r Cr αn αρ = Pn {T nαρ ≤ [ ]}Pn {tα ≤ τnα < tα + } Cr 2r Cr αn αρ n ≥ P {Y αρ } ≤ [ ]}Pn {tα ≤ τnα < tα + Cr 2r Cr [ αn 2r ] αn k X ( 2r ) − αn αρ = e 2r Pn {tα ≤ τnα < tα + } k! Cr

Pn {T nαρ ≤ [ Cr

(12)

k=0

≥

αρ 1 Pn {tα ≤ τnα < tα + }, 3 Cr

for large n, where the last inequality is true by the following lemma. Lemma 4.1. For any α, 0 < α ≤ 1, [αn] X k=1

1 (αn)k −αn e as n → ∞. −→ k! 2

Proof. Let {X Pin} be the sequence of i.i.d. Poisson random variables with parameter 1, and let Sn = k=1 Xk , then Sn is a Poisson with parameter n. Thus,

556

I. Jeon [αn] X k=1

(αn)k −αn X [αn]k −[αn] e e ∼ k! k! [αn] k=1

= P {S[αn] ≤ [αn]} = P {S[αn] − [αn] ≤ 0} S[αn] − [αn] = P{ ≤ 0} 1 [αn] 2 1 −→ as n → ∞, 2 since

S[αn] −[αn] 1

[αn] 2

goes to a standard normal distribution.

Now, since in each fragmentation the largest cluster size decreases by at most r, for αn αn any ω ∈ {T nαρ ≤ [ αn 2r ]}, the largest cluster size decreases at most r · [ 2r ] ≤ 2 . So Cr αn l(Xτnα + αρ )(ω) ≥ αn − αn 2 = 2 , i.e., n

Cr

≥ kXτnnα + αρ (ω)kn[ αn 2 ] Cr

αn ρ αρ = . 2 n 2

Therefore, αρ } Cr 2 αn αρ ≥ Pn {T nαρ ≤ [ ], tα ≤ τnα < tα + } Cr 2r Cr αρ 1 }, ≥ Pn {tα ≤ τnα < tα + 3 Cr and the last inequality is true by (12). Thus Pn {kXtnα + αρ kn[ αn ≥ 2 ]

≥ lim Pn {kXtnα + αρ kn[ αn 2 ]

n→∞

Cr

αρ } > 0. 2

As in Theorem 6, letting α → 0, we get Tg ≤ tsg .

5. Applications (Theorems 3 and 4) β

δ mρ}. 2 Since {k } is decreasing, φ(n) is increasing and % ∞. Fix n large enough so that φ(n) >> 1. For η ∈ En , I ⊂ {1, 2, · · · , n}, let rI (η) be the rate of jump from η including only the coordinates in I, i.e., P {Xt+h ∈ AIη |Xt = η} = rI (η)h + o(h), . where AIη = {η + 1nij : i, j ∈ I}. (Recall that 1nij =

ρ n {ei+j

− (ei + ej )}.)


557

Then

rI (η) =

n 2ρ

X i,j∈I,i+j≤n

ρ K(i, j)η(i)(η(j) − δij ). n

. For k ∈ J = {j ∈ N : 2j ≤ φ(n)}, let

j

X j . 2i i , for all j ≤ k}. K k = {η ∈ En : kηk21 ≤ i=0

Note that K k ⊂ K k−1 for all k ∈ J − {0}. Lemma 5.1. If η ∈ K k for all k ∈ J, then kηkn[ φ(n) ] ≥ δρ. 2

Proof. Let k0 = max J. Since kηk21 k0 kηkn[ φ(n) ] ≥ δρ, since φ(n) 2 mρ implies

2α ρ n

0 : Xtn ∈ K k }, k+1 2k+1 ρ . }, Klk = {η ∈ K k : kηk21 ≤ ρ − l n

then Tk−1 ≤ Tk a.s. and Klk ⊂ K k+1 for all l such that k+1

X n . 2i i ) k+1 ], l − 1 ≥ αk = [(ρ − 2 ρ i=0

where [·] is the greatest integer function, since for such l, kηk21

k+1

≤ρ−

k+1

k+1

i=0

i=1

X l2k+1 ρ n 2k+1 ρ X i ≤ ρ − (ρ − = 2i i ) k+1 2 i . n 2 ρ n

. For l = 0, 1, 2, · · · , let σlk = inf{t > 0 : Xt ∈ Klk }. Then σ0k−1 = Tk−1 , σlk−1 ≥ k−1 σl−1 and


559

αk−1 −1

X

Tk = Tk−1 +

i=0

k−1 (σi+1 ∧ Tk − σik−1 ∧ Tk ) + (Tk − σαk−1 ∧ Tk ). k−1

Also, αk−1 −1

E(Tk − Tk−1 ) = E{

X

k−1 (σi+1 ∧ Tk − σik−1 ∧ Tk )} + E(Tk − σαk−1 ∧ Tk ) k−1

i=0

≤ αk−1

max

0≤i≤αk−1 −1

k−1 E(σi+1 ∧ Tk − σik−1 ∧ Tk ) + E(Tk − σαk−1 ∧ Tk ). k−1

Claim. For 0 ≤ i ≤ αk−1 − 1, k−1 ∧ Tk − σik−1 ∧ Tk ) ≤ E(σi+1

Proof. Let

T = inf{t > 0 : Xσk−1 +t ∈ AIX0 i

22+2α ρ . n22kα 2k

k−1 σ i

(16)

∪ K k }.

k−1 k−1 , if ξ ∈ AIη0 , then ξ ∈ K k or ∃m > l such that ξ ∈ Km , For any η ∈ Klk−1 \ Kl+1 since k

k

k

kηk21 − kξk21 = k − 1ij k21 ( for some i, j ∈ I0 ) k ρ = k (ei + ej )k21 (since i + j > 2k ) n ρ ρ =i +j n n ρ ≥ 2k . n k−1 That is, if Xσk−1 ∧Tk ∈ Klk−1 \ Kl+1 for some l, then Xσk−1 ∈ Klk−1 \ K k and i

i

k−1 Xσk−1 +T ∈ Km \ K k for some m > l, or Xσk−1 +T ∈ K k . Thus i

i

σik−1

∧ Tk + T ≥

k−1 σi+1

∧ Tk a.s.,

hence k−1 ∧ Tk − σik−1 ∧ Tk ) ≤ E(T ). E(σi+1

. Now let τ = inf{t > 0 : Xt ∈ AIX0 − }, then t

E(T ) = E{T |Xσk−1 } i

= E{τ |X0 ∈ K k−1 \ K k } = EX k−1 (τ ) σ i

∧Tk

≤ max Eη (τ ) η∈K k−1

≤ max

η∈K k−1

≤

1 rI0 (η)

22+2α ρ , n22kα 2k

560

I. Jeon

and the Claim is done.

Similarly, E(Tk − σαk−1 )≤ k−1 ∧Tk

22+2α ρ . n22kα 2k

(17)

Let γ = 2α − 1 − 2β > 0, then for k ≥ 1 with constants C1 , C2 which may vary in each expression, E(Tk − Tk−1 ) ≤ (αk−1 + 1) ≤ αk−1 ≤ (ρ −

22+2α ρ n22kα 2k

C1 22+2α ρ + 2 2kα n2 k m2(γ+β)k k X

2 i i )

i=0

n 22+2α ρ C1 + 2 k 2kα 2 ρ n2 k m2(γ+β)k

k X 1 22+2α ρ C1 1 = {ρ − C }) + 2 βi k 2kα 2 2 ρ 2 k m2(γ+β)k

(18)

i=0

22+2α+β ρ(1 − δ) C1 C2 δ = + + γk 2 (k+1)β γk (γ+β)k C 2 2 m2 2 22+2α+β C1 C2 δ ≤ + + . ρ(1 − δ)(2β − 1)2 2(γ+β)k m2(γ+β)k 2γk Therefore, for 0 < β < ∞ X

2α−1 2 ,

E(Tk − Tk−1 ) ≤

k=1

∞ X 22+2α+β C1 1 + C2 δ + ρ(1 − δ)(2β − 1)2 2(γ+β)k m k=1

2+2α+β

=

C1 2 + C2 δ + ρ(1 − δ)(2β − 1)2 (2γ+β − 1) m

Now, by a slight modification of (15), we get ρ n K(1, 1)η(1){η(1) − } 2ρ n n 2 C + C1 , ≥ 2ρ

γI 0 (η) =

where I 0 = {1}. Thus, as in (18), E(T0 ) ≤

2β C1 + C2 δ. + − 1)2 m

ρ(2β

Recalling that k0 , the largest integer such that 2k0 ≤ φ(n), and setting T −1 = 0, we have


ETk0 = ET0 + E(

k0 X

561

(Tk − Tk−1 )

k=1

≤

inf

0 ρc . Assume that n is sufficiently large so that αn >> 1. For η ∈ En , such that η(i) = 0 for all i ≥ αn, the coagulation rate


563

ρ . n X λn (η) = K(i, j)η(i){η(j) − δij } 2ρ n i+j≤n

≥

[αn] [αn] ρ n X X ijη(i){η(j) − δij } 2ρ n j=1 i=1

=

[αn] X [αn] X

n { 2ρ

ijη(i)η(j) −

j=1 i=1

[αn] X [αn] X j=1 i=1

ρ ijη(i)δij } n

[αn] X

ρ n 2 {ρ − i2 η(i) } 2ρ n i=1 ρn n 2 (ρ − αρ2 ) = (1 − α), ≥ 2ρ 2

=

where the last inequality holds since [αn] X i=1

i2 η(i)

X [αn] X ρ ≤ η(i)ρ ≤ αρ i iη(i) = αρ2 . n n [αn]

[αn]

i=1

i=1

Also, the fragmentation rate ρc n n . n X · ρβ = . F (i, j)η(i + j) ≤ µn (η) = 2ρ 2ρ 2 i+j≤n

Therefore, µn (η) ≤

ρn ρc n < (1 − α) ≤ λn (η). 2 2

Let λ0 = ρ(1−α) , µ0 = ρ2c , then λ0 > µ0 . Let Ytn be the birth and death process on 2 {−n, −(n − 1), · · · , 0, 1, 2, · · · , n} with reflecting state {−n} and absorbing state {n} and transition probability Pt (i + 1|i) = λ0 nt + o(t) (−n ≤ i < n), Pt (i − 1|i) = µ0 nt + o(t) (−n < i < n), Pt (n − 1|n) = 0. Then Ytn represents “the number of coagulation steps − the number of fragmentation and fragmentation rate steps” up to time t of a process with coagulation rate ρ(1−α)n 2 ρc n . 2 Let T be the first hitting time of absorbing state n from the initial state 0, i.e., let . T = inf{t > 0 : Ytn = n|Y0n = 0}, then ET ≤

1 n = 0 for some k ≥ αn}, the first hitting time of the cluster size greater than or equal to αn.

564

I. Jeon

Since Xtn hits k−cluster for some k ≥ αn before “the number of coagulation − the number of fragmentation” = n, and by stochastic dominance, for any 0 > 0, Pn {τnα ≤

1 + 0 } ≥ P {Y 1 +0 = n} λ0 −µ0 λ0 − µ 0 1 = P {T ≤ + 0 } λ0 − µ0 = P {T ≤ E(T ) + 0 } 0 , ≥ E(T ) + 0

for all n. Therefore, lim inf P {τnα ≤ n

and strong gelation occurs in time 2 tsg ≤ (ρ−ρ . c)

1 0 + 0 } ≥ , λ0 − µ 0 E(T ) + 0

1 λ0 −µ0

+ 0 , for any 0 > 0. Letting 0 , α → 0, we get

Proof of Corollary 2. (TC1) is satisfied since, for any η ∈ En , X X K(i, j)η(i)η(j) ≤ M ijη(i)η(j) i,j

i,j

≤ M(

X

iη(i))(

X

i

jη(j))

j

≤ M ρ2 < ∞. Also, (TC2) is satisfied since X

F (i, j)η(i + j) ≤

i,j≥1

≤

X

C η(i + j) i+j

i,j≥1 ∞ X X

k=2 i+j=k ∞ X

=

(k − 1)

k=2

≤

C η(i + j) i+j C η(k) k

∞ CX kη(k) 2 k=2

ρC . ≤ 2 Moreover,

X i+j=k

F (i, j) ≤

X i+j=k

C i+j

= (k − 1) ·

C ≤ C. k


565

Since ρβ = sup η

β≤

C 2

and ρc =

β

≤

C 2 .

Tg ≤

X

F (i, j)η(i + j) ≤

i,j

ρC , 2

By Theorem 4 and Theorem 6, 2 2 ≤ (ρ − ρc ) (ρ −

C 2 )

=

4 . 2ρ − C

Proof of Corollary 3. X

F (i, j)η(i + j) ≤

C(i + j)η(i + j) ≤ Crρ,

i=1 j=1

i,j≥1

hence β ≤ Cr, ρc =

r r X X

β

≤

Cr , and, therefore, by Theorems 4 and 6, tg

≤

2 ρ−Cr .

Acknowledgement. This is a part of the author’s Ph.D thesis. I would like to thank my adviser, P. March, for introducing this subject, for showing me the mathematical ideas and insights, and for his encouragement.

References [A] [An] [BH] [BC] [BCC] [B] [Bi] [BP] [BW] [D]

[DS]

[DGS] [ES] [EK] [F] [H]

Aldous, D.J.: Deterministic and Stochastic Models for Coalescence (Aggregation, Coagulation): A Review of the Mean-Field Theory for Probabilists. Preprint (1996) Anderson, W.I.: Continuous Time Markov Chains. New York: Springer-Verlag, 1991 Bak, T.A., Heilmann, O.J.: Post-gelation solutions to Smoluchowski’s coagulation equation. J. Phys. A: Math. Gen. 27, 4203–4209 (1994) Ball, J.M., Carr, J.: The discrete coagulation-fragmentation equations: existence, uniqueness, and density conservation. J. Stat. Phys. 61, 203–234 (1990) Ball, J.M., Carr, J., Penrose, O.: The Becker–Döring Cluster Equations: Basic Properties and Asymptotic Behavior of Solutions. Commun. Math. Phys. 104, 657–692 (1986) Barrow, J.D.: Coagulation with fragmentation. J. Phys. A: Math. Gen. 14, 729–733 (1981) Billingsley, P. Convergence of Probability measures. New York: J. Wiley & Sons, 1968 Buffet, E., Pule, I.V.: On the Lushnikov’s model of gelation. J. Stat. Phys. 58, 1041–1058 (1990) Buffet, E., Werner, R.F.: A counterexample in coagulation theory. J. Math. Phys. 32, 2276–2278 (1991) Drake, R.L.: A general mathematical survey of the coagulation equation. In: Topics in Current Aerosol Research. 3, Pergamon Part 2, G. M. Hidy and J. R. Brock, eds. Oxford: Press, 1972, pp. 51–119 Donnelly, P., Simons, S.: On the stochastic approach to cluster size distribution during particle coagulation: I. Asymptotic expansion in the deterministic limit. J. Phys. A: Math. Gen. 26, 2755– 2767 (1993) Dubovskii, P.B., Galkin, V.A., and Stewart, I.W.: Exact solutions for the coagulation-fragmentation equation. J. Phys. A: Math. Gen. 25, 4737–4744 (1992) Ernst, M.H., Szamel, G.: Fragmentation kinetics. J. Phys. A: Math. Gen. 26, 6085–6091 (1993) Ethier, S.N., Kurtz, T.G.: Markov Processes: Characterization and Convergence. New York: J. Wiley & Sons, 1986 Feller, W.: An Introduction to Probability Theory and Its applications Volume II. New York: J. Wiley, 1971 Halmos, P.R.: A Hilbert Space Problem Book. New York : Springer- Verlag, 1982

566

[HEZ]

I. Jeon

Hendricks, E., Ernst, M., and Ziff, R.: Coagulation equations with gelation. J. Stat. Phys. 31, 519–563 (1983) [HSES] Hendricks, E., Spouge, M., Eibl, M., and Schreckenberg, M.: Exact Solutions for Random Coagulation Processes. Z. Phys. B- Condensed Matter 58, 219–227 (1985) [HEL] Huang, J., Edwards, B.F., and Levine, A.D.: General solutions and scaling violation for fragmentation with mass loss. J. Phys. A: Math. Gen. 24, 3967–3977 (1985) [JS] Jacod, J., Shiryaev A.N.: Limit Theorems for Stochastic Processes. Berlin: Springer-Verlag, 1987 [JM] Joffe, A., Metivier, M.: Weak convergence of semimartingales with application to multi type branching process. Adv. in Appl. Probab. 18, 20–65 (1986) [KS] Karatzas, I., Shreve, S.E.: Brownian Motion and Stochastic Calculus. New York: Springer–Verlag, 1991 [KT] Karlin, S., Taylor, M.H.: A First Course in Stochastic Process. London: Academic Press, 1975 [K] Kokholm, N.J.: On Smoluchowski’s coagulation equation. J. Phys. A: Math. Gen. 21, 839–842 (1988) [L] Leyvraz, F.: Existence and properties of post-gel solutions for the kinetic equations of coagulation. J. Phys. A: Math. Gen. 16, 2861–2873 (1983) [LT] Leyvraz, F., Tschudi, H.: Critical Kinetics near gelation. J. Phys. A: Math. Gen. 15, 1951–1964 (1982) [Li] Liggett, T.M.: Interacting Particle Systems. New York: Springer-Verlag, 1985 [Lin] Lindvall, T.: Lectures on the Coupling Method. New York: Wiley, 1992 [Lu] Lushnikov, A.A.: Certain now aspects of the coagulation theory. Izv. Atm. Ok. Fiz. 14, 738–743 (1978) [M] Marcus, A.H.: Stochastic coalescence. Technometrics 10, 133–146 (1968) [MS] Merkulovich, V.M., Stepanov, A.S.: Atmospheric and Ocean Phys. 22, 195–199 (1986) [Mc] McLeod, J.B.: On an infinite set of non-linear differential equations I, II. Quart. J. Math. Oxford(2) 13, 119–128, 193–205 (1962) [S] Silk, J.: Star Formation. Geneva Observatory, Switzerland: Sauverny, 1980 [Sc] Scott, W.T.: Poisson statistics in distributions of coalescing droplets. J. Atmos. Sci. 24, 221–225 (1967) [Sm] Smoluchowski, M.V.: Versuch einer mathematischen Theorie der Koagulationskinetik kolloider Lösungen. Z. Phys. Chem. 92, 129–168 (1917) [SZT] Sorensen, C.M., Zhang, H.X., and Taylor, T.W.: Cluster-Size Evolution in a Coagula- tionFragmentation System. Phys.Rev.Lett. 59, 363–366 (1987) [St] Stewart, I.W.: A global existence theorem for the general coagulation-fragmentation equation with unbounded kernels. Math. Meth. Appl. Sci. 11, 627–648 (1989) [SV] Strook, D.W., Varadhan, S.R.S.: Multidimensional Diffusion Processes. New York: Springer-Verlag, 1979 [vE1] van Dongen, P.G.J., Ernst, M.H.: Pre-and post-gel size distributions in (ir)reversible polymerization. J. Phys. A: Math. Gen. 16, L327–L332 (1983) [vE2] van Dongen, P.G.J., Ernst, M.H.: Size distribution in the polymerization model Af RBg . J. Phys. A: Math. Gen. 17, 2281–2297 (1984) [vE3] van Dongen, P.G.J., Ernst, M.H.: Cluster size distribution on irreversible aggregation at large times. J. Phys. A: Math. Gen. 18, 2779–2793 (1985) [VZL] Vigil, R.D., Ziff, R.M., and Lu, B.: New universality class for gelation in a system with particle breakup. Phys. Rev. B 38, 942–945 (1988) [W] Warshaw, M.: Cloud droplet coalescence: Statistical foundation and a one-dimensional sedimentation model. J. Atmos. Sci. 24, 278–286 (1967) [WS] Witten, T.A., Sander, L.M.: Phys. Rev. Lett. 47, 1400–1403 (1981) [Wh] White, W.H.: A global existence theorem for Smoluchowski’s coagulation equation. Proc. Am. Math. Soc. 80, 273–276 (1980) [YH] Yu Jiang, Hu Gang: Long-time behavior of the cluster size distribution in joint coagulation processes. Physical Review B 40, 661–665 (1989)


[YHM] [Z1] [Z2] [ZS] [ZHE]

567

Yu Jiang, Hu Gang, and Ma BenKun: Critical property and universality in the generalized Smoluchowski coagulation equation. Phys. Rev. B 41, 9424–9429 (1990) Ziff, R.: Kinetics of polymerization. J. Stat. Phys. 23, 241–263 (1980) Ziff, R.: An explicit solution to a discrete fragmentation model. J. Phys. A: Math. Gen. 25, 2569–2576 (1992) Ziff, R., Stell, G.: Kinetics of polymer gelation. J. Chem. Phys. 73, 3492–3499 (1980) Ziff, R., and Hendricks, E., Ernst, M.: Critical Properties for Gelation: A Kinetic Approach. Phys. Rev. Lett. 49, 593–595 (1982)

Communicated by J. L. Lebowitz

Commun. Math. Phys. 194, 569 – 589 (1998)

Communications in


Boundary Exchange Algebras and Scattering on the Half Line Antonio Liguori1,2 , Mihail Mintchev1,2 , Liu Zhao3 1 2 3

Dipartimento di Fisica dell’Università di Pisa, Piazza Torricelli 2, 56100 Pisa, Italy Istituto Nazionale di Fisica Nucleare, Sezione di Pisa, 56100 Pisa, Italy Institute of Modern Physics, Northwest University, Xian 710069, P.R. China

Received: 14 November 1996 / Accepted: 5 November 1997

Abstract: Some algebraic aspects of field quantization in space-time with boundaries are discussed. We introduce an associative algebra BR , whose exchange properties are inferred from the scattering processes in integrable models with reflecting boundary conditions on the half line. The basic properties of BR are established and the Fock representations associated with certain involutions in BR are derived. We apply these results for the construction of quantum fields and for the study of scattering on the half line.

1. Introduction It is well known that the presence of boundaries in space affects the behavior of quantum fields. In this paper we discuss the influence of the boundary conditions on the canonical commutation relations between creation and annihilation operators. Our investigation is inspired mainly by the factorized scattering theory of integrable models with reflecting boundary conditions on the half line. In the absence of boundaries [6,13,26], the algebraic features of these models are encoded in the Zamolodchikov-Faddeev (Z-F) algebra [6,26], denoted in what follows by AR . This is an associative algebra, whose generators satisfy quadratic constraints, known as exchange relations. The Fock representation of AR equipped with an appropriate involution describes the scattering processes in integrable models. In this respect one should recall first that the Fock space contains two dense subspaces whose elements are interpreted as asymptotic in- and out-states. Second, the S-matrix can be explicitly constructed as a unitary operator interpolating between the asymptotic in- and out-spaces. In a pioneering paper from the middle of the eighties, Cherednik [4] suggested a possible generalization of factorized scattering theory to integrable models with reflecting boundary conditions, which preserve integrability. The recent efforts to gain a deeper insight in various boundary-related two-dimensional phenomena, stimulated

570

A. Liguori, M. Mintchev, L. Zhao

further investigations [5,7–12,16,21,23–25] in this subject. Among others, we would like to mention the attempts to develop an algebraic approach. One of the basic ideas there is to extend the Z-F algebra by introducing [8–12] “boundary creating" (also called “reflection") operators, which formally translate in algebraic terms the nontrivial boundary conditions. When possible, such an algebraic formulation is quite attractive because the treatment of the boundary conditions in their standard analytic form is as a rule a complicated matter. In spite of the great progress in implementing the above idea in particular models, the fundamental features of the boundary operators and their interplay with the “bulk" theory are still to be investigated. This is among the main purposes of the present paper. We start our analysis by introducing an exchange algebra BR with the following structure. In the above spirit, BR contains both boundary and bulk generators. The latter have a counterpart in AR , but we shall see that the exchange of two bulk generators of BR involves in general boundary elements. The impact of the boundary on the bulk theory is therefore manifest already on the algebraic level, while the detailed boundary conditions are specified on the level of representation. We concentrate in this article on the Fock representations of BR . We will show that there exist two series of such representations, depending on certain involutions in BR . We shall construct these representations explicitly, establishing also their basic properties. As an application of these results, we will perform a detailed and rigorous investigation of the S-matrix of integrable models in the presence of reflecting boundaries. The paper is organized as follows. In Sect. 2 we define the exchange algebra BR and investigate some of its basic features. We introduce the concept of reflection BR algebra and the related notion of reflection automorphism. At the end of this section we describe also a family of natural generalizations of BR . Section 3 is devoted to the Fock representations of BR . In Sect. 4 we describe some applications. We show that the second quantization on the half line naturally leads to BR . We also analyze here the scattering operator of integrable models. The last section contains our conclusions. In the appendix we construct representations of BR carrying a boundary quantum number. This article brings together and extends the results independently obtained by the present authors in [19] and [27]. 2. The Exchange Algebra BR BR is by definition an associative algebra with identity element 1. It has two types of generators: (2.1) {aα (x), a∗α (x) : α = 1, ..., N, x ∈ Rs } and (2.2) {bβα (x) : α, β = 1, ..., N, x ∈ Rs }, which, as mentioned in the introduction, are called bulk and boundary generators respectively. For convenience, we divide the constraints on (2.1,2) in three groups: (i)

bulk exchange relations are quadratic in the bulk generators and read β1 β2 (x2 , x1 ) aβ2 (x2 ) aβ1 (x1 ) = 0, aα1 (x1 ) aα2 (x2 ) − Rα 2 α1

a

∗α1

(x1 ) a

∗α2

(x2 ) −

aα1 (x1 ) a∗α2 (x2 ) −

∗β2

∗β1

a (x2 ) a (x1 ) Rβα21βα12 (x2 , x1 ) = α2 β1 a∗β2 (x2 ) Rα (x1 , x2 ) aβ1 (x1 ) = 1 β2

1 1 2 δ(x1 − x2 ) δαα12 1 + δ(x1 + x2 ) bα α1 (x1 ); 2 2

0,

(2.3) (2.4)

(2.5)

Boundary Exchange Algebras and Scattering on the Half Line

571

(ii) boundary exchange relations γ2 γ1 Rα (x1 , x2 ) bδγ11 (x1 ) Rγβ21δδ12 (x2 , −x1 ) bβδ22 (x2 ) = 1 α2 δ2 δ1 (x1 , −x2 ) bγδ11 (x1 ) Rδβ21γβ12 (−x2 , −x1 ); bγα22 (x2 ) Rα 1 γ2

(2.6)

(iii) mixed relations γ1 γ2 (x2 , x1 ) bδγ22 (x2 ) Rγβ12δδ21 (x1 , −x2 ) aδ1 (x1 ), aα1 (x1 ) bβα22 (x2 ) = Rα 2 α1

(2.7)

γ2 δ 1 (x1 , x2 ) bγδ11 (x1 ) Rγβ21γα12 (x2 , −x1 ) . bβα11 (x1 ) a∗α2 (x2 ) = a∗δ2 (x2 ) Rα 1 δ2

(2.8)

In the above equations and in what follows the summation over repeated upper and lower indices is always understood. The entries of the exchange factor R are complex valued measurable functions on Rs × Rs , obeying γ1 γ2 (x1 , x2 ) Rγβ11γβ22 (x2 , x1 ) = δαβ11 δαβ22 , Rα 1 α2

(2.9)

γ1 γ2 γ2 γ3 β 1 δ2 (x1 , x2 ) Rγδ22βα33 (x1 , x3 ) Rγβ11δβ22 (x2 , x3 ) = Rα (x2 , x3 ) Rα (x1 , x3 ) Rδβ22γβ33 (x1 , x2 ) . Rα 1 α2 2 α3 1 γ2 (2.10) These compatibility conditions are assumed throughout the paper and can be considered as general requirements on R, which together with Eqs. (2.3–8) define the exchange algebra BR . Equation (2.10) is the spectral quantum Yang–Baxter equation in its braid form, Rs playing the role of spectral set. Let us comment now on the exchange relations (2.3–8), which may look at first sight a bit complicated. Concerning the general structure, we observe that after setting √ formally all boundary generators in (2.3–8) to zero and rescaling by a factor of 1/ 2 the bulk generators, one gets the Z-F algebra AR . This fact clarifies partially the origin of Eqs. (2.3–5). The presence of boundary generators in the right hand side of (2.5) is worth stressing. This is one of the essential points, in which our approach differs from the previous attempts to define a boundary exchange algebra. Equation (2.6) describes the exchange of two boundary generators taken in generic points and also deserves a remark. It looks similar to the boundary Yang–Baxter equation [4]; the difference is that the elements {bβα (x)} do not commute in general and consequently their position in (2.6) is essential. Notice also that {bβα (x)} close a subalgebra of BR , which presents by itself some interest [24]. Finally, Eqs. (2.7, 8) express the interplay between {aα (x), a∗α (x)} and {bβα (x)} and represent another relevant new aspect of our proposal. Two straightforward examples, denoted by B± , correspond to the constant solutions β1 β2 = ± δαβ21 δαβ12 Rα 1 α2

(2.11)

of (2.9,10) and represent in the above context the counterparts of the canonical (anti)commutation relations. Equations (2.6–8) imply that {bβα (x)} are central elements in B± . Nevertheless, also in these relatively simple cases the right-hand side of Eq. (2.5) keeps trace of the nontrivial boundary conditions. Two applications of B+ with N = 1 are described in Sect. 4. To further understand the structure of BR and its representations, it is instructive to introduce some involutions in BR . Let HN be the family of invertible Hermitian N × N matrices and let M be the set of matrix valued functions m : Rs → HN , such that the entries of m(x) and m(x)−1 are measurable and bounded in Rs . Consider the mapping Im defined by

572


Im : a∗α (x) 7−→ mβα (x) aβ (x),

(2.12)

Im : aα (x) 7−→ a∗β (x) m−1α β (x),

(2.13)

Im : bβα (x) 7−→ mγβ (−x) bδγ (−x) m−1α δ (x).

(2.14)

Provided that m ∈ M satisfies β1 β2 γ1 γ2 β1 β2 1 γ2 R†γ α1 α2 (x1 , x2 ) mγ1 (x1 ) mγ2 (x2 ) = mα1 (x2 ) mα2 (x1 ) Rγ1 γ2 (x2 , x1 ),

(2.15)

it is not difficult to check that when extended as an antilinear antihomomorphism on BR , Im defines an involution. In Eq. (2.15) and in what follows the dagger stands for Hermitian conjugation, i.e. α1 α2

1 β2 R†β α1 α2 (x1 , x2 ) ≡ Rβ1 β2 (x1 , x2 ),

the bar indicating complex conjugation. Notice that for the algebras B± Eq. (2.15) is satisfied for any m ∈ M. In this paper we shall concentrate on the following specific type of BR -algebras. We call the boundary generators {bβα (x)} reflections if bγα (x) bβγ (−x) = δαβ

(2.16)

hold. In this case we refer to BR as a reflection exchange algebra. The condition (2.16) is Im -invariant and one easily proves Proposition 1. Let BR be a reflection exchange algebra. Then the mapping % : aα (x) 7−→ bβα (x)aβ (−x),

(2.17)

% : a∗α (x) 7−→ a∗β (−x)bα β (−x),

(2.18)

% : bβα (x) 7−→ bβα (x),

(2.19)

leaves invariant the constraints (2.3–8) and extends therefore to an automorphism on BR . Moreover, being compatible with Im , % is actually an automorphism of {BR , Im } considered as an algebra with involution. In what follows % is called the reflection automorphism of BR . Besides encoding some essential features of any reflection exchange algebra, % has a direct physical interpretation in scattering theory: it provides a mathematical description of the intuitive physical picture that bouncing back from a wall, particles change the sign of their rapidities. In fact, the two elements a∗α (−x) and a∗β (x)bα β (x) are %-equivalent, a∗α (−x) ∼ a∗β (x)bα β (x).

(2.20)

This relation in our framework is the counterpart of a heuristic equation (see for example Eq. (3.22) of [10]), conjectured in all papers dealing with factorized scattering with reflecting boundaries. In the next section we will show that the %-equivalence becomes actually an equality in the Fock representation of {BR , Im }. For proving this statement we will use the relations βδ (x1 , x2 ) {aδ (x1 ) − % [aδ (x1 )]} , (2.21) {aα (x1 ) − % [aα (x1 )]} a∗β (x2 ) = a∗γ (x2 ) Rαγ


573

αβ (x2 , x1 ), a∗α (x1 ) − % a∗α (x1 ) a∗β (x2 ) = a∗γ (x2 ) a∗δ (x1 ) − % a∗δ (x1 ) Rγδ (2.22) whose validity follows directly from Eqs. (2.3–5, 7, 8, 16). Before concluding this section, we would like to introduce a whole class of more general exchange algebras which can be treated in the above way. The idea is to replace the reflection x 7→ −x, which plays a special role in defining BR , with any almost everywhere differentiable mapping λ : x 7→ x e, satisfying the iterative functional equation

λ (λ(x)) = x.

(2.23)

The resulting exchange algebras will be denoted by BR,λ and are characterized by the following constraints: the relations (2.3,4) remain unchanged, whereas (2.5–8) take the form α2 β1 (x1 , x2 ) aβ1 (x1 ) = aα1 (x1 ) a∗α2 (x2 ) − a∗β2 (x2 ) Rα 1 β2 (2.24) 1 2 {δ(x1 − x2 ) δαα12 1 + δ(x1 − x e2 ) bα α1 (x1 )}, 0 1/2 2|detλ (x1 )| γ2 γ1 Rα (x1 , x2 ) bδγ11 (x1 ) Rγβ21δδ12 (x2 , x e1 ) bβδ22 (x2 ) = 1 α2 δ2 δ1 (x1 , x e2 ) bγδ11 (x1 ) Rδβ21γβ12 (e x2 , x e1 ), bγα22 (x2 ) Rα 1 γ2 γ1 γ2 aα1 (x1 ) bβα22 (x2 ) = Rα (x2 , x1 ) bδγ22 (x2 ) Rγβ12δδ21 (x1 , x e2 ) aδ1 (x1 ), 2 α1 γ2 δ1 (x1 , x2 ) bγδ11 (x1 ) Rγβ21γα12 (x2 , x e1 ). bβα11 (x1 ) a∗α2 (x2 ) = a∗δ2 (x2 ) Rα 1 δ2

(2.25)

(2.26)

Here λ0 (x) denotes the Jacobian matrix of the function λ. The results of this section regarding BR can be transferred with obvious modifications to BR,λ . For the complete set of solutions of Eq. (2.23) we refer to [14]. When s = 1 for instance, the mapping λ can be any almost everywhere differentiable function in R whose graph is symmetric with respect to the diagonal {(x, y) ∈ R2 : x = y}. Summarizing, we introduced so far the exchange algebra BR and some natural generalizations of it. We defined also a set of involutions in BR , which are useful in representation theory. Focusing on reflection type BR -algebras, we shall construct in the next section the relative Fock representations. 3. Fock Representations We consider in this paper representations of {BR , Im } with the following general structure. 1. The representation space L is a locally convex and complete topological linear space over C. 2. The generators {aα (x), a∗α (x), bβα (x)} are operator valued distributions with common and invariant dense domain D ⊂ L, where Eqs. (2.3–8) hold. 3. D is equipped with a nondegenerate sesquilinear form (inner product) h · , · im , which is at least separately continuous. The involution Im defined by Eqs. (2.12–14) is realized as a conjugation with respect to h · , · im . A Fock representation of {BR , Im } is specified further by the following requirement. 4. There exists a vector (vacuum state) ∈ D which is annihilated by aα (x). Moreover, is cyclic with respect to {a∗α (x)} and h , im = 1.

574


A more general situation, when a boundary quantum number [10] is present, is outlined in the appendix. There is a series of direct but quite important corollaries from the above assumptions. Let us start with Proposition 2. The automorphism % of any reflection BR -algebra is implemented in the above Fock representations by the identity operator. Proof. First of all we observe that hP 0 [a∗ ] , {aα (x) − %[aα (x)]}P [a∗ ]im = 0,

(3.1)

where P and P 0 are arbitrary polynomials. In fact, by means of Eq. (2.21) one can shift the curly bracket to the vacuum and use that aα (x) annihilate . Now the cyclicity of , combined with the properties of h · , · im , allow to replace P 0 [a∗ ] by an arbitrary state ϕ ∈ D. A further conjugation leads to

which implies

hP [a∗ ] , {a∗α (x) − %[a∗α (x)]}ϕim = 0,

(3.2)

a∗α (x) = a∗β (−x)bα β (−x)

(3.3)

on D. Analogously, employing (2.22) one concludes that aα (x) = bβα (x)aβ (−x)

(3.4)

also holds on D. Finally, taking in consideration Eq. (2.19) we deduce that % is indeed implemented by the identity operator. For describing some further characteristic features of the Fock representations of BR , we introduce the c-number distributions Bαβ (x) ≡ h , bβα (x)im .

(3.5)

The requirement 3 implies that B †βα (x) = mγα (−x) Bγδ (−x) m−1βδ (x),

(3.6)

which is the analog of condition (2.15) regarding the exchange factor R. Two other simple consequences of our assumptions 1–4 above are collected in Proposition 3. The vacuum vector is unique (up to a phase factor) and satisfies bβα (x) = Bαβ (x) .

(3.7)

Proof. The argument implying the uniqueness of the vacuum is standard. Concerning Eq. (3.7), it can be inferred from the identity h[bβα (x) − Bαβ (x)] , P [a∗ ]im = 0,

(3.8)

P being an arbitrary polynomial. In order to prove Eq. (3.8) we shift by a conjugation the polynomial to the first factor in the right-hand side of (3.8) and apply afterwards the exchange relation (2.8) and Eq. (3.5). For completing the proof, one also employs that is cyclic and h · , · im is continuous and nondegenerate.


575

Combining Eq. (3.7) with the fact that aα (x) annihilate , we conclude that Eqs. (2.5, 7, 8) allow for a purely algebraic derivation of the vacuum expectation values involving any number and combination of the generators {aα (x), a∗α (x), bβα (x)}. In particular, taking the vacuum expectation value of Eq. (2.6) one gets γ2 γ1 (x1 , x2 ) Bγδ11 (x1 ) Rγβ21δδ12 (x2 , −x1 ) Bδβ22 (x2 ) = Rα 1 α2 δ2 δ1 (x1 , −x2 ) Bδγ11 (x1 ) Rδβ21γβ12 (−x2 , −x1 ). Bαγ22 (x2 ) Rα 1 γ2

(3.9)

We thus recover at the level of Fock representation the original boundary Yang–Baxter equation [4]. In addition, when one is dealing with reflection algebras, Eq. (2.16) implies Bαγ (x) Bγβ (−x) = δαβ .

(3.10)

In this case we refer to B as a reflection matrix. A final comment in this introductory part concerns the algebras B± . Using that {bβα (x)} are central elements, in a Fock representation of B± one has bβα (x)ϕ = Bαβ (x)ϕ

(3.11)

for any ϕ ∈ D. At this stage it is convenient to introduce the set M(R, B) of all elements of M obeying both Eqs. (2.15) and (3.6). Then the basic input fixing a Fock representation of the reflection algebra {BR , Im } is the triplet {R, B; m}, where R and B satisfy Eqs. (2.9, 10) and (3.9,10), and m ∈ M(R, B). Some explicit examples of such triplets have been found already by Cherednik [4]. With any {R, B; m} we associate a Fock representation denoted by FR,B;m . To the end of this section we will describe the explicit construction of FR,B;m . n of FR,B;m . For this Our first step will be to introduce the n-particle subspace HR,B purpose we consider N M L2 (Rs ), (3.12) H= α=1

equipped with the standard scalar product Z (ϕ, ψ) =

ds xϕ†α (x)ψα (x) =

N Z X

ds xϕα (x)ψα (x).

(3.13)

α=1 n we are looking for, will be a subspace of the n-fold For n ≥ 1 the n-particle space HR,B (n) ⊗n tensor power H , characterized by a suitable projection operator PR,B . The ingredients (n) for constructing PR,B are essentially two: a specific finite group and its representation in H⊗n , defined in terms of the exchange factor R and the reflection matrix B. Let us concentrate first on the group. In the case of AR , this was [17] simply the permutation group Pn . The physics behind BR suggest to enlarge in this case the group by adding a reflection generator. More precisely, we consider the group Wn generated by {τ, σi : i = 1, ..., n − 1} which satisfy

σ i σ j = σj σ i , σ i τ = τ σi ,

|i − j| ≥ 2, 1 ≤ i < n − 2,

(3.14)

576


σi σi+1 σi = σi+1 σi σi+1 , σn−1 τ σn−1 τ = τ σn−1 τ σn−1 ,

(3.15)

σi2 = τ 2 = 1.

(3.16)

Wn is the Weyl group associated with the root systems of the classical Lie algebra Bn and has 2n n! elements. Although it contains no permutations, W1 = {1, τ } is nontrivial. We turn now to the representation of Wn in H⊗n . Observing that any element ϕ ∈ H⊗n can be viewed as a column whose entries are ϕα1 ···αn (x1 , . . . , xn ), we define the operators {T (n) , Si(n) : i = 1, ..., n − 1} acting on H⊗n according to: h i Si(n) ϕ

α1 ...αn

(x1 , ..., xi , xi+1 , ..., xn ) = n ≥ 2,

n [Ri i+1 (xi , xi+1 )]βα11...β ...αn ϕβ1 ...βn (x1 , ..., xi+1 , xi , ..., xn ),

T (n) ϕ

α1 ...αn

(3.17)

(x1 , ..., xn ) =

n [Bn (xn )]βα11...β ...αn ϕβ1 ...βn (x1 , ..., xn−1 , −xn ),

n ≥ 1,

(3.18)

where

β1 ...βn

Rij (xi , xj )

α1 ...αn

c β c βi βj = δαβ11 δαβ22 · · · δαβii · · · δαjj · · · δαβnn Rα (xi , xj ) i αj

(3.19)

and c βi β1 β2 βn βi n [Bi (x)]βα11...β ...αn = δα1 δα2 · · · δαi · · · δαn Bαi (x).

(3.20)

The hat in Eqs. (3.19, 20) indicates that the corresponding symbol must be omitted. For implementing Eqs. (3.17, 18) on the whole H⊗n , we assume at this stage that the matrix β1 β2 (x1 , x2 ) and Bαβ (x) are bounded functions. We are now in position to elements Rα 1 α2 prove Proposition 4. {T (n) , Si(n) : i = 1, ..., n − 1 } are bounded operators on H⊗n and the mapping χ(n) : τ 7−→ T (n) ,

χ(n) : σi 7−→ Si(n) ,

i = 1, · · · , n − 1

(3.21)

defines a representation of Wn in H⊗n . Moreover, (n) ≡ PR,B

X 1 χ(n) (ν) 2n n!

(3.22)

ν∈Wn

is a bounded projection operator in H⊗n . Proof. The main point is to show that {T (n) , Si(n) : i = 1, ..., n − 1 } obey Eqs. (3.14– 16). This can be checked directly. Equations (3.14) are satisfied by construction. Equations (3.15) follow from (2.10) and (3.9). Finally, Eqs. (2.9) and (3.10) imply (3.16).


577

(n) Let us observe in passing that PR,B is an orthogonal projector only if the N × N (n) is not orthogonal, but being a identity matrix e belongs to M(R, B). In general PR,B bounded operator determines for any n ≥ 1 a (nonempty) closed subspace (n) n ≡ PR,B H⊗n . HR,B

(3.23)

n behave as follows: By construction the elements of HR,B

ϕα1 ...αn (x1 , ..., xi , xi+1 , ..., xn ) n = [Ri i+1 (xi , xi+1 )]βα11...β ...αn ϕβ1 ...βn (x1 , ..., xi+1 , xi , ..., xn ), n ϕα1 ...αn (x1 , ..., xn ) = [Bn (xn )]βα11...β ...αn ϕβ1 ...βn (x1 , ..., xn−1 , −xn ).

0 HR,B

(3.24) (3.25)

0 also the finite particle space FR,B;m (H) as the (0) (1) (n) (n) n = ϕ , ϕ , ..., ϕ , ... with ϕ ∈ HR,B and

1

= C , we introduce Setting (complex) linear space of sequences ϕ ϕ(n) = 0 for n large enough. The vacuum state is = (1, 0, ..., 0, ...). 0 (H) the annihilation and creation operators At this point we define on FR,B;m ∗ {a(f ), a (f ) : f ∈ H} setting a(f ) = 0 and Z √ (x , ..., x ) = n + 1 ds x f †α0 (x)ϕ(n+1) [a(f )ϕ](n) 1 n α1 ···αn α0 α1 ···αn (x, x1 , ..., xn ), (3.26)

a∗ (f )ϕ

(n) α1 ···αn

(x1 , ..., xn ) =

i √ h (n) n PR,B f ⊗ ϕ(n−1)

α1 ···αn

(x1 , ..., xn ),

(3.27)

0 for all ϕ ∈ FR,B;m (H). The operators a(f ) and a∗ (f ) are in general unbounded on 0 n (H). However, for any ψ (n) ∈ HR,B one has the estimates FR,B;m

k a(f )ψ (n) k ≤

√

k a∗ (f )ψ (n) k ≤

√

(n+1) n k PR,B kk f kk ψ (n) k , (3.28) n . k · k being the L2 -norm. Therefore a(f ) and a∗ (f ) are bounded on each HR,B The right-hand side of Eq. (3.27) can be given an alternative form by implementing (n) . The resulting expression is a bit complicated, but since in explicitly the action of PR,B some cases it might be instructive, we give it for completeness:

n k f kk ψ (n) k,

(n) 1 a∗ (f )ϕ α1 ···αn (x1 , ..., xn ) = √ fα1 (x1 )ϕ(n−1) α2 ···αn (x2 , . . . , xn )+ 2 n (n−1) n C(x1 ; x2 , ..., xn )βα11···β ···αn fβ1 (−x1 )ϕβ2 ···βn (x2 , . . . , xn ) + 1 √

2 n

n X

β1 ···βn

Rk−1 k (xk−1 , xk ) · · · R1 2 (x1 , xk )

α1 ···αn

k=2

bk , . . . , xn ) + fβ1 (xk )ϕ(n−1) β2 ···βn (x1 , . . . , x

(n−1) n C(xk ; x1 , ..., x bk , ..., xn )γβ11···γ bk , . . . , xn ) , ···βn fγ1 (−xk )ϕγ2 ···γn (x1 , . . . , x where

n bk , ..., xn )βα11···β C(xk ; x1 , ..., x ···αn =

bk (k+1) (xk , xk ) · · · R(n−1) n (xk , xn )Bn (xk )· R12 (xk , x1 )R23 (xk , x2 ) · · · R

(3.29)

578


β1 ···βn

bk (k+1) (xk , −xk ) · · · R23 (x2 , −xk )R12 (x1 , −xk ) R(n−1) n (xn , −xk ) · · · R

. (3.30) We turn now to the boundary generators, defining bβα (x) as the multiplicative operator 0 (H) is given by Eq. (3.7) and whose action on FR,B;m

α1 ···αn

(n) bβα (x)ϕ γ1 ...γn (x1 , ..., xn ) = [R01 (x, x1 ) R12 (x, x2 ) · · · R(n−1) n (x, xn ) Bn (x)·

(n) 1 ...δn ·R(n−1) n (xn , −x) · · · R12 (x2 , −x) R01 (x1 , −x)]βδ αγ1 ...γn ϕδ1 ...δn (x1 , ..., xn ),

(3.31)

for n ≥ 1. Notice that the boundary generators {bβα (x)} preserve the particle number. 0 (H), which we By construction {a(f ), a∗ (f )} and {bβα (x)} leave invariant FR,B;m take as the domain D, whose existence was required in the definition of Fock representation. For deriving the commutation properties on D it is convenient to introduce the operator-valued distributions aα (x) and a∗α (x) defined by Z Z s †α ∗ a(f ) = d x f (x)aα (x), a (f ) = ds x fα (x)a∗α (x). (3.32) After a straightforward but lengthly computation, one verifies the validity of the following statement. Proposition 5. The operator-valued distributions {aα (x), a∗α (x)} and {bβα (x)} satisfy the relations (2.3–8) on D. Assuming that M(R, B) 6= ∅, we proceed further by implementing the involutions {Im : m ∈ M(R, B)}. For this purpose we have to construct a sesquilinear form h · , · im on D, such that the mapping (2.12–14) is realized as the conjugation with respect to h · , · im . Let us consider the following form on D: hϕ, ψim =

∞ X

hϕ(n) , ψ (n) im ,

(3.33)

n=0

where hϕ(0) , ψ (0) im = ϕ(0) ψ (0) , hϕ

(n)

Z

,ψ

(n)

(3.34)

im =

dsx1 · · · dsxn ϕ(n)†α1 ...αn (x1 , ..., xn )mβα11 (x1 ) · · · mβαnn (xn )ψβ(n) (x1 , ..., xn ). 1 ...βn (3.35) The right-hand side of (3.33) always makes sense because for any ϕ, ψ ∈ D the series is actually a finite sum. Using that m(x) satisfies Eqs. (2.15) and (3.6), one easily proves Proposition 6. The inner product defined by (3.33–35) is nondegenerate on D and the involution Im is implemented by h · , · im -conjugation. The next question concerns the positivity of h · , · im . This point is conveniently discussed after introducing the subset M(R, B)+ of those elements of M(R, B), which are positive definite almost everywhere in Rs . One has indeed Proposition 7. The inner product h · , · im is positive definite on D if and only if m ∈ M(R, B)+ .


579

Proof. From Eq. (3.35) it is clear that if m ∈ M(R, B)+ then the inner product is positive definite. Conversely, suppose that h · , · im is positive definite. Let y ∈ Rs be a fixed non zero vector, and take an arbitrary f ∈ H with support laying in the half space x · y ≥ 0. Consider the 1-particle state (1) f ]α (x) = ϕα (x) = [PR,B

1 fα (x) + Bαβ (x)fβ (−x) . 2

Using eqs. (3.6, 10) and the support properties of fα , one gets Z 1 ds x f †α (x)mβα (x)fβ (x). hϕ , ϕim = 2

(3.36)

(3.37)

Since f is arbitrary, positivity of h · , · im implies that m(x) is positive definite almost everywhere in the half space x · y ≥ 0. Finally, the arbitrariness of y allows to extend the validity of this conclusion to Rs . Proposition 7 shows that there are two kinds of Fock representations of BR . The representation FR,B;m will be called of type A if h · , · im is positive definite; otherwise we will say that FR,B;m is of type B. The standard probabilistic interpretation of quantum field theory applies directly only to the A-series. This does not mean however that the B-series has no physical applications. In the last case one has to isolate first a physical subspace where h · , · im is nonnegative. This is usually done by symmetry considerations and may depend on the specific model under consideration. The final step in completing the derivation of FR,B;m is the construction of the representation space L. It is necessary at this stage to consider the classes A and B separately. For m ∈ M(R, B)+ the inner product space {D, h · , · im } is actually a preHilbert space. Let FR,B;m (H) be the completion of D with respect to the Hilbert space topology. Clearly L = FR,B;m (H) satisfies all the requirements. For type B representations there is no distinguished Hilbert space topology for completing D. A natural substitute is the topology τ defined by the family of seminorms sψ (ϕ) ≡ |hψ , ϕim |,

ϕ , ψ ∈ D.

(3.38)

It turns out [2] that τ is the weakest locally convex topology in which h · , · im is separately τ -continuous. Moreover, τ is a Hausdorff topology, because h · , · im is nondegenerate. Therefore D admits a unique (up to isomorphism) τ -completion, which has all the needed properties and provides the space L for the B-series. We conclude this section by a general observation, which concerns A-type representations only and is based on the fact that any m ∈ M(R, B)+ can be written in the form m(x) = p† (x) p(x), where p(x) is an invertible matrix. Notice that p(x) is not unitary unless m(x) = e. It is easy to show that the mapping induced by aα (x) 7−→ pβα (x) aβ (x),

a∗α(x) 7−→ a∗β (x) p−1α β (x),

bβα (x) 7−→ pγα (x) bδγ (x) p−1βδ (−x)

(3.39) (3.40)

is an isomorphism between {BR , Im } and {BR0 , Ie }, where R0βα11βα22 (x1 , x2 ) = pγα11 (x1 ) pγα22 (x2 ) Rγδ11δγ22 (x1 , x2 ) p−1βδ11 (x2 ) p−1βδ22 (x1 ). Setting

B 0βα (x) = pγα (x) Bγδ (x) p−1βδ (−x),

(3.41) (3.42)

580


one has in addition that FR,B;m and FR0 ,B 0 ;e are equivalent. In other words, for any m ∈ M(R, B)+ one can equivalently replace Im with Ie , suitably modifying (see Eqs. (3.41, 42)) the exchange factor R and the reflection matrix B. Let us mention finally that the above formalism carries over easily to the Fock representations of BR,λ . One must only replace the Lebesgue measure ds x by the λinvariant measure |detλ0 (x)|1/2 ds x. 4. Applications 4.1. Free Boson Field on the Half Line. In order to give a first idea about the physical content of the algebra BR , we focus below on a simple example of quantization in R+ . More precisely, we construct the free boson field 8(t, x), satisfying (4.1) x ∈ R+ , ∂t2 − ∂x2 + M 2 8(t, x) = 0, with the boundary condition lim(∂x − η) 8(t, x) = 0, x↓0

η ≥ 0.

(4.2)

The standard Neumann and Dirichlet boundary conditions are recovered from (4.2) by setting η = 0 or taking the limit η → ∞ respectively. We will show that the quantization of the system (4.1,2) can be described in terms of BR with N = 1 and R = 1. The exchange structure of this boundary algebra is trivial, which allows to isolate and easily illustrate the physical implications of the boundary generator b(k). In this section the arguments of the BR -generators have the meaning of momenta and are denoted therefore by k, p, etc. Let us introduce the phase factor B(k) =

k − iη . k + iη

(4.3)

Then the triplet {R = 1, B; m = e} satisfies all requirements of the previous section and one can construct the corresponding Fock representation F1,B;e . Equation (3.31) shows that the operator b(k) acts as a multiplication by B(k). Therefore, one is left in F1,B;e with the following relations: [a(k) , a(p)] = 0, [a∗ (k) , a∗ (p)] = 0, 1 1 [a(k) , a∗ (p)] = δ(k − p) + B(k)δ(k + p). 2 2

(4.4)

Notice that these would be the standard canonical commutation relations, apart from the term B(k)δ(k + p). We define now the field operator Z ∞ dk √ a(k) e−iω(k)t+ikx + a∗ (k) eiω(k)t−ikx , (4.5) 8(t, x) = 2πω(k) −∞ where ω(k) =

√

M 2 + k2 .

(4.6)


581

This is just the expression in the case without boundary, but one should keep in mind that now the algebra of creation and annihilation operators is different. By means of (4.4) one easily derives the basic correlator - the two-point Wightman function Z ∞ dk e−iω(k)t12 e−ik(x1 −x2 ) + B(k)e−ik(x1 +x2 ) , h , 8(t1 , x1 )8(t2 , x2 )ie = 4πω(k) −∞ (4.7) where t12 = t1 − t2 . The right-hand side of Eq. (4.7) defines a tempered distribution (B(k) is C ∞ and bounded on R), which satisfies Eqs. (4.1, 2). It consists of two terms. The term without B(k) is the usual two-point Wightman function of the system without boundary. The term proportional to B(k) has its origin in the boundary generator and explicitly breaks translation and Lorentz invariance. It is remarkable that in spite of this fact, 8(t, x) is a local field. The validity of this statement can be deduced from the commutator (4.8) [ 8(t1 , x1 ) , 8(t2 , x2 ) ] = iD(t1 − t2 , x1 , x2 ). One has

e x1 + x2 ), D(t, x1 , x2 ) = 1(t, x1 − x2 ) + 1(t, Z

where 1(t, x1 − x2 ) = −

∞ −∞

dk sin[ω(k)t] eik(x1 −x2 ) 2πω(k)

is the ordinary Pauli–Jordan function with mass M and Z ∞ dk e 1(t, x1 + x2 ) = − sin[ω(k)t] B(k) eik(x1 +x2 ) . 2πω(k) −∞

(4.9) (4.10)

(4.11)

Observing that for x1 , x2 ∈ R+ the inequality |t1 − t2 | < |x1 − x2 | implies |t1 − t2 | < x1 + x2 , one concludes that the locality properties of the field 8 are governed by the e x) for |t| < x. The latter can be easily evaluated and using that η ≥ 0, behavior of 1(t, one finds e x)| = 0. (4.12) 1(t, |t|<x So, 8(t, x) is a local field when x ∈ R+ . Notice that this is not the case if 8(t, x) is e in the commutator have considered on the whole real line. The two terms 1 and 1 a very intuitive explanation. As far as |t1 − t2 | < |x1 − x2 | no signal can propagate between the points (t1 , x1 ) and (t2 , x2 ) and the commutator vanishes. When |x1 − x2 | < |t1 −t2 | < x1 +x2 signals can propagate directly between the two points, but they cannot be influenced by the boundary and the only contribution comes from the standard Pauli– Jordan function 1. As soon as x1 + x2 = |t1 − t2 |, signals starting from one of the points can be reflected at the boundary and reach the other point. This phenomenon is e and is codified in term proportional to B(k) of the boundary responsible for the term 1, algebra (4.4). The case η < 0 is slightly more delicate due to the presence of a bound state in the one-particle energy spectrum, which must be taken into account in the construction of a local field. The results of this subsection can be obviously generalized to higher space-time dimensions. 4.2. Scattering on the Half Line. Before entering the details of the application of BR to factorized scattering with reflecting boundary conditions, we will discuss the simple

582


case of particles of mass M freely moving on R+ and bouncing over a wall at x = 0. The relevant one-particle space is L2 (R+ , dx). We denote by Dη ⊂ L2 (R+ , dx) the subspace of C ∞ -functions on R+ , which vanish for sufficiently large x, have square integrable first and second derivatives and obey d − η ϕ(x) = 0. (4.13) lim x↓0 dx The current

i dϕ d ϕ j=− − ϕ ϕ 2m dx dx

(4.14)

satisfies j(0) = 0 for all ϕ ∈ Dη , thus preventing any probability flow through the wall x = 0. For a one-particle Hamiltonian we take H (1) = −

1 4, 2M

(4.15)

defined on Dη . The evolution problem is well posed because H (1) , which is obviously symmetric, is actually essentially self-adjoint [22]. A set of (generalized) eigenstates verifying (4.13) is k ∈ R, (4.16) ψk (x) = e−ikx + B(k)eikx , where B(k) is given by Eq. (4.3). The eigenvectors (4.16), which represent physically scattering states, satisfy ψ−k (x) = ψ k (x) = B(−k)ψk (x).

(4.17)

For η ≥ 0 the systems {ψk : k > 0} and {ψ−k : k > 0} are separately complete and are related via complex conjugation, which in the physical context implements time reversal. When η < 0, there is in addition a unique bound state p (4.18) ψb (x) = −2η eηx , with energy E = −η 2 /2M . The n-body Hamiltonian of the associated multiparticle Bose system H (n) = −

1 (41 + ... + 4n ) 2M

(4.19)

n – the subspace of symmetric functions in Dη⊗n . Clearly, there is neither is defined on Dη+ particle production nor particle collision in this model. There is however a nontrivial reflection from the boundary, which can be described as follows. One can consider ψk as representing a particle, which when time t → −∞, travels with momentum −k towards the wall. Accordingly, we take

1 | − kiin = √ ψk (x), 2π

k > 0,

(4.20)

as a basis of one-particle “in"-states. Concerning the basis of one-particle “out"-states, the analogous consideration gives 1 1 |kiout = √ ψ k (x) = √ ψ−k (x), 2π 2π

k > 0.

(4.21)


583

The scattering operator is defined at this point by S |kiout = | − kiin .

(4.22)

For η ≥ 0, S is by construction a unitary operator on L2 (R+ , dx). For η < 0, S is defined and unitary on the subspace of L2 (R+ , dx) which is orthogonal to the bound state (4.18). The one-particle matrix elements of S read Z ∞ 1 out out out in hk|S|pi = hk| − pi = dxψk (x)ψp (x) = B(k)δ(k − p). (4.23) 2π 0 More generally out

hk1 , ..., kn | − p1 , ..., −pn iin = B(k1 )...B(kn )δ(k1 − p1 )...δ(kn − pn ),

(4.24)

provided that k1 > ... > kn > 0 and p1 > ... > pn > 0. Our main observation now is that the above simple scattering problem admits a field-theoretic solution in terms of the algebra (4.4). In fact, it is easy to verify that the vacuum expectation values 2n ha∗ (k1 )...a∗ (kn ) , a∗ (−p1 )...a∗ (−pn )ie ,

(4.25)

in the Fock representation F1,B;e reproduce precisely the transition amplitudes (4.24). We have therefore the following Fock realization n

|k1 , ..., kn iout = 2 2 a∗ (k1 )...a∗ (kn ), n

| − p1 , ..., −pn iin = 2 2 a∗ (−p1 )...a∗ (−pn ),

k1 > ... > kn > 0, p1 > ... > pn > 0,

(4.26) (4.27)

of the interpolating states. Summarizing, the scattering operator of our simple model has a purely algebraic characterization. In this respect, the term proportional to B(k) in (4.4) is the algebraic counterpart of the boundary condition, given analytically by Eq. (4.13). At this stage we have enough background for facing the more complicated problem of scattering in integrable models with reflecting boundary conditions in 1+1 spacetime dimensions. The presence of particle collisions in this case leads in general to the boundary algebras BR with R 6= 1. Using the Fock representations of BR , derived in the previous section, we present below a rigorous construction of the S-matrix, which generalizes some previous results [20] valid in the absence of a boundary. We also show that under certain conditions on the triplet {R, B; m}, the transition amplitudes, originally derived by Cherednik [4], are indeed Hilbert space matrix elements of a unitary operator. The asymptotic particles of integrable models are parametrized by their rapidity θ ∈ R and internal “isotopic" index α = 1, ..., N . We recall that in the case of relativistic dispersion relation the energy-momentum vector is expressed in terms of θ and the mass M according to (4.28) p0 = M cosh(θ), p1 = M sinh(θ) . An elastic reflection (p0 , p1 ) 7−→ (p0 , −p1 ) corresponds therefore to the transformation θ 7−→ −θ. The fundamental building blocks for constructing the scattering operator are the β1 β2 (θ1 , θ2 ) and Bαβ (θ), which are supposed to satisfy Eqs. (2.9, 10) and (3.9, matrices Rα 1 α2 10). We allow for R to depend on θ1 and θ2 separately (and not only on θ1 − θ2 ), because in general the presence of boundaries breaks down Lorentz invariance.

584


A crucial observation is that the algebra BR alone does not determine the scattering operator S we are looking for: one must fix in addition an involution Im . The latter selects a Fock representation FR,B;m , which is the main ingredient for constructing S. Postponing the discussion of the physical meaning of the choice of m ∈ M(R, B) to the end of this section, it might be instructive for the time being to describe the set M(R, B) for some familiar integrable model. We choose the SU (2) Thirring model. In this case N = 2 and setting θ12 = θ1 − θ2 the relevant R-matrix reads [1] 2 X iπρ(θ12 ) θ12 α+β (−1) Eαβ ⊗ Eβα , (4.29) Eαα ⊗ Eββ + R(θ1 , θ2 ) = (iπ − θ12 )ρ(−θ12 ) iπ α,β=1

where Eαβ are the Weyl matrices and θ θ 1 ρ(θ) = 0 + 0 1− . 2 2πi 2πi

(4.30)

The general solution of Eqs. (3.9, 10), subject to the physical constraint of boundary crossing symmetry [10], is given in [3]. Let us concentrate for simplicity on the diagonal solutions η−θ β(θ) E11 + E22 , (4.31) B(θ) = β(−θ) η+θ with η ∈ C and θ θ η + iπ − θ η + 2πi + θ 3 + 0 1− 0 0 . β(θ) = 0 4 2πi 2πi 2πi 2πi

(4.32)

Let µ+ (µ− ) be any measurable real-valued even (odd) function, such that µ± and 1/µ± are bounded. Then, if Re η = 0, the set M(R, B) contains all matrices of the form ξ ∈ R, ξ 6= 0.

m(θ) = µ+ (θ) (E11 + ξE22 ) ,

(4.33)

In addition, for η = 0 one has the solutions ¯ 21 , m(θ) = µ− (θ) ζE12 + ζE

ζ ∈ C.

(4.34)

From Eq. (4.33) it follows that M(R, B)+ 6= ∅. After this concrete example illustrating the set M(R, B), we return to the general framework. The idea is to extend the formalism, developed at the beginning of this section for the Schrödinger particle on the half line, to the case of integrable models. In what follows we assume that (4.35) M(R, B)+ 6= ∅ and consider representations FR,B;m of type A. The physical motivation for this restriction is quite evident. According to Proposition 7, it ensures positivity of the metric in the asymptotic spaces F out and F in , which we are going to construct now. For this purpose we introduce the following relation in C0∞ (R): f 1 f2

⇐⇒ θ1 > θ2

∀ θ1 ∈ supp(f1 ) ,

∀ θ2 ∈ supp(f2 ).

(4.36)

We will adopt also the notation f 0

⇐⇒

θ>0

∀ θ ∈ supp(f ),

(4.37)


and

fe(θ) = f (−θ).

585

(4.38)

As suggested by Eqs. (4.26, 27), F out and F in are generated by finite linear combinations of the vectors (k ≥ 1) E out = { , a∗ (f1 ) · · · a∗ (fk ) : f1α1 · · · fkαk 0, ∀ α1 , ..., αk = 1, ..., N } (4.39) and g1 ) · · · a∗ (e gk ) : g1β1 · · · gkβk 0, ∀ β1 , ..., βk = 1, ..., N } E in = { , a∗ (e (4.40) respectively. By construction both F out and F in are linear subspaces of the Hilbert space FR,B;m (H). 0 (H) which belong One should notice that in principle there are elements of FR,B;m neither to F out nor to F in . We call them mixed vectors. Linear combinations involving both in- and out-states provide in general examples of such vectors. In spite of the existence of mixed vectors, the subspaces F out and F in satisfy a sort of asymptotic completeness, which is essential for constructing the S-matrix. More precisely, one has Proposition 8. F out and F in separately are dense in FR,B;m (H). Proof. We focus on F out . Let ϕ ∈ FR,B;m (H) and let us assume that hϕ , ψim = 0

∀ ψ ∈ F out .

(4.41) In order to prove the thesis, we have to show that ϕ = ϕ(0) , ϕ(1) , ..., ϕ(n) , ... = 0. Obviously ϕ(0) = 0. Let us consider ϕ(n) for arbitrary but fixed n ≥ 1. Equation (3.27) and Eq. (4.41) imply that Z

hϕ(n) , a∗ (f1 ) · · · a∗ (fn )im = dθ1 · · · dθn ϕ(n)†α1 ...αn (θ1 , ..., θn )mβα11 (θ1 ) · · · mβαnn (θn )f1β1 (θ1 ) · · · fnβn (θn ) = 0

(4.42) for all f1 , ..., fn such that f1α1 · · · fnαn 0 ∀ α1 , ..., αn = 1, ..., N . Therefore ϕ(n) α1 ...αn (θ1 , ..., θn ) = 0

(4.43)

n in the domain θ1 > · · · > θn > 0. Finally, using that ϕ(n) ∈ HR,B has definite exchange and reflection properties described by Eqs. (3.24, 25), one can extend the domain of validity of (4.43) and conclude that ϕ(n) actually vanishes almost everywhere in Rn . Clearly, a similar argument applies also to the case of F in . We observe in passing that the definition of F out and F in does not explicitly involve the boundary generators {bβα (θ)}. This fact is not surprising because is cyclic with respect to {a∗α (θ)}. At this point we are ready to define the scattering matrix S and to prove that it is a unitary operator in FR,B;m (H). The construction consists essentially of three steps. One starts by defining S as the following mapping of E out onto E in :

S = ,

(4.44)

S a∗ (g1 )a∗ (g2 ) · · · a∗ (gk ) = a∗ (e g1 )a∗ (e g2 ) · · · a∗ (e gk ),

(4.45)

586


where g1β1 · · · gkβk 0, ∀ β1 , ..., βk = 1, ..., N . It is not difficult to check that hSψ out , Sϕout im = hψ out , ϕout im ,

∀ ψ out , ϕout ∈ E out .

(4.46)

∀ ψ in , ϕin ∈ E in .

(4.47)

Moreover, S is invertible and hS −1 ψ in , S −1 ϕin im = hψ in , ϕin im ,

The second step is to extend S and S −1 by linearity to the whole F out and F in respectively. Clearly, one has to show that these extensions are correctly defined. Consider for instance S and suppose that there exist a sequence g1i β · · · gki β 0 , 1

k

∀ β1 , ..., βk = 1, ..., N ,

such that a∗ (g1 )a∗ (g2 ) · · · a∗ (gk ) =

M X

i = 1, ..., M ,

a∗ (g1i )a∗ (g2i ) · · · a∗ (gki ).

(4.48)

i=1

In order to prove that the linear extension of S is not ambiguous, we must show that g1 )a∗ (e g2 ) · · · a∗ (e gk ) = a∗ (e

M X

a∗ (e g1i )a∗ (e g2i ) · · · a∗ (e gki ).

(4.49)

i=1

The argument is as follows. In the domain θ1 > θ2 > ... > θk > 0 Eq. (4.48) implies that g1β1 (θ1 ) g2β2 (θ2 ) · · · gkβk (θk ) =

M X

g1i β (θ1 ) g2i β (θ2 ) · · · gki β (θk ). 1

2

k

(4.50)

i=1

Because of the support properties of {gj } and {gji } one has that Eq. (4.50) holds actually (k) proves the validity of Eq. (4.49). in Rk , which projected by PR,B It is easy to see also that Eqs. (4.46, 47) remain valid for the linear extensions of S and S −1 on F out and F in respectively. This fact implies in particular that both S and S −1 are bounded linear operators. Finally, one extends S and S −1 by continuity to FR,B;m (H). Because of the asymptotic completeness proven in Proposition 8, the extensions are unique and define the unitary scattering operator and its inverse. As it should be expected from integrability, n n ⊂ HR,B . Notice however, that in contrast to the case without boundone has SHR,B ary, where the scattering operator leaves invariant each one-particle state, the S-matrix 1 . constructed above acts nontrivially already in HR,B By construction the matrix elements of S between out-states in the Fock space FR,B;e (H) reproduce precisely the transition amplitudes derived by Cherednik [4]. Since the latter are referred to the involution Ie , a natural question arising at this point concerns the physical meaning of other possible choices of m ∈ M(R, B)+ . For answering this question we consider two generic asymptotic states ϕin ∈ F in and ψ out ∈ F out . If both m, e ∈ M(R, B)+ , one may compare the transition amplitudes associated with the involutions Im and Ie . One finds out in hψ out , ϕin im = hψ out , ϕin d ie = hψd , ϕ ie , out where ϕin d and ψd are the “dressed" in- and out-states

(4.51)


587

(n) γ1 γn in (n) (ϕin d )α1 ...αn (θ1 , ..., θn ) = mα1 (θ1 ) · · · mαn (θn )(ϕ )γ1 ...γn (θ1 , ..., θn ),

(4.52)

†γ1 †γn out (n) (ψdout )(n) β1 ...βn (θ1 , ..., θn ) = m β1 (θ1 ) · · · m βn (θn )(ψ )γ1 ...γn (θ1 , ..., θn ).

(4.53)

It follows from Eq. (4.51) that the effect of the involution Im is exactly reproduced in FR,B;e by appropriate dressing (4.52, 53) of the in- or out-states. The results of this section can be summarized as follows. Proposition 9. Suppose that the exchange factor R and the reflection matrix B satisfy (2.9,10) and (3.9,10). Assume also that M(R, B)+ 6= ∅. Then the scattering operator associated with the Fock representation FR,B;m is unitary for any m ∈ M(R, B)+ . Conditions (2.9, 10) and (3.9, 10) are standard for the scattering on the half line. The same is true for (2.15), which is usually imposed in the slightly stronger form β1 β2 1 β2 R†β α1 α2 (θ1 , θ2 ) = Rα1 α2 (θ2 , θ1 ),

(4.54)

known as Hermitian analyticity. We emphasize that condition (3.6), which is often overlooked in the physical literature, is essential for the unitarity of S and represents therefore an useful criterion for selecting possible reflection matrices. In the case of the SU (2) Thirring model one gets in this way the restriction Re η = 0 in Eqs. (4.31, 32). Let us mention also that if R depends on the difference θ12 ≡ θ1 − θ2 , one usually assumes [13,26] that R admits a suitable continuation to the complex θ12 -plane, which satisfies crossing symmetry, has certain pole structure, etc. In that case also B is required to have a continuation in the complex θ-plane, which obeys boundary crossing symmetry [10]. In our example (see Eqs. (4.29–32)) R and B admit such continuations. Finally, the bootstrap equations [9,26] reduce further the set of physically relevant exchange and reflection matrices. From Proposition 9 it follows however that the unitarity of S as an operator in FR,B;m (H) depends exclusively on the behavior of R and B for real values of the rapidities. 5. Outlook and Conclusions In the present paper we have introduced the associative algebra BR and investigated some of its basic features. BR admits two series of Fock representations, which have been constructed explicitly. The positive metric representations provide a framework for deriving Cherednik’s transition amplitudes and proving that they are indeed the matrix elements of a unitary scattering operator. We have shown also that the algebra B+ enters the Bose quantization on the half line. The associated Klein–Gordon field is local, in spite of the breakdown of the Poincaré symmetry. BR is actually a member of a large family of algebras BR,λ , which are defined by Eqs. (2.23–26). BR,λ can be studied in the same way as BR and are expected to find relevant applications to statistical models with boundaries. It will be interesting in this respect to extend to BR,λ the notion of second R-quantization, developed in [17,18] for the Z-F algebra AR . We point out finally that one can further generalize BR,λ , eliminating the condition (2.9) and/or (3.10). In this case, instead with the Weyl group Wn , one has to deal with an infinite dimensional group Wn0 , which is freely generated by the elements {τ 0 , σi0 : i = 1, ..., n − 1} satisfying the relations (3.14,15), but not (3.16). Recent investigations [15] show actually that the group Wn0 appears in many different physical and mathematical contexts. We hope to say more about this generalization of BR,λ in the near future.

588


Appendix In quantum field theory on the half line it is sometimes necessary to allow for a quantum number j = 1, ..., NB to reside on the boundary [10]. We will show below that this case is still described by the boundary algebra {BR , Im }, but corresponds to representations with slightly more general structure than that of FR,B;m . To be precise, instead of the requirement 4 formulated in the beginning of Sect. 3, these representations satisfy: 40 . There exists a NB -dimensional subspace (vacuum space) V ⊂ D, which is annihilated by aα (x). Moreover, V is cyclic with respect to {a∗α (x)} and h · , · im is positive definite on V. For NB = 1 we recover the property 4 specifying FR,B;m . Let us briefly describe now the main features of the representations characterized by the conditions 1-3 and 40 . Let 1 , . . . , NB be an orthonormal basis in V. We denote by P0 the h · , · im -orthogonal projection on V and define Bαβ (x) ≡ P0 bβα (x) P0 .

(A.1)

Notice that Bαβ (x) is now an operator, carrying the vacuum space into itself, Bαβ (x) j = Bαβ kj (x) k .

(A.2)

The following obvious generalization of Proposition 3 holds. Proposition 30 . The vacuum space V is unique and satisfies bβα (x) |V = Bαβ (x) |V .

(A.3)

Projecting the relevant equations on the vacuum space, one immediately verifies the validity of (3.6,9,10) as operator equations on V. Summarizing, the basic input for constructing the above more general class of representations of {BR , Im } is still the triplet {R, B; m}, the novelty being that Bαβ (x) are operators which satisfy (3.6,9,10) on V. Apart from the following minor modifications, the construction precisely follows that described in Sect. 3. First of all, the elements of n carry an extra lower index varying from 1 to NB . In the scalar product this index is HR,B saturated among the two states. Second, performing the substitution Bαβii (x) 7→ Bαβii jk (x) in Eq. (3.20), the operator Bi (x) becomes a NB ×NB -matrix, which inserted in (3.30,31) acts on the states by a standard matrix multiplication. References 1. Belavin, A.A.: Exact Solution of the Two-dimensional Model with Asymptotic Freedom. Phys. Lett. B 87, 117–123 (1979) 2. Bognár, J.: Indefinite Inner Product Spaces. Berlin: Springer Verlag, 1974 3. Chao, L., Hou, B., Shi, K., Wang, Y., Yang, W.: Bosonic Realization of Boundary Operators in SU (2)invariant Thirring Model. Int. J. Mod. Phys. A 10, 4469–4482 (1995) 4. Cherednik, I.V.: Factorizing Particles on a Half Line and Root Systems. Theor. Math. Phys. 61, 977–983 (1984) 5. Corrigan, E., Dorey, P.E., Rietdijk, H.R.: Aspects of Affine Toda Field Theory on a Half Line. Suppl. Prog. Theor. Phys. 118, 143–164 (1995)


589

6. Faddeev, L.D.: Quantum Completely Integrable Models in Field Theory. Soviet Sci. Rev. Sect. C 1, 107–155 (1980) 7. Fendley, P., Saleur, H.: Deriving Boundary S Matrices. Nucl. Phys. B 428, 681–693 (1994) 8. Fring, A., Köberle, R.: Affine Toda Field Theory in the Presence of Reflecting Boundaries. Nucl. Phys. B 419, 647–664 (1994) 9. Fring, A., Köberle, R.: Factorized Scattering in the Presence of Reflecting Boundaries. Nucl. Phys. B 421, 159–172 (1994) 10. Ghoshal, S., Zamolodchikov, A.B.: Boundary S Matrix and Boundary State in Two-Dimensional Integrable Quantum Field Theory. Int. J. Mod. Phys. A9, 3841–3886 (1994) 11. Ghoshal, S.: Boundary S-matrix of the O(N )-Symmetric Non-Linear Sigma Model. Phys. Lett. B 334, 363–368 (1994) 12. Ghoshal, S.: Bound State Boundary S-Matrix of the Sine-Gordon model. Int. J. Mod. Phys. A 9, 4801– 4810 (1994) 13. Karowski, M., Weisz, P.: Exact Form Factors in (1+1)-Dimensional Field Theoretic Models with Soliton Behavior. Nucl. Phys. B 139, 455–476 (1978) 14. Kuczma, M., Choczewski, B., Ger, R.: Iterative Functional Equations. Cambridge: Cambridge University Press, 1990 15. Kulish, P.P., Sasaki, R.: Covariance Properties of Reflection Equation Algebras. Progr. Theor. Phys. 89, 741–761 (1993) 16. LeClair, A., Mussardo, G., Saleur, H., Skorik, S.: Boundary Energy and Boundary States in Integrable Quantum Field Theories. Nucl. Phys. B 453, 581–618 (1995) 17. Liguori, A., Mintchev, M.: Fock Representations of Quantum Fields with Generalized Statistics. Commun. Math. Phys. 169, 635–652 (1995) 18. Liguori, A., Mintchev, M., Rossi, M.: Unitary Group Representations in Fock Spaces with Generalized Exchange Properties. Lett. Math. Phys. 35, 163–177 (1995) 19. Liguori, A., Mintchev, M.: Boundary Exchange Algebras. Preprint IFUP-TH 21/96, March 1996 20. Liguori, A., Mintchev, M., Rossi, M.: Fock Representations of Exchange Algebras with Involution. J. Math. Phys. 38, 2888–2898 (1997) 21. Penati, S., Zanon, D.: Quantum Integrability in Two-Dimensional Systems with Boundary. Phys. Lett. B 358, 63–72 (1995) 22. Reed, M., Simon, B.: Methods in Modern Mathematical Physics II: Fourier Analysis, Self-Adjointness. New York: Academic Press, 1975 23. Saleur, H., Skorik, S., Warner, N.P.: The Boundary Sine-Gordon Theory: Classical and Semi-Classical Analysis. Nucl. Phys. B 441, 421–436 (1995) 24. Sklyanin, E.K.: Boundary Conditions for Integrable Quantum Systems. J. Phys. A: Math. Gen. 21, 2375–2389 (1988) 25. Warner, N.P.: Supersymmetry in Boundary Integrable Models. Nucl. Phys. B 450, 663–694 (1995) 26. Zamolodchikov, A.B., Zamolodchikov, A.B.: Factorized S-Matrices in Two Dimensions as the Exact Solutions of Certain Relativistic Quantum Field Theory Models. Ann. Phys. 120, 253–291 (1979) 27. Zhao, L.: Fock Spaces with Reflection Condition and Generalized Statistics. Preprint hep-th 96040024 Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 194, 591 – 611 (1998)

Communications in


The Semi-Infinite Cohomology of Affine Lie Algebras Stephen Hwang Department of Engineering Sciences, Physics and Mathematics, Karlstad University, S-651 88 Karlstad, Sweden. E-mail: [email protected] Received: 21 November 1996 / Accepted: 5 November 1997

Abstract: We study the semi-infinite or BRST cohomology of affine Lie algebras in detail. This cohomology is relevant in the BRST approach to gauged WZNW models. Our main result is to prove necessary and sufficient conditions on ghost numbers and weights for non-trivial elements in the cohomology. In particular we prove the existence of an infinite sequence of elements in the cohomology for non-zero ghost numbers. This will imply that the BRST approach to the topological WZNW model admits many more states than a conventional coset construction. This conclusion also applies to some non-topological models. Our work will also contain results on the structure of Verma modules over affine Lie algebras. In particular, we generalize the results of Verma and Bernstein-Gel’fandGel’fand, for finite dimensional Lie algebras, on the structure and multiplicities of Verma modules. The present work gives the theoretical basis of the explicit construction of the elements in cohomology presented previously. Our analysis proves and makes use of the close relationship between highest weight null-vectors and elements of the cohomology.

1. Introduction and Summary of Results The present work studies the semi-infinite or BRST cohomology of affine Lie algebras. The motivation comes from the quantization of Wess-Zumino-Novikov-Witten (WZNW) models. These models play an essential part in the understanding and classification of conformal field theories. The BRST symmetry arises as a consequence of the gauging of a WZNW model w.r.t. a subgroup [1]. The constraints associated with this BRST symmetry are the generators of an affine Lie algebra g 0 = gk ⊕ g˜ k˜ . Here gk and g˜ k˜ correspond to the same finite dimensional Lie algebra, but have different central elements k and k˜ = −k − 2cg¯ (see Sect. 2 for notation). The latter affine Lie algebra corresponds to an auxiliary, and in general non-unitary, WZNW model that arose in the

592

S. Hwang

derivation in [1]. The physical states in the gauged WZNW model are now given by the non-trivial elements of the resulting BRST cohomology. In [2] it was proved that the BRST approach was equivalent to the conventional coset construction, so that the states were ghost-free and satisfied the usual highest weight conditions w.r.t. the subalgebra gk . The conditions for this proof was that one selected a specific range of representations for the auxiliary WZNW model. For the original ungauged WZNW model the range of representations were assumed to be the integrable ones. In this work we will consider completely general highest weight representations (an analogous treatment may be given for lowest weight representions). The motivation for this is that it may be that a more general situation than in ref. [2] is the physically relevant one. Our analysis of the cohomology is most straightforwardly applied to the case when the gauged subgroup coincides with the original group, i.e. when we have a topological WZNW model. But, as we will show, it also generalizes to the most important class of non-topological models, namely those in which the ungauged WZNW model is unitary. In [3] the explicit construction of elements in the BRST cohomology was considered. The procedure presented there for obtaining these elements showed that they were intimitely related to certain null-vectors. The key to the construction was to make a selection of null-vectors that generated the states in the cohomology. It turned out that these null-vectors are the highest weight vectors. Then by using the explicit form of highest weight null-vectors given by Malikov, Feigin and Fuchs [4], the elements may be constructed. Our work here may be seen as the theoretical basis of this construction. We will here prove that the procedure in [3] will always generate non-trivial states in the cohomology. We will also prove that the ghost numbers that appeared in the construction are the only possible ones. The ghost numbers will be uniquely determined by the representations of the algebras involved, and for fixed representations only one value (and its negative) will occur. It is still an open question whether the construction provides all the possible states. We also lack a general result on the dimensionality of the cohomology. The plan of the paper and its main results are the following. In Sect. 2 we give the basic definition and facts for affine algebras and associated modules. In Sect. 3 we discuss the structure of Verma modules. This is important since our analysis of the cohomology relies very heavily on this structure, in particular, on the embeddings of Verma modules into Verma modules. We make extensive use of a technique due to Jantzen [5] to perturb the highest weight of a reducible Verma module to obtain an irreducible one. This perturbation gives also a filtration of modules in a given Verma module. Section 3 contains results on the structure of Verma modules, which we have been unable to find in the literature. The main results are Theorem 3.10 and Theorem 3.11. These are generalizations of results of Verma [15] and Bernstein, Gel’fand and Gel’fand [11], respectively, for finite dimensional Lie algebras and of Rocha-Caridi and Wallach for affine Lie algebras with highest weights on Weyl orbits through dominant weights. The proof of Theorem 3.11 is almost identical to the proof of the finite dimensional case given in [14], Theorem 7.7.7 (which is used also in [6]). The proof of Theorem 3.10 only partly coincides with [6], as the latter does not extend to the case of antidominant weights. In Sect. 4 we proceed to introduce the BRST formalism. Most of the material (except Lemma 4.2) is well-known. In particular, we recapitulate a theorem due to Kugo and Ojima [7]. This theorem will partly be used in the main section, Sect. 5. It is also conceptually important in understanding the basic mechanism behind the appearance of elements in the BRST cohomology for non-zero ghost numbers, which we now explain. The theorem, which applies only to irreducible modules, states that elements in the

Semi-Infinite Cohomology of Affine Lie Algebras

593

cohomology form either singlet or doublet (singlet pair) representations w.r.t the BRST algebra. Furthermore, elements that are trivial or outside the cohomology form so-called quartets in the terminology of [7], i.e. sets four states, in which two of the elements are BRST exact. In order to obtain an irreducible module, we use a trick due to Jantzen, to perturb a reducible module into an irreducible one. In the irreducible case one may prove (Corollary 5.2), that only ghost-free highest weight states are BRST non-trivial. As the perturbation is taken to zero and the module becomes reducible, certain quartets will evolve into singlet pairs in the following way. Two of states of the quartet will remain in the irreducible module and will then form a singlet pair in this module. The two other states will become null-states. One of the main results in this paper (Theorem 5.12) is the determination of the relevant null-states. This theorem gives the necessary and sufficient conditions on the null-states to be part of a quartet, that will contain a singlet pair as the perturbation is set to zero. The implications of the theorem are exploited in Theorems 5.14 and 5.15, which give the necessary and sufficient conditions on the ghost-numbers and weights for which the cohomology is non-trivial. In particular in Theorem 5.15 a sequence of non-trivial BRST invariant states is proved to exist. This sequence is exactly the one for which the construction has been given in [3]. The ghost numbers appearing ˜ − l(λ) and l(λ) is the length of a Weyl transformation associated are ±p, where p = l(λ) with λ (see Sect. 3). This means that for given highest weights λ and λ˜ of the original and auxiliary sectors, |p| is fixed to exactly one value. By Theorem 5.14 these ghost numbers and weights are the only non-trivial ones. Let us also address the question of how the embedding of g into a larger algebra may affect our results. As our approach relies on the use of null-vectors, the crucial question is what happens to the relevant null-vectors as g is embedded. If the null-vector w.r.t. g will cease to be null in the larger algebra, then the entire quartet, to which the vector belongs for non-zero perturbation, will remain a quartet as Jantzen’s perturbation is set to zero. Thus the corresponding elements in the cohomology of g will now be exact. In addition, many more elements may disappear from the cohomology group. This is most evident from the construction in [3], where one used non-trivial states at ghost number p − 1 (p > 0) to construct a BRST non-trivial element of ghost number p. In the extreme case the module over the larger algebra is irreducible and all elements, except the one at zero ghost number, will disappear. There is one case in which the embedding will be straightforward. This will happen when we select integrable representations of the larger algebra. In this case it is known [9] that the irreducible module over the larger algebra is completely reducible w.r.t. to any subalgebra. Hence, the results given here generalize directly. This was the situation analyzed in [2]. Corollary 5.11 proves that the solutions given in [2] for a selected range of representations of the auxiliary sector, are in fact the unique solutions for zero ghost number for any selection of representations of the auxiliary sector. The existence of extra elements in the cohomology, which have non-zero ghost numbers, implies that the BRST approach to WZNW models is different from the conventional coset approach. This applies to the topological case, but also to the non-topological case, at least when we take integrable representations of the original algebra. The rôle of these extra states is at this point unclear. It may be that their appearance will lead to inconsistencies. One may avoid the states by selecting an appropriate range of representations for the auxiliary sector. Then only ghost free states will appear in the cohomology. This was the situation treated in ref [2]. It may on the other hand be that the extra states are a new and important part of the quantization of WZNW models. In the latter case one may expect that the extra states will be needed to ensure S-matrix unitarity and hence will appear as poles in scattering amplitudes.

594

S. Hwang

2. Preliminaries Let g¯ be a simple finite dimensional Lie algebra of rank r. We denote by gk the cor¯ and responding affine Lie algebra of level k. The set of roots of g¯ and g are α¯ ∈ 1 ¯ α ∈ 1, respectively. The highest root of g¯ is denoted ψ and its length is taken to be ¯ s, ¯ + , 1+ and to simple roots by 1 one. The restriction to positive roots are denoted by 1 1s . The weight and root lattices of g¯ and g are denoted by 0¯ w , 0¯ r , 0w and 0r . 0+r is the lattice generated by positive roots. Let 0+w be the set of dominant weights, 0+w = 2λ ·α {λ ∈ 0w | αi · λ ≥ 0 for αi ∈ 1s }. Let 0fw = {λi ∈ 0+w | (αij )2j = δij for αj ∈ 1s } be the set of fundamental weights. Here λi · αj denotes the invariant scalar product on g and (αj )2 = αj · αj . Define ρ as twice the sum of fundamental weights of g. ρ¯ is the corresponding sum for g. ¯ ρ satisfies ρ · αi = (αi )2 , αi ∈ 1s . We define the set − of antidominant weights 0w = {λ ∈ 0w |αi · (λ + ρ/2) ≤ 0 for αi ∈ 1s }. A weight µ ∈ 0w is said to be singular if it is orthogonal to at least one positive root and is said to be regular otherwise. The Weyl group W of g is the set of transformations on 0w generated by the simple reflections σi (λ) = λ −

2λ · αi αi (αi )2

α i ∈ 1s .

(2.1)

The length l(w) of w ∈ W is the minimal number of simple reflections that give w. We also define the ρ−centered reflections σiρ (λ) = σ(λ + ρ/2) − ρ/2. Similarly we write wρ (λ) for a general ρ-centered Weyl transformation. We define an ordering between weights. Let µ, ν ∈ 0w be such that µ − ν ∈ 0+r . We then write µ ≥ ν. If µ − ν ∈ 0+r /{0}, then this is denoted by µ > ν. Two weights λ and µ are said to be on the same Weyl orbit if there exists w ∈ W such that µ = w(λ). Similarly, they are said to be on the same ρ-centered Weyl orbit if µ = wρ (λ). We make a triangular decomposition of g, g = n− ⊕ h ⊕ n+ . We will use the notation eα for the generators of n+ , fα for those of n− and hi , i = 1, . . . , r + 2 for the generators ¯ h1 is a central element of g with of the Cartan subalgebra h. hi , i = 2, . . . , r + 1 span h, eigenvalue k/2 and h0 is a derivation. We have a corresponding decomposition of U (g), the universal enveloping algebra of g, as U(g) = U (n− ) ⊗ U(h) ⊗ U(n+ ). Let M (λ) denote the highest weight Verma module over g of highest weight λ. The module is generated by a highest weight primary vector v0λ satisfying eα v0λ = 0, hi v0λ = λi v0λ

hi ∈ h.

(2.2)

M (λ) admits a weight decomposition M (λ) =

M

Mη (λ).

η∈0+r

Vectors in Mν (λ) will be called weight vectors of degree ν and their weights differ from the highest weight by ν. We consider throughout only vectors v ∈ Mη (λ) with dimMη (λ) < ∞. The dimension of Mν (λ) is P (ν), which is the number of ways ν may be written as a linear combination of positive roots with non-negative coefficients. Let M 0 (λ) be the proper maximal submodule of M (λ). Then M (λ)/M 0 (λ) is irreducible and isomorphic to the unique irreducible g−module L(λ).


595

Define a Hermitian form h..|..i as the mapping from M (λ) × M (λ) to the complex numbers by hv0λ |v0λ i = 1, hwλ |uvλ i = hu† wλ |vλ i,

(2.3)

†

= fα , fα† = where u ∈ U(g) and ( ) denotes the Hermite conjugation defined by † eα , hi = hi . For vη , wµ ∈ Mη (λ) we clearly have hwµ |vη i = 0 for η 6= µ. If η = µ, then F (λ)η = hwη |vη i may be viewed as a P (η)×P (η) matrix, whose entries are polynomials in λ. The determinant of F (λ)η is given by the Kac–Kazhdan formula [10] det F (λ)η = const.

e†α

∞ h Y Y

(λ + ρ/2) · α −

α∈1+ n=1

n 2 iP (η−nα) α , 2

(2.4)

where roots α ∈ 1+ are taken with their multiplicities and P (η) = 0 if η 6∈ 0+0 . The zeros of the determinant are associated with highest weight vectors vµ that occur in M (λ) (see the following section). From Eq. (2.4) one may infer that µ = λ − nα, which implies that the Verma module M (µ) is a submodule of M (λ). M (λ) is irreducible if and only if there does not exist n ∈ Z and α ∈ 1+ such that n (λ + ρ/2) · α − α2 = 0. (2.5) 2 Notice that this equation will for any imaginary root α (i.e. α2 = 0) be equivalent to the ¯ condition k = −cg¯ , where cg¯ is the quadratic Casimir of the adjoint representation of g. 3. Structure of Embeddings of Verma Modules If the Kac–Kazhdan equation (2.5) has non-trivial solutions for a given module M (λ), then there will exist Verma modules M (µ) that are submodules of M (λ). This implies the φ

existence of a g-homomorphism, φ ∈ Homg (M (µ), M (λ)), such that M (µ) → M (λ). We will in this section and throughout the rest of this paper assume k 6= −cg¯ , so that solutions to Eq. (2.5) only occur for real roots α. The structure of embeddings P is most clearly depicted through a filtration due to Jantzen [5]. Introduce z = λ∈0fw zλ λ, where zλ are non-zero complex numbers. Consider the one-parameter family of weights λ = λ+z. If λ is a weight of a reducible module M (λ) and zi 6= 0, then for 0 < || 1, M (λ ) is irreducible. We now define a filtration M (λ ) ⊃ M (1) (λ ) ⊃ M (2) (λ ) ⊃ . . .

(3.1)

by M (n) (λ ) = {v ∈ M (λ ) | hw|vi is divisable by n for any w ∈ M (λ )} . (3.2) We will often write M for M (λ ) etc. for M (n) . If v = uv0λ , u ∈ U(g), then we write v = uv0λ . In the limit → 0 this induces a filtration of modules in M (λ) M (λ) ⊃ M (1) (λ) ⊃ M (2) (λ) ⊃ . . . .

(3.3)

Note that Jantzen’s filtration is hereditary: Let M (µ) ∈ M (s) (λ) and M (ν) ∈ M (t) (µ). Then M (ν) ∈ M (s+t) (λ).

596

S. Hwang

Any irreducible subquotient of a g-module M (λ) is isomorphic to an irreducible g-module L(µ), λ − µ ∈ 0+r . Denote by (M (λ) : L(µ)) the multiplicity of L(µ) in M (λ). M (1) (λ) is the maximal proper submodule of M (λ) and hence M (λ)/M (1) (λ) is isomorphic to the irreducible module L(λ). We will call the vectors in M (1) null-vectors πL L(λ). of M (λ). We define πL to be the projection M (λ) −→ The submodules of a given Verma module are generally not all of Verma type. It is convenient to introduce the notion of primitive vectors. Let V be a g-module. A vector vλ ∈ V is said to be primitive if there exists a submodule U of V such that v 6∈ U,

xv ∈ U for any x ∈ n+ .

(3.4)

λ is called a primitive weight. Highest weight vectors are clearly primitive, but in general they do not exhaust all primitive vectors, even in the case of finite dimensional algebras, as was first noted in [11]. In fact, there may be infinitely many more primitive vectors than highest weight vectors (see [12] for an example for finite dimensional algebras). Any module V is generated by its primitive weights as a g-module. We will call a module which is generated by acting freely with U(n− ) on a primitive vector, which is not of highest weight type, a Bernstein-Gel’fand (BG) module. The corresponding primitive vector will be called a Bernstein-Gel’fand primitive vector. Although every zero in the determinant Eq. (2.4), i.e. every (α, n) for which the Kac– Kazhdan equation (2.5) is satisfied, corresponds to a highest weight vector in M (λ) (cf. Proposition 3.8), the converse is in general not true. For a given λ there are usually more highest weight vectors than solutions (α, n). Let Homg (M (µ), M (λ)) 6= 0 for a pair (α, n) in Eq. (2.5) with α real, i.e. µ = λ − nα, n ≥ 1 and α ∈ 1+ ∩ 1R , where 1R is the set of real roots. Then we may write µ = σαρ (λ) < λ.

(3.5)

The inequality ensures that a solution to Eq. (2.5) exists. In the form Eq. (3.5) it is clear that by iteration, we will find new highest weight vectors not given by solutions to the Kac–Kazhdan equation for λ. It also follows that M (λ) is irreducible if and only if λ is antidominant. Notice that this requires k < −cg¯ . Let us proceed to give a more precise classification of highest weight vectors in M (λ) in terms of Weyl transformations. Define the Bruhat ordering on W . Let w, w0 ∈ W . We write w0 → w if there exists α ∈ 1+ ∩ 1R , such that w = σα w0 and l(w) = l(w0 ) + 1. We write w0 ≺ w if there are w0 , w1 , . . . , wp ∈ W such that w0 = wp → wp−1 → . . . → w1 → w0 = w. It may be shown that w0 ≺ w if and only if the reduced expressions w0 = σj1 . . . σjp and w = σi1 . . . σiq are such that (j1 , . . . , jp ) is obtained by deleting q − p elements from (i1 , . . . , iq ). By combining Theorem 4.2 in [10] with Eq. (3.5) we have the following. Theorem 3.1. A Verma module M (λ) contains an irreducible subquotient L(µ) if and only if the following condition is satisfied: (*) λ = µ, or there exists a sequence of positive roots α1 , α2 , . . . , αk and a sequence of weights λ = µ1 , µ2 , . . . , µk , µk+1 = µ such that µi+1 = σαρ i (µi ) < µi for i = 1, 2, . . . , k. Lemma 3.2. Let µ ∈ 0w . Then there exists w ∈ W and a unique λ + ρ/2 ∈ 0+w (k > ρ ρ ρ ρ −cg¯ ) or λ ∈ 0− w (k < −cg¯ ) such that µ = w (λ) = σin σin−1 . . . σi1 (λ), where i1 , . . . , in denote the simple roots αi1 , . . . , αin with


597

(**) µ = λ, or µ 6= λ and σiρp+1 σiρp . . . σiρ1 (λ) < σiρp . . . σiρ1 (λ) (k > −cg¯ ) or σiρp+1 σiρp . . . σiρ1 (λ) > σiρp . . . σiρ1 (λ) (k < −cg¯ ), p = 1, 2, . . . , n − 1. − Proof. Consider k < −cg¯ . For µ ∈ 0− w the lemma is trivial (w = 1). Let µ = µ1 6∈ 0w . s Then there exists α1 ∈ 1 such that n1 = (2µ1 + ρ) · α1 /α12 ∈ N = 1, 2, 3, . . . .. This 2 implies that µ2 = σαρ 1 (µ1 ) satisfies µ2 < µ1 . Let λ+ρ/2 ∈ 0− w be such that (µ2 −λ) ≥ 0 (which is always possible, as can be seen by an explicit parametrization of the weights). We have (µ2 − λ)2 = (µ1 − λ)2 + n1 (2λ + ρ) · α1 and, therefore, (µ2 − λ)2 < (µ1 − λ)2 . If µ2 6∈ 0− w we can continue this process. We get a sequence of weights µ1 = µ, µ2 , . . . , µr with (µp+1 −λ)2 < (µp −λ)2 and µp+1 = σαρ p (µp ) < µp , p = 1, . . . , r −1. This sequence must terminate after a finite number of steps, since (µr − λ)2 ≥ 0 from (µ2 − λ)2 ≥ 0. But this can only happen if the last weight µr of the sequence satisfies αi · (2µr + ρ) ≤ 0 0 for all αi ∈ 1s , i.e. µr ∈ 0− w . We now prove the uniqueness. Assume w, w ∈ W and λ, λ0 ∈ 0+w such that µ = wρ (λ) = w0ρ (λ0 ). Then λ = w−1ρ w0ρ (λ0 ). This implies λ = λ0 , as follows by an adaption of [14], Lemma A in Sect. 13.2, to the present case. The case k > −cg¯ is proved in a completely analogous fashion.

Lemma 3.3. Let µ and λ be as in Lemma 3.2 and µ0 = λ, µ1 = σiρ1 (µ0 ), µ2 = σiρ2 (µ1 ), . . . , µn = σiρn (µn−1 ) = µ, where σiρk , k = 1, 2, . . . , n are simple reflections satisfying (**). Then for k > −cg¯ , Homg M (µp ), M (µp−1 ) 6= 0, p = 1, 2, . . . , n and for k < −cg¯ , Homg M (µp−1 ), M (µp ) 6= 0, p = 1, 2, . . . , n. Proof. The proof is by explicit construction. Consider e.g. k < −cg¯ and µp = σiρp (µp−1 ). We take the sl2 subalgebra generated by eip , fip and hip satisfying [fip , eip ] = −hip and [hip , fip ] = −fip . Let vµp be the highest weight vector that generates M (µp ) and hip vµp = µip vµp . Then it is straightforward to check that vµp−1 = (fip )2µip +1 vµp is a highest weight vector andit will generate a submodule isomorphic to M (µp−1 ). Hence, Homg M (µp−1 ), M (µp ) 6= 0. By Theorem 3.1, Lemma 3.2 and Lemma 3.3 and we have the following: Proposition 3.4. Let µ ∈ 0w . Then there exists a unique λ + ρ/2 ∈ 0+w (k > −cg¯ ), or λ ∈ 0− w (k < −cg¯ ), such that Homg (M (µ), M (λ)) 6= 0 (k > −cg¯ ), or Homg (M (λ),M (µ)) 6= 0, (k < −cg¯ ). Furthermore, if ν ∈ 0w and Homg (M (µ), M (ν)) 6= 0, then [dimHomg (M (µ), M (λ))][dimHomg (M (ν), M (λ))] 6= 0 for k > −cg¯ or [dimHomg (M (λ), M (µ))][dimHomg (M (λ), M (ν))] 6= 0 for k < −cg¯ . Lemma 3.5. Let λ + ρ/2 ∈ 0+w (k > −cg¯ ) or λ ∈ 0− w (k < −cg¯ ), w ∈ W and α ∈ 1+ ∩ 1R . Then (i) σαρ wρ (λ) < wρ (λ) ⇒ l(σα w) > l(w) for k > −cg¯ or l(σα w) < l(w) for k < −cg¯ , (ii) l(σα w) > l(w) for k > −cg¯ or l(σα w) < l(w) for k < −cg¯ ⇒ σαρ wρ (λ) ≤ wρ (λ). Proof. The proof of (i) is identical to that of Lemma 7.7.2 (ii) in [14] (cf. [6], Lemma 8.2). Note that in the proof of Lemma 7.7.2 in [14], λ ∈ 0+w is assumed. The weaker condition on λ, assumed in our case, does not affect (i). We prove (ii) for k > −cg¯ . We have σαρ wρ (λ) = wρ (λ) − nα.

598

S. Hwang ρ

Here n = (2w (λ)+ρ)·α ∈ Z. If n < 0 then σαρ wρ (λ) > wρ (λ). By (i), we get l(σα w) < α2 l(w) which is a contradiction. Hence, n = 0, 1, 2, . . . and (ii) follows. The proof for k < −cg¯ is analogous. The following two lemmas are direct generalizations of [14], Lemma 7.7.4 and Lemma 7.7.5 (cf. [6], Lemma 8.4 and Lemma 8.5). Lemma 3.6. Let w1 , w2 ∈ W , γ ∈ 1+ ∩ 1R and α ∈ 1s , with γ 6= α. The following conditions are equivalent: α

γ

(i) σα w1 ←− w1 and σα w1 ←− w2 , α

σα (γ)

(ii) w2 ←− σα w2 and w1 ←− σα w2 . Lemma 3.7. Let w ∈ W and γ ∈ 1+ ∩1R be such that l(w) > l(σγ w). Then w σγ w. We proceed to obtain results on the g-homomorphisms M (ν) → M (µ). First we have the following: Proposition 3.8 (cf. [14], Lemma 7.6.11). Let ν ∈ 0w , α ∈ 1+ , µ = σαρ (ν). Assume µ ≤ ν. Then Homg (M (µ), M (ν)) 6= 0. Proof. The proof is essentially the same as in [14]. The case µ = ν is trivial, so we assume µ < ν. We consider only k > −cg¯ as the case k < −cg¯ is analogous. By Lemma 3.2 there exists w ∈ W and λ0 + ρ/2 ∈ 0+w such that ν = wρ (λ0 ). Let w = σαn . . . σα1 be a reduced expression of w in terms of simple reflexions and ν0 = λ0 , ν1 = σαρ 1 (ν0 ), ν2 = σαρ 2 (ν1 ), . . . , νn = σαρ n (νn−1 ) = ν µ0 = λ, µ1 = σαρ 1 (µ0 ), µ2 = σαρ 2 (µ1 ), . . . , µn = σαρ n (µn−1 ) = µ. Then ν0 = w0ρ (µ0 ) for some w0 ∈ W (from ν0 = w−1ρ (ν) = w−1ρ σαρ (µ)) and µ0 + ρ/2 ∈ 0+w , hence µ0 − ν0 ∈ 0+r . On the other hand, µn − νn = −mα, m > 0. Since the same element of W transforms µ and ν into µp and µp , respectively, p = 0, 1, 2, . . . , n, µp is transformed from νp by a reflexion σγρp (γp ∈ 1+ ), hence µp − νp ∈ 0+r or νp − µp ∈ 0+r . Hence, there exists a smallest integer k such that µk − νk ∈ 0+r and µk+1 − νk+1 ∈ −0+r . Now µk − νk = σαρ k+1 (µk+1 − νk+1 ). Since µk+1 − νk+1 is proportional to γk+1 , it can be seen that σαk+1 (γk+1 ) ∈ 1− . Hence, γk+1 = αk+1 (since σαk+1 permutes all positive roots except αk+1 ). The relations µk+1 − νk+1 ∈ −0+r and µk+1 = σαρ k+1 (νk+1 ) imply Homg (M (µk+1 ), M (νk+1 )) 6= 0 (Lemma 3.3). On the ρ other hand M (µk+2 ) = M (σk+2 (µk+1 )) so that Homg (M (µk+2 ), M (µk+1 )) 6= 0. Hence, Homg M (µk+2 ), M (σαρ k+2 (νk+1 )) =Homg (M (µk+2 ), M (νk+2 )) 6= 0. Continuing this step by step we arrive at Homg (M (µ), M (ν)) 6= 0. As a corollary to this proposition we can generalize results obtained by [15] and [11] for finite dimensional Lie algebras. Corollary 3.9. A necessary and sufficient condition for M (µ) to be a submodule of M (ν) is that the condition (*) in Theorem 3.1 is satisfied. We are now ready to formulate one of the main results of this section, namely the dimension of the g-homomorphisms M (µ) → M (ν). This result generalizes the result of Verma [15] for finite dimensional Lie algebras and Rocha-Caridi, Wallach [6] for representations with highest weights on Weyl orbits through dominant weights.


599

Theorem 3.10. Let µ, ν ∈ 0w . Then dimHomg (M (µ), M (ν)) ≤ 1. Proof. We consider the cases k > −cg¯ and k < −cg¯ separately. k > −cg¯ : By Proposition 3.4 it is sufficient to prove that dimHomg (M (µ), M (λ)) ≤ 1, where µ = wρ (λ), λ + ρ/2 ∈ 0+w . The proof is then similar to that of [6], Lemma 8.14, using induction on l(w). We only sketch it. For l(w) = 0 the theorem is trivial. Assume it to be true for l(w) < p. Consider l(w) = p. Let i = 1, 2, . . . , n be such that σiρ (µ) > µ, where σi are reflections corresponding to simple roots αi . Then l(σi w) < l(w) (Lemma 3.5) and dimHomg M (µ), M (σiρ (µ)) 6= 0 (Proposition 3.7). Consider the sl2 subalgebra gi corresponding to the simple root αi , i = 1, 2, . . . , n. M (σiρ (µ)) is the so-called completion of M (µ) w.r.t gi and is unique ([18], Proposition 3.6, and [6], ρ Proposition 8.11). Then, dimHomg (M (µ), M (λ)) = dimHom g M (σi (µ)), M (λ) . By the induction hypothesis dimHomg M (σiρ (µ)), M (λ) = 1. This gives the theorem in the case k > −cg¯ . k < −cg¯ : We will prove the theorem using essentially the original argument of Verma [15], Theorem 2. By Proposition 3.4 it is sufficient to prove that dimHomg (M (λ), M (µ)) ≤ 1, where µ = wρ (λ), λ ∈ 0− w . As M (λ) is irreducible, we can count the number of states in M (µ) and M (λ) to establish that if dimHomg (M (λ), M (µ)) ≥ 2, then P (η) = dimMη (µ) ≥ 2 dimMη+λ−µ (λ) = 2P (η + λ − µ). This is, however, a contradiction [15], Lemma 3, as can be seen by considering large η. Note here the following. Firstly, Theorem 3.1 and Theorem 310 imply that a BG module V (µ) is a submodule of M (λ) if and only if (M (λ) : L(µ)) ≥ 2. Secondly, if a BG φV M

module V (µ) ⊂ M (λ), then there exists a g-homomorphism φV M such that V (µ) → M (µ) ⊂ M (λ). As Theorem 3.10 shows that an element of Homg (M (µ), M (ν)) is either zero or unique (up to a multiplicative constant), we write M (µ) ⊂ M (ν) whenever Homg (M (µ), M (ν)) 6= 0. We next generalize a result established for finite dimensional Lie algebras [19] and for k > −cg¯ in [6]. Theorem 3.11. Let µ, ν ∈ 0w . Then there exist w, w0 ∈ W and λ + ρ/2, λ0 + ρ/2 ∈ ρ 0 0ρ 0+w (k > −cg¯ ), or λ, λ0 ∈ 0− w (k < −cg¯ ) such that µ = w (λ ) and ν = w (λ). For k > −cg¯ we have: (i)

M (µ) ⊂ M (ν) ⇐⇒ w ≺ w0 , λ = λ0 ⇐⇒ (M (ν) : L(µ)) 6= 0 (ii) If M (µ) ⊂ M (ν), µ 6= ν, then there are µ = µ0 , µ1 , . . . , µn = ν such that µi+1 = wiρ (λ), i = 0, 1, . . . n − 1 with l(wi+1 ) = l(wi ) − 1, w0 = w, wn = w0 and M (µ0 ) ⊂ M (µ1 ) ⊂ M (µ2 ) ⊂ . . . ⊂ M (µn ).

For k < −cg¯ we have: (iii) M (µ) ⊂ M (ν) ⇐⇒ w0 ≺ w, λ = λ0 ⇐⇒ (M (ν) : L(µ)) 6= 0 (iv) If M (µ) ⊂ M (ν), µ 6= ν, then there are µ = µ0 , µ1 , . . . , µn = ν such that µi+1 = wiρ (λ), i = 0, 1, . . . n − 1 with l(wi+1 ) = l(wi ) + 1, w0 = w, wn = w0 and M (µ0 ) ⊂ M (µ1 ) ⊂ M (µ2 ) ⊂ . . . ⊂ M (µn ).

600

S. Hwang

Proof (cf. [14] and [6]). Consider k < −cg¯ . The existence of w, w0 follows from Lemma 3.2. By Theorem 3.1 and Corollary 3.9 we have M (µ) ⊂ M (ν) ⇐⇒ (M (ν) : L(µ)) 6= 0. AssumeM (µ) ⊂ M (ν). By Corollary 3.9 there exist γ1 , . . . , γn ∈ 1+ such that µ = wρ (λ) < σγρ1 wρ (λ) < . . . < σγρn σγρn−1 . . . σγρ1 wρ (λ) = w0ρ (λ0 ). Then λ = λ0 (Lemma 3.2) and by Lemma 3.5 we have l(w) < l(σγ1 w) < . . . < l(w0 ). Hence, w ≺ w0 (Lemma 3.7). We now assume w0 ≺ w, λ = λ0 . Then there exist γ1 , . . . , γn ∈ 1+ such that γ1

γ2

γn−1

γn

w = w0 −→ w1 −→ w2 · · · −→ wn−1 −→ wn = w0 . By Lemma 3.5 we have µ = w0ρ (λ) ≤ w1ρ (λ) ≤ . . . ≤ wnρ (λ) = ν and, hence, M (w0ρ (λ)) ⊂ M (w1ρ (λ)) ⊂ . . . ⊂ M (wnρ (λ)) (Proposition 3.8). This proves (iii) and (iv). The cases (i) and (ii) are proved analogously. It is convenient to introduce the concept of length of a weight. Let µ ∈ 0w . Then we define the length l(µ) as the smallest integer l(w) such that µ = wρ (λ), w ∈ W , λ + ρ/2 ∈ 0+w or λ ∈ 0− w . We now prove some useful results involving this concept. First we have a result similar to Lemma 3.5. Lemma 3.12. Let λ + ρ/2 ∈ 0+w (k > −cg¯ ) or λ ∈ 0− w (k < −cg¯ ), w ∈ W and α ∈ 1+ ∩ 1R . The following conditions are equivalent (i) σαρ wρ (λ) l (wρ (λ)) for k > −cg¯ , or l σαρ wρ (λ) < l (wρ (λ)) for k < −cg¯ . Proof. We prove (i)=⇒ (ii) for the case k > −cg¯ . Let w0ρ (λ) = σαρ wρ (λ) with l(w0 ) = l(σαρ wρ (λ)). We have σαρ w0ρ (λ) = wρ (λ) > σαρ wρ (λ) = w0ρ (λ), and, thus, l(w) = l(σα w0 ) < l(w0 ) (Lemma 3.5). By definition, l(w) ≥ l(wρ (λ)) and, hence, l(wρ (λ)) < l(w0 ) = l(σαρ wρ (λ)). The case k < −cg¯ is proved analogously. We now prove (ii) =⇒ (i) for k > −cg¯ . We have σαρ wρ (λ) = wρ (λ) − nα. ρ

∈ Z. n = 0 implies σαρ wρ (λ) = wρ (λ) and thus l(σαρ wρ (λ)) = Here n = (2w (λ)+ρ)·α α2 ρ l(w (λ)). This contradicts (ii) and, therefore, we have n 6= 0. If n < 0 then σαρ wρ (λ) > wρ (λ). By the implication (i) =⇒ (ii), we again contradict (ii). Hence, n = 1, 2, . . . and (i) follows. The proof for k < −cg¯ is analogous. We may easily generalize [14], Proposition 7.6.8, to obtain:


601

Lemma 3.13. Let λ + ρ/2 ∈ 0+w (k > −cg¯ ) or λ ∈ 0− w (k < −cg¯ ) and w = σαn . . . σα1 be a reduced decomposition of w ∈ W , where α1 , . . . , αn ∈ 1s . Let λ0 = λ, λ1 = σαρ 1 (λ0 ), λ2 = σαρ 2 (λ1 ), . . . , λn = σαρ n (λn−1 ). Then for k > −cg¯ , λ0 ≥ λ1 ≥ . . . ≥ λn and

2αi+1 · (λi + ρ/2) ∈ {0, 1, 2 . . .}, 2 αi+1

and for k < −cg¯ λ0 ≤ λ1 ≤ . . . ≤ λn and

2αi+1 · (λi + ρ/2) ∈ {0, −1, −2 . . .}. 2 αi+1

Lemma 3.14. Let λ + ρ/2 ∈ 0+w (k > −cg¯ ) or λ ∈ 0− w (k < −cg¯ ). Let µ ∈ 0w with µ = wρ (λ), w ∈ W . If l(µ) = l(w), then w, λ, µ satisfy (**) in Lemma 3.2 with l(µ) = n. In addition, this is the minimal integer n for which (**) is satisfied. Proof. Consider k < −cg¯ . Let w = σαn . . . σα1 with l(w) = l(µ). By Lemma 3.13 we have a sequence λ0 ≤ λ1 ≤ . . . ≤ λn and

2αi+1 · (λi + ρ/2) ∈ {0, −1, −2, . . .}. 2 αi+1

Assume λi = λi+1 for some i ∈ {0, 1, 2 . . . , n}. Then clearly w0 = σαn . . . σαi+1 σαi−1 . . . σα1 satisfies µ = w0ρ (λ) and l(w0 ) < l(w). This contradicts the assumption l(µ) = l(w). The last assertion follows by the definition of l(µ). k > −cg¯ is proved analogously. Proposition 3.15. Let M (µ) ⊂ M (ν), where µ, ν ∈ 0w . Then l(µ) − l(ν) = n for k > −cg¯ , or l(ν) − l(µ) = n for k < −cg¯ , if and only if n is the largest integer for which M (µ) ⊂ M (n) (ν). Proof. Consider k < −cg¯ . By Proposition 3.4 and the hereditary nature of Jantzen’s filtration it is sufficient to prove the proposition for l(µ) = 0, i.e. for µ ∈ 0− w and some given M (ν). We prove the “only if” case by induction on l(ν). For l(ν) = 0 the proposition is trivial. Assume it to be true for l(ν) ≤ p − 1 and consider l(ν) = p. As p ≥ 1 there must exist α ∈ 1s such that ν 0 = σαρ (ν) < ν. Then M (ν 0 ) ⊂ M (ν) (Proposition 3.8) and l(ν 0 ) < l(ν) (Lemma 3.12). If l(ν 0 ) < p − 1 then l(ν) < p, which is a contradiction. Hence, l(ν 0 ) = p − 1. In addition, M (ν 0 ) ⊂ M (1) (ν) and M (ν 0 ) 6⊂ M (2) (ν). This follows by an explicit construction of the highest weight vector that generates M (ν 0 ) (cf. the proof of Lemma 3.3). We now use the induction hypothesis on M (ν 0 ) together with the hereditary nature of Jantzen’s filtration to conclude that the proposition holds for l(ν) = p. We prove the “if” case. Consider M (µ) ⊂ M (p) (ν), µ ∈ 0− w and use induction on p. The case p = 0 is trivial. Assume the assertion to be true for 0 ≤ p ≤ n − 1 and consider p = n ≥ 1. As p ≥ 1 there must exist α ∈ 1s such that ν 0 = σαρ (ν) < ν and M (ν 0 ) ⊂ M (ν) (Proposition 3.8) with l(ν 0 ) < l(ν) (Lemma 3.12). By explicit construction of the highest weight vector one checks that M (ν 0 ) ⊂ M (1) (ν) and M (ν 0 ) 6⊂ M (2) (ν). Then the hereditary nature of Jantzen’s filtration implies M (µ) ⊂ M (n−1) (ν 0 ), which by the induction hypothesis yields l(ν 0 ) = n − 1. Then ν 0 = σαρ (ν) implies l(ν) = n, which concludes the proof. The case k > −cg¯ is proved in a completely analogous fashion.

602

S. Hwang

Lemma 3.16 ([14], Lemma 7.7.6; [6], Lemma 8.6). Let w1 , w2 ∈ W . The number of elements w ∈ W such that w1 ← w ← w2 is 0 or 2. Lemma 3.17 (cf. [14], Lemma 7.7.7 (iii) and [6], Lemma 8.15 (iii)). Let M (µ1 ) and M (µ2 ) be Verma modules with highest weights µ1 and µ2 , respectively. Let µ1 + ρ/2 and µ2 + ρ/2 be regular. If l(µ1 ) = l(µ2 ) + 2 (k > cg¯ ) or l(µ1 ) = l(µ2 ) − 2 (k < cg¯ ), then the number of µ such that M (µ1 ) ⊂ M (µ) ⊂ M (µ2 ), M (µ1 ) 6= M (µ) 6= M (µ2 ) is either 0 or 2. Proof. Consider k > −cg¯ . The definition of l(µ1 ) and l(µ2 ) implies together with Lemma 3.2 that there exists w1 , w2 ∈ W such that µ1 = w1ρ (λ), µ2 = w2ρ (λ), λ + ρ/2 ∈ 0+w and l(w1 ) = l(w2 ) + 2. In addition, µ1 + ρ/2 and µ2 + ρ/2 regular imply that λ ∈ 0+w . Then the number of w ∈ W such that M (w1ρ (λ)) ⊂ M (wρ (λ)) ⊂ M (w2ρ (λ)) and M (w1ρ (λ)) 6= M (wρ (λ)) 6= M (w2ρ (λ)) is 0 or 2, as can be seen from combining Lemma 3.16 and Theorem 3.11. This proves the assertion of the lemma for k > −cg¯ . The case k < −cg¯ is proved analogously. 4. The BRST Formalism Define the algebra g 0 = gk ⊕ g−k−2cg¯ , where cg¯ is the quadratic Casimir of the adjoint representation. This algebra is invariant under the exchange k → −k − 2cg¯ and, consequently, we may restrict to k > cg . The singular case k = −cg¯ will not be treated here. f(λ), ˜ where We will denote g−k−2cg¯ by g˜ and the Verma module over g˜ will be denoted M 0 0 λ˜ is its highest weight. Let Bn0+ , Bn0− , Bh0 , Bg˜ and Bg0 be bases of n+ , n− , h0 , g˜ and g 0 , respectively. The generators e˜α , f˜α and h˜ i is a realization of Bg˜ and e0α , fα0 and h0i ˜ and similarly L0 ˜ = L(λ) ⊗ L( ˜ ˜ (λ) ˜ λ). a realization of Bg0 . Define M 0 ˜ = M (λ) ⊗ M λλ

λλ

πL0 denotes the projection M 0 −→ L0 . We define the anticommuting ghost and antighost operators c(x) and b(x), respectively, where x ∈ Bg0 , with the following properties: (i) {c(x), b(y)} = δx† ,y , (ii) c† (x) = c(x† ), b† (x) = b(x† ), (iii) b(a1 x + a2 y) = a1 b(x) + a2 b(y),

(4.1) (4.2) (4.3)

a1 , a2 ∈ C.

Here δx,y = 1 if x = y and 0 otherwise. Introduce a normal ordering  if either x ∈ Bn0− or y ∈ Bn0+  c(x)b(y) −b(y)c(x) if either x ∈ Bn0+ or y ∈ Bn0− . : c(x)b(y) : = 1 2 (c(x)b(y) − b(y)c(x)) otherwise Define the BRST operator X X 1 c(x† )x + c(x† )ρ(x) − d= 2 x∈Bg0

x∈Bh0

X

: b([x, y])c(x† )c(y † ) :,

(4.4)

(4.5)

x,y∈Bg0

where ρ(x) is the component of ρ corresponding to the element x ∈ Bh0 . The BRST operator has the following two fundamental properties: d2 = 0 and d† = d. The first property implies that xtot = {d, b(x)} generates an algebra g0 which is centerless. Define a ghost module F gh . It is generated by the ghost operators acting on a vacuum vector v0gh satisfying


603

c(x)v0gh = b(y)v0gh = 0 for x ∈ Bn0+ and y ∈ Bn0+ ∪ Bh0 .

(4.6)

We also define a restricted module Fˆ gh = {v gh ∈ F gh | b(x)v gh = 0 for x ∈ Bh0 }. The dual F ∗gh of F gh has a vacuum vector v0∗gh satisfying c† (x)v0∗gh = b† (y)v0∗gh = 0 for x ∈ Bn0+ ∪ Bh0 and y ∈ Bn0+ .

(4.7)

The restricted dual is Fˆ ∗gh = {v ∗gh ∈ F ∗gh | c(x)v ∗gh = 0 for x ∈ Bh0 }. Define a Hermitian form for the ghost sector by hv0∗gh |v0gh i = 1, hv ∗gh |uv gh i = hu† v ∗gh |v gh i,

(4.8)

for a polynomial u in the ghost operators and v gh ∈ F gh , v ∗gh ∈ F ∗gh . If v gh = uv0gh then we denote by v ∗gh the vector u† v0∗gh . The ghost number N gh of any vector v gh ∈ F gh is defined by N gh (v0gh ) = 0 and N gh (c(x)v) = N gh (v) + 1, N gh (b(x)v) = N gh (v) − 1. The ghost numbers of vectors in the dual module is similarly defined with N gh (v0∗gh ) = 0. It is easily seen that hu∗ |vi = 0 if N gh (u∗ ) + N gh (v) 6= 0. Let C(g 0 , V ) be the complex V ⊗ F gh for a g 0 -module V . ˆ 0 , V ) = {ω ∈ C(g 0 , V ) | b(x)ω = 0, xtot ω = 0 We define the relative subcomplex C(g 0 ∗ ˆ for x ∈ Bh0 } and C(g , V ) is the dual complex. If ω = v ⊗ v gh for v ∈ V , v gh ∈ F gh , then we denote by ω ∗ the vector v ⊗ v ∗gh . We decompose d as d = dˆ +

X

(xtot c(x) + M(x)b(x)).

(4.9)

x∈Bh0

ˆ for ω ∈ C(g ˆ 0 , V ) and consequently on the relative subcomplex we We have dω = dω ˆ the may analyze the cohomology of dˆ in place of d. The cohomology associated with d, ∞/2+p 0 ˆ semi-infinite or BRST relative cohomology is sometimes denoted by H (g , V ) to distinguish it from the conventional Lie algebra cohomology. We will, however, here for simplicity write Hˆ p (g 0 , V ), where p refers to elements ω with Ngh (ω) = p. Our primary interest here will be for V = L0λ,λ˜ . But in order to gain knowledge of this case we will 0 also study V = Mλ, ˜ and its submodules. λ It will be convenient to make a classification of vectors in the complex C(g 0 , V ) using the BRST operator. A central result due to Kugo and Ojima [7] states the following. Theorem 4.1. Let V be an irreducible module. Then a basis of C(g 0 , V ) may be chosen so that for an element ω in this basis one of the following will be true. (i) Singlet case: ω ∈ H ∗ (g 0 , V ) and hω ∗ |ωi 6= 0, N gh (ω) = 0. (ii) Singlet pair case: ω ∈ H ∗ (g 0 , V ) and there exists an element ψ 6= ω such that ψ ∈ H ∗ (g 0 , V ), hψ ∗ |ωi 6= 0 and N gh (ψ) = −N gh (ω). iii Quartet case: ω 6∈ H ∗ (g 0 , V ). There will exist four elements ω1 , ω2 , ψ1 , ψ2 ∈ C(g 0 , V ), where either ω = ω1 or ω = ω2 , such that hψ1∗ |ω1 i 6= 0 and hψ2∗ |ω2 i 6= 0, ω2 = dω1 and ψ1 = dψ2 and N gh (ω1 ) = N gh (ω2 ) − 1 = −N gh (ψ1 ) = −N gh (ψ2 ) − 1.

604

S. Hwang

ˆ ˆ 0 , V ) w.r.t. d. There will exist an analogous classification on the relative subcomplex C(g In this classification all non-trivial elements in the cohomology will be singlets or singlet pairs. It should be remarked that the condition of irreducibility is essential for the theorem. In the following section, we will find that for V being a reducible Verma module the above classification does not hold. In particular, the non-trivial elements of the cohomology for non-zero ghost numbers will for this case not be members of singlet pairs. ˆ 0 , M 0 ) as follows. Let λ = λ + z and λ˜ = λ˜ − z. Define Jantzen’s filtration for ξ ∈ C(g 0(n) 0 f(λ˜ )| hw0∗ |v 0 i is divisable by n for any w0∗ ∈ Then M (λ ) = {v ∈ M (λ ) ⊗ M ∗ ∗ ˜ f M (λ ) ⊗ M (λ )}. We denote by ξ the vector v ⊗ v˜ ⊗ v gh . An element ξ is always assumed to be finite as → 0. We denote by f () ∼ n the leading order of a function f () in the limit → 0. Note that our definition of Jantzen’s filtration for g 0 implies that λ + λ˜ is independent of . This is required if the cohomology should have at least one non-trivial element for 6= 0, namely the vacuum solution ξ0 = v0 ⊗ v˜ 0 ⊗ v0gh . In the next section the following result will be needed. ˆ 2 = g()ξ1 , where ˆ 0 , M0 ) be non-zero for = 0 and dξ Lemma 4.2. Let ξ1 , ξ2 ∈ C(g ˆ 0 , M 0(ni ) ), i = 1, 2. g() ∼ 1 or . Let n1 and n2 be the largest integers for which ξi ∈ C(g ˆ 0 , M0 ) which are non-zero for = 0 and satisfy Then there exist ζ1 , ζ2 ∈ C(g (i) (ii) (iii)

∗ hζi | ξj i ∼ ni δi,j 6= 0 for 6= 0, i = j = 1, 2. ˆ 1 = f ()ζ2 , where f () ∼ 1 or ∼ . dζ ˆ 0 , M 0(ni ) ). n1 , n2 are the largest integers for which ζi ∈ C(g

In addition, for g() ∼ 1: n1 = n2 if and only if f () ∼ 1, n1 = n2 + 1 if and only if f () ∼ . For g() ∼ we have: n1 = n2 − 1 if and only if f () ∼ 1, n1 = n2 if and only if f () ∼ . ˆ 0 , M0(n1 ) ) for a largest integer n1 and M0(ni ) is irreducible for Proof. Since ξ1, ∈ C(g ˆ 0 , M0(n1 ) ) with hξ ∗ |ζ1 i ∼ n1 . Then 0 < || 1 there must exist one vector ζ1 ∈ C(g 1 ∗ ∗ ˆ ∗ ˆ 1 g()hζ1 |ξ1 i = hζ1 |dξ2 i = hdζ |ξ2 i

(4.10)

ˆ 1 = f ()ζ2 , for some vector ζ2 satisfying hζ ∗ |ξ2 i 6= 0 and which is nonimplies dζ 2 zero for = 0. In addition, f () is a non-singular function of . From the fact that dˆ is linear in the generators of g 0 we can can conclude that f () ∼ 1 or . Pick a basis of ˆ 0 , M0 ) such that ξ1 , ξ2 are two of its elements. Denote the elements of the basis by C(g ∗ ˆ 0 , M0∗ ), ζi , i = 1, 2, 3, . . . . We ξi , i = 1, 2, 3, . . . . Similarly we pick a basis of C(g ∗ ∗ choose it such that hζi |ξj i is non-zero only for i = j. Now since hζi |ξ2 i = 0 for ˆ 0 , M0(n2 ) ) we must have hζ ∗ |ξ2 i ∼ n2 . This in turn implies, using i 6= 2 and ξ2 ∈ C(g 2 ∗ ˆ 0 , M0(n2 ) ). |ξj i = 0 for j 6= 2 and the definition of Jantzen’s filtration, that ζ2 ∈ C(g hζ2 ∗ ∗ We now conclude from hξ1 |ζ1 i ∼ n1 , hξ2 |ζ2 i ∼ n2 , Eq. (4.10) and f () ∼ 1 or that for g() ∼ 1 we have n1 ∼ n2 f () ∼ n2 or n2 +1 , while for g() ∼ we have n1 ∼ n2 −1 f () ∼ n2 −1 or n2 . A standard tool in the analysis of the cohomology is a contracting homotopy operator. ˆ 0 , M 0 ), i.e. ω0 = v 0 ⊗ v gh , where v 0 = v0 ⊗ v˜0 and Let ω0 be a vacuum vector of C(g 0 0 0 ˜ , respectively. Consider an element v0 , v˜ 0 are primary highest weight vectors of M and M ˆ M 0 ) of the form ω = v 0 ⊗ v gh with v 0 = uv0 ⊗ uv ω ∈ C(g, ˜ 0 , u ∈ U (n− ), u˜ ∈ U (n˜ − ) and N gh (v gh ) = n. We write u = um +um−1 +. . .+u0 , where ui ∈ U (n− ) is a monomial of order i. Introduce a gradation Ngr . We define Ngr (ω0 ) = 0. Furthermore, Ngr (ω) =


605

L 0 ˆ ˆ 0 ˆ 0, M 0) = m − n. We will get a filtration C(g Ngr C(g , M )Ngr . We now decompose d P 0 as dˆ = d0 + d−1 , where d0 = α∈1+ c(eα ) fα . We have dω = d0 ω + (lower order terms). ˆ 0 , M 0 )p+q−r be of the form Let ωp,q ∈ C(g ωp,q = fα1 . . . fαp v0 ⊗ v˜ ⊗ b(fβ0 1 ) . . . b(fβ0 q )c(fγ0 1 ) . . . c(fγ0 r )v0gh ,

(4.11)

where α, β, γ ∈ 1+ . The homotopy operator κ0 is now defined by κ0 ωp,q =

1 p+q

p X

fα1 . . . fc ˜ ⊗ b(fα0 i )b(fβ0 1 ) . . . b(fβ0 q ) αi . . . f αp v 0 ⊗ v

i=1

c(fγ0 1 ) . . . c(fγ0 r )v0gh

p 6= 0

κ0 ω0,q = 0,

(4.12)

where capped factors are omitted. It is now straightforward to verify (d0 κ0 + κ0 d0 )ωp,q = (1 − δp+q,0 )ωp,q + (lower order terms).

(4.13)

One may also define a gradation N˜ gr using the elements of U(n˜ − ) in place of U (n− ). P We then have a corresponding decomposition dˆ = d˜0 + d˜−1 with d˜0 = α∈1+ c(e0α ) f˜α and a homotopy operator κ˜ 0 . 5. The BRST Cohomology We will now in detail study the semi-infinite relative cohomology associated with the BRST operator. The notation follows that of previous sections. ω , ξ , . . . always denote ˆ 0 , . . .) that are finite in the limit → 0. Our starting point is Propoelements of C(g sition 5.1 concerning the cohomology of Verma modules. This proposition was to our knowledge first given in [20], Proposition 2.29. Proposition 5.1. Let M 0 be a highest weight Verma module over g 0 . Then Hˆ p (g 0 , M 0 ) = 0 for p < 0. Proof ([2]). Let ω ∈ Hˆ p (g 0 , M 0 ) and have a highest order term ωn in the gradation Ngr . ˆ = d0 ωn + (lower order terms) and hence d0 ωn = 0 to leading order. Using Then 0 = dω Eq. (4.13) we conclude that ωn = d0 (κ0 ωn ) + (lower order terms) and as a consequence ˆ 0 ωn ) + (lower order terms). Thus ω is a trivial element of Hˆ p (g 0 , M 0 ) of this, ω = d(κ to highest order. This may be iterated to lower orders and we find that ω ∈ Hˆ p (g 0 , M 0 ) will be non-trivial only for Ngr (ω) ≤ 0, which is impossible if N gh (ω) < 0. Corollary 5.2 ([2]). Let M 0 in Proposition 5.1 be irreducible. Then Hˆ p (g 0 , M 0 ) = 0 for p 6= 0. Furthermore, ω ∈ Hˆ 0 (g 0 , M 0 ) is the element ω = v0 ⊗ v˜ 0 ⊗ v0gh , where v0 ˜ respectively, satisfying and v˜ 0 are primary highest weight vectors of weights λ and λ, λ + λ˜ + ρ = 0. ˆ 0 , M 0 ) be such Corollary 5.3. Let L0 be the irreducible g 0 -module of M 0 . Let ω ∈ C(g p 0 0 ˆ 0 that 0 6= πL (ω) ∈ H (g , L ), p < 0. Then ˆ = ν, dω where ν ∈ Hˆ p+1 (g 0 , M 0(1) ) and is non-zero.

(5.1)

606

S. Hwang

ˆ = 0 in C(g ˆ 0 , M 0 ). Then Proposition 5.1 implies ω = Proof. ([2]) Assume first dω 0 0 0 ˆ ˆ ˆ ˆ 0 , M 0 /M 0(1) ), dη, η ∈ C(g , M ). Since ω ∈ C(g , M 0 /M 0(1) ) we must have η ∈ C(g ˆ ˆ = 0. which implies that ω is cohomologically trivial. Therefore, dω = ν 6= 0 and so dν p+1 0 0(1) 0 0 0 0(1) ˆ ˆ ˆ ˆ If ν 6∈ H (g , M ), then ν = dν for some ν ∈ C(g , M ) and d(ω − ν 0 ) = 0. Proposition 5.1 then implies that πL0 (ω) is a trivial element of Hˆ p (g 0 , L0 ). The following lemma is partly the converse of Corollary 5.3. ˆ = ν in C(g ˆ 0 , M 0 ) with ν ∈ Hˆ p+1 (g 0 , M 0(1) ) and ˆ 0 , M 0 ), dω Lemma 5.4. Let ω ∈ C(g p 0 0 ˆ πL0 (ω) 6= 0, then πL0 (ω) ∈ H (g , L ). ˆ = ν with ν ∈ Hˆ p+1 (g 0 , M 0(1) ) implies that dπ ˆ L0 (ω) = 0. Secondly, Proof. Firstly, dω 0 ˆ ˆ + ν0 ˆ 0 0 0 assume πL (ω) to be trivial, i.e. πL (ω) = dπL (ψ), ψ ∈ C(g , M 0 ). Then ω = dψ ˆ = dν ˆ 0 . This is a contradiction to ˆ 0 , M 0 ), with ν 0 ∈ C(g ˆ 0 , M 0(1) ), and so ν = dω in C(g the assumption ν ∈ Hˆ p+1 (g 0 , M 0(1) ). Lemma 5.5. dim Hˆ p+1 (g 0 , M 0(1) ) = dim Hˆ p (g 0 , L0 ) for p ≤ −2. ˆ Proof. Let ν ∈ Hˆ p+1 (g 0 , M 0(1) ) with p ≤ −2, then by Proposition 5.1 ν = dω, 0 0 p 0 0 ˆ ˆ ω ∈ C(g , M ) and πL0 (ω) 6= 0. Lemma 5.4 then implies πL0 (ω) ∈ H (g , L ). We have thus proved that dim(Hˆ p+1 (g 0 , M 0(1) )) ≤ dim(Hˆ p (g 0 , L0 )). We now prove that ˆ 0, M 0) the dimensionalities are in fact the same. Assume two elements ω1 , ω2 ∈ C(g p 0 0 ˆ 0 0 with πL (ω1 ), πL (ω2 ) ∈ H (g , L ), corresponding to the same element ν. By Corolˆ 1 = ν1 and dω ˆ 2 = ν2 , where ν1 = ν2 as elements ˆ 0 , M 0 ): dω lary 5.3 we have in C(g p+1 0 0(1) ˆ 1 − ω2 ) = ν1 − ν2 = dν ˆ 0, ˆ in H (g , M ). Subtracting the equations yields d(ω 0 0 0(1) 0 ˆ ˆ ν ∈ C(g , M ), which by Proposition 5.1 implies ω1 − ω2 − ν = d(. . .). πL0 (ω1 ) and πL0 (ω2 ) are therefore identical elements in Hˆ p (g 0 , L0 ). The results obtained so far are of importance for negative ghost numbers. We now turn to results relevant for positive ghost numbers. We will connect the two cases by the use of Jantzen’s perturbation, Theorem 4.1 and Lemma 4.2. ˆ = ν, ν ∈ ˆ 0 , M 0 ) with πL0 (ω) ∈ Hˆ p (g 0 , L0 ), satisfying dω Lemma 5.6. Let ω ∈ C(g 0 0(1) ˆ C(g , M ). Then: ˆ 0 , M 0 ) with πL0 (ψ) ∈ Hˆ −p (g 0 , L0 ) and hψ ∗ |ωi = 6 0. (i) There exists ψ ∈ C(g ˆ 0 , M 0(1) ) of opposite ghost number of ν, (ii) With ψ as in (i): There exists χ ∈ C(g ˆ = ψ . satisfying dχ ˆ 0 , M 0(2) ), where χ is defined as in (ii). (iii) χ, ν 6∈ C(g ˆ = 0. (iv) With ψ as in (i): dψ (v) p ≤ 0. Proof. (i) follows directly from Theorem 4.1. (ii) and (iii) follow from Theorem 4.1 and Lemma 4.2 using ω = ξ2 , ν = ξ1 and g() ∼ 1. (iv) follows by applying dˆ to the ˆ = ψ , using dˆ2 = 0 and taking the limit → 0. Finally (v) may be proved equation dχ ˆ by contradiction. If p > 0, then by (iv) and Corollary 5.3 ψ is d−exact and, hence, so is ω. ˆ 0 , M 0(1) ) be such that ˆ 0 , M 0 ), πL0 (ψ) 6= 0, and χ ∈ C(g Lemma 5.7. Let ψ ∈ C(g ˆ = ψ . Then χ ∈ C(g ˆ 0 , M 0(1) /M 0(2) ), πL0 (ψ) ∈ Hˆ p (g 0 , L0 ) and p ≥ 0. Conversely, dχ ˆ = ψ . ˆ 0 , M 0(1) ) such that dχ let πL0 (ψ) ∈ Hˆ p (g 0 , L0 ), p ≥ 1, then there exists χ ∈ C(g


607

ˆ = ψ . We apply Lemma 4.2 with ξ1 = ψ, ξ2 = χ and g() ∼ . Proof. Assume dχ Then n1 = 0, n2 ≥ 1 and by the lemma there exist two vectors ω and ν such that ˆ = f ()ν , with f () ∼ 1 or . Furthermore f () ∼ hω∗ |ψ i ∼ 1, hν∗ |χ i ∼ and dω 1, since otherwise n1 = n2 . This in turn implies n2 = 1, by Lemma 4.2 (iii), and ˆ 0 , M 0(1) /M 0(2) ). We now show that ν is not exact in C(g ˆ 0 , M 0(1) ). Assume χ, ν ∈ C(g 0 0(1) 0(2) ˆ with η ∈ C(g ˆ + h()ν0 , where ˆ , M /M ). Then ν = dη the contrary, i.e. ν = dη ˆ 0 , M 0(1) ) and h() is a polynomial in such that h(0) = 0. This implies that ν 0 ∈ C(g 0 ˆ 0 = f ()h()ν0 . Now lim→0 hω0∗ |ψ i 6= 0 since ω 0 and ω ω = ω − f ()η satisfies dω 0 ˆ differ by an element in C(g , M 0(1) ). This is a contradiction as can be seen from 1ˆ 0∗ 1 ˆ 0∗ 1 hω0∗ |ψ i = hω0∗ | dχ i = hdω | χ i = hf ()h()ν | χ i −→ 0

for → 0.

Thus ν ∈ Hˆ −p+1 (g 0 , M 0(1) ). Lemma 5.4 then gives πL0 (ω) ∈ Hˆ −p (g 0 , L0 ), which implies πL0 (ψ) ∈ Hˆ p (g 0 , L0 ). The condition p ≥ 0 follows from Corollary 5.3 and the fact that ˆ = 0 in C(g ˆ 0 , M 0 ). dψ We now prove the converse statement. Let πL0 (ψ) ∈ Hˆ p (g 0 , L0 ), p ≥ 1. Pick a ˆ 0 , M 0 ), πL0 (ω) ∈ basis as in Theorem 4.1 so that ψ is one of its elements and ω ∈ C(g −p 0 0 ∗ ˆ ˆ H (g , L ), hω |ψi 6= 0, is another. Corollary 5.3 implies dω = ν and then Lemma 5.6 gives the assertion. ˆ = ψ and ˆ 0 , M 0 ) be such that N gh (ψ) ≥ 1, dχ Lemma 5.8. Let ψ and χ ∈ C(g ˆ 0 , M 0(1) /M 0(2) ). Then πL0 (ψ) ∈ Hˆ p (g 0 , L0 ). χ ∈ C(g Proof. By Lemma 5.7 it is sufficient to prove that πL0 (ψ) 6= 0. Assume the contrary i.e. ˆ 0 , M 0(1) ). Then Lemma 4.2 implies that there exist two vectors ω and ν satisfying ψ ∈ C(g ˆ = f ()ν in C(g ˆ 0 , M 0 ), where f () ∼ . In addition, ψ, ω, ν ∈ C(g ˆ 0 , M 0(1) /M 0(2) ) dω ˆ 0 and hω∗ |ψ i ∼ , hν∗ |χ i ∼ . Now N gh (ω) ≤ −1, so that by Proposition 5.1, ω = dω 0 0 0 0 ˆ for some vector ω . We then have ω = dω + h()ν for some vector ν , which is non-singular for = 0 and h() is a polynomial of such that h(0) = 0. This implies ˆ 0 , which by comparing with dω ˆ = f ()ν yields h() ∼ and ν ∼ dν ˆ 0 . ˆ = h()dν dω Then ˆ 0∗ |χ i = hν0∗ |dχ ˆ i = hν0∗ |ψ i, ∼ hν∗ |χ i ∼ hdν ˆ 0 , M 0(1) ). so that hν0∗ |ψ i ∼ 1, which contradicts ψ ∈ C(g

Proposition 5.9. Hˆ p (g 0 , L0 ) for p ≥ 1 are represented by elements of the form v ⊗ v˜ 0 ⊗ ˜ v0 is a primary v gh , or equivalently of the form v0 ⊗ v˜ ⊗ v gh , where v ∈ L, v˜ ∈ L, highest weight vector w.r.t. g, v˜ 0 is a primary highest weight vector w.r.t. g˜ and v gh satisfies c(x)v gh = 0, x ∈ n+ . Proof. Let Hˆ p (g 0 , L0 ), p ≥ 1 be non-zero. Then by Theorem 4.1 there exists ω ∈ ˆ = ν (Corollary 5.3) with ν ∈ ˆ 0 , M 0 ) such that πL0 (ω) ∈ Hˆ −p (g 0 , L0 ). We have dω C(g ˆ 0 , M 0(1) ). It follows by Lemma 5.6 (iv) that ψ ∈ C(g ˆ 0 , M 0 ) with πL0 (ψ) ∈ Hˆ p (g 0 , L0 ), C(g ˆ = 0 in C(g ˆ 0 , M 0 ). We can now use the gradation N˜ gr introduced in the will satisfy dψ previous section to decompose dˆ = d˜0 + d˜−1 and use the homotopy operator κ˜ 0 to successively eliminate highest order terms of ψ in this gradation. Since p ≥ 1 we will finally get an element of the form v ⊗ v˜ 0 ⊗ v gh . The alternative form is found by using the gradation Ngr .

608

S. Hwang

Proposition 5.10. Hˆ 0 (g 0 , M 0 ) are represented by elements v ⊗ v˜ 0 ⊗v0gh , or equivalently ˜ v˜ 0 are highest weight vectors w.r.t. g and by the elements v0 ⊗ v˜ ⊗v0gh , where v, v0 and v, g, ˜ respectively, with v0 and v˜ 0 being primary, and v0gh is the ghost vacuum. Furthermore, the weights µ and µ˜ of the primary highest weight vectors v0 and v˜ 0 , respectively, satisfy µ + µ˜ + ρ = 0. ˆ = 0 and we can use the gradation N˜ gr and the Proof. Let ψ ∈ Hˆ 0 (g 0 , M 0 ). Then dψ homotopy operator as in the proof of Proposition 5.9 to conclude that since N gh (ψ) = 0 we must have ψ = v ⊗ v˜ 0 ⊗ v0gh . By using the gradation Ngr we get the alternative form. The condition on the weights is a consequence of htot (v ⊗ v˜ ⊗ v0gh ) = 0. Corollary 5.11. Hˆ 0 (g 0 , L0 ) are represented by elements of the form v0 ⊗ v˜ 0 ⊗ v0gh . Furthermore, the weights µ and µ˜ of the primary highest weight vectors v0 and v˜ 0 , respectively, satisfy µ + µ˜ + ρ = 0. ˆ = 0 in C(g ˆ 0 , M 0 ). ˆ 0 , M 0 ) and πL0 (ψ) ∈ Hˆ 0 (g 0 , L0 ). Assume first dψ Proof. Let ψ ∈ C(g ˆ Then the corollary follows directly from Proposition 5.10. Consider now dψ = ν 6= 0, ˆ = 0, ˆ 0 , M 0 ) such that dω where πL0 (ν) = 0. Then by Lemma 5.6 (iv) there exists ω ∈ C(g 0 0(1) ∗ ˆ ω 6∈ C(g , M ) and hω |ψi 6= 0. We may then apply Proposition 5.10 to ω, so that ω is of the form claimed in the corollary. As hω ∗ |ωi 6= 0, ω is a singlet representation of the BRST cohomology (cf. Theorem 4.1) and, hence, ψ and ω yield equivalent elements in H 0 (g 0 , L0 ). Theorem 5.12. A necessary and sufficient condition for Hˆ ±p (g 0 , L0 ), p ≥ 1, to be non-zero is either one of the following: ˆ 0 , M 0(2) ) and ν ∈ ˆ 0 , M 0(1) ) satisfying ν 6∈ C(g (i) There exists a vector ν ∈ C(g −p+1 0 0(1) ˆ H (g , M ). ˆ 0 , M 0(2) ), N gh (χ) = ˆ 0 , M 0(1) ) satisfying χ 6∈ C(g (ii) There exists a vector χ ∈ C(g ˆ ˆ p − 1, dχ = 0 and dχ 6= 0. In addition, dim Hˆ −p+1 (g 0 , M 0(1) ) = dim Hˆ ±p (g 0 , L0 ) , p ≥ 1. Proof. Necessary: (i) follows by Corollary 5.3 and Lemma 5.6 (iii). (ii) follows from (i) together with Lemma 5.6 (ii) and (iii). Sufficient: (i) For p > 1 we use Lemma 5.5. This also gives the last assertion of dimensionalities for these cases. For p = 1 we have ν ∈ Hˆ 0 (g 0 , M 0(1) ). We have two ˆ for some ψ ∈ C(g ˆ 0 , M 0 ), πL0 (ψ) 6= 0. possibilities. Either ν ∈ Hˆ 0 (g 0 , M 0 ) or ν = dψ ˆ 6= 0 from Proposition 5.10, so that we get case (ii) of the In the first case we have dν theorem, which is proved below. For the second possibility we use Lemma 5.4. ˆ = ψ ˆ = 0 and dχ ˆ 6= 0 implies, using that dˆ is linear in the generators of g 0 , dχ (ii) dχ ˆ for some ψ satisfying lim→0 ψ 6= 0. Proposition 5.8 then gives πL0 (ψ) ∈ H p (g 0 , L0 ). We finally prove the assertion concerning dimensionalities for the case p = 1. Asˆ 0 , M 0 ), with πL0 (ω1 ), πL0 (ω2 ) ∈ H −p (g 0 , L0 ) and sume first that there exist ω1 , ω2 ∈ C(g −p+1 0 0(1) ˆ 1 , ν2 = dω ˆ 2 (which is necessary by Corolˆ ν 1 , ν2 ∈ H (g , M ), satisfying ν1 = dω ˆ 1 − ω2 ) = ν1 − ν2 = d(. ˆ . .), lary 5.3), where ν1 = ν2 mod exact terms. This implies d(ω 0 0 so that by Proposition 5.1, πL (ω1 ) = πL (ω2 ) mod exact terms. Consider the opposite case, i.e. two different vectors ν1 and ν2 give the same element in Hˆ −1 (g 0 , L0 ). Write ˆ 1 = ν1 and dω ˆ 2 = ν2 . The requirement that πL0 (ω1 ) and πL0 (ω2 ) are equivalent eledω ±1 0 ˆ . .), where ν 0 ∈ C(g ˆ ˆ 0 , M 0(1) ). Then ments in H (g , L0 ) now implies ω1 = ω2 + ν 0 + d(. 0 ˆ ν1 = ν2 + dν .


609

Corollary 5.13. dimHˆ ±1 (g 0 , L0µµ˜ ) = 1 if l(µ) ˜ − l(µ) = 1 and µ and −µ˜ − ρ are on the ±1 0 ˆ same ρ-centered Weyl orbit, and dimH (g , L0µµ˜ ) = 0 otherwise. Proof. By Proposition 5.10 and Theorem 3.11 we have dimHˆ 0 (g 0 , M 0(1) ) = 1 if l(µ) ˜ − l(µ) = 0 and µ and −µ−ρ ˜ are on the same ρ-centered Weyl orbit and dimHˆ 0 (g 0 , M 0(1) ) = 0 otherwise. Then dimHˆ ±1 (g 0 , L0µµ˜ ) =dimHˆ 0 (g 0 , M 0(1) ) = 1 (Theorem 5.12). With the help of Proposition 5.10 we can easily construct ν as in Theorem 5.12 (i), which gives the corollary. ˜ − l(µ) 6= p, or if l(µ) ˜ − l(µ) = p and Theorem 5.14. Hˆ ±p (g 0 , L0µµ˜ ) = 0, p ≥ 0, if l(µ) µ and −µ˜ − ρ are not on the same ρ-centered Weyl orbit. Proof. The theorem is true for p = 0 by Corollary 5.11 and for p = 1 by Corollary 5.13. Assume the theorem to be true for Hˆ ±q (g 0 , L0µµ˜ ), 0 ≤ q ≤ p − 1 and consider q = p. Assume there exists ω ∈ Mµ0 µ˜ such that π(ω) ∈ Hˆ −p (g 0 , L0µµ˜ ) 6= 0. Let ˆ 0 , M 0(s) ). Then there exists grad(σ) = s if s is the largest integer for which σ ∈ C(g 0(1) −p+1 0 (g , Mµµ˜ ), grad(ν) = 1 (Theorem 5.12). Write ν = ν1 + ν2 + . . . νn , where ν ∈ Hˆ νi ∈ Vi , i = 1, 2, . . . , n, grad(νi ) = 1 and Vi are Verma or BG modules of primitive ˜ − l(µ) − 1 (Proposition 3.15). We may weights (µi , µ˜ i ). We have l(µ˜ i ) − l(µi ) = l(µ) assume that ν cannot be written as a sum ν 0 + ν 00 , where ν 0 , ν 00 ∈ Hˆ −p+1 (g 0 , Mµ0(1) µ˜ ) and unequal, grad(ν 0 ) = grad(ν 00 ) = 1, as this would yield two different elements in ˆ i = 0 for some value of i, then νi = d(. ˆ . .) (Proposition 5.10) and Hˆ −p (g 0 , L0µµ˜ ). If dν clearly ν − νi will correspond to the same element ω. Hence, we may restrict to νi with ˆ i 6= 0, i = 1, . . . , n. dν ˆ = 0 using the gradation Ngr . Let νî be the highest Consider now the equation dν order term of νi and Ngr (νi ) = Ni , i = 1, . . . , n. Let Nˆ = max{Ni }ni=1 and order so Pm that Ni = Nˆ for i ∈ {1, 2, . . . , m}, m ≤ n. Then d0 ( i=1 )νi = 0. As d0 νi ∈ Vi , this equation may only be solved if there exists at least one Vj such that d0 νˆ i ∈ Vi ∩ Vj . Let φV M i

φV M i be the g-homomorphism Vi → Mi0 , i = 1, . . . , n, where Mi are Verma modules of the same primitive weight as Vi . φV M i exists for all i (see the note after Theorem 3.10). Then d0 φV M i (νî ) ∈ Mi0 ∩Mj0 . This is only possible if d0 φV M i (νî ) ∈ Mi0(1) for all ˆ V M i (νî ) ∈ M 0(1) to highest order in Ngr and that there exists ηi ∈ M 0 i. This implies dφ i i ˆ ˆ 0 to leading order such that ηi = dφV M i (νi ). If there exists ηi0 ∈ Mi0(1) such that η = dη i ˆ V M i (νi ) − η 0 ) to leading order, which contradicts dφ ˆ V M i (νi ) = ξi , where ξi is then d(φ i non-exact in Mi0(1) . Hence, ηi ∈ Hˆ −p+2 (g 0 , Mµ0(1) ˜ i ) to highest order. Theorem 5.12 now iµ −p+1 0 0 ˆ (g , Li ) to highest order. The induction hypothesis asserts that πLi φV M i (νî ) ∈ H implies l(µ˜ i ) − l(µi ) = p − 1 and that µi and −µ˜ i − ρ lie on the same ρ-centered Weyl orbit. Then l(µ) ˜ − l(µ) = p, µ and −µ˜ − ρ lie on the same Weyl orbit (Theorem 3.11 and Proposition 3.15). Theorem 5.15. Let µ, µ˜ ∈ 0w be such that µ + ρ/2 and µ˜ + ρ/2 are regular and µ and −µ˜ − ρ are on the same ρ−centered Weyl orbit. Then Hˆ ±p (g 0 , Lµµ˜ ) 6= 0, where p = l(µ) ˜ − l(µ) ≥ 0. Proof. For p = 0 the theorem is given by Corollary 5.11 (cf. the proof of Theorem 5.14, where it is shown that µ + µ˜ + ρ = 0 implies l(µ) ˜ − l(µ) = 0). For p = 1 the theorem follows from Corollary 5.13. We proceed by induction on p. Assume the theorem to be true for 0 ≤ l(µ) ˜ − l(µ) ≤ p − 1. We will also assume the following to hold to this

610

S. Hwang

order of p. Let ω ∈ Mµµ˜ such that πL (ω) ∈ Hˆ −q (g 0 , Lµµ˜ ) for 0 ≤ q ≤ p − 1. We then ˆ = ν1 + ν2 + . . . + νn with νi ∈ Mµi µ˜ i and grad(νi ) = 1, i = 1, . . . , n (with assume dω grad(...) defined as in Theorem 5.14). This assumption clearly holds for p = 1. We now consider µ, µ˜ such that l(µ) ˜ − l(µ) = p ≥ 2 with µ˜ + ρ/2 and µ + ρ/2 being regular. Introduce the following notation. For the Verma module Mµµ˜ we let M1 , . . . , Mn denote all submodules such that Mi ⊂ Mµ(1)µ˜ , Mi 6⊂ Mµ(2)µ˜ , i = 1, . . . , n. Denote by φi be a non-zero element of (µ1 µ˜ 1 ), . . . , (µn µ˜ n ) their respective highest weights. S Let S Homg (Mi , Mµµ˜ ), i = 1, . . . , n. Let Mi1 ...ik = Mi1 . . . Mik , i1 , . . . , ik = 1, . . . , n and φi1 ...ik be a non-zero element of Homg (Mi1 ...ik , Mµµ˜ ). Consider now ω1 ∈ M1 with πL (ω1 ) ∈ Hˆ −p+1 (g 0 , Lµ1 µ˜ 1 ). By Theorem 3.11 and the induction hypothesis ω1 ˆ 1 = ν1 + . . . + νs , where νi ∈ M1,i ⊂ M (1) (induction hypothesis). As exists. Then dω 1 grad(νi )=1 we have grad(φi (νi ))=2. Therefore, there will exist a union of Verma modules, M2...k say, such that φ1 (ν1 + . . . + νs ) = φ2...k (ν10 + . . . + νt0 ), ν10 + . . . + νt0 ∈ M2...k . By Lemma 3.17, M2...k is non-zero and different from M1 . Thus, φ1 (ν1 + . . . + νs ) may either be viewed as originating from an element ν1 + . . . + νs in M1 or from ˆ 0 + . . . + νt0 ) = 0, there exists an element an element ν10 + . . . + νt0 in M2...k . As d(ν 1 0 0 ˆ 2...k (Proposition 5.1). From this it follows ω2...k ∈ M2...k such that ν1 + . . . + νt = dω ˆ 2...k (ω2...k )−φ1 (ω1 )) = 0. Define ξ = φ2...k (ω2...k )−φ1 (ω1 ), which must be nonthat d(φ zero as M1 6= M2...k . We now prove that ξ is a non-trivial element of Hˆ −p+1 (g 0 , M 0(1) ), which by Theorem 5.12 proves our assertions for l(µ)−l(µ) ˜ = p (including the additional induction assumption). ˆ with η ∈ M (1) . Let ξ = ξ1 + . . . ξk , where ξi ∈ Mi , Assume the contrary, ξ = dη µµ˜ ˆ i = 0 then the corresponding ˆ i ∈ M (2) , i = 1, . . . , k. If dξ i = 1, . . . , k. By construction dξ µµ˜ Verma module Mi may be deleted from M2...k without affecting the construction of ξ (as ˆ i = 0 implies ξi = d(. ˆ . .)). Hence, we may consider dξ ˆ i 6= 0, i = 1, . . . , k. To highest dξ ˆ yields ξ (N ) = ξ (N ) + . . . + ξ (N ) = d˜0 η (N ) , order in the gradation N˜ gr the equation ξ = dη i k where ξ (N ) and η (N ) are the leading terms in ξ and η, respectively, and ξi(N ) denotes the N th order term of ξi (which is non-zero for at least one value of i). Generally, d˜0 γ ∈ V only if γ ∈ V for any Verma or BG module V. Therefore, η (N ) = η1(N ) + . . . ηk(N ) and ˆ i = 0, ξi(N ) = d˜0 ηi(N ) , ηi(N ) ∈ Mi i = 1, . . . , k. This implies d˜0 ξ (N ) = 0 and in turn dξ which is a contradiction. Remark 1. Results similar to Theorem 5.15 may be obtained for weights µ+ρ/2 and µ+ ˜ ρ/2 being singular, provided the corresponding Verma modules satisfy the multiplicity condition of Lemma 3.17. It is clear, however, that this generalization does not hold for all singular cases. Remark 2. The proof of Theorem 5.15 provides also an explicit method for finding the elements of the cohomology for negative ghost numbers. It is the same method as was presented in ref. [3]. Acknowledgement. I would like to thank Henric Rhedin for stimulating discussions during the progression of this work.

References 1. Karabali, D. and Schnitzer, H.: BRST quantization of the gauged WZW action and coset conformal field theories. Nucl. Phys. B329, 649–666 (1990)


611

2. Hwang, S. and Rhedin, H.: The BRST formulation of G/H WZNW models. Nucl. Phys. B406, 165–184 (1993) 3. Hwang,S. and Rhedin, H.: Construction of BRST invariant states in G/H models. Phys. Lett. B350, 38–43 (1995) (hep-th/9501084) 4. Malikov, F.G., Feigin, B.L. and Fuks, D.B.: Singular vectors in Verma modules over Kac–Moody algebras. Funkt. Anal. Ego Prilozh. 20, 25–37 (1986) (English translation in Funkt. Anal. Appl. 20, 103–113 (1986)) 5. Jantzen, J.C.: Moduln mit einem höchsten Gewicht. Lecture notes in Mathematics, Eds. A. Dold and B. Eckmann, Berlin–Heidelberg–New York: Springer-Verlag, 1979 6. Rocha-Caridi, A. and Wallach, N.R.: Projective modules over graded Lie algebras I. Math. Z. 180, 151–177 (1982) 7. Kugo, T. and Ojima, I.: Local covariant operator formalism of non-abelian gauge theories and quark confinement problem. Suppl. Prog. Theor. Phys. 66, 1–130 (1979) 8. Rocha-Caridi, A. and Wallach, N.R.: Highest weight modules over graded Lie algebras: Resolutions, filtrations and character formulas. Trans. Am. Math. Soc. 277, 133–162 (1983) 9. Kac, V.G. and Peterson, D.: Infinite dimensional Lie algebras, theta functions and modular forms. Adv. Math. 53, 125–264 (1984) 10. Kac, V.G. and Kazhdan, D.A.: Structure of representations with highst weight of infinite-dimensional Lie algebras. Adv. in Math. 34, 97–108 (1979) 11. Bernstein, I.N., Gel’fand, I.M. and Gel’fand, S.I.: Structure of representations generated by vectors of highest weight. Funct. Anal. App. 5, 1–8 (1971) 12. Conze, N. and Dixmier, J.: Idéaux primitifs dans l’algèbre enveloppante d’une algèbre de Lie semisimple. Bull. Sc. Math 96, 339–351 (1972) 13. Kac, V.G.: Infinite dimensional Lie algebras. Third edition, Cambridge: Cambridge Univ. Press, 1990 14. Dixmier, J.: Enveloping Algebras. North Holland Mathematical library, Amsterdam: North Holland Publ. Co., 1977 15. Verma, D.-N.: Structure of certain induced representations of complex semisimple Lie algebras. Bull. Amer. Math. Soc. 74, 160–166 (1968) 16. Humphreys, J.: Introduction to Lie algebras and representation theory. Revised edition, Berlin– Heidelberg–New York: Springer-Verlag, 1980 17. Hwang, S. and Marnelius, R.: BRST symmetry and a general ghost decoupling theorem. Nucl. Phys. B320, 476–486 (1989) 18. Enright, T.J.: On the fundamental series of a real semisimple Lie algebra: Their irreducibility, resolutions and multiplicity formulae. Ann. of Math. 110, 1–82 (1979) 19. Bernstein, I.N., Gel’fand, I.M. and Gel’fand, S.I.: Differential operators on the base affine space and the study of g-modules. In: I.M. Gel’fand, ed., Publ. of the 1971 summer school in Math., Janos Bolyai Math. Soc., Budapest, pp. 21–64 20. Lian, B.H. and Zuckermann, G.: BRST cohomology and highest weight vectors. Comm. Math. Phys. 135, 547–580 (1991) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 194, 613 – 630 (1998)

Communications in


Zeta-Function Regularization, the Multiplicative Anomaly and the Wodzicki Residue Emilio Elizalde1 , Luciano Vanzo2 , Sergio Zerbini2 1

Unitat de Recerca, CSIC, IEEC, Edifici Nexus 201, Gran Capità 2–4, 08034 Barcelona, Spain and Departament ECM and IFAE, Facultat de F´ısica, Universitat de Barcelona, Diagonal 647, 08028 Barcelona, Spain. E-mail: [email protected] 2 Dipartimento di Fisica, Universit` a di Trento, and Istituto Nazionale di Fisica Nucleare, Gruppo Collegato di Trento, Italia. E-mail: [email protected], [email protected] Received: 3 February 1997 / Accepted: 5 November 1997

Abstract: The multiplicative anomaly associated with the zeta-function regularized determinant is computed for the Laplace-type operators L1 = −1+V1 and L2 = −1+V2 , with V1 , V2 constant, in a D-dimensional compact smooth manifold MD , making use of several results due to Wodzicki and by direct calculations in some explicit examples. It is found that the multiplicative anomaly is vanishing for D odd and for D = 2. An application to the one-loop effective potential of the O(2) self-interacting scalar model is outlined.

1. Introduction Within the one-loop or external field approximation, the importance of zeta-function regularization for functional determinants, as introduced in [1], is well known, as a powerful tool to do with the ambiguities (ultraviolet divergences) present in relativistic quantum field theory (see for example [2]-[4]). It permits to give a meaning, in the sense of analytic continuation, to the determinant of a differential operator which, as the product of its eigenvalues, is formally divergent. For the sake of simplicity we shall here restrict ourselves to scalar fields. The one-loop Euclidean partition function, regularised by zeta-function techniques, reads [5] 1 LD 1 1 ln Z = − ln det 2 = ζ 0 (0|LD ) + ζ(0|LD ) ln µ2 , 2 µ 2 2 where ζ(s|LD ) is the zeta function related to LD – typically an elliptic differential operator of second order – ζ 0 (0|LD ) its derivative with respect to s, and µ2 a renormalization scale. The fact is used that the analytically continued zeta-function is generally regular at s = 0, and thus its derivative is well defined. When the manifold is smooth and compact, the spectrum is discrete and one has

614

E. Elizalde, L. Vanzo, S. Zerbini

ζ(s|LD ) =

X i

λ−2s , i

λ2i being the eigenvalues of LD . As a result, one can make use of the relationship between the zeta-function and the heat-kernel trace via the Mellin transform and its inverse. For Re s > D/2, one can write Z ∞ 1 −s ts−1 K(t|LD ) dt , (1.1) ζ(s|LD ) = Tr LD = 0(s) 0 K(t|LD ) =

1 2πi

Z Re s>D/2

t−s 0(s)ζ(s|LD ) ds ,

(1.2)

where K(t|LD ) = Tr exp(−tLD ) is the heat operator. The previous relations are valid also in the presence of zero modes, with the replacement K(t|LD ) −→ K(t|LD ) − P0 , P0 being the projector onto the zero modes. A heat-kernel expansion argument leads to the meromorphic structure of ζ(s|LD ) and, as we have anticipated, it is found that the analytically continued zeta-function is regular at s = 0 and thus its derivative is well defined. Furthermore, in practice all the operators may be considered to be trace-class. In fact, if the manifold is compact this is true and, if the manifold is not compact, the volume divergences can be easily factorized. Thus Z (1.3) Kt (LD ) = dVD Kt (LD )(x) and

Z ζ(LD , z) =

dVD ζ(LD |z)(x),

(1.4)

where Kt (LD )(x) and ζ(LD |z)(x) are the heat-kernel and the local zeta-function, respectively. However, if an internal symmetry is present, the scalar field is vector valued, i.e. φi and the simplest model is the O(2) symmetry associated with self-interacting charged fields in R4 . The Euclidean action is Z λ (1.5) S = dx4 φi −1 + m2 φi + (φ2 )2 , 4! where φ2 = φk φk is the O(2) invariant. The Euclidean small disturbances operator reads Aij = Lij +

λ 2 λ 8 δik + 8i 8k , 6 3

Lij = −1 + m2 δik ,

(1.6)

in which 1 is the Laplace operator and 8 the background field, assumed to be constant. Thus, one is actually dealing with a matrix-valued elliptic differential operator. In this case, the partition function is [6] " #

Aik (L + λ2 82 ) (L + λ6 82 )

. (1.7) ln Z = − ln det 2 = − ln det µ µ2 µ2

Zetas, Multiplicative Anomaly and Wodzicki Residue

615

As a consequence, one has to deal with the product of two elliptic differential operators. In the case of a two-matrix, one has ln det(AB) = ln det A + ln det B .

(1.8)

Usually the way one proceeds is by formally assuming the validity of the above relation for differential operators. This may be quite ambiguous, since one has to employ necessarily a regularization procedure. In fact, it turns out that the zeta-function regularized determinants do not satisfy the above relation and, in general, there appears the so-called multiplicativity (or just multiplicative) anomaly [7, 8]. In terms of F (A, B) ≡ det(AB)/(det A det B) [8], it is defined as: aD (A, B) = ln F (A, B) = ln det(AB) − ln det(A) − ln det(B) ,

(1.9)

in which the determinants of the two elliptic operators, A and B, are assumed to be defined (e.g., regularized) by means of the zeta-function [1]. It should be noted that the non vanishing of the multiplicative anomaly implies that the relation ln det A = Tr ln A

(1.10)

does not hold, in general, for elliptic operators like A = BC. It turns out that this multiplicative anomaly can be expressed by means of the noncommutative residue associated with a classical pseudo-differential operator, known as the Wodzicki residue [9]. Its important role in physics has been recognized only recently. In fact, within the non-commutative geometrical approach to the standard model of the electroweak interactions [10, 11], the Wodzicki residue is the unique extension of the Dixmier trace (necessary to write down the Yang-Mills action functional) to the larger class of pseudo-differential operators (9DO) [12]. Other recent contributions along these lines are [13–15]. Furthermore, a proposal to make use of the Wodzicki formulae as a practical tool in order to determine the singularity structure of zeta-functions has appeared in [16] and the connection with the commutator anomalies of current algebras and the Wodzicki residue has been found in [17] The purpose of the present paper is to obtain explicitly the multiplicative anomaly for the product of two Laplace-like operators – by direct computations and by making use of several results due to Wodzicki – and to investigate the relevance of these concepts in physical situations. As a result, the multiplicative anomaly will be found to be vanishing for D odd and also for D = 2, being actually present for D > 2, with D even. The contents of the paper are the following. In Sect. 2 we present some elementary computations in order to show the highly non-trivial character of a brute force approach to the evaluation of the multiplicative anomaly associated with two differential operators (even with very simple ones). In Sect. 3 we briefly recall several results due to Wodzicki, concerning the noncommutative residue and a fundamental formula expressing the multiplicative anomaly in terms of the corresponding residue of a suitable pseudodifferential operator. In Sect. 4, the Wodzicki formula is used in the computation of the multiplicative anomaly in RD and, as an example, the O(2) model in R4 is investigated. In Sect. 5, a standard diagrammatic analysis of the O(2) model is discussed and evidence for the presence of the multiplicative anomaly at this diagrammatic level is given. In Sect. 6 we treat the case of an arbitrary compact smooth manifold without boundary. Some final remarks are presented in the Conclusions. In the Appendix a proof of the multiplicative anomaly formula is outlined.

616


2. Direct Calculations Motivated by the example discussed in the introduction, one might try to perform a direct computation of the multiplicative anomaly in the case of the two self-adjoint elliptic commuting operators Lp = −1 + Vp , p = 1, 2, in MD , with Vp constant. Actually, we could deal with the shifts of two elliptic 9ODs. For the sake of simplicity, we may put µ2 = 1 and consider all the quantities to be dimensionless. At the end, one can easily restore µ2 by simple dimensional considerations. In order to compute the multiplicative anomaly, one needs to obtain the zeta-functions of the operators. Let us begin with MD smooth and compact without boundary (the boundary case can be treated along the same lines) and let us try to express ζ(s|L1 L2 ) as a function of ζ(s|Lp ). If we denote L0 = −1 and by λi its non-negative, discrete eigenvalues, the spectral theorem yields X (2.1) ζ(s|L1 L2 ) = [(λi + V1 )(λi + V2 )]−s . i

Making use of the identity (λi + V1 )(λi + V2 ) = (λi + V+ )2 − V−2 ,

(2.2)

with V+ = (V1 + V2 )/2 and V− = (V1 − V2 )/2, and noting that V−2 2 and even, there are a finite number of simple poles other than at s = 0 in Eq. (2.11). As an example, in the important case D = 4, in a compact manifold without boundary, the zeta function has simple poles at s = 2, s = 1, s = 0, etc. Only the first

618


one is relevant, the other being harmless. Separating the term corresponding to l = 1, only this gives a non vanishing contribution when one takes the derivatives with respect to s at zero. Thus, a direct computation yields A0 V−2 2 VD (V1 − V2 )2 . = 4(4π)∈

a4 (L1 , L2 ) =

(2.14)

It follows that it exists potentially, an alternative direct method for computing the multiplicative anomaly for the shifts of two elliptic 9DOs and its structure will be a function of V−2 and of the heat-kernel coefficients Ar , which, in principle, are computable (the first ones are known). We will come back on this point in Sect. 6, using the Wodzicki formula. However, we observe that, here, the multiplicative anomaly is a function of the series of zeta-functions related to operators of Laplace type. One soon becomes convinced that it is not easy to go further along this way for an arbitrary D-dimensional manifold. We conclude this section with explicit examples. Example 1. MD = RD . Let us start with a particularly simple example, i.e. MD = RD . The two zeta-functions ζ(s|Li ) are easy to evaluate and read ζ(s|Li ) =

D

VD

D

Vi 2

− D2 ) , 0(s)

−s 0(s

(4π) 2

i = 1, 2 ,

(2.15)

where VD is the (infinite) volume of RD . We need to compute ζ(s|L1 L2 ). For Re s > D/2, starting from the spectral definition, one gets Z ∞ −s 2VD ζ(s|L1 L2 ) = dkk D−1 k 4 + (V1 + V2 )k 2 + V1 V2 . (2.16) D 4π) 2 0( D2 ) 0 For Re s > (D − 1)/4, the above integral can be evaluated [19], to yield √ 1 D 1−2s 2πVD 0(2s − D2 ) 2 −s α − 1 4 (V1 V2 ) 4 −s P 2 D+1 (α) , (2.17) ζ(s|L1 L2 ) = D s− 2 2s (4π) 2 0(s) Pνµ (z) being the associate Legendre function of the first kind (see for example [19]), and V1 + V2 . α= √ 2 V1 V 2

(2.18)

This provides the analytical continuation to the whole complex plane. For D = 2Q + 1, one easily gets ζ(0|L1 L2 ) = 0, √ ζ 0 (0|L1 L2 ) =

2πVD 0(−Q − 21 ) D

(4π) 2 =

VD 0(−Q − 21 ) D

(4π) 2

1

α2 − 1 D

4

1

D

(V1 V2 ) 4 P 2 D+1 (α) − 2

1/2

2(V1 V2 ) 2 (1 + cosh(Dγ))

,

(2.19)


619

in which cosh γ = α. The first equation says that the conformal anomaly vanishes. On the other hand, one has for D odd, 0

0

ζ (0|L1 ) + ζ (0|L2 ) =

VD 0(−Q − 21 ) D

D V1 2

+

D V2 2

,

(2.20)

(4π) 2

As a consequence, making use of elementary properties of the hyperbolic cosine, one gets a(L1 , L2 ) = 0. Namely, for D odd the multiplicative anomaly is vanishing (see [8]). For D = 2Q, the situation is much more complex. First the conformal anomaly is non-zero, i.e. ζ(0|L1 L2 ) =

i VD (−1)Q h Q/2 (V V ) cosh(Qγ) , 1 2 (4π)Q Q!

(2.21)

and, in general, the multiplicative anomaly is present. As a check, for D = 2, we get i V2 h V2 (V1 V2 )1/2 cosh γ = − (V1 + V2 ) 4π 4π 1 a1 (A) = ζ(0|A) , = 4π

ζ(0|L1 L2 ) = −

(2.22)

where A = −1I + V is a 2 × 2 matrix-valued differential operator, I the identity matrix, V = diag (V1 , V2 ), and aR1 (A) is the first related Seeley-De Witt coefficient, given by the well known expression dx2 (− tr V ). Unfortunately, it is not simple to write down – within this naive approach – a reasonably simple expression for it, because the associate Legendre function depends on s through the two indices µ and ν. However, it is easy to show that the anomaly is absent when V1 = V2 , therefore it will depend only on the difference V1 − V2 . Thus, one may consider the case V2 = 0. As a result, Eq. (2.16) yields the simpler expression √ ζ(s|L1 L2 ) =

2πVD

D (4π) 2 0( D2 )

0( D2 − s)0(2s − 0(s)

D 2 )

D

V1 2

−2s

.

(2.23)

In this case the multiplicative anomaly is given by a(L1 , L2 ) = ln det(L1 L2 ) − ln det(L1 ) ,

(2.24)

since the regularized quantity ln det(L2 ) = 0. It is easy to show that, when D is odd, again aD (L1 , L2 ) = 0. When D = 2Q, one obtains a2Q (L1 , L2 ) =

VD (−1)Q Q V [9(1) − 9(Q)] . (4π)Q 2Q! 1

(2.25)

We conclude this first example by observing that the multiplicative anomaly is absent when Q = 1, D = 2, and that it is present for Q > 1, D > 2 even. The result obtained is partial and more powerful techniques are necessary in order to deal with the general case. Such techniques will be introduced in the next section.

620


Example 2. MD = S 1 × RD−1 , D = 1, 2, 3, . . .. In this case the zeta functions corresponding to Li , i = 1, 2, are given by " 2 #(D−1)/2−s ∞ π (D−1)/2−2s 0(s + (1 − D)/2) X L Vi (2.26) n2 + ζ(s|Li ) = 22s+1 LD−2s 0(s) 2π n=−∞ (i = 1, 2, here L is the length of S 1 ). In terms of the basic zeta function (see [20]): ζ(s; q) ≡

∞ X

(n2 + q)−s

(2.27)

n=−∞

=

∞ √ 0(s − 1/2) 1/2−s 4π s 1/4−s/2 X √ q q π + ns−1/2 Ks−1/2 (2πn q), 0(s) 0(s) n=1

where Kν is the modified Bessel function of the second kind, we obtain π −D/2 h −D D/2−s 2 L0(s − D/2)Vi ζ(s|Li ) = 0(s) # ∞ X p 2−s−D/2 s+1−D/2 D/4−s/2 s−D/2 +2 L Vi n Ks−D/2 (nL V1 ) (2.28) n=1

≡ ζ (s|Li ) + ζ (s|Li ). (1)

(2)

For the determinant we get, for D odd, n h D/2 det Li = exp −π −D/2 2−D L0(−D/2)Vi 1−D/2

+(2L)

D/4 Vi

∞ X

n

−D/2

#)

p KD/2 (nL Vi )

,

(2.29)

n=1

for D even (D = 2Q),



  Q X 1 − ln Vi  ViQ  j j=1 # √ Q X ∞ p Vi −Q n KQ (nL Vi ) . + 4L 2πL

L det Li = exp − Q!

1 − 4π

Q

(2.30)

n=1

As for the product L1 L2 , using the same strategy as before, after some calculations we obtain (here we use the short-hand notation L± ≡ L0 + V± , cf. equations above):   [Q/2] X V−2p (−V+ )Q−2p 2L det(L1 L2 ) = (det L+ )2 exp −  (2p)!(Q − 2p)!(4π)Q p=1   p−1 X 1 1 1 + − ψ(2p) − ln V+  × 1 − C + 2 (Q − 2p)! 2 j j=1  ∞ ∞  X X V−2p (1) V−2p (2) − ζ (2p|L+ ) − ζ (2p|L+ ) , (2.31)  p p p=[Q/2]+1

p=1


621

where [x] means ‘integer part of x’ and C is the Euler–Mascheroni constant. We can check from these formulas that the anomaly (1.9) is zero in the case of odd dimension D. Actually, this is most easily seen, as before, by using the expression corresponding to (2.16) for the present case. It also vanishes for D = 2. The formula above is useful in order to obtain numerical values for the case D even, corresponding to different values of D and L (the series converge very quickly). The results are given in Table 1. We have looked at the variation of the anomaly in terms of the different parameters: L, D, V1 and V2 while keeping the rest of them fixed. Within numerical errors, we have checked the complete coincidence with formula (4.5) in Sect. 4. Table 1. Values of the multiplicative anomaly a(L1 , L2 ) in terms of the parameters: L, D, V1 and V2 . Observe its evolution when some of the parameters are kept fixed while the others are varied. In all cases, a perfect coincidence with Wodzicki’s expression for the anomaly is obtained (within numerical errors) L 1 0.1 1 5 10 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0.1 0.5 1 2 5 10 20 0.1 0.5 1 2 5 10 20

D 2 2 2 2 2 2 4 6 8 10 12 14 16 4 4 4 6 6 6 4 4 4 4 4 4 4 6 6 6 6 6 6 6

V1 2 8 8 8 8 10 10 10 10 10 10 10 10 2 5 1 2 5 1 7 7 7 7 7 7 7 7 7 7 7 7 7 7

V2 2 3 3 3 3 1 1 1 1 1 1 1 1 1 2 6 1 2 6 2 2 2 2 2 2 2 2 2 2 2 2 2 2

a(L1 , L2 ) 0. –1.8686 × 10−14 –2.0817 × 10−17 –1.4572 × 10−16 –1.4572 × 10−16 2.87 × 10−12 0.064117 –0.028063 0.0151245 –0.003636 0.0006124 –0.00008166 9.09 × 10−6 0.0007916 0.007124 0.019789 –0.0000945 –0.001984 –0.005512 0.001979 0.009895 0.019789 0.0395786 0.098947 0.197893 0.395786 –0.00070865 –0.00354326 –0.0070865 –0.014173 –0.0354326 –0.07008652 –0.141730

Example 3. MD = RD with Dirichlet b.c. on p pairs of perpendicular hyperplanes. The zeta function is, in this case,

622


ζ(s|Li ) =

π (D−p)/2−2s 0(s + (p − D)/2) Qp 2D−p+1 j=1 aj 0(s)  (D−p)/2−s 2 p ∞ X X nj  + Vi  , aj

n1 ,...,np =1

(2.32)

j=1

where the aj , j = 1, 2, . . . , p, are the pairwise separations between the perpendicular hyperplanes. For the determinant, we get, for D − p = 2h + 1 odd, det Li =   h−1/2    2 p ∞   h+1/2 X X π nj  exp − 2h+2 Qp 0(−h − 1/2) + Vi  , (2.33)   aj  2  j=1 aj n1 ,...,np =1 j=1 and, for D − p = 2h even,     h−1 ∞  h X X (−π) 1   Q 2 + h det Li = exp  p 2h+1 h!  j 2 j=1 aj j=1 n ,...,n +

∞ X n1 ,...,np =1



2 p X nj j=1

aj

h



+ Vi  ln 

2 p X nj j=1



p =1

1





aj

2 p X nj j=1

aj

h + Vi 

    + Vi   .  

(2.34)

For the calculation of the anomaly one follows the same steps of the two preceding examples and we are not going to repeat this again. In order to obtain the final numbers one must make use of the inversion formula for the Epstein zeta functions of these expressions [20, 2]. 3. The Wodzicki Residue and the Multiplicative Anomaly For the reader’s convenience, we will review in this section the necessary information concerning the Wodzicki residue [9] (see, also [7] and the references to Wodzicki quoted therein) that will be used in the rest of the paper. Let us consider a D-dimensional smooth compact manifold without boundary MD and a (classical) 9DO, A, of order m, acting on sections of vector bundles on MD . To any 9DO, A, it corresponds to a complete symbol a(x, k), such that, modulo infinitely smoothing operators, one has Z Z dk dyei(x−y)k a(x, k)f (y) . (3.1) (Af )(x) ∼ D RD (2π) RD The complete symbol admits an asymptotic expansion for |k| → ∞, given by X a(x, k) ∼ am−j (x, k) ,

(3.2)

j

and fulfills the homogeneity property am−j (x, tk) = tm−j am−j (x, k), for t > 0. The number m is called the order of A.


623

If P is an elliptic operator of order p > m, according to Wodzicki one has the following property of the non-commutative residue, which we may take as its characterization.

Proposition. The trace of the operator AP −s exists and admits a meromorphic continuation to the whole complex plane, with a simple pole at s = 0. Its Cauchy residue at s = 0 is proportional to the so-called non-commutative (or Wodzicki) residue of A: res(A) = p Ress=0 Tr(AP −s ) .

(3.3)

The r.h.s. of the above equation does not depend on P and is taken as the definition of the Wodzicki residue of the 9DO, A. Properties. (i)

Strictly related to the latter result is the one which follows, involving the short-t asymptotic expansion Tr(Ae−tP ) '

X j

αj t

D−j p −1

−

res(A) ln t + O(t ln t) . p

(3.4)

Thus, the Wodzicki residue of A, a 9DO, can be read off from the above asymptotic expansion selecting the coefficient proportional to ln t. (ii) Furthermore, it is possible to show that res(A) is linear with respect to A and possesses the important property of being the unique trace on the algebra of the 9DOs, namely, one has res(AB) = res(BA). This last property has deep implications when including gravity within the non-commutative geometrical approach to the Connes-Lott model of the electro-weak interaction theory [12, 10, 11]. (iii) Wodzicki has also obtained a local form of the non-commutative residue, which has the fundamental consequence of characterizing it through a scalar density. This density can be integrated to yield the Wodzicki residue, namely Z Z dx a−D (x, k)dk . (3.5) res(A) = D MD (2π) |k|=1 Here the component of order −D of the complete symbol appears. Form the above result it immediately follows that res(A) = 0 when A is an elliptic differential operator. (iv) We conclude this summary with the multiplicative anomaly formula, again due to Wodzicki. A more general formula has been derived in [8]. Let us consider two invertible elliptic self-adjont operators, A and B, on MD . If we assume that they commute, then the following formula applies: res (ln(Ab B −a ))2 = a(B, A) , (3.6) a(A, B) = 2ab(a + b) where a > 0 and b > 0 are the orders of A and B, respectively. A sketch of the proof is presented in the Appendix. It should be noted that a(A, B) depends on a 9DO of zero order. As a consequence, it is independent on the renormalization scale µ appearing in the path integral.

624


(v) Furthermore, it can be iterated consistently. For example ζ 0 (A, B) = ζ 0 (A) + ζ 0 (B) + a(A, B), ζ (A, B, C) = ζ 0 (AB) + ζ 0 (C) + a(AB, C) = ζ 0 (A) + ζ 0 (B) + ζ 0 (C) + a(A, B) + a(AB, C) .

(3.7)

0

As a consequence, a(A, B, C) = a(AB, C) + a(A, B) .

(3.8)

Since a(A, B, C) = a(C, B, A), we easily obtain the cocycle condition (see [8]): a(AB, C) + a(A, B) = a(CB, A) + a(C, B) .

(3.9)

4. The O(2) Bosonic Model In this section we come back to the problem of the exact computation of the multiplicative anomaly in the model considered in Sect. 2. Strictly speaking, the result of the last section is valid for a compact manifold, but in the case of RD the divergence is trivial, being contained in the volume factor. The Wodzicki formula gives a(L1 , L2 ) =

1 2 res (ln(L1 L−1 . 2 )) 8

(4.1)

2 We have to construct the complete symbol of the 9DO of zero order [ln(L1 L−1 2 )] . It is given by

2 a(x, k) = ln(k 2 + V1 ) − ln(k 2 + V2 ) .

(4.2)

For large k 2 , we have the following expansion, from which one can easily read off the homogeneuos components: a(x, k) =

∞ X j=2

cj k

−2j

=

∞ X

a2j (x, k) ,

(4.3)

j=2

where cj =

j X (−1)j V1n − V2n V1j−n − V2j−n . n(j − n)

(4.4)

n=1

As a consequence, due to the local formula one immediately gets the following result: for D odd, the multiplicative anomaly vanishes, in perfect agreement with the direct calculation of Sect. 2. This result is consistent with a general theorem contained in [8]. For D even, if D = 2 one has no multiplicative anomaly, while for D = 2Q, Q > 1, one gets a(L1 , L2 ) =

Q−1 VD (−1)Q X 1 V1j − V2j V1Q−j − V2Q−j . Q 4(4π) 0(Q) j(Q − j) j=1

(4.5)


625

It is easy to show that for V2 = 0 this expression reduces to the one obtained directly in Sect. 2. In the O(2) model, for D = 4, we have a(L1 , L2 ) =

V4 V4 (V1 − V2 )2 = λ2 84 , 4(4π)2 36(4π)2

(4.6)

which, for dimensional reasons, is independent of the renormalization parameter µ. Then, the one-loop effective potential reads ln Z V4 M12 M22 M14 3 M24 3 1 + ln + ln = − + − + λ2 84 , (4.7) 64π 2 2 µ2 64π 2 2 µ2 72(4π)2

Vef f = −

with M12 = m2 +

λ 2 8 , 2

M22 = m2 +

λ 2 8 . 6

(4.8)

Thus, the additional multiplicative anomaly contribution seems to modify the usual Coleman-Weinberg potential. A more careful analysis is required in order to investigate the consequences of this remarkable fact. 5. Feynman Diagrams The necessity of the presence of the multiplicative anomaly in quantum field theory can also be understood perturbatively, using the background field method. The effective action of the O(2) model in a background field 8 will be denoted by 0(8, φ), where φ is the mean field. Then, if 00 (φ) denotes the effective action with vanishing 8, it turns out that 0(8, φ) = 00 (8 + φ).

(5.1)

Therefore, the nth order derivatives of 0 with respect to φ at φ = 0 determine the vertex functions of the O(2) model in the background external field. The one-loop approximation to 0 is again given by log det(L1 L2 ), and the determinant of either of the operators, L1 and L2 , corresponds to the sum of all vacuum-vacuum 1PI diagrams where only particles of masses squared M12 = m2 + λ82 /2 or M22 = m2 + λ82 /6 flow along the internal lines. In Fig. 1 we have depicted this, by using a solid line for type-1 particles and a dashed line for type-2 particles. Thus, for example, the inverse propagator at zero momentum for type-1 particle, as computed from the above effective potential, is obtained from the second derivative with respect to φ1 . The only 1PI graphs which contribute are shown in Fig. 2. This is clearly not the case, as the full theory exhibits a trilinear coupling φ2 (φ1 )2 which gives the additional Feynman graph depicted in Fig. 3. Without investigating this question any further, we can safely affirm already that a perturbative formula for the Wodzicki anomaly given in terms of Feynman diagrams should exist. It surely owes its simple form to very subtle cancellations among an infinite class of Feynman diagrams. We conclude this section with some remarks. In the present model, the existence of a multiplicative anomaly of the type considered could be a trivial problem, in fact it has the same

626


+

LogDet(L 1 )

+

LogDet(L 2 )

Fig. 1. The Feynman graph giving the one-loop effective potential without taking into account the anomaly

+ Fig. 2. Contributions coming from 1PI graphs

form as the classical potential energy. This suggests that it can be absorbed in a finite renormalization of the coupling constant of the theory. Secondly, this anomaly gives no contribution to the one-loop beta function of the model, since it is independent of the arbitrary renormalization scale, but it certainly contributes to the two-loop beta function. And, finally, we have seen that the anomaly can be interpreted as an external field effect which, in the present model, could be relevant only when the theory is coupled to an external source. Therefore, it should be very interesting to study its relevance in at least two other situations, namely the cases of a spontaneously broken symmetry and of QED in external background fields.

Fig. 3. Additional Feynman graph of the full theory

6. The Case of a General, Smooth and Compact Manifold MD Without Boundary Since the multiplicative anomaly is a local functional, it is possible to express it in terms of the Seeley-De Witt spectral coefficients. Let us consider again the operator Lp = L0 + Vp , with L0 = −1 acting on scalars, in a smooth and compact manifold MD without boundary. We have to compute the Wodzicki residue of the 9DO, 2 ln(L1 L−1 . (6.1) 2 )


627

With this aim, if V1 < V2 , we can consider the 9DO

2 −tL1 , ln(L1 L−1 2 ) e

(6.2)

and compute the ln t term in the short-t asymptotic expansion of its trace. We are dealing here with self-adjoint operators and thus, by using the spectral theorem, we get h Tr

2 −tL1 ln(L1 L−1 2 ) e

Z

i

∞

= V1

dλρ(λ|L1 ) [ln λ − ln(λ + V2 − V1 )]2 e−tλ , (6.3)

where ρ(λ|L1 ) is the spectral density of the self-adjoint operator L1 . Now, it is well known that the short-t expansion of the above trace receives contributions from the asymptotics, for large λ, of the integrand in the spectral integral. The asymptotics of the spectral function associated with L1 are known to be given by (see, for example [21, 22], and the references therein) r 0, there is a sequence (αn , βn ) ∈ Cρ , where 8(αn , ρ, 0)) 6= 0, such that the corresponding solution (A(r, αn ), w(r, αn ), w0 (r, αn )) of (2.1),(2.2) is defined for all r > ρ satisfies A(r, αn ) > 0, and w2 (r, αn ) < 1. Moreover, limr→∞ (A(r, αn ), w2 (r, αn ), w0 (r, αn )) = (1, 1, 0), limr→∞ r(1 − A(r, αn )) < ∞ and w(r, αn ) has precisely n-zeros. Our final result classifies solutions which are well-behaved in the far-field. It does not describe the behavior of either the gravitational field or the YM field, inside a black hole – this is the subject dealt with in this paper.

712

J. A. Smoller, A. G. Wasserman W = -1

RNL

R

W'

W=1

Q

– +

S

– +

S

+ –

Q

W

R

RNL

W = 2

W=– 2

Fig. 2. Cp (ρ = 1)

W = -1

W'

W=1

S RNL

R Q

– +

+ –

W

Q R

RNL

S

W = – 1+ρ

W = 1+ρ

Fig. 3. Cp (ρ > 1)

Theorem 2.6 ([14]). Let (A(r), w(r)) be a solution of (2.1), (2.2) which is defined and smooth for r > r¯ > 0 and satisfies A(r) > 0 if r > r. ¯ Then every such solution must be in one of the following classes: (i) (ii) (iii) (iv) (v) (vi)

A(r) > 1 for all r > 0; 2 Schwarzschild Solution: A(r) = 1 − 2m r , w ≡ 1, (m = const.); Reissner–Nordström Solution: A(r) = 1 − rc + r12 , w(r) ≡ 0, (c = const.); Bartnik–McKinnon Particle-like Solution; Black-Hole Solution; RNL Solution.

In each case, limr→∞ w2 (r) = 1 or 0 (0 only for RN solutions), limr→∞ rw0 (r) = 0 and limr→∞ A(r) = 1. The solution also has finite (ADM) mass; i.e. limr→∞ r(1 − A(r)) < ∞.

Extendability of Solutions of Einstein–Yang/Mills Equations

713

3. The zeros of A In this section we shall prove that the zeros of A(r) are discrete, except possibly for an accumulation point at r = 0. We shall also show that A can have at most two zeros in the region r ≥ 1. In proving these, we shall make use of Figs. 1–3. In the remainder of this paper we shall always assume that the following hypothesis (H) holds for a given solution (A(r), w(r)) of (1.3) and (1.4): Hypothesis. There is an r1 > 1 such that the solution (A(r), w(r)) is defined for all r > r1 , and A(r2 ) > 0 for some r2 ≥ r1 . Theorem 3.1. If the hypothesis (H) holds, then A has at most a finite number of zeros in any interval of the form ε ≤ r < ∞, for any ε > 0. Furthermore, all the zeros of A, with at most two exceptions, lie in the set r < 1. ¯ for some r¯ > 0, then the solution is the Note that from [12], if A(r) ¯ = 0 = A0 (r), extreme Reissner–Nordström (ERN) solution A(r) =

r−1 r

2 ,

w(r) ≡ 0.

For this solution, Theorem 3.1 clearly is valid. Thus, in this section we shall as¯ 6= 0. In this case from1 ([10]), limr&r¯ A(r) = 0, sume that if A(r) ¯ = 0, then A0 (r) 0 0 ¯ w¯ ) exists, and (w, ¯ w¯ 0 ) ∈ Cr¯ ; (cf. Figs. 1–3). limr&r¯ (w(r), w (r)) = (w, Proposition 3.2. A cannot have more than two zeros in the region r ≥ 1.

A(r)

A(r)

ρ

or

η r

ρ

η r

Fig. 4.

Proof. Suppose that A has 3 zeros in the region r ≥ 1. Then there must exist ρ, η, 1 ≤ ρ < η with A(ρ) = 0 = A(η) and A0 (η) < 0 < A0 (ρ); cf. Fig. 4. Since (w(η), w0 (η)) ∈ Cη and η > 1, we see from Fig. 3 that w2 (η) > 1. Then from Theorem 2.1,(ii), A cannot have any zeros if r < η. This contradiction establishes the result. We next prove Proposition 3.3. If 0 < r0 < 1, then r0 cannot be a limit point of the zeros of A. Notice that Theorem 3.1 follows at once from Propositions 3.2 and 3.3. 1 In ([10]), the result was demonstrated for the case where A(r) > 0 for r near r, ¯ r > r, ¯ but the same proof holds if A(r) < 0 for r near r, ¯ r > r. ¯

714

J. A. Smoller, A. G. Wasserman

Proof. We shall show that there is a neighborhood of r0 in which A 6= 0. Choose ε > 0 such that r0 +ε < 1. We will show, using Theorem 2.3, that there exists an η > 0 such that if z1 and z2 are two consecutive zeros of A, r0 < z1 < z2 < r0 +ε < 1, A(z1 ) = 0 = A(z2 ), then

A0 (z1 ) > 0 > A0 (z2 ),

z2 − z1 > η.

(3.1) (3.2)

This implies that there can be at most a finite number of zeros of A in the interval (r0 , r0 + ε). Now A(z2 ) = 0 implies that (w(z2 ), w0 (z2 )) lies on Cz2 , and A0 (z2 ) < 0 implies that (w(z2 ), w0 (z2 )) lies on the middle curve in Fig. 1, (where ρ is replaced by z2 ). Without loss of generality, assume w0 (z2 ) > 0, w(z2 ) > 0. Now define δ by 1 = −2δ < 0. (r0 + ε) − (r0 + ε) Then there exists a constant c > 0 such that (r0 + ε) − Hence r−

u2 ≤ −δ < 0, if |w| < c. (r0 + ε)

u2 ≤ −δ, if r0 ≤ r ≤ r0 + ε, r

and thus

u2 − rA < −δ , z1 ≤ r ≤ z2 , (3.3) r since A(r) > 0 if z1 < r < z2 . Let w1 = −c, w2 = 0; then there exist r1 , r2 , z1 < r1 < r2 < z2 such that w(r1 ) = −c, w(r2 ) = 0. (This is because A cannot change sign in the interval −c ≤ w ≤ 0; cf. Fig. 1, and Fig. 5.) 8(r) = r −

W = -1

W'

CZ

1

r2 A=0 f=0

CZ

2

(W(Z ), W'(Z2)) 2

r1 W

r=a f (a) = 0

W=-C=W1 W=0=W2

Fig. 5.

Now in view of (3.3), if we can show that there is an M > 0, M independent of z1 , z2 for which |Aw02 | ≤ M, r1 ≤ r ≤ r2 (equivalently, w1 ≤ w(r) ≤ w2 ),

(3.4)


715

then on the interval w1 ≤ w ≤ w2 , we may apply Theorem 2.3 to conclude that (3.2) holds. Thus the proof of Proposition 3.3 will be complete once we prove (3.4); this is the content of the following lemma. Lemma 3.4. If −c ≤ w(r) ≤ 0, then f (r) ≡ (Aw02 )(r)
z1 , because by Theorem 2.1 (ii), there would be no zero of A smaller than r1 . Therefore, either the point w(z1 ), w0 (z1 ) lies in −1 ≤ w ≤ −c, w0 > 0, in which case we take a = z1 , or else the orbit crosses the segment −1 ≤ w ≤ −c, w0 = 0 at some r = a, and again f (a) = 0. We now prove 2 (3.7) if f (r) = 2 , then f 0 (r) < 0, r0 for r in the interval (a, r2 ). Since f (a) = 0, then if (3.7) holds, there can be no first value of r for which f (r) = r22 , and hence (3.5) holds. Thus it suffices to prove (3.7). 0 To do this, we first note that 8(r) ≥ −

1 , r0

Indeed 8(r) = r(1 − A) − Now from (2.14), we have, when f =

if

a < r < r2 .

(3.8)

u2 u2 1 1 ≥− ≥− ≥− . r r r r0

2 , r02

r2 f 0 (r) = −(2rf + 8)w0 − 2uw w0 2 0 = − 2r 2 + 8 w − 2uw w0 r0 4r 1 1 0 ≤ − + w + 2 w0 r 0 r 0 r0 w0 4r = 2+ 1− w0 r0 r0 3w0 ≤ 2− w0 , r0 where we have used (3.8). Now when f = r22 , w02 = r22A > r22 , or w0 > 0 0 0 in (3.9) gives √ ! √ 3 2 w0 < (2 − 3 2)w0 < 0, r2 f 0 ≤ 2 − 2 r0

(3.9)

√

2 r0 .

Using this

716


and this gives (3.7). Thus the proof of Lemma 3.4 is complete, and as we have seen, this proves Proposition 3.3. 4. The Case A > 0 Near r0 In this section we shall first prove the equivalence of Theorems 1.1 and 1.2. Then we shall prove Theorem 1.2 in the case where A(r) > 0 for r near r0 , r > r0 . In view of Theorem 3.1, we know that A can have at most a finite number of zeros on the interval (r0 , ∞). Hence A(r) is of one sign for r > r0 . In this section we shall prove that if A(r) > 0 for r near r0 , the solution can be extended. The far more difficult case where A(r) < 0 for r near r0 , will be considered in Sect. 5. Proposition 4.1. Theorems 1.1 and 1.2 are equivalent. Proof. Assume that Theorem 1.1 holds, and that A(r) ¯ > 0, where r¯ is as given in the ¯ . If w2 (r) ¯ ≥ 1 and (ww0 )(r) ¯ > 0, then statement of Theorem 1.2. Consider w(r), ¯ w0 (r) from Theorem 2.1, i), the solution cannot exist for all r > r, ¯ and this contradicts our ¯ ≥ 1 and (ww0 )(r) ¯ < 0, then from Theorem 2.1, ii), the solution assumptions. If w2 (r) is an RNL solution and is thus defined for all r, 0 < r < r. ¯ Thus, we may assume that w2 (r) ¯ < 1. If w2 (r) ˜ > 1 for some r˜ > r, ¯ then (ww0 )(r) ˜ > 0, so again Theorem 2.1, i) implies that the solution is not defined in the far-field. Hence we may assume that the orbit stays in the region w2 (r) < 1 for all r > r. ¯ Moreover A(r) > 0 for all r > r¯ because A(r) = 0 for some r > r¯ > 1 cannot occur. (In w2 < 1, “crash" can occur only if r < 1; see [7].) Thus from [14, Proposition 6.2], limr→∞ µ(r) < ∞, hence Theorem 1.2 holds. Conversely, if Theorem 1.2 holds, then (1.6) implies that limr→∞ r(1 − A(r)) < ∞ so A(r) → 1 as r → ∞; in particular A(r) > 0 for r large. This implies that Theorem 1.1 holds. This last result justifies our assumption that in the remainder of this paper that the following hypothesis (H) holds for a given solution (A(r), w(r) of (1.3) and (1.4): Hypothesis. There is an r1 > 1 such that the solution (A(r), w(r)) is defined for all r > r1 , and A(r2 ) > 0 for some r2 ≥ r1 . We now let r0 be any given positive number, and assume that the solution (A(r), w(r)), of (1.3), (1.4) is defined for all r > r0 . We then have the following theorem: Theorem 4.2. Assume that hypothesis (H) holds, and that A(r) > 0 for r near r0 , r > r0 . Then the solution can be extended to an interval of the form r0 − ε < r ≤ r0 . Proof. It follows from Theorem 2.1 that either w2 (r) < 1 for all r near r0 , or else (A, w) is an RNL solution and is thus defined for 0 < r ≤ r0 . In the case w2 (r) < 1 for all r near r0 then if A(r) is bounded away from zero for r near r0 the solution must continue into a region of the form (r0 − ε, r0 ], for some ε > 0. (The proof of this fact is the same if A > 0 or A < 0 near r0 . In (5.6) below we give the proof for A < 0, so we omit the proof here). If, on the other hand, A is not bounded away from zero near r0 , then A(rn ) → 0 for some sequence rn & r0 . In [10], we have shown that this implies limr&r0 A(r) = 0, and limr&r0 (w(r), w0 (r)) ∈ Cr0 , so the solution (A, w) is analytic at r0 and thus again continues past r0 ; i.e., to an interval of the form r0 − ε ≤ r ≤ r0 . This completes the proof of Theorem 4.2. In the next section we shall consider the case where A(r) < 0 for r near r0 , r > r0 .


717

5. The Case A < 0 Near r0 In this section we assume that the solution (A, w) of (1.3), (1.4) is defined for all r > r0 , and that A(r) < 0 for r near r0 , r > r0 . We shall prove that the solution can be continued past r0 . This is the content of the following theorem. Theorem 5.1. Assume that hypothesis (H) holds and that A(r) < 0 for r near r0 , r > r0 . Then the solution can be continued to an interval of the form r0 − ε < r ≤ r0 . Notice that Theorems 4.1 and 5.1 imply Theorem 1.2. Proof. There are two cases to consider: Case 1. There are positive numbers δ and 1 such that A(r) < −δ ,

if

0 < r0 < r < r0 + 1 ;

(5.1)

Case 2. There is a 1 > 0 such that A(r) < 0 ,

if

0 < r0 < r < r0 + 1 ;

(5.2)

and for some sequence rn & r0 , A(rn ) → 0.

(5.3)

We begin the proof of Theorem 5.1 by first considering Case 1. We shall need a few preliminary results, the first of which is Lemma 5.2. If (5.1) holds, and w(r) is bounded near r0 (r > r0 ), then w0 (r) is bounded near r0 . Proof. From (2.7), we can write w00 +

8 uw w0 = − 2 . 2 r A r A

(5.4)

Since 1 1 u2 8 = − − 3 , 2 r A rA r r A we see that both 8/r2 A and uw/r2 A are bounded near r0 . Thus the coefficients in (5.4) as well as the rhs are bounded, so w0 too is bounded near r0 . Lemma 5.3. If w0 is bounded near r0 , then A is bounded near r0 . Proof. From (2.1), we have rA0 + (1 + 2w02 )A = 1 −

u2 . r2

(5.5)

The hypothesis implies that w is bounded near r0 so the coefficients of (5.5) are bounded near r0 . Thus A too is bounded near r0 .

718


These last two results enable us to dispose of the case where (5.1) holds, and also w(r) is bounded near r0 (r > r0 ).

(5.6)

Since (5.6) holds, then A, w, and w0 are bounded near r0 , and by (5.1), A(r) < −δ, we see from (1.3), and (1.4), that A0 , w0 and w00 are bounded. Thus limr&r0 (A(r), w(r), w0 (r), r) ¯ w, = (A, ¯ w¯ 0 , r0 ) ≡ P exists where A¯ < 0. Hence the orbit through P is thus defined on an interval , r0 − ε < r < r0 + ε, for some ε > 0. Remark. We did not use the fact that A < 0 to obtain this conclusion; all we needed was A bounded away from 0 and w bounded near r0 . We shall now show that in Case 1, w must be bounded near r0 . To do this, we will assume that w is unbounded near r0 , r > r0 , and we shall arrive at a contradiction. Thus, assume that for some ε > 0, w(r) is unbounded on (r0 , r0 + ε).

(5.7)

Lemma 5.4. If A(r) < 0 for r near r0 , and (5.7) holds, then the projection of the orbit (w(r), w0 (r)) has finite rotation about (0, 0), and about (±1, 0) for r near r0 . Remark. Note that we do not assume (5.1) but only that A < 0 near r0 . In Case 2, we use the contrapositive of Lemma 5.4; i.e., if A < 0 for r near r0 , and if the orbit has infinite rotation about either (0, 0) or (±1, 0), then w is bounded near r0 . Proof. Assume that the orbit has infinite rotation about either (0, 0), or (±1, 0); we will show that this leads to a contradiction. Since (5.7) holds, the orbit must rotate infinitely many times outside the region w2 ≤ 1, as r & r0 . We may also assume without loss of generality that limr&r0 w(r) = −∞. It follows that there exists sequences {rn }, {sn }, rr+1 < sn+1 < rn , with w0 (rn ) = 0, w(sn ) = −2, lim w(rn ) = −∞, and w(rn ) < w(r) < w(sn ), for rn < r < sn ; cf. Fig. 6 W = -2

W'

Sn W

rn

Fig. 6.

We first show that for w(r) ≤ −2, w0 is bounded; i.e, (as in the proof of Lemma 3.4, (cf. (3.7)), 2 (5.8) if w0 (r) = (r0 + ε), then w00 (r) < 0. 3 To prove (5.8), we use (2.7): w00 =

u −uw (r − rA)w0 u2 w0 − + 3 < 3 [−rw + uw0 ]. 2 2 r A −r A r A r A

(5.9)


719

Thus, if for some r > r0 , and w(r) ≤ −2, we had w0 (r) = 23 (r0 + ε), then since it follows that 2 w w0 (r) = (r0 + ε) > r , 3 u

w u

≤ 23 ,

so that (5.9) implies (5.8). Thus, if w(r) ≤ −2, then w0 (r) < 23 (r0 +ε). Since sn −rn < ε, we have, for large n 2ε −2 − w(rn+1 ) < (r0 + ε); 3 this violates (5.7).qed Corollary 5.5. If (5.1) and (5.7) hold, then limr&r0 |w(r)| = ∞. Proof. For r near r0 , the lemma implies that the orbit has finite rotation near r0 . Thus the orbit must lie in one of the four strips, w < −1, −1 < w < 0, 0 < w < 1, w > 1. Since in each strip w00 is of fixed sign when w0 = 0 it follows then that w0 is of one sign near r0 , so that w has a limit at r0 ; since w(r) is not bounded near r0 , the result follows. W = -1

(1)

W'

(2)

W=1

(3)

(4) W

(8)

(7)

(6)

(5)

Fig. 7.

It follows from the last result that if w is unbounded near r0 , then the orbit must lie in either region (1) or region (5), as depicted in Fig. 7. We will assume that the orbit lies in region (5) for r near r0 ; the proof for region (1) is similar, and will be omitted. Thus, assuming (5.1), and (5.7) we have w0 (r) < 0 near r0 , and lim w(r) = +∞.

(5.10)

r&r0

Since r0 is finite, (5.10) implies w0 (r) is unbounded for r near r0

(r > r0 ).

(5.11)

Lemma 5.6. If A(r) < 0 for r near r0 (r > r0 ), and (5.10) holds, then lim w0 (r) = −∞.

r&r0

(5.12)

Remark. We do not use hypothesis (5.1) in this lemma, but we only assume A < 0 near r0 . This result will be used in Case 2.

720


Proof. If w0 does not have a limit at r0 , then in view of (5.11), we can find sequences rn & r0 , sn & r0 , rn < sn < rn+1 , such that w0 (sn ) = −n,

n w0 (rn ) = − , 2

(5.13)

and

n if rn ≤ r ≤ sn , w0 (r) ≤ − . 2 Then if rn ≤ r ≤ sn and n is large, (2.2) gives

(5.14)

2

00

−w (r) = = <
r0 `n2.

(5.15)

(5.16)

But for large n, rn < 1 + r0 , so that (5.16) implies 1 = (1 + r0 ) − r0 ≥ 6(sn − rn ) = ∞. This contradiction establishes (5.12) and the proof of the lemma is complete.

Thus to dispense with Case 1, and obtain the desired contradiction (assuming that w is unbounded near r0 ), we shall prove the following proposition. Proposition 5.7. It is impossible for (5.1) and (5.7) to hold. To prove this proposition, we shall obtain an estimate of the form w00 (r) ≤ k(−w0 (r)) for r near r0 . Integrating from r > r0 to r1 > r, gives −w0 (r) ≤ k(r1 − r), `n −w0 (r1 ) and this shows that w0 is bounded near r0 , thereby violating (5.12). In order to prove (5.17), we need two lemmas, the first of which is

(5.17)


721

Lemma 5.8. If A(r) < 0 for r near r0 , (r > r0 ), and both (5.10) and (5.12) hold, then writing Aw02 = f , we have − f (r) > w(r)5 , if r is near r0 .

(5.18)

Remark. We do not assume that (5.1) holds, but only that A < 0 near r0 . This result too will be used in Case 2. Proof of Lemma 5.8.. We write (2.14) in the form (cf. (2.5)) 2 −u 02 2 0 02 02 0 w + 2uww = 0. r f + rf w + (rf + r − rA)w + r

(5.19)

Now for r near r0 , rf + r − rA = rAw02 + r − rA = rA(w02 − 1) + r ≤ 0,

(5.20)

in view of (5.12). Furthermore, if r is near r0 , −u2 02 w + 2uww0 < 0 (5.21) r because of (5.10), and (5.12). Thus (5.19)-(5.21) imply r2 f 0 + rf w02 > 0, so that for r near r0 −w0 0 (−w0 ) > f w0 , f > −f r or f 0 /f < w0 . Integrating from r to r1 , where r0 < r < r1 , and r1 is close to r0 , gives r1 `n(−f ) < w(r1 ) − w(r), r

so that

`n(−f (r)) > w(r) − k1 , where k1 = w(r1 ) − `n(−f (r1 )). Exponentiating gives −f (r) > k2 ew(r) > w(r)5 ,

for r near r0 , in view of Corollary 5.5.

We shall use this last lemma for proving the following result. Lemma 5.9. Assume that (5.1) and (5.10) hold. Then there is a constant k > 0 such that (5.22) − A(r) > kw(r)4 , for r near r0 . Proof. From (2.1), if r is near r0 , 8 2 A = 2− f≥ r r 0

f −u2 − 3 r r

−

f −f > , r r

where we have used (5.18). Thus, using (5.12), −Aw02 = k3 Aw0 , r for some k3 > 0. It follows that for some constant k > 0, A0 >

−A(r) > ek3 w > kw4 , if r is near r0 .

722


We can now complete the proof of Proposition 5.7. As we have seen earlier, it suffices to prove (5.17). Now since we are in region (5) (cf. Fig. 7), uw < 0, so that for r near r0 , (2.2) gives w00
r0 , then we can find numbers b and c, b > c > r0 , and sequences {sn }, {tn }, r0 < tn+1 < sn < tn , with µ(sn ) = c, µ(tn ) = b. Thus b − c = µ(tn ) − µ(sn ) = µ0 (ξ)(tn − sn ), where ξ is an intermediate point. Now from (2.15) for r near r0 , µ0 (r) = 2Aw02 +

u2 u2 ≤ ≤ k, r2 r2

since w is assumed to be bounded. P Hence (b−c) < k(tn −sn ), or tn −sn > (b−c)/k > 0. This is a contradiction since n (tn − sn ) is finite. Thus (5.28) holds and the proof is complete. Combining Lemmas 5.4 and 5.11, we get as an immediate corollary, Corollary 5.12. If (5.2) and (5.3) hold, and = ∞, then 8(r) is bounded for r near r0 . We next have Lemma 5.13. If (5.2) and (5.3) hold, and w is bounded near r0 , then either Aw02 is bounded near r0 , or limr&r0 (Aw02 )(r) = −∞. Proof. We write f = Aw02 , and again use (2.14): r2 f 0 + (2rf + 8)w02 + 2uww0 = 0.

(5.29)

If f is not bounded near r0 , then (Lemma 5.11) since 8 and w are bounded, (5.29) shows that f 0 > 0 if f is sufficiently large, and the result follows. Lemma 5.14. If (5.2) holds, and Aw02 is bounded near r0 , then the rotation number is finite. Proof. We are going to apply Theorem 2.3 with w1 = −1, w2 = −1 + ε, for some ε > 0. Thus assume = ∞; then there exists a sequence r0n & r0 with w(r0n ) = 0, w0 (r0n ) > 0. Since A < 0 near r0 , the orbit cannot cross the segment w0 = 0, −1 ≤ w ≤ 0 n n n for r < r0n . Thus we can find ε > 0 and numbers r−1 , and r−1+ε , such that w(r−1 ) = −1, n n n w(r−1+ε ) = −1 + ε, and for r−1 ≤ r ≤ r−1+ε , we have −1 < w(r) < −1 + ε, and n ≤ r ≤ r0n , −1 + ε < w(r) < 0. By hypothesis, Aw02 is bounded near r0 , so for r−1+ε n n ≤ r ≤ r−1+ε , for large n. In order to apply Theorem 2.3, it only in particular on r−1 remains to show that 8(r) is bounded away from 0 on this interval if ε is small. Choose ε > 0 so small that (1 − w2 )2
r− > r0 − = .9r0 . r r r0

(5.31)

On this interval, 8 = r − rA −

Now by Theorem 2.3, there exists an η > 0, such that for each n, n n r−1+ε − r−1 ≥ η.

724


n n But as r−1+ε and r−1 both lie in (r0 , r0 + 1) for large n, we have n n 1 = (r0 + 1) − r0 ≥ 6 r−1+ε − r−1 = ∞,

and this is a contradiction.

Our final lemma in the proof of Proposition 5.10 is the following Lemma 5.15. If (5.2) and (5.3) hold, and = ∞, Aw02 is bounded near r0 . Proof. By Corollary 5.12, 8 is bounded. From (2.6), if Aw02 → −∞, then as r & r0 , rA0 = −2Aw02 + and this contradicts (5.3).

8 −→ +∞, r

Note that Lemmas 5.14, and 5.15 prove Proposition 5.10. Corollary 5.16. If (5.2) and (5.3) hold, then w(r) is of one sign for r near r0 . We next show that for r near r0 , either w2 (r) > 1 or w2 (r) < 1;

(5.32)

that is, either w < −1, or −1 < w < 0, or 0 < w < 1, or w > 1. To prove this we need two lemmas, the first of which is: Lemma 5.17. If (5.2) and (5.3) hold then limr&r0 w2 (r) = 1 is not possible. Proof. Suppose (for definiteness) that limr&r0 w(r) = −1. With ε defined by (5.30), we see that for r near r0 , −1 − ε ≤ w(r) ≤ −1 + ε. On this interval, (5.31) implies 8(r) > .9r0 . Then from (2.6), rA0 = −2Aw02 + and this contradicts (5.3).

.9r0 8 > > 0, r r

We next show that the orbit has finite rotation about (1, 0) in the case w > 0 near r0 , or about (−1, 0) in case w < 0. Lemma 5.18. If (5.2) and (5.3) hold and w > 0 for r near r0 , then the projection of the orbit in the w − w0 plane has finite rotation about (1, 0). Similarly if w < 0 for r near r0 , then the projection of the orbit in the w − w0 plane has finite rotation about (−1, 0). Proof. Suppose w > 0 near r0 (the proof for w < 0 is similar, and will be omitted), and the orbit has infinite rotation about (1,0). Since limr&r0 w(r) 6= 1, we must have either limr&r0 w(r) > 1 or limr&r0 w(r) < 1. In either case we repeat the argument of Lemma 5.10 using the w-interval [1, 1 + ε] or [1 − ε, 1]. We have that 8 is bounded away from 0 by (5.31). By Lemma 5.13, either (Aw02 )(r) → −∞ as r & r0 , or Aw02 is bounded near r0 . We rule out the case Aw02 → −∞ because w0 is of one sign; hence Aw02 is bounded near r0 . Using Theorem 2.3 exactly as in Lemma 5.14, we have that the orbit can cross the line w = 1 a finite number of times. Thus w > 1 or w < 1 for r near r0 .


725

Summarizing, we have Corollary 5.19. For r near r0 , precisely one of the following holds: w(r) < −1, −1 < w(r) < 0, 0 < w(r) < 1, or w(r) > 1. Since w00 , when w0 = 0, has a fixed sign in each of the four strips, we see that w0 must have a fixed sign for r for r0 ; i.e., the projection of the orbit in the w − w0 plane must lie in one of the 8 regions depicted in Fig. 7. Since we now have the orbit confined to one of these 8 regions, without loss of generality we will consider the case where w0 < 0. We will first show that orbit cannot lie in regions (6) or (8) for r near r0 . Then we will show that if the orbit is in regions (5) or (7), and w0 is bounded near r0 , then limr&r0 A(r) = 0 and limr&r0 (w(r), w0 (r)) exists and lies on Cr0 ; hence the orbit continues past r0 . We complete the proof of Theorem 5.1 by showing that the case where w0 is unbounded near r0 cannot occur. Lemma 5.20. If (5.2) and (5.3) hold, then the orbit cannot lie in regions (6), or (8) for r near r0 . Proof. In regions (6) and (8), w is bounded near r0 . Thus from Lemma 5.11, lim A(r) = 0.

r&r0

(5.33)

If v = Aw0 , then from (2.13) we see v 0 ≤ 0 so limr&r0 v(r) = L > 0 exists. Thus 2 writing Aw02 = vA , we see that lim (Aw02 )(r) = −∞.

r&r0

(5.34)

Since w is bounded near r0 (5.33) implies that 8 is bounded near r0 . Thus, from (2.6), rA0 =

8 − 2Aw02 −→ +∞ r

as r & r0 . However, this contradicts (5.3).

We now consider the case where (5.2) and (5.3) hold, and the orbit lies in one of the regions (5) or (7) for r near r0 , r > r0 . We first consider the case where w0 is bounded. Lemma 5.21. Suppose that (5.2) and (5.3) hold, and that the orbit lies in either region (5) or (7) for r near r0 . If w0 (r) is bounded near r0 then limr&r0 A(r) = 0, limr&r0 (w(r), w0 (r)) = (w, ¯ w¯ 0 ) exists, and (w, ¯ w¯ 0 ) lies on Cr0 . Note that in view of our remark preceding Proposition 5.10, Lemma 5.21 implies that Theorem 5.1 holds in this case. Proof. First note that since w0 is bounded, this implies w is bounded, and hence Lemma 5.11 implies that (5.35) lim A(r) = 0. r&r0

Now as A → 0, and w has a limit, we see that 8 = r − rA − u2 /r has a limit; call this limit 80 ; i.e.

726


80 = lim = 8(r).

(5.36)

r&r0

If 80 6= 0, then as limr&r0 v(r) = 0 we may apply L’Hospital’s rule to obtain lim w0 (r) = lim

r&r0

r&r0

= lim

r&r0

v(r) v 0 (r) = lim 0 A(r) r&r0 A (r) −2w02 v − uw r r2 8 2Aw02 − r r2

= lim

r&r0

−uw , 8

where we have used (2.6) and (2.13). Thus

lim w0 (r) = lim

r&r0

r&r0

−uw . 8

(5.37)

We claim that 80 6= 0.

(5.38)

Note that if (5.38) holds, then since w has a finite limit at r0 , (5.37) implies that limr&r0 w0 (r) exists and is finite, and lim (w(r), w0 (r)) ∈ Cr0 .

r&r0

So, to complete the proof Lemma 5.21, it suffices to prove (5.38). Thus, assume 80 = 0; we show this leads to a contradiction. If (uw)(r0 ) 6= 0, then (5.37) implies that w0 (r) is unbounded near r0 , and this is a contradiction. Hence we may assume (uw)(r0 ) = 0. If u(r0 ) = 0, then 0 = 80 = r0 −

u20 = r0 , r0

and this is a contradiction since r0 > 0. Thus we may assume w(r0 ) = 0. In this case 0 = 80 = r0 −

1 , r0

so that r0 = 1. Note too that if w(r0 ) = 0, the orbit lies in region (7) for r near r0 . We now have A(rn+1 ) − A(rn ) = (rn+1 − rn )A0 (ξ),

(5.39)

where rn > ξ > rn+1 > 1. From (2.6) ξA0 (ξ) = 1 − A(ξ) − 2

u2 (ξ) − 2(Aw02 )(ξ). ξ2

(5.40)

Since ξ > 1, 1 − u ξ(ξ) > 0, so for large n, (5.40) implies A0 (ξ) > 0. Using this in (5.39) gives 0 > A(rn ) > A(rn+1 ), and this violates (4.3). Thus (5.38) holds and the proof is complete.


727

We now consider the case where (5.2) and (5.3) hold, and the orbit is in region (5) or (7), and w0 (r) is unbounded for r near r0 , r > r0 . We shall show that this case is impossible. First note that if w is bounded near r0 , it follows from Lemma 5.11 that lim A(r) = 0.

r&r0

(5.41)

Since w0 < 0, limr&r0 w(r) exists. Thus if w is bounded near r0 , limr&r0 8(r) exists and is finite; say (5.42) lim 8(r) = 80 . r&r0

We now have Proposition 5.22. If (5.2) and (5.3) hold, and w0 is unbounded near r0 , then w cannot be bounded near r0 ; in particular that orbit cannot lie in region (7). Proof. Suppose that w(r) is bounded for r near r0 ; we will show that this leads to a contradiction. Thus, in this case (5.41) holds and 80 is finite. We consider 3 cases 80 > 0, 80 < 0, 80 = 0, and we will obtain contradictions in all cases. Case 1. 80 > 0. From (2.6), for r near r0 , A0 (r) =

8 2Aw02 > 0, − 2 r r

and this violates (5.3); thus Case 1 cannot occur. Case 2. 80 < 0. We first show lim w0 (r) = −∞.

r&r0

(5.43)

To see this, note that if (5.43) were false, then as w0 is unbounded near r0 , there would exist a sequence sn & r0 such that w0 (sn ) < −n and w00 (sn ) = 0. Then from (2.7) 0 = s2n (Aw00 )(sn ) + 8(sn )w0 (sn ) + (uw)(sn ) = 8(sn )w0 (sn ) + (uw)(sn ) −→ ∞ as n → ∞. This contradiction implies that (5.43) holds. Now if f = Aw02 , then from (2.14), r2 f 0 + (2rf + 8)w02 + 2uww0 = 0,

(5.44)

and since (2rf + 8) is strictly negative near r0 and w is bounded near r0 it follows from (5. 43) that f 0 (r) > 0 if r is near r0 ). Thus lim f (r) = L < 0

r&r0

exists; where L ≥ −∞. We claim that L = −∞.

(5.45)

728


To see this, we note first that (w02 v)(r) = w0 (r)f (r) → +∞, (v = Aw0 ),

(5.46)

so that (cf. (2.13)),

−2w02 v uw − 2 → −∞, r r since w is bounded near r0 . Hence, if r0 < r < r1 , and r1 is near r0 , v(r1 ) < v(r) so v0 =

(Aw02 )(r) = v(r)w0 (r) < v(r1 )w0 (r), and as v(r1 )w0 (r) → −∞, we see that (Aw02 )(r) → −∞, as r & r0 ; thus (5.45) holds. Now again using (2.6), rA0 (r) = −2(Aw02 )(r) +

8 → +∞, r

as r & r0 . But this violates (5.3); hence Case 2 cannot occur. We now turn to the final case, Case 3. 80 = 0. The proof in this case relies on Theorem 2.2. Indeed, we will show that limr&r0 A0 (r) = 0, and from (5.41), limr&r0 A(r) = 0. This is enough to invoke Theorem 2.2, to conclude that w(r) ≡ 0 and thus w0 (r) ≡ 0; this violates the assumption that w0 is unbounded. We first show lim A0 (r) ≤ 0. (5.47) r&r0

Indeed, if limr&r0 A0 (r) > 0 then for r > r0 , r near r0 , 0 > A(r) = A(r) − A(r0 ) = (r − r0 )A0 (ξ) > 0, where ξ is an intermediate point. This contradiction establishes (5.47). Next, since 8 rA0 = − 2Aw02 , r

(5.48)

it follows from (5.47) that limr&r0 ( 8r − 2Aw02 ) ≤ 0, so limr&r0 ( 8r00 − 2Aw02 ) ≤ 0, or 0 ≥ limr&r0 2Aw02 ≥ thus We next show

80 = 0, r0

limr&r0 Aw02 = 0.

(5.49)

lim Aw02 = limr&r0 Aw02 .

(5.50)

r&r0

(Note that if (5.50) holds, then limr&r0 Aw02 = 0, so from (5.48) A0 (r0 ) = 0. Thus the proof of Proposition 5.22 will be complete once we prove (5.50).) So suppose that there is an η > 0 such that lim Aw02 ≤ −2η . r&r0

(5.51)


729

Then in view of (5.49), if f = Aw02 , we can find a sequence sn & r0 such that f (sn ) = −η, f 0 (sn ) < 0. Since (5.41) holds, we have A(sn ) → 0 so that w0 (sn ) → −∞. From (5.44), s2 f 0 (sn ) + (−2sn η + 8(sn ))w02 (sn ) + 2(uww0 )(sn ) = 0.

(5.52)

But as f 0 (sn ) < 0 and w02 (sn ) → ∞, we see that (5.52) cannot hold for large n. Thus (5.50) holds and this implies limr&r0 Aw0 (r) = 0, and thus by Theorem 2.2, we have a contradiction. We now consider the final case in the proof of Theorem 5.1, namely in regions (5) or (7), (5.53) w and w0 are unbounded near r0 . (Of course, this implies that we are in region (5).) Note too that in this case we have lim w(r) = +∞.

(5.54)

r&r0

Proposition 5.23. If (2.2) and (2.3) hold, and the orbit lies in region (5), then (5.54) cannot hold. Note that once Proposition 5.23 is established this will complete the proof of Theorem 5.1. Proof. From our remark following the statement of Lemma 5.6, we have lim w0 (r) = −∞.

(5.55)

r&r0

Then as we have remarked earlier (5.18) holds; i.e. Aw02 > w5 , for r near r0 . Thus, from (5.48) for r near r0 , 8 u2 0 5 rA (r) = −2f + > 2w(r) + 1 − A − > 0, r r since u2 is of order w4 , and this contradicts (5.3).

6. Miscellaneous Results and Open questions In Sect. 3, we proved that the zeros of A are discrete, except possibly at r = 0. This leads to the first question. 1. Can r = 0 be a limit point of zeros of A? We conjecture that the answer is no. In a recent paper [4, p. 8, ` 7], the authors assume that the answer is no. A rigorous proof of this would be welcome. A related question is 2. Do there exist solutions of the EYM equations for which A has more than two zeros? A negative answer obviously implies a negative answer to question 1. In [5], the authors have numerically obtained a solution having two zeros. This leads to the next Problem. 3. Give a rigorous proof of the existence of a global solution of the EYM equations, (other than the classical Reissner–Nordström solution), where A has two zeros.

730


4. A subject of much current interest is the study of solutions near r = 0 [4,5]. If, as we suspect, Question 1 has a negative answer, then every solution near r = 0, has either A > 0 or A < 0. If A > 0 near r = 0, then we have proved in [10], that either limr&0 A(r) = 1, in which case the solution is particle-like, or else lim r&0 A(r) = +∞, in which case the solution is a Reissner–Nordström-like (RNL) solution [14]; this case is re-discussed in [4]. If A < 0 near r = 0, much less is known. In [14], we proved the following theorem: Theorem 6.1. Given any triple of the form q = (1, b, c), there exists a unique local RNL solution (Aq (r), wq (r)), satisfying limr&0 rA(r) = b, wq (0) = 1, wq00 (0) = c, and the solution depends continuously on these values. If b < 0, then limr&0 A(r) = −∞, and limr&0 (w2 (r), w0 (r)) = (1, 0). These solutions have been termed Schwarzschild-like [5]. In [5], the authors also investigated RNL solutions but they mistakenly omitted the 2-parameter family of solutions that have w(0) = 0. These solutions have the following asymptotic form near r = 0 : 1 b + + h.o.t. , r2 r 3 w(r) = cr + h.o.t. .

A(r) =

These solutions are interesting since they give rise to asymptotically flat solutions with half-integral rotation numbers, see [14]. In addition there are solutions which have w2 (0) = 1; these solutions have the following asymptotic form near r = 0 : b + h.o.t. , r w(r) = ±1 + cr2 + h.o.t. .

A(r) =

There is still another type of local solution (discussed in [5]), having A < 0 near r = 0, but these do not appear to give rise to asymptotically flat global solutions, [5]. We are thus lead to the following “trichotomy conjecture”: Conjecture. If (A(r), w(r)) is a globally defined solution of the EYM equations (1.3), (1.4), then   −∞, or +1, or lim A(r) =  +∞. r&0 In view of our above remarks concerning the behavior of solutions if A(r) > 0 near r = 0, this conjecture can be rephrased as: Conjecture. If (A(r), w(r)) is a globally defined solution to the EYM equations (1.3), (1.4), and A(r) < 0 for r near 0, then limr&0 A(r) = −∞. 5. Another interesting question is the following: Does there exist a solution to the EYM equations (1.3), (1.4), where A(r) < 0 in a neighborhood of r = ∞? We conjecture that the answer to this question is negative. If our conjecture is true, this would enable us to drop the hypothesis A(r) ¯ > 0 in Theorem 1.2. If, on the other hand the conjecture is true, then we can show that the orbit must have infinite rotation in the (w, w0 )-plane and w must be unbounded. 6. Using the methods in [7–9], we have proved the following theorem


731

Theorem 6.2. There is a continuous 2-parameter family of solutions Aα,β (r), wα,β (r) to the EYM equations (1.3), (1.4), defined in the far-field, which are analytic functions of s = r1 . That is, if (A(r), w(r)) is a solution to the EYM equations (1.3), (1.4) which is asymptotically flat, and is analytic in s = r1 , then (A(r), w(r)) = Aα,β (r), wα,β (r) for some pair of parameter values (α, β). (We omit the details of the proof as they are similar to those in [7].) In the above theorem, one parameter is the (ADM) mass β, and in fact, A(s = 0) = 1, dw 2 and dA ds |s=0 = −β. The other parameter is α = ds |s=0 , and w (s = 0) = 1; cf. [10]. It follows from the results in [10 or 14], that the (ADM) mass β is finite for any solution which is defined in the far-field. Moreover, for such solutions lim r→∞ rw0 (r) = 0, cf. [9]. We do not know whether limr→∞ r2 w0 (r) ≡ lims→0 dw(s) ds exists. This leads to the next question: Is every asymptotically flat solution to the EYM equations (1.3), (1.4) analytic in s = r1 at s = 0? If the answer is affirmative, then we may consider the (α, β)-plane as representing those solutions having the following asymptotic form near s = 0: A(s) = 1 − βs + h.o.t. , w(s) = 1 − αs + h.o.t. , and all such solutions are described by a point in the (α, β)-plane (or in the corresponding plane corresponding to w(s = 0) = −1), or they correspond to the 1-parameter family of classical Reissner–Nordström solutions: A(r) = 1 − rc + r12 , w(r) ≡ 0.

β

Schwarzschild Solutions

RNL β=2

Ω=

1

Ω=

2

• • • Ω=n Pn

P1

RNL

α

P2

RNL Fig. 8.

We consider the (α, β)-plane as depicted in Fig. 8. In this plane, certain regions are easy to identify. Thus, if α < 0, these correspond to RNL-solutions. Similarly, the region α > 0, β < 0, also correspond to RNL-solutions. The line α = 0 corresponds to Schwarzschild solutions with mass β. Particle-like and black-hole solutions must lie in the 1st quadrant α > 0, β > 0. Presumably, there are a countable number of curves in the 1st quadrant distinguished by the number of zeros of w, parametrized by ρ, the event horizon. (These are schematically depicted in Fig. 8, where the points Pn correspond to particle-like solutions and the β coordinate of Pn tends to 2 as n → ∞; cf. [11].) There are also a countable number of points in this quadrant which correspond to particle-like solutions. All other solutions in this quadrant are RNL solutions.

732


Thus, near any particular black-hole solution, there are global solutions which are neither black-hole or particle-like solutions; i.e., they must be RNL solutions. This follows since any point in this plane represents a global solution (from our results in this paper, cf. Theorem 1.2). Thus for any such global solution (A, w), either A has a zero, in which case the corresponding point (α, β) lies on one of the above-mentioned countable number of curves, or it is one of the countable number of particle-like solutions, or it is an RNL solution [10, 14]. It follows that in any neighborhood of a black-hole solution (A0 (r), w0 (r)) there are RNL solutions. In particular, if A0 (r1 ) = −η < 0, then arbitrarily close to this solution, there are solutions (A(r), w(r)) having A(r1 ) > 0. This is a spectacular example of non-continuous dependence on initial conditions. References 1. Bartnik. R., and McKinnon, J.: Particle-like solutions of the Einstein–Yang–Mills equations. Phys. Rev. Lett. 61, 141–144 (1988) 2. Bizon, P.: Colored black holes. Phys. Rev. Lett. 64, 2844–2847 (1990) 3. Breitenlohner, P., Forgács, P. and Maison, D.: Static spherically symmetric solutions of the Einstein– Yang–Mills equations. Commun. Math. Phys. 163, 141–172 (1994) 4. Breitenlohner, P., Lavrelashvili, G. and Maison, D.: Mass inflation and chaotic behavior inside hairy black holes. gr-qc/9703047 5. Donats, E.E., Gal’tsov, D.V., and Zotov, M. Yu: Internal structure of Einstein–Yang–Mills black holes. gr-qc/9612067 6. Kunzle, H.P., and Masood-ul-Alam, A.K.M.: Spherically symmstric static SU(2) Einstein–Yang–Mills fields. J. Math. Phys. 31, 928–935 (1990) 7. Smoller, J., Wasserman, A., Yau, S.-T., McLeod, J.: Smooth static solutions of the Einstein–Yang Mills equations. Commun. Math. Phys. 143, 115–147 (1991) 8. Smoller, J., Wasserman, A.: Existence of infinitely-many smooth static, global solutions of the Einstein/Yang–Mills equations. Commun. Math. Phys. 151, 303–325 (1993) 9. Smoller, J., Wasserman, A., and Yau, S.-T.: Existence of black hole solutions for the Einstein–Yang/Mills equtions. Commun. Math. Phys. 154, 377–401 (1993) 10. Smoller,J., and Wasserman, A.: Regular solutions of the Einstein–Yang/Mills equations. J. Math. Phys. 36, 4301–4323 (1995) 11. Smoller, J. and Wasserman, A.: Limiting masses of solutions of Einstein–Yang/Mills equations. Physica D., 93, 123–136 (1996) 12. Smoller, J. and Wasserman, A.: Uniqueness of extreme Reissner–Nordström solution in SU(2) Einstein– Yang/Mills theory for spherically symmetric spacetime. Phys. Rev. D., (15 Nov. 1995), 52, 5812–5815 (1995) 13. Smoller, J., and Wasserman, A.: Uniqueness of zero surface gravity SU(2) Einstein–Yang/Mills black holes. J. Math. Phys. 37, 1461–1484 (1996) 14. Smoller, J., and Wasserman, A.: Reissner–Nordström-like solutions of the SU(2) Einstein–Yang/Mills equations. J. Math. Phys. 38, 6522–6559 (1997) 15. Straumann, N., and Zhou, Z.: Instability of a colored black hole solution. Phys. Lett B. 243, 33–35 (1990) 16. Ershov, A.A., and Galtsov, D.V.: Non abelian baldness of colored black holes. Phys. Lett. A. 150, 747, 160–164 (1989) 17. Lavrelashvili, G., and Maison, D.: Regular and black-hole solutions of Einstein–Yang/Mills dilation theory. Phys. Lett. B. 295, 67 (1992) 18. Volkov, M.S. and Gal’tsov, D.V.: Black holes in Einstein–Yang/Mills theory. Sov. J. Nucl. Phys. 51, 1171 (1990) 19. Volkov, M.S., and Ga.’tsov, D.V.: Sphalerons in Einstein–Yang/Mills theory. Phys. Lett. B. 273, 273 (1991) Communicated by A. Jaffe

Communications in Mathematical Physics - Volume 194

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 304

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 194

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 304

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Recommend Documents