Communications in Mathematical Physics - Volume 264

Commun. Math. Phys. 264, 1–40 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1538-3 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

68 downloads 644 Views 9MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 264, 1–40 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1538-3

Communications in

Mathematical Physics

The Virtual Class of the Moduli Stack of Stable r-Spin Curves Takuro Mochizuki1,2 1 2

Department of Mathematics, Kyoto University, Kyoto, 606-8502, Japan. E-mail: [email protected] Max-Planck Institute for Mathematics, Vivatsgasse 7, 53111 Bonn, Germany. E-mail: [email protected]

Received: 18 April 2004 / Accepted: 27 October 2005 Published online: 15 March 2006 – © Springer-Verlag 2006

Abstract: We recall the outline of the Seely-Singer-Witten construction of the virtual class on the moduli of stable r-spin curves. We prove that the obtained classes satisfy the axioms of Jarvis-Kimura-Vaintrob.

1. Introduction 1.1. Stable r-spin curves. Let n be a non-negative integer, r be a natural number and 1/r,m m = (m1 , . . . , mn ) be an n-tuple of integers. Let Mg,n be the moduli stack of smooth n-pointed r-spin curves of genus g, which is a tuple (C, p, F) of a smooth curve C of genus g, n-points p = {p1 , . . . , pn } contained in the curve C, and a line bundle F on C with the isomorphism F ⊗ r ωC ⊗O(−m·p), where we put m·p = mi ·pi . T. Jarvis 1/r,m constructed the moduli stack Mg,n of the stable r-spin curves, which gives the smooth 1/r,m

compactification of the stack Mg,n ([3 and 4]). Saying naively, a ‘definition’ of stable r-spin curve is a tuple (C, p, F) of an n-pointed stable curve (C, p) and a torsion-free rank 1 sheaf F with the morphism F ⊗ r −→ ωC (−m · p) with some conditions. But the moduli stacks of such naive objects are not smooth. Hence he introduced the notion of the coherent net so that the moduli stacks become smooth. Thus the moduli stack 1/r,m Mg,n over the complex number field C is topologically an orbifold. The nodal point of stable r-spin curves (C, p, F) are divided into two types by the local property of the r-spin structure F at the point: If F is locally free at the nodal point Q, then Q is called of Ramond type. Otherwise the point Q is called of Neveu-Schwarz type. By the definition, F around the point Q is described as follows: Let π : C˜ −→ C be a normalization of the curve C at the point Q. Let Qj (j = 1, 2) denote the preimage of Q. We take a coordinate neighbourhood Bi = {zi ∈ C | |zi | < 1, zi (Qi ) = 0} around points Qi .

2

T. Mochizuki

• If Q is of Ramond type, then we have the locally free sheaf F˜ generated by the sections (dzi /zi )1/r . The sheaf F is obtained from F˜ by the gluing ϕ : F˜ Q1 −→ F˜ Q2 such that ϕ((dz1 /z1 )1/r )⊗ r = −dz2 /z2 . If Q is of Ramond type, the index at Q is defined to be (−1, −1). • If Q is of Neveu-Schwarz type, then there is a pair of integers (m1 , m2 ) with the conditions 0 ≤ mi ≤ r − 2 and m1 + m2 = r − 2 such that the sheaf F is π∗ OBi (zimi dzi )1/r around the nodal point Q. The pair (m1 , m2 ) is called the index of Q. Also the number mi is called the index of Qi . 1.2. The virtual class. In the papers [6 and 7], T. Jarvis, T. Kimura and A. Vaintrob 1/r 1/r,m introduced the substack M of Mg,n for any stable decorated graph of genus g with n tails marked by the tuple m, and they formulated the axioms of the virtual class 1/r 1/r,m of M (see Definition 4.1). The virtual class of Mg,n was originally introduced by Witten ([13]), when he generalized his famous conjecture on the intersection theory of the moduli stack of stable curves, although the existence of the nice moduli stack of r-spin stable curves was established by Jarvis later. Remark 1.1. Jarvis, Kimura and Vaintrob also considered the case where one of mi is −1. In this paper, we restrict our attention to the tuples of non-negative integers. 1.3. The purpose. Our purpose of the paper is to recall the outline of the original construction of Witten, which we call the SSW-construction, and to show that the obtained virtual classes satisfy the axioms of Jarvis-Kimura-Vaintrob. We recall the SSW-construction briefly. We denote the moduli stack of smooth (resp. stable) r-spin curves by M (resp. M). From the universal curve C −→ M, we obtain the Hilbert space bundles E 0 and E 1 whose fibers on the curve (C, p, F) are L2 (C, F) and L2 (C, F ⊗ T 0,1 ) respectively. We have the family of the closed operators ∂. Then 0 1 we extend such a family ∂ : E 0 −→ E 1 to the family ∂¯ : E −→ E on the stack M due to the argument of Seely-Singer [11]. (See Sect. 3.) i Then we take finite dimensional vector subbundles E i ⊂ E satisfying the following conditions: 0 1 • ∂(E 0 ) ⊂ E 1 and the index of ∂ : E 0 −→ E 1 is same as the index of ∂ : E −→ E . • E 1 contains the orthogonal complement of Im ∂ . Such E 0 −→ E 1 is called a finite reduction. (See Definition 4.2 for a more precise state1 ment.) Let ρE 1 denote the orthogonal projection E onto E 1 . Let π denote the natural Witten gave the section φ of π ∗ E 1 over E 0 , which is given projection of E 0 on M. r−1 by φ(s) = ∂s + ρE 1 s (see Subsubsect. 4.2.3), and he shows φ −1 (0) = M (see Lemma Chern classctop (π ∗ E1 , φ), which is contained in 04.5). Hence we obtain the topd(r, ∗ H E . It can be shown that (−1) m)π∗ ctop π ∗ E 1 , φ is independent of a choice M of a finite reduction, which should be the virtual class. We will see the classes satisfy the axioms in Subsect. 4.4. One of the key points in the proof is to show ‘vanishing’. Our simple idea for vanishing is explained in the Subsubsect. 4.3.1. The author hopes that it is sufficiently clear for the readers. Remark 1.2. The algebro-geometric construction of the virtual class was given by A. Polishchuk and A. Vaintrob in [10 and 9].

The Virtual Class of the Moduli Stack of Stable r-Spin Curves

3

2. Stable r-Spin Curves We recall the definitions and the results on the stable r-spin curves due to Jarvis ([3 and 4]) to fix the notation. We refer the paper [6] as a convenient reference. We denote a tuple of points p1 , . . . , pn (resp. integers m1 , . . . , mn ) by p (resp. m). We denote the formal sum mi · pi by m · p. Definition 2.1. Let (X, p) be a nodal, n-pointed algebraic curve, and let K be rank one torsion-free sheaf on X. A d th root of K of type m = (m1 , . . . , mn ) is a pair (E, b) of rank one torsion-free sheaf E and an OX -module homomorphism b : E ⊗ d −→ K(−m · p) with the following properties: • d · deg E = deg K − mi . • b is an isomorphism on the locus of X where E is locally free. • for every point p ∈ X where E is not free, the length of the cokernel of b at p is d − 1. Jarvis used the following notion to obtain the smooth moduli space. Definition 2.2. Let K be a rank 1, torsion-free sheaf on a nodal n-pointed curve (X, p). A coherent net of r th roots of K of type m = (m1 , . . . , mn ) consists of the following data: • A rank 1 torsion-free sheaf Ed on X for every divisor d of r. ⊗ d/d

−→ Ed for every pair of divisors d

• An OX -module homomorphism cd,d : Ed

and d of r such that d divides d. These data are subject to the following restrictions: 1. E1 = K and c1,1 = id. 2. For each divisor d of r and each divisor d of d, let m = (m

1 , . . . , m

n ) be a tuple such that m

i is the unique non-negative integer less than d/d , and congruent to mi mod d . Then the homomorphism cd,d makes (Ed , cd,d ) into a d/d root of Ed of type m

. ⊗ d /d

= cd,d

holds. 3. The homomorphisms {cd,d } are compatible, i.e., cd ,d

◦ cd,d

Then Jarvis defined stable r-spin curves. Definition 2.3. An n-pointed r-spin curve of type m = (m1 , . . . , mn ) is defined to be an n-pointed nodal curve (X, p) with a coherent net of r th roots of ωX of type m, where ωX is the dualizing sheaf of X. An r-spin curve is called smooth if X is smooth, and it is called stable if X is stable. The nodal point of a stable r-spin curve (C, p, F) is divided into two types by the local property of the sheaf F at the point: If F is locally free at the nodal point Q, then Q is called of Ramond type. Otherwise Q is called of Neveu-Schwarz type. To obtain the category of the r-spin curves, the ‘morphisms’are considered as follows. ∼ Definition2.4. An isomorphism of r-spin curves X, p, {Ed , cd,d } −→ X , p ,

{Ed , cd,d

} of the same type m is defined to be a tuple (τ, β) of an isomorphism of pointed curves τ : (X, p) −→ (X , p ) and a family of isomorphisms βd : τ ∗ Ed −→ Ed with β1 being the canonical isomorphism τ ∗ ωX (−m · p) −→ ωX (−m · p), such that

the βd are compatible with all the maps cd,d and τ ∗ cd,d

.

4

T. Mochizuki

The foundational and important theorem of Jarvis is the following. Proposition 2.1 (Jarvis). The moduli functor of the stable n-pointed r-spin curves of genus g and type m is representable by a smooth proper Deligne-Mumford stack. Following Jarvis, we denote the stack of the stable n-pointed r-spin curves of genus 1/r,m 1/r,m 1/r g and type m by Mg,n . The disjoint union m,0≤mi 1. Then the families of the norms Nt (φ) and Nt (φ ) are mutually bounded independently of t, i.e., there is a constant C > 1, such that C −1 Nt (φ)(u) ≤ Nt (φ )(u) ≤ C Nt (φ)(u). Proof. Let f · e(l)z = g · e(r − l)w be a section of Ft . From the relation of the sections e(l)z and e(r − l)w , we obtain the relation |f | · |t|el/r · |w|−1 = |g|. Since z · w = t e , we obtain |f |2 · |z|2l/r = |g|2 · |w|2(r−l)/r . On the other hand, we have the relations dz = −w−1 · z · dw and d z¯ = −w¯ −1 · z¯ · d w¯ on Xt . Thus we have the following inequality for

The Virtual Class of the Moduli Stack of Stable r-Spin Curves

17

the integral on any subset D contained in the region (z, w, t) C −1 |z| < |w| < C|z| : |f |2 · |z|(l) · φ z · |dzd z¯ | ≤ |g|2 · |w|(r−l) · φ w · |dwd w| ¯ C −2 · D D ≤ C2 · |f |2 · |z|(l) · φ z · |dzd z¯ |. D

Hence the norms are mutually bounded.

3.3.9. Collapsing a curve. We continue to use the notation. Lemma 3.20. Let f · e(l) = g · e(r − l) be a section of Ft on Xt (t = 0). Then we have the following equality: z |f |2 · |z|(l) · dz · d z¯ = |g|2 · |w|(r−l) · · dw · d w¯ . (15) w δ 2 e−αn . We claim that (P2) in Sect. 2.1 holds for this return even though n may be greater than N . This is because the critical point that will guide xˆn through its partial derivative recovery obeys (G1) and (G2) up to time N, and by the proof of (P2), the time it takes to complete this recovery is < λ3 αn ≤ N . Indeed if we assume (G1) holds for xˆ up to time n, n ≤ α1∗ N , then on the time interval [0, n], the orbit of xˆ1 has the bound/free behavior described in Sect. 2.1. Moreover, by an 1 argument identical to that for Corollary 2.1, we have |(f j ) (xˆ1 )| > K −1 e 4 λj for j ≤ n. We remark that beyond time α1∗ N, the dynamical description of xˆ in the last paragraph ceases to be valid as soon as a bound period > N is encountered. Conversely, the behaviors of other critical orbits beyond time N do not impact the properties of xˆ up to time α1∗ N . In view of the discussion above, we modify Proposition 4.2 slightly as follows: Proposition 4.2’. In addition to the hypotheses in Proposition 4.2, we assume that for some xˆ ∈ C and n ∈ (N, α1∗ N ], (G1) holds for xˆ up to time n. Then the conclusion of Proposition 4.2 holds for this xˆ for all i ≤ n.

272

Q. Wang, L.-S. Young

5.2. Duality between phase-space and parameter-space dynamics. Setting. Let λ < 41 λ0 be as before. To establish the above-mentioned duality, new upper bounds are imposed on α and ε (or equivalently εˆ ). Let N = {a ∈ (−ˆε , εˆ ) : fa ∈ GN (f0 ; λ, α, ε)}. For the rest of Sect. 5, we fix xˆ ∈ C. All parameters considered are assumed to be in N ; all indices considered are assumed to be ≤ α1∗ N , and (G1) is assumed to hold for xˆ for all the indices in question. We use the notation d τi (a) := da xî (a). Our main results are (P1’)–(P3’), three properties of a → xî (a) that are the analogs of (P1)–(P3) in Sect. 2.1. We state also two lemmas that lie at the heart of these properties. To avoid disrupting the flow of ideas, proofs are postponed to Sect. 5.3. ˆ where iˆ is as in Proposition 4.2. Then Lemma 5.1. Let n > i, 1 |τn+i | ≤ (1 + Ke− 4 λn ) |(fai ) (xˆn )|. |τn |

1

(1 − Ke− 4 λn ) |(fai ) (xˆn )| ≤

(P1’) (Outside of Cδ ). There exists i0 ≥ iˆ such that the following hold for n ≥ i0 : 1 (i) If xˆn is free, and xˆn+j ∈ Cδ ∀ 0 ≤ j < j0 , then |τn+j | > 21 c1 δe 4 λ0 j |τn | for j ≤ j0 ; 1 (ii) if in addition xˆn+j0 ∈ Cδ0 , then |τn+j0 | > 21 c1 e 4 λ0 j0 |τn |. The reader should think of i0 as the time after which x- and a-derivatives are sufficiently close in the sense of Lemma 5.1. We assume xî ∈ C 1 δ0 for all i < i0 . 2

+ Consider next an interval ω ⊂ N with xˆn (ω) ⊂ Iµj . To establish the desired relationship between phase-space and parameter-space dynamics during the bound period, we impose the following additional upper bound on α: Let L be a Lipschitz constant of the map G : (x, a) → (fa (x), a). We assume α is small enough that 3

1

L λ α < e 8 λ.

(5)

For each a ∈ ω, let pa denote the bound period of Iµj with respect to fa , and let HD(·, ·) denote the Hausdorff distance between two sets. Lemma 5.2. Let ω and α be as above. Then the following hold for all a ∈ ω: HD(xˆn+j (ω), fa (xˆn (ω))) e

λpˆ 4

|τn (a)| for all a ∈ ω; 8α

(d) if xˆn (ω) ≈ Iµj , then |xˆn+pˆ (ω)| ≥ e− λ |µ| . To state (P3’), we divide each orbit in the time interval [i0 , n] into bound and free periods as in Sect. 2.1, and say all a ∈ ω have the same itinerary up to time n if (i) their bound and free periods coincide and (ii) whenever xî (ω) is free, it is ⊂ π + for some π ∈ P.

Nonuniformly Expanding 1D Maps

273

(P3’) (Global distortion). There exist i1 > i0 and K3 > 1 such that if n ≥ i1 and all points in ω have the same itinerary through step n − 1, then for all a, a ∈ ω, K3−1
i1 , we assume that for all j < i, γj are defined, as are Qj , representing a canonical subdivision. We assume also that the notion of bound/free makes sense on each ω ∈ Qj . Consider now ω ∈ Qi−1 (on which γi−1 = ∗). We first put on it the canonical partition Qi as defined in Sect. 3.1. On each ωˆ ∈ Qi |ω , there are 2 options: we either let γi = xî on all of ω, ˆ or we let it = ∗ on all of ω. ˆ The rules are as follows: (a) We are free to set γi = ∗ or xî on any ωˆ for which xî (w) ˆ is outside of Cδ . (b) If xî (w) ˆ ⊂ Cδ , the following conditions must be met if we wish to set γi |ωˆ = xî : ˆ ∩ {d(·, C) < e−αi } = ∅; (i) xî (ω) (ii) ωˆ ⊂ α ∗ i , i.e., fa ∈ Gα ∗ i (f0 ; λ, α, ε) for all a ∈ ω. ˆ Finally, we set γi = ∗ on {γi−1 = ∗}. This completes our definition at the i th step. Paragraph (2) is then repeated with i + 1 in the place of i. We observe that the process above is well defined. It is clearly well defined initially. When γi (ω) ˆ ⊂ Cδ , (2)(b) guarantees that for all a ∈ ωˆ for which γi (a) = ∗, the ensuing bound period is meaningful (see Sect. 5.1). Once this is taken care of, Sect. 5.2 gives the desired resemblance to phase-space dynamics until the next free return. We remark that even though {γi } is associated with a particular xˆ ∈ C, Condition (2)(b)(ii) demands good behavior of all critical orbits up to time α ∗ i. The fact that this requirement is only up to time α ∗ i, which is K −1 δ. Not knowing the location of γn (ω), we assume the worst-case scenario, namely that γn (ω) crosses entirely a forbidden region {d(·, y) ˆ < e−αn } for some yˆ ∈ C. Thus the fraction of ω with d(xˆn , C) < e−αn is 1 −αn < 2e · Kδ −1 , which we may assume is < Ke− 2 αn (see the paragraph following Proposition 2.2). Here (P3’) is used to transfer the ratio of lengths on γn (ω) back to ω. Case 2. γj0 (ω) ≈ Iµj . Let p be the bound period initiated at time j0 . Observe first that 1 |γn (ω)| > K −1 |γj0 +p (ω)| > K −1 e− 10 |µ| : Since γn (ω) is free (otherwise there would be no deletion), n ≥ j0 + p. The first inequality follows from (P1’)(ii) combined (possibly) with (P2’)(c); the second follows from (P2’)(d). Observe also that by design, |µ| ≤ 1 1 αj0 < αn, so the fraction of ω being estimated is again < Ke−αn e 10 αn < Ke− 2 αn .


277

6.3. Deletions on account of (G2). We begin with an estimate on derivative growth in terms of the time an orbit spends in bound periods initiated at returns to Cδˆ for arbitrary δˆ < δ. Consider f ∈ GN (f0 ; λ, α, ε) and n ≤ α1∗ N . Let x ∈ I be such that d(xi , C) ≥ min{ 21 δ0 , e−αi } for all 0 ≤ i < n. By the reasoning in Sect. 5.1, the usual ˆ n) denote bound/free decomposition makes sense for the orbit of x up to time n. Let B(δ; the total number of i, 0 ≤ i ≤ n, such that xi ∈ Cδˆ or it is in a bound period initiated from a visit to Cδˆ . ˆ n) ≤ σ n, then Lemma 6.2. Let f and x be as above. Given δˆ ≤ δ and σ > 0, if B(δ; |(f n ) (x)| > K −1 δˆ e[(1−σ ) 4 λ0 −α]n . 1

Proof. Consider first the case where xn is free. Let tˆ1 < tˆ1 + pˆ 1 ≤ tˆ2 < tˆ2 + pˆ 2 ≤ · · · ≤ ˆ tˆk + pˆ k ≤ n, where tˆ1 , · · · , tˆk are the consecutive free return times to {d(·, C) < δ}. Then (f n ) (x) = (f n−tˆk −pˆk ) (xtˆk +pˆk ) · (f pˆk ) (xtˆk ) ·(f tˆk −tˆk−1 −pˆk−1 ) (xtˆk−1 +pˆk−1 ) · · · (f tˆ1 ) (x). We use (P1)(i) for (f n−tˆk −pˆk ) (xtˆk +pˆk ), (P1)(ii) for (f ti −tî−1 −pî−1 ) (xtî−1 +pî−1 ), and the trivial estimate |(f pî ) (xtî )| > c1−1 for growth during bound periods (see (P2)(ii)). This ˆ 41 λ0 (1−σ )n since pˆ 1 + · · · + pˆ k ≤ σ n by assumption. The gives |(f n ) (x)| > K −1 δe factor −αn is needed if n is not free; see Corollary 2.1.

Corollary 6.1. Let the hypotheses be as in Lemma 6.2, with x = xˆ1 for some xˆ ∈ C. We assume further that d(xî , C) > 21 δ0 for all i ≤ n0 , where n0 is sufficiently large ˆ Then B(δ; ˆ 0, n) < σ n implies |(f n ) (xˆ1 )| > c1 e[(1−σ ) 41 λ0 −α]n . depending on δ. Proof. The factor δˆ is absorbed into the initial growth if n > n0 .

ˆ n) For f ∈ G, it can be deduced from properties of the invariant measure that n1 B(δ; ˆ dµ decreases with δ. In light of the duality in Sect. 5.2, one may expect a similar phenomenon for a → γi (a). We formulate below a large deviation estimate useful for estimating the measure of parameters deleted on account of (G2). ˆ n) be the Let {γi , i < n} be as in Sect. 6.2. For a such that γn (a) = ∗, let B(a, δ; ˆ n) defined above with f = fa and x = x. number B(δ; ˆ Proposition 6.1. Given any σ > 0, there exist positive numbers εˆ 1 = εˆ 1 (σ ) and ˆ ) such that δˆ = δ(σ ˆ n) > σ n}| < e−ˆε1 n |0 |. |{a ∈ 0 : γn = ∗ and B(a, δ; 6.4. Large deviation estimate. We first state the analog of Lemma 3.3. Let ωˆ ∈ Qj0 ˆ a stopping time starting from be such that γj0 (ω) ˆ = ∗ and is free. On ωˆ we define S, j0 , as follows: We extend the process on ω beyond time j0 , and for each a ∈ ω, let k = k(a) > j0 be the first time when γk (Qk−1 (a)) is not in a bound period and has ˆ length > δ. If such a k exists, we set S(a) = k − j0 . If a is deleted before that happens, ˆ we set S(a) = 0.

278


Lemma 6.3. Let ωˆ ∈ Qj0 be such that γj0 (ω) ˆ is free and ≈ Iµj . Then −1 ˆ ˆ for all m > K log |µ|. |{a ∈ ωˆ : S(a) > m}| < e− 2 K m |ω| 1

The proof is entirely parallel to that of Lemma 3.3 in Sect. 3.1. Proof of Proposition 6.1. We take a probabilistic viewpoint, with underlying probability space (0 , P ), P being normalized Lebesgue measure on 0 . Let δˆ > 0 be a small number to be determined. Let n be fixed. The idea is to introduce Xi dominated by ˆ certain exponential random variables such that B(a) := B(a, δ, n) ≤ Xi (a). Step I. Formulation of problem as one involving Xi . For each a ∈ 0 , we define t0 < t1 < · · · and S1 , S2 , . . . via the following algorithm, with the understanding that the algorithm terminates as soon as γi (a) = ∗ or time n is reached. To get started, let t0 be the smallest j > 0 such that γj (Qj −1 (a)) ∩ Cδˆ = ∅. (i) After ti is defined, we define Si+1 : If Qti (a)∩Cδˆ = ∅, set Si+1 = 0; if Qti (a) ⊂ Cδˆ , ˆ n − ti ) where Sˆ is the stopping time above starting from ti . let Si+1 = min(S, (ii) If Si+1 = 0, let ti+1 be the smallest j > ti such that γj (Qj −1 (a)) ∩ Cδˆ = ∅; define ti+1 the same way if Si+1 > 0 except that j is taken ≥ ti + Si+1 . Suppose ti (a) is defined. Let Q = Qti −1 (a). Assuming δˆ δˆ 10 ; (3) for all a , a ∈ Q, τti (a )/τti (a ) < K. (1) is true because trajectories of critical curves in bound periods initiated outside of Cδˆ cannot meet Cδˆ . If Si > 0, it may happen that ti = ti−1 + Si , in which case |γti (Q)| > δ by definition. Otherwise we back up to time t when Q was first created as an element of some Qj . Then ti−1 + Si ≤ t < ti , and γt (Q) ∩ Cδˆ = ∅. If γt (Q) is outside of Cδ , then |γti (Q)| > K −1 δ by (P1’). If γt (Q) ≈ Iµj for some Iµj ⊂ Cδ \ Cδˆ , then 8α |γti (Q)| > K −1 δˆ λ by (P2’)(d). In all cases, (2) holds assuming δˆ σ n} decreases exponentially with n. We define the following σ -algebras on ω: Let Ai be the set of a for which ti is defined. Then Ai ∈ Fi , and for a ∈ Ai , the atom of Fi containing a is Qti (a)−1 (a). For a ∈ Ai , the atom of Fi containing a is Qk (a), where k is the last step before the algorithm above is terminated. One verifies that Fi so defined is a , and that Xi is measurable with respect to Fi . σ -algebra, that F0 < F1 < · · · < Fn Step II. Large deviation estimate for 1≤i≤n Xi . First we compute the conditional distribution of Xi+1 given Fi , i ≥ 0. Consider Q ∈ Fi |Ai . (On Q ∈ Fi with Q ∩ Ai = ∅, Xi+1 = 0.) From (2) and (3) above, we have (i) P (Xi+1 = 0 | Q) ≥ 1 − K δˆ 10 . 9

For Iµj ⊂ Cδˆ , Lemma 6.3 together with (1) and (3) above give 1

(ii) P (Xi+1 > m | Q ∩ {γti ∈ Iµj }) < Ke− 2 K otherwise.

−1 m

if m ≥ K|µ|; no information


279

Combining the last two estimates, we obtain for all m ≥ 0, ˆ e−K P (Xi+1 > m | Q) < K δˆ− 10 min(δ, 1

−1 m

) + K δˆ 10 e− 2 K 9

1

−1 m

.

(8)

A simple computation then gives E[eρXi+1 |Q] < ∞ if ρ < 21 K −1 (where K is as in the exponents above). We note further that by decreasing δˆ (keeping ρ fixed), E[eρXi+1 |Q] can be made arbitrarily close to 1. Let η > 0 be a number to be determined shortly, and ˆ choose δˆ = δ(η) sufficiently small that E[eρXi+1 |Q] < eη . Observing that the upper bound in (8) and hence that for E[eρXi+1 |Q] do not depend on i or on Q, we conclude that with the choices of ρ, η and δˆ above, E[eρXi+1 |Fi ] < eη for every i ≥ 0. To finish, we observe that E eρ 1≤i≤n Xi = E E[eρ 1≤i≤n Xi |Fn−1 ] = E eρ 1≤i≤n−1 Xi E[eρXn |Fn−1 ] ≤ eη E eρ 1≤i≤n−1 Xi ,

] ≤ enη . We arrive, therefore, at the estimate     P {B > σ n} σ n < eηn−ρσ n .  

giving inductively E[eρ

1≤i≤n Xi

1≤i≤n

1

This is < e− 2 ρσ n if η is chosen < 41 ρσ .

References: A version of Sects. 6.2 and 6.3 is used in [BC2]; Sect. 6.4 is taken from [WY2]. 7. Positive Measure Sets of Good Parameters 7.1. Preliminary definitions and choices. 1. We fix λ ≤ 15 λ0 . 2. Augmented versions of (G1) and (G2). For reasons to become clear, it will be advantageous to put our good maps “deeper inside" GN (f0 ; λ, α, ε). We say xˆ ∈ C satisfies (G1)# and (G2)# up to time N if for all 1 ≤ i ≤ N , (G1)# d(xî , C) > min( 21 δ0 , 2e−αi ); 1 (G2)# |(f i ) (xˆ1 )| > 2c1 eλ1 i where λ1 = λ + 100 λ0 , # and say f ∈ GN (f0 ; λ, α, ε) if all xˆ ∈ C satisfy (G1)# and (G2)# up to time N . # (f ; λ, α, ε) ⊂ G (f ; λ, α, ε). The proof of the following lemma is Clearly, GN 0 N 0 straightforward. # (f ; λ, α, ε), Lemma 7.1. There exists K4 > 1 for which the following holds: If faˆ ∈ GN 0 −n −n then for all n ≤ N, fa ∈ Gn (f0 ; λ, α, ε) for all a ∈ [aˆ − K4 , aˆ + K4 ]. 1 λ, we impose two upper bounds on α: The first 3. Choice of α. In addition to α < 100 is introduced in (5) in Sect. 5.2; the second is (9) in Sect. 7.2. With λ and α fixed, # (f ; λ, α, ε) will be abbreviated as G and G # from here on. GN (f0 ; λ, α, ε) and GN 0 N N

280


ˆ We need σ to be small enough that the exponent in Corollary 4. Choices of σ and δ. 1 6.1, namely (1 − σ ) 41 λ0 − α, is > λ1 . (For example, σ = 100 will do.) We then let δˆ 1 be given by Proposition 6.1 with 2 σ in the place of σ . 5. The start-up interval 0 . We choose 0 ⊂ (−ˆε , εˆ ) to contain 0 and to be short enough that for some n0 sufficiently large, d(xî , C) > 21 δ0 for all i ≤ n0 , xˆ ∈ C and a ∈ 0 . A number of impositions on n0 have been made; see, for example, Sects. 2.1 and 5.2, and Corollary 6.1. There will be more in the next two pages. 7.2. Inductive construction of . We seek to construct a sequence of sets 0 ⊃ n0 ⊃ 2n0 ⊃ 22 n0 ⊃ · · · in parameter space with the properties that (i) for each , {fa , a ∈ 2 n0 } ⊂ G2# n and 0 (ii) := ∩≥0 2 n0 has positive Lebesgue measure. The rules of construction are detailed below; the measure estimate is given in Sect. 7.3. Overview of procedure. Let C := {xˆ 1 , xˆ 2 , . . . , xˆ q }. Associated with each xˆ k , we define a process {γik , i < ∞} in the sense of Sect. 6.1 with the property that for every a such that γ2k n (a) = ∗, xˆ k (a) satisfies (G1)# and (G2)# up to time 2 n0 . We then let 0

k2 n := {γ2k n = ∗} 0

It

0

and

2 n0 := ∩1≤k≤q k2 n . 0

follows that fa ∈ G2# n for every a ∈ 2 n0 . 0 The processes γik are updated in N -to-2N cycles,

N = 2 n0 , = 1, 2, . . . . Within each cycle, we first update each of the q processes individually, i.e., extend γik from k to ∗ i = N to i = 2N. At the end of this updating, we reset some of the values of γ2N to reflect the combined status of all q processes before moving to the next cycle. Remarks. It is absolutely essential to take inventory of the global picture at regular time intervals (as we do at times 2 n0 ). Other than that, the precise order in which γik is updated is unimportant. Also, the number “2" has little significance: all that is needed is a relation with α that gives (9) below. Getting started. Let nk1 be the smallest i > 0 such that xîk (0 )∩C 1 δ0 = ∅. Then nk1 > n0 , 2

and |xˆ kk (0 )| > 21 δ0 . Since δ ρ > 0 are arbitrary fixed constants. Theorem 1 in A subsequence {λji }i is of full density if limλ→∞ #{i : λji < λ}/#{j : λj < λ} = 1. Hecke eigenstates are simultaneous eigenfunctions of the Laplacian and all Hecke operators. If the spectrum of the Laplacian is simple, as conjectured e.g. for the modular surface, any eigenfunction of the Laplacian is a Hecke eigenstate. 4 A sequence of probability measures dν is tight if for any > 0 there is a compact domain K ⊂ D j such that lim supj →∞ D−K dνj < . 2 3

Quantum Leaks

305

Fig. 1. Leaky Sinai billiard

Fig. 2. Leaky Bunimovich billiard

Fig. 3. Leaky polygonal billiard

Sect. 3 implies that there is a constant C > 0 such that (at least) one of the following two statements is true: There is a subsequence of eigenfunctions ϕji (i = 1, 2, . . . ) with eigenvalues λji ∈ π 2 i 2(1+σ ) + [−Ci −2ρ , Ci −2ρ ] and some c > 0 such that for any compact K ⊂ D we have dνji > c. (1.8) lim inf i→∞

D−K

The number of eigenvalues λj in the interval π 2 i 2(1+σ ) + [−Ci −2ρ , Ci −2ρ ] is unbounded as i → ∞. The first statement implies that eigenfunctions lose a positive proportion of mass. The second alternative implies extreme level clustering; this seems unlikely for a generic billiard of the above type, but cannot a priori be ruled out. To get a rough idea on whether to expect more level clustering than in the case of compact domains D, we show in Sect. 5 that the spectral counting function has the asymptotics (Theorem 2) #{j : λj < λ} =

Area(D) L(λ) √ λ λ− 4π 4π ∞ ∞ √ √ 1 √ 1 + J1 2rδi λ + O( λ), λ i 2π r i=1 √ δi λ>π

(1.9)

r=1

where L(λ) = 2

∞ i=1 √ δi λ>π

i

(1.10)

306

J. Marklof

is an ‘effective length’ of the boundary ∂D and J1 is the J -Bessel function. The fluctuations √ are therefore larger than in the compact case, where the error term is of order O( λ); cf. Sect. 5 for a more detailed discussion. The proof of Theorem 1 is elementary and based on the construction of ‘bouncing ball’ quasimodes [17, 1, 31, 13, 6–8, 33, 18] (see also Bogomolny and Schmit’s recent work on eigenfunctions in pseudo-integrable billiards [5]). The non-compactness of the domain allows for quasimodes with discrepancy almost as small as O(µ−1 ), where µ is the quasi-eigenvalue. The best rigorous bound for the discrepancy in the compact case is O(1), cf. [13]. Our construction is completely independent on the choice of f on the interval (0, a1 ), and one may use this additional freedom to try and tune f on (0, a1 ) in such a way that the billiard flow on D is ergodic. It seems plausible that this is possible if the billiard flow on the restricted compact region D0 = {(x, y) ∈ R2 : 0 < x < a1 , 0 < y < f (x)} is ergodic (as in the examples displayed in Figs. 1 and 2), but to the best of my knowledge there are no rigorous results in this direction (see however [23, 24, 16] for proofs of ergodicity for different classes of non-compact domains). A further interesting class of examples are infinite pseudo-integrable billiards (Fig. 3) that are known to be ergodic5 for almost all initial directions [12]. 2. Quasimodes A function ψ ∈ H01 (D) is called a quasimode for − with quasi-eigenvalue µ and discrepancy , if ( + µ)ψ ≤ ψ , (2.1) ψ ∂ D = 0, where · denotes the L2 norm. A sequence of quasimodes {ψi }i with quasi-eigenvalues µi is of order s, if −s/2

( + µi )ψi = O(µi

) ψi .

(2.2)

We summarize a few important properties of quasimodes; more details can be found in [9, 22, 13, 33]. By expanding ψ in an orthonormal basis of eigenfunctions, ψ = j ψ, ϕj ϕj , it is easy to see that (2.1) implies | ψ, ϕj |2 (λj − µ)2 ≤ 2 ψ 2 = 2 | ψ, ϕj |2 . (2.3) j

j

Hence |λj − µ| ≤ for at least one j , i.e., there is at least one eigenvalue λj in the interval [µ − , µ + ]. Consider the larger interval J = [µ − b , µ + b ], b > 1. We have | ψ, ϕj |2 ≤ (b )−2 | ψ, ϕj |2 (λj − µ)2 ≤ b−2 ψ 2 . (2.4) λj ∈J / 5

λj ∈J /

Since the modulus of the momentum components in both x- and y-directions are constants of motion, ergodicity is here understood with respect to a two-dimensional submanifold of the unit cotangent bundle.

Quantum Leaks

307

For a domain A ⊂ D define ψ A =

A

|ψ(x, y)|2 dx dy.

Triangle and Cauchy-Schwarz inequality imply

ψ A ≤

ψ, ϕj ϕj +

ψ, ϕj ϕj

A

λj ∈J

≤

| ψ, ϕj |2

λj ∈J

≤ ψ

λj ∈J

λj ∈J

ϕj 2A +

λj ∈J /

(2.5)

A

ϕj 2A +

ψ, ϕ ϕ j j

λj ∈J /

| ψ, ϕj |2 ,

(2.6)

λj ∈J /

and hence, together with (2.4), λj ∈J

ϕj 2A ≥

ψ A − b−1 . ψ

(2.7)

Now suppose that for a sequence of quasimodes ψi with quasi-eigenvalue µi and discrepancy i the intervals Ji = [µi − b i , µi + b i ] each contain at most k eigenvalues λj .

(2.8)

Then, in each interval Ji there is a λji such that 1 ψi A −1 ϕji A ≥ √ . −b k ψi

(2.9)

3. Leaky Domains Let f : (0, ∞) → (0, ∞) be a right-continuous function, monotonically decreasing to 0 on the half-line [a1 , ∞) (for some a1 > 0), and f (x)dx < ∞. We are interested in the domain D = {(x, y) ∈ R2 : x > 0, 0 < y < f (x)}. In the following we will assume that f is chosen so that ∞ f (x)h(π 2 f (x)−2 )dx < ∞, (3.1) a1

where h : [0, ∞) → [0, ∞) is a fixed increasing function bounded by h(x) ≤ central result is the following.6

√ x. The

6 The notation A B for two positive quantities A, B means there is a constant C > 0 such that A ≤ CB. We write A B if A B A.

308

J. Marklof

Theorem 1. For any given decreasing function τ : [0, ∞) → (0, ∞), and any infinite sequence of real numbers 0 < µ1 ≤ µ2 ≤ . . . → ∞

(3.2)

satisfying ∞

τ (µi ) < ∞,

(3.3)

i=1

there is a domain D of the above type whose Dirichlet Laplacian has an infinite sequence of quasimodes ψi,m,n with quasi-eigenvalues µi,m,n = n2 µi + m2 ξi ,

i, m, n ∈ N,

(3.4)

and ξi

h(µi )2 , µi τ (µi )2

(3.5)

so that (i) ( + µi,m,n )ψi,m,n = O(mξi ) ψi,m,n , (ii) ψi,m,n , ψi ,m ,n = 0 for i = i or n = n , (iii) | ψi,m,n , ψi,m ,n | min{0.001, |m − m |−1 } ψi,m,n ψi,m ,n for m = m , (iv) for any compact set K ⊂ D, ψi,m,n D−K →1 ψi,m,n

(3.6)

uniformly for all m, n ∈ N as i → ∞. Remark 1.1. Note that the set {µi,m,n : i, m, n ∈ N} is a discrete subset of R+ , with mean density #{(i, m, n) : µi,m,n < λ} C = , λ→∞ λ 4π lim

(3.7)

where C = π2

i

√

1 ≤ Area(D). µi ξi

(3.8)

This may either be verified directly, or concluded from the observation (cf. Sects. 4 and 6) that {µi,m,n } can be identified with the spectrum of the Dirichlet Laplacian on an −1/2 −1/2 , δi = π µ i , and thus total infinite union of rectangles Di with sides i = πξi area C = i Area(Di ). In this interpretation, (3.7) represents Weyl’s law (1.3).

Quantum Leaks

309

Remark 1.2. If assumption (2.8) holds e.g. for the quasimodes ψi,1,1 , Eqs. (2.9) and (3.6) imply there is an infinite sequence of eigenfunctions ϕji with ϕji = 1, such that for any compact K ⊂ D, lim inf ϕji D−K ≥ i→∞

1 − b−1 . √ k

(3.9)

That is, the eigenstates ϕji lose a positive proportion of mass. It should be stressed that we have not ruled out the probably very remote possibility that assumption (2.8) with i = O(mξi ) can never be satisfied for the domains D considered in the theorem (an explicit construction of D is given in Sect. 4). It would be interesting to see whether (2.8) can be established at least for generic choices of such D, i.e., generic choices of δi . In Sect. 5 we will prove an upper bound for the error term in Weyl’s law, which in turn yields a rough estimate on possible level clustering. Remark 1.3. For m, n bounded as i → ∞ the theorem establishes quasimodes with very small discrepancy, h(µi,m,n )2 ( + µi,m,n )ψi,m,n = O ψi,m,n . (3.10) µi,m,n τ (µi,m,n )2 Since h and τ can be arbitrarily slowly increasing/decreasing functions (respectively), this yields quasimodes of order arbitrarily close to 2; cf. Example 1.1 below. The number of such quasimodes with µi,m,n < λ, Nbb (λ) = #{(i, m, n) : m, n = O(1), µi,m,n < λ} #{i : µi < λ}, is determined by the restriction that τ (λ)dNbb (λ) < ∞.

(3.11)

(3.12)

Hence the higher the desired accuracy of quasimodes (achieved by choosing a sufficiently slowly decreasing τ ), the thinner the corresponding sequence of quasimodes becomes. Remark 1.4. The theorem also implies that there can be sequences of quasimodes of order zero that have almost full density. ‘Order zero’ means that ( + µi,m,n )ψi,m,n = O(1) ψi,m,n ,

(3.13)

i.e., mξi ≤ C1 for some constant C1 > 0. Since in view of (3.5) there is a constant C2 > 0 such that ξi µi ≥ C2 , we have NBB (λ) = #{(i, m, n) : µi,m,n = n2 µi + m2 ξi < λ, mξi ≤ C1 } 2 C λ C 1 ≥ # (i, m, n) : n2 < − 1, m≤ µi C2 ξi √ √ µi τ (µi )2 . λ h(µi )2 µi 2(α + β). That is, ( + µi,m,n )ψi,m,n = O(mµ−1+σ i,m,n ) ψi,m,n ,

(3.17)

The fact that (3.16) implies (3.3) with τ (x) = x −α (α > α) is seen by summation by parts. In view of Weyl’s law (1.3) and the small discrepancy O(µ−1+σ i,m,n ) for bounded m, a failure of assumption (2.8) would imply an extreme clustering of eigenvalues. As we shall see in Sect. 5, the bounds on the error term in Weyl’s law worsen as σ → 0, and hence clustering cannot be ruled out. An evaluation of the lower bound for the number of order-zero quasimodes in (3.14) yields NBB (λ) λθ ,

(3.18)

with θ = max{1 + α − 2α − 2β, 1/2}. Note that θ can be arbitrarily close to 1 for suitable parameter choices. Example 1.2. A second interesting choice that yields a domain D with exponentially √ narrow cusps is h(x) = x/ logγ (1 + x) with γ > 0. For any given infinite sequence of real numbers µi with #{j : µj ≤ λ} logα λ, there is a domain D with

(3.19)

| log f (x)|−γ dx < ∞, so that

( + µi,m,n )ψi,m,n = O(m log−σ µi,m,n ) ψi,m,n ,

(3.20)

for any fixed σ < 2(γ − α). Choose here τ (x) = log−α x with α > α, and (3.3) can again be checked using summation by parts. In this case the number of order-zero quasimodes is bounded from below by NBB (λ)

√ λ.

(3.21)

Quantum Leaks

311

4. Proof of Theorem 1 We begin by constructing accurate quasimodes on the rectangle [a, a + ] × [0, δ] with Dirichlet boundary conditions at y = 0, δ. Let χ ∈ C0∞ (R) be a mollified characteristic function of the interval [0, 1]. That is, 0 ≤ χ (x) ≤ 1, χ (x) = 0 for x ∈ / [0, 1] and χ (x) = 1 for x ∈ [ , 1 − ] for some fixed, small > 0. We assume also that χ (x) = O( −1 ) (such a choice is always possible). For m, n ∈ N, a ∈ R and , δ > 0 put π m(x − a) π ny x−a ψm,n (x, y) = χ sin sin (4.1) δ and

µm,n = π 2

m

2 +

2 n . δ

(4.2)

Straightforward differentiation yields 1 π m(x − a) x−a ( + µm,n )ψm,n (x, y) = 2 2πmχ cos x − a π m(x − a) π ny +χ sin sin , δ (4.3) and hence

( + µm,n )ψm,n = Oχ 2

m2 δ , 3

(4.4)

where the implied constant only depends on the choice of χ . Because of this and ψm,n 2 =

δ (1 + O( )), 4

we obtain

( + µm,n )ψm,n = Oχ

m ψm,n . 2

(4.5)

(4.6)

Furthermore, for n = n we have ψm,n , ψm ,n = 0, and for n = n , m = m , 2 π m x δ x πmx

ψm,n , ψm ,n = sin dx χ sin 2 0 2

x π mx π m x δ χ + − 1 sin sin dx = 2 0 (1− ) 1 δ = [χ (x)2 − 1] + 4 0 1− ×[cos(π(m − m )x) − cos(π(m + m )x)]dx δ = O( ). (4.7) 4

312

J. Marklof

On the other hand, using integration by parts, we have

[χ (x)2 − 1] cos(π(m − m )x)dx 0

1 2 = [χ (x) − 1] sin(π(m − m )x) π(m − m ) 0 − 2χ (x)χ (x) sin(π(m − m )x)dx .

(4.8)

0

Since χ ( )2 = 1, sin(0) = 0 the first term vanishes, and since χ (x) = O( −1 ) the integral is of O(1). The analogous argument works for the remaining integrals. Hence | ψm,n , ψm ,n | min ,

1 ψm,n ψm ,n . |m − m |

(4.9)

We will now give an explicit construction of D. The function f is chosen constant on the intervals [ai , ai+1 ), i = 1, 2, 3, . . . ; set δi = f (ai ) and i = ai+1 − ai . As quasimodes we take ψi,m,n (x, y) = χ

x − ai i

sin

π m(x − ai ) π ny sin , i δi

(4.10)

with quasi-eigenvalues µi,m,n = π

2

m i

2

2 n . + δi

(4.11)

By construction, these are completely localized in the rectangle [ai , ai+1 ] × [0, δi ] and hence satisfy requirement (iv) of the theorem. Setting µi = π 2 δi−2 , every given sequence of µi having property (3.3) determines a sequence of δi . Because of (4.6), ( + µi,m,n )ψi,m,n −1 −2 2 −2 = Oχ (m−2 i ) = Oχ (mδi Ai ) = Oχ (mµi Ai ). (4.12) ψi,m,n To minimize the discrepancy, we would like to choose Ai as large as possible. The choice Ai = τ (µi )h(µi )−1 yields condition (i) and determines f . Since

∞

f (x)h(π 2 f (x)−2 )dx =

a1

i f (ai )h(π 2 f (ai )−2 )

i

=

Ai h(π 2 δi−2 )

i

=

τ (µi ) < ∞,

i

the function f is in the required class satisfying (3.1). Condition (ii) is evident from (4.1), and (iii) from (4.9).

(4.13)

Quantum Leaks

313

5. Asymptotic Distribution of Eigenvalues In view of condition (2.8) we would like to control the number of eigenvalues in small intervals. The following theorem illustrates that extreme level clustering cannot a priori be ruled out. Theorem 2. The spectral counting function N (λ) = #{j : λj < λ} of the Dirichlet Laplacian for the domain D (as in Sect. 4) satisfies N (λ) =

L(λ) √ Area(D) λ− λ 4π 4π ∞ ∞ √ √ 1 √ 1 + λ i J1 2rδi λ + O( λ), 2π r i=1 √ δi λ>π

(5.1)

r=1

where ∞

L(λ) = 2

i

(5.2)

i=1 √ δi λ>π

and J1 is the J -Bessel function. Remark 2.1. The standard bound |J1 (x)| x −1/2

for x large

(5.3)

implies that N (λ) =

√ Area(D) λ + O(L(λ) λ), 4π

(5.4)

where L(λ) = 2π

∞ i=1 µi 2− 1 .

324

I. Binder, M. Braverman, M. Yampolsky

To ensure that the machine Mn1 will not be able to produce an accurate 2− 1 -approximation of J (Pθ1 ) faster than in the time h( 1 ) we simply select i1 > h( 1 ). This guarantees that the TM will have to read at least h( 1 ) digits of the oracle φ to distinguish the two Julia sets, which takes the time h( 1 ). φ To “fool” the machine Mn2 we then change a digit ri2 for i2 > i1 sufficiently far in the continued fraction of θ1 to a large N2 . In this way, we will obtain a Brjuno number θ2 for which φ

r(θ∗ )(1 − 1/4 − 1/8) < r(θ2 ) < r(θ∗ )(1 − 1/4).

(2.8)

Again, there exists 2 such that for any such Brjuno number, we have dH (J (Pθ1 ), J (Pθ2 )) > 2− 2 , and we choose i2 > h( 2 ). Continuing inductively, we arrive at the desired limiting Brjuno number θ∞ . To convince the reader that this construction is not artificial, and not due to the peculiarities of the selected computation model let us recast it somewhat informally as follows. It is possible by an arbitrarily small perturbation of the parameter θ to cause a detectable disturbance in the picture of J (Pθ ). To distinguish the picture of the new Julia set from the old one, in practice one needs to draw it with arbitrary precision arithmetic. That is, not only the input of the parameter (reading the oracle) will take a long time due to the number of significant digits, but also all the arithmetic manipulations with this parameter. Of course, the former consideration is already sufficient to prove the theorem. 3. Computing Noble Siegel Disks The primary goal of the present paper is to show that there are computationally hard yet computable Julia sets with Siegel disks. To establish this computability we need a computability result for noble Siegel disks. The term “noble” is applied in the literature to rotation numbers of the form [a0 , a1 , . . . , ak , 1, 1, 1, . . .]. The noblest of all is the golden mean γ∗ = [1, 1, 1, . . .]. Lemma 3.1. There is a Turing Machine M, which given a finite sequence of numbers [a0 , a1 , . . . , ak ] computes the conformal radius rγ for the noble number γ = [a0 , . . . , ak , 1, . . .]. The idea is to approximate the boundary of γ with the iterates of the critical point cγ = −e2π iγ /2. It is known that in this case the critical point itself is contained in the boundary. The renormalization theory for golden-mean Siegel disks (constructed in [McM]) implies that the boundary γ∗ is self-similar up to an exponentially small error. In particular, there exist constants C > 0 and λ > 1 such that dH ({Pγi∗ (cγ∗ ), i = 0, . . . , qn }, ∂γ∗ ) < Cλ−n . Below we derive a similar estimate for all noble Siegel disks with constructive constants C and λ. For this, we do not need to invoke the whole power of renormalization theory. Rather, we will use a theorem of Douady, Ghys, Herman, and Shishikura [Do1] which specifically applies to quadratic noble Siegel disks.

Complexity of Julia Sets

325

Noble (or more generally, bounded type) Sigel quadratic Julia sets may be constructed by means of quasiconformal surgery on a Blaschke product, fγ (z) = e2πiτ (γ ) z2

z−3 . 1 − 3z

This map homeomorphically maps the unit circle T onto itself with a single (cubic) critical point at 1. The angle τ (γ ) can be uniquely selected in such a way that the rotation number of the restriction ρ(fγ |T ) = γ . For each n, the points q

{1, fγ (1), fγ2 (1), . . . , fγ n+1

−1

(1)}

form the nth dynamical partition of the unit circle. We have (cf. Theorem 3.1 of [dFdM]) the following: Theorem 3.2 (Universal real a priori bound). There exists an explicit constant B > 1 independent of γ and n such that the following holds. Any two adjacent intervals I and J of the nth dynamical partition of fγ are B-commensurable: B −1 |I | ≤ |J | ≤ B|I |. Let us now consider the mapping which identifies the critical orbits of fγ and Pγ by : fγi (1) → Pγi (cγ ). We have the following (Theorem 3.10 of [YZ]): Theorem 3.3 (Douady, Ghys, Herman, Shishikura). The mapping extends to a K-quasiconformal homeomorphism of the plane C which maps the unit disk D onto the Siegel disk γ . The constant K depends on B and a0 , . . . , ak in a constructive fashion. Elementary combinatorics implies that each interval of the nth dynamical partition contains at least two intervals of the (n + 2)nd dynamical partition. This in conjunction with Theorem 3.2 implies that the size of an interval of the (n + 2)nd dynamical partition of fγ is at most τ n where B τ= . B +1 We now complete the proof of Lemma 3.1. Denote Wn the connected component containing 0 of the domain obtained by removing from the plane a closed disk of radius 2Kτ n around each point of n = {Pγi (cγ ), i = 0, . . . , qn+2 }. By Theorem 3.3, distH (n , ∂γ ) < Kτ n , and we have Wn ⊂ γ and distH (∂γ , ∂Wn ) ≤ n = 2Kτ n .

326

I. Binder, M. Braverman, M. Yampolsky

Any constructive algorithm for producing the Riemann mapping of a planar region (e.g. that of [BB]) can be used to estimate the conformal radius r(Wn , 0) with precision n . Denote this estimate rn . Elementary estimates imply that the Julia set J (Pγ ) ⊂ B(0, 2). By Schwarz Lemma this implies r(γ , 0) < 2. By Lemma 2.12 we have √ |r(γ , 0) − rn | ≤ |r(γ , 0) − r(Wn , 0)| + |r(Wn , 0) − rn | < 4 n + n −→ 0, n→∞

and the proof is complete. 4. Making Small Changes to Φ and r For a number γ = [a1 , a2 , . . .] ∈ R \ Q we denote αi (γ ) =

1 ai +

1 ai+1 +

,

1 ai+2 + · · ·

so that

(γ ) =

α1 (γ )α2 (γ ) . . . αn−1 (γ ) log

n≥1

1 . αn (γ )

We will show the following two lemmas. Lemma 4.1. For any initial segment I = (a0 , a1 , . . . , an ), write ω = [a0 , a1 , . . . , an , 1, 1, 1, . . .]. Then for any ε > 0, there is an m > 0 and an integer N such that if we write β = [a0 , a1 , . . . , an , 1, 1, . . . , 1, N, 1, 1, . . .], where the N is located in the n + mth position, then

(ω) + ε < (β) < (ω) + 2ε. Lemma 4.2. For ω as above, for any ε > 0 there is an m0 > 0, which can be computed from (a0 , a1 , . . . , an ) and ε, such that for any m ≥ m0 , and for any tail I = [an+m , an+m+1 , . . .] if we denote β I = [a1 , a2 , . . . , an , 1, 1, . . . , 1, an+m , an+m+1 , . . .], then

(β I ) > (ω) − ε. We first prove Lemma 4.1. Denote

− (ω) = (ω) − α0 (ω)α1 (ω) . . . αn+m−1 (ω) log

1 αm+n (ω)

The value of the integer m > 0 is yet to be determined. Denote β N = (a0 , a1 , . . . , an , 1, 1, . . . , 1, N, 1, 1, . . .). We will need the following estimates, which are proven by induction

.

Complexity of Julia Sets

327

Lemma 4.3. For any N , the following holds: 1. For i ≤ n + m we have

N

log αi (β ) < 2i−(n+m) /N ;

α (β N+1 )

i

2. for i < n + m,

N

log αi (β ) < 2i−(n+m) ;

α (β 1 )

i

3. for i < n + m,

log α (β1 N )

i

log 1

log N +1 αi (β

)

< 2i−(n+m)+1 ;

4. for i < n + m − 1,

log α (β1 N )

i

< 2i−(n+m)+1 .

log

log 1 1

αi (β )

The estimates yield the following. Lemma 4.4. For any ω of the form as in Lemma 4.1 and for any ε > 0, there is an m0 > 0 such that for any N and any m ≥ m0 , | − (β N ) − − (β 1 )|
1 such ε that the tail of the sum i≥n+m1 α1 α2 · · · αi−1 log α1i < 16 . We will show how to choose m0 ≥ m1 to satisfy the conclusion of the lemma. We bound the influence of the change from β 1 to β N using Lemma 4.3, Parts 2 and 4. The influence on each of the “head elements" (i < n + m1 < n + m − 1) is bounded by

i−1

1

α1 (β 1 ) · · · αi−1 (β 1 ) log α (β 1)

i 2j −(n+m) + 2i−(n+m)+1

0 stands for the edge length of T. Our main result is Theorem 1.1. For a random deformation, T(λ, ω), of a regular tree graph T with branching number K ≥ 2 the AC spectrum of −T(λ,ω) is continuous at λ = 0 in the sense that for any interval I ⊂ R and almost all ω:

lim L I ∩ σac (−T(λ,ω) ) = L I ∩ σac (−T ) , (1.6) λ→0

where L(·) denotes the Lebesgue measure. Remarks 1.1. (i) As is generally known by ergodicity arguments [3, 17, 1], and in our case also by the 0-1 law for the sigma-algebra of events measurable at infinity, which is applicable through Theorem A.2, for almost all ω the AC spectrum of −T(λ,ω) is given by a certain non-random set. (ii) The assumption on the distribution of {ωe }e∈E can be relaxed: the present proof readily extends to the class of random graphs where the distribution of these variables is stationary under the endomorphisms of the tree T and weakly correlated in the sense of [2, Def. 1.1]. (iii) To better appreciate the continuity asserted in Theorem 1.1, one may note that the analogous statement is not expected to be true in case the disorder is restricted to be radially symmetric, i.e., ωe = adist{e,0} with {an } a collection of iid random variables. In this case, the AC spectrum coincides with that of a one-dimensional Sturm-Liouville operator. In view of related results about Anderson localization in one dimension [3, 17, 15, 16] one may expect (though we are not aware of a published proof) that also here localization sets in at any non-zero level of disorder.

374

M. Aizenman, R. Sims, S. Warzel

2. An Outline of the Argument A generally useful tool for the study of the spectral and dynamical properties of any quantum graph is provided by the Green function. For tree graphs, we find it particularly useful to consider a related quantity, which is an extension of the Weyl-Titchmarsh function familiar from the context of Sturm-Liouville or Schrödinger operators on a line. Before outlining the main steps in the derivation of Theorem 1.1, we shall introduce this function and its key properties, first somewhat informally through its appearance in a scattering problem. 2.1. A scattering perspective. As noted by Miller and Derrida [14], one may obtain a scattering perspective on extended states by considering a setup in which a wire Wx is attached to a tree graph T at an interior point x of an edge. Particles of energy E and decay rate η are sent at a steady rate down this wire. In the corresponding steady state, the quantum amplitude ψ for observing a particle at a point is given by a function satisfying (−T∪Wx − z) ψ = 0, where z = E + iη and −T∪Wx is a self adjoint Laplacian on the union of the graph and the wire, defined with suitable BC for the three segments meeting at the point of contact. For the latter, we assume here that it will be appropriate to take the Kirchhoff conditions. − As follows from Theorem 2.1 below, on the two subgraphs T+ x and Tx , produced by cutting T at x, the above differential equation has a unique – up to a multiplicative constant – square-integrable solution ψ + and correspondingly ψ − . Thus ψ takes the form: √ √ ei z(y−x) + r(x; z) e−i z(y−x) along the wire ψ(y; z) = , (2.1) ψ ± (y; z) along the graph where r(x; z) is the reflection coefficient, and the three branches are linked through the Kirchhoff conditions: ψ + (x; z) = ψ − (x; z) = 1 + r(x; z),

√ ∂ + ∂ − ψ (x; z) − ψ (x; z) = i z 1 − r(x; z) ∂x ∂x

(2.2)

with the differentiation taken in the direction away from the root of T. The above relations yield √ 1 − r(x; z) i z (2.3) = R + (x; z) + R − (x; z), 1 + r(x; z)

where R ± = ± ∂ψ ± /∂x /ψ ± . From the scattering perspective the graph absorbs some of the current directed at it, i.e., conducts it to infinity, if and only if |r(x; z)| < 1. A simple consequence of (2.3) is the equivalence

|r(x; E)| < 1 ⇔ Im R + (x; E) + R − (x; E) > 0 . (2.4) As it turns out R also plays a direct role in the spectral theory of −T : the diagonal of its Green function is given by

−1 GT (x, x; z) = − R + (x; z) + R − (x; z) . (2.5)

Quantum Tree Graphs with Disorder

375

By the theorem of de la Vallée Poussin, the AC component of the spectral measure, associated with the function in (2.5), is π −1 Im GT (x, x; E + i0) dE. Therefore, there is a relation between the occurrence of the AC spectrum, the ability of the graph to conduct current to infinity, and the non-vanishing of Im R ± (x; E). Let us note that the reflection coefficient for the version of the above experiment in which the particles are sent towards only the forward subtree T+ x , is given by a version of (2.3) with only R + (x; z) on the right side, and similarly for T− x. 2.2. Tree extension of the Weyl-Titchmarsh function. We shall now follow the somewhat informal introduction above with a more careful definition of the functions R ± . For this purpose the following statement plays an important role. Theorem 2.1. Let G be a connected metric graph with a selected “open” vertex u which has exactly one adjacent edge. Let −G,u be the symmetric Laplacian defined with selfadjoint BC on all vertices excepting the open vertex, where it is required that both ψ(u) = 0 and ψ (u) = 0. Then: (i) For any z ∈ C+ := {z ∈ C : Im z > 0}, the space of square-integrable solutions of (−∗G,u − z) ψ = 0, with −∗G,u the adjoint operator, is one dimensional. (ii) The solution ψ(x; z) and its derivative ψ (x; z) do not vanish on any point which disconnects G. (iii) Normalized so that ψ(u; z) = 1, both ψ(x; z) and ψ (x; z) are analytic for z ∈ C+ and all x ∈ G. We note that −G,u is not self-adjoint. The proof of this theorem is given in Appendix A. The following corollary is a relevant implication for trees. Throughout, we denote by ψ ± (x; z|u) the functions described in Theorem 2.1 which correspond to the two subtrees, T± u , into which T is split at u, with u serving as the open vertex. We fix their normalization such that ψ ± (u; z|u) = 1. Corollary 2.1. Along the edges of a metric tree T, the ratio R ± (x; z) := ±

1 ψ ± (x; z|u)

∂ ± ψ (x; z|u) ∂x

(2.6)

does not depend on u as long as x stays in T± u. Definition 2.1. We shall refer to the above R ± as the (generalized) Weyl-Titchmarsh (WT) functions. These functions have a number of properties which are used in the proof of our main result. If not obvious, their derivation is given in Appendix A. 1. (Relation with the Green function). The generalized WT function may be related to the diagonal elements of the Green function which is defined on T+ x , with the α = 0 BC at x, as R + (x; z) = cot α −

1 , GαT+ (x, x; z) x

and similarly for

R−.

(2.7)

376


2. (Boundary values). The function has the Herglotz-Nevanlinna property [5]: it is analytic for z ∈ C+ with Im R ± (x; z) > 0 when Im z > 0. By a standard implication, for each x the limit R ± (x; E + i0) := lim R ± (x; E + iη) η↓0

(2.8)

exists for Lebesgue almost every E ∈ R. 3. (Evolution along the tree). The values Re+ (·; z) at two opposite ends of an edge e are related by a Möbius transformation, which integrates the Riccati equation: ∂ + R (x; z) + z + R + (x; z)2 = 0 . ∂x Over each vertex R + (·; z) is additive thanks to (1.2):

Rf+ 0; z . Re+ Le ; z =

(2.9)

(2.10)

f ∈Ne+

4. (Relation with the current). For each u, the quantity

∂ + + + J (x, z|u) := Im ψ (x; z|u) ψ (x; z|u) ∂x = |ψ + (x; z|u)|2 Im R + (x; z) ≥ 0

(2.11)

represents a current. It is additive at the vertices and conserved along the edges for real z. For z ∈ C+ the current is decreasing in the direction away from the root: ∂ + J (x; z|u) = −|ψ + (x; z|u)|2 Im z ≤ 0 . ∂x

(2.12)

At interior vertices the net current flux is zero. 2.3. The core of the argument. We now have the requisite tools to outline the proof of the persistence of the AC spectrum under weak disorder. A key element in our analysis is to show that for small (λ, η), the WT function R + (x; E + iη, λ, ω) does not depend much on ω. At each point its distribution is narrowly peaked around a value which may only depend on (λ, η), and the relative location of the point within the edge. By the rules of the evolution of R + along an edge, which are described above, it follows that for

(λ, η) → (0, 0) the limit of the “typical” value of Re+ 0; z, λ, ω , or more precisely any accumulation point of such, obeys a Möbius evolution whose unique periodic solution is given by the WT function of the regular tree T. The continuity then readily follows, though some care is needed in the presentation of the argument. In this part, we employ the strategy which was presented in [2]. It should be appreciated that the asymptotic lack of dependence of R + (x, z; λ, ω) on ω is not just a trivial consequence of the smallness of λ since this parameter affects an infinite number of random terms. As commented above, it is natural to expect the corresponding statement to fail when the disorder is radial, with ω given by radially symmetric but otherwise iid random variables. To streamline the notation, in various places the dependence of ψ + and R + on λ and ω will be suppressed. The first statement establishing a reduction of fluctuations concerns Im R + (x; z). For that the starting point is (2.11) by which |ψ + (x; z|0)|2 · Im R + (x; z) gives the flux at


377

x of a conserved current. The current is injected at the root and at each vertex it is split among the forward directions. It is significant that the first factor takes a common value among the different forward directions, the second factor is independently distributed, and, furthermore, it has the same distribution as the total current Im R + (0; z). It follows that

+

1 + ψ 0; z, λ, ω|0 2 e∈N0+ Im Re 0; z, λ, ω K 0

≤ (2.13)

2 . Im R0+ 0; z, λ, ω K ψ + 0; z, λ, ω|0 f

This expresses current conservation/attrition, and for Im z = 0 holds as equality. Here f ∈ N0+ is an arbitrary edge forward to that of the root, and due to the particular normalization chosen (before Corollary 2.1) the numerator on the right side is actually one. Our argument proceeds by combining two essential observations: 1. By the Jensen inequality the expectation value of the logarithm of the left side of (2.13) is non-negative. The inequality can be strengthened to show that the above expectation value provides an upper bound on a positive quantity which expresses the relative width of the distribution of Im R0+ 0; z, λ, ω . 2. The expectation of the logarithm of the right side of (2.13) is a quantity which it is natural to regard as a Lyapunov exponent,

√ ψf+ 0; z, λ, ·|0

. γλ (z) := −E log K + (2.14) ψ 0; z, λ, ·|0 0

For λ = 0, this Lyapunov exponent vanishes for almost every z ∈ σac (−T ). Furthermore, the average of γλ (E + iη) over any energy interval is continuous in (λ, η). The above mentioned improvement of the Jensen inequality is summarized in the following statement, which is a consequence of [2, Lemma 3.1 and Lemma D.2]. Lemma 2.1. Let {Xj }K j =1 be a collection of K ≥ 2 iid positive random variables, and X a variable of the same distribution. Then for any a ∈ (0, 1/2]:    K 1 a2 E log  Xj  ≥ E log X + (2.15) δ (X, a)2 , K 4 j =1

where δ(X, a) is the relative a-width of X, which is defined below. Definition 2.2. The relative a-width of the distribution of a positive random variable X, at a ∈ (0, 1/2], is δ(X, a) := 1 −

ξ− (X, a) ξ+ (X, a)

(2.16)

with ξ− (X, a) = sup{ ξ : P(X < ξ ) ≤ a} and ξ+ (X, a) = inf{ ξ : P(X > ξ ) ≤ a}. A number of useful rules of estimates of the relative width of a distribution are compiled in [2, Appendix D]. We shall now turn to the two key properties of the Lyapunov exponent which were mentioned above.

378


3. A Lyapunov Exponent and Its Continuity We shall refer to γλ (z) which is defined by (2.14) as the Lyapunov exponent of the randomly deformed tree T(λ, ω). The following theorem collects some of its properties. Of particular relevance is that the integral of γλ (E + iη) over E ∈ σac (T) is small for small λ and η. Theorem 3.1. The Lyapunov exponent γλ (z) has the following properties: (i) As a function of z ∈ C+ , it is positive and harmonic with γλ (iη)/η → 0 for η → ∞. (ii) For λ = 0, it vanishes on the AC spectrum: γ0 (E + i0) = 0 for Lebesgue-almost all E ∈ σac (−T ). (iii) For any z ∈ C+ , γλ (z + iη) is jointly continuous in (λ, η) ∈ R × [0, ∞). (iv) For any [a, b] ⊂ σac (−T ):

b

lim

λ→0 η↓0

γλ (E + iη) dE = 0.

(3.1)

a

Proof. (i) From (2.14) and (2.6) it follows that γλ (z) is the negative of the real part of the Herglotz-Nevanlinna function √ wλ (z) := log K + E

L0 (λ)

+

R0 l; z, λ dl ,

0

(3.2)

and hence it is harmonic. The positivity of γλ (z) follows from (2.11) and the Jensen inequality, which yield 2γλ (z) ≥ E log

J0+ (0; z|0)

J0+ (L0 (λ, ·); z|0)

>0

(3.3)

due to the current loss (2.12) on every edge for z ∈ C+ . The statement of asymptotics derives from (A.5) and the bound (A.2) in Appendix A. (ii) The vanishing of γ0 along σac (−T ) is a consequence of the Im z ↓ 0 limit of (2.13) and the fact that Re+ (0; z, 0) is independent of e, with 0 < Im Re+ (0; E + i0, 0) < ∞ for Lebesgue-almost all E ∈ σac (−T ). (iii) From (2.14) and (A.4) together with the dominated convergence theorem, which is applicable due to (A.5) and Theorem A.1, we conclude that the continuity of γλ (z + iη) follows from that of R0 (0; z + iη, λ, ω). The latter is derived using the argument in the proof of Theorem A.1(iv). (iv) By virtue of (ii) it suffices to prove that

b

lim

λ,η→0

a

b

γλ (E + iη) dE =

γ0 (E + i0) dE.

(3.4)

a

To do so, we note that the integrals in (3.4) can be associated with the (unique) Borel measure σ(λ,η) corresponding to the positive harmonic function h(λ,η) (z) = γλ (z+iη) (cf. (3.6) below). Since wλ (·+iη) has the Herglotz-Nevanlinna property, the harmonic conjugate of h(λ,η) = − Re wλ (· + iη) has a definite sign and hence


379

locally integrable boundary values [5, Thm. 1.1]. Therefore, the measure σ(λ,η) is purely AC [5, Thm 3.1 & Corollary 1] for all (λ, η) ∈ [0, 1]2 and given by b σ(λ,η) [a, b] = γλ (E + iη) dE . (3.5) a

The assertion thus follows from (iii) and Lemma 3.1 below.

The last part of the preceding proof was based on the following general convergence result for sequences of harmonic functions. Recall (cf. [5, 9]) that every positive harmonic function h : C+ → (0, ∞) which satisfies limη→∞ h(iη)/η = 0 admits the representation Im z h(z) = σ (dE) (3.6) |E − z|2 R with some positive Borel measure σ on R with R (E 2 + 1)−1 σ (dE) < ∞. Lemma 3.1. Let hn , h : C+ → (0, ∞) be positive harmonic functions with limη→∞ hn (iη)/η = 0 and similarly for h. Suppose that for all z ∈ C+ , lim hn (z) = h(z).

n→∞

(3.7)

Then their associated Borel measures converge vaguely, limn→∞ σn = σ . The proof is an immediate consequence of the representation (3.6) and [6, Prop. 4.1] (see also [17, Lemma 5.22]). 4. Fluctuation Bounds Proceeding along the lines outlined in Subsect. 2.3, we shall now show that a small Lyapunov exponent γλ (z) implies the sharpness of the distribution of both the imaginary part and the modulus of a certain linear function of R0+ (0; z, λ, ω). Theorem 4.1. For any λ ∈ R, z ∈ C+ and a ∈ (0, 1/2]:

2 8 δ Im R0+ (0; z, λ, ·), a ≤ 2 γλ (z), a √

2 2 √

sin zL0 (λ, ·) + δ cos zL0 (λ, ·) + R0 (0; z, λ, ·) , a √ z ≤ 512

(K + 1)2 γλ (z) . a2

Proof. The derivation of (4.1) starts from the relation 1

2γλ (z) ≥ E log Im Rf+ 0; z, λ, · − E log Im R0+ 0; z, λ, · , K +

(4.1)

(4.2)

(4.3)

f ∈N0

which is obtained by taking the expectation of the logarithm of (2.13). Applying the

improved Jensen inequality (2.15), and using the fact that Im Rf+ 0; z, λ, ω are iid for

380


2 f ∈ N0+ , the right side of (4.3) is bounded from below by a 2 δ Im R0+ (0; z, λ, ·), a /4. This implies (4.1). The proof of (4.2) starts by observing that the quantity in its left side can be identified with the right side of (2.13): √

√

sin zL0 (λ, ω) + cos zL0 (λ, ω) + R0 (0; z, λ, ω) √ z

= ψ0+ L0 (λ, ω); z, λ, ω|0 . (4.4) This follows from (A.4) in Appendix A. Setting X := J0+ (L0 (λ, ·); z|0)/J0+ (0; z|0) and using the definition of the current, the left side in (4.2) therefore equals

Im R0+ 0; z, λ, ·

δ X, a + f ∈N0+ Im Rf 0; z, λ, ·

Im R0+ 0; z, λ, · a a

, ≤δ + δ X, , (4.5) + 2 2 f ∈N + Im Rf 0; z, λ, · 0

where the inequality results from the additivity of the relative width under multiplication [2, Lemma D.1]. This additivity and the invariance under inversion [2, Lemma D.1] ensures that the first term on the right side of (4.5) is bounded from above by

a aK δ Im R0+ 0; z, λ , Im Rf+ 0; z, λ , +δ 2(K + 1) 2(K + 1) +

f ∈N0

+

≤ 2 δ Im R0

0; z, λ ,

√ a 8 2 (K + 1) γλ (z) . ≤ 2(K + 1) a

(4.6)

Here the first inequality results from the rules of addition of iid random variables [2, Lemma D.1]. The second one is a consequence of (4.1). The second term on the right √ side of (4.5) is bounded from above according to δ(X, a/2) ≤ 2 γλ (z)/a. This follows from (3.3) and the simple bound E [ln X] , (4.7) a valid for all random variables X taking values in (0, 1]. Combining the above estimates, we arrive at (4.2). δ(X, a)2 ≤ (1 − ξ− (X, a))2 ≤ − ln ξ− (X, a) ≤ −

5. Stability of the Weyl-Titchmarsh Function Under Weak Disorder 5.1. The main stability result. Our goal in this section is to show that the boundary values of the WT function are continuous at λ = 0 in a certain distributional sense as long as E ∈ σac (−T ). Here the distribution refers to the joint dependence on the energy and the randomness. The result to be derived is: Theorem 5.1. Let I ⊂ σac (−T ) be an interval. Then the WT function converges in LI ⊗ P-measure, i.e., for all ε > 0: ! lim LI ⊗ P R0+ (0; E + i0, λ, ω) − R0+ (0; E + i0, 0) > ε = 0, (5.1) λ→0

where LI denotes the Lebesgue measure on I .


381

The above statement will be derived in this section by proving, in Theorem 5.2 which appears below, that for all ε > 0 and all sequences (λ, η) converging to zero, ! lim LI ⊗ P R0+ (0; E + iη, λ, ω) − R0+ (0; E + i0, 0) > ε = 0 . (5.2) λ,η→0

Before we delve into the proof of the statements which lead to Theorem 5.1 let us note that it implies our main claim. Proof (of Theorem 1.1; assuming Thm. 5.1). Since σac (−T(λ,ω) ) coincides almost-surely with a non-random set, it suffices to show that

lim E L I ∩ σac (−T(λ,·) ) = L I . (5.3) λ→0

We start the proof of this relation by observing that

L I ≥ E L I ∩ σac (−T(λ,·) )

! ≥ LI ⊗ P 0 < Im R0+ (0; E + i0, λ, ω) < ∞ ,

(5.4)

where the second inequality is due to Theorem A.2. For any ε > 0 the set on the right side includes the collection of (E, ω) for which ε < Im R0+ (0; E+i0, 0) < ∞ and Im R0+ (0; E+i0, λ, ω)−Im R0+ (0; E+i0, 0) ≤ ε. Accordingly, the right side of (5.4) is bounded below by the difference of ! LI ⊗ P Im R0+ (0; E + i0, λ, ω) − Im R0+ (0; E + i0, 0) ≤ ε (5.5) and

! LI Im R0+ (0; E + i0, 0) ∈ [0, ε] ∪ {∞} .

(5.6)

As λ → 0 the measure in (5.5) converges to L(I ) by Theorem 5.1. Moreover, as ε ↓ 0 the measure in (5.6) converges to zero. 5.2. Convergence in measure. In order to derive Theorem 5.1, we shall consider the! distribution under the measure LI ⊗ P of the joint values of E, Re+ (0; E + iη, λ, ω) e∈E , and {Le (λ, ω)}e∈E . In the following, Lmax stands for some uniform upper bound on Le (λ, ω), which exists due to the boundedness of the random variables. The setup is similar to that employed in [2]. Definition 5.1. Let (λ, η) ∈ [0, 1]2 and I ⊂ σac (−T ). The Borel measure ν(λ,η) on I × CE × [0, Lmax ]E is the measure induced by LI ⊗ P under the mapping

!

(E, ω) → E, Re+ 0; E + iη, λ, ω e∈E , {Le (λ, ω)}e∈E . (5.7) E . Moreover, its E-conditional distribution on CE × [0, Lmax ]E is abbreviated by ν(λ,η)

Remarks 5.1. (i) The above definition relies on the fact that one may identify the edge sets E of T(λ, ω) corresponding to different values of λ and/or ω. (ii) In case (λ, η) = (0, 0) the measure ν(λ,η) is a product of the Lebesgue measure and products of Dirac measures: " " ν(0,0) = dE δR + (0;E+i0,0) δL . (5.8) e∈E

0

e∈E

382


(iii) The family of finite measures ν(λ,η) is tight. Indeed, the bound (A.2) in Appendix A and arguments as in [2, Prop. B.1 & Lemma B.1] show that inf

sup

t>0 (λ,η)∈[0,1]2

ν(λ,η) Re > t = 0

(5.9)

for all e ∈ E. Accordingly, every sequence of measures ν(λ,η) corresponding to (λ, η) → (0, 0) has weak accumulation points. The issue now is to show that all of the above mentioned accumulation points coincide. Theorem 5.2. In the sense of weak convergence: lim ν(λ,η) = ν(0,0) .

λ,η→0

(5.10)

The proof of this theorem closely follows ideas in [2], and rests on the following two lemmas. We first show that all accumulation points of the sequence in (5.10) are supported on points satisfying the limiting recursion relation. Lemma 5.1. Let ν be a (weak) accumulation point for the family of measures ν(λ,η) , with the parameters (λ, η) in [0, 1] × (0, 1] converging to (0, 0). Then (i) The limiting recursion relation

√

√

sin ELe cos ELe + Rf Re √ E f ∈Ne+ √ √ √

= cos ELe Re − E sin ELe

(5.11)

holds for ν-almost all (E, R, L) ∈ I × CE × [0, Lmax ]E . (ii) The lengths are ν-almost surely constant, Le = L for all e ∈ E. (iii) The variables {Re }e∈E are identically distributed ν-almost surely. (iv) For Lebesgue-almost all E ∈ I there exist I ∈ [0, ∞) and M ∈ [0, ∞) such that for all e ∈ E √ √ sin EL Im Re = I and cos EL + Re = M √ E

(5.12)

ν E -almost surely. Proof. (i) The fact that the accumulation points obey the limiting recursion relation which is the (λ, η) = (0, 0) version of (A.3) and (2.10) is a consequence of the general principle proven in [2, Prop. 4.1]. (ii) This statement is implied by the pointwise convergence limλ→0 Le (λ, ω) = L. (iii) The claim follows from the fact that all prelimit quantities are identically distributed.


383

(iv) We fix e ∈ E. Then Theorem 4.1 and Theorem 3.1 yield lim

δ Im Re+ (0; E + iη, λ, ·), a dE = 0,

(5.13) I

lim δ cos E + iη Le (λ, ·) λ,η→0 I √

2 sin E + iη Le (λ, ·) + + Re (0; E + iη, λ, ·) , a dE = 0 (5.14) √ E + iη λ,η→0

for all a ∈ (0, 1/2]. By [2, Lemma D.4] this implies that both random variables in (5.12) are almost surely constant for Lebesgue-almost all E ∈ I . Since they are identically distributed for all e ∈ E, the constants I and M are independent of e . √ √ The explicit expression (1.4) shows that sin EL / E = 0 for all E ∈ σac (−T ). Therefore, (5.12) asserts that the Re -marginals of ν E are supported on the intersection of a line with a circle, that is, on at most two points. Next, we show that this support contains only one point which coincides with R0 (0, E + i0, 0). Lemma 5.2. Assume the situation of Lemma 5.1. Then for Lebesgue-almost all E ∈ I : (i) there exists ∈ C with Im ≥ 0 such that for all e ∈ E, Re =

ν E -almost surely.

(5.15)

(ii) = R0 (0, E + i0, 0). Proof. (i) By Lemma 5.1 there exists ± ∈ C with Im ± ≥ 0 such that the Re -marginal of ν E is supported on {+ , − } for all e ∈ E. Suppose that + = − . Then the distribution of f ∈Ne+ Rf is supported on at least three points. This follows by explicitly identifying three points K± and (K − 1) + + − in the support. But this contradicts the limiting recursion relation (5.11) since then the distribution of the left side is supported contains at least three points but the distribution of the right side on at most two points in its support. (ii) Equation (5.11) with substituted for all Re , and Le = L for all edges e, is quadratic in . For Lebesgue almost all E ∈ σac (−T ) this equation has a complex non-real solution, and in this case R0+ (0, E + i0, 0) is its only solution in the upper half plane.

5.3. Section summary. Let us now note that the above lemmas imply the two theorems stated in this section: Proof of Theorem 5.2. Lemmas 5.1 and 5.2 jointly imply Theorem 5.2.

Proof of Theorem 5.1. As is discussed in [2], an application of Fatou’s lemma yields (5.1) from (5.2).

384


6. Extensions 6.1. More general vertex conditions. A variety of boundary conditions other than (1.2) lead to self-adjoint Laplacians on metric graphs [7, 12]. Of those, the argument presented here can be readily extended to the class of symmetric BC. These require at each vertex: 1. for some fixed β ∈ [0, π] the following is common to all the edges e adjacent to the vertex: cos(β) ψ + sin(β) ne · ∇ ψ

(6.1)

with ne · ∇ the outward derivative, 2. for some fixed α ∈ [0, π] the sums over all edges adjacent to the vertex satisfy cos(α) ψe − sin(α) ne · ∇ ψ = 0. (6.2) e

e

The symmetric class includes the Kirchhoff BC (1.2), for which β = 0 and α = π/2. Our analysis extends to the general symmetric BC through a rotation which mixes the function ψ + and its derivative, where ψ + is defined as below Theorem 2.1 with the present boundary conditions. We denote: #+ (x; z|0) := cot(β) ψ + (x; z|0) + ψ and correspondingly

∂ + ψ + (x; z|0) , ψ (x; z|0) = − + # (x; z) ∂x R

#+ (x; z) := − cot(β) + R + (x; z) −1 . R

(6.3)

(6.4)

#+ (x; z|0) which takes a comUnder the above boundary conditions it is the function ψ mon value among the forward edges of any vertex. The current can be expressed in terms of the ‘rotated’ quantities as 2 + #+ (x; z) . (6.5) # (x; z|0)2 Im R J + (x; z|0) ≡ ψ + (x; z|0) Im R + (x; z) = ψ The argument, as it is outlined in Sect. 2.3, applies verbatim with ψ + (x; z|0) and #+ (x; z). In this context the relevant Lyapunov #+ (x; z|0) and R R + (x; z) replaced by ψ exponent is % + $ # (0; z, λ, ·|0) √ ψ f γλ (z) := −E log # K + , (6.6) ψ # (0; z, λ, ·|0) 0

where f is an arbitrary edge forward to the edge emanating from the root. It follows #+ yields the same value as with ψ + , i.e., from (6.3) that the above expression with ψ γλ (z) = γλ (z), the latter being defined by (2.14). # 6.2. Tree graphs with decorations. By gluing a copy of a finite metric graph G to every vertex of the tree T one obtains a metric graph T G which is referred to as a decorated tree. The Laplacian −TG is rendered self-adjoint by imposing, for example, Kichhoff BC. Such decorations provide a mechanism for the creation of gaps in the spectrum [20, 13]. The strategy presented here allows to establish the stability of the AC spectrum under random deformations of a (uniformly) decorated tree even if G has loops. In deriving the fluctuation bounds in this case, in the sum on the left side of (2.13) one may omit the terms Im Rf+ which correspond to directions f into the decorating parts. These terms vanish for real z since the finite graph G does not conduct current to infinity.


385

A. Appendix: More on the Weyl-Titchmarsh Function on Tree Graphs This appendix is devoted to the WT functions R ± on general metric tree graphs T, presented in Definition 2.1. We start by proving Theorem 2.1 on which the definition relies. Basic properties of R ± are the topic of the second subsection. The third subsection deals with the Green function on T and its relation to the WT functions.

A.1. Uniqueness of square-integrable solutions on graphs with a dangling end. We will now give a proof of Theorem 2.1. Proof. (i) That there is at least one non identically vanishing function in the kernel of the operator −∗G,u − z can be seen by elongating the dangling edge beyond u thereby creating a backward extension G ⊃ G . We set ψ(x; z) = (−G − z)−1 ϕ(x),

(A.1)

where ϕ is some non identically vanishing function compactly supported on the elongation of the edge containing u and −G is a self-adjoint Laplacian on G . This function (A.1) does not vanish identically on G, since otherwise (by (A.4) below) it would be identically zero on the whole edge containing u and the support of ϕ. Suppose now there is another solution which is linearly independent of ψ(·; z). Since the solution space on the edge adjacent to u is two dimensional, one can linearly combine them to satisfy a self-adjoint BC (cf. (1.3)) at u. Thereby one produces an eigenfunction of a self-adjoint Laplacian −G with eigenvalue z ∈ C+ . This contradicts the self-adjointness. (ii) In fact, more generally for all x ∈ G which disconnect the graph, we have that cos(α) ψ(x; z) − sin(α) ψ (x; z) = 0 for all α ∈ [0, π ). Otherwise one would have found a square-integrable, non-trivial eigenfunction with eigenvalue z ∈ C+ of a restriction of −∗G,u to functions on that disconnected piece, which does not contain u. Since this Laplacian is rendered self-adjoint by imposing α-BC at x, this is a contradiction. (iii) This is an immediate consequence of (A.1).

A.2. Basic properties of the Weyl-Titchmarsh functions. Following are some properties of R + (x; z) which are of relevance in the main part of the paper. Similar statements apply to R − , with proofs differing only in the notation. Theorem A.1. The WT function R + (x; z) has the following properties: (i) R + (x; ·) : C+ → C+ is analytic for fixed x. (ii) For each e ∈ E and all z ∈ C+ : + R (0; z) ≤ e

√ 2 |z| √ , 1 − exp −2Le Im z

√ √ −1 and |Re+ (Le ; z)| ≤ 2K |z| 1 − exp −2Le Im z due to (2.10).

(A.2)

386


(iii) Along any edge e the function obeys the Riccati equation (2.9). In particular its values are related by Möbius transformations: √

√ √

Re+ 0; z cos zl − z sin zl +

√ √ √

Re l; z = (A.3) cos zl + Re+ 0; z sin zl / z for all l ∈ [0, Le ] and z ∈ C+ . (iv) Equipping the space [Lmin , ∞)E with the uniform topology, R0+ (0; z) is a continuous function of {Le }e∈E ∈ [Lmin , ∞)E for all z ∈ C+ . Proof. (i) The first assertion follows from the analyticity of ψ + (x; z|0) and of its deriative (cf. Theorem 2.1). The Herglotz-Nevanlinna property is a consequence of (2.7). (ii) This is an immediate consequence of (A.6) and Lemma A.1(i) below. (iii) This assertion follows from the fact that ψe+ (·; z|0) ∈ H2 [0, Le ] is a solution of the free Schrödinger equation −ψ = zψ on the interval [0, Le ] which, using the boundary conditions at l = 0, may be written as √ √ ψe+ (l; z|0) sin( zl) + = cos( zl) + Re (0; z) √ (A.4) z ψe+ (0; z|0) for all l ∈ [0, Le ]. (iv) Suppose the metric tree T is finite and has only N generations, i.e., the number of edges connecting any edge to the root is at most N . In this case, the continuity of R0 (0; z) follows from the explicit evolution equations (A.3) and (2.10). Lemma A.1(iii) below shows that R√0 (0; z) may be uniformly approximated by its values on a finite tree provided Im z is large enough. Hence R0 (0; z) is continuous for those z ∈ C+ . Since R0 (0; z) is analytic in z ∈ C+ , this implies continuity for all z ∈ C+ . Remark A.1. Another immediate consequence of (A.4) and its analog with 0 and Le interchanged, is the bound −1 + + √ √ ψe (Le ; z|0) |Re+ (Le ; z)| − |z|Le ≤ e |z|Le 1 + |Re√(0; z)| 1+ ≤ + e √ |z| |z| ψe (0; z|0) (A.5) which shows that ψ + (·; z|0) de- or increases at most exponentially on any edge e. Instead of the WT function R + , it is sometimes more convenient to consider its transform √ R + (x; z) − i z m(x; z) := + (A.6) √ R (x; z) + i z which takes values in the complex unit disk. The evolution on the edges takes a particularly simple form for m. In fact, from (A.3) and (2.10) one obtains √ 1 + mf (0; z) 2i zLe me (0; z) = e me (Le ; z), and me (Le ; z) = g , (A.7) 1 − mf (0; z) + f ∈Ne

where g(ζ ) := Theorem A.1.

ζ −1 ζ +1 . The

next lemma collects some facts which are used in the proof of


387

Lemma A.1. Let z ∈ C+ and assume Le ≥ Lmin > 0 for all e ∈ E. Then m(x; z) has the following properties: √

(i) It satisfies: |me (0; z)| ≤ exp −2Le Im z . (ii) At the root the dependence on a particular value me (Le ; z) is uniformly √ exponential in the sense that there exists a constant c < ∞ such that for all Im z sufficiently large: ∂m0 (0; z) √

N (A.8) ∂m (L ; z) ≤ c exp −2N Lmin Im z , e e where N is the number of vertices between the edge e to the root. # #0 (0; z) correspond to metric (iii) Let m0 (0; z) and m √ tree graphs T and T which coincide up to the N th generation. Then for all Im z sufficiently large: √

|m0 (0; z) − m #0 (0; z)| ≤ 2K N+1 cN exp −2N Lmin Im z .

(A.9)

Proof. (i) This is an immediate consequence of the first evolution equation (A.7). (ii) Using the chain rule this can be traced back to a straightforward differentiation of Eqs. (A.7). The edge and vertex terms are subsequently bounded with the help of (i). (iii) We expand the difference into a telescopic sum of K N+1 differences and use both (ii) and the fact that the values of m(·; z) and m #(·; z) on the K N+1 leaves in the N th generation differ at most by a complex number of modulus 2.

A.3. The Green function on a tree graph. Analogously to one dimension [4, 3], the Green function of the Laplacian −T on a metric tree graph T can be constructed using two non-vanishing square-integrable functions. In fact, the following lemma is straightforward. Lemma A.2. The Green function GT (u, x; z) of the Laplacian −T can be expressed as

+ ψ ∧ ψ − (u, x; z|v) , (A.10) GT (u, x; z) = W (ψ + , ψ − )(u; z|v) + independently of v, as long as v ∈ T+ u ∩ Tx . Here

+

ψ ∧ψ

−

(u, x; z|v) :=

ψ + (u; z|0) ψ − (x; z|v) for x ∈ T− u ψ − (u; z|v) ψ + (x; z|0) for x ∈ T+ u,

(A.11)

and W (ψ + , ψ − ) := ψ + ∂ψ − /∂x − ∂ψ + /∂x ψ − is the Wronskian. Remarks A.1. (i) The Wronskian is constant along any edge in T− v . In particular, this implies that W (ψ + , ψ − ) = 0, since otherwise one could linearly combine ψ ± to a square-integrable solution of (−T − z) ψ = 0 on the whole tree. (ii) The right side of (A.10) defines an integral kernel of the resolvent (−T − z)−1 which is jointly continuous in (u, x).

388


(iii) Setting fv± (·; z|·) := ψ ± (·; z|·)/ψ ± (v; z|·), where v is any point on the same edge as u, the Green function (A.10) can be rewritten in terms of WT functions:

+ fv ∧ fv− (u, x; z) . (A.12) GT (u, x; z) = − + R (v; z) + R − (v; z) In particular, for u = x = v, we obtain (2.5). Moreover, at the root, we obtain

−1 α = 0, (A.13) GT (0, 0; z) = cot α − R + (0; z) , because R − (0; z) = − cot α due to the BC (1.3). For a self-adjoint Sturm-Liouville, or more specifically, Schrödinger operator on the half-line, the WT function at the origin allows one to reconstruct the spectral measure and therefore contains all spectral information [4, 3]. Generally, this fails to hold for operators on tree graphs. However, the AC spectrum of −T can still be detected by the boundary value of R + (0; z): Theorem A.2. The AC spectrum σac (−T ) of the Laplacian on a rooted metric tree graph T is concentrated on the set ! (A.14) E ∈ R : 0 < Im R0+ (0; E + i0) < ∞ . Proof. Pick any edge e ∈ E and let φ be a compactly supported function on e. A straightforward but tedious computation using (A.12) shows that for Lebesgue-almost all E ∈ R the AC density of the spectral measure associated with φ is given by & ' lim Im φ, (−T − E − iη)−1 φ η↓0

=

Im Re+ (0; E + i0) gφ− (E) + Im Re− (0; E + i0) gφ+ (E) |Re+ (0; E + i0) + Re− (0; E + i0)|2

,

(A.15)

for Lebesgue-almost every E ∈ R, where (

)2 (

)2 gφ± (E) := φ, Re f ± ·; E + φ, Im f ± ·; E (A.16)

and f ± ·; E is the solution of the Schrödinger equation (−T − E) f = 0, which

satisfies dff± /dx 0; E = ±Rf± (0; E + i0) at every edge f and is normalized to

fe± 0; E = 1. By the current conservation (2.12) along each edge and the positivity and additivity of the current at each vertex, we have

2 Im Re+ (0; E + i0) ≤ f0+ 0; E Im R0+ (0; E + i0) (A.17) for Lebesgue-almost all E ∈ R. Similarly, by tracing the current flow on the backward tree emanating from e and using Im R0− (0; E + i0) = 0, we obtain for Lebesgue-almost all E ∈ R,

2 Im Re− (0; E + i0) = (A.18) ff− 0; E Im Rf+ (0; E + i0), f

where the sum extends over all edges f = e, which have the same distance to the root as e. From the above considerations we conclude that for any e and Lebesgue-almost all E ∈ R if Im R0+ (0; E + i0) = 0, then 1. Im Re+ (0; E + i0) = 0, and 2. Im Re− (0; E + i0) = 0. But this shows that σac (−T ) is indeed concentrated on the set in (A.14).


389

Acknowledgement. We are grateful to Uzy Smilansky for stimulating discussions concerning quantum graphs. We would also like to express thanks for the gracious hospitality enjoyed at the Weizmann Institute (MA) and the Department of Mathematics at UC Davis (SW). This work was supported in part by the Einstein Center for Theoretical Physics and the Minerva Center for Nonlinear Physics at the Weizmann Institute, by the US National Science Foundation, and by the Deutsche Forschungsgemeinschaft.

References 1. Acosta, V., Klein, A.: Analyticity of the density of states in the Anderson model in the Bethe lattice. J. Stat. Phys. 69, 277–305 (1992) 2. Aizenman, M., Sims, R., Warzel, S.: Stability of the absolutely continuous spectrum of random Schrödinger operators on tree graphs. http://arxiv.org/list/math-ph/0502006, 2005. To appear in Probab. Theor. Relat. Fields 3. Carmona, R., Lacroix, J.: Spectral theory of random Schrödinger operators. Boston: Birkhäuser, 1990 4. Coddington, E.A., Levinson, N.: Theory of ordinary differential equations. NewYork: McGraw-Hill, 1955 5. Duren, P.L.: Theory of H p spaces. New York: Academic, 1970 6. Hupfer, T., Leschke, H., Müller, P., Warzel, S.: Existence and uniqueness of the integrated density of states for Schrödinger operators with magnetic fields and unbounded random potentials. Rev. Math. Phys. 13, 1547–1581 (2001) 7. Kostrykin, V., Schrader, R.: Kirchhoff’s rule for quantum wires. J. Phys. A 32, 595–630 (1999) 8. Kostrykin, V., Schrader, R.: A random necklace model. Waves and random media 14, S75–S9032 (2004) 9. Kotani, S.: One-dimensional random Schrödinger operators and Herglotz functions. In: K. Ito (ed.), Taneguchi Symp. PMMP, Amsterdam: North Holland, 1985, pp. 219–250 10. Kottos, T., Smilansky, U.: Periodic orbit theory and spectral statistics for quantum graphs. Ann. Phys. 274, 76–124 (1999) 11. Kuchment, P.: Graph models for waves in thin structures. Waves and random media 12, R1–R24 (2002) 12. Kuchment, P.: Quantum graphs: I. Some basic structures. Waves and random media 14, S107–S128 (2004) 13. Kuchment, P.: Quantum graphs II. Some spectral properties of quantum and combinatorial graphs. preprint. J. Phys. A: Math. Gen. 38, 4887–4900 (2005) 14. Miller, J.D., Derrida, B.: Weak disorder expansion for the Anderson model on a tree. J. Stat. Phys. 75, 357–388 (1993) 15. Minami, N.: An extension of Kotani’s theorem to random generalized Sturm-Liouville operators. Commun. Math. Phys. 103, 387–402 (1986) 16. Minami, N.: An extension of Kotani’s theorem to random generalized Sturm-Liouville operators II. In: Stochastic processes in classical and quantum systems, Lecture Notes in Physics 262, BerlinHeidelberg-New York: Springer, 1986, pp. 411–419 17. Pastur, L., Figotin, A.: Spectra of random and almost-periodic operators. Berlin: Springer, 1992 18. Reed, M., Simon, B.: Methods of modern mathematical physics IV: Analysis of operators. NewYork: Academic Press, 1978 19. Schanz, H., Smilansky, U.: Periodic-orbit theory of Anderson-localization on graphs. Phys. Rev. Lett. 84, 1427–1430 (2000) 20. Schenker, J.H., Aizenman, M.: The creation of spectral gaps by graph decorations. Lett. Math. Phys. 53, 253–262 (2000) 21. Sobolev, A.V., Solomyak, M.: Schrödinger operators on homogeneous metric trees: spectrum in gaps. Rev. Math. Phys. 14, 421–467 (2002) 22. Solomyak, M.: On the spectrum of the Laplacian on regular metric trees. Waves and random media 14, S155–S171 (2004) Communicated by B. Simon


Communications in


Fermionic Quantization of Hopf Solitons S. Krusch1 , J.M. Speight2 1 2

Institute of Mathematics, University of Kent, Canterbury CT2 7NF, England. E-mail: [email protected] Department of Pure Mathematics, University of Leeds, Leeds LS2 9JT, England. E-mail: [email protected]

Received: 14 April 2005 / Accepted: 10 June 2005 Published online: 24 November 2005 – © Springer-Verlag 2005

Abstract: In this paper we show how to quantize Hopf solitons using the FinkelsteinRubinstein approach. Hopf solitons can be quantized as fermions if their Hopf charge is odd. Symmetries of classical minimal energy configurations induce loops in configuration space which give rise to constraints on the wave function. These constraints depend on whether the given loop is contractible. Our method is to exploit the relationship between the configuration spaces of the Faddeev-Hopf and Skyrme models provided by the Hopf fibration. We then use recent results in the Skyrme model to determine whether loops are contractible. We discuss possible quantum ground states up to Hopf charge Q = 7. 1. Introduction The possibility of knot-like solitons in a nonlinear field theory was first proposed by Faddeev in 1975, [10]. In 1997, interest in the model was revived by an article by Faddeev and Niemi [11]: the advent of larger computer power and a better understanding of the initial conditions led to a series of papers. In [15] axially symmetric configurations were studied extensively. Papers by Battye and Sutcliffe showed that for higher Hopf charge twisted, knotted and linked configurations occur [7, 8]. The most recent results are due to Hietarinta and Salo [16, 17]. Stable and metastable static solutions have now been explored up to Hopf charge Q = 8. Quantization of Hopf solitons was first discussed in [15]. More recently Su described a collective coordinate quantization in [27] which was motivated by the collective coordinate quantization of Skyrmions in [1]. However, collective coordinate quantizations can be potentially misleading unless the topology of configuration space is examined carefully [5]. In this paper we describe the fermionic quantization of Hopf solitons following an old idea of Finkelstein and Rubinstein [12]. Solitons in scalar field theories can consistently be quantized as fermions provided the fundamental group of configuration space

392

S. Krusch, J.M. Speight

has a Z2 subgroup generated by a loop in which two identical solitons are exchanged. Loops in configuration space give rise to so-called Finkelstein-Rubinstein constraints which depend on whether the loop is contractible. The Skyrme model [28] was the main motivation for this approach; see [20] for further references. Symmetries of classical configurations induce loops in configuration space. After quantization these loops give rise to constraints on the wave function. Recently, a simple formula has been found to determine whether a loop in the configuration space of Skyrmions is contractible [20]. We shall exploit the fact that Skyrmions and Hopf solitons are related via the Hopf map to use Skyrmions as a tool to study Hopf solitons. This paper is organized as follows. In Sect. 2 we discuss the configuration space of Hopf solitons for general domains. The configuration space of Skyrmions can be related to Hopf solitons via the Hopf map which is a fibration. This mathematical structure enables us to prove that the Hopf map induces, in certain circumstances, an isomorphism between the fundamental groups of the Skyrme and Faddeev-Hopf configuration spaces. In Sect. 3 we summarize some known facts about Hopf solitons. In Sect. 4 we describe how to quantize a Hopf soliton as a fermion and calculate possible ground states in the Faddeev-Hopf model. In the following section, we discuss collective coordinate quantization in this context. We end with some concluding remarks.

2. The Topology of Configuration Space Let M be a compact, connected, oriented 3-manifold and p0 ∈ M be a marked point. The case of most interest is M = S 3 , interpreted as the one point compactification of R3 with p0 representing the boundary at infinity. The configuration space we seek to study is (S 2 )M , the space of based maps M → S 2 , that is continuous maps sending the chosen point p0 to a chosen point in S 2 , (0, 0, 1) say. We also define the space Free(M, S 2 ) of unbased maps M → S 2 and similarly (S 3 )M and Free(M, S 3 ), where the chosen point is (1, 0) ∈ S 3 ⊂ C2 , say. All such spaces are given the compact open topology (equivalent to the C 0 topology). Our goal in this section is to relate the topology of (S 2 )M , the Faddeev-Hopf configuration space, to that of (S 3 )M , the standard Skyrme configuration space. The connected components of (S 2 )M were enumerated and classified by Pontrjagin [25]. Let µ be a generating 2-cocycle for H 2 (S 2 ; Z) = Z. Then given φ ∈ (S 2 )M one has an associated 2-cocycle φ ∗ µ ∈ H 2 (M; Z) by pullback. No two maps M → S 2 having noncohomologous 2-cocycles can be homotopic, and every 2-cocycle on M is cohomologous to the pullback of µ by some map. Thus, the homotopy classes of maps M → S 2 fall into disjoint families labelled by H 2 (M; Z). Within any such family, the classes are labelled by elements of H 3 (M; Z)/2[φ ∗ µ] ∪ H 1 (M; Z). Note that this group varies from family to family and that to compute it requires knowledge of the ring structure on H ∗ (M; Z). The most important family is the one with [φ ∗ µ] = 0, the so-called algebraically inessential maps. Classes within this family are labelled by elements of H 3 (M; Z) = Z, identified with the Hopf charge Q, which we would like to interpret as the soliton number of the configuration, that is, the excess of solitons over antisol2 M itons. Let us denote the space of algebraically inessential maps by (S 2 )M ∗ ⊂ (S ) . 2 3 Note that these sets coincide if H (M; Z) = 0, for example, when M = S . Configura2 tions outside (S 2 )M ∗ wrap some 2-cycle in M nontrivially around S . They are bound to some topological defect in physical space and so are arguably not localized topological solitons at all. We shall not consider their physics in this paper.

Fermionic Quantization of Hopf Solitons

393

Our main tool will be the Hopf map π : S 3 → S 2 , most conveniently defined by identifying S 3 with the unit sphere in C2 and S 2 with CP 1 , for then π : (z1 , z2 ) → [z1 , z2 ].

(2.1)

Note that π sends the marked point (1, 0) ∈ S 3 to the marked point [1, 0] ∈ S 2 , corresponding to the North pole, (0, 0, 1). The map π is a fibration, that is, it has the homotopy : M → S3 lifting property with respect to all domains. A map φ : M → S 2 has a lift φ ∗ 2 M (where π ◦ φ = φ) if and only if φ µ = 0, that is, if and only if φ ∈ (S )∗ . The integer : M → S 3 , that is, the in H 3 (M; Z) labelling the class of φ is precisely the degree of φ baryon number of the Skyrme configuration φ . This was shown explicitly for M = S 3 in [24]. So, given a Skyrme configuration of degree Q, we may produce an algebraically inessential Hopf configuration of charge Q by composition with the Hopf map. In this way we produce a map π∗ : (S 3 )M → (S 2 )M ∗ . To what extent does the topology of (S 3 )M determine that of (S 2 )M ? ∗ Theorem 1. The map π∗ : (S 3 )M → (S 2 )M ∗ induced by the Hopf fibration is a Serre fibration. Proof. We must prove that the map has the homotopy lifting property with respect to all disks D k [23], that is, that the commutative diagram below left may be com along the diagonal. Here H is a homotopy between two maps pleted by a map H Z f0 , f1 : D k → (S 2 )M ∗ and f0 is a lift of f0 . Using the identification of g : X → Y with gˆ : Z × X → Y , we produce the commutative diagram below right. Now the ˆ since π is a fibration. From H ˆ we produce a map homotopy Hˆ certainly does lift to H Hˇ : D k × I → Free(M, S 3 ) by (Hˇ (d, t))(p) = Hˆ (p, d, t). A priori, this is not necessarily the lifted homotopy we seek, however, since there is no reason why it should respect the basing condition: f0- 3 M (S )

D k × {0} ι

H

? D × [0, 1] k

M × D k × {0}

π∗

ι

ˆ f 0 - 3 S ˆ H

π .

? ? ? - S2 - (S 2 )M M × D k × [0, 1] ∗ H Hˆ

Let U ⊂ S 2 be a small closed ball centred on (0, 0, 1) and choose a local trivialization π of the Hopf bundle S 1 → S 3 → S 2 over U . Then by continuity of Hˆ and compactness k of D × [0, 1], there exists a closed ball B ⊂ M centred on p0 so that the restriction ˆ | : B × D k × I → S 3 takes values in π −1 (U ). We may write it, with respect to our H local trivialization, as ˆ |(p, d, t) = (Hˆ (p, d, t), λ(p, d, t)), H where λ : B × D k × I → S 1 . In this language, we are done if λ| : {p0 } × D k × I → {1}, for then the map Hˇ does satisfy the basing criteria. Note that we are free to change λ to any continuous map λ∗ we please, provided we do not change it on ∂B × D × I , since ˆ along the fibres of S 3 which does not change π ◦ H ˆ , so that the altered this just shifts H

394


map is still a lift of Hˆ , and is still continuous. Now since ∂B × D k × I deformation retracts to S 2 and π2 (S 1 ) = 0, λ| : ∂B × D k × I is nullhomotopic and we may construct the required λ∗ : B × D × I → S 1 by applying the null homotopy radially in B. Our main interest is to understand the fundamental group of each connected component of (S 2 )M ∗ . Given any map ρ : X → Y , there is a natural homomorphism ρ∗ : π1 (X) → π1 (Y ) defined by composition of loops in X with ρ. The fact that π∗ , which we will henceforth denote ρ, is a Serre fibration allows us to obtain a short exact sequence 3 relating π1 ((S 3 )M ) and π1 ((S 2 )M ∗ ). In the case M = S this reduces to the statement that the homomorphism ρ∗ associated with ρ is actually an isomorphism. We can therefore determine the homotopy class of a loop in the Hopf configuration space by lifting it to a loop in the Skyrme configuration space and applying known results. Theorem 2. The map ρ : (S 3 )M → (S 2 )M ∗ obtained from the Hopf fibration induces a short exact sequence of groups ρ∗

1 0 → π1 ((S 3 )M ) → π1 ((S 2 )M ∗ ) → H (M; Z) → 0. ρ

Proof. Given any Serre fibration F → E → B, where F, E, B denote the fibre, total space and base, we have an induced long exact sequence of homotopy groups: ρ∗

ρ∗

. . . → π1 (F )→π1 (E) → π1 (B) → π0 (F ) → π0 (E) → π0 (B) → 0.

(2.2)

1 M In the case at hand, E = (S 3 )M , B = (S 2 )M ∗ and F = (S ) . Using the identification S 1 = U (1), we see that F is a topological group, so all its connected components are homeomorphic. The components of GM for any Lie group G are enumerated in [3] while π1 (GM ) is constructed in [4]. The relevant results here are π0 (F ) = H 1 (M; Z) and π1 (F ) = 0. Note also that π0 (E) = π0 (B) = H 3 (M; Z) = Z by the theorems of Hopf and Pontrjagin. Substituting in (2.2) gives ρ∗

ρ∗

0 → π1 (E) → π1 (B) → H 1 (M; Z) → Z → Z → 0.

(2.3)

By exactness, the second ρ∗ is surjective, and there are only two surjective homomorphisms Z → Z (namely 1 → 1 and 1 → −1), both of which are injective. So we see that the second ρ∗ is an isomorphism. Since the second ρ∗ has trivial kernel, the image of H 1 (M) in Z is 0 by exactness, and the sequence truncates as was claimed. We note in passing that this provides an algebraic proof that the Hopf map takes degree Q Skyrme configurations to Hopf charge Q (or −Q if the orientation on M or S 3 is swapped) Faddeev-Hopf configurations, since this is precisely the statement that ρ∗ : π0 ((S 3 )M ) → π0 ((S 2 )M ∗ ) is an isomorphism. By identifying the Hopf degree of ∈ (S 3 )M , we adopt the standard convention that φ ∈ (S 2 )M with the degree of its lift φ ∗ 3 2 S the Hopf map π ∈ (S ) itself has Hopf degree +1. The short exact sequence does not tell us precisely what π1 ((S 2 )M ∗ ) is in general. One useful class of domains (which includes M = S 3 ), where we do know the answer is those with finite fundamental group. Corollary 3. If π1 (M) is finite then ρ∗ : π1 ((S 3 )M ) → π1 ((S 2 )M ∗ ) induced by the Hopf map is an isomorphism.


395

Proof. The result follows once we show that H 1 (M; Z) = 0. By the Universal Coefficient Theorem, H 1 (M; Z) is isomorphic to the free part of H1 (M; Z), since H0 (M; Z) = Z has no torsion. But H1 (M; Z) is isomorphic to the abelianization of π1 (M) which, being finite, can have no free part. These results are useful because a lot is known about the topology of (S 3 )M since it can be identified with the topological group GM , where G = SU (2). The canonical identification is given by z1 −¯z2 S 3 → SU (2) : (z1 , z2 ) → U = . (2.4) z2 z¯ 1 This map is well-defined because U † U = U U † = I2 and |z1 |2 + |z2 |2 = 1 implies that det U = 1. Also note that (1, 0) → I2 . Since (SU (2))M is a topological group all connected components of (S 3 )M are homeomorphic, and the fundamental group is abelian. A loop in the identity component of SU (2)M based at the constant map M → {I2 } may be thought of as a map from S 1 ∧ M to SU (2), where ∧ denotes smash product. If M = S 3 then S 1 ∧ M = S 4 and π4 (SU (2)) = Z2 , so we have that 3 3 π1 ((S 2 )S∗ ) = π1 (SU (2)S ) = Z2 for all components. Using a similar argument for the vacuum sector (S 2 )M 0 of the Faddeev-Hopf model, we could very easily have shown 2 that, for M = S 3 , π1 ((S 2 )M 0 ) = π4 (S ) = Z2 . Note that we have actually proved much more than this, however: the fundamental group of every connected component of the Faddeev-Hopf configuration space is Z2 , and crucially, that the map from the Skyrme configuration space induced by Hopf fibration is an isomorphism. The above results will suffice for our purposes. In fact, one can say much more about the algebraic topology of (S 2 )M , with M a general compact oriented 3-manifold. It turns out that all components of (S 2 )M ∗ are homeomorphic, though the same fails to be true for the full space (S 2 )M . Furthermore, it is possible to compute both the fundamental group and the whole real cohomology ring (including its cup product structure) of any component of (S 2 )M . These results are obtained [4] by exploiting a somewhat less obvious relationship between (S 2 )M and the vacuum (degree 0) sector of SU (2)M . Essentially, all Faddeev-Hopf configurations in a given sector may be obtained from a fixed map in that sector by acting on the codomain with some degree 0 Skyrme configuration. This gives natural maps from the vacuum sector of the Skyrme model to each sector of the Faddeev-Hopf model, which can be shown to have many topologically natural properties. The topological results we present here are not so powerful as those of [4], but they are also less technical and may be visualized rather concretely. Most importantly, they are particularly well-suited to the study of Finkelstein-Rubinstein quantization in the Faddeev-Hopf model. 3. The Faddeev-Hopf Model From now on we consider only the case M = S 3 , interpreted as the one point compactification of R3 with the point p0 representing the boundary at infinity. The most extensively studied model of this kind is due to Faddeev [10] who suggested the following Lagrangian density: L=

1 λ ∂µ n · ∂ µ n − ∂µ n × ∂ν n · ∂ µ n × ∂ ν n , 2 4

(3.1)

396


where the field n = (n1 , n2 , n3 ) takes values on the 2-sphere, that is |n|2 = 1, λ is a coupling constant, and the boundary condition is n(∞) = (0, 0, 1). We have changed notation from φ to n for the field so as to fit in with the existing literature on the model. Note that the second term in (3.1) stabilizes the solitons against radial rescaling. As discussed in Sect. 2 the Hopf charge Q can be identified with the degree of any lift of n to n : R3 → S 3 . The energy E of a static configuration of Hopf charge Q is bounded below by 3

E ≥ c|Q| 4 ,

(3.2)

where c is a constant. For more details see [29, 30]. The Lagrangian of the model has E(3) × O(3) symmetry. Since spatial translations are rather trivial we will not discuss them any further. The target space O(3) symmetry is broken to O(2) symmetry by the boundary condition. Kundu and Rybakov showed in [21] that topologically nontrivial configurations admit at most an axial (oneparameter) symmetry. General configurations with axial symmetry are discussed in [15]. Special configurations with axial symmetry have been studied recently in [17] and can be described in the following way. Introduce toroidal coordinates (η, ξ, φ) on R3 defined by x=a

sinh η cos φ sinh η sin φ sin ξ , y=a , z=a . cosh η − cos ξ cosh η − cos ξ cosh η − cos ξ

(3.3)

These coordinates form a canonically oriented orthogonal system covering all of R3 except the circle C = {x 2 + y 2 = a 2 , z = 0} and the z-axis. Surfaces of constant η ∈ (0, ∞) are tori of revolution about the z-axis, but with non-circular generating curves. As η → ∞ these tori collapse to the circle C and as η → 0 they collapse to the z-axis. Each torus of constant η is parametrized by the angular coordinates (φ, ξ ); φ is the angle around the z axis, ξ is an angular coordinate around the not quite circular generating curve of the torus. The maps of interest are most easily written in terms of a complex stereographic coordinate W on S 2 . Projecting from (0, 0, 1), so that W = (n1 + in2 )/(1 − n3 ), they take the form1 W = f (η)ei(mξ −nφ) ,

(3.4)

where f (η) satisfies the boundary conditions f (0) = ∞ and f (∞) = 0. Inverting the stereographic projection yields n=

2f 2f f2 − 1 cos(mξ − nφ), 2 sin(mξ − nφ), 2 . f2 + 1 f +1 f +1

(3.5)

This ansatz will be referred to as the toroidal ansatz. Here the word “ansatz” is used rather loosely for an approximation which is a good initial guess for the numerically calculated static solution. It is worth mentioning that the toroidal ansatz gives rise to exact solutions 3 for the Lagrangian density L = (Hµν H µν ) 4 , where Hµν = n · (∂µ n × ∂ν n), [2]. Under rotation by α around the z axis the toroidal coordinates change to (η, ξ, φ + α) which rotates the vector n by −nα around the third axis in target space. Obviously, this rotation can be undone by a rotation around the third axis in target space. 1

Note that we have changed the sign of n in [17].


397

There is an obvious lift of any map R3 → S 2 within this ansatz to a Skyrme configuration R3 → S 3 , obtained as follows. For given f , m and n, let n : (η, φ, ξ ) → (z1 , z2 ) ∈ C2 , z2 =

1 f2

+1

where z1 =

einφ .

f f2 + 1

eimξ , (3.6)

n is actually S 3 -valued, and the composition of this Then |z1 |2 + |z2 |2 = 1 so that map with the Hopf map is clearly n, since the stereographic coordinate W coincides with the inhomogeneous coordinate W = z1 /z2 under the identification S 2 ≡ CP 1 . It is now straightforward to compute the degree of n, and hence the Hopf degree of n. Since the degree of n is a homotopy invariant, we may deform f to any convenient function satisfying the boundary conditions, for example, f (η) = η−1 . In this case, 1 1 (2− 2 , 2− 2 ) ∈ S 3 is a regular value of n with precisely |mn| preimages, namely the points with η = exp(imξ ) = exp(inφ) = 1. At each of these preimages, the image of the canonically oriented coordinate frame under d n is

m n − 23 − 23 d n : [∂η , ∂ξ , ∂φ ] → −2 , 0, 2 , 0 , 0, √ , 0, 0 , 0, 0, 0, √ , 2 2 where we have identified C2 ∼ = R4 . The orientation of the image frame is given by the sign of the determinant n∂ξ , d n ∂φ , n] = det[d n∂η , d

mn . 8

Hence each of the |mn| preimages has multiplicity +1 if mn > 0 and −1 if mn < 0, so the Hopf charge of n is mn, in agreement with the calculation in [15]. Numerical evidence suggests that the energy minimals for Q = 1, 2 and 4 have axial symmetry. In general, minimals are more complicated, having knotted or linked structures with at most discrete symmetries. In principle any cyclic group Cq is a possible discrete symmetry. However, in practice only the simplest nontrivial symmetry — the twofold symmetry C2 — seems to occur. Clearly any nonconstant smooth field configuration cannot be symmetric under a rotation in target space without a compensating spatial rotation. It is possible, however, for a configuration to be invariant under a spatial rotation without a compensating rotation in target space. For example, the axial configuration in (3.5) with even n has a C2 symmetry generated by spatial rotation by π about the z-axis. We will discuss symmetries further in the next section when we calculate the constraints they impose on the wave function. 4. Finkelstein-Rubinstein Constraints In this section we describe how to use ideas of Finkelstein and Rubinstein [12] to quantize a scalar field theory and obtain fermions. Quantization usually implies replacing the classical configuration space by wave functions on configuration space. However, if the configuration space is not simply connected it is possible to define wave functions on the universal cover of configuration space. As shown in Sect. 2, the fundamental group of 3 each connected component of our configuration space Q = (S 2 )S∗ is Z2 . So the universal is a twofold cover. We will also assume that the topological charge is conserved cover Q

398


in the quantum theory, as it is in the classical theory, so the wave functions are defined on the covering space of a component of configuration space QQ with fixed Hopf charge Q. We shall formally think of the quantum state of the model as being specified by a Q ) with respect to some measure on Q Q . Let : Q Q → Q Q wave function ∈ L2 (Q be the deck transformation, that is, the map which takes p to the unique point in QQ which differs from p but projects to the same point in QQ . This induces a linear map

∗ : L2 → L2 by pullback: ( ∗ )(p) := ( (p)). Since the states ∗ and are Q and physically indistinguishable, we must have (p) = eiθ(p) ∗ (p) for all p ∈ Q all . But ∗ ∗ = 1, so the only possibilities are ∗ = or ∗ = − . In order to allow for fermionic solitons, we must consistently choose the latter possibility: our wavefunctions must always be odd under . Spinoriality then arises as follows. Consider the loop in QQ defined by spatial rotation about a fixed axis through 2π of a fixed base configuration n. Since π1 (QQ ) = Z2 , this may not be contractible, and its contractibility is independent of the basepoint n Q fail to close, but are rather chosen. If it is noncontractible, both lifts of the loop to Q paths connecting a -related pair of points (both of which project to n). Having insisted on -oddness, therefore, we see that every allowable state in this sector acquires a minus sign under spatial rotation by 2π, the hallmark of spinoriality. That this is equivalent to fermionicity (that is, odd exchange statistics) was proved by Finkelstein and Rubinstein in [12]. The question of whether Hopf solitons can be consistently quantized as fermions thus reduces to the question of whether 2π spatial rotation loops in QQ are noncontractible when Q is odd and contractible when Q is even. To answer this, we only need to determine the contractibility for a representative of each sector. Consider the loop 3 n(η, ξ, φ + 2π t), where n : S3 → S3 γ : [0, 1] → (S 3 )S defined by γˆ (η, ξ, φ, t) = is defined in (3.6), and we once again use the natural identification of g : X → Y Z with gˆ : Z × X → Y . This is a 2π spatial rotation loop (about the z axis) of the degree Q = mn Skyrme configuration n. Note that π ◦ γ : [0, 1] → QQ is also a 2π rotation loop, but in the Faddeev-Hopf configuration space. Corollary 3 states that π ◦ γ is contractible if and only if γ is contractible, which is true if and only if the degree Q is odd, by work of Giulini [14]. Hence imposing -oddness on our quantum states does indeed produce a consistent fermionic quantization of Hopf solitons. It is important to realize that, having imposed -oddness, every noncontractible loop Q → C, regardless of whether the in QQ must be associated with a sign flip in : Q loop is generated by a spatial rotation. Let n be a Hopf degree Q = 0 energy minimal of the Faddeev-Hopf model which is invariant under a simultaneous spatial rotation by α about some axis e and rotation by β around the third axis in target space (the only axis compatible with the boundary conditions). Since for Q = 0 the maximal symmetry of a configuration is O(2) × O(2), only one spatial rotation axis e is possible for a given n, and we may choose it, without loss of generality, to lie along the z axis. Let us call such a combined transformation an (α, β)-rotation. Then we may construct a loop L(α, β)n in QQ based at n which consists of rotation by 2tα around the z-axis for time t ∈ [0, 21 ], followed by rotation by (2t − 1)β around the third axis in target space for t ∈ ( 21 , 1]. In this language, the fact that n has the specified symmetry is precisely the statement that Q corresponding to n, L(α, β)n is a loop, i.e. closed. There are two points p, (p) ∈ Q and any physical state must have ( (p)) = − (p). Now if L(α, β)n is noncontractible then p and (p) are connected by the lifts of L(α, β)n , starting at p and (p), respectively.Hence, evaluated at the specific point p (or (p)) we must have


ˆ ˆ e−iα L3 e−iβ K3 (p) = − (p),

399

(4.1)

for any allowed state, where Lˆ 3 is the third component of the spin operator Lˆ and Kˆ 3 is the third (and only) component of the spin operator in target space (henceforth called isospin). Q If L(α, β)n were contractible, however, it would lift to a pair of closed loops in Q based at p and (p), so that ˆ ˆ (4.2) e−iα L3 e−iβ K3 (p) = (p), simply by continuity of . In the spirit of semiclassical quantization we assume that, at least for low lying states, the symmetry of the classical energy minimal is not broken by quantum effects. Thus we seek quantum states which are also invariant under (α, β)-rotations, so that ˆ ˆ (4.3) e−iα L3 e−iβ K3 (x) = eiθ(x) (x), Q . But, assuming the (α, β)-rotation generates a finite group, there must for all x ∈ Q ˆ ˆ exist an integer q such that (e−iα J3 e−iβ I3 )q ≡ , which implies, by continuity, that θ(x) must in fact be constant. But then θ (x) = θ (p) = π if L(α, β)n is noncontractible by (4.1), or θ (x) ≡ 0 if L(α, β)n is contractible, by (4.2). Hence, we obtain the so-called Finkelstein-Rubinstein constraints on symmetric quantum states: ψ if the induced loop is contractible, ˆ ˆ (4.4) e−iα L3 e−iβ K3 ψ = −ψ otherwise. Equation (4.4) imposes constraints on the spin and isospin quantum numbers L, L3 and K3 . It is worth pausing here to discuss the relationship between body-fixed and space-fixed angular momentum. The Lagrangian of the Hopf model is invariant under a SO(3) × SO(3) symmetry group consisting of rotations in space and target space. For these symmetries we can define left and right actions which are generated by the space-fixed and body-fixed angular momenta J and L acting on space and by space-fixed and bodyfixed angular momenta I and K acting on target space. The body-fixed and space-fixed angular momentum operators are related by rotations which imply that J2 = L2 . For rotations in target space only rotations around the third axis are compatible with the boundary conditions. This implies I32 = K32 . When the model is quantized the angular momentum operators Jˆ 2 = Lˆ 2 , Jˆ3 , Lˆ 3 , Iˆ3 and Kˆ 3 form a set of commuting observables. The quantum wave function ψ can then be labelled by the usual spin quantum number as follows ψ = |L, L3 , J3 , K3 , I3 . Since the Finkelstein-Rubinstein constraints do not impose any restrictions on the values of J3 and I3 , these values will often be suppressed and the wave function is given as ψ = |L, L3 , K3 . In order to make predictions, we are interested in states with given J and I3 . Therefore, we have to consider states with quantum numbers L = J and K3 = ±I3 . Then the Finkelstein-Rubinstein constraints have the following effect. By restricting the allowed quantum states for given J and I3 the degeneracy of the states is changed. In the extreme case that the degeneracy is zero, certain combinations of J and I3 get excluded. We now return to our discussion of loops in configuration space and FinkelsteinRubinstein constraints. Just as for 2π spatial rotation loops, we can use the isomorphism

400

S. Krusch, J.M. Speight 3

3

π1 ((S 2 )S∗ ) → π1 ((S 3 )S ) induced by the Hopf fibration to calculate whether a given loop L(α, β)n is contractible. For every configuration n we can choose a configuration 3 3 n˜ in the configuration space (S 3 )S of Skyrmions. Then L(α, β)n˜ is a loop in (S 3 )S 3 which projects to the loop L(α, β)n in (S 2 )S∗ under π . The action of SO(3) on the target space of n, that is S 3 , is now identified with the adjoint action of SU (2) on itself. Once again, Corollary 3 shows that L(α, β)n is contractible if and only if L(α, β)n˜ is contractible. Contractibility of the latter loop can be determined by means of an explicit formula recently derived for Skyrmions with discrete symmetries, [20]. This states that the loop L(α, β)n˜ is contractible if and only if N=

Q (Qα − β) 2π

(4.5)

is even. Note that there is a slight subtlety with the choice of the sign of β. We can immediately recover our earlier result that the -odd quantization is consistently fermionic from formula (4.5). To see this, note that every configuration is symmetric under (2π, 0)-rotation, and substituting α = 2π , β = 0 into (4.5) shows that N is odd if and only if Q is odd. Hence the spin quantum numbers L and J are half integer if and only if Q is odd. Similarly, considering the case α = 0, β = 2π (pure isorotation by 2π) shows that the isospin quantum numbers K3 and I3 are also half integer if and only if Q is odd. New constraints on low-lying quantum states are obtained if we assume that they are invariant under the symmetry groups of the corresponding classical energy minimals. The Faddeev-Hopf model has received much less numerical attention than the Skyrme model, so our understanding of these minimals and their symmetries is comparatively limited. For this reason, we will discuss the Finkelstein-Rubinstein constraints for general symmetries first, then apply the analysis to those symmetries which have been observed in numerical experiments. Since we are interested in symmetries which can be generated by loops in configuration space we disregard reflections and look only at subgroups of T 2 = SO(2) × SO(2). Note that T 2 , and hence every subgroup of T 2 , is abelian. This severely limits the symmetry groups possible, and accounts in part for the numerical observation that Hopf degree Q minimals tend to possess far less symmetry than degree Q Skyrmions. The symmetry group Gn < T 2 of a configuration n is either continuous, in which case Gn ∼ = SO(2) corresponding to axial symmetry, or discrete, hence finite (T 2 is compact). Every finite abelian group is isomorphic to a product of finite cyclic groups of coprime order, so it suffices to understand the FinkelsteinRubinstein constraints for q-fold cyclic symmetry Cq . First, we deal with axial symmetry. Consider the axial configurations (3.4) with Hopf charge Q = mn. These are invariant under (α, nα)-rotations for all α ∈ R. Since the loop L(α, nα)n exists for all α ∈ R it is homotopic to the constant loop (α = 0). So L(α, nα)n is contractible and gives rise to the following constraint on wave functions: ˆ

ˆ

e−iα L3 e−inα K3 = .

(4.6)

Since formula (4.6) is valid for all α we can expand the equation in α. The first order term gives rise to the following constraint for the spin operators: (4.7) Lˆ 3 + nKˆ 3 = 0. Equation (4.7) implies for the spin quantum numbers that L3 = −nK3 .


401

If the axial symmetry is broken then the symmetry group must be isomorphic to a product of finite cyclic groups. Not every cyclic subgroup of T 2 is possible for a given Q, however, since the generator (α, β) of Cq < T 2 must satisfy Eq. (4.5), that is, N must be an integer. There are precisely q different Cq subgroups of T 2 which are candidates for symmetry groups, generated by (2π/q, 2kπ/q), where k = 0, 1, . . . , q − 1, since pure isorotation can never leave a nonconstant configuration invariant. Let us denote these groups Cqk . To illustrate, let us assume that q is prime so that Cq is a finite field. Then formula (4.5) applied to the generator of Cqk implies that Q(Q − k) = 0 mod q and hence Q = 0 mod q or Q = k mod q by the field property. Hence, unless Q is Q mod q a multiple of q, formula (4.5) rules out all possible Cq symmetries except Cq . Similar criteria can be derived for q not prime, but they are not so neat. Of particular interest given the current state of numerics is the case q = 2. The argument above shows that, for odd Q, only C21 symmetry is possible, not C20 . Given a candidate symmetry group Cqk , formula (4.5) gives us a one-dimensional (hence irreducible) representation of Cq¯ , where q¯ = q if Q(k + 1) is even and q¯ = 2q if Q(k +1) is odd, by mapping the generator (2π/q, 2kπ/q) to (−1)N . This representation may also be thought of as a homomorphism Cq¯ → Z2 = {1, −1} and is thus necessarily trivial if q is odd and Q(k + 1) is even. We call this the Finkelstein-Rubinstein representation of Cq¯ . There is also a natural representation of Cq¯ on the spin-isospin L, K3 quantum state space, defined by the inclusion Cqk < SO(3) × SO(2). A state with quantum numbers L, K3 is thus compatible with Cqk symmetry if and only if the decomposition of the spin-isospin L, K3 representation of Cq¯ into irreducible representations contains a copy of the Finkelstein-Rubinstein representation. Given that we consider only cyclic groups, in practice we need only check compatibility on the generator (α, β) = (2π/q, 2π k/q). Thus L3 , K3 must satisfy e−2πi(L3 +kK3 )/q = (−1)N = eiπQ(Q−k)/q , 1 ⇔ L3 + kK3 = − Q(Q − k) + q, 2

(4.8) (4.9)

where is an integer. A good candidate for the ground state in the charge Q sector is the state with the lowest values of L and |K3 | (and hence J and |I3 |) compatible in this way with the symmetries of the classical minimal. To illustrate this symmetry analysis, we compute the quantum ground state for stable and metastable Hopf solitons of degrees Q = 1, . . . 7, using the classical solutions obtained numerically by Hietarinta et al [17]. Only axial and C2 symmetries ever arise for these solutions. In the C2 case for even Q, we distinguish between the two possible groups C20 and C21 using the colour coding information in [17]. The results are presented in Table 1. The first entry is the Hopf number Q. A star indicates that the state is metastable, that is, the classical solution is not a global minimal. The next entry is the energy EQ which has been calculated in [17] and corresponds to λ = 1/4. The following entry gives the shape of the Hopf configuration. The entry “symmetry” shows which symmetry has been used to calculate the Finkelstein-Rubinstein constraints. Here (n, m) corresponds to the axial symmetry of the corresponding toroidal ansatz (3.4). C20 is generated by π rotation in space whereas C21 is generated by rotation by π in space followed by rotation by π in target space. As a word of caution, while axial symmetry has been checked numerically, the C2 symmetry is obtained by inspection from the figures in [17] and [8]. For low Q the symmetries are apparent. However, for higher Hopf charge, Q > 4, the

402

S. Krusch, J.M. Speight Table 1. Ground states and excited states for Q = 1, . . . , 7

|Q| 1

EQ 135.2

shape unknot

symmetry (1, 1)

FR

ground state

excited state (1)

excited state (2)

1

| 21 , − 21 , 21

| 23 , − 21 , 21

| 23 , − 23 , 23

2

220.6

unknot

(2, 1)

1

|0, 0, 0

|1, 0, 0

|2, −2, 1

2∗

249.6

unknot

(1, 2)

1

|0, 0, 0

|1, 0, 0

|1, −1, 1

3

308.9

unknot

C21

-1

311.3

unknot

(3, 1)

1

| 23 , 21 , 21 | 25 , − 23 , 21

| 21 , − 21 , 23

3∗

| 21 , 21 , 21 | 23 , − 23 , 21

| 29 , − 29 , 23

4

385.5

unknot

(2, 2)

1

|0, 0, 0

|1, 0, 0

|2, −2, 1

4∗

392.7

unknot

1

|0, 0, 0

|1, 0, 0

|0, 0, 1

4∗

C20

405.0

unknot

(4, 1)

1

|0, 0, 0 | 21 , ± 21 , | 21 , − 21 ,

5

459.8

link

—

—

5∗

479.2

unknot

C21

1

|1, 0, 0 1 2 1 2

| 23 , ± 21 , | 23 , − 21 ,

|4, −4, 1 1 2 1 2

| 21 , ± 21 , 23 | 21 , 21 , 23

6

521.0

link

—

—

|0, 0, 0

|1, 0, 0

|0, 0, 1

6∗

536.2

link

—

—

|0, 0, 0

|1, 0, 0

|0, 0, 1

7

589.0

knot

—

—

| 21 , ± 21 , 21

| 23 , ± 21 , 21

| 21 , ± 21 , 23

symmetries are difficult to guess, if indeed they exist at all. Where no entry is given, the classical solution has no obvious symmetry and the only constraint applicable is that of consistent fermionicity. “FR” gives the Finkelstein-Rubinstein constraints (−1)N , where N is calculated with Eq. (4.5) for the generator of the discrete symmetries. Note that axial symmetry implies FR = 1. Then ground states are calculated as explained above. They are given in the form |LL3 K3 . The quantum numbers J3 and I3 are suppressed. Recall that J = L and |I3 | = |K3 |. We have also included two excited states. “Excited state (1)”, is obtained from the ground state by increasing L by 1 and finding the lowest K3 such that all constraints are satisfied. Similarly, “excited state (2)” is obtained by increasing K3 by 1. Note that changing the sign of Lˆ 3 and Kˆ 3 in the constraints (4.4) given by a loop L(α, β)n can be interpreted as constraints for the loop L(−α, −β)n . Since the fundamental group is Z2 the loop L(α, β)n is contractible if and only if L(−α, −β)n is contractible. Therefore, whenever |L, L3 , K3 satisfies the constraints imposed by a symmetry, so does |L, −L3 , −K3 . In Table 1, we only display states with K3 ≥ 0. Since no constraints with FR = −1 occur for even Hopf charge Q all the ground states are given by |0, 0, 0 and “excited states (1)” are |1, 0, 0. The influence of the Finkelstein-Rubinstein constraints can only be seen for “excited state (2)”. For odd Q the Finkelstein-Rubinstein constraints influence the ground states and all the excited states. One might ask why the first and second excited states are expected to have spin and isospin one unit higher than the ground state, respectively. One reason is that this is consistent with the collective coordinate quantization of Hopf solitons, to which we turn in the next section. 5. Collective Coordinate Quantization The simplest non-trivial quantitative application of our results is the collective coordinate quantization, [27]. In this case the wave function is only non-vanishing on the


403

space of minimal energy configurations in a given sector, also called the moduli space. The effective Lagrangian Leff in this approximation is obtained by restricting the full Lagrangian to fields which, at each fixed time, lie in the moduli space.2 From Leff one can construct an effective Hamiltonian and canonically quantize the system in the standard manner. For Hopf charge Q = 1 the reduced Hamiltonian is given in [27] using “SU (2) notation”. The Lagrangian L (3.1) can be split up into kinetic energy T and potential energy V , namely L = T − V , where

1 λ |∂t n × ∂i n|2 , |∂t n|2 + 2 R3 2 i 1 λ |∂i n|2 + |∂i n × ∂j n|2 . V = 4 R3 2 T =

i

(5.1) (5.2)

i,j

Now let M ⊂ QQ be the moduli space of charge Q energy minimizers, and n(t) be a trajectory in M. Since n(t) is a critical point of V for all t, V must remain constant, V [n(t)] = M0 say, interpreted as the classical mass of the Hopf soliton. It follows that the effective Lagrangian is Leff = T |M − M0 , so the reduced dynamics is determined purely by the kinetic energy restricted to M. This has a natural geometric interpretation: being quadratic in first time derivatives, T defines a positive quadratic form and hence a unique Riemannian metric γ on M, and the classical dynamics descending from Leff is nothing other than geodesic motion in (M, γ ). Since the Faddeev-Hopf model is not of Bogomol’nyi type, M is just the orbit of any energy minimizer under the symmetry group of the model, that is, all zero modes arise due to symmetry. The centre of mass motion decouples, so we may, without loss of generality, assume that the centre of mass is fixed at the origin, so that M is the orbit of some minimizer n0 under G = SO(3) × SO(2), acting as described in Sect. 3. So (M, γ ) is a homogeneous space, diffeomorphic to G/K, where K < G is the isotropy group of n0 . It follows that γ is uniquely determined by its value on Tn0 M. Generically, as we have described, K is discrete, so M has dimension 4, and γ is specified by 6 constants, which may be interpreted as the components of the Hopf soliton’s inertia tensor. However, we shall concentrate on the case where n has axial symmetry. Then K = {k(α) = ([diag(eiα/2 , e−iα/2 )], einα ) : α ∈ R}

(5.3)

for some divisor n of Q, where we have used the standard isomorphisms SO(3) ≡ P U (2) and SO(2) ≡ U (1) to identify SO(3) matrices with projective equivalence classes of U (2) matrices, and SO(2) matrices with complex phases. Let θ1 , θ2 , θ3 be the usual basis of left invariant vector fields on SO(3) and θ4 = ∂ξ on SO(2) ≡ {eiξ : ξ ∈ R}. Let · · · denote linear span. Then the Lie algebra of G, is g = θ1 , . . . , θ4 , and the Lie algebra of K is k = θ3 + nθ4 . We may identify Tn0 M with the complementary space p = θ1 , θ2 , θ3 . Note g = k ⊕ p since n = 0. So γ is equivalent to a positive symmetric 2 As has been discussed in the Skyrme model, [6, 9], this approximation breaks down if centrifugal effects are taken into account. This problem can be avoided by introducing a (sufficiently large) mass term for the vector n so that the fields decay fast enough at infinity.

404


bilinear form γ¯ : p ⊕ p → R, and this must be invariant under the adjoint action of K on p. Relative to the basis {θ1 , θ2 , θ3 } this is   cos α − sin α 0 Adk(α) =  sin α cos α 0  . (5.4) 0 0 1 Let p∗ denote the dual space to p, so that γ¯ ∈ p∗ p∗ , where denotes the symmetric tensor product. The induced action of K on p∗ p∗ may be decomposed into irreducible representations, whence one finds that the dimension of the space of invariant symmetric bilinear forms on p ⊕ p is [19] 2π 2 1 1 2 dα = 2. (5.5) tr Adk(α) + tr Adk(α) 2π 0 2 Hence there exist positive constants a, b such that γ¯ = a σ12 + σ22 + bσ32 ,

(5.6)

where {σi } are the one forms dual to {θi }. Thus the metric γ on M is determined by just two constants. The static solution n0 , and hence its classical mass M0 and moments of inertia a, b, all depend parametrically on the coupling λ. In fact, this dependence is quite simple, as we shall now show. Let us temporarily denote all λ dependence explicitly, so that Tλ , Vλ are the kinetic and potential energy functionals at coupling λ, nλ is the static solution, M0 (λ) is its mass, and a(λ), b(λ) its moments of inertia. A simple rescaling of the integration variables in (5.2) shows that, for any fixed map n : R3 → S 2 , √ √ Vλ [n(x)] ≡ λ V1 [n( λ x)]. (5.7) Hence, given an extremal n∗ of V1 (here and henceforth, the subscript ∗ will indicate 1 that a quantity refers to the λ = 1 model), nλ (x) = n∗ (λ− 2 x) is an extremal of Vλ , and furthermore its energy is √ √ M0 (λ) = Vλ [nλ ] = λ V1 [n∗ ] = λ M∗ . (5.8) 1

So the classical soliton masses scale as λ 2 . A similar argument works for the moments of inertia too. The coefficients a(λ), b(λ) are, by definition, twice the kinetic energies (i) of the time-dependent fields, nλ (x, t) say, obtained from nλ by subjecting it to spatial rotation at unit angular velocity about the xi -axis with i = 1, 3 respectively. Let Ri (t) denote rotation through angle t about the xi -axis. Then 1 1 1 (i) −2 nλ (x, t) = nλ (Ri (t)x) = n∗ λ− 2 Ri (t)x = n∗ Ri (t)λ− 2 x = n(i) λ x, t , ∗ (5.9) by linearity of Ri . Rescaling the integration variables in (5.1) as before, one sees that 3 3 (i) (i) Tλ [nλ ] = λ 2 T1 [n∗ ], and so the moments of inertia scale as λ 2 : 3

a(λ) = λ 2 a∗ ,

3

b(λ) = λ 2 b∗ .

(5.10)

Note that neither of these arguments appealed to axial symmetry, so the same scaling behaviour applies to solitons with only discrete (for example, trivial) symmetry groups,


405

also. This includes the scaling behaviour of the moment of inertia associated with isorotation (where this no longer coincides with spatial rotation) because (iso)

nλ

1

1

−2 (x, t) = R3 (t)nλ (x) = R3 (t)n∗ (λ− 2 x) = n(iso) x, t). ∗ (λ

(5.11)

From now on, we will no longer denote the λ dependence explicitly, but will retain the ∗ subscript for quantities associated with the λ = 1 model. We wish to quantize geodesic motion on M, which may be formulated as a Hamiltonian flow on T ∗ M, within the framework of Finkelstein and Rubinstein. As it stands, there is a problem with this, however. As shown above, the fundamental group of QQ , the topological sector containing M, is Z2 , whereas π1 (M) = Z2n , where n is the divisor of Q appearing in (5.3). A proof of this is presented in the appendix. So π1 (M) = π1 (QQ ) unless n = 1, and this type of axial symmetry occurs only for Q = 1 and the metastable Q = 2 state, according to Hieterinta et al [17]. Nevertheless, a fermionic collective coordinate approximation is still possible, the key point being that in all cases the 2π spatial rotation loop has order 2 in π1 (M). It is slightly unfortunate that this is true independent of Q, that is, whether Q is odd or even. For consistency we must thus choose bosonic quantization for Q even, it is not imposed on us by the topology of M. This illustrates that collective coordinate quantization can be quite treacherous in the absence of a good understanding of the topology of the full configuration space. To construct the collective coordinate quantization it is convenient to exploit the n-fold covering map : SO(3) → G/K which maps g ∈ SO(3) to the coset (g, 1)K, that is, the left coset of K containing (g, 1) ∈ G. Note that commutes with the natural SO(3) left actions on SO(3) and M. Geodesics in (M, γ ) are the images of geodesics in (SO(3), ∗ γ ), where the lifted metric ∗ γ is precisely (5.6), but with σi now interpreted as (global) left invariant one forms on SO(3), rather than basis vectors in p∗ . The Hamiltonian generating geodesic flow in (SO(3), ∗ γ ) is 1 2 1 1 1 1 H = (5.12) (L1 + L22 ) + L23 = |L|2 + − L23 , 2a 2b 2a 2b 2a where Li : T ∗ SO(3) → R are the angular momenta corresponding to the vector fields θi (the components of the moment map for the Hamiltonian action of SO(3) on T ∗ SO(3)). Their Poisson bracket algebra is well known: {L1 , L2 } = L3 and cyclic permutations. We may now quantize in the usual way, replacing classical angular momenta by Lˆ i , self-adjoint linear operators on L2 (SO(3)) and Poisson brackets by commutators. Note that {Hˆ , Lˆ 2 , Lˆ 3 } is a compatible set of observables. In this set-up, we are thinking of the wavefunction as defined on the covering space, ψ : SO(3) → C; it is important to note that for Q odd (even) only those functions which are double-valued (single-valued) under the projection make physical sense. The deck transformation group for is generated by exp(2π θ3 /n), so we find that the eigenvalues of Lˆ 3 must be integer multiples of n/2. This conclusion may be reached another way. Note that θ3 + nθ4 ∈ k vanishes on M, so the corresponding classical momenta are linearly dependent: L3 + nK3 = 0. Hence the quantum operators must satisfy (Lˆ 3 + nKˆ 3 )ψ = 0 on any physical state, and the conclusion follows because Kˆ 3 has half-integer spectrum. Of course, this is nothing other than the FR constraint for axial symmetry (4.7). We may use the linear dependence of the third components of spin and isospin to rewrite Hˆ in terms of Kˆ 3 , or both Lˆ 3 and Kˆ 3 if we wish. A convenient way to write the quantum hamiltonian is 1 ˆ2 1 1 ˆ2 ˆ H = M0 + (5.13) L + − L3 . 2a 2b 2a

406


It is now trivial to express the quantum energy spectrum in terms of the quantum numbers L2 and K3 :

√ 2 L(L + 1) 1 1 E = λ M∗ + 3 n2 K32 , (5.14) + − 2a∗ 2b∗ 2a∗ λ2 where we have used the constraint L3 = −nK3 to eliminate L3 , and the scaling behaviour obtained in (5.8),(5.10) to render all λ dependence explicit. Recall that ∗-subscript quantities refer to the λ = 1 soliton. As discussed in the previous section, the body-fixed and space-fixed angular momenta satisfy Jˆ 2 = Lˆ 2 and Iˆ32 = Kˆ 32 . Therefore, we can also express the energy in terms of the space-fixed angular momentum quantum numbers, which are the quantities measured in a physical experiment, by replacing L(L + 1) by J (J + 1) and K32 by I32 in formula (5.14). We would like to order these states by increasing energy. Clearly, this order depends on n and the relative size of the constants a∗ and b∗ . As discussed above, to determine these constants, one must compute the kinetic energy of time dependent fields n(t) = (exp(tθ1 ), 1) · n0 and n(t) = (exp(tθ3 ), 1) · n0 respectively, where · denotes the action of G on M. This is computationally very expensive if one uses for n0 the genuine axially symmetric energy minimizers found in [17], since even to construct n0 requires one to solve nonlinear PDEs. Instead, we shall again exploit the Hopf fibration and assume that n0 is well approximated by the image under the Hopf map ρ of a Skyrme configuration U : R3 → SU (2) within the rational map ansatz of Houghton, Manton and Sutcliffe [18]. This idea was introduced in [8].3 The rational map ansatz may be described as follows. Using exp : su(2) → SU (2), one may identify SU (2) with the closed ball of radius π in su(2) ≡ R3 . The entire boundary of this ball gets mapped to −I2 . Partition physical space R3 into concentric 2-spheres of radius r ∈ [0, ∞). Choose a fixed holomorphic map R : S 2 → S 2 ⊂ R3 of degree Q and a smooth decreasing surjection f : [0, ∞) → (0, π ] (the profile function). Then the corresponding degree Q Skyrme configuration is U (r, x1 , x2 ) = exp (f (r)R(x1 , x2 )) ,

(5.15)

where x1 , x2 is any coordinate system on S 2 . With respect to stereographic coordinates z, R on its domain and codomain, R is the eponymous rational map R(z). We may then write U (r, z) more explicitly as −if 1 e + |R|2 eif 2i R¯ sin f U (r, z) = . (5.16) 2iR sin f eif + |R|2 e−if 1 + |R|2 The corresponding Faddeev-Hopf configuration π ◦ U can easily be calculated with Eqs. (2.1) and (2.4), W (r, z) =

|R(z)|2 eif (r) + e−if (r) , 2iR(z) sin f (r)

(5.17)

where again we choose stereographic coordinates on S 2 . The idea is to approximate the true energy minimizer n0 by a configuration of this form and minimize over all possible R and f . In fact, to obtain axial symmetry, we must assume R(z) = zQ (note this 3

Su has also discussed the rational map ansatz for Hopf solitons, using a different notation, [26].


407

Table 2. Classical energy M∗ and moments of inertia a∗ , b∗ of various axially symmetric solitons, at λ = 1, within the rational map ansatz. For comparison, we also quote the classical energies of the corresponding numerical solutions found in the literature (M∗H : Hietarinta and Salo [17], M∗G : Gladikowski and Hellmund [15], M∗B : Battye and Sutcliffe [7]). Note that M∗H and M∗G have been inferred using the scaling rule (5.8) Q 1 2 3∗

M∗H 270.4 441.2 622.6

M∗ 275.0 462.9 665.5

M∗G 278.6 446.9 —

M∗B 252.5 418.0 590.5

a∗ 418.8 1265.0 3272.7

b∗ 369.7 1309.4 3556.1

assumes the divisor n of Q is simply n = Q, so our results apply only to Q = 1, 2 and the metastable Q = 3∗ , 4∗ solitons). We then minimize the potential energy V over all possible profile functions f . This yields a nonlinear second order ODE for f (r) which is easily solved numerically. We may, without loss of generality, set λ to unity. Having constructed our approximate energy minimizer, W (r, z), we must compute the kinetic energy at t = 0 of W (t, r, z) =

|R(˜z(t, z))|2 eif (r) + e−if (r) , 2iR(˜z(t, z)) sin f (r)

(5.18)

where z˜ (t, z) =

z cos t/2 + i sin t/2 , iz sin t/2 + cos t/2

and

z˜ (t, z) = zeit ,

(5.19)

yielding a∗ /2 and b∗ /2 respectively. The calculations are elementary, but lengthy, and all reduce to radial integrals of expressions involving f (r) and f (r). The results for Q = 1, 2 and the metastable Q = 3∗ are summarized in Table 2. These data, along with formula (5.14) give the complete quantum energy spectrum for these solitons, at arbitrary coupling. To illustrate our approach we shall interpret the Hopf solitons as super heavy fermion states in the strongly coupled pure Higgs sector of the standard model, as advocated by Gipson and Tze [13]. To make contact with their work, we must take the unit of energy to be e0 = 300 GeV, = 1, and the coupling constant to be λ = ln(mH /e0 )/24π 2 , where mH is the Higgs mass. In this model, the Higgs sector is strongly coupled, so the Higgs mass assumes the rather large value mH ≈ 1 TeV, so that λ ≈ 0.005. The unit of length is the Compton wavelength of a particle of rest energy e0 , namely d0 = c/e0 ≈ 0.66 10−3 fm. Then the Q = 1 ground state represents what Gipson and Tze call a “smoke ring soliton” of energy 6.63 TeV which is compatible with the lower bound of 5.5 TeV given in [13]. A sensible measure for the size of the Hopf soliton is the value of the radius in the rational map ansatz at which the profile function takes the value π/2. We find that our Hopf soliton has a radius of 0.08 10−3 fm which is comparable with the lower bound of 0.2 10−3 fm in [13] where the radius is defined in a slightly different way. We display the groundstates and the first two excited states in the collective coordinate approximation in Table 3. The energies of the states are dominated by the classical contribution. As anticipated in Table 1, the groundstate has the lowest energy followed by excited state (1) and excited state (2). The energy of the states increases with the Hopf charge Q. The size of the Hopf solitons also increases with the charge; 0.08 10−3 fm for Q = 1, 0.09 10−3 fm for Q = 2 and 0.13 10−3 fm for Q = 3.

408


Table 3. Groundstates and first excited states, and their energies, of super heavy smoke ring solitons in the collective coordinate approximation, using the rational map ansatz Q

groundstate

E0

excited state (1)

E1

excited state (2)

E2

1

| 21 , − 21 , 21

6.63 TeV

| 23 , − 21 , 21

9.67 TeV

| 23 , − 23 , 23

9.93 TeV

2

|0, 0, 0 | 23 , − 23 , 21

9.82 TeV

|1, 0, 0 | 25 , − 23 , 21

10.49 TeV

14.58 TeV

15.23 TeV

|2, −2, 1 | 29 , − 29 , 23

17.12 TeV

3∗

11.79 TeV

Clearly, the relative size of the quantum excitation energy of an excited state to the ground state energy depends on the coupling λ. If λ is small, as in the application above, the quantum corrections become significant. In an application where the solitons are taken to model real physical structures, whose energies and sizes are known experimentally (rather than hypothetical exotic matter states as in the current case), one would tune the energy and length scales independently so as to fit some reference data as well as possible. This amounts to tuning both λ and the value of , which is why we retained explicit dependence in Eq. (5.14). In the case of the Skyrme system as a model of nucleons, for example, one finds that ≈ 46.8 in natural units [22]. Even if λ is large, therefore, quantum corrections may still be significant, provided /λ remains large. So the relative importance of quantum corrections depends strongly on the physical interpretation of the model under consideration. 6. Conclusion We have described how to quantize Hopf solitons using the Finkelstein-Rubinstein construction and thereby demonstrated that Hopf solitons can be quantized as fermions when their Hopf charge Q is odd. An important ingredient of the proof is the fact that the Hopf map S 3 → S 2 induces a Serre fibration (S 3 )M → (S 2 )M ∗ . Using this fibration we could show that the fundamental group of Skyrmions is isomorphic to the fundamental group of Hopf solitons, when physical space has finite fundamental group, and this isomorphism is induced by the Hopf map. This enabled us to use results which have been derived for the Skyrme model. In a semiclassical quantization we expect that classical symmetries are not broken by quantum effects. Then the symmetries of the classical configurations induce non-trivial constraints on the wave function. We calculated possible ground states of Hopf solitons for Q = 1, . . . , 7 from the minimal energy configurations given in [17]. Since Hopf solitons do not have many symmetries, the constraints on the wave functions are quite weak. Often, only the degeneracy of a state changes, rather than the state being excluded completely. Excited states have been included to better illustrate the influence of the Finkelstein-Rubinstein constraints. In order to get quantitative predictions of the quantum energy spectrum of Hopf solitons, we resorted to a collective coordinate approximation. In general, naive collective coordinate quantization can give spurious results if the topology of the moduli space is incompatible with that of the full configuration space. We concentrated on the case where the moduli space consists of axially symmetric configurations, which provides a good example of this difficulty. As discussed in the previous section, such a moduli space allows for fermionic quantization for both odd and even Hopf charge. In order to describe the physics correctly, we have to impose bosonic quantization for even Q and fermionic quantization for odd Q. In other words, we must impose some of the


409

Finkelstein-Rubinstein constraints arising from the topology of the full configuration space “by hand” on the wave function on the moduli space. They do not arise from the topology of the moduli space itself. The Faddeev-Hopf model contains a single coupling constant λ. By simple rescaling arguments, we derived the scaling behaviour of the classical energy and moments of inertia of a soliton as λ varies. This allowed us to find a formula for the quantum energy spectrum of axially symmetric solitons, within the collective coordinate approximation, with all λ dependence explicit. The numerical constants M∗ , a∗ and b∗ in this formula were approximated, for three such axially symmetric solitons, by constructing approximate energy minimizers within the rational map ansatz. Our aim in this paper was to illustrate the general approach of fermionic soliton quantization within the FaddeevHopf model. This can now be applied to a variety of physical models that admit Hopf solitons. Acknowledgements. The authors wish to thank D. Auckly and P. M. Sutcliffe for fruitful discussions. S. K. acknowledges an EPSRC Research fellowship GR/S29478/01.

Appendix: The Fundamental Group of the Moduli Space We wish to compute the fundamental group of M, the orbit of a configuration n : R3 → S 2 under G = SO(3) × SO(2), when n is invariant under the axial symmetry group K = {(R3 (α), einα ) : α ∈ R} < G, where R3 (α) denotes rotation through α about the x3 axis. Since M ∼ = G/K and p : G → G/K is a fibration, we have the associated homotopy exact sequence ι

⇒

p

K → G → G/K p∗ ι∗ π1 (K) → π1 (G) → π1 (M) → π0 (K) p∗ ι∗ Z → Z2 ⊕ Z → π1 (M) → 0.

Hence p∗ surjects, so π1 (M) ≡ π1 (G)/ ker p∗ by the Isomorphism Theorem. But ker p∗ is, by exactness, the image of π1 (K) under inclusion, clearly the infinite cyclic group generated by 1 ⊕ n ∈ π1 (G). This group has precisely 2n cosets in π1 (G), labelled by the elements 0 ⊕ 0, 0 ⊕ 1, . . . , 0 ⊕ (2n − 1), for example. Let us denote the coset g + ker p∗ by [g]. It follows immediately that the quotient group π1 (G)/ ker p∗ is cyclic of order 2n, generated by [0 ⊕ 1]. Note also that the 2π spatial rotation loop lies in 1 ⊕ 0 ∈ π1 (G), which projects to [0 ⊕ n] = n[0 ⊕ 1] in π1 (G)/ ker p∗ , since 1 ⊕ 0 = 0 ⊕ n − 1 ⊕ n. Hence the 2π spatial rotation loop in M is noncontractible of order 2, independent of n (and Q). References 1. Adkins, G.S., Nappi, C.R., Witten, E.: Static properties of nucleons in the Skyrme model. Nucl. Phys. B228, 552 (1983) 2. Aratyn, H., Ferreira, L.A., Zimerman, A.H.: Exact static soliton solutions of 3+1 dimensional integrable theory with nonzero Hopf numbers. Phys. Rev. Lett. 83, 1723–1726 (1999) 3. Auckly, D., Kapitanski, L.: Holonomy and Skyrme’s model. Commun. Math. Phys. 240, 97–122 (2003)

410


4. Auckly, D., Speight, J.M.: Fermionic quantization and configuration spaces for the Skyrme and Faddeev-Hopf models. http://arxiv.org/list/, 2004 5. Balachandran, A.P., Marmo, G., Skagerstam, B.S., Stern, A.: Classical Topology and Quantum States. Chap. 13.4, Singapore: World Scientific, 1991 6. Bander, M., Hayot, F.: Instability of rotating chiral solitons. Phys. Rev. D 30, 1837 (1984) 7. Battye, R.A., Sutcliffe, P.M.: Knots as stable soliton solutions in a three-dimensional classical field theory. Phys. Rev. Lett. 81, 4798 (1998) 8. Battye, R.A., Sutcliffe, P.M.: Solitons, links and knots. Proc. Roy. Soc. Lond. A455, 4305 (1999) 9. Braaten, E., Ralston, J.P.: Limitations of a semiclassical treatment of the Skyrmion. Phys. Rev. D 31, 598 (1985) 10. Faddeev, L.D.: Quantisation of solitons. Preprint IAS Print-75-QS70, Princeton, 1975 11. Faddeev, L.D., Niemi, A.J.: Knots and particles. Nature 387, 58 (1997) 12. Finkelstein, D., Rubinstein, J.: Connection between spin, statistics, and kinks. J. Math. Phys. 9, 1762 (1968) 13. Gipson, J.M., Tze, H.C.: Possible heavy solitons in the strongly coupled Higgs sector. Nucl. Phys. B183, 524 (1980) 14. Giulini, D.: On the Possibility of Spinorial Quantization in the Skyrme Model. Mod. Phys. Lett. A 8, 1917–1924 (1993) 15. Gladikowski, J., Hellmund, M.: Static solitons with non-zero Hopf number. Phys. Rev. D56, 5194– 5199 (1997) 16. Hietarinta, J., Salo, P.: Faddeev-Hopf knots: Dynamics of linked un-knots. Phys. Lett. B451, 60–67 (1999) 17. Hietarinta, J., Salo, P.: Ground state in the Faddeev-Skyrme model. Phys. Rev. D62, 081701 (2000) 18. Houghton, C.J., Manton, N.S., Sutcliffe, P.M.: Rational maps, monopoles and skyrmions. Nucl. Phys. B510, 507 (1998) 19. Jones, H.F.: Groups, representations and physics. Bristol, UK: Adam Hilger, 1990, p. 80 20. Krusch, S.: Homotopy of rational maps and the quantization of Skyrmions. Ann. Phys. 304, 103–127 (2003) 21. Kundu, A., Rybakov,Y.P.: Closed vortex type solitons with Hopf index. J. Phys. A15, 269–275 (1982) 22. Leese, R.A., Manton, N.S., Schroers, B.J.: Attractive channel Skyrmions and the Deuteron. Nucl. Phys. B 442, 228 (1995) 23. McCleary, J.: User’s guide to spectral sequences. Delaware: Publish or Perish, 1985, p. 102 24. Meissner, U.G.: Toroidal solitons with unit Hopf charge. Phys. Lett. B 154, 190–192 (1985) 25. Pontyagin, L.: A classification of mappings of the 3-dimensional complex into the 2-dimensional sphere. Mat. Sbornik N.S. 9:51, 331–363 (1941) 26. Su, W.C.M.: Faddeev-Skyrme model and rational maps. Chin. J. Phys. 40, 516 (2002) 27. Su, W.-C.: Semiclassical quantization of Hopf solitons. Phys. Lett. B525, 201–204 (2002) 28. Skyrme, T.H.R.: A nonlinear field theory. Proc. Roy. Soc. Lond. A260, 127 (1961) 29. Vakulenko, A.F., Kapitansky, L.V.: Stability of solitons in S(2) in the nonlinear sigma model. Sov. Phys. Dokl. 24, 433–434 (1979) 30. Ward, R.S.: Hopf solitons on S 3 and R 3 . Nonlinearity 12, 241 (1999) Communicated by G.W. Gibbons


Communications in


On Fermion Grading Symmetry for Quasi-Local Systems Hajime Moriya Department of Mathematics, Graduate School of Science, Hokkaido University, Kita 10, Nishi 8, Kita-Ku, Sapporo, Hokkaido, 060-0810, Japan. E-mail: [email protected] Received: 15 April 2005 / Accepted: 24 November 2005 Published online: 15 March 2006 – © Springer-Verlag 2006

Abstract: We discuss fermion grading symmetry for quasi-local systems with graded commutation relations. We introduce a criterion of spontaneously symmetry breaking (SSB) for general quasi-local systems. It is formulated based on the idea that each pair of distinct phases (appeared in spontaneous symmetry breaking) should be disjoint not only for the total system but also for every complementary outside system of a local region specified by the given quasi-local structure. Under a completely model independent setting, we show the absence of SSB for fermion grading symmetry in the above sense. We obtain some structural results for equilibrium states of lattice systems. If there would exist an even KMS state for some even dynamics that is decomposed into noneven KMS states, then those noneven states inevitably violate our local thermal stability condition. 1. Introduction The univalence super-selection rule that forbids the superposition of two states whose total angular momenta are integers and half-integers is regarded as a natural law [28], see also e.g. § 6.1 of [8], III.1 of [16], § 2.2 of [27]. However, if we take a more fundamental standpoint, there are subtle points in deciding whether a (conserved) quantity satisfies the superselection rule, see e.g. [6]. Particularly, if the number of degrees of freedom is infinite as usually considered in statistical mechanics and quantum field theory, it is not obvious that symmetry assumed for kinematics leads its preservation in the state level, that is, the absence of spontaneously symmetry breaking. In this note we try to justify the univalence superselection, i.e. unbroken symmetry of fermion grading transformations that multiply fermion fields by −1. We clarify how fermion grading symmetry is different from other symmetries and is hardly broken. We shall review some relevant results. First, if a state is invariant under some asymptotically abelian group of automorphisms like space-translations, then fermion grading symmetry is perfectly preserved. That is, any such state has zero expectation value for

412

H. Moriya

every odd element [18, 23]. (See also e.g. 7.1.6 of [25], Exam. 5.2.21 of [14]. The same statement for quantum field theory is given in [15].) We shall refer to [22] that discusses (possible) forms of symmetry breaking of fermion grading transformations for dynamics that commute with some asymptotically abelian group of automorphisms. But the status of broken and unbroken symmetry of fermion grading is not given there. It seems not unreasonable to expect unbroken symmetry of fermion grading irrespective of such translation invariant assumptions. It has been shown however that non-factor quasi-free states of the CAR algebra have odd elements in their centers and give an example of the breakdown of fermion grading symmetry, though being rather technical and not coming from a physical model [20]. We note that two mutually disjoint noneven states in the factor decomposition of each non-factor quasi-free state have a common state restriction outside of some local region. It can be said that those noneven states are not macroscopically distinguishable. We are led to consider that the conventional criterion of spontaneous symmetry breaking based on the center merely for the total system is too weak to be an appropriate formula for general quasi-local systems. We introduce a more demanding criterion of SSB for general quasi-local systems, which turns out to be equivalent to the usual one for tensor-product systems. A pair of states are said to be disjoint with respect to the given quasi-local structure if for every local region, their state restrictions to its complementary outside system induce disjoint GNS representations (Definition 1). Using this notion, we propose a criterion of spontaneous symmetry breaking (Definition 2). We show the absence of spontaneous symmetry breaking in the above sense for fermion grading symmetry for general graded quasi-local systems that encompass lattice and continuous systems (Proposition 1). This proposition may be similar to the following statement in [24]: No odd element exists in observable at infinity [19]. We study temperature states (Gibbs states and KMS states) of lattice systems with graded commutation relations. For every even Gibbs state, we have a grading preserving isomorphism from its center onto that of its state restriction to the complementary outside system of each local region (Proposition 3). For now, we cannot provide a definite answer whether fermion grading symmetry is perfectly preserved or not for temperature states of those lattice systems. We only claim that if a KMS state breaks the fermion grading symmetry, then it is not thermodynamically stable. More precisely, suppose that the odd part of the center of an even KMS state for even dynamics is not empty. Then in the factor decomposition of its perturbed state by a local Hamiltonian multiplied by the inverse temperature, there are noneven KMS states that violate the local thermal stability condition (a minimum free energy condition for open systems) with respect to the perturbed dynamics acting trivially on the specified local region (Proposition 5). We give a remark upon our choice of the local thermal stability condition. In [10] we introduced two versions of local thermal stability — LTS-M and LTS-P. We make use of the latter that will be simply called LTS here. (See the Appendix for the details.) Though we have no example of such breaking nor disprove its existence, we may say that the violation of the univalence superselection rule, if it would occur, is pathological from a thermodynamical viewpoint. 2. Notation and Some Known Results We recall the definition of quasi-local C∗ -systems. (For references, we refer e.g. to § 2 of [24], § 2.6 of [14], and § 7.1 of [25].) Let F be a directed set with a partial order relation ≥ and an orthogonal relation ⊥ satisfying the following conditions:

Fermion Grading Symmetry for Quasi-Local Systems

413

a) If α ≤ β and β ⊥ γ , then α ⊥ γ . b) For each α, β ∈ F, there exists a unique upper bound α ∨ β ∈ F which satisfies γ ≥ α ∨ β for any γ ∈ F such that γ ≥ α and γ ≥ β. c) For each α ∈ F, there exists a unique αc in F satisfying αc ⊥ α and αc ≥ β for any β ∈ F such that β ⊥ α. We consider a C∗ -algebra A furnished with the following structure. Let {Aα ; α ∈ F} be a family of C∗ -subalgebras of A with the index set F. Let be an involutive ∗-automorphism that determines the grading on A as Ae := {A ∈ A (A) = A},

Ao := {A ∈ A (A) = −A}.

(1)

These Ae and Ao are called the even and the odd parts of A. For α ∈ F, Aeα := Ae ∩ Aα ,

Aoα := Ao ∩ Aα .

(2)

The above grading structure is referred to as fermion grading (by the condition L4 defined below). For a given state ω on A, its restriction to Aα is denoted ωα . If a state takes zero for all odd elements, it is called even. Let Floc be a subset of F corresponding to the set of indices of all local subsystems and set Aloc := α∈Floc Aα . We assume L1, L2, L3, L4 as follows: L1. L2. L3. L4.

Aloc ∩ Aδ is norm-dense in Aδ for any δ ∈ F. If α ≥ β, then Aα ⊃ Aβ . (Aα ) = Aα for all α ∈ F. For α ⊥ β the following graded commutation relations hold: [Aeα , Aeβ ] = 0,

[Aeα , Aoβ ] = [Aoα , Aeβ ] = 0,

{Aoα , Aoβ } = 0, where [A, B] = AB − BA is the commutator and {A, B} = AB + BA is the anticommutator. Our Floc may correspond to the set of all bounded open subsets of a space(-time) region or the set of all finite subsets of a lattice. About the condition c), αc will indicate the complement of α in the total region. We set L1 as it is for necessity in the proof of Proposition 1. For A ∈ A (and also for A ∈ Aα due to condition L3), we have the following unique decomposition: A = A + + A− , 1 A+ := A + (A) ∈ Ae (Aeα ), 2 1 A− := A − (A) ∈ Ao (Aoα ). 2

(3)

In order to ensure that the fermion grading involution acts non-trivially on A, we may assume, for example, that Aoα is not empty for all α ∈ F. However, all our results below obviously hold for any trivial cases where fermions do not or rarely exist.

414

H. Moriya

3. A Criterion of Spontaneous Symmetry Breaking Appropriate for General Quasi-Local Systems and Fermion Grading Symmetry A pair of states will be called disjoint with each other if their GNS representations are disjoint, see e.g. § 2.4.4 and § 4.2.2 of [14]. We shall employ the following more demanding condition for disjointness of two states. Definition 1. Let ω1 and ω2 be states of a quasi-local system (A, {Aα }α∈Floc ). If for every γ ∈ Floc , their restrictions to the complementary outside system of γ , i.e., ω1γc and ω2γc are disjoint with each other, then ω1 and ω2 are said to be disjoint with respect to the quasi-local structure {Aα }α∈Floc . We shall give a criterion of spontaneous symmetry breaking based on Definition 1 as follows. Let G be a group and τg (g ∈ G) be its action of ∗-automorphisms on a quasilocal system (A, {Aα }α∈Floc ). Suppose that τg commutes with a given (Hamiltonian) dynamics for every g ∈ G. Let denote some set of physical states (e.g. the set of all ground states or all equilibrium states at some temperature for the given dynamics), G and G denote the set of all G-invariant states in . Let ω be an extremal point in . Suppose that ω has a factor state decomposition in in the form of ω = dµ(g)ωg with ωg := τg∗ ω0 (= ω0 ◦ τg ), where ω0 is a factor state in (but not in G ) and so is each ωg , and µ denotes some probability measure on G. With the above setting, we define the following. Definition 2. If for each g = g of G a pair of factor states ωg and ωg are disjoint with respect to the given quasi-local structure, then it is said that the G-symmetry is macroscopically broken. Let ω be a state of a quasi-local system (A, {Aα }α∈Floc ). It is said that ω satisfies the cluster property (with respect to the quasi-local structure) if for any given ε > 0 and any A ∈ A there exists an α ∈ Floc such that ω(AB) − ω(A)ω(B) < ε B (4) for all B ∈ β⊥α Aβ . It is shown in [24] and Theorem 2.6.5 [14] that every factor state satisfies this cluster property. However, the converse does not always hold; non-factor quasi-free states of the CAR algebra satisfy the cluster property with respect to the quasi-local (lattice) structure used for their construction, see [20] for details. The following proposition asserts that fermion grading symmetry cannot be broken in the sense of Definition 2. A remarkable thing is that it makes no reference to the dynamics. We are using essentially no more than the canonical anticommutation relations (CAR) for its proof. (The idea of the proof comes from our study on state correlation for composite fermion systems done in [12, 21].) Proposition 1. Let ω be a state of a quasi-local system (A, {Aα }α∈Floc ) and denote the fermion grading involution of A. Suppose that ω satisfies the cluster property with respect to the quasi-local structure. Then ω and ω cannot be disjoint with respect to the quasi-local structure {Aα }α∈Floc . Accordingly spontaneous symmetry breaking in the sense of Definition 2 does not exist for fermion grading symmetry. Proof. Suppose that ω and ω are disjoint with respect to the quasi-local structure {Aα }α∈Floc . Then ω and ω restricted to Aαc are disjoint for each α ∈ Floc . Hence it follows that ωα − ωα = 2. (5) c

c


415

This is equivalent to the existence of an odd element A− ∈ Aoαc such that A− ≤ 1 and |ω(A− ) − ω((A− ))| = |ω(A− ) − ω(−A− )| = 2|ω(A− )| = 2, namely, ω(A− ) = 1. (6) By (3) and L1, we have that Aloc ∩ Aeδ is norm dense in Aeδ and so is Aloc ∩ Aoδ in for any δ ∈ F. Hence from (6), we have some A− in Aoγ for some γ ∈ Floc such that γ ≤ αc , A− ≤ 1 and ω(A− ) > 0.999. (7) Aoδ

(We use a sloppy notation for A− in the above; A− in (6) belonging to Aαc is approximated by A− in (7) belonging to Aoγ .) By the decomposition of A− into hermitian elements A− = 1/2(A− + A∗− ) − i i/2(A− − A∗− ) , we have

ω 1/2(A− + A∗ ) − iω i/2(A− − A∗ ) > 0.999. − −

Since (A− + A∗− ) and i(A− − A∗− ) are both self-adjoint, we have ω 1/2(A− + A∗ ) 2 + ω i/2(A− − A∗ ) 2 > 0.9992 . − − Hence we have 0.999 ω 1/2(A− + A∗ ) > 0.999 (8) √ or ω i/2(A− − A∗− ) > √ . − 2 2 From (8), 1/2(A− + A∗− ) ≤ 1 and i/2(A− − A∗− ) ≤ 1, we can choose A− = A∗− ∈ Aoγ (by adjusting ±1) such that A− ≤ 1 and 0.999 ω(A− ) > √ . 2

(9)

By the cluster property assumption (4) on ω, for a sufficiently small ε > 0 and the above specified A− ∈ Aoγ there exists an α ∈ Floc such that ω(A− B) − ω(A− )ω(B) < ε B (10) for all B ∈ β⊥α Aβ . By (5) with α = γ ∨ α , the same argument leading to (9) implies that there exists ∗ ∈ Ao such that ζ ⊥ (γ ∨ α ), B ≤ 1 and B− = B− − ζ 0.999 ω(B− ) > √ . 2

(11)

Substituting the above B− to B in (10), and using (9) and (11), we have Im (ω(A− B− )) < ε, 0.9992 Re ω(A− B− ) > − ε. 2

(12)

416

H. Moriya

∗ ∈ Ao , and γ ⊥ ζ , A B is skew-self-adjoint, Due to A− = A∗− ∈ Aoγ , B− = B− − − ζ ∗ i.e. (A− B− ) = −A− B− . Therefore ω(A− B− ) is a purely imaginary number, which however contradicts (12). Thus we have shown that ω and ω cannot be disjoint with respect to {Aα }α∈Floc . Since any factor state satisfies the cluster property, the possibility of SSB of Definition 2 for the symmetry is negated.

4. On the Centers of Temperature States of Lattice Systems From now on, we consider lattice fermion systems [11] and also the lattice systems with graded commutation relations [7] satisfying the translation uniformity to be specified. Take Zν , ν(∈ N)-dimensional cubic integer lattice. Let Floc be a set of all finite subsets of the lattice. We assume that there is a finite number of degrees of freedom (spins) on each site of the lattice. For general graded lattice systems, we further assume that the subalgebra A{i} on each site i on the lattice is isomorphic to a d × d full matrix algebra, d ∈ N being independent of i. Hence for each I ∈ Floc , AI is isomorphic to a d |I| × d |I| full matrix algebra, and A is a UHF algebra of type d ∞ by Lemma 2.1 of [7]. (As an example of such systems, A{i} is generated by fermion operators ai , ai∗ , and y spin operators represented by the Pauli matrices σix , σi , σiz which are even elements commuting with all fermion operators.) We denote the conditional expectation of the tracial state from A onto AJ by EJ . The interaction among sites is determined by the potential , a map from Floc to A satisfying the following conditions: ( -a) ( -b) ( -c) ( -d) ( -e)

(I) ∈ AI , (∅) = 0. ∗

(I) = (I). (I) = (I). EJ (I) = 0 if J ⊂ I and J = I. For each fixed I ∈ Floc , the net {HJ (I)}J with HJ (I) := K (K); K ∩ I = ∅, K ⊂ J is a Cauchy net for J ∈ Floc in the norm topology converging to a local Hamiltonian H (I) ∈ A.

Let P denote the real vector space of all satisfying all the above conditions. The set of all ∗-derivations on the domain Aloc commuting with is denoted D(Aloc ). There exists a bijective real linear map from ∈ P to δ ∈ D(Aloc ) for the lattice fermion systems (Theorem 5.13 of [11]), and similarly for the graded lattice systems (Theorem 4.2 of [7]). The connection between δ ∈ D(Aloc ) and its corresponding ∈ P is given by δ(A) = i[H (I), A],

A ∈ AI

(13)

for every I ∈ Floc , where the local Hamiltonian H (I) is determined by ( -e) for this . The condition ( -d) is called the standardness which is for fixing ambiguous terms (such as scalars) irrelevant to the dynamics given by (13). We remark that any product state, for example the Fock state, can be used in place of the tracial state for EJ to obtain a similar one-to-one correspondence between δ and . Furthermore, characterizations of equilibrium states, such as LTS, Gibbs (and also the variational principle for translation invariant states), have all been shown to be independent of the choice of its those product states [7]. The above-mentioned Gibbs condition was defined for the quantum spin lattice systems [9], and then extended to the lattice fermion systems in § 7.3 of [11], and to the


417

graded lattice systems under consideration [7]. Let be a cyclic and separating vector of a von Neumann algebra M on H and denote the modular operator for (M, ), see [26]. The state ω on M given by ω(A) = (, A) for A ∈ M satisfies the KMS condition for the modular automorphism group σt := Ad(it ), t ∈ R, at the inverse temperature β = −1 and is called the modular state with respect to σt . The following definition works for any lattice system under consideration. Definition 3. Let ϕ be a state of A and Hϕ , πϕ , ϕ be its GNS triplet. It is said that ϕ satisfies the Gibbs condition for δ ∈ D(Aloc ) at inverse temperature β ∈ R, for short (δ, β)-Gibbs condition, if and only if the following conditions are satisfied : (Gibbs-1) The GNS vector ϕ is separating for Mϕ := πϕ (A) . For (Mϕ , ϕ , Hϕ ), the modular operator ϕ and the modular automorphism group βH (I) σϕ,t are defined. Let σϕ,t denote the one-parameter group of ∗-automorphisms determined by the generator δϕ + δπϕ (βH (I)) , where δϕ denotes the generator for σϕ,t and δπϕ (βH (I)) (A) := i[βπϕ (H (I)), A] for A ∈ Mϕ . βH (I)

(Gibbs-2) For every I ∈ Floc , σϕ,t βH (I)

The modular state for σϕ,t βH (I)

unit vector ϕ

fixes the subalgebra πϕ (AI ) elementwise.

is given as the vector state of a (uniquely determined)

lying in the natural cone for (Mϕ , ϕ ) and is denoted ϕ βH (I) . We use βH (I)

the same symbol for its restriction to A, namely, ϕ βH (I) (A) := π(A)ϕ

βH (I)

, ϕ

βH (I)

is normalized and ϕ βH (I) (1) = 1 in our notation. for A ∈ A. We remark that ϕ (For the general references of the perturbed states, the relative modular automorphisms, and their application to quantum statistical mechanics, see [1, 2] and § 5.4 of [14].) We next show the product property of ϕ βH (I) in the following sense. Lemma 2. Let ϕ be a (δ, β)-Gibbs state for δ ∈ D(Aloc ) and β ∈ R. If it is even, then for each I ∈ Floc , ϕ βH (I) is a product state extension of the tracial state tr I on AI and its restriction to AIc , as denoted ϕ βH (I) = tr I ◦ ϕ βH (I) A . (14) Ic

Proof. It has already been shown in Proposition 7.7 of [11] for the lattice fermion systems, and we can easily verify this statement for the graded lattice systems as well. But we shall provide a slightly simpler proof. In Theorem 9.1 of [2] it is shown that ϕ βH (I) ([Q1 , Q2 ]Q) = 0

(15)

for every Q1 , Q2 ∈ AI and Q ∈ AI , the commutant of AI in A. From this we see that ϕ βH (I) is a product state extension of the tracial state tr I on AI and its restriction to AI . Since ϕ is an even state and H (I) is an even self-adjoint element, ϕ βH (I) is also even. It is easy to see AI = AeIc + vI AoIc , where vi := ai∗ ai − ai ai∗ ,

vI :=

i∈I

vi .

(16)

418

H. Moriya

This vI is a self-adjoint unitary implementing on AI . For A+ ∈ AeI , A− ∈ AoI , B+ ∈ AeIc and B− ∈ AoIc , computing the expectation values of all Aσ Bσ with σ = ± and σ = ± for ϕ βH (I) , we obtain ϕ βH (I) (A+ B+ ) = tr I (A+ )ϕ βH (I) (B+ ), and zeros for the others, i.e A+ B− , A− B+ and A− B− . Therefore ϕ βH (I) is equal to the product state extension of the tracial state tr I on AI and ϕ βH (I) |AIc . We provide a grading structure with von Neumann algebras generated by even states subalgebras. For an even state ω of a quasi-local system, let and with their -invariant Hω , πω , ω be a GNS triplet of ω and let Mω denote the von Neumann algebra generated by this representation. Let U,ω be a unitary operator of Hω implementing the grading involution , and ω := Ad(U,ω ). Then even and odd parts of Mω are given by Meω := {A ∈ Mω ω (A) = A}, Moω := {A ∈ Mω ω (A) = −A}. (17) Let N be a -invariant subalgebra of Mω . We give its grading as Ne(ω ) := N ∩ Meω ,

No(ω ) := N ∩ Moω ,

(18)

where the superscripts e(ω ) and o(ω ) indicate that the grading is determined by ω . For any A ∈ Mω (also A ∈ N), we have its unique decomposition A = A+ + A− such that A+ ∈ Meω (Ne(ω ) ) and A− ∈ Moω (No(ω ) ) in the same manner as (3). Let ω1 and ω2 be even states on A. Let N1 and N2 be some -invariant subalgebras of Mω1 and Mω2 , respectively. If there is an isomorphism η from N1 onto N2 , that is, N1 and N2 are isomorphic, then we denote this relationship by N1 ∼ N2 . If there is a grading preserving isomorphism η from N1 onto N2 , that is, η maps the even part to the even, the odd to the odd, then we write N1 ∼ N2 . Obviously each of ‘∼’ and ‘∼ ’ is an equivalence relation. We recall relative entropy, which will be used in the proof of the next proposition and also for the formulation of our local thermal stability condition in the next section and the Appendix. For two states ω1 and ω2 of a finite-dimensional system, it is defined by S(ω1 , ω2 ) := ω2 (log D2 − log D1 ) , := +∞, otherwise,

if ker D1 ⊂ ker D2 , (19)

where Di is the density matrix for ωi (i = 1, 2). It is positive, and zero if and only if ω1 = ω2 . Its generalization to von Neumann algebras is given in [4, 5]. (Note that the order of two states and the sign convention of relative entropy are both reversed in [14].) In the following discussion we are interested in centers. Let us denote the center of Mω by Zω . It is immediate to see that Zω is -invariant for an even state ω. We shall e( ) o( ) use the shorthand Zeω and Zoω for Zω ω and Zω ω , respectively. Proposition 3. Let ϕ be an even (δ, β)-Gibbs state. For I ∈ Floc , let ϕIc denote the state restriction of ϕ onto AIc . Then for any I ∈ Floc there is a grading preserving isomorphism between the centers of the von Neumann algebras generated by the GNS representation of ϕ and by that of ϕIc . Especially, ϕ is a factor state if and only if is ϕIc also.


419

βH (I) Proof. Let Hϕ , πϕ , ϕ be a GNS triplet of ϕ, and ϕ denote the normalized βH (I) vector representing its perturbed state ϕ as in Definition 3. By Theorem 3.10 of [5] (also by the discussion below Definition 6.2.29 of [14]), S(ϕ, ϕ βH (I) ) ≤ 2 βH (I) ,

(20)

S(ϕ βH (I) , ϕ) ≤ 2 βH (I) .

Since the relative entropy is not increasing by restriction onto any subsystem, taking the restrictions of ϕ and ϕ βH (I) onto AIc denoted ϕIc and ϕ βH (I) Ic respectively, we have S(ϕIc , ϕ βH (I) Ic ) ≤ 2 βH (I) ,

(21)

ϕIc ) ≤ 2 βH (I) .

(22)

S(ϕ

βH (I)

Ic ,

By applying the argument in § 2 and 3 of [3] to the present case, (21) implies that ϕIc quasi-contains ϕ βH (I) Ic , and also for (22), vice-versa. (The notion of quasi-containment given in this reference is as follows. For a pair of representations π1 and π2 of a C∗ -algebra, if there is a subrepresentation of π1 which is quasi-equivalent to π2 , then π1 is said to quasi-contain π2 ). Therefore ϕIc and ϕ βH (I) Ic are quasi-equivalent. Let HϕI , πϕI , ϕI and Hϕ βH (I) I , πϕ βH (I) I , ϕ βH (I) I be GNS representations c

c

c

c

c

c

for ϕIc and ϕ βH (I) Ic , MϕI and Mϕ βH (I) I be von Neumann algebras generated by those c c representations of AIc . By taking the restriction of the canonical isomorphism between the von Neumann algebras MϕI and Mϕ βH (I) I which maps πϕI (A) to πϕ βH (I) I (A) for c c c c A ∈ A onto their centers ZϕI := MϕI ∩ Mϕ and Zϕ βH (I) I := Mϕ βH (I) I ∩ Mϕ βH (I) , c c c c Ic Ic we have ZϕI ∼ Zϕ βH (I) I . c

c

(23)

In the above derivation, we have noted that even and odd parts of von Neumann algebras generated by a GNS representation are weak limits of even and odd parts of a underlying C∗ -system (mapped onto the GNS space), and hence the canonical isomorphism conjugating a pair of quasi-equivalent representations and its restriction to -invariant subalgebras are grading preserving. We shall construct a GNS representation of ϕ βH (I) (on A) from the above Hϕ βH (I) I , c πϕ βH (I) I , ϕ βH (I) I on AIc and a GNS representation of the tracial state tr I on AI c c denoted KI , κI , I . Define K := KI ⊗ Hϕ βH (I) I , c

:= I ⊗ ϕ βH (I) I , c

VI := κI (vI ) ⊗ 1Ic , κˆ I (A) := κI (A) ⊗ 1Ic for A ∈ AI , πˆ ϕ βH (I) I (A) := 1I ⊗ πϕ βH (I) I (A) for A ∈ AIc , c

(24)

c

where 1I and 1Ic are the identity operators on KI and Hϕ βH (I) I , vI is given by (16). Noting c Ad(vI ) = |AI , we have a unique representation κ of the total system A on K satisfying κ(A) = κˆ I (A)

for A ∈ AI ,

(25)

420

H. Moriya

and κ(B+ ) = πˆ ϕ βH (I) I (B+ ) c

for B+ ∈ AeIc ,

κ(B− ) = VI πˆ ϕ βH (I) I (B− ) c

for B− ∈ AoIc .

(26) By (14), i.e., the product property of ϕ βH (I) for AI and AIc , we verify that this K, κ, gives a GNS triplet of ϕ βH (I) . We have also

Mκ := κ(A) = κI (AI ) ⊗ πϕ βH (I) I (AIc ) c

= (κI (AI )) ⊗ Mϕ βH (I) I .

(27)

c

Since ϕ βH (I) Ic is even, and is |AIc -invariant, we have a unitary operator UIc of Hϕ βH (I) I which implements |AIc in its GNS space Hϕ βH (I) I , πϕ βH (I) I , ϕ βH (I) I . c c c c As (17), Ad(UIc ) determines the even and odd parts of Mϕ βH (I) I . Accordingly by (18), c the grading is induced on the center Zϕ βH (I) I and it is decomposed into Zeϕ βH (I) and c

Ic

Zoϕ βH (I) . Ic For AI , κI (vI ) gives a unitary operator implementing |AI . By the construction of K, κ, , U := κI (vI ) ⊗ UIc ∈ B(K)

(28)

gives a unitary operator which implements for ϕ βH (I) . This U gives a grading for Mκ and it is split into Meκ and Moκ . Also by this grading the center Zκ := Mκ ∩ Mκ is decomposed into Zeκ and Zoκ . Note that the center of the tensor product of a pair of von Neumann algebras is equal to the tensor product of their centers by the commutant theorem (Corollary 5.11 in I.V. of [26]). Since AI is a full matrix algebra, and the center of any state on it is trivial, by (27) we have Zκ = 1I ⊗ Zϕ βH (I) I .

(29)

c

Moreover from (28) and (29) it follows that Zeκ = 1I ⊗ Zeϕ βH (I) , Ic

Zoκ = 1I ⊗ Zoϕ βH (I) , Ic

(30)

where we have noted that the grading of Zϕ βH (I) I is determined by the unitary UIc . The c equalities (29) and (30) give Zκ ∼ Zϕ βH (I) I . c

(31)

Combining (31) with (23) we have ZϕI ∼ Zϕ βH (I) I ∼ Zκ . c

c

(32)

βH (I) Since K, κ, and Hϕ , πϕ , ϕ are both GNS representations of the βH (I) same state ϕ on A, they are apparently unitary equivalent. The representation βH (I) obviously induces algebra for Hϕ , πϕ , ϕ , Hϕ , πϕ , ϕ the same von Neumann namely Mϕ . Hence K, κ, and Hϕ , πϕ , ϕ are unitary equivalent. Taking the


421

restriction of the unitary map which conjugates those equivalent representations of A onto the center, we have Z κ ∼ Zϕ .

(33)

ZϕI ∼ Zϕ ,

(34)

From (32) and (33), it follows that c

which is what we would like to have.

Remark 1. We note that the identification of two von Neumann algebras in (31) and in (34) does not imply that the underlying C∗ -systems A and AIc are conjugated to each other in those representations. Remark 2. We shall explain that the formula (34) does not hold in general by an example. Take the one-dimensional lattice Z and a site of it, say the origin 0. We prepare a non-factor quasi-free state ρ [20] on A{0}c , where {0}c denote the complementary region of {0}. The factor decomposition of ρ is given by ρ = 1/2(ψ + ψ), where ψ is a noneven factor state of A{0}c . Take a (unique) product state extension of the tracial state ˜ We see that the state tr {0} of A{0} and ψ to the total system A, which is denoted ψ. ˜ ψ on A is equal to the state extension of tr {0} and ψ to A. Let Hψ˜ , πψ˜ , ψ˜ √ ˜ Take an odd unitary u of A{0} , say, 1/ 2(a0 + a ∗ ). Define be a GNS triplet of ψ. 0 √ ξ := 1/ 2(ψ˜ + πψ˜ (u)ψ˜ ), which is a unit vector of Hψ˜ . Let ϕξ denote the state

determined by ϕξ (A) := πψ˜ (A)ξ, ξ for A ∈ A. It is clear that this ϕξ is a factor state of A by its construction. By direct computation, its restriction onto A{0}c is equal to ρ. Hence ϕξ is a factor state whose restriction to the subsystem A{0}c is a non-factor. 5. Violation of the Local Thermal Stability for Noneven KMS States For some technical reason we shall work with KMS states [17] (not directly with Gibbs states). Let αt (t ∈ R) be a one-parameter group of ∗-automorphisms of A. A state ϕ is called an (αt , β)-KMS state if it satisfies ϕ Aαiβ (B) = ϕ(BA) for every A ∈ A and B ∈ Aent , where Aent denotes the set of all B ∈ A for which αt (B) has an analytic extension to A-valued entire function αz (B) as a function of z ∈ C. Our dynamics αt is assumed to be even, namely αt = αt for each t ∈ R. We also put the following assumptions in order to relate αt with some δ ∈ D(Aloc ). (I) The domain of the generator δα of αt includes Aloc . (II) Aloc is a core of δα . The next proposition asserts the equivalence of the KMS and Gibbs conditions under (I, II). The proof was given for the lattice fermion systems in Theorem 7.5 (the implication from KMS to Gibbs under the assumption (I) and Theorem 7.6 (the converse direction under the assumption (I, II)) of [11]. The proof for the graded lattice systems can be done in much the same way and we shall omit it. We emphasize that this equivalence does not require the evenness of states, which becomes essential in the proof of Proposition 5.

422

H. Moriya

Proposition 4. Let αt be an even dynamics satisfying Conditions (I, II). Let δ(∈ D(Aloc )) be the restriction of its generator δα to Aloc . Then a state ϕ of A satisfies (αt , β)-KMS condition if and only if it satisfies (δ, β)-Gibbs condition. One would ask whether fermion grading symmetry is perfectly preserved or not for non-zero temperature states. (It is plausible that we can derive a stronger statement about the unbroken symmetry of fermion grading for KMS states than Proposition 1.) We leave this question for future study. Here we show the following rather weak statement. Suppose that there is a nonzero odd element in the center of some even KMS state for even dynamics satisfying (I, II), then there always exist noneven KMS states that do not satisfy the local thermal stability (LTS). This LTS refers to LTS-P in the terminology of [10] (not LTS-M there). The content of the local thermal stability condition is summarized in the Appendix. We give some relevant material here. Let ϕ be an arbitrary even (αt , β)-KMS state. For I ∈ Floc , which is now fixed, ϕ βH (I) denotes the perturbed state of ϕ by βH (I). From the given δ ∈ D(Aloc ) and I ∈ Floc , a new ∗-derivation δ˜ ∈ D(Aloc ) is given as ˜ ∈ P by follows. Let ∈ P denote the potential corresponding to δ. Define ˜

(J) := 0, if J ∩ I = ∅,

˜

(J) := (J), otherwise. (35) ˜ by δ˜ ∈ D(Aloc ). By definition, δ˜ acts We denote the ∗-derivation corresponding to trivially on AI . The one-parameter group of ∗-automorphisms of A generated by δ˜ is equal to the perturbation of αt by H (I) given in terms of the Dyson-Schwinger expansion series and denoted α˜ t . By Proposition 4 and its proof found in [11], ϕ βH (I) satisfies ˜ β)-Gibbs condition. (α˜ t , β)-KMS condition and (δ, We recall the GNS representation K, κ, of ϕ βH (I) previously defined in (24), (25), (26). Let p be a nonzero projection in Zκ which has a unique even-odd decomposition p = p+ + p− , p+ ∈ Zeκ and p− ∈ Zoκ . By (29) we can write p = 1I ⊗ q with some q ∈ Zϕ βH (I) I . Furthermore by (30), we have p+ = 1I ⊗ q+ with q+ ∈ Zeϕ βH (I) and

c

Ic

and p− = 1I ⊗ q− with q− ∈ Zoϕ βH (I) . We define a positive linear functional on A by Ic

βH (I) ϕp (A)

for A ∈ A.

:= (κ(A), p)

(36)

We take its restriction onto AIc . For A+ ∈ AeIc , we have βH (I)

ϕp

(A+ ) = (κ(A+ ), p)

= 1I ⊗ πϕ βH (I) I(A+ ) I ⊗ ϕ βH (I) I , I ⊗ qϕ βH (I) I c c c

= πϕ βH (I) I(A+ )ϕ βH (I) I , qϕ βH (I) I c c c

= πϕ βH (I) I(A+ )ϕ βH (I) I , q+ ϕ βH (I) I , c

c

c

(37)

where in the last equality we have used the evenness of ϕ βH (I) . For A− ∈ AoIc , βH (I)

ϕp

(A− ) = (κ(A− ), p)

= κI (vI ) ⊗ πϕ βH (I) I (A− ) I ⊗ ϕ βH (I) I , I ⊗ qϕ βH (I) I c c c

= tr I (vI ) πϕ βH (I) I (A− )ϕ βH (I) I , qϕ βH (I) I = 0, (38) c

where we have used tr I (vI ) = 0.

c

c


423

If p is even, i.e. p = p+ = 1I ⊗ q+ with q+ ∈ Zeϕ βH (I) , then from (37), (38), and Ic

ϕ βH (I) (A− ) = 0 for any A− ∈ Ao , it follows that

βH (I) ϕp (A) = πϕ βH (I) I (A)ϕ βH (I) I , q+ ϕ βH (I) I c

c

c

(39)

for any A ∈ AIc . Suppose that Zoκ is not empty. Take any nonzero f ∈ Zoκ . Then f + f ∗ and if + (if )∗ are self-adjoint elements in Zoκ . Since at least one of them is nonzero, we can take a self-adjoint element in Zoκ whose operator norm is less than 1 and shall denote such an element by f . Let pf := 1/2(1 + f ), which is a positive operator. Define a noneven state βH (I)

(40)

ψ := 2ϕpf

βH (I)

by substituting this pf into p of (36). We easily see that ψ is equal to 2ϕp−f for p−f := 1/2(1 − f ). Their averaged state 1/2(ψ + ψ) is obviously equal to ϕ βH (I) .

Proposition 5. Let αt be an even dynamics satisfying (I, II) and let ϕ be an arbitrary even (αt , β)-KMS state. For I ∈ Floc , let α˜ t denote the perturbed dynamics of αt by the ˜ denote the potential for α˜ t given as (35). If the odd part local Hamiltonian H (I). Let of the center of the perturbed state ϕ βH (I) is not empty, then the noneven (α˜ t , β)-KMS ˜ β)-LTS condition. states ψ and ψ given as (40) violate ( , Proof. Since ϕ βH (I) is an (α˜ t , β)-KMS state, ψ and ψ are also (α˜ t , β)-KMS states ˜ β)-Gibbs states by by Theorem 5.3.30 [14]. Accordingly ϕ βH (I) , ψ and ψ are all (δ, Proposition 4. We consider the state restrictions of ϕ βH (I) , ψ, and ψ onto AIc . Since the even parts of pf and p−f are both scalar, it follows from (37) that ϕ βH (I) |AeI = ψ|AeI = ψ|AeI . c

c

c

Due to (38) all of them are even when restricted to AIc . Hence we have ϕ βH (I) |AIc = ψ|AIc = ψ|AIc .

(41)

˜ determined by the formula Denote the local Hamiltonians for the new potential ( -e) by {H˜ (J)}J∈Floc . From (35) it follows that H˜ (I) = 0, and hence ϕ βH (I) (H˜ (I)) = ψ(H˜ (I)) = ψ(H˜ (I)) = 0.

(42)

We compute conditional entropy of ϕ βH (I) , ψ and ψ for the finite region I. The definition of conditional entropy is given in (47). Noting (14) we have

SI (ϕ βH (I) ) = −S(tr I ◦ ϕ βH (I) |AIc , ϕ βH (I) ) = −S(tr I ◦ ϕ βH (I) |AIc , tr I ◦ ϕ βH (I) |AIc ) = 0,

(43)

424

H. Moriya

which is the maximum value of SI (·). For ψ, using (41) and then (14) we have

SI (ψ) = −S(tr I ◦ ψ|AIc , ψ) = −S(tr I ◦ ϕ βH (I) |AIc , ψ) = −S(ϕ βH (I) , ψ).

(44)

Since ϕ βH (I) = ψ, the former is even and the latter is noneven, it follows from this equality and the strict positivity of relative entropy (see [4]) that

SI (ψ) < 0. By the automorphism invariance (acting on two states in the argument) of relative entropy, we have

SI (ψ) < 0. SI (ψ) =

(45)

Substituting (42), (43), and (45) into (48), we obtain ˜

˜

˜

FI,β (ψ) = FI,β (ψ) < FI,β (ϕ βH (I) ) = 0.

(46)

˜ β)-LTS condition This strict inequality with (41) shows that ψ and ψ do not satisfy ( , ˜ (49), although both of them satisfy (δ, β)-Gibbs condition. Appendix Local thermal stability (LTS) condition. Let (A, {AI }I∈Floc ) be a lattice system considered in § 4. In [10] the local thermal stability (LTS) is studied for the lattice fermion systems. It is easy to see that the same formulation is available for the graded lattice systems under consideration. Let ω be a state of (A, {AI }I∈Floc ). For I ∈ Floc , the conditional entropy of ω is defined in terms of the relative entropy (19) by

SI (ω) := −S(tr I ◦ ω|AIc , ω) = −S(ω · EIc , ω) ≤ 0,

(47)

where EIc is the conditional expectation onto AIc with respect to the tracial state and ω · EIc (A) := ω(EIc (A)) for A ∈ A. Let ∈ P. The conditional free energy of ω for I ∈ Floc is given by

(ω) := SI (ω) − βω(H (I)), FI,β

(48)

where H (I) is a local Hamiltonian for I with respect to . Definition 4. Let be a potential in P. A state ϕ of A is said to satisfy the local thermal stability condition for at inverse temperature β or ( , β)-LTS condition if for each I ∈ Floc ,

(ϕ) ≥ FI,β (ω) FI,β

for any state ω satisfying ω|AIc = ϕ|AIc .

(49)


425

There is the other definition of local thermal stability in [10] that has the same variational principle formula as above but takes the commutant algebra AI as the complementary outside system of a local region I instead of AIc . We shall call this alternative local thermal stability condition LTS condition, where the superscript ‘’ stands for the commutant. (Also by ‘’ we mean that this formalism is not so natural compared to Definition 4 if we respect the given quasi-local structure. Nevertheless, there are some mathematically good points with LTS .) The equivalence of KMS and LTS conditions holds for the lattice fermion systems without assuming the evenness on states. For our LTS, on the contrary, such evenness assumption is required in deriving its equivalence to the KMS condition. (The formalism of LTS using commutants for complementary outside systems makes it possible to exploit the known arguments for quantum spin lattice systems [13].) Acknowledgements. I thank KEK and IHES where this work is done. I thank Professor Araki for providing valuable suggestions that clarify the discussion. I thank Professor Tsutsui for kind hospitality at KEK. I acknowledge the JSPS Postdoctoral Fellowships for Research Abroad (Aug 2003–May 2005) and IHES. I thank Professor Ruelle for kind hospitality at IHES. I have been supported by COE post-doctoral fellowship of the Mathematics Department of Hokkaido University since Jun 2005 which is greatly appreciated.

References 1. Araki, H.: Relative hamiltonian for faithful normal states of a von Neumann algebra. Publ. RIMS, Kyoto Univ. 7, 165–209 (1973) 2. Araki, H.: Positive cone, Radon-Nikodym theorems, relative hamiltonian and the Gibbs condition in statistical mechanics. An application of the Tomita-Takesaki theory. In: C∗ -algebras and their applications to statistical mechanics and quantum field theory. D. Kastler, ed. Bologna: Editrice Composition, pp 64–100, 1975 3. Araki, H.: On uniqueness of KMS states of one-dimensional quantum lattice systems. Commun. Math. Phys. 44, 1–7 (1975) 4. Araki, H.: Relative entropy of states of von Neumann algebras. Publ. RIMS, Kyoto Univ. 11, 809–833 (1976) 5. Araki, H.: Relative entropy for states of von Neumann algebras II. Publ. RIMS, Kyoto Univ. 13, 173–192 (1977) 6. Araki, H.: On superselection rules. Proc. 2nd Int. Symp. Foundations of Quantum Mechanics, Tokyo, (1986) pp. 348–354 Physical Society of Japan, Tokyo (1987) 7. Araki, H.: Conditional expectations relative to a product state and the corresponding standard potentials. Commun. Math. Phys. 246, 113–132 (2004) 8. Araki, H.: Ryoushiba no Suuri (Japanese). Iwanami, 1996. Mathematical Theory of Quantum Fields. translation by Watamura, U. C., Oxford University Press, Oxford (1999) 9. Araki, H., Ion, P.D.F.: On the equivalence of KMS and Gibbs conditions for states of quantum lattice systems. Commun. Math. Phys. 35, 1–12 (1974); Araki, H.: On the equivalence of the KMS condition and the variational principle for quantum lattice systems. Commun. Math. Phys. 38, 1–10 (1974) 10. Araki, H., Moriya, H.: Local thermodynamical stability of fermion lattice systems. Lett. Math. Phys. 60, 109–121 (2002) 11. Araki, H., Moriya, H.: Equilibrium statistical mechanics of fermion lattice systems. Rev. Math. Phys. 15, 93–198 (2003) 12. Araki, H., Moriya, H.: Joint extension of states of subsystems for a CAR system. Commun. Math. Phys. 237, 105–122 (2003) 13. Araki, H., Sewell, G.L.: KMS conditions and local thermodynamical stability of quantum lattice systems. Commun. Math. Phys. 52, 103–109 (1977); Sewell, G.L.: KMS conditions and local thermodynamical stability of quantum lattice systems II. Commun. Math. Phys. 55, 53–61 (1977) 14. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics I and II, Springer-Verlag, Berlin Heidelberg NewYork (1979 and 1981) 15. Driessler, D., Summers, S. J.: Central decomposition of Poincaré-invariant nets. Ann. Inst. Henri Poincaré, Phys. Théor. 43, 147–166 (1985)

426

H. Moriya

16. Haag, R.: Local Quantum Physics. Springer-Verlag, Berlin Heidelberg NewYork (1996) 17. Haag, R., Hugenholz, N.M., Winnink, M.: On the equilibrium states in quantum statistical mechanics. Commun. Math. Phys. 5, 215–236 (1967) 18. Lanford III, O.E., Robinson, D.W.: Mean entropy of states in quantum statistical mechanics. J. Math. Phys. 9, 1120–1125 (1968) 19. Lanford III, O.E., Ruelle, D.: Observable at infinity and states with short range correlations in statistical mechanics. Commun. Math. Phys. 13, 194–215 (1969) 20. Manuceau, J., Verbeure, A.: Non-factor quasi-free states of the CAR-algebra. Commun. Math. Phys. 18, 319–326 (1970) 21. Moriya, H.: Some aspects of quantum entanglement for CAR systems. Lett. Math. Phys. 60, 109– 121 (2002); On separable states for composite systems of distinguishable fermions J. Phys. A: Math. Gen in press. Validity and failure of some entropy inequalities for CAR systems. J. Math. Phys. 46, 033508 (2005). On a state having pure-state restrictions for a pair of regions. Interdisc. Inf. Sci. 10, 31–40 (2004) 22. Narnhofer, H., Thirring, W.: Spontaneously broken symmetries. Ann. Inst. Henri Poincaré, Phys. Théor. 70, 1–21 (1999) 23. Powers, R.T.: Representations of the canonical anticommutation relations. Thesis, Princeton University, 1967 24. Robinson, D.W.: A characterizaion of clustering states. Commun. Math. Phys. 41, 79–88 (1975) 25. Ruelle, D.: Statistical Mechanics, Rigorous Results. Benjamin, NewYork (1969) 26. Takesaki, M.: Theory of Operator Algebras I. Springer-Verlag, Berlin Heidelberg NewYork (1979) 27. Weinberg, S.: The Quantum Theory of Fields I. Cambridge University Press, Cambridge (2002) 28. Wick, G.C., Wightman, A.S., Wigner, E.P.: The intrinsic parity of elementary particles. Phys. Rev. 88, 101–105 (1952) Communicated by H. Spohn


Communications in


Fermionic Characters and Arbitrary Highest-Weight r +1 -Modules Integrable sl Eddy Ardonne1 , Rinat Kedem2 , Michael Stone1 1

Department of Physics, University of Illinois, 1110 W. Green St., Urbana, IL 61801, USA. E-mail: [email protected]; [email protected] 2 Department of Mathematics, University of Illinois, 1409 W. Green Street, Urbana, IL 61801, USA. E-mail: [email protected] Received: 18 April 2005 / Accepted: 6 July 2005 Published online: 10 February 2006 – © Springer-Verlag 2006

Abstract: This paper contains the generalization of the Feigin-Stoyanovsky construction to all integrable slr+1 -modules. We give formulas for the q-characters of any highestweight integrable module of slr+1 as a linear combination of the fermionic q-characters of the fusion products of a special set of integrable modules. The coefficients in the sum are the entries of the inverse matrix of generalized Kostka polynomials in q −1 . We prove the conjecture of Feigin and Loktev regarding the q-multiplicities of irreducible modules in the graded tensor product of rectangular highest weight-modules in the case of slr+1 . We also give the fermionic formulas for the q-characters of the (non-level-restricted) fusion products of rectangular highest-weight integrable slr+1 -modules.

1. Introduction Fermionic formulæ for characters of highest-weight modules of affine algebras or vertex algebras first appeared in a purely algebraic context [17]. They were later shown [13, 12] to be related to the partition functions of certain statistical mechanical systems at their critical points. These character formulæ have desirable combinatorial properties, such as the manifest positivity of the coefficients that represent weight-space multiplicities. They also have a physical significance because they reflect the quasi-particle content of the statistical mechanical system. Consequently, algebraic constructions of bases for representations which reveal this combinatorial structure are important, and have been studied using several methods in the past dozen years. One such method is that of Feigin and Stoyanovski˘ı [23]. These authors used a theorem of Primc [21] to give an interesting construction of the vacuum integrable modules of the affine algebra g associated to any simple Lie algebra g. Their construction relies on the loop generators of the affine algebra. Physical systems associated with such integrable g-modules are generalizations of the Heisenberg spin chain in statistical mechanics, or the WZW model in conformal field theory.

428

E. Ardonne, R. Kedem, M. Stone

The formulæ of Feigin-Stoyanovski˘ı [23] have an attractive interpretation in terms of (a bosonic version of) non-abelian quantum Hall states [19, 2]. In these states there are r “types” of particles that obey a generalized exclusion principle: the wave function vanishes if any k +1 particles occupy the same state. Here r is the rank of the algebra and k is the level of the integrable g-module. In the presence of quasi-particle excitations, the wave functions can also vanish if fewer than k + 1 particles occupy the same state. The statistics of the quasi-particles is ‘dual’ to the statistics of the fundamental particles [1]. The original construction of Feigin-Stoyanovski˘ı can be used to compute [23] characters of vacuum (with highest weight k0 ) representations of affine algebras. Later, Georgiev [10, 9] generalized it to some modules in the ADE series, with particularly simple highest weights, of the form lωj + k0 , corresponding to special rectangular Young diagrams. (Here ωj are certain fundamental g-weights, and l ∈ Z≥0 .) In general, no fermionic formulæ are available for arbitrary highest-weight, integrable g-modules. In this paper, we resolve this problem for the case of slr+1 . We explain, in terms of the functional realization of Feigin and Stoyanovsk˘i, why such ‘rectangular highest weight’ modules are very special, and why there is no direct fermionic construction for other modules. However, we prove that it is possible to compute the character of any module as a finite sum of fermionic characters of the ‘rectangular’ highest-weight modules. The coefficients in this sum are the entries of the inverse matrix of generalized Kostka polynomials. These coefficients are, however, not manifestly positive (or even of positive degree). In our construction we are naturally led to the graded tensor product of Feigin and Loktev [8] of finite-dimensional g-modules. In the case of irreducible slr+1 -modules with highest weights of the form lωj (where ωj is any fundamental weight), we compute the explicit fermionic form of the graded multiplicities of irreducible modules in the Feigin-Loktev tensor product, thus proving two of the conjectures of [8]: That the graded tensor product in this case is independent of the evaluation parameters, and that it is related to the generalized Kostka polynomials of [22, 16]. The plan of the paper is as follows. In Sect. 2 we give the basic definitions of the algebra and its modules. In Sects. 3 and 4, we supply the details of the generalized construction of [23] for integrable modules of slr+1 , with highest weights corresponding to rectangular Young diagrams. In Sect. 5, we explain a similar calculation of graded characters of conformal blocks or coinvariants (the fusion product of [8]), which turn out to be related to the generalized Kostka polynomials of [22, 16]. We then use this calculation in Sect. 6 to compute the characters of arbitrary highest-weight representations. See Theorem 6.3 for the main result. Although, for the sake of clarity, we concentrate in this paper on the case of g= slr+1 , the generalization to affine algebras associated with other simple Lie algebras is possible, but in that case one should replace the notion of integrable g-modules with irreducible g-modules as their top component with those which have (the degeneration to the classical case of) Kirillov-Reshetikhin modules as their top component. We will give this construction in a future publication.

2. Notation 2.1. Current generators of affine algebras. Let g = slr+1 and let = {αi | i = 1, . . . , r} denote its simple roots, and {ωi | i = 1, . . . , r} the fundamental weights. Let {eαi = ei | i = 1, . . . , r} denote the corresponding generators of n+ , and

Fermionic Characters of slr+1

429

{fαi = fi | i = 1, . . . , r} those of n− . We have the Cartan decomposition slr+1 n+ ⊕ h ⊕ n− , where h is the Cartan subalgebra. Irreducible, finite-dimensional highest-weight g-modules πλ are parametrized by weights λ ∈ P + , that is, λ = l1 ω1 + · · · + lr ωr with li ∈ Z≥0 . The subset of P + consisting of weights λ such that ri=1 li ≤ k is called the set of level-k restricted weights, Pk+ . The affine Lie algebra associated with g is g, where g g ⊗ C[t, t −1 ] ⊕ Cc ⊕ Cd, where c is central and [d, x ⊗ t n ] = −nx ⊗ t n .

(2.1)

def

We denote the current generators by x[n] = x ⊗ t n , x ∈ slr+1 . Let x, y be the symmetric bilinear form on slr+1 . Then the relations between the currents are f (t)g(t)dt, [x ⊗ f (t), y ⊗ g(t)]g = [x, y]gf (t)g(t) + c x, y t=0

where [·, ·]g is the corresponding commutator in g. The Cartan decomposition is g n+ ⊕ h ⊕ n− with n± = n± ⊕ (slr+1 ⊗ t ±1 C[t ±1 ]) and h = h ⊕ Cc ⊕ Cd. The algebra g is the algebra obtained by dropping the generator d. We will frequently use generating functions for current generators of the affine algebra, which we define by x[n]z−n−1 , x ∈ slr+1 . (2.2) x(z) = n∈Z

Note that the convention for the current generators in (2.2) is different from that used by [23, 4]. 2.2. Affine algebra modules. On any irreducible slr+1 -module, c acts by a constant k called the level of the representation. A cyclic highest-weight g-module with highest weight = λ + k0 + mδ is a cyclic module generated by the action of g on a highest-weight vector vλ , such that n+ vλ = 0, hvλ = λ(h)vλ , for h ∈ h ⊂ g, cvλ = kvλ ,

dvλ = mvλ .

(2.3) (2.4)

The universal such module is the Verma module M() U ( n− ). If k ∈ N and λ ∈ Pk+ , the quotient of the Verma module by its maximal submodule is an irreducible, highest-weight integrable g-module, which we denote by Vλ (we assume k is fixed in this notation). The structure of the cyclic module generated by a highest-weight vector vλ is independent of m, so it is generally convenient to set m = 0. Definition 2.1. Let M be an irreducible cyclic highest-weight module with highest weight = λ + k0 , generated by the highest-weight vector vλ . The subspace generated by the action of the subalgebra g ⊗ 1 g on vλ is called the top component of M. It is isomorphic as a g-module to πλ .

430


The irreducible, finite-dimensional g-module πλ is characterized as the quotient of the Verma module of g by the left ideal in g generated by fili +1 . Similarly, the integrable module Vλ is the quotient of theVerma module of g, M(), by the left ideal in g generated by fi [0]li +1 , plus one additional generator, eθ [−1]k−θ(λ)+1 , where θ = α1 + · · · + αr . A characterization of the maximal proper submodule M () of M() in the case of integrable modules was given in [21] in terms of the algebra of current generators. Note that on any highest-weight module, the current (2.2) acts as a Laurent series in z. Therefore, products of currents make sense when acting on a highest-weight module, and one can consider the associative algebra of currents. Formally, the coefficients of zn in products of currents of the form x(z)y(z) exist only in a completion U of U ( g ). Theorem 2.2 [21]. Let M() be a Verma module with highest weight = λ + k0 , with λ ∈ Pk+ and k ∈ N. Denote its maximal proper submodule by M (), such that Vλ M()/M (). Let R be the subspace in U generated by the adjoint action of U (slr+1 ) on the coefficients of eθ (z)k+1 . Then M () = RM(). Again, the elements in R act as well-defined elements of U ( g ) on M(). We call the set of currents which result from the adjoint action of slr+1 on the current eθ (z)k+1 the integrability conditions. For example, for any root α, the coefficients of eα (z)k+1 are in R.

3. The Semi-Infinite Construction of Feigin and Stoyanovski˘ı Theorem 2.2 was used by Feigin and Stoyanovski˘ı [23] to give a construction of the integrable modules in the case where = k0 . The construction naturally gives rise to fermionic formulæ for the characters of integrable modules. We will explain the details of the construction of [23] below.

3.1. Principal subspaces. For arbitrary integrable highest weight = λ + k0 , let vλ be the highest-weight vector of Vλ . Consider the subalgebra n− = n− ⊗ C[t, t −1 ] def

acting on vλ . (0)

Definition 3.1. Define the principal subspace Wλ = Wλ = U ( n− )vλ ⊂ Vλ . Similarly, (N) n− )TN vλ , where TN = tα(N) is the affine define the principal subspaces Wλ = U ( Weyl translation corresponding to the root α(N) = i Ni αi (in the notation of [11] (6.5.2)), where Ni are positive integers such that (Cr N)α = 2N for all α, and Cr is the Cartan matrix of slr+1 . Lemma 3.2. This choice of α(N) gives a sequence of inclusions (0)

(1)

(N)

Wλ ⊂ Wλ ⊂ · · · ⊂ Wλ

⊂ ··· ,

(3.1)

such that the inductive limit of the sequence (3.1) as N → ∞ is the integrable module Vλ .


431 (N)

The inclusions follow from the fact that vλ ∈ Wλ . The fact that the inductive limit indeed gives the full module is not obvious (see [20, 5]) but follows from the fact that the module is integrable. In fact, this theorem was proven in [23] for the following cases: sl2 for arbitrary highest weight, and sl3 with = k0 . This was done by computing the characters in the limit N → ∞, and comparing them with the known character formulæ for Vλ of [17]. In [10, 9], certain combinatorial proofs were provided using ideas related to those of [23] (with differently defined principal subspaces) for rectangular highest weights, for all simply laced algebras. The principal subspaces of that paper are different from those used here, as [10] uses what amounts to a different subalgebra to generate the subspace. In this paper, we will continue this program by giving the character formulæ for arbitrary highest-weight modules of slr+1 . It turns out that the methods of [23] are not sufficient for the case of non-rectangular representations, and instead we must resort to computing the characters of certain fusion products of representations, and decomposing them in terms of irreducible modules. The result is a formula which is a sum of fermionic formulas of the form found in [23, 10, 9], where the coefficients in the sum are elements of Z[q −1 ]. 3.2. Relations in the principal subspace. Let us characterize the ideal Iλ , where Wλ U ( n− )/Iλ . Using a PBW-type argument, it is easy to see that Wλ = U (n− ⊗ C[t −1 ])vλ , because the highest-weight vector vλ is annihilated by n− ⊗ tC[t]. Thus, Iλ includes the left ideal generated by {fα [n] | n > 0, α ∈ }. The ideal contains the two-sided ideal generated by relations in the Lie algebra. In terms of generating functions, these relations are 0, |i − j | = 1 [fαi (z), fαj (w)] = , (3.2) w −1 δ(w/z)fαi +αj (z), |i − j | = 1 fαi (z), [fαi (w), fαi±1 (u)] = 0, (3.3) where δ(z) = n∈Z zn . These two relations together mean that matrix elements involving the product fi (z)fi±1 (w) have a simple pole whenever z = w, and that the residue of this pole commutes with fi (u). The integrability condition fi (z)k+1 v = 0,

v ∈ Vλ , 1 ≤ i ≤ r,

(3.4)

implies that Iλ contains the two-sided ideal generated by the coefficients of zn of fi (z)k+1 (in the appropriate completion of the universal enveloping algebra). Finally there are the relations which follow from the integrability of the top component πλ of Vλ , which is a subspace of Wλ also. Therefore, Iλ contains the left ideal generated by fi [0]li +1 . The integrability condition involving eθ [−1] does not play a role, because it is not an element of U ( n− ). 3.3. Construction of the dual space. In order to compute the characters of the principal subspace Wλ , we describe its dual space. This will enable us to calculate the character for sufficiently simple λ. The dual space is spanned by the coefficients of monomials of nm the form x1n1 · · · xm of matrix elements in the set

Fλ = w|fi1 (x1 ) · · · fim (xm )|vλ | w ∈ Vλ∗ , m ≥ 0, 1 ≤ ia ≤ r ,

432


where Vλ∗ is the restricted dual module. Given an ordering of the generators, the function nm above is defined in the region |xi | > |xi+1 |, and therefore the coefficient of x1n1 · · · xm for given integers nj is given by the expansion in this regime. Below, we shall refer to the function space Fλ itself as the dual space, and specify an appropriate pairing. This space can be characterized by its pole structure and vanishing conditions. 3.3.1. The dual space to U ( n− ). Let us first consider the larger function space G, dual to the universal enveloping algebra U = U ( n− ). The algebra U is spanned by words in the letters {fαi [n] | i = 1, . . . , r, n ∈ Z}, and it is h and d-graded. The graded component U [m]d , where m = (m(1) , . . . , m(r) )T , is spanned by the elements fi1 [n1 ] · · · fim [nm ], of h-weight α m(α) α = j αij and − i ni = d. The dual space to U is also h- and d-graded. Denote by U [m] the h-graded component, and by G[m] the dual to it. This is a space of functions in the variables (α)

x = {xi

| i = 1, . . . , m(α) , α = 1, . . . , r},

(α)

(α)

where xi is the variable corresponding to a generator of the form fα (xi ). We define the pairing (·, ·) between U and G inductively, as follows: (1, 1) = 1,

(g(x), Mfα [n]) =

1 2πi

(α) (α) (x1 )n g(x)dx1 , M

(α)

x1 =0

,

M ∈ U,

(3.5) (α)

where the contour of integration is taken counter-clockwise around the point x1 in such a way that all other points are excluded, (g(x), fα [n]M) =

1 2πi

(α) |x1 |

(α )

< |xj

= 0,

|. Similarly,

(α)

x1 =0

(α) (α) (x1 )n g(x)dx1 , M

,

the contour is taken clockwise. The commutation relations between the currents are equivalent to the operator product expansion (OPE) fi (z)fi±1 (w) =

fαi +αi±1 (w) + regular terms, z−w

where “regular terms” refers to terms which have no pole at z = w, and the expansion of the denominator is taken in the region |z| > |w|. Due to the OPE’s, it is clear that func(α) (α±1) . Thus, functions tions in G[m] will have at most a simple pole whenever xj = xk in G[m] are rational functions of the form g(x) =

g1 (x) (α) i,j,α (xi

where g1 (x) are polynomials in (xi )±1 . (α)

(α+1)

− xj

, )

(3.6)


433

Again using the OPE’s, we can construct the pairing between all other elements of U and G. For example, 1 (α) n (α) (α±1) (α) (g(x), Mfα+α±1 [n]) = (x ) (x1 −x1 )g(x) (α) (α±1) dx1 , M , x1 =x1 2πi x1(α) =0 1 where the contour excludes all other points, and (g(x), Mfα+···+α+h [n]) 1 (α) (α) (α+1) (x )n (x1 − x1 ) = 2πi x1(α) =0 1 (α+h−1) (α+h) · · · (x1 − x1 )g(x) (α)

(α+h)

x1 =···=x1

(α) dx1 , M

.

The function g1 (x) is not completely arbitrary, due to the Serre relation (3.3). The Serre relation implies that the function (α) (α+1) (x1 − x1 )g(x) (α) (α+1) x1 =x1

(α+1)

(α)

(α)

has no poles at the points xj = x1 and xj that the function g1 (x) has the property that

(α+1)

= x1

, where j > 1. This implies

g1 (x)|x (α) =x (α) =x (α±1) = 0. i

j

(3.7)

k

Finally, it is clear that since [fi (z), fi (w)] = 0, g1 (x) is symmetric under the ex(α) (α) change of variables xi ↔ xj . In summary, we have Theorem 3.3. The space of functions G[m] dual to the graded component U [m] of the universal enveloping algebra of n− , with the pairing defined inductively by (3.5), is the (α) space of functions in the variables {xj } with j = 1, . . . , m(α) and α = 1, . . . , r, of the form (3.6), where g1 (x) is a polynomial in (xj )±1 , symmetric under the exchange of (α)

(α)

variables with the same superscript, and which vanishes whenever x1

(α)

= x2

(α±1)

= x1

.

3.3.2. Dual to the principal subspace Wλ . Next, we consider the space Fλ [m], which is defined as the graded component of the space Fλ , the subset of matrix elements of U [m] in Fλ . The space Fλ [m] is the dual space to Wλ [m] (the weight subspace of Wλ of h-weight λ − mT α) with the pairing defined as in (3.5), where 1 ∈ U is replaced by vλ . The dual space Fλ [m] is the subspace of G[m], which couples trivially via the pairing (3.5) to the ideal Iλ ⊂ U . Apart from the two-sided ideal coming from the relations in the algebra, which we have already accounted for in constructing G[m], the ideal Iλ contains the relations coming from the highest-weight conditions (2.4), and from the integrability conditions (3.4). The integrability conditions mean that Ufi (x)k+1 U ⊂ Iλ , which means that g1 (x)|x (α) =···=x (α) = 0, 1

for all g(x) ∈ Fλ [m] and for all α.

k+1

(3.8)

434


The ideal Iλ contains the left ideal generated by fα [n], n > 0 for any α. We see from (α) (3.5) that for functions in Fλ [m], g1 (x) can have at most a simple pole at x1 = 0. Let us define the function g2 (x) by g(x) =

g2 (x) (α) (α) α,i (xi ) α,i,j (xi

(α+1)

− xj

(3.9)

, )

(α)

where g2 (x) is a polynomial in xi for all i, α. In order to account for the relation Ufβ [n] ⊂ Iλ for β = αi + · · · + αi+h , where n > 0, we need to impose an additional restriction on g2 (x), because of the prefac(α) (α+1) (α+h) −1 · · · x1 ) in (3.9). The function g1 (x), after evaluation at the point tor (x1 x1 (α) (α+1) (α+h) = · · · = x1 , must be of degree greater than or equal to −1 in u = x1 = x 1 the variable u if it is to couple trivially to fβ [n] for n > 0. Therefore, we see that g2 (x) satisfies: g2 (x)|x (α) =x (α+1) =···=x (α+h) =u 1

1

1

vanishes as uh as u → 0.

(3.10)

Finally we need to take into account the integrability conditions for the top component: Ufβ [0]λ(β)+1 ⊂ Iλ for each positive root β. For simple roots, this means that g2 (x)|x (α) =···=x (α)

lα +1 =0

1

= 0.

(3.11)

When β is not a simple root, then the relations are more complicated, involving variables corresponding to different roots. These are sufficiently complicated that we do not know how to compute the character of the space in this case. However, at this point let us note that for the special case of rectangular representations, the situation is much simpler. The relation (3.10) is automatically satisfied for such representations. For suppose we consider the representation with lβ = 0 for at most one index β. Then since Ufβ [0] ⊂ Iλ , whereas Ufα [0] ⊂ Iλ for α = β, we have that in this special case, (β) (xj )−1 g2 (x), (3.12) g1 (x) = j

where g2 (x) is a polynomial in all the variables, satisfying (3.11) for the index β only, as well as the integrability conditions and the Serre relation. The relation (3.10) is not an extra condition in this case. Let us summarize the result for rectangular representations, therefore. Theorem 3.4. Let β = lωβ + k0 for some 1 ≤ β ≤ r. Then the dual space of functions to the graded component of the principal subspace Wlωβ [m] is the space of rational functions of the form (3.6), where g1 (x) is a function of the form (3.12), where g2 (x) is a (α) polynomial in the variables xi satisfying the Serre relation (3.7), symmetric under the (β) (β) (α) (α) exchange of variables xi ↔ xj for all α, vanishing when x1 = · · · = xl+1 = 0, or (α)

when any k + 1 variables of the same superscript coincide, x1

(α)

= · · · = xk+1 for any α.

In the next section, we will show how to compute the character of this space using a filtration on the space. For non-rectangular representations there is no such simple description of the space. The purpose of this paper is to explain how to compute the character for non-rectangular representations as a linear combination of characters of rectangular representations.


435

3.4. Filtration of the dual space Fλ . In this subsection, we will assume that = β = λ+k0 , λ = λβ = lωβ for some fixed 1 ≤ β ≤ r. This corresponds to aYoung diagram of rectangular form (with l columns and β rows). As explained above, the space Fλ is h-graded, Fλ = m Fλ [m], where Fλ [m] is a (α) subspace of the space of rational functions in the variables x = {xi | α = 1, . . . , r ; i = 1, . . . , m(α) } of the form G(x) =

g(x)

(β) r−1 (α) i (xi ) α=1 i,j (xj

(α+1)

− xj

,

(3.13)

)

where g(x) is polynomial, symmetric under exchange of variables with the same value (α) (α) of α (which we will refer to as the color index), xi ↔ xj . The index β corresponds to the fundamental weight wβ , where λβ = lwβ . In addition, g(x) vanishes when any of the following conditions is met (α)

x1

(α) x1 (β) x1

(α)

= · · · = xk+1 , =

(3.14)

(α) x2

= ···

(α±1) = x1 , (β) = xl+1 = 0.

(3.15) (3.16)

Our goal is to compute the character of this space, for which purpose we will introduce a filtration and an associated graded space. We will be able to compute the characters of the graded pieces easily. To simplify the calculations below, let us define the closely related space Fλ [m]. This space is a subspace of the space of all rational functions in the variables x, which are given by G(x) = r−1 α=1

g(x) (α) i,j (xi

(α+1)

− xj

,

(3.17)

)

(β) where g(x) is as in (3.13), so G(x) = i xi G(x). In the following, we will fix m and l, and study a filtration of this space Fλ [m] (which we will refer to by F), which can be described as follows. Let µ = (µ(1) , . . . , µ(r) ) be a collection of partitions, where each µ(α) is a partition (α) of m(α) and has ma rows of length a. (α) We can now rename the variables xi by associating each of them to a box of the Young diagram associated with the partitions µ(α) . As a result of this renaming, we have (α) variables xa,i,j , which correspond to the Young diagram of partition µ(α) , namely to column j of the i th row (counted from top to bottom) of length a. See the left part of Fig. 1 for an explicit example. In the proofs which follow, we will simplify this notation as much as possible. Note that, due to the symmetry properties of g(x), how we rename the variables is irrelevant. (α) Let H be the space of rational functions in the variables y = {ya,i | α = 1, . . . , r; a ≥ (α)

1; i = 1, . . . , ma }. Define the evaluation map ϕµ(α) , which sets all the variables in the same row of the (Young diagram associated to the) partition µ(α) to the same value, (α) (α) xa,i,j → ya,i . The effect of the evaluation map on the variables corresponding to the

436

E. Ardonne, R. Kedem, M. Stone (α)

xk,1,1

(α)

xk,1,2

(α)

xk,1,k−1 xk,1,k

(α)

(α)

(α) xk,mk ,k−1xk,mk ,k

(α)

(α)

xk−1,1,k−1

(α)

xk−1,mk−1 ,k−1

xk,mk ,1 xk,mk ,2 xk−1,1,1 xk−1,1,2

(α) xk−1,mk−1x,1k−1,mk−1 ,2

(α)

yk,1

(α)

(α)

(α)

yk,1

yk,mk

(α) (α)

yk−1,1

yk−1,1

(α)

(α)

(α)

yk,1

(α)

yk,1

(α)

yk,mk

(α)

yk,mk

(α)

yk,mk

(α)

yk−1,1

(α)

yk−1,mk−1

(α)

(α)

(α)

yk−1,mk−1yk−1,mk−1

−→

(α)

x2,1,1

(α)

x2,m2 ,1

x2,1,2

(α)

y2,1

(α)

y2,m2

(α)

(α)

x2,m2 ,2

(α)

y1,1

(α)

y1,m1

(α)

y2,1

(α)

y2,m2

(α)

x1,1,1

(α)

x1,m1 ,1

(α)

Fig. 1. The evaluation map for the variables x (α) . Note that we dropped the superscripts (α) in ma

(α) partition

µ is shown in Fig. 1. We define the evaluation map ϕµ : F → H to be ϕµ = rα=1 ϕµ(α) . By (3.14), ϕµ (g(x)) = 0 (where g(x) is as in (3.13) with G(x) ∈ F), if any of the partitions µ(α) has a part which is greater than k. Hence, in the following, we will assume that none of the partitions has a part greater than k, and refer to these (multi)-partitions as k-restricted. Our strategy will be to study the image of F under the evaluation map.

Definition 3.5. Let Hµ be the space of functions in the variables y, and let Hµ ⊂ Hµ be the subspace spanned by functions of the form H (y) = Hµ (y)h(y),

(3.18) (α)

(α)

where h(y) is an arbitrary polynomial in y, symmetric under the exchange ya,i ↔ ya,j , and (β) (α) (α) (α) (α+1) Hµ (y) = (ya,i − yb,j )2Aab (ya,i − yb,j )−Aab (ya,i )max(0,a−l) . α=1,... ,r (a,i)>(b,j )

α=1,... ,r−1 (a,i);(b,j )

(a,i)

(3.19) Here, Aab = min(a, b) and (a, i) ∈ Ik × Imα (where Im = {1, . . . , m}). The ordering (a, i) > (b, j ) is defined as follows. The index i increases downwards, and we say that (a, i) > (b, j ) if a > b, or, if a = b, when i < j .


437

Let us define a lexicographic ordering on multi-partitions. That is, the usual lexicographic ordering is taken on partitions µ(α) , and ν > µ if ν (α) = µ(α) for all α < γ and ν (γ ) > µ(γ ) . Let ker ϕµ be the kernel of the evaluation map ϕµ acting on F. We can now define the subspaces

µ = ker ϕν , µ = ker ϕν . (3.20) ν>µ

ν≥µ

Thus, µ is the space of rational functions which are annihilated by every evaluation map with ν > µ. By definition, ν ⊂ µ if ν < µ, and µ ⊂ µ . In addition, m(1) = {0}. m(r) (1

,... ,1

Therefore, µ defines a filtration on F. Define the associated graded space Gr = Gr µ ,

)

(3.21)

µ

where Gr µ = µ / µ and the sum is over multi-partitions of m. The main purpose of this section is to prove Theorem 3.6. The induced map ϕ µ : Gr µ → Hµ

(3.22)

is an isomorphism of graded vector spaces. This is very similar to the proof found in [7] for the case which corresponds to sl3 , and we use the same ideas here. To prove the theorem, we need to show three things. First, the evaluation map ϕµ : µ → Hµ

(3.23)

is well-defined. Second, it is surjective, and third, the induced map (3.22) is well defined and injective. 3.4.1. The evaluation map is well-defined. To prove that the map ϕµ : µ → Hµ , is well defined, we must show that the rational functions obtained after the evaluation are indeed of the form (3.18) and (3.19). We will do this by showing that the structure of the poles and zeros of the image of the functions (3.17) in Fm under the evaluation map is precisely of the form (3.19). Lemma 3.7. Let G(x) ∈ µ . Then, the function ϕµ (G(x)) has a zero of order at least (α) (α) 2 min(a, a ) when ya,i = ya ,i , ∀α. Proof. The proof is independent of α, and so we can use the argument used in the case of sl2 in [2]. We will repeat that argument here for completeness. It is sufficient to consider the dependence of G(x) on the two sets of variables of the same color α, which we denote by {xa,i | i = 1, . . . , a} and {xa ,i | i = 1, . . . , a }. We can assume that a ≥ a without loss of generality. We can carry out the evaluation map in two steps: ϕµ = ϕ 2 ◦ ϕ 1 . Here ϕ 1 consists of evaluating all the variables except the set {xa ,i | i = 1, . . . , a } and ϕ 2 consists of

438


setting xa ,1 = · · · = xa ,a = ya (note that under ϕ 1 , the variables xa,1 , . . . , xa,a are all set to ya ). Let g1 (ya ; xa ,1 , . . . , xa ,a ) = ϕ 1 (G(x)).

(3.24)

Because G(x) ∈ µ , G(x) is annihilated by all ϕν with ν > µ. Therefore g1 (ya ; xa ,1 , . . . , xa ,a )x

a ,i =ya

= 0 for all i ,

(3.25)

because this corresponds to an evaluation corresponding to a multi-partition greater than µ. Therefore,

g1 (ya ; xa ,1 , . . . , xa ,a ) =

a

(xa − xa ,i )g˜ 1 (ya ; xa ,1 , . . . , xa ,a ).

(3.26)

i =1 (α)

Now g1 (ya ; xa ,1 , . . . , xa ,a ) was obtained from a symmetric function in xi , and so, for each i , ∂g1 ∂g1 =a . (3.27) ∂ya xa ,i =ya ∂xa ,i x =ya a ,i

However (3.26) tells us that, again for each i , ∂g1 ∂g1 = − ∂ya xa ,i =ya ∂xa ,i x

= (ya − xa ,i )g˜ 1 i =1

a

a ,i =ya

,

(3.28)

xa ,i =ya

the prime on the product meaning that the term with i = i is to be omitted. The only way to reconcile (3.27) with (3.28) is for g˜ 1 |xa ,i =ya to be zero. Thus the zero at xa ,i = ya is at least of order two

g1 (ya ; xa ,1 , . . . , xa ,a ) =

a

(ya − xa ,i )2 g˜ 2 (ua ; xa ,1 , . . . , xa ,a ).

(3.29)

i =1

We now evaluate the right-hand-side of (3.29) at xa ,1 = · · · = xa ,a = ya and, recalling the condition that a ≥ a , we have ϕµ (G(x)) =

(ya − ya )2Aa,a G.

(3.30)

Lemma 3.8. The image under the evaluation map ϕµ of any function in F (and hence (α) (α+1)

µ ) has a pole of maximal order min(a, a ) whenever ya,i = ya ,i .


439

Proof. We will prove this lemma by looking at the zeros of g(x), which arise because we need to satisfy the Serre relations, g|x (α) =x (α) =x (α+1) = 0 and g|x (α) =x (α+1) =x (α+1) = 0 1 2 1 1 1 2 for α = 1, . . . , r − 1. These relations depend on two sets of variables only. (α) Consider the dependence of g on the two sets of variables xi = xi , with (α±1) , with j = 1, . . . , a . Under the evaluation map, these i = 1, . . . , a and x¯j = xj variables map to ϕµ (xi ) = y and ϕµ (x¯i ) = y¯ respectively. Note that x and x¯ are variables corresponding to two adjacent roots. Again without loss of generality, assume that a ≥ a . When x1 = x¯1 = x¯j or x1 = xj = x¯1 , g vanishes, so we find g(x1 , . . . , xa ; x¯1 , . . . , x¯a ; . . . )|x1 =x¯1 =z1

a a = (xi − z1 ) (x¯i − z1 )g (z1 ; x2 , . . . , xa ; x¯2 , . . . , x¯a ; . . . ).

(3.31)

j =2

i=2

Repeating the argument for g we find g (x2 , . . . , xa ; x¯2 , . . . , x¯a ; . . . )|x2 =x¯2 =z2 =

a

(xi − z2 )

a

(x¯i − z2 )g (z1 , z2 ; x3 , . . . , xa ; x¯3 , . . . , x¯a ; . . . ).

(3.32)

j =3

i=3

We can repeat this argument a times with the result g(x1 , . . . , xa ; x¯1 , . . . , x¯a ; . . . )|{x =x¯ =z }a i

a

=

a

a

(xj − zi )

i=1 j =i+1

i

i i=1

a

(x¯j − zi )g(z ˜ 1 , . . . , za ; xa +1 , . . . , xa ; . . . ).

i=1 j =i+1

(3.33) We find that ϕµ (g) has a zero of order at least aa − min(a, a ) when y = y, ¯ by counting the number of zeros in (3.33) and using that a ≤ a. Taking into account the poles of (3.17), which after applying the evaluation map becomes a pole of order aa when y = y, ¯ we find that the image of Fm has a pole of order at most min(a, a ), when (α) (α±1) xa,j = xa ,j . Lemma 3.9. The image of ϕµ acting on a function G ∈ µ has a zero of order at least (β) max(0, a − l) when ya,i = 0. Proof. To prove this lemma, we will study the effect of the evaluation map on g(x) in Eq. (3.13). We focus on the variables of a row of length a (where we assume that a > l), (β) (β) {xj | j = 1, . . . , a}. Under the evaluation map, these variables map to ϕµ (xj ) = y (β) . We know that the function (β)

g1 (x1 , . . . , xa(β) ) = g(x)|x (β) =···=x (β) =0 1

(3.34)

l

(β) (β) contains a factor aj =l+1 xj , because it vanishes if any of the remaining variables xj is set to zero (because of the condition (3.16) on g(x)). Thus, the image of g1 under the (β) evaluation map has a zero of order at least max(0, a − l) whenever ya,i = 0.

440


Lemma 3.10. The map ϕµ : µ → Hµ is well defined. Proof. This follows from Lemmas 3.7, 3.8, 3.9 and the definition of the space Hµ .

3.4.2. Proof of surjectivity We will continue with the proof that the map (3.23) is surjective. We have to prove that for each function of the form defined by (3.18) and (3.19), there is at least one function in the pre-image in µ . We do this by explicitly giving the form of these pre-images, showing that they are elements of F and finally, proving that these pre-images are indeed in the kernel of ϕν for each ν > µ, which shows that they are in µ . For each (k-restricted) multi-partition µ, we consider the function F (x) =

Sym f (x) , p(x)

(3.35)

where f (x) and p(x) are a polynomials of the form (we identify the variables (α) (α) xa,i,a+1 = xa,i,1 ) (β) (α) (α+1) f (x) = f˜(x) xa,i,j (xa,i,j − xa ,i ,j ) α a,i,j a ,i ;j =j

α,a,i j >l

×

(α)

p(x) =

(α)

(α)

(α)

(xa,i,j − xa ,i ,j )(xa,i,j +1 − xa ,i ,j )

(3.36)

(α)

(3.37)

α (a,i)>(a ,i ) (α) j =1,... ,ma

(α+1)

(xa,i,j − xa ,i ,j ),

α=1,... ,r−1 a,i,j a ,i ,j

where f˜(x) is an arbitrary polynomial. The symmetrization is over each of the r sets (α) of variables {xi } with the same value of α. As we did before, we will drop as many indices as possible in the following lemmas. Lemma 3.11. The functions F (x) of (3.35) are elements of F. Proof. We have to show that f (x) satisfies the vanishing conditions (3.14), (3.15) and (3.16). First of all, we easily see that f (x) is zero when any k + 1 variables of the same color are set to the same value. Because the partitions have rows of maximum length k, these k + 1 variables can not all be placed in the same row, which implies that the factor

(α) (α) (xa,i,j − xa ,i ,j ) evaluates to zero under ϕµ . To show that the Serre relations are satisfied, we have to show that the zeros (α) (α+1) (xa,i,j − xa ,i ,j ) (3.38) α a,i,j a ,i ;j =j (α)

(α+1)

satisfy the Serre relations. Let xa,j = xa,i,j and x¯a ,j = xa ,i ,j , for some choice of α, i and i .


441

For every x, there is a zero with every x, ¯ except those appearing in the column which has the same number as the x (i.e. for j = j ). Note that if we set two variables x, which belong to the same column, to the same value, f (x) is zero, because the factor

(α) (α) (xa,i,j − xa ,i ,j ) is zero in that case. Hence, we set xa,j = x¯a ,j = x˜ (j = j ). Focus ing on this variable, we find the following zeros (x˜ − x¯a,i )(x˜ − x¯a ,i ) a ;i =i,i (x˜ − x¯a ,i )2 So, indeed x˜ has zero with every x. ¯ Similarly, we find that there is at least a zero of order one when we set x1 = x¯1 = x¯2 . To complete the proof of this lemma, we need to show that f (x) satisfies the condition

(β) (3.16). This easily follows from the factor j >l xa,i,j , combined with the zeros which give rise to the condition (3.14). Remark 3.12. It is instructive to note that all the zeros in (3.38) are necessary to satisfy the Serre relations. We need to show that if we remove any of these zeros, we will violate a Serre relation. To show that this is true, it is important that we take the zeros between variables of the same color into account. Let us remove the zero (xa,j − x¯a ,j ), where j = j . Without loss of generality, we can assume that j < j . The two variables are indicated in Fig. 2 by the black boxes. The gray boxes denote the zeros with the variables corresponding to the black box from the same partition. All we need to do is show that there is at least one variable, of either partition, such that when this variable is set to the same value as the two ‘black variables’, we do not get a zero, and thus violate a Serre relation. This variable is taken to be of color (α + 1), (if j > j , it is of color (α)). More precisely, it is the variable x¯a ,j , taken from the same row as x¯a ,j (denoted by the ‘slanted’ box), which always exists, because j < j . There is no zero at x¯a ,j = x¯a ,j , because both variables are taken from the same row. In addition, there is no zero at xa,j = x¯a ,j , because it is not present in the factor (3.38) and the zero at xa,j = x¯a ,j is the one we removed. We conclude that after we remove the (arbitrary) zero at xa,j = x¯a ,j , we do not have a zero when xa,j = x¯a ,j = x¯a ,j . Thus, we have shown that by removing any of the zeros in (3.38), we violate a Serre condition. We conclude that the zeros are indeed necessary. Lemma 3.13. The function F (x) of (3.35) associated to a k-restricted multi-partition µ is an element of the kernel of ϕν for any ν > µ. (α)

(α + 1)

01 Fig. 2. A violation of the Serre relations if the zero corresponding to the black squares is removed from (3.36). The left partition corresponds to the variables of color (α), the right one to color (α + 1). The ‘slanted’ box is the third variable, in addition to the two black ones, for which the Serre condition is violated. The gray boxes denote the zeros with the variable corresponding to the black box of the same partition, coming from the integrability conditions

442


Proof. Let us take a ν > µ, and let ν (α) be the first partition such that ν (α) > µ(α) . We will focus on the variables x (α) and show that the function F (x) can not be non-zero under the evaluation map ϕν . Two variables in the same column of µ(α) have a zero, so they can not be placed in the same row in ν (α) , if the result is to be non-zero, because in that case, acting with the evaluation map gives a zero. However, because ν (α) > µ(α) , we can not avoid placing variables of the same column in µ(α) in the same row of ν (α) . To show this, let us denote the length of the rows of (α) (α) the partitions by µi and νi , such that the index i is increasing going downwards. The only way to avoid placing variables of the same column of µ(α) in the same row of ν (α) is by placing the variables of µ(α) in rows of the same length in ν (α) . However, because (α) (α) ν (α) > µ(α) , there will be an ı˜ such that νı˜ > µı˜ . Let us focus on the smallest ı˜. We (α) (α) (α) (α) have to place a variable of a row µi with i > ı˜ in the row ν˜ . Because µi ≤ µı˜ , i

(α)

this variable belongs to the same column of another variable in νı˜ . We conclude that F (x) is zero under the evaluation map ϕν with ν > µ. Lemma 3.14. The function F (x) of (3.35) is an element of µ . Proof. This follows from Lemmas 3.11 and 3.13.

As a last step in the proof of surjectivity, we have to show that the image of F (x) under the evaluation map is indeed of the form (3.18) and (3.19). In particular, it contains as a factor the functions h(y), which are symmetric under the exchange of variables (α) (α) ya,i ↔ ya,i . Lemma 3.15. The image of F (x) under the evaluation map ϕµ is a scalar multiple of the function H (y) in (3.18). Proof. To prove this lemma, we can follow the same approach as we did in our paper on the sl2 case, because the argument does not depend on the color of the variables. We will focus on the variables x (α) , and determine the permutations σ , for which ϕµ (f (σ (x (α) ))) is non-zero. So, we consider f (σ {x (α) }). (3.39) σ ∈Sm(α)

In the following, we will omit the label α. Recall that the variable xa,i,j corresponds to the j th column in the i th row of length a. Under the evaluation map, xa,i,j → ya,i ∀j . Suppose that for some σ , we have σ (xa,i,j ) = xa ,i ,j with (a , i ) < (a, i) and that (a, i) is the largest row for which this is true. This means that all rows above (a, i) undergo only a permutation within the row. Suppose that the pre-factor   ϕµ ◦ σ  (xa,i,j − xa ,i ,j )(xa,i,j +1 − xa ,i ,j ) (3.40) (a,i)>(a ,i )

is to be non-zero. Then xa ,i ,j can not be in a column directly below or to the left of the permutation image of any other element from row (a, i). This means that at least one other element from row (a, i) should be mapped to a row below (a, i). If it is mapped to the row (a , i ) it can appear in any column other than j . If it is mapped to any other


443

row, it can appear in any other column than j and an adjacent column (to the right or left depending on whether it is above or below (a , i ).) Now we repeat this argument for this new element, concluding that at least one more element of row (a, i) is mapped to a lower row, and so forth, until eventually we find that all elements are permuted to a row below (a, i). If the elements are permuted to the same row, they can be placed in adjacent columns. Elements which are permuted to different rows can not be placed in adjacent columns, this being due to the factor linking adjacent columns in the pre-factor. There are at most a columns in µ(α) in rows below (a, i), and hence the elements must all appear in the same row, which is therefore of length a. Thus all the variables in rows of length a are mapped to another row of length a, for the same reason. As a result, the only permutations which give a non-zero contribution to ϕµ (f (σ (x (α) ))) are those that permute variables within each row, or those that permute rows of equal length. Under the evaluation map, the former contribute equal terms to the sum, while row interchanges (α) correspond to the symmetrization over the variables ya,i with the same values of α and a in h(y). Note that the other factors in the function F are symmetric under the permutation of rows of equal length, so these factors do not interfere with the argument above. Lemma 3.16. The map ϕµ : µ → Hµ is surjective. Proof. This follows from Lemmas 3.14 and 3.15.

3.4.3. Injectivity proof Lemma 3.17. The induced map ϕ µ : Gr µ → Hµ (3.22) is well defined and injective. Proof. To prove that the map (3.22) is well defined, we use Lemma 3.10 and observe that the image of µ under ϕµ is zero by using the definition of µ . It follows that we can define the induced map ϕ µ acting on the quotient Gr µ = µ / µ . Moreover, the difference between two different functions in µ that map to the same rational function in Hµ is in µ . Hence, the map is also injective. We have now completed the proof of Theorem 3.6, because the theorem follows from Lemmas 3.17 and 3.16. The map (3.22) is degree preserving, and thus we can count the functions of homogeneous degree d in Hµ to obtain the character of the space F.

(β) To compute the character of Fλ , we add the poles (xa,i,j )−1 , which are present in the functions G(x) in (3.13). The only thing in the calculation of the character which

(β) changes is the fact that due to these poles, the zeros (ya,i )max(0,a−l) in (3.19) become

(β) poles (ya,i )− min(a,l) . 3.5. Character of the dual space. Using the results of the previous section, we can calculate the character of the dual space Fλ , where λ = lωβ . First, let us define the character of Wλ as follows: T dim Wλ [m]d q d eλ−ω Cr m , (3.41) chq Wλ = d,m(α)

where Wλ [m]d is the subspace generated by elements in U ( n− ) of homogeneous degree m(α) in fα , and homogeneous degree −d in t. Here, ω = (ω1 , . . . , ωr )T .

444

E. Ardonne, R. Kedem, M. Stone (α)

The space Fλ is a space of functions in the variables xi . If we define its (m, d)-graded (α) component to be the space of functions in m(α) variables xi and total homogeneous degree d in all the variables, then, due to the way we defined the generating functions fα (x) (or, equivalently, the coupling), we have that Fλ [m]d is the dual to Wλ [m]d , where d = d + α m(α) . Thus, (α) T chq Wλ = chWλ [m] = q d+ α m eλ−ω Cr m dim(Fλ [m])d , (3.42) m

m

d

where dim(Fλ [m])d denotes the dimension of the subspace of functions in Fλ [m] which have homogeneous degree d. The powers of z correspond to the components of the weights in terms of the simple roots. Recall that here, λ = λβ = lωβ . We will calculate this character by actually summing over all the functions in H, and counting their homogeneous degree. The character of the space of symmetric functions (α) h(y) in ma variables is given by 1

k

r α=1

a=1 (q)m(α) a

(3.43)

,

i where (q)m = m i=1 (1 − q ) for m ∈ N and (q)0 = 1. The homogeneous degree of the rational function Hµ (y), combined with the addi (α) tional poles (a,i) (ya,i )−a is given by 1 Hµ (y) (α ) deg = Aa,l m(β) m(α) . m(α) a (Cr )α,α Aa,a ma − a − (α) a 2 (y ) a α (a,i)

a,i

α,α ,a,a

(3.44) (0)

It follows that the character of Wλ is →T

(0) chq Wlωβ

q 21 m

= →

m∈Zr×k ≥0

→

→ (β)

(Cr ⊗A)m−(id⊗Am)l

(q)→ m

elωβ −ω

TC m r

.

(3.45) →

Here (A)a,b = min(a, b) is a k × k matrix, and Cr is the Cartan matrix of slr+1 . Also, m (1) (1) (r) (r) denotes the vector (m1 , . . . , mk ; · · · ; m1 , . . . , mk ). We made use of the definition (q)→ = m

k r

(q)m(α) . a

α=1 a=1

r +1 -Modules 4. Characters for Rectangular Highest-Weight sl In this section, we will show that we can use the characters of the principal subspace Wλ to obtain the character of the full integrable module Vλ . We will be able to do this by using the invariance of the weight multiplicities of Vλ under the action of the affine Weyl group, in particular the affine Weyl translations tα . More specifically, we will show


445

that acting with an affine Weyl translation on the principal subspace, and taking an appropriate limit, we obtain the full integrable module. Let be an affine weight of level k. It can be written as = λ + k0 − mδ, where λ is the weight with respect to h ∈ slr+1 . Let tα be the affine Weyl translation

corresponding to the root α (see [11], Eq. (6.5.2)), and define the translation tN = i tNi αi , where N = (N1 , . . . , Nr )T . Then 1 tN () = λ + kNT · α + k0 − (m + NT · l + kNT Cr N)δ. (4.1) 2 Again, l = (l1 , . . . , lr )T , where λ = i li ωi , and α = (α1 , . . . , αr )T . Also note that α in terms of the weights is given by α = Cr ω. Consider the principal subspace W (N) = U ( n− )tN vλ . It has a dual space description which is similar to Fλ , if we choose the vector N carefully. Given that if fα [m]vλ = 0, then fα [m + (Cr · N)α ]tN vλ = 0 (since the Weyl group preserves weight space multiplicities), we choose N such that (Cr · N)α = 2N for all α, for some N ∈ Z+ . In the case of slr+1 , we have (N)i = N i(r + 1 − i). Then fα [2N + δα,β ]tN vλ = 0, where λ = lωβ , and fα [2N − 1 + δα,β ]tN vλ = 0. Note also that the extremal vector tN vλ is a basis for the one-dimensional weight subspace of weight 1 tN (λ) = λ + kNT α + k0 − (λ, NT α) + kNT Cr N δ. 2 In the case of interest here, this becomes tN (lωβ ) = lωβ + kNT α + k0 − (lNβ + kN |N|)δ, where |N| = i Ni . (N) Thus, the space dual to Wλ is the space of functions of the form (α) (xi )−2N G(x), α,i

where G(x) is the function in Eq. (3.13). (N) (0) Thus, we find that the character of Wlωβ differs from the character of Wlωβ by a (α) change in the exponent of q by lNβ + kN|N| − 2N |m| (where |m| = α m ) and a change in the weight by ωT Cr kN, which leads to →T

(N) chq Wlωβ

q 21 m

= →

r×k m∈Z≥0

→

→ (β)

(Cr ⊗A)m−(id⊗Am)l

(q)→ m

q lNβ +kN|N|−2N|m| elωβ −ω

T C (m−kN) r

.

A form suitable for taking the limit N → ∞ is obtained by eliminating the summa (α) (α) (0) tion variable mk in favor of m(α) = ka=1 ama . This gives for the character of Wlωβ (α) (we define m(α) = k−1 a=1 ama )

446

E. Ardonne, R. Kedem, M. Stone (0)

chq Wlωβ =

1

q 2k m

T C m− 1 lm(β) r k

elωβ −ω

m∈Zr≥0

×

TC m r

q

→

T −1 → 1→ 2 m (Cr ⊗Ck−1 )m−

r

k−1

α=1

r×(k−1) m∈Z≥0

→ (β) δl m. We have the following generalization of the triangularity property for Kostka polynomials: Lemma 5.13. The generalized Kostka polynomial Kλ,µ (q) = 0 unless λ µ = ν(µ) according to the dominance ordering on partitions. Proof. The dominance ordering on partitions is β

λα ≤

α=1

β

µα ,

for all β ∈ 1, . . . , r + 1.

α=1

Recast in terms of the variables n and l this means that A(n − l)β − βlr+1 = A(n − l)β − βm(r) ≥ 0

for all β.

(5.18)

For β = r +1 the equality holds due to the condition |λ| = |µ|, so we need only consider β ≤ r. Using the fact that 1 α(n(α) − lα ), r +1 r

m(r) =

α=1

Eq. (5.18) becomes r α=1

Since

m(β)

(Aβα −

βα )(n(α) − lα ) = (Cr−1 (n − l))β = m(β) . r +1

≥ 0 in the summation in Kλ,µ (q), this proves the lemma.


457

Note also that if m(β) = 0 for all β, then λ = µ. In that special case, Kλ,µ (q) = 1. To tie in with the usual notion of the unitriangularity of the Kostka matrix, let Sr [m] ∼ P (r, k)[m] be the subsets of (multi-) partitions of m, and fix max(k, r) ≥ m. The number of elements of both sets is the number of partitions of m. Let λ m. The last lemma implies that the square matrix K(q), with entries indexed lexicographically by the partitions ν(µ) with µ ∈ Sr [m] and λ is upper unitriangular. That is, define (K(q))λ,µ = Kλ,µ ,

µ ∈ Sr [m], µ = ν(µ), max(k, r) ≥ |λ| = |µ|.

Then K(q)λ,µ = 0 if λ µ, and it is equal to 1 if λ = µ. In the case in which we are interested, in which r is fixed and may be smaller than |µ|, we take the subset of the elements of this matrix which have the length of µ to be at most r, and the length of λ to be at most r + 1. r +1 -Modules 6. Characters for Arbitrary Highest-Weight sl Let λ ∈ Pk+ and let Vλ be the highest-weight slr+1 -module of level k. We are interested in computing a fermionic formula for the character of this space, for arbitrary λ, similar in form to the one found in Sect. 3. We compute this character in several steps. First, we compute the character of the fusion product of several principal subspaces corresponding to rectangular highest weights µp . We then use a Weyl translation to find the character of the fusion product of integrable modules corresponding to the same highest weights. At this point, we choose a very particular set of r rectangular highest weights, of the form µp = ap ωp with p = 1, . . . , r. We use the decomposition of the fusion product into the graded sum over irreducible highest-weight modules, with coefficients given by the generalized Kostka polynomials. This means that the character of the fusion product is the sum over characters of irreducible modules, with coefficients given by the Kostka polynomial. This relation between the characters is invertible, so we use it to write the character of the irreducible module in terms of a finite sum over characters of particular fusion products. The coefficients in the sum are polynomials in q −1 whose coefficients are not necessarily positive, since they are given by the entries of the inverse of the matrix of generalized Kostka polynomials in q −1 . 6.1. Character of the fusion product of principal subspaces. Consider the fusion product of principal subspaces: Wµ (ζ ) = W1 (ζ1 ) · · · WN (ζN ) = U (n− ⊗ C(t))v1 ⊗ · · · ⊗ vN , where we allow singularities at t = ζp . Here, vp is the highest-weight vector of Vµp (ζp ), the module of level-k, with highest weight of the form µp = ap ωαp , localized at ζp . We choose k sufficiently large – that is, k ≥ p ap , so that the level-restriction in the decomposition coefficients does not play a role. Note once more that the algebra U (n− ⊗ C(t)) is filtered by degree in t, and that, defining the cyclic vector ⊗vp to have degree 0, the fusion product Wµ (ζ ) inherits this filtration. Hence, we can define the q-character of Wµ (ζ ) as the Hilbert series of the associated graded space – it is a Laurent series in q, which we can compute for sufficiently simple µp .

458


As an n− ⊗ C[t, t −1 ]-module, Wµ (ζ ) decomposes as a direct sum of principal subspaces Wλ (0), with graded coefficients which are equal to the generalized Kostka polynomials in the previous section. This follows from the fact that Wλ (0) is generated by the action of n− ⊗ C[t, t −1 ] on the highest weight vector of Vλ (0), and in the previous section we computed the graded space of multiplicities of these highest-weight vectors in the fusion product of integrable modules to be generalized Kostka polynomials. Thus, we can see that chq Wµ (ζ ) = Kλ,µ (q −1 )chq Wλ . (6.1) λ

Note that the sum over λ is finite, because Kλ,µ (q) = 0 when λ1 > (ν(µ))1 . In this subsection we will compute the character of the fusion Wµ (ζ ), by characterizing the dual space of n− ⊗ C(t) acting on the cyclic vector ⊗vp . The dual space is the space of generating functions for matrix elements of the form

w|U (n− ⊗ C(t))v1 ⊗ · · · ⊗ vN , | w ∈ Wλ∗ (∞), λ ∈ Pk+ . (α)

Thus, the dual space Fµ (ζ ) is the space of functions in the variables xi (with 1 ≤ α ≤ r and 1 ≤ i ≤ m(α) ), with pairing defined in the same way as in Eq. (5.7). Thus it is the (α ) (α) (α±1) space of functions with possible simple poles at xi p = ζp and xi = xj , such that the polynomial f (x) defined by F (x) =

(αp ) p,i (xi

− ζp )

f (x)

r−1

(α)

is symmetric under the exchange xi relation whenever

(α)

x1

α=1

(α)

j,k (xj

(α+1)

− xk

)

∈ Fµ (ζ )

(6.2)

(α)

↔ xj . In addition, it vanishes due to the Serre (α)

= x2

(α±1)

= x1

.

There is no degree restriction on f (x), since we allow for poles at infinity in U (n− ⊗ C(t)), as well as at t = ζp . We do not allow for zeros at t = ζp , so the pole structure at t = ζp is as before. Moreover we have, as in the calculation of the coinvariant, the condition that f (x) vanishes whenever (αp )

x1

(α )

= · · · = xapp+1 = ζp ,

p = 1, . . . , N.

(6.3)

Finally, it is possible now to have currents fα (z)k+1 acting non-trivially on the tensor product of highest-weight vectors. Since Wλ (0) is a subspace of an integrable module, where such currents act trivially, the dual space is in the subspace which couples trivially to such currents. That is, we must impose the integrability condition, that f (x) vanishes whenever (α)

x1

(α)

= · · · = xk+1 .

(6.4)

These conditions characterize the space Fµ (ζ ). In order to compute the character of the h-graded component Fµ (ζ )[m], we introduce the same filtration as in Sect. 3.4. That is, let ν be a multi-partition consisting of r partitions, where ν (α) m(α) , (we denote this as ν m). We order multi-partitions lexicographically, and introduce the


459

evaluation maps ϕν as in Sect. 3.4. The evaluation maps act on the space Fµ (ζ ). Let

ν = ∩ν >ν ker ϕν etc., where the kernel now refers to that of the evaluation map acting on Fµ (ζ ). Define the graded components Gr ν = ν / ν . We compute the image of the induced map ϕ ν : Gr ν → Hν . Here, Hν is the space of rational functions in the variables α y = ya,i | 1 ≤ α ≤ r, 1 ≤ i ≤ m(α) , 1 ≤ a ≤ k , a (α)

(α)

(α+1)

where ma is the number of rows of length a in ν (α) , with possible poles at ya,i = ya ,i and at

(α ) ya,ip

= ζp .

ν ⊂ Hν be the subspace of functions spanned by functions of the Definition 6.1. Let H form H (y) = Hν (y)h(y),

(6.5)

where h(y) is a polynomial, symmetric under the exchange of variables with the same values of α and a, and (α) (α) (α) (α+1) Hν (y) = (ya,i − ya ,i )2Aa,a (ya,i − ya ,i )−Aa,a α=1,... ,r (a,i)>(a ,i )

×

α=1,... ,r−1 (a,i);(a ,i )

(ya,ip − ζp )−Aa,ap . (α )

(6.6)

p,(a,i)

By using almost identical arguments to those in Sect. 3.4, we conclude that Theorem 6.2. The induced map ν ϕ ν : Gr ν → H

(6.7)

is an isomorphism of graded vector spaces. Therefore we have that chq Fµ (ζ ) =

ν . chq H

m νm

ν we can set ζp = 0 in Hν (y), as it does not change the To compute the character of H character. Also recall that chq Wµ (ζ )[m] = q |m| chq Fµ (ζ ). Thus we have →T

q 21 m

chq Wµ (ζ ) = →

m∈Zr×k ≥0

→ →T

→

(Cr ⊗A)m−m (id⊗A) n

(q)→ m

eω

T ·n−ωT C m r

.

(6.8)

(α) (α) Recall that n = (n(1) , . . . , n(r) )T , with n(α) = a≥0 ana , where na is the number of highest weights of the form µp = aωα . In order to calculate the character for general principal subspaces of slr+1 , we can restrict ourselves to sequences of r partitions of the form µp = ap ωp , with p = 1, . . . , r.

460


The results of Sect. 5.7 show that the matrix K(q) with elements (K(q))λ,ν(µ) = Kλ,µ (q) is invertible, so we can invert the relation (6.1) and conclude that the character of the principal subspace of a general highest weight is given by (K−1 (q −1 ))ν(µ),λ chq Wµ (ζ ), (6.9) chq Wλ = µ

where the finite sum is over sequences of partitions of the form µ = (n(1) ω1 , . . . , n(r) ωr ), i.e. sequences of rectangular weights, such that ν(µ) ≤ λ (in the sense of Lemma 5.12). 6.2. Characters for general highest-weight modules of slr+1 . We can now use the results of Sect. 4, to obtain the character formulæ for the Weyl translated principal subspaces and, in particular, the characters of general integrable irreducible representations of slr+1 . Let us denote the limit of N → ∞ of T N chq Vµ (ζ ) (where N is chosen in such a way that (Cr · N)α = 2N, for all α) by chq Vµ (ζ ). Using the results and notation of Sect. 4, we find 1 T Cr m − k1 nT · m ωT ·n−ωT Cr m q 2k m e chq Vµ (ζ ) = m ∈Zr →T

−1 → →T

−1 →

q 21 m (Cr ⊗Ck−1 )m− n (id⊗Ck−1 )m 1

r × r , (q)∞ → r×(k−1) a 0), chq V(l1 ,l2 ) = chq V(l1 ,l2 ) (ζ ) −

1 chq V(l1 −1,l2 −1) (ζ ), q

(6.17)

where chq Vµ (ζ ) is given by Eq. (6.10) or (6.11). 6.3.2. An sl4 example. We give an explicit example for the matrix K for representations of sl4 , with level k ≤ 4. In addition, we will restrict ourselves to representations with 3 i=1 ili = 0 mod 4 (see Sect. 5.7). There are 10 representations of this kind, and we will use the ordering (0, 0, 0); (1, 0, 1), (0, 2, 0); (2, 1, 0), (0, 1, 2); (4, 0, 0), (2, 0, 2), (1, 2, 1), (0, 4, 0), (0, 0, 4).

462


With this ordering, we obtain the following Kostka matrix   1 q 0 0 0 0 q2 0 0 0  0 1 0 q q 0 q q2 0 0     0 0 1 0 0 0 0 q + q2 0 0    0 0 0 1 0 0 0 q 0 0    q 0 0. K(q) =  0 0 0 0 1 0 0  0 0 0 0 0 1 0 0 0 0   0 0 0 0 0 0 0 0 0 1   1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 01 000000 0 The inverse is



1 0  0  0   −1 K (q) =  0 0  0  0 0 0

−q 1 0 0 0 0 0 0 0 0

0 0 1 0 0 0 0 0 0 0

q2 −q 0 1 0 0 0 0 0 0

q2 −q 0 0 1 0 0 0 0 0

0 0 0 0 0 1 0 0 0 0

0 −q 3 −q q2 0 −q − q 2 0 −q 0 −q 0 0 1 0 0 1 0 0 0 0

0 0 0 0 0 0 0 0 1 0

(6.18)

 0 0  0  0  0.  0  0  0 0 1

(6.19)

Note that the inverse Kostka matrix has off-diagonal elements with both signs. As an example, we find that (by making use of Eq. (6.13)) 1 1 chq V(2,1,0) (ζ ) − chq V(0,1,2) (ζ ) q q 1 1 1 1 − + 2 chq V(0,2,0) (ζ ) + 2 chq V(1,0,1) (ζ ) − 3 chq V(0,0,0) (ζ ), q q q q (6.20)

chq V(1,2,1) = chq V(1,2,1) (ζ ) −

with chq Vµ (ζ ) given by Eq. (6.10). 7. Conclusion The main purpose of this paper was to find explicit fermionic character formulæ for arbitrary integrable highest-weight modules of slr+1 , using a generalization of the methods of Feigin and Stoyanovski˘ı [23]. Because the functional realization of the dual space for non-rectangular highest weights is too complex for computation of a fermionic character (see Sect. 3.3.2), we did not compute purely fermionic characters, which would have the nice feature that they are manifestly power series in q, with non-negative coefficients. Instead, we found explicit character formulæ as a finite sum of fermionic characters with coefficients in Z[q −1 ]. To obtain these explicit characters, we used the following strategy: we computed the fermionic character formula for the (non level-restricted) fusion product of N integrable modules with rectangular highest weights µp = ap ωαp , Eq. (6.11), and of the space of


463

conformal blocks associated with this fusion product, the generalized Kostka polynomial of Theorem 5.11. We thus provided a proof of the conjecture of Feigin and Loktev [8], concerning the relation between their graded tensor product and the generalized Kostka polynomials [22, 16] in this case. It is also a direct proof of the independence of the dimension of the FL-fusion product of the evaluation parameters (the points ζp ), since the associated graded space whose character we computed corresponds to the limit ζp → 0 for all p. We then used the characters for the special case of these fusion products, together with the relation (6.12), to obtain a formula for the characters of integrable modules of slr+1 of arbitrary (non-rectangular) highest weight, in terms of the inverse matrix of certain generalized Kostka polynomials, see Theorem 6.3. The generalization of the discussion in this paper to other simple Lie algebras requires us to consider the so-called Kirillov-Reshetikhin modules (or rather, their limit to the loop algebra case, as KR-modules were originally defined for Yangians). These take the place of irreducible g-modules with rectangular highest weights but as g-modules, they are not necessarily irreducible. We will explain this generalization in an upcoming publication. Acknowledgements. The work of E.A. is supported by NSF grants numbers DMR-04-42537 and DMR01-32990; that of M.S. by NSF grant DMR-01-32990. R.K. would like to thank B. Feigin and S. Loktev for many useful discussions.

References 1. Ardonne, E., Bouwknegt, P., Schoutens, K.: Non-abelian quantum Hall states—exclusion statistics, K-matrices, and duality. J. Stat. Phys. 102(3–4), 421–469 (2001) 2. Ardonne, E., Kedem, R., Stone, M.: Filling the bose sea: symmetric quantum Hall edge states and affine characters. J. Phys. A 38(3), 617–636 (2005) 3. Belavin, A.A., Polyakov, A.M., Zamolodchikov, A.B.: Infinite conformal symmetry in two-dimensional quantum field theory. Nucl. Phys. B 241(2), 333–380 (1984) 4. Feigin, B., Jimbo, M., Kedem, R., Loktev, S., Miwa, T.: Spaces of coinvariants and fusion product, affine sl2 character formulas in terms of kostka polynomials. Duke Math. J. 125(3), 549–588 (2004) 5. Feigin, B., Jimbo, M., Loktev, S., Miwa, T., Mukhin, E.: Addendum to: “Bosonic formulas for (k, l)-admissible partitions”. Ramanujan J. 7(4), 519–530 (2003) 2 spaces of coin6. Feigin, B., Kedem, R., Loktev, S., Miwa, T., Mukhin, E.: Combinatorics of the sl variants. Transform. Groups 6(1), 25–52 (2001) 2 coinvariants: dual 7. Feigin, B., Kedem, R., Loktev, S., Miwa, T., Mukhin, E.: Combinatorics of the sl functional realization and recursion. Compositio Math. 134(2), 193–241 (2002) 8. Feigin, B., Loktev, S.: On generalized Kostka polynomials and the quantum Verlinde rule. In: Differential topology, infinite-dimensional Lie algebras, and applications, Volume 194 of Amer. Math. Soc. Transl. Ser. 2, Providence, RI: Amer. Math. Soc., 1999, pp. 61–79 9. Georgiev, G.: Combinatorial constructions of modules for infinite-dimensional Lie algebras, II. Parafermionic space. http://arxiv.org/list/math.QA/9504024, 1995 10. Georgiev, G.: Combinatorial constructions of modules for infinite-dimensional Lie algebras, I. Principal subspace. J. Pure Appl. Alg. 112(3), 247–286 (1996) 11. Kac, V.G.: Infinite-dimensional Lie algebras. Cambridge: Cambridge University Press, Third edition, 1990 12. Kedem, R., Klassen, T.R., McCoy, B.M., Melzer, E.: Fermionic quasi-particle representations for characters of (G(1) )1 × (G(1) )1 /(G(1) )2 . Phys. Lett. B 304(3–4), 263–270 (1993) 13. Kedem, R., McCoy, B.M.: Construction of modular branching functions from Bethe’s equations in the 3-state Potts chain. J. Stat. Phys. 71(5–6), 865–901 (1993) 14. Kirillov, A.N.: Ubiquity of Kostka polynomials. In: Physics and combinatorics 1999 (Nagoya), River Edge, NJ: World Sci. Publishing, 2001, pp. 85–200 15. Kirillov, A.N., Schilling, A., Shimozono, M.: A bijection between Littlewood-Richardson tableaux and rigged configurations. Selecta Math. (N.S.) 8(1), 67–135 (2002)

464


16. Kirillov, A.N., Shimozono, M.: A generalization of the Kostka-Foulkes polynomials. J. Alg. Combin. 15(1), 27–69 (2002) 17. Lepowsky, J., Primc, M.: Structure of the standard modules for the affine Lie algebra A[1] 1 . Volume 46 of Contemporary Mathematics. Providence, RI: Amer. Math. Soc., 1985 18. Macdonald, I.G.: Symmetric functions and Hall polynomials. Oxford Mathematical Monographs. New York: Clarendon Press Oxford University Press, Second edition, 1995. With contributions by A. Zelevinsky, Oxford Science Publications 19. Moore, G., Read, N.: Nonabelions in the fractional quantum Hall effect. Nucl. Phys. B 360(2–3), 362–396 (1991) (1) 20. Primc, M.: Vertex operator construction of standard modules for An . Pac. J. Math. 162(1), 143–187 (1994) 21. Primc, M.: Loop modules in annihilating ideals of standard modules for affine Lie algebras. In: VII. Mathematikertreffen Zagreb-Graz (Graz, 1990), Volume 313 of Grazer Math. Ber., Karl-FranzensUniv. Graz, 1991, pp. 39–44 22. Schilling, A., Ole Warnaar, S.: Inhomogeneous lattice paths, generalized Kostka polynomials and An−1 supernomials. Commun. Math. Phys. 202(2), 359–401 (1999) 23. Stoyanovski˘ı, A.V., Fe˘ıgin, B.L.: Functional models of the representations of current algebras, and semi-infinite Schubert cells. Funkt. Anal. i Pril. 28(1), 68–90, 96 (1994) 24. Tsuchiya, A., Kanie, Y.: Vertex operators in conformal field theory on P1 and monodromy representations of braid group. In: Conformal field theory and solvable lattice models (Kyoto, 1986), Volume 16 of Adv. Stud. Pure Math., Boston, MA: Academic Press, 1988, pp. 297–372 Communicated by L. Takhtajan


Communications in


Decay of Solutions of the Wave Equation in the Kerr Geometry F. Finster1 , N. Kamran2 , J. Smoller3 , S.-T. Yau4 1

NWF I – Mathematik, Universität Regensburg, 93040 Regensburg, Germany. E-mail: [email protected] Department of Math. and Statistics, McGill University, Montréal, Québec, Canada H3A 2K6. E-mail: [email protected] 3 Mathematics Department, The University of Michigan, Ann Arbor, MI 48109, USA. E-mail: [email protected] 4 Mathematics Department, Harvard University, Cambridge, MA 02138, USA. E-mail: [email protected] 2

Received: 18 April 2005 / Accepted: 10 August 2005 Published online: 1 March 2006 – © Springer-Verlag 2006

Abstract: We consider the Cauchy problem for the massless scalar wave equation in the Kerr geometry for smooth initial data compactly supported outside the event horizon. We prove that the solutions decay in time in L∞ loc . The proof is based on a representation of the solution as an infinite sum over the angular momentum modes, each of which is an integral of the energy variable ω on the real line. This integral representation involves solutions of the radial and angular ODEs which arise in the separation of variables. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Asymptotic Estimates for the Radial Equation . . . . . . . . . 3.1 Holomorphic families of radial solutions . . . . . . . . . 3.2 A continuous family of solutions near ω = 0 . . . . . . . 4. Global Estimates for the Radial Equation . . . . . . . . . . . . 4.1 The complex Riccati equation . . . . . . . . . . . . . . 4.2 Invariant disk estimates . . . . . . . . . . . . . . . . . . 4.3 Bounds for the Wronskian and the fundamental solutions 5. Contour Deformations to the Real Axis . . . . . . . . . . . . . 6. Energy Splitting Estimates . . . . . . . . . . . . . . . . . . . 7. An Integral Representation on the Real Axis . . . . . . . . . . 8. Proof of Decay . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Research supported in part by the Deutsche Forschungsgemeinschaft. Research supported by NSERC grant # RGPIN 105490-2004. Research supported in part by the NSF, Grant No. DMS-010-3998. Research supported in part by the NSF, Grant No. 33-585-7510-2-30.

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

466 467 470 470 477 480 480 481 487 492 495 499 500 503

466

F. Finster, N. Kamran, J. Smoller, S.-T. Yau

1. Introduction In this paper we study the long-time dynamics of massless scalar waves in the Kerr geometry. We prove that solutions of the Cauchy problem with smooth initial data which is compactly supported outside the event horizon, decay in L∞ loc . Our starting point is the integral representation for the propagator [5], which involves an integral over a complex contour in the energy variable ω. In order to study the long-time dynamics, we must deform the contour to the real line. To this end, we carefully analyze the solutions of the associated radial and angular ODEs which arise in the separation of variables. In particular, we show that the integrand in our representation has no poles on the real axis. We call such poles radiant modes, because in a dynamical situation they would lead to continuous radiation coming out of the ergosphere. We now set up some notation and state our main result. As in [5], we choose BoyerLindquist coordinates (t, r, ϑ, ϕ) with r > 0, 0 ≤ ϑ ≤ π , 0 ≤ ϕ < 2π , in which the Kerr metric takes the form 2 dr 2 2 2 2 ds = (dt − a sin ϑ dϕ) − U + dϑ U −

sin2 ϑ (a dt − (r 2 + a 2 ) dϕ)2 U

(1.1)

with U (r, ϑ) = r 2 + a 2 cos2 ϑ ,

(r) = r 2 − 2Mr + a 2 ,

where M and aM denote the mass and the angular momentum of the black hole, respectively. We restrict attention to the non-extreme case M 2 > a 2 , where the function has two distinct zeros, r0 = M −

M 2 − a2

and r1 = M +

M 2 − a2 ,

corresponding to the Cauchy and the event horizon, respectively. We consider only the region r > r1 outside the event horizon, and thus > 0. The ergosphere is the region where the Killing vector ∂t∂ is space-like, that is where r 2 − 2Mr + a 2 cos2 ϑ < 0.

(1.2)

The ergosphere lies outside the event horizon r = r1 , and its boundary intersects the event horizon at the poles ϑ = 0, π. Theorem 1.1. Consider the Cauchy problem for the wave equation in the Kerr geometry for smooth initial data which is compactly supported outside the event horizon and has fixed angular momentum in the direction of the rotation axis of the black hole, i.e. for some k ∈ Z, (0 , ∂t 0 ) = e−ikϕ (0 , ∂t 0 )(r, ϑ) ∈ C0∞ ((r1 , ∞) × S 2 )2 . 2 2 Then the solution decays in L∞ loc ((r1 , ∞) × S ) as t → ∞.

Decay of Solutions of the Wave Equation in the Kerr Geometry

467

The study of linear hyperbolic equations in a black hole geometry has a long history. Regge and Wheeler [11] considered the radial equation for metric perturbations of the Schwarzschild metric. In the late 1960s and early 1970s, Carter, Teukolsky and Chandrasekhar discovered that the equations describing scalar, Dirac, Maxwell and linearized gravitational fields in the Kerr geometry are separable into ordinary differential equations (see [2]). Much research has been done concerning the long-time behavior of the solutions of these equations, through both numerical and analytical methods. Price [9] gave arguments which indicated decay of solutions of the scalar wave equation in the Schwarzschild geometry. Press and Teukolsky [8] did a numerical study which strongly suggested the absence of unstable modes, and Whiting [10] later proved that for ω in the complex plane, such unstable modes cannot exist. This “mode stability” does not rule out that there might be unstable modes for real ω (what we call radiant modes). Furthermore, mode stability does not lead to any statement on the Cauchy problem. Finally, Kay and Wald [7] used energy estimates to prove a boundedness result for solutions of the scalar wave equation in the Schwarzschild geometry. Unfortunately, these energy methods cannot be used in a rotating black hole geometry, because the energy density is indefinite inside the ergosphere, making it impossible to introduce a positive definite conserved scalar product. This difficulty was dealt with in [5, 6], where Whiting’s mode stability result was combined with estimates for the resolvent and for the radial and angular ODEs. In [5] we established an integral representation which expresses the solution as a contour integral of an integrand involving the separated radial and angular eigenfunctions over a contour staying within a neighborhood located arbitrarily close to the real axis. This integral representation is the starting point of the present paper. After deforming the contours onto the real axis, we can prove decay using the Riemann-Lebesgue Lemma, similar to the case of the Dirac equation [4]. We remark that the decay result of our paper would not be expected to hold for a massive scalar field satisfying the Klein-Gordon equation, as indicated in [1]. Finally, we note that the problem considered here is closely related to one of the major open questions in general relativity; namely the problem of linearized stability of the Kerr metric. For the stability under metric perturbations one considers the equation for linearized gravitational waves, which can be identified with the general wave equation for spin s = 2 (see [2]). Thus replacing scalar waves (s = 0) by gravitational waves (s = 2), the above theorem would prove linearized stability of the Kerr metric. However, the analysis for s = 2 would be considerably more difficult due to the complexity of the linearized Einstein equations. Nevertheless, we regard this paper as a first step towards proving linearized stability of the Kerr metric. 2. Preliminaries We recall a few constructions and results from [5, 6] which will be needed later on. As radial variable we usually work with the Regge-Wheeler variable u ∈ R defined by du r 2 + a2 = ; dr

(2.1)

then u = −∞ corresponds to the event horizon. It is most convenient to write the wave equation in the Hamiltonian form i ∂t = H ,

(2.2)

468


where = (, i∂t ). The Hamiltonian can be written as 0 1 H = , Aβ

(2.3)

where 1 ∂ ∂ a2 k2 − (r 2 + a 2 ) − 2 . − 2 ρ ∂u ∂u r + a 2 S r 2 + a2 2ak β=− . 1− 2 ρ r + a2 ρ = r 2 + a 2 − a 2 sin2 ϑ 2 . r + a2

A=

(2.4) (2.5) (2.6)

The operators A and β are symmetric on the Hilbert space L2 (R × S 2 , dµ)2 with the measure dµ := ρ du d cos ϑ.

(2.7)

It is immediately verified that the Hamiltonian is symmetric with respect to the bilinear form A0 < 1 , 2 > = 1 , 2 C2 dµ. (2.8) 0 1 R×S 2 As is worked out in detail in [5], the inner product < , > is the physical energy of . Therefore, we refer to as the energy scalar product. The fact that the energy scalar product is not positive definite can be understood from the fact that the operator A is not positive on L2 (R × S 2 , dµ). Using the ansatz (t, r, ϑ, ϕ) = e−iωt−ikϕ R(r) (ϑ),

(2.9)

the wave equation can be separated into an angular and a radial ODE, Rω,k Rλ = −λ Rλ ,

Aω,k λ = λ λ .

(2.10)

Here the angular operator Aω,k is also called the spheroidal wave operator. The separation constant λ is an eigenvalue of Aω,k and can thus be regarded as an angular quantum number. In [6] it was shown that if ω is in a small neighborhood of the real line, more precisely if ε ω ∈ Uε := ω ∈ C | |Im ω| < , 1 + |Re ω| then for sufficiently small ε > 0 the angular operator Aω,k has a purely discrete spectrum (λn )n∈N with corresponding one-dimensional eigenspaces which span the Hilbert space L2 (S 2 ). We denote the projections onto the eigenspaces by Qn (k, ω). These projections as well as the corresponding eigenvalues λn are holomorphic in ω ∈ Uε . In


469

analogy to the eigenvalues l(l + 1) of the Laplacian on the sphere, the angular eigenvalues λn grow quadratically for large n in the sense that there is a constant C(k, ω) > 0 such that n2 C(k, ω)

|λn (k, ω)| ≥

for all n ∈ N.

(2.11)

We set ω0 = −

ak r12 + a 2

(2.12)

with r1 the event horizon and use the notation (ω) = ω − ω0 .

(2.13)

In order to bring the radial equation into a convenient form, we introduce a new radial function φ(r) by φ(r) =

r 2 + a 2 R(r) .

Then in the Regge-Wheeler variable, the radial equation can be written as the “Schrödinger equation” d2 − 2 + V (u) φ(u) = 0 du

(2.14)

with the potential V (u) = − ω +

ak 2 r + a2

2 +

λn (ω) 1 +√ ∂u2 r 2 + a 2 . 2 2 2 (r + a ) r 2 + a2

(2.15)

In [5] we derived an integral representation for the solution of the Cauchy problem of the following form, (t, r, ϑ, ϕ) 1 −ikϕ =− e 2πi k∈Z dω e−iωt (Qk,n (ω) S∞ (ω) 0k )(r, ϑ). × lim − n∈IN

ε 0

Cε

Cε

Here the integration contour Cε must lie inside the set Uε .

(2.16)

470


3. Asymptotic Estimates for the Radial Equation 3.1. Holomorphic families of radial solutions. In this section we fix the angular quantum numbers k, n and consider solutions φ´ and φ` of the Schrödinger equation (2.14) which satisfy the following asymptotic boundary conditions on the event horizon and at infinity, respectively,

´ ´ lim e−iu φ(u) = 0, (3.1) = 1, lim e−iu φ(u) u→−∞ u→−∞

` ` = 0. (3.2) = 1, lim eiωu φ(u) lim eiωu φ(u) u→∞

u→∞

These solutions were introduced in [5] for ω in the lower complex half plane intersected with Uε . Here we will show that they are holomorphic in ω, and we will extend their definition to a larger ω-domain. More precisely, we prove the following two theorems. Theorem 3.1. The solutions φ´ are well-defined on the domain

r1 − r0 D = Uε ∩ ω ∈ C | Im ω ≤ . 2(r12 + a 2 ) They form a holomorphic family of solutions in the sense that for every fixed u ∈ R ´ and n ∈ N, the function φ(u) is holomorphic in ω ∈ D. Theorem 3.2. For every angular momentum number n there is an open set E containing the real line except for the origin, E ⊃ E0 := Uε ∩ {ω ∈ C | Im ω ≤ 0 and ω = 0} ,

(3.3)

such that the solutions φ` are well-defined for all ω ∈ E and form a holomorphic family on E. For the proofs we will rewrite the Schrödinger equation with boundary conditions (3.1, 3.2) as an integral equation (which in different contexts is called the Lipman-Schwinger or Jost equation). Then we will perform a perturbation expansion and get estimates for all ´ the the terms of the expansion. To introduce the method, we begin with the solutions φ; solutions φ` will be treated later with a similar technique. First we write the Schrödinger equation (2.14) in the form d2 2 ´ ´ − 2 − φ(u) = −W (u) φ(u) (3.4) du with a potential W = 2 + V (u) which vanishes at u = −∞. We define the Green’s function of the differential operator −∂u2 − 2 by the distributional equation (−∂v2 − 2 ) S(u, v) = δ(u − v).

(3.5)

The Green’s function is not unique; we choose it such that its support is contained in the region v ≤ u; i.e. 1 −i(u−v) i(u−v) e if = 0 − e (3.6) S(u, v) = (u − v) × 2i v−u if = 0.


471

(Here denotes the Heaviside function defined by (x) = 1 if x ≥ 0 and (x) = 0 otherwise.) We multiply (3.4) by the Green’s function and integrate, ∞ ∞

´ ´ S(u, v) (−∂v2 − 2 ) (φ(v) − eiu ) dv = − S(u, v) W (v) φ(v) dv. −∞

−∞

If we assume for the moment that φ´ satisfies the desired boundary conditions (3.1), we can integrate by parts on the left and use (3.5). This gives the Lipman-Schwinger equation u iu ´ ´ φ(u) = e − S(u, v) W (v) φ(v) dv, −∞

which in the context of potential scattering is also called the Jost equation (see e.g. [3]). Its significance lies in the fact that we can now easily perform a perturbation expansion in the potential W . Namely, taking for φ´ the ansatz as the perturbation series φ´ =

∞

φ (l) ,

(3.7)

l =0

we are led to the iteration scheme

 φ (0) (u) = eiu  u φ (l+1) (u) = − S(u, v) W (v) φ (l) (v) dv. 

(3.8)

−∞

This iteration scheme can be used for constructing solutions of the Jost equation, and this will give us the functions φ´ with the desired properties. Proof of Theorem 3.1. Fix ω ∈ D. As the potential W is smooth in r and vanishes on the event horizon, we know that W has near r1 the asymptotics W = O(r − r1 ). This means in the Regge-Wheeler variable (2.1) that W decays exponentially as u → −∞. More precisely, there is a constant c > 0 such that |W (u)| ≤ c eγ u

with γ :=

r1 − r 0 . r12 + a 2

(3.9)

Let us show inductively that |φ (l) (u)| ≤ µl e−Im u

with µ :=

c eγ u . (γ − Im − |Im |)2

(3.10)

In the case l = 0, the claim is obvious from (3.8). Thus assume that (3.10) holds for a given l. Then, estimating the integral equation in (3.8) using (3.9), we obtain u l (l+1) |φ (u)| ≤ c µ |S(u, v)| e(γ −Im ) v dv. (3.11) −∞

The Green’s function (3.6) can be estimated in the case v ≤ u by u − v 1 −i(u−v) τ |Im | (u−v) |S(u, v)| = e dτ . ≤ (u − v) e 2 0

472


Substituting this inequality in (3.11) gives u |φ (l+1) (u)| ≤ c µl e|Im | u (u − v) e(γ −Im −|Im |) v dv. −∞

Since the parameter α := γ − Im − |Im | is positive according to the definition of D, we can carry out the integral as follows, u u d d eαu eαu αv αv (u − v) e dv = u − e dv = u − = 2. dα dα α α −∞ −∞ This gives (3.10) with l replaced by l + 1. Since for u on a compact interval, the analytic dependence of the solutions in ω from the coefficients and the initial conditions follows immediately from the Picard-Lindelöf Theorem, it suffices to consider the region u < u0 for any u0 ∈ R. By choosing u0 sufficiently small, we can arrange that µ < 1/2 for all u < u0 . Then the estimate (3.10) shows that the perturbation series (3.7) converges absolutely, uniformly in u ∈ (−∞, u0 ). Using similar estimates for the u-derivatives of φ (l) , one sees furthermore that the perturbation series (3.10) can be differentiated term by term, and using (3.5) we find that φ´ is indeed a solution of (3.4). Furthermore, ´ φ(u) − eiu =

∞

φ (l) (u),

l=1

and taking the limit u → ∞ and using (3.10) we find that the right side goes to zero. Using the same argument for the first derivatives, we obtain (3.1). In order to prove that φ´ is analytic in ω, we first note that if = 0, we can differentiate the perturbation series (3.7) term by term and verify that the Cauchy-Riemann equations are satisfied (note that λn is holomorphic in ω according to [6]). Since φ´ is bounded near = 0, it is also analytic at = 0. ` In analogy to (3.4), we now write the Schrödinger equation We turn to the solutions φ. as

−

d2 − ω2 du2

` ` φ(u) = −W (u) φ(u)

(3.12)

with (ak)2 λn 2ak − + 2 r 2 + a2 (r 2 + a 2 )2 (r + a 2 )2 1 +√ ∂u2 r 2 + a 2 . r 2 + a2

W (u) = −ω

Assuming that ω = 0, we choose the Green’s function as 1 −iω(v−u) e − eiω(v−u) (v − u). S(u, v) = 2iω The corresponding Jost equation is ` φ(u) = e−iωu −

∞ u

` S(u, v) W (v) φ(v) dv.

(3.13)

(3.14)


473

The perturbation series ansatz φ` =

∞

φ (l)

(3.15)

l =0

leads to the iteration scheme

  φ (0) (u) = e−iωu ∞ (l) (l+1) φ (u) = − S(u, v) W (v) φ (v) dv. 

(3.16)

u

Note that, in contrast to the exponential decay (3.9), now the potential W , (3.13), has only polynomial decay. As a consequence, the iteration scheme allows us to construct φ` only inside the set E0 as defined in (3.3). Lemma 3.3. The solutions φ` are well-defined for every ω ∈ E0 . They form a holomorphic family in the interior of E0 . Proof. Fix ω ∈ E0 . Then ω = 0 and Im ω ≤ 0, and this allows us to estimate the potential (3.13) and the Green’s function (3.14) for u, v > u0 and some u0 > 0 by |W (v)| ≤

c , v2

|S(u, v)| ≤

1 Im ω (u−v) . e |ω|

(3.17)

Let us show by induction that 1 |φ (u)| ≤ l! (l)

c |ω| u

l eIm ω u .

For l = 0 this is obvious from (3.16), whereas the induction step follows by estimating the integral equation in (3.16) with (3.17), n ∞ 1 c 1 Im ω (u−v) c Im ω v e e |φ (l+1) (u)| ≤ l! |ω| |ω| v 2+l u l+1 c 1 = eIm ω u . (l + 1)! |ω| u Hence the perturbation series (3.15) converges absolutely, locally uniformly in u. It is straightforward to check that φ` satisfies the Schrödinger equation (3.12) with the correct boundary values (3.2). If Im ω < 0, one can differentiate the series (3.15) term by term with respect to ω and verify that the Cauchy-Riemann equations are satisfied. It remains to analytically extend the solutions φ` for fixed n to a neighborhood of any point ω0 ∈ R \ {0}. To this end, we need good estimates of the derivatives of φ` with respect to ω and u. It is most convenient to work with the functions ψ (l) (u) := (2iω)l eiωu φ (l) (u), for which the iteration scheme (3.16) can be written as ψ (0) = 1 and ∞ ψ (l+1) (u) = (e−2iω(v−u) − 1) W (v) ψ (l) (v) dv. u

(3.18)

(3.19)

474


Lemma 3.4. For every ω0 ∈ R \ {0} and n ∈ N, there are positive constants c, K, δ, such that for all ω ∈ E0 ∩ Bδ (ω0 ) with Im ω < 0 and all p, q, n ∈ N the following inequality holds, p q ∂ ∂ (l) ≤ c1+l+p K q p! q! 1 . ψ (u) (3.20) ∂ω ∂u l! ul+q Proof. According to [6], λn is holomorphic in a neighborhood of ω0 , and thus (for example using the Cauchy integral formula) its derivatives can be bounded in Bδ (ω0 ) by 1+p K |∂ωp λn (ω)| ≤ p! 2 for suitable K > 0. Since the potential W , (3.13), is also holomorphic in r (in a suitable neighborhood of the positive real axis) and has quadratic decay, its derivatives can be estimated by 1+p+q p! q! K q |∂ωp ∂u W (u)| ≤ . (3.21) 2 u2+q We choose c so large that the following conditions hold, 1

c > 16 K,

1 K eK ≤ . (ω0 − δ) c 2

(3.22)

We proceed to prove (3.20) by induction in l. For l = 0 there is nothing to prove. Thus assume that (3.20) holds for a given l. Using the induction hypothesis together with (3.21), we can then estimate the derivatives of the product W ψ (l) as follows, q

|∂ωp ∂u (W ψ (l) )| ≤

p q 1+a+b K p q a b 2

a=0

× =

a! b! u2+b

b=0 c1+l+p−a

K q−b (p − a)! (q − b)! l! a q b p K 1 p! q! . l! 2c 2

ul+q−b

c1+l+p K 1+q u2+l+q

a=0

b=0

According to (3.22), the two remaining sums can be bounded by the geometric series ∞ −m = 2, and thus m=0 2 q

|∂ωp ∂u (W ψ (l) )| ≤ 4

c1+l+p K 1+q p! q! . u2+l+q l!

Next we differentiate the integral equation (3.19), q ∂ωp ∂u ψ (l+1) (u)

p ∞ p q = ∂ωr ∂u (v − u)(e−2iω(v−u) − 1) r −∞ r =0

×∂ωp−r W ψ (l) (v) dv

(3.23)


475

(note that, since Im ω < 0, the factor e−2iωv gives an exponential decay of the integrand as v → ∞). After manipulating the partial derivatives as follows, r v−u r q −2iω(v−u) q ∂ω ∂u (v − u) (e − 1) = (−∂v ) (v − u) ∂v ω × (e−2iω(v−u) − 1) , the resulting v-derivatives can all be integrated by parts. The boundary terms drop out, and we obtain p ∞ u − v r q p−r

p p q (l+1) −2iω(v−u) ∂ω ∂u ψ W ψ (l) (v) dv. (u) = (e −1) ∂v ∂v ∂ω r ω u r =0

Since ω is in the lower half plane, we have the inequality |e−2iω(v−u) | ≤ 1. We conclude that r p ∞

p p q (l+1) ∂v u − v ∂ q ∂ p−r W ψ (l) (v) dv. (u) ≤ 2 ∂ω ∂u ψ v ω r u ω r =0

(3.24) The v-derivatives in the curly brackets can act either on one of the factors (u − v) or on the function W ψ (l) . Taking into account the combinatorics, we obtain p r

p 1 r r−s ∞ p q (l+1) s p−r q+s W ψ (l) dv. r (u) ≤ 2 (v − u) ∂v ∂ω ∂u ψ ∂ ω r r ω s u r =0

s=0

Using (3.23), we get p r p! (q + s)! −r r−s 1+l+p−r 1+q+s p q (l+1) (u) ≤ 8 K ω r c ∂ω ∂u ψ s! (r − s)! l! r = 0 s=0 ∞ (v − u)s × dv. v 2+l+q+s u

Introducing the new variable τ = uv , the integral can be computed with iterative integrations by parts, 1 ∞ (v − u)s 1 dv = 1+l+q (1 − τ )s τ l+q dτ v 2+l+q+s u u 0 1 (l + q)! d s l+q+s 1 (1 − τ )s τ dτ = 1+l+q u (l + q + s)! 0 dτ s 1 (l + q)! s! (l + q)! s! 1 1 . τ l+q+s dτ = 1+l+q = 1+l+q u (l + q + s)! 0 u (1 + l + q + s)! We thus obtain p r (l + q)! c1+l+p K 1+q p! (q + s)! K s r r−s p q (l+1) (u) ≤ 8 . ∂ω ∂u ψ 1+l+q r u (r − s)! l! (ωc) (1 + l + q + s)! r = 0 s=0

476


Using the elementary estimate q! q +1 q +s q! (q + s)! (l + q)! = · ··· ≤ , (1 + l + q + s)! q +l+1 q +l+2 q +l+s+1 l+1 we obtain r p

r r−s 1 c1+l+p K 1+q p! q! K r p q (l+1) (u) ≤ 8 . ∂ω ∂u ψ u1+l+q (l + 1)! ωc (r − s)! K r =0

s=0

The last sum can be estimated by an exponential, r s=0

r ∞

r r−s

r 1 1 r a 1 r a . ≤ ≤ = exp (r − s)! K a! K a! K K a=0

a=0

According to (3.22), we can now estimate the remaining sum over r by a geometric series, r 1 p ∞

r K eK K r exp ≤ 2. ≤ ωc K ωc r =0

r =0

We thus obtain c2+l+p K q p! q! c1+l+p K 1+q p! q! p q (l+1) ≤ , (u) ≤ 16 ∂ω ∂u ψ u1+l+q (l + 1)! u1+l+q (l + 1)! where in the last step we again used (3.22).

Proof of Theorem 3.1. According to (3.15, 3.18), ` φ(ω, u) = e−iωu

∞ l =0

1 ψ (l) (ω, u). (2iω)l

Expanding ψ (l) in a Taylor series in ω, we obtain the formal expansion ` + ζ, u) = e−iωu φ(ω

∞ l =0

∞ 1 ζ p p (l) ∂ ψ (ω, u). (2i(ω + ζ ))l p! ω p=0

Lemma 3.4 allows us to estimate this expansion for every ω ∈ E0 ∩ Bδ (ω0 ) with Im ω < 0 as follows, l ∞ ∞ 1 c ` |φ(ω + ζ, u)| ≤ c (c|ζ |)p . l! |ω + ζ | u l =0

p=0

This expansion converges uniformly for |ζ | < 2c . Similarly, one can show that the series of ζ -derivatives also converge uniformly. Hence we can interchange differentiation with summation, and a straightforward calculation shows that the Cauchy-Riemann equations are satisfied. Thus the above expansion allows us to extend φ` analytically to the ball |ζ | < 2c . Since the constant c is independent of Im ω, we thus obtain an analytic extension of φ` across the real line.


477

3.2. A continuous family of solutions near ω = 0. In Theorem 3.2 we made no statement about the behavior of the fundamental solutions φ` at ω = 0. Indeed, we cannot expect the solutions to have a holomorphic extension in a neighborhood of ω = 0. But at least, after suitable rescaling, these solutions have a well-defined limit at ω = 0: Theorem 3.5. For every angular momentum number n, there is a real solution φ0 of the Schrödinger equation (2.14) for ω = 0 with the asymptotics (µ) 1 µ− 21 lim u φ0 (u) = √ (3.25) with µ := λn (0) + . u→∞ 4 π This solution can be obtained as a limit of the solutions from Theorem 3.2, in the sense that for all u ∈ R, φ0 (u) =

lim

E0 ω→0

` ωµ φ(u)

and

φ0 (u) =

lim

E0 ω→0

ωµ φ` (u).

Note that the λn are the eigenvalues of the Laplacian on the sphere. They are clearly non-negative, and thus the parameter µ in (3.25) is positive. Unfortunately, the function φ0 cannot be constructed with the iteration scheme (3.16) because if we put in the Green’s function for ω = 0 (which is obtained from (3.14) by taking the limit ω → 0), we get for φ (1) the equation ∞ φ (1) (u) = (v − u) W (v) dv, u

and since W decays at infinity only quadratically, the integral diverges. To overcome this problem, we combine the quadratically decaying part of the potential with the unperturbed operator. More precisely, for any ω in the set F := ω ∈ C | Im ω ≤ 0 and |ω| ≤ (16ak)−1 , we write the Schrödinger equation as µ2 − 41 d2 2 − 2+ − ω φ(u) = −W (u) φ(u), du u2 1

where µ(ω) = (λn (ω) − 2akω + 41 ) 2 . The potential W is continuous in ω and bounded by c |W (u)| ≤ 3 for all ω ∈ F . (3.26) u The solutions of the unperturbed Schrödinger equation can be expressed with Bessel functions, πu πu h1 (u) = Jµ (ωu), Yµ (ωu). h2 (u) = 2 2 They have the following asymptotics,  h2 (u) ∼ sin(ωu)   h1 (u) ∼ cos(ωu), √ 1 µ πω (µ) 2µ− 2 −µ+ 1 µ+ 21 2 h (u) ∼ u , h (u) ∼ u √  2 1  1 π ωµ (µ + 1) 2µ+ 2

if ωu 1, if ωu 1.

478


The Green’s function can be expressed in terms of the two fundamental solutions by the standard formula h1 (u) h2 (v) − h1 (v) h2 (u) S(u, v) = (v − u) , w(h1 , h2 ) where w(h1 , h2 ) = h 1 h2 −h1 h 2 = −ω is the Wronskian. The perturbation series ansatz φ =

∞

φ (l)

(3.27)

l=1

now leads to the integral equation φ

(l+1)

(u) =

∞

S(u, v) W (v) φ (l) (v) dv.

(3.28)

u

We choose the function φ (0) such that its asymptotics at infinity is a multiple times the plane wave e−iωu , whereas for ω = 0, it has the asymptotics (3.25), φ (0) (u) = ωµ (h1 − ih2 )(u).

(3.29)

Lemma 3.6. For any fixed n there is u0 ∈ R such that the iteration scheme (3.29, 3.28) converges uniformly for all u > u0 and ω ∈ F . The functions φ defined by (3.27) are solutions of the Schrödinger equation (2.14) with the asymptotics φ(u) c φ (0) (u) − 1 ≤ u and a constant c = c(n). Proof. Using the asymptotic formulas for the Bessel functions, one sees (similar to the estimate [3, Eq. (4.4)] for µ = l + 21 and integer l) that for all v ≥ u and ω ∈ F , the Green’s function is bounded by −µ+ 1 µ+ 1 2 2 u v Im ω (v−u) |S(u, v)| ≤ C e . (3.30) 1 + |ω| u 1 + |ω| v Similarly, we can bound the Bessel functions in (3.29) to get µ− 1 2 1 u (0) −Im ωu ≤ |φ | e ≤ C. C 1 + |ω| u

(3.31)

Let us show inductively that |φ (l) | ≤ C eIm ω u

u 1 + |ω| u

−µ+ 1 2

Cc u

l .

(3.32)

For l = 0 there is nothing to prove. The induction step follows from (3.28, 3.26, 3.30) −µ+ 1 ∞ 2 cC Cc l u v −Im ωu 2 Im ωv (l+1) |φ e dv |≤Ce 1 + |ω| u 1 + |ω| v v 3 v u −µ+ 1 l ∞ 2 cC u Cc Im ωu dv. ≤Ce 1 + |ω| u u v2 u The lemma now follows immediately from (3.32, 3.31) and by differentiating the series (3.27) with respect to u.


479

Proof of Theorem 3.5. From the asymptotics at infinity, it is clear that φ =

ωµ φ` if ω = 0 φ0 if ω = 0.

Denoting the ω-dependence of φ by a subscript, we thus need to prove that for all u ∈ R, lim φω (u) = φ0 (u).

lim φω (u) = φ0 (u),

F ω→0

(3.33)

F ω→0

To simplify the problem, we first note that for u on compact intervals, the continuous dependence on ω follows immediately from the Picard-Lindelöf Theorem (i.e. the continuous dependence of solutions of ODEs on the coefficients and initial values). Thus it suffices to prove (3.33) for large u. Furthermore, writing the Schrödinger equation as (∂u − iω)(∂u + iω) φω = −U φ, the potential U has quadratic decay at infinity. Thus, after the substitution (∂u − iω) = eiωu ∂u e−iωu , we can multiply the above equation by e−iωu and integrate to obtain e−iωu (∂u + iω) φω (u) =

∞

e−iωv U (v) φω (v) dv.

u

Here we emphasized the ω-dependence by a subscript; note also that the integral is well-defined in view of the asymptotics of φω at infinity. This equation shows that φω

converges pointwise once we know that φω (u) converges uniformly in u. Hence it remains to show that for every > 0 there is u0 and δ > 0 such that for all ω ∈ F with |ω| < δ, |φω (u) − φ0 (u)| < ε

for all u > u0 .

(3.34) (0)

To prove (3.34) we use the uniform convergence of the functions φω , (3.29), to choose δ such that for all ω ∈ F with |ω| < δ, (0)

|φω(0) (u) − φ0 (u)|
u0 .

According to Lemma 3.6, we can by choosing u0 sufficiently large arrange that |φω(0) (u) − φω (u)|
u0 and ω ∈ F .

Now (3.34) follows immediately from the estimate (0)

(0)

|φω − φ0 | ≤ |φω − φω(0) | + |φω(0) − φ0 | + |φ0 − φ0 |.

480


4. Global Estimates for the Radial Equation Let Y1 and Y2 be two real fundamental solutions of the Schrödinger equation (2.14) for a general real and smooth potential V . Then their Wronskian w := Y1 (u) Y2 (u) − Y1 (u) Y2 (u)

(4.1)

is a constant. By flipping the sign of Y2 , we can always arrange that w < 0. We combine the two real solutions into the complex function z = Y1 + iY2 , and denote its polar decomposition by z = ρ eiϕ

(4.2)

with real functions ρ(u) ≥ 0 and ϕ(u). By linearity, z is a solution of the complex Schrödinger equation z

= V z .

(4.3)

Note that z has no zeros because at every u at least one of the fundamental solutions does not vanish. 4.1. The complex Riccati equation. We introduce the function y by z

. (4.4) z Since z has no zeros, the function y is smooth. Moreover, it satisfies the complex Riccati equation y =

y + y2 = V .

(4.5)

The fact that the solutions of the complex Riccati equation are smooth will be helpful for getting estimates. Conversely, from a solution of the Riccati equation one obtains the corresponding solution of the Schrödinger equation by integration, v log z|vu = y. (4.6) u

Using (4.2) in (4.4) gives separate equations for the amplitude and phase of z, ρ = ρ Re y , and integration gives

ϕ = Im y,

log ρ|vu = ϕ|vu =

v

Re y,

(4.7)

Im y .

(4.8)

u v u

Furthermore, the Wronskian (4.1) gives a simple algebraic relation between ρ and y. Namely, w can be expressed by w = −Im (z z ) = ρ 2 Im y and thus w . (4.9) ρ2 = − Im y Since ρ 2 is positive and w is negative, we see that Im y(u) > 0

for all u.

(4.10)


481

4.2. Invariant disk estimates. We now explain a method for getting estimates for the complex Riccati equation. This method was first used in [6] for estimates in the case where the potential is negative (Lemma 4.1). Here we extend the method to the situation when the potential is positive (Lemma 4.2). For sake of clarity, we develop the method again from the beginning, but we point out that the proof of Lemma 4.1 is taken from [6]. Let y(u) be a solution of the complex Riccati equation (4.5). We want to estimate the Euclidean distance of y to a given curve m(u) = α + iβ in the complex plane. A direct calculation using (4.5) gives 1 d |y − m|2 = (Re y − α) (Re y − α) + (Im y − β) (Im y − β)

2 du = (Re y − α) V − (Re y)2 + (Im y)2 − α − (Im y − β) 2 Re y Im y + β

= (Rey − α) V − (Rey)2 − (Imy)2 + 2β Im y − α + (Rey − α) 2(Im y − β) Im y −(Im y − β) β + 2α Im y − (Im y − β) 2(Re y − α) Im y = (Re y − α) V − (Re y − α)2 − (Im y − β)2 − α 2 + β 2 − α

−(Im y − β) β + 2αβ − 2α (Re y − α)2 + (Im y − β)2 . Choosing polar coordinates centered at m, y = m + Reiϕ ,

R := |y − m|,

we obtain the following differential equation for R, R + 2αR = cos ϕ V − R 2 − α 2 + β 2 − α − sin ϕ β + 2αβ .

(4.11)

In order to use this equation for estimates, we assume that α is a given function (to be determined later). With the abbreviations u U = V − α 2 − α and σ (u) = exp 2 α , (4.12) 0

the ODE (4.11) can then be written as (σ R) = σ U − R 2 + β 2 cos ϕ − (σβ) sin ϕ. To further simplify the equation, we want to arrange that the square bracket vanishes. If U is negative, this can be achieved by the ansatz √ √ 1 1 |U | |U | β = T + , R = T − (U < 0), (4.13) 2 T 2 T with T > 1 a free function. In the case U > 0, we make similarly the ansatz √ √ U U 1 1 β = T − , R = T + (U > 0) 2 T 2 T

(4.14)

482


Im y √

|U | T R m

√ |U | T Re y

Fig. 1. Invariant disk estimate for U < 0

with a function T > 0. Using (4.13, 4.14), the ODE (4.11) reduces to the simple equation (σ R) = −(σβ) sin ϕ. If we now replace this equation by a strict inequality, (σ R) > −(σβ) sin ϕ,

(4.15)

with R a general positive function, the inequality |y − m| ≤ R will be preserved as u increases. In other words, the disk BR (m) will be an invariant region for the flow of y. In the next two lemmas we specify the function T in the cases U < 0 and U > 0, respectively. To avoid confusion, we note that it is only a matter of convenience to state the lemmas on the interval [0, umax ]; by translation we can later immediately apply the lemmas on any closed interval. Lemma 4.1. Let α be a real function on [0, umax ] which is continuous and piecewise C 1 , such that the corresponding function U , (4.12), is negative, U ≤ 0 on [0, umax ] . For a constant T0 ≥ 1 we introduce the function T by 1 TV[0,u) log |σ 2 U | , T (u) = T0 exp 2

(4.16)

define the functions β and R by (4.13) and set m = α + iβ. If a solution y of the complex Riccati equation (4.5) satisfies at u = 0 the condition |y − m| ≤ R,

(4.17)

then this condition holds for all u ∈ [0, umax ] (for illustration see Fig 1). Proof. For ε > 0 we set Tε (u) = T0 exp

u 2 |σ U | 1 + ε(1 − e−u ) 2 0 |σ 2 U |

(4.18)


483

and denote corresponding functions α, R, m, and σ by an additional subscript ε. Since Tε (0) = T (0) and limε 0 Tε = T , it suffices to show that for all ε > 0 the following statement holds, |y − mε |(0) ≤ Rε (0)

⇒

|y − mε |(u) ≤ Rε (u) for all u ∈ [0, umax ].

In differential form, we get the sufficient condition |y − mε |(u) = Rε (u)

⇒

|y − mε | (u) < Rε (u).

According to (4.15), this last condition will be satisfied if (σε Rε ) > |(σε βε ) |.

(4.19)

From now on we omit the subscripts ε. In order to prove (4.19), we first use (4.13, 4.12) to rewrite the functions σβ and σ R as  1 2   σβ = |σ U | T + |σ 2 U | T −1 2 (4.20) 1  σR = |σ 2 U | T − |σ 2 U | T −1 .  2 By definition of Tε (4.18),

T

1 |σ 2 U | = 2 + εe−u . T 2 |σ U |

It follows that 2 U | T −1 ) = −εe−u ( |σ 2 U | T −1 ) ( |σ ( |σ 2 U | T ) = εe−u ( |σ 2 U | T )

if |σ 2 U | ≥ 0, if |σ 2 U | < 0.

Hence when we differentiate through (4.20) and set ε = 0, either the first or the second summand drops out in each equation, and we obtain (σ R) = |σβ| . If ε > 0, an inspection of the signs of the additional terms gives (4.19). Lemma 4.2. Let α be a real function on [0, umax ] which is continuous and piecewise C 1 , such that the corresponding function U , (4.12), satisfies on [0, umax ] the conditions U ≥0

and

U + 4U α ≥ 0.

For a constant T0 ≥ 0 we introduce the function T by U (0) T (u) = T0 , σ 2U

(4.21)

(4.22)

define the functions β and R by (4.14) and set m = α + iβ. If a solution y of the complex Riccati equation (4.5) satisfies at u = 0 the condition |y − m| ≤ R, then this condition holds for all u ∈ [0, umax ] (see Fig. 2). Furthermore, √ √ U (0) . Re y ≥ α − U − T0 2σ

(4.23)

484


Fig. 2. Invariant disk estimate for U > 0, in the cases T > 1 (left) and T < 1 (right).

Proof. For ε > 0 we set 1

Tε = T0 (σ 2 U )− 2 (1 − εe−u ). Using (4.14, 4.12) we can write the functions σβ and σ R as 1 −1 2 σβ = − T0 σ U (1 − εe−u )−1 − T0 (1 − εe−u ) 2 1 −1 2 σR = T0 σ U (1 − εe−u )−1 + T0 (1 − εe−u ) , 2

    

where we again omitted the subscript ε. Differentiation gives (σ R) > −(σβ) =

1 !

1 −1 2 σ U (1 − εe−u )−1 − T0 1 − εe−u . T0 2 2

(4.24)

According to the second inequality in (4.21), the function σ 2 U is strictly increasing and thus the expression on the right of (4.24) is positive for sufficiently small ε. Hence (4.19) is satisfied. Letting ε → 0, we obtain that the circle B R (m) is invariant. In order to prove (4.23) √ we note that in the case T < 1 the inequality is obvious because even Re y ≤ α − U (see Fig. 2). Thus we can assume T ≥ 1, and the estimate √ U Re y ≥ α − R ≥ α − (T + 2) 2 together with (4.22) gives the claim.

If the potential V is monotone increasing, by choosing α ≡ 0 we obtain the following simple estimate. Corollary 4.3. Assume that the potential V is monotone increasing on [0, umax ]. For a constant T0 > 0 with T02 ≥ −V (0) we introduce the functions β =

1 V T0 − , 2 T0

R =

1 V T0 + . 2 T0

(4.25)


485

Im y T0

Re y β R

Fig. 3. Invariant region estimate for monotone V

If a solution of the complex Riccati equation (4.5) satisfies at u = 0 the condition iT0 T0 ≤ , Re z ≤ 0 , y ∈ z | |z − iβ| ≤ R, Re z, Im z ≥ 0 ∪ z | z − 2 2 then this condition holds for all u ∈ [0, umax ] (see Fig. 3). Proof. Choosing α ≡ 0 and β, T according to (4.25), we know from Lemma 4.1 and Lemma 4.2 that the circles |y − m| ≤ R are invariant. Furthermore, we note that the arc in Fig. 3 is the flow line of the equation y + y 2 = 0, and thus it cannot be crossed from the right to the left when V is positive. This gives the result in the case that V has no zeros. If V has a zero, the invariant disks in the regions V ≤ 0 and V ≥ 0 coincide at the zero of V . The invariant disk estimates of Lemma 4.1 and Lemma 4.2 can also be used if the functions α and U have a discontinuity at some v ∈ [0, umax ], i.e. αl := lim α(u) = lim α(u) =: αr , uv

u v

Ul := lim U (u) = lim U (u) =: Ur . uv

u v

In this case we choose the function T also to be discontinuous at v, Tl := lim T (u) = lim T (u) =: Tr , uv

u v

in such a way that the circle corresponding to (αr , Ur , Tr ) contains that corresponding to (αl , Ul , Tl ) (see Fig. 4). In the next lemma we give sufficient “jump conditions” for this “matching.” Lemma 4.4. (Matching of invariant disks). Suppose that Ul < 0. Depending on the sign of Ur , we set (αr − αl )2 + |Ul + Ur | if Ur < 0, √ |Ul | |Ur | √ (αr − αl )2 + |Ul + Ur | + |Ul | Ur Tr = Tl if Ur > 0. √ |Ul | Ur

Tr = Tl

(4.26) (4.27)

486


Fig. 4. Matching of invariant disks in the cases Ur < 0 (left) and Ur > 0 (right)

Let Bl/r be the disks with centers ml/r = αl/r + iβl/r and radii Rl/r as given by (4.13) or (4.14). Then Bl ⊂ Br .

Proof. We must satisfy the condition Rr ≥ |mr − ml | + Rl . Taking squares, we obtain the equivalent conditions Rr ≥ Rl and (Rr − Rl )2 ≥ (αr − αl )2 + (βr − βl )2 . This last condition can also be written as (αr − αl )2 + (βl2 − Rl2 ) + (βr2 − Rr2 ) ≤ 2 (βl βr − Rl Rr ).

(4.28)

In the case Ur < 0, we can substitute the ansatz (4.13) into (4.28) to obtain the equivalent inequality (αr − αl ) + |Ul | + |Ur | ≤ 2

Tr Tl |Ul | |Ur | + Tl Tr

.

Dropping the last summand on the right and solving for Tr , we obtain (4.26), which is thus a sufficient condition. In the case Ur > 0, we substitute (4.13, 4.14) into (4.28) to obtain the equivalent condition (αr − αl ) + |Ul | − Ur ≤ |Ul | Ur 2

Tr Tl − Tl Tr

.

Using the inequality |Ul | − Ur ≤ |Ul + Ur |, replacing the factor Tl /Tr on the right by one and solving for Tr , we obtain the sufficient condition (4.27).


487

4.3. Bounds for the Wronskian and the fundamental solutions. We now consider the solutions φ´ and φ` as defined in Section 3.1 for ω on the real axis and set y´ =

φ´

, φ´

y` =

φ`

. φ`

We keep k fixed. Since taking the complex conjugate of the separated wave equation flips the sign of k, we may assume that k ≥ 0. Then ω0 as defined by (2.12) is negative. ´ φ) ` is non-zero. Proposition 4.5. If ω ∈ [ω0 , 0], the Wronskian w(φ, Proof. According to (2.13), ω and have the same sign. From (4.10) we know that the functions y´ and y` both stay either in the upper or lower half plane. In view of the asymptotics (3.1, 3.2), we know that they must be in opposite half planes. Thus ! ´ φ) ` = φ´ φ` y´ − y` = 0. w(φ, In the case ω ∈ (ω0 , 0), we need the following global estimate for large λ. Proposition 4.6. For any u1 ∈ R there are constants c, λ0 > 0 such that φ(u) c ´ for all λ > λ0 , ω ∈ (ω0 , 0), u < u1 . ≤ w(φ, ´ φ) ` λ The remainder of this section is devoted to the proof of this proposition. Let u1 ∈ R and ω ∈ (ω0 , 0). Possibly by increasing u1 and λ0 we can clearly arrange that V is monotone decreasing on [u1 , ∞). Then we have the following estimate. Lemma 4.7. The functions φ` and y` satisfy the inequalities ` |φ(u)| ≥ 1,

Re y(u) ` ≤ |ω| on [u1 , ∞).

` = −iω. Thus for v suffiProof. From the asymptotics (3.2) we know that limu→∞ y(u) ciently large, |y(v) ` − i|ω|| < ε, and we can apply Corollary 4.3 on the interval [u1 , v] backwards in u with T0 = |ω| + 2ε. Since ε can be chosen arbitrarily small, we conclude that Corollary 4.3 applies even on [u1 , ∞) with T0 = |ω|. This means that 0 ≤ Im y` ≤ |ω|, Finally, we use (4.9) with w = i|ω|.

Re y` ≤ |ω|

on [u1 , ∞).

´ which are more difficult because we need a We now come to the estimates for φ, stronger result. The next lemma specifies the behavior of the potential on (−∞, 2u1 ]. Lemma 4.8. For any u1 ∈ R there are constants c, λ0 such that the potential V has for all ω ∈ (ω0 , 0) and all λ > λ0 the following properties. There are unique points u− < u0 < u+ < u1 such that V (u− ) = −

2 , 2

V (u0 ) = 0,

V (u+ ) = 2 .

488


V is monotone increasing on (−∞, u+ ]. Furthermore, u+ − u− ≤ c,

(4.29)

γ u+ ≥ log − log λ − c, 1 1 |V | + |V

| 2 ≤ |V | on [u+ , 2u1 ], 4 2

2 3

(4.30) (4.31)

with γ as in (3.9). Proof. We expand V in a Taylor series around the event horizon, V = −2 + (λ + c0 )(r − r1 ) + λ O((r − r1 )2 ). Hence for sufficiently large λ0 there are near the event horizon unique points u− , u0 , u+ where the potential has the required value. Integrating (2.1) we get near the event horizon the asymptotic formula u ∼

1 log(r − r1 ). γ

Getting asymptotic expansions for u± we immediately obtain (4.29, 4.30). Furthermore, using (2.1) to transform r-derivatives into u-derivatives, we obtain in the region (r1 , r1 + ε) ∩ (u+ , ∞) the estimates λ γu ≤ V (u) ≤ λ c eγ u , e c |V (u)| + |V

(u)| ≤ λ c eγ u , uniformly in λ and ω. Hence for sufficiently large λ0 , (4.31) will be satisfied near the event horizon. In the region r > r1 + ε away from the event horizon, V is strictly positive, V > λ/c, and since the derivatives of V can clearly be bounded by |V | + |V

| < cλ, it follows that (4.31) is again satisfied. First we apply Corollary 4.3 on the interval (−∞, u− ) to obtain the following result. Corollary 4.9. There is a constant c > 0 such that for all ω ∈ (ω0 , 0) and λ > λ0 , ≤ Im y ≤ , 2

|Re y| ≤

2

on (−∞, u− ].

Also, at u = u− we have an invariant disk with αl = 0,

Ul = −

2 , 2

Tl =

√ 2.

On the interval [u− , u+ ] we use the method described in the next lemma.

(4.32)


489

Lemma 4.10. Assume that the potential V is monotone increasing on [0, umax ]. We set α = max(2 V (umax ), 0) and introduce for a given constant T0 > 1 the functions U , σ , β, R, and T by (4.12, 4.14) and √ |U (0)| 2αu T (u) = T0 e . (4.33) √ |U (u)| If a solution y of the complex Riccati equation (4.5) satisfies at u = 0 the condition |y − m| ≤ R, then this condition holds for all u ∈ [0, umax ]. Proof. By definition of α, the function U = V − α 2 is negative and monotone increasing. Using furthermore that σ = e2αu , we can estimate the total variation in (4.16) as follows, u |U |

2 TV[0,u) log |σ U | = 4α − = 4αu + log |U (0)| − log |U (u)|. |U | 0 This gives (4.33).

2 √ Thus we match the invariant disk (4.32) to a disk with Ur = V (u− ) − αr and αr = 2 . From (4.29) we see that (u+ − u− ) α is uniformly bounded, and thus we obtain the following estimate.

Corollary 4.11. There is a constant c > 0 such that for all ω ∈ (ω0 , 0) and λ > λ0 , ≤ Im y ≤ c , c

|Re y| ≤ c on [u− , u+ ].

At u = u+ we get an invariant disk with 0 ≤ αl ≤ ,

−c Ul = −2 ,

Tl ≤ c.

(4.34)

In the remaining interval [u+ , 2u1 ] an approximate solution of the Schrödinger equation (2.14) is available from semi-classical analysis: the WKB wave function u √ − 41 φ(u) = V exp V . The corresponding function y is given by y(u) =

√ V

V − . 4V

In order to get an invariant disk estimate which quantifies the exponential increase of ϕ, we choose α such that it also becomes large as V 0. For technical simplicity, we choose 7 α(u) = V (u), (4.35) 8 giving rise to the following general result.

490


Lemma 4.12. Assume that the potential V is positive on [0, umax ] and that |V (u)| ≤

3 1 V (u) 2 , 2

|V

(u)| ≤

1 V (u)2 . 4

(4.36)

We introduce for a given constant T0 > 0 the functions α, U , σ , β, R, and T by (4.35, 4.12, 4.14, 4.22). If a solution y of the complex Riccati equation (4.5) satisfies at u = 0 the condition |y − m| ≤ R, then this condition holds for all u ∈ [0, umax ]. Furthermore, √ V T0 Re y ≥ − . 8 2

(4.37)

Proof. A short calculation yields 7 V

15 V − √ , 64 16 V 105 3 83

7 V 2 7 V

V2 − − V + U + 4αU = √ . 3 128 64 32 V 2 16 V U = V − α2 − α =

Using (4.36) we obtain the estimates V V ≤ U ≤ 64 2

3

V2 and U + 4αU ≥ . 16

Hence the conditions (4.21) are satisfied, and Lemma 4.2 applies. The inequality (4.37) follows from (4.23), the just-derived upper bound for U and the fact that σ ≥ 1. Matching the invariant disk (4.34) to the invariant disk with αr = α(ur ) and Ur = V (ur ) − α 2 (ur ) − α (ur ) with α according to (4.35), we obtain Ur ≤ 2 ,

Tr ≤ c.

(4.38)

We can then apply the last lemma on the interval [u+ , 2u1 ]. ` we can Proof of Proposition 4.6. Suppose that u < u1 . Using the definition of y´ and y, rewrite the Wronskian as ´ φ) ` = φ´ φ` (y´ − y). w(φ, ` Applying Lemma 4.7 at u = 2u1 gives φ(u) φ(u) 1 ´ ´ . ≤ w(φ, φ(2u ´ φ) ` ´ Re y(2u ´ ) 1 ) − |ω| 1 We combine (4.37) with T0 = Tr satisfying (4.38) to get √ V Re y´ ≥ − c. 8

(4.39)

(4.40)


491

Since the potential V is strictly positive on the interval [u1 , 2u1 ], we can, possibly by increasing λ0 and c, arrange that √

√ V ≥

λ c

on [u1 , 2u1 ]

(4.41)

on [u1 , 2u1 ].

(4.42)

and thus also that √ λ Re y´ ≥ 16 c

This inequality allows us to bound the fraction in (4.39), φ(u) φ(u) ´ ´ ≤ . w(φ, φ(2u ´ φ) ` ´ 1)

(4.43)

Thus it remains to control the last quotient. We omit the accent and use the notation ρ = |φ|. In the case u < u+ , we can use (4.9), ρ(u)2 Im y(u+ ) , = 2 ρ(u+ ) Im y(u) and the last quotient is controlled from above and below by Corollary 4.9 and Corollary 4.11. Hence, rewriting the quotient on the right of (4.43) as ρ(u) ρ(u+ ) ρ(u) = , ρ(2u1 ) ρ(u+ ) ρ(2u1 ) it remains to consider the case u ≥ u+ . Applying (4.7) and (4.40), we obtain 2u1 φ(u) 1 2u1 √ = − Re y(u) ≤ c (2u − u ) − V. A := log 1 + φ(2u1 ) 8 u u Now we use (4.30) and the fact that the function log is bounded, A ≤ c log λ −

1 8

2u1

√

V.

u

Estimating the last summand with (4.41), u

2u1

√

V ≥

2u1 u1

√

√ V ≥

λ u1 , c

we conclude that for large λ√this summand dominates the term c log λ, and thus (4.43) decays in λ even like exp(− λ/c).

492


5. Contour Deformations to the Real Axis In this section we fix the angular momentum number k throughout and omit the angular variable ϕ. We can again assume without loss of generality that k ≥ 0. Also, since here we are interested in the situation only locally in u, we evaluate weakly. Thus we write the integral representation (2.16) for compactly supported initial data 0 and a test function η ∈ C0∞ (R × S 2 )2 as 1 = − dωe−iωt . (5.1) lim − ε 0 2πi Cε Cε n∈IN

The integration contour in (5.1) can be moved to the real axis provided that the integrand is continuous. In the next lemma we specify when this is the case and simplify the integrand. For ω real, the complex conjugates of φ´ and φ` are again solutions of the ODE. Thus, apart from the exceptional cases ω ∈ {0, ω0 }, we can express φ` as a linear ´ combination of φ´ and φ, φ` = α φ´ + β φ´

(ω ∈ R \ {0, ω0 }).

(5.2)

The complex coefficients α and β are called transmission coefficients. The Wronskian of φ´ and φ` can then be expressed by ´ φ) ` = β w(φ, ´ φ) ´ = 2i β, w(φ,

(5.3)

where in the last step we used the asymptotics (3.1). Furthermore, it is convenient to introduce the real fundamental solutions ´ φ1 = Re φ,

´ φ2 = Im φ,

and to denote the corresponding solutions of the wave equation in Hamiltonian form ωn . by 1/2 ´ φ) ` is non-zero at ω ∈ R \ {0, ω0 }, then the integrand Lemma 5.1. If the Wronskian w(φ, in (5.1) is continuous at ω and ! lim − lim (Qn (ω + iε) S∞ (ω + iε) )(r, ϑ) ε0

=−

ε 0

2 i ωn ωn tab a < bωn , >, ω

(5.4)

a,b=1

where the coefficients tab are given by α α t11 = 1 + Re , t12 = t21 = −Im , β β

t22 = 1 − Re

α . β

(5.5)

Proof. We start from the explicit formula for the operator product Qn S∞ given in [5, Proposition 5.4]. Since the angular operator Qn (ω + iε) can be diagonalized for ε sufficiently small, the kernel g(u, u ) is simply the Green’s function of the radial ODE, i.e. for ω in the lower half plane, ´ ` ) if u ≤ u

1 φ(u) φ(u

× ` g(u, u ) := (5.6) ´ ) if u > u . , ´ φ) ` φ(u) φ(u w(φ,


493

whereas the formula in the upper half plane is obtained by complex conjugation. Using ` we find that that limε0 φ´ = limε 0 φ´ and limε0 φ` = limε 0 φ, ! lim − lim g(u, u ) = 2i Im g(u, u ), ε0

ε 0

and a short calculation using (5.2, 5.3) gives 2 ! i lim − lim g(u, u ) = − tab φ a (u) φ b (u ) ε0 ε 0 a,b=1

with tab according to (5.5). Except for the function g(u, u ), all the functions appearing in the formula for Qn S∞ in [5, Proposition 5.4] are continuous on the real axis. A direct calculation shows that ! lim − lim (Qk,n (ω + iε) S∞ (ω + iε) ) ε0

=−

ε 0

2 i (ω − β)ω 0 tab a L2 (dµ) 0 1 ω a,b=1

with dµ given by (2.7). Since the b are eigenfunctions of the Hamiltonian, we know according to (2.3) that A b = (ω − β)ω b . Using furthermore that the operator A is symmetric on L2 (dµ), we conclude that (ω − β)ω 0 A0 L2 (dµ) = L2 (dµ) = , 0 1 0 1 where in the last step we used (2.8).

Let us now consider for which values of ω and n the contour can be moved to the ´ φ) ` is non-zero unless ω ∈ real axis. According to Proposition 4.5, the Wronskian w(φ, [ω0 , 0]. We now analyze carefully the exceptional cases ω = 0, ω0 . From Theorem 3.1, Theorem 3.2 and Theorem 3.5 we know that the functions φ´ and φω = ωµ φ` are continuous for all ω ∈ R. If ak = 0 and ω = 0, the functions φ´ and φω degenerate to real solutions with the asymptotics ´ lim φ(u) = 1,

u→−∞

1 (µ) lim uµ− 2 φ0 (u) = √ . π

u→∞

Noting that the function ∂u

r2

+ a2

=

r 3

(r 2 + a 2 ) 2

= √

r r 2 + a2

2Mr 1− 2 r + a2

is monotone increasing, the potential V , (2.15), is everywhere positive. Hence solutions of the Schrödinger equation (2.14) are convex. This implies that the functions φ´ and φ` do not coincide, and thus their Wronskian is non-zero. As a consequence, the Green’s function (5.6), and thus the whole integrand in (5.1), is bounded and continu` and thus we can in ous near ω = 0 (note that (5.6) is invariant under rescalings of φ, this formula replace φ` by φω ). In the case ak = 0 and ω = 0, the function φ0 is real,

494


´ φ0 ) = 0. If on the other hand ak = 0 and ω = ω0 , whereas φ´ is complex, and thus w(φ, ´ φ0 ) = 0. Hence the integrand in (5.1) is φ´ is real and φ` is complex, and again w(φ, continuous and bounded at the points ω = 0, ω0 . We conclude that for every n ∈ N, the integrand in (5.1) is continuous on an open neighborhood of ω ∈ R \ (ω0 , 0). Fur´ φ) ` = 0 if ω ∈ (ω0 , 0) and λ is sufficiently thermore, according to Proposition 4.6, w(φ, large. We have thus proved the following result. Proposition 5.2. There is δ > 0 and n0 ∈ N such that for every 0 ∈ C0∞ (R × S 2 )2 , the completeness relation ∞ 2 0 dω 1 ωn ωn 0 = + tab a < bωn , 0 > 2π ω R \[ω ,0] ω 0 0 n>n0 n=0 a,b=1 " + (Qn S∞ 0 ) dω n≤n0 Dδ

holds, with the contour Dδ as in Fig. 5. We point out that the contour Dδ passes along the line segment [ω0 , ω0 +δ) twice, once as the limit of the contour in the lower half plane, and once as limit of the contour in the upper half plane. These two integrals can be combined to one integral over [ω0 , ω0 + δ) with the integrand given by (5.4). Let us now consider how the remaining contour integrals over Cε can be moved to the real line. According to Theorems 3.1 and 3.2, the functions φ´ and φ` have for every n ≤ n0 and for every ω ∈ (ω0 , 0) a holomorphic extension to a neighborhood of ω. Thus their Wronskian is also holomorphic in this neighborhood, and consequently ´ φ) ` = 0 for ω near 0 they can have only isolated zeros of finite order. Since w(φ, and ω0 , we conclude that the numbers of zeros must be finite. Since we only need to consider a finite number of angular momentum modes, there is at most a finite number of points ω1 , . . . , ωK ∈ (ω0 , 0), K ≥ 0, where any of the Wronskians w(φ´ n , φ` n ) has a zero. We denote the maximum of the orders of these zeros at ωi by li ∈ N. The above zeros of the Wronskian lead to poles in the integrand of (5.1) and correspond to radiant modes. We will prove in Sect. 7 by contradiction that these radiant modes are actually absent. Therefore, we now make the assumption that there are radiant modes, i.e. that the Wronskians w(φ´ n , φ` n ) have at least one zero on the real axis. As a preparation for the analysis of Sect. 7, we now choose a special configuration where radiant modes appear, but in the simplest possible way. We choose new initial data 0 = P(H ) 0 ,

(5.7) Im ω

ω0 ω0 + δ Fig. 5. The integration contour Dδ

Re ω


495

where P is the polynomial P(x) = ω (ω − ω0 ) (x − ω1 )

l1 −1

K #

(x − ωi )li .

i=2

Then 0 again has compact support, and using the spectral calculus, the corresponding solution (t) of the Cauchy problem is obtained from (5.1) by multiplying the integrand by P(ω). Then the poles of the integrand at ω2 , . . . , ωK disappear, and at ω1 a simple pole remains. Subtracting this pole, the integrand becomes analytic, whereas for the pole itself we get a contour integral which can be computed with residues. Let us summarize the result of the above construction with a compact notation. For a test function η ∈ C0∞ (R × S 2 )2 we introduce the vectors ηωn by ηωn = (η1ωn , η2ωn ) where ηaωn = < aωn , η>. Proposition 5.3. Assume that there are radiant modes, K ≥ 1. Then the Cauchy development (t) of the initial data (5.7) satisfies the relation 1 ∞ −iωt ωn ωn ωn = e η , T C2 dω + e−iω1 t ηω1 n , σ n C2 . 2π −∞ n≤n n∈N

0

Here ω1 ∈ (ω0 , 0). The (σ n )n=1,...,n0 are vectors in C2 , at least one of which is non-zero. The matrices T ωn have the following properties, (1) If ω ∈ [ω0 , 0] or n > n0 , ωn (T ωn )ab = tab

(5.8)

with tab according to (5.5). (2) For each n, the function T ωn is continuous in ω ∈ R and analytic in (ω0 , 0). 6. Energy Splitting Estimates In this section, we consider the family of test functions ηL (u) = η(u + L) for a fixed η ∈ C0∞ (R × S 2 )2 . Our goal is to control the inner product in the limit L → ∞ when the support of ηL moves towards the event horizon. Our method is to split up the inner product into a positive and an indefinite part. Once the indefinite part is bounded using the ODE estimates of Sect. 4, we can use the Schwarz inequality and energy conservation to also control the positive part. We choose u1 ∈ R and general test functions η, ζ ∈ C0∞ ((−∞, u1 ) × S 2 )2 which are supported to the left of u1 (for later use we often work more generally with ζ instead of 0 ). Since for each fixed n, the T ωn are continuous and the eigensolutions aωn (u) are, according to Theorem 3.1, also continuous in ω, uniformly for u ∈ (−∞, u1 ), we have no difficulty controlling the expressions ηωn , T ωn ζ ωn C2 for n ≤ n0 and ω ∈ [ω0 , 0]. Hence we only need to consider the case when the matrix T ωn is given by (5.8). Using (5.5), the eigenvalues λ± of this matrix are α λ± = 1 ± . (6.1) β

496


In order to determine the sign of these eigenvalues, we first use the asymptotics (3.1, ` φ) ` = −2iω and w(φ, ´ φ) ´ = 2i. Furthermore, 3.2) to compute the Wronskians w(φ, we obtain from (5.2) and its complex conjugate that ` φ) ´ φ). ` = (|α|2 − |β|2 ) w(φ, ´ w(φ, Combining these identities, we find that |α|2 − |β|2 = −

ω .

(6.2)

From (6.1, 6.2) we see that in the case ω ∈ [ω0 , 0], where ω and have the same sign, the eigenvalues λ± are both positive. However, if ω ∈ (ω0 , 0), one of the eigenvalues is negative. This result is not surprising, because the lack of positivity corresponds to the fact that for ω ∈ [ω0 , 0] the energy density can be negative inside the ergosphere. In the case when T ωn is not positive, we decompose it into the difference of two positive matrices, T ωn = T+ωn − T−ωn

for ω ∈ (ω0 , 0), n > n0 ,

where T−ωn = −λ− 1. In the next lemma we bound the integral over T−ωn using ODE techniques. Lemma 6.1. For any ε > 0 we can, possibly by increasing n0 , arrange that for all L ≤ 0, 0 ηωn , T ωn ζ ωn C2 dω ≤ ε. − L n>n0 ω0

Proof. Using (6.2, 5.3) we can estimate the norm of T− by ω 1 |α|2 − |β|2 2|ω| T− = |λ− | = = . ≤ 2 ´ φ)| ` 2 |β| (|α| + |β|) 2 |β| |w(φ, Hence

0

n>n0 ω0

ωn ωn ωn η , T ζ C2 dω ≤ 2 − L

0

n>n0 ω0

ωn | |ηL |ζ ωn | |ω| dω. ´ φ)| ` |w(φ, ´ φ)| ` |w(φ,

Writing out the energy scalar product using [5, Eq. (2.14)] and expressing the funda´ one sees that mental solutions aωn in terms of the radial solution φ, ωn ´ |ηL | ≤ c sup |ηL φ|,

R

´ |ζ ωn | ≤ c sup |ζ φ|, R

where the constant c = c(ω) is independent of λ. Now we apply Proposition 4.6 and use that the eigenvalues λn grow quadratically in n, (2.11). Lemma 6.2. There is a constant C > 0 such that for all L ≥ 0, ωn ωn ωn η , T ζ C2 dω ≤ C. n∈N R\[ω0 ,0]

L


497

Proof. First of all, using the positivity of the matrix T , ωn ωn ωn η , T ζ C2 dω L n∈N R\[ω0 ,0]

≤

! 1 ωn ωn ηL , T ωn ηL C2 + ζ ωn , T ωn ζ ωn C2 dω. 2 n∈N R\[ω0 ,0]

The two summands can be treated in exactly the same way; we treat the summand involvωn because of the additional L-dependence. Applying Proposition 5.2 and dropping ing ηL all negative terms, we get ωn ωn ηL , T+ωn ηL C2 dω ≤ R\[ω0 ,0]

+

0

n>n0 ω0

ωn ωn ηL , T−ωn ηL C2 dω +

" n≤n0 Dδ

|| dω.

Using the asymptotic form of the energy scalar product and the Hamiltonian near the event horizon, it is obvious that the first term stays bounded as L → ∞. The second term is bounded according to Lemma 6.1. For the contour integrals we can use the formula (5.4) on the real interval [ω0 , ω0 + δ). Since Theorem 3.1 gives us control of the asymptotics of fundamental solution φ´ uniformly as u → −∞, it is clear that the integral over [ω0 , ω0 + δ) is bounded uniformly in L. For the contour in the complex plane, we cannot work with (5.4), but we must instead consider the formula for the operator product Qn S∞ given in [5, Proposition 5.4] together with the estimate for the Green’s function given in Lemma 6.3 below. Lemma 6.3. For every ω˜ ∈ Dδ with ω = ω0 , there are constants C, > 0 and u0 ∈ R such the Green’s function satisfies for all ω ∈ Dδ ∩ B (ω) ˜ the inequality |g(u, v)| ≤ C for all u, v ≤ u0 . Proof. It suffices to consider the case Im ω ≤ 0, because the Green’s function in the upper half plane is obtained simply by complex conjugation. By symmetry, we can furthermore assume that u ≤ v. Thus, according to (5.6), we must prove the inequality φ(u) ` φ(v) ´ ≤ C for all u ≤ v ≤ u0 . w(φ, ´ φ) ` ´ φ) ` has no zeros away According to Whiting’s mode stability [10], the Wronskian w(φ, from the real line, and thus by choosing δ so small that Bδ (ω) ˜ lies entirely in the lower ´ φ)| ` is bounded away from zero on Bδ (ω). half plane, we can arrange that |w(φ, ˜ Hence ´ ` ´ φ) ` our task is to bound the factor |φ(u) φ(v)|. Solving the defining equation for w(φ, for φ` and integrating, we obtain u0 φ` u0 du ´ ` . = −w(φ, φ) ´ 2 φ´ v φ(u) v

498


Substituting the identity 1

e−2iu d 2i du

=

´ 2 φ(u)

e2iu ´ 2 φ(u)

−

1 d 2i du

1 ´ 2 φ(u)

,

the integral over the last term gives a boundary term, v

u0

1 d 2i du

1

´ 2 φ(u)

1 1 u0 = . ´ 2 v 2i φ(u)

The integral over the other term can be estimated by 1 2

u0

v

2iu

´ e−2Im v u0 (e−iu φ(u)) −2iu e e dv ≤ dv. 2 −iu 3 ´ ´ (e φ(u) φ(u)) v

Using the asymptotics (3.1) one sees that the last integrand vanishes at the event horizon. ´ (3.7, 3.8), we see that this integrand decays even expoFrom the series expansion for φ, nentially fast. Therefore, the last integral is finite, uniformly in v and locally uniformly ´ the in ω. Collecting all the obtained terms and using the known asymptotics (3.1) of φ, result follows. Lemma 6.4. For any ε > 0 we can, possibly by increasing n0 , arrange that for all L ≥ 0,

0

n>n0 ω0

ωn ωn ωn η , T ζ C2 dω ≤ ε. + L

Proof. Again using positivity, it suffices to bound the terms n>n0

0 ω0

ωn ωn ηL , T+ωn ηL C 2

dω

and

n>n0

0 ω0

ζ ωn , T+ωn ζ ωn C2 dω .

They can be treated similarly, consider for example the first term. For any n1 > n0 , inf λ2n1

(ω0 ,0)

0

n≥n1 ω0

ωn ωn ηL , T+ωn ηL C2 dω ≤

≤ + +

" n≤n0 Dδ

n>n0

0 ω0

0

n>n0 ω0

(AηL )ωn , T+ωn (AηL )ωn C2 dω

(AηL )ωn , T−ωn (AηL )ωn C2 dω

|| dω.

Here A is the angular operator. When it acts on a test function, we always get rid of the time-derivatives with the replacement i∂t → H . We now argue as in the proof of Lemma 6.2 (with ηL replaced by AηL ) and choose n1 sufficiently large.


499

7. An Integral Representation on the Real Axis We now use a causality argument together with the estimates of the previous section to show that the radiant modes in Proposition 5.3 must be absent. This will be a contradiction to the assumption that there are radiant modes, ruling out the possibility that there are radiant modes at all. This will lead us to an integral representation of the propagator on the real axis. Let us return to the setting of Proposition 5.3. Choosing the ϑ-dependence of η such that it is orthogonal to the angular wave functions ( aω1 n )n≤n0 except for one n, and choosing the u-dependence of η such that it is orthogonal only to one of the plane waves e±i(ω1 −ω0 )u , we can clearly arrange that ω1 n lim sup |σ (L)| =: κ > 0 where σ (L) := C 2 . (7.1) L→∞

n≤n0

Furthermore, we choose η such that its support lies to the left of the support of 0 , i.e. dist(supp ηL , supp 0 ) > L for all L ≥ 0. Due to the finite propagation speed (which in the (t, u)-coordinates is equal to one), supp ηL ∩ supp (t) = ∅

if |t| ≤ L.

Hence for all L > 0, L 1 0= eiω1 t 2L −L 1 ∞ sin((ω − ω1 )L) ωn ωn ωn = ηL , T 0 C2 dω + σ (L). 2π (ω − ω1 )L −∞ n∈N

We apply Lemma 6.1 and Lemma 6.4 with ε = κ/(8π ) to obtain κ 1 0 ωn ωn ωn ηL , T 0 C2 dω ≤ . 2 2π n>n ω0 0

Furthermore, Lemma 6.2 gives rise to the estimate sin((ω − ω1 )L) sin((ω − ω1 )L) ωn ωn ωn , ηL , T 0 C2 dω ≤ C sup (ω − ω1 )L (ω − ω1 )L R\[ω0 ,0] R\[ω0 ,0] and since this supremum tends to zero as L → ∞, we conclude that the expression on the left vanishes in the limit L → ∞. Combining these estimates with (7.1), we obtain 0 sin((ω − ω )L) 1 ωn ωn ωn lim sup (7.2) ηL , T 0 C2 dω ≥ π κ. (ω − ω1 )L ω0 L→∞ n≤n0

Since the matrices T ωn are continuous in ω and the fundamental solutions aωn (u) are according to Theorem 3.1 uniformly bounded as u → −∞, there is a constant C such that ωn ωn ωn η , T C2 ≤ C for all L ≥ 0 and ω ∈ (ω0 , 0), n ≤ n0 . 0 L

500


Hence we can apply Lebesgue’s dominated convergence theorem on the left of (7.2) and take the limit L → ∞ inside the integral, giving zero. This is a contradiction. ´ φ) ` has Since radiant modes have been ruled out, we know that the Wronskian w(φ, no zeros on the real axis. Thus we can move all contours up to the real axis. This gives the following integral representation for the propagator. Theorem 7.1. For any initial data 0 ∈ C0∞ (R × S 2 )2 , the solution of the Cauchy problem has the integral representation (t, r, ϑ, ϕ) =

2 1 −ikϕ ∞ dω −iωt kωn a b e tab kωn (r, ϑ) < kωn , 0 > e 2π −∞ ω k∈Z

n∈IN

a,b=1

with the coefficients tab as given by (5.5, 5.2). Here the sums and the integrals converge in L2loc . 8. Proof of Decay We now combine the integral representation of the solution of the Cauchy problem obtained in Theorem 7.1 with the energy splitting estimates of Sect. 6 to prove our main decay theorem. Proof of Theorem 1.1. We choose an interval [rL , rR ] ⊂ (r1 , ∞) and let K be the compact set K = [rL , rR ] × S 2 . As a consequence of Theorem 7.1, we have for any η ∈ ◦

C0∞ (K )2 the integral representation

1 ∞ −iωt ωn ωn ωn = e η , T 0 C2 dω. 2π −∞

(8.1)

n∈IN

It is useful to introduce the short notation (η, ζ ) = .

(8.2)

We consider on K the Hilbert space H = L2 (K, dµ)2 and denote its scalar product by H . We can represent the inner product (8.2) as (η, ζ ) = H with the operator B given by A0 A(β − ω0 ) A2 B = H (H − ω0 ) = . 0 1 (β − ω0 )A A + β (β − ω0 ) This operator, densely defined on C0∞ (K)2 ⊂ H, is obviously symmetric. We now construct a self-adjoint extension. We decompose B in the form 2 A 0 B = B0 + E with B0 := . 0 A The elliptic operator A on the compact domain K is essentially self-adjoint and has compact resolvent (see [5, Sect. 3]). Thus we can choose a domain D(B0 ) which makes B0


501

self-adjoint. Denoting the resolvents by Rλ0 = (B0 − λ)−1 and Rλ = (B − λ)−1 , the resolvent identity reads $ $ $ 0 0 0 0 Rλ = (1 1 + Rλ E) Rλ = Rλ 1 + Rλ E Rλ0 B0 − λ R λ . Writing out the operator inside the brackets, $ $ 0 F : = Rλ E Rλ0 1 1 0 (A2 − λ)− 2 A(β − ω0 ) (A − λ)− 2 , = 1 1 1 1 (A − λ)− 2 (β − ω0 )A(A2 − λ)− 2 (A − λ)− 2 β (β − ω0 ) (A − λ)− 2 1

and using that (A2 − λ)− 2 AH ≤ 1 for λ < 0, we conclude that by choosing λ 0, we can make the norm of F arbitrarily small. Hence the operator 1 + F is invertible, and we obtain the formula $ $ Rλ = Rλ0 (1 1 + F )−1 Rλ0 . We conclude that the operator Rλ is also compact. This gives us a self-adjoint extension of B with a purely discrete spectrum without limit points and finite-dimensional eigenspaces. We now arrange that the operator B has no kernel. Namely, if on the contrary the operator has a kernel, it is obvious from the definition of B that one of the operators A, H or H − ω0 has a kernel. Using the separation of variables, we get corresponding radial ODEs with Dirichlet boundary conditions at rL and rR . Since non-trivial solutions of these ODEs have discrete zeros, we can by increasing the size of the interval [rL , rR ] arrange that B has no kernel. Due to the purely discrete spectrum, there is a constant c > 0 such that Bξ ≥

1 ξ c

for all ξ ∈ D(B).

(8.3)

Let ε > 0. For given ω1 > |ω0 | and n1 ≥ n0 we set   T−ωn if n > n1 and ω ∈ (ω0 , 0) ωn TI = T ωn if n ≤ n1 and ω ∈ [−ω1 , ω1 ]  0 otherwise and T+ωn = T ωn − TIωn . Furthermore, we introduce the short notation

1 ∞ dν · · · ≡ dω · · · 2π N −∞ n∈N

and omit the superscript compact form

ωn .

With this notation, we can write (8.1) for t = 0 in the

(η, ζ ) = N

η, (T+ + TI ) ζ C2 dν.

502


Since in Lemma 6.1 we used pointwise estimates of the aωn , these estimates depend on η and ζ only via their norm in the Hilbert space H. The same is true for the finite number of modes n ≤ n1 for ω in the compact set [−ω1 , ω1 ]. We thus have the bound |η, TI ζ C2 | dν ≤ c(K, ω1 , n1 ) ηH ζ H . (8.4) N

◦

Now we estimate the inner product (η, (t)) for η ∈ (C ∞ (K )2 ) as follows, |(η, (t))| ≤ e−iωt η, TI 0 C2 dν N 1 + |η, T+ (1 + H 2 )(1 + A) 0 C2 | dν, C N

(8.5)

where the constant C, given by C = C(n1 , ω1 ) =

inf

n>n1 or |ω|>ω1

(1 + ω2 )(1 + λn (ω)),

can be made arbitrarily small by increasing ω1 and n1 . Since T+ is positive, we have, using the Schwarz inequality,

N

1

η, T+ ζ C2 dν ≤

N

η, T+ ηC2 dν

2

N

1 ζ, T+ ζ C2 dν

2

.

Applying this inequality in the last term of (8.5), we obtain 2 |η, T+ (1 + H 2 )(1 + A) 0 C2 | dν ≤ c( 0 ) η, T+ ηC2 dν N N

= c( 0 ) (η, η) − η, TI ηC2 dν ≤ c( 0 ) |(η, η)| + η2H , N

where in the last step we used (8.4). Hence by choosing ω1 and n1 sufficiently large, we 1 can arrange that the second summand in (8.5) is smaller than ε(|(η, η) + η2H ) 2 . The first term in (8.5) consists of the sum of the angular modes n > n1 and n ≤ n1 . For the sum over n > n1 , we can again apply Lemma 6.1 keeping in mind that the dependence on η and ζ is controlled by their norms. Possibly by further increasing n1 we can arrange that this contribution is bounded by εηL2 . For the remaining finite sum n ≤ n1 we simply apply the Riemann-Lebesgue lemma. We conclude that, for large t,

◦ 1 |(η, (t))| ≤ ε ηL2 (K,dµ) + |(η, η)| 2 for all η ∈ C0∞ (K )2 . We rewrite this inequality in the Hilbert space H,

2H ≤ 2ε2 η2H + H . By continuity, this inequality holds for any η ∈ H. Evaluating this inequality for the sequence ηk obtained by projecting (t) on the eigenspaces with eigenvalues ≤ k, ηk = χ(−∞,k] (B) (t),


503

and taking the limit k → ∞, we obtain the inequality

< (t), B (t)>2H ≤ 2ε2 (t)2H + < (t), B (t)>H . Since the term < (t), B (t)>H vanishes only if (t) = 0, we may divide by this term. Using (8.3), we conclude that 2 (t) 1 H (t)2 ≤ 2ε2 + 1 ≤ 2ε 2 (c + 1) c < (t), B (t)>H and thus, for sufficiently large t, 1

(t)H ≤ ε (2c(c + 1)) 2 . We conclude that (t) converges to zero in L2 (K). Applying the same argument to the initial data H n 0 , we conclude that the partial derivatives of (t) also decay in L2 (K). The Sobolev embedding H 2,2 (K) → L∞ (K) proves the theorem. Acknowledgement. We would like to thank Johann Kronthaler for discussions on the Jost equation. We are grateful to the Vielberth Foundation, Regensburg, for support.

References 1. Cardoso, V., Yoshida, S.: Superradiant instabilities of rotating black branes and strings. JHEP 0507, 009 (2005) 2. Chandrasekhar, S.: The mathematical theory of black holes. Oxford: Oxford University Press, 1983 3. De Alfaro, V., Regge, T.: Potential Scattering. Amsterdam: North-Holland Publishing Company, 1965 4. Finster, F., Kamran, N., Smoller, J., Yau, S.T.: The long-time dynamics of Dirac particles in the Kerr-Newman black hole geometry. Adv. Theor. Math. Phys. 7, 25–52 (2003) 5. Finster, F., Kamran, N., Smoller, J., Yau, S.T.: An integral spectral representation of the propagator for the wave equation in the Kerr geometry, Commun. Math. Phys. 260, no.2, 257–298 (2005) 6. Finster, F., Schmid, H.: Spectral estimates and non-selfadjoint perturbations of spheroidal wave operators. http://atxiv.org/list/math-ph/0405010 to appear in Crelle’s Journal (2006) 7. Kay, B., Wald, R.: Linear stability of Schwarzschild under perturbations which are nonvanishing on the bifurcation 2-sphere. Classical Quantum Gravity 4, 893–898 (1987) 8. Press, W.H., Teukolsky, S.A.: Perturbations of a rotating black hole. II. Dynamical stability of the Kerr metric. Astrophys. J. 185, 649 (1973) 9. Price, R.H.: Nonspherical perturbations of relativistic gravitational collapse I, scalar and gravitational perturbations. Phys. Rev. D (3) 5, 2419–2438 (1972) 10. Whiting, B.: Mode stability of the Kerr black hole. J. Math. Phys 30, 1301–1305 (1989) 11. Regge, T., Wheeler, J.A.: Stability of the Schwarzschild singularity. Phys. Rev. (2) 108, 1063–1069 (1957) Communicated by G.W. Gibbons


Communications in


Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases Elliott H. Lieb1, , Robert Seiringer2, 1

Departments of Mathematics and Physics, Jadwin Hall, Princeton University, P. O. Box 708, Princeton, NJ 08544, USA. E-mail: [email protected] 2 Department of Physics, Jadwin Hall, Princeton University, P. O. Box 708, Princeton, NJ 08544, USA. E-mail: [email protected] Received: 27 April 2005 / Accepted: 27 September 2005 c E.H. Lieb and R. Seiringer 2006 Published online: 9 March 2006 –

Dedicated to Jakob Yngvason on the occasion of his 60th birthday Abstract: We prove that the Gross-Pitaevskii equation correctly describes the ground state energy and corresponding one-particle density matrix of rotating, dilute, trapped Bose gases with repulsive two-body interactions. We also show that there is 100% Bose-Einstein condensation. While a proof that the GP equation correctly describes non-rotating or slowly rotating gases was known for some time, the rapidly rotating case was unclear because the Bose (i.e., symmetric) ground state is not the lowest eigenstate of the Hamiltonian in this case. We have been able to overcome this difficulty with the aid of coherent states. Our proof also conceptually simplifies the previous proof for the slowly rotating case. In the case of axially symmetric traps, our results show that the appearance of quantized vortices causes spontaneous symmetry breaking in the ground state. 1. Introduction In this paper we show that a rotating Bose gas is correctly described by the Gross-Pitaevskii (GP) equation in a suitable low-density limit. We also show that there is 100% Bose-Einstein condensation (BEC) into a solution of the GP equation. These conclusions were heretofore unproved, and it might not be an exaggeration to say that they were even conjectural, primarily because of the unusual situation (proved in [23]) that the absolute ground state of the Schrödinger Hamiltonian is not the bosonic ground state in the rapidly rotating case, as it is in the case when there is little or no rotation. In other c

2006 by the authors. This paper may be reproduced, in its entirety, for non-commercial purposes. Work partially supported by U.S. National Science Foundation grant PHY 01 39984. Work partially supported by U.S. National Science Foundation grant PHY 03 53181, and by an A.P. Sloan Fellowship

506

E.H. Lieb, R. Seiringer

words, the vortices seen in rotating gases are not properties of the absolute ground state but are, instead, true manifestations of the bosonic symmetry requirement. If the GP equation correctly describes the physics of a rotating gas (as we show here), then it also shows the superfluidity of such a gas, as will be discussed below. In the case of a cylindrically symmetric trap potential, the rotational symmetry is broken when more than one vortex is present; the GP equation must describe this broken rotational symmetry and, therefore, it must have multiple minimum energy solutions in this case. The key mathematical tool employed here is coherent states. Our work is based on the results of [18] and the observation there that one can make a c-number substitution for many boson modes (not just one, as in Bogoliubov’s method) without significant error provided the number of such modes is of lower order than N , the number of particles. As in our previous work [15, 12, 17, 23, 14] on dilute, trapped Bose gases we start with the Hamiltonian for N bosons HN =

N

(i)

H0 +

vN (xi − xj ) ,

(1)

1≤i<j ≤N

i=1

where H0 is the one-body part of the Hamiltonian and vN is the two-body repulsive interaction. These terms and the GP limit are described as follows. 1. The GP limit. We want to fix the external trapping potential but let N tend to infinity. To retain the notion of a dilute gas in this situation we let the interparticle potential depend on N in such a way that aN , the two-body scattering length of vN , is related to N by the condition that N aN = a

is fixed.

(2)

In this limit the three components of the energy (kinetic, trapping potential and interaction potential) scale in the same way and are all of the same order of magnitude. We call this the GP limit. It is this limit that will lead to the GP equation (6). 2. The two-body potential. We choose a radial two-body potential w(x) such that w(x) ≥ 0 (this is an important restriction for our methods) and such that w(x) = 0 for |x| > R0 (this finite range condition is a technical restriction for simplicity and can be relaxed if need be). We note that integrability of w(x) is not assumed here, w(x) is even allowed to have a hard core. The scattering length of w is a (i.e., the solution to [−2+w(x)]f (x) = 0 with f (∞) = 1 satisfies f = 1 − a/|x| for |x| > R0 ). The actual two-body potential in (1), given by vN (x) = N 2 w(N x) ,

(3)

has scattering length aN = a/N. 3. The one-body Hamiltonian. We work, as usual, in the rotating coordinate system, in which case the kinetic energy has to be supplemented by a term −·(p∧x) = p·(∧x), where is the angular velocity vector, and p = −i∇. It is convenient to add and subtract a term m2 ( ∧ x)2 and thereby write H0 =

1 (p + A(x))2 + V (x) 2m

(4)

Derivation of the Gross-Pitaevskii Equation for Rotating Bose Gases

507

with A(x) = m ∧ x. Then V is the trapping potential (which might or might not have some geometric symmetry) minus m2 ( ∧ x)2 . It is well known that we must have V (x) → ∞ as |x| → ∞, for otherwise the system will fly apart. We can also assume that V ≥ 0 without loss of generality. Actually, for technical reasons we require just a little more, namely V (x) ≥ C1 ln(|x|)−C2 for some positive constants C1 and C2 . (This condition can probably be relaxed a bit. What we actually need is that Tr eα(−V (x)) and Tr |A(x)|s eα(−V (x)) are finite for α large enough, for some s > 2. We will show in the appendix that this is fulfilled under the stated assumption on V .) We note that in the rotating coordinate system, the velocity at x is not p/m but rather v = i−1 [H0 , x] = p/m + ∧ x. The angular velocity around the axis is · (v ∧ x)|x ⊥ |−2 ||−1 , where |x ⊥ | is the distance to the axis. In a cylindrically symmetric state ψ we have ·(p∧x) ψ = 0 and, therefore, the angular velocity is , not zero. In the fixed frame the angular velocity is − = 0. In other words, the system in such a state is not rotating. As long as is small enough, the GP ground state is cylindrically symmetric and hence there is no rotation; this is a manifestation of superfluidity. In order to have rotation at least one vortex must form. This is a typical property of superfluids. Henceforth, we use units in which = 2m = 1. We also note that the modification of the kinetic energy in (4) is mathematically just like that caused by a uniform magnetic field with vector potential A (and e/c = 1). There is nothing special about A(x) = m ∧ x as far as the mathematics is concerned, so one could have an arbitrary A without disturbing our analysis, provided it does not grow too fast at infinity. One could think, for example, of applying a magnetic field to the system, but then our particles would have to be charged and the attendant Coulomb interaction would nullify the treatment of the system as a dilute gas with short range interaction. On the other hand, we could allow our particles to have a magnetic moment (“bosons with spin”) and our analysis would easily extend to this case. The ground state energy depends in a non-trivial way on the total spin when there is rotation [22], even in the absence of a magnetic field. This is due to the symmetry requirement of the wave function, whereby the symmetry of the spin part determines the spatial symmetry (see, e.g., [6]). We will not pursue this topic further in this paper. Our analysis is carried out here for a gas of three-dimensional particles, but the same ideas apply to a two-dimensional gas. There will be changes, of course, because the notion of scattering length is different in 2D and because the energy per particle of a homogeneous gas of low density is not 4πas as in 3D but rather 4π /| log as2 |. (Here, as is the unscaled scattering length of the interaction potential, which is held fixed in the thermodynamic limit for the homogeneous gas.) Thus, the GP equation will be a little different, but the conclusion will be the same: The only effect of rotation is to replace p 2 by |p + A|2 in the GP equation derived in [16]. In order to keep this paper manageable we do not discuss the 2D case, but the interested reader can easily combine the results in [16, 23] and the present paper. The Hamiltonian HN acts on L2 (R3N ) but we are interested in its restriction to the bosonic subspace of L2 (R3N ), namely to permutation symmetric functions. We denote the ground state energy of HN in the bosonic sector by E0 (N ), and we keep in mind that this might be larger than the absolute ground state energy of HN when no permutation symmetry is imposed. We turn now to the GP equation, which originates from the GP energy functional for a complex-valued function φ of one variable x ∈ R3 . For a ≥ 0, the GP energy functional is given by

508


E GP [φ] = φ|H0 |φ + 4πa

R3

|φ(x)|4 dx .

(5)

It can easily be shown [15] that E GP [φ] has a minimum over all φ with φ2 = 1 and this minimum energy is denoted by E GP (a). (We use the standard notation φp = 1/p .) There might be several minimizers (and there surely will be when |φ(x)|p dx the trap has axial symmetry and a is large [22, 23]) but each minimizing φ will satisfy the GP equation (−i∇ + A(x))2 φ(x) + V (x)φ(x) + 8πa|φ(x)|2 φ(x) = µφ(x) ,

(6)

where µ is the chemical potential (i.e., the energy per particle to add a small number of particles). Note that µ = E GP (a) + 4πa |φ(x)|4 dx > E GP (a) because of the quartic nonlinearity. Our main theorem concerning the bosonic ground state energy of (1) is the following. Theorem 1. With a denoting the scattering length of w, we have E0 (N ) = E GP (a) . N→∞ N lim

(7)

In [23] it was shown that lim sup N→∞

E0 (N ) ≤ E GP (a) , N

(8)

and, therefore, it remains only to prove a lower bound to lim inf N→∞ E0 (N )/N of the right form, which we do here. The GP energy minimizer(s) φ also tells us something about the density (diagonal and off-diagonal) and about Bose-Einstein condensation in the ground state of HN , or any approximate ground state. We call a sequence of bosonic N -particle density matrices γN an approximate ground state if limN→∞ N −1 Tr HN γN = E GP (a). The reduced one-particle density matrix of γN will be denoted by γN(1) . We would like to suppose that, as N → ∞, γN(1) converges to some γ and that γ = |φ φ|, where φ is a solution to the GP equation. This would be 100% Bose-Einstein condensation into the GP state and was proved to occur in the non-rotating case [12]. The difficulty in the rotating case is that the solution to the GP equation might not be unique (as it is in the non-rotating case), in which case the limit γ need not be a pure state. We wouldexpect, however, that γ is always a convex combination of pure GP states, i.e., γ = i λi |φi φi |, where φi is a solution to the GP equation and i λi = 1. (This, of course, is not the same as the much weaker and less interesting statement that γ is a convex combination of terms of the form |ψ ψ|, in which ψ is a linear combination of GP solutions instead of being equal to just one GP solution.) Unfortunately, as in the case of a cylindrically symmetric trap, the set of GP states might not be countable, and so the summation i must be replaced by some kind of integral. This accounts for the rather abstract Theorem 2 below. In any event, this theorem tells us that there is always 100% condensation, even if the system has a wide choice of states into which to condense. Note that γN(1) is a positive trace class operator on the one-particle space L2 (R3 ), and we choose the normalization Tr γN(1) = 1 for convenience. (The conventional normalization is Tr γN(1) = N.) By the Banach-Alaoglu Theorem, any sequence γN(1) will have a


509

subsequence that converges to some γ in the weak-* topology, i.e., limN→∞ Tr AγN(1) = Tr Aγ for all compact operators A. This convergence will even hold in the norm topology, i.e., limN→∞ Tr |γN(1) − γ | = 0 by compactness. More precisely, since the γN(1) are the one-particle density matrices of approximate ground states, we have (using the positivity of the interaction potential in HN ) Tr H0 γN(1) ≤ const. independently of N . √ √ √ √ √ √ Hence also H0 γN(1) H0 H0 γ H0 in weak-* sense, i.e., Tr A H0 γN(1) H0 → √ √ Tr A H0 γ H0 for all compact A. Since H0−1 is a compact operator, this implies that Tr γN(1) → Tr γ as N → ∞ (simply use A = H0−1 above). For positive operators, weak-* convergence plus convergence of the trace implies norm-convergence [27, 25]. We denote by the set of all γ s that are limit points of one-particle density matrices of approximate minimizers. That is, 1 GP (1) Tr HN γN = E (a), lim γN = γ . = γ : there is a sequence γN , lim N→∞ N N→∞ (9) As remarked above, the convergence γN(1) → γ can either mean weak-* convergence or norm convergence. Note that, in particular, norm convergence implies that Tr γ = 1 for all γ ∈ . Theorem 2. The set of one-particle density matrices of approximate ground states, as defined in (9), has the following properties: (i) is a compact and convex subset of the set of all trace class operators. (ii) Let ext ⊂ denote the set of extreme points in . (An element γ ∈ is extreme if γ cannot be written as γ = aγ1 + (1 − a)γ2 with γ1,2 ∈ , γ1 = γ2 , and 0 < a < 1.) We have ext = {|φ φ| : E GP [φ] = E GP (a)}, i.e., the extreme points in are given by the rank-one projections onto GP minimizers. (iii) For each γ ∈ , there is a positive (regular Borel) measure dµγ , supported in ext , with ext dµγ (φ) = 1, such that dµγ (φ) |φ φ| , (10) γ = ext

where the integral is understood in the weak sense. That is, every γ ∈ is a convex combination of rank-one projections onto GP minimizers. A consequence of the Krein–Milman Theorem [4, Vol. 2, Thm. 25.12] is that given any γ ∈ and given any ε > 0 there are finitely many GP minimizers φi and positive coefficients λi (with i λi = 1) such that γ = λi |φi φi | + ε (11) i

with Tr |ε | < ε. That is, every element of can be approximated by a finite convex combination of GP minimizers. We also note that part (iii) of Theorem 2 follows from part (ii) using Choquet’s Theorem [4, Vol. 2, Thm. 27.6]. We shall, however, prove part (iii) (and Eq. (11)) directly in Sect. 3 (see Step 4). Equation (10) reflects the spontaneous symmetry breaking that occurs in the system under consideration. Consider the case of an external potential V (x) which is axially symmetric, with symmetry axis given by the angular velocity vector . In general, the

510


non-uniqueness of the GP minimizer stems from the appearance of quantized vortices, which break the axial symmetry, and hence lead to a whole continuum of GP minimizers [22, 23, 2, 3, 8, 7, 1]. Uniqueness of the GP minimizer can be restored by perturbing the one-particle Hamiltonian H0 in such a way as to break the symmetry and to favor one of the minimizers, e.g., by introducing a slightly asymmetric trap potential V (x). This then leads to complete BEC, as can be seen from our Theorem 2, which does not assume any particular symmetry of V (x). Note that in the case of a unique GP minimizer, Theorem 2 implies that the reduced one-particle density matrix of any approximate ground state converges to the projection onto this unique GP minimizer, since ext (and hence ) consists of only one element in this case. The situation of a dilute rotating Bose gas described in this section contrasts with the situation of the absolute ground state of HN , i.e., the lowest eigenvalue and corresponding state without imposing symmetry restrictions on the wavefunctions. In [23] it was shown that Eq. (7) does not hold, in general, for the absolute ground state energy. The energy per particle in this case is given by minimizing a functional similar to (5), but which now depends on one-particle density matrices rather than on wave functions φ(x). In [22, 23] it was shown that the corresponding energy is strictly lower than E GP (a) for a large enough (and = 0). The density matrix functional has a unique minimizer for any value of and a, and in general this minimizer will not be rank one. An analogue of Theorem 2 also holds for the absolute ground state. As shown in [23], consists of only one element in this case, namely the unique minimizer of the density matrix functional just mentioned. This implies, in particular, that there is no spontaneous symmetry breaking in the absolute ground state. We refer the reader to [23] for more details. In the remainder of this paper, we present the proof of Theorems 1 and 2. We are grateful to Lev Pitaevskii for drawing our attention to the problem of the correctness of the GP equation for a rapidly rotating Bose gas in an email correspondence in 1999.

2. Proof of Theorem 1 Step 1. Reduction of the number of particles to ensure a bounded energy per particle. One of the problems we shall face in our analysis is to control three-body collisions, i.e., to show that the ground state wave function is suitably small when three particles are close together. We have found a way to do this (see Step 4) with the help of a bound on the change in energy when three particles are added to the system. It is not evident that this bound is always satisfied (although it must be satisfied on average since the total energy is bounded by N) and the discussion in this subsection shows how to circumvent this annoyance. If another way could be found to control the three-body amplitude or to control the incremental energy then the analysis in this section would not be needed. Let us consider the Hamiltonian (1) for M ≤ N particles (but still with interaction potential vN depending on N ): HM,N =

M i=1

(i)

H0 +

vN (xi − xj ) .

(12)

1≤i<j ≤M

This operator acts naturally on all of L2 (R3M ). We denote the ground state energy in the bosonic sector by E0 (M, N ). Our goal is a good lower bound on E0 (N, N ).


511

= M(N ) be the largest integer ≤ N satisfying two conditions: a.) N − M is Let M N) − E0 (M − 3, N ) ≤ 6E GP (a). Then E0 (M + 3, N ) − divisible by 3 and b.) E0 (M, N ) > 6E GP (a), E0 (M + 6, N) − E0 (M + 3, N ) > 6E GP (a), etc., whence E0 (M, N ) + 2(N − M)E GP (a) . E0 (N, N ) ≥ E0 (M,

(13)

We will prove the following in the remainder of this section. Proposition 1. Fix Z > 0, and let Mj and Nj be two sequences of integers, with Mj ≤ Nj , limj →∞ Mj = ∞ and limj →∞ Nj = ∞, such that E0 (Mj , Nj ) − E0 (Mj − 3, Nj ) ≤ 3Z for all j and limj →∞ Mj /Nj = λ for some 0 ≤ λ ≤ 1. Then lim inf j →∞

1 E0 (Mj , Nj ) ≥ λE GP (λa) . Nj

(14)

Note that (14) does not depend on Z. It is now useful to note that the energy E GP (a) is concave in a (as an infimum over affine functions) and thus satisfies E GP (λa) ≥ (1 − λ)E GP (0) + λE GP (a) ≥ λE GP (a) .

(15)

The last inequality in (15) follows from E GP (0) > 0. ) defined above will have a subsequence such that M(N j )/Nj → The sequence M(N λ as j → ∞ for some 0 ≤ λ ≤ 1. If we combine (13)–(15) with Z = 2E GP (a) we find for this sequence that lim E0 (Nj , Nj )/Nj ≥ λ2 E GP (a) + 2(1 − λ)E GP (a)

j →∞

= [1 + (1 − λ)2 ]E GP (a) ≥ E GP (a),

(16)

which proves (7) for this sequence Nj . (Here and in the following, we denote lim inf by lim for short.) Together with the upper bound (8) we also conclude from (16) that λ = 1. )/N has only 1 as a limit point, and hence That is, for Z ≥ 2E GP (a) the sequence M(N (16) holds for the full sequence N = 1, 2, 3 .... Our goal in the rest of this section is to prove Proposition 1, which then proves (7), as just explained.

Step 2. The generalized Dyson Lemma. To get a lower bound on E0 (M, N ), we start by deriving a lower bound on the Hamiltonian HM,N , using Corollary 1 in [13]. This corollary, which is a generalization of Lemma 1 in [19] which, in turn, stems from Lemma 1 in Dyson’s paper [5], asserts the following. (Note that the range of the potential vN is R0 /N , and its scattering length is a/N. We use the “hat” to denote Fourier transform.) Lemma 1. Let R > R0 /N. Let χ (p) be a radial function such that 0 ≤ χ (p) ≤ 1 and such that h(x) ≡ (1 − χ )(x) is bounded and integrable (which implies that χ (p) → 1 as |p| → ∞). Let fR (x) = sup |h(x − y) − h(x)| , |y|≤R

(17)

512


and wR (x) =

2 f (x) fR (y) dy . R π2 R3

(18)

Let UR (x) be any positive, radial function that vanishes outside the annulus R0 /N ≤ |x| ≤ R, with R3 UR (x) dx = 4π. Let ε > 0. If y1 , . . . , yn denote n fixed points in R3 , with |yi − yj | ≥ 2R for all i = j , then we have the operator inequality on L2 (R3 ), −∇χ (p)2 ∇ + 21

n

vN (x −yi ) ≥

i=1

n

(1−ε)

i=1

a a UR (x −yi )− wR (x −yi ) . (19) N Nε

The sums in (19) are multiplication operators, i.e., they are just functions of x. The operator −∇χ (p)2 ∇ is just the positive multiplication by p 2 χ (p)2 in Fourier space. The original Lemma 1 in [19] has χ (p) ≡ 1 and h = wR = f = ε = 0. Clarification. What Lemma 1 really says is that we can replace the unpleasant interaction potential vN (which possibly contains an infinite hard core) by a small, smooth, but longer ranged potential whose main part, UR , is positive. There are two prices that have to be paid for this luxury. One is to forego a piece of the positive kinetic energy, −∇χ(p)2 ∇. The second is that the potential is really only a ‘nearest neighbor’ potential. That is to say, the particle at x is allowed to interact with only one other particle at a time. This is seen from the requirement that the interaction UR has range R, but the other particles must be separated by a distance 2R. In order to utilize the coherent state inequalities later on in Step 3 we have to extend our UR to an ordinary two-body potential, i.e., we have to be able to drop the 2R separation requirement. To do so will require an estimation of the amplitude (in the exact, original ground state wave function) of finding three or more particles within a distance 2R of each other. Clearly, this amplitude is small, but we find that we have to resort to path integrals (or, more precisely, the Trotter product formula) to estimate it. This will be done in Step 4 below. As an immediate corollary of Lemma 1 we can omit the condition |yi − yj | ≥ 2R and replace (19) by −∇χ (p) ∇ + 2

1 2

n

vN (x − yi )

i=1

≥

n

(1 − ε)

i=1

a a UR (x − yi ) − wR (x − yi ) θ(|yk − yi | − 2R) N Nε

(20)

k=i

for any set of points yj ∈ R3 . Here, θ denotes the Heaviside step function, given by θ(t) = 1 if t ≥ 0 and θ (t) = 0 if t < 0. That is to say, if there are only n < n of the

yi that are a distance ≥ 2R from all the other yk then we simply apply (19) to these n coordinates. The right side of (20) does not contain the other values of i because the θ factor vanishes for those. The left side does contain these unwanted yi but, since vN is non-negative, this does no harm to the inequality (20). We apply (20) to each particle, considering the other M − 1 particles as fixed, and obtain


HM,N ≥

513

M

−∇i 1 − χ (pi )2 ∇i + 2pi · A(xi ) + A(xi )2 + V (xi )

i=1

+

M

(1 − ε)aN −1 UR (xi − xj ) − a(N ε)−1 wR (xi − xj )

i=1 j =i

×

θ (|xk − xj | − 2R) .

(21)

k=i,j

For the negative part of the interaction (containing wR ), we can simply use θ ≤ 1 for a lower bound. For the positive part (containing UR ), we will use the fact that θ (|xk − xj | − 2R) ≥ 1 − θ(2R − |xk − xj |) , (22) k=i,j

k=i,j

which follows from the simple inequality j (1 − sj ) ≥ 1 − j sj when 0 ≤ sj ≤ 1 for all j . We now use (21) and (22) in the following way. We begin by defining a new M-particle Hamiltonian, K, by K=

M i=1

(i)

K0 +

2(1 − ε) aN −1 UR (xi − xj ) ,

(23)

1≤i<j ≤M

where K0 is a one-body Hamiltonian to be described next. If K0 were simply (−i∇ + A)2 + V then (23) would be the conventional Hamiltonian with two-body interaction 2UR . (The factor 2 arises because each pair i, j appears twice in (21).) Unfortunately, K0 has to be a little more complicated because we used up part of the kinetic energy in replacing vN by UR via Lemma 1. Pick some η > 0, and let K0 = −∇ 1 − χ (p)2 ∇ − 2η + 2p · A(x) +A(x)2 + V (x) + η|x|4 − κ(η) .

(24)

The constant κ(η) is chosen so that K0 > 0. It is a matter of convenience to include it in the definition of K0 . It is defined by

κ(η) = inf spec −η + 2p · A(x) + η|x|4 . (25) The reason for adding the terms −2η and η|x|4 to K0 is to ensure that K0 is bounded from below and has compact resolvent, and so that κ(η) is finite. (Note: the exponent 4 in |x|4 could be replaced by any exponent > 2 for our purposes. This is due to the fact that we have a vector potential A(x) in mind that is bounded by (const. )|x|, as in the case of pure rotation. If this is not so (because an external magnetic field has been added) some polynomial of higher order than |x|4 could be needed, but our analysis would continue to go through.) Since there is a 2η in (24) and not just η we have that K0 ≥ −η + V (x) ≥ −η ≥ 0, since V (x) ≥ 0 by assumption. This will be convenient later. Let · B denote the M-particle bosonic ground state expectation for the original Hamiltonian HM,N . Actually, it is convenient to take the zero temperature limit of the

514


Gibbs state, which means that in case of a ground state degeneracy of HM,N , we would take · B to be the uniform average over all ground states. Then E0 (M, N ) = HM,N B and we have, therefore, using (21)–(25), E0 (M, N ) ≥ inf spec K + Mκ(η) − ηM |x1 |4 B − 2ηM − 1 B M 2a wR (x1 − x2 ) B Nε aM 3 UR (x1 − x2 )θ (2R − |x2 − x3 |) B . − N −

(26)

(Note: We made use of the bosonic symmetry to replace i i by M1 , for example.) The term −1 B can be bounded as follows. We have p 2 ≤ 2(p + A)2 + 2A2 , and hence, using positivity of the interaction potential vN , M − 1 B ≤ 2E0 (M, N ) + 21 ||2 M |x1 |2 B .

(27)

To prove Proposition 1 we have to bound the various terms in (26) and (27), and that is what we do in the following steps. The main term to bound is inf spec K, the ground state energy of the ‘effective Hamiltonian’ (23). The momentum cutoff χ (p) in (24) will be chosen as follows. Let (p) be an infinitely differentiable, spherically symmetric function with (p) = 0 for |p| ≤ 1, (p) = 1 for |p| ≥ 2 and 0 ≤ (p) ≤ 1 in-between. Then, for some adjustable parameter s to be determined later, we choose χ (p) = (sp) .

(28)

The potential wR (x) defined in (18) is then a smooth and rapidly decreasing function that depends only on the ratio R/s. It is easy to see that R3

wR (x)dx ≤ const.

R2 s2

(29)

as long as R ≤ const. s. We will, in fact, choose R s. Finally, we are still free to make a choice for the function UR (x) in Lemma 1. We choose it to be a ‘hat’ function: UR (x) =

6R −3 R ≥ |x| ≥ 2−1/3 R 0 otherwise ,

(30)

assuming that R ≥ 21/3 R0 /N, a condition that will be amply satisfied by our choice N −1/3 R N −2/3 later on. We remark that the exact form of UR (x) is unimportant in what is to come. We will need only the properties that UR (x)dx = 4π and that UR ∞ ≤ const. R −3 for R R0 /N.


515

Step 3. Coherent state method for the ground state. We begin our analysis of (26) by bounding the main term, inf spec K. This will be done with the aid of coherent states, exploiting ideas in [18], and is, perhaps, the most methodologically novel part of our work. The one-body operator K0 has purely discrete spectrum and can be written in terms of its eigenvalues ej and orthonormal eigenfunctions |ϕj as K0 = j ≥1 ej |ϕj ϕj |. Recall that K0 ≥ −η + V (x) ≥ −η ≥ 0, so ej > 0. We assume the sequence ej to be ordered, i.e., ej +1 ≥ ej for all j . For simplicity, we introduce the notation W (x1 − x2 ) ≡ (1 − ε)aN −1 UR (x1 − x2 ) .

(31)

The well known second quantization formalism involves the operators aj† and aj which are the creation and annihilation operators of a particle in the state |ϕj . They satisfy the usual canonical commutation relations [ai , aj† ] = δij , etc. The second quantized version of (23) is † †

= K ej aj† aj + ai aj ak al Wij kl , (32) j ≥1

ij kl

acts on the bosonic Fock space, where Wij kl = ϕi ⊗ ϕj |W |ϕk ⊗ ϕl . The operator K F, consisting of a direct sum over all particle number sectors. We are interested in a

in the sector of particle number M. Hence lower bound to the ground state energy of K

without changing this energy. We can then we can add a term ( j aj† aj − M)2 to K look for a lower bound irrespective of particle number. I.e., for any C ≥ 0, we have that

for M particles is ≥ inf spec K on the full Fock space, where inf spec K 2 † † C † K≡ ej aj† aj + ai aj ak al Wij kl + a j aj − M . (33) M j ≥1

ij kl

j ≥1

The choice of C will be made later. The Fock space F can be thought of as the tensor product of the Fock spaces generated by each mode ϕj . We choose some integer J 1 (to be determined later) and split the Fock space into two parts, namely F = F < ⊗ F > , where F < is the tensor product of the Fock spaces generated by all the modes ϕj with j ≤ J and where F > is that generated by all the other modes. Next, we introduce coherent states [10] for all the modes j ≤ J . (By coherent states we mean ordinary canonical Schrödinger, Bargmann, Glauber, coherent states.) The modes with j > J will not be omitted, but they will be treated differently from the j ≤ J modes. Let z = (z1 , . . . , zJ ) denote a vector in CJ . Let also (z) denote the projection onto the coherent state |z1 ⊗ · · · ⊗ zJ ∈ F < . The symbol |z1 ⊗ · · · ⊗ zJ

is shorthand for |z1 ⊗ |z2 ⊗ · · · ⊗ |zJ , and |zj denotes the coherent state for the j th mode given by |zj = exp[−|zj |2 /2 + zj aj† ] |vacuum . The Hamiltonian K in (33) can now be written as K = dz (z) ⊗ U (z) , (34) where U (z) is an operator acting on F > . The operator U (z) depends on z since it is also an upper symbol for the modes j ≤ J . The integration measure is dz = π −J j ≤J dxj dyj with zj = xj + iyj . This is discussed in [18, 10]. As an example, the upper symbol for

516


aj† is z¯ j and for aj it is zj , but for aj† aj it is |zj |2 − 1. Thus, to a term such as ai† aj† ak al with i, j ≤ J and k, l > J would correspond the upper symbol operator z¯ i z¯ j ak al . It is easier to compute the lower symbol (which is denoted by u(z)) than the upper symbol U (z). It is obtained simply by replacing aj† by z¯ j and aj by zj in all (nor-

mal-ordered) polynomials, even in higher polynomials such as aj† aj or aj† aj† aj aj . An

equivalent definition of the lower symbol of any polynomial P in the aj ’s and aj† ’s (normal-ordered or not) is the expectation value u(z) = z1 ⊗ · · · ⊗ zJ |P|z1 ⊗ · · · ⊗ zJ . In the case considered here, u(z) = z1 ⊗ · · · ⊗ zJ |K|z1 ⊗ · · · ⊗ zJ . The lower symbol is useful because the upper symbol can conveniently be obtained from it as [10] U (z) = e−∂z ∂z¯ u(z) = u(z) − ∂z ∂z¯ u(z) + 21 (∂z ∂z¯ )2 u(z) , (35) where ∂z ∂z¯ = j ≤J ∂zj ∂z¯ j . (In the general case there would be higher order derivatives on the right side of (35), but not in our case since u(z) is a polynomial of order four.) Note that (34) implies that inf spec K ≥ inf (inf spec U (z)) , z

(36)

since dz (z) = IF < and (z) ⊗ U (z) ≥ inf spec U (z) (z). Our goal in the rest of this subsection is to derive a lower bound to inf spec U (z) for a fixed z. The reader might wonder why we use coherent states only for modes j ≤ J and not for all modes. The reason is that the upper symbol for the operator ej aj† aj is ej (|zj |2 − 1), and the −1 term is a term that we do not want when minimizing for a fixed z. We make an error in the energy of the form − j ≤J ej and for this reason we cannot take J = ∞. But we can, and will let J → ∞ as N → ∞. 3a. Lower bound on the lower symbol u(z). In order to derive a lower bound to U (z) and the bottom of its spectrum, we start by deriving a lower bound to the lower symbol u(z), which is the first term in (35). This symbol can be conveniently expressed in terms of the function z ∈ L2 (R3 ), parametrized by z ∈ CJ , given by z (x) = zj ϕj (x) . (37) 1≤j ≤J

Note that z 22 = j ≤J |zj |2 . Denoting T ≡ k>J ek ak† ak , we have z1 ⊗ · · · ⊗ zJ j ej aj† aj z1 ⊗ · · · ⊗ zJ = ej |zj |2 + T j ≤J

= z |K0 |z + T .

(38)

There is a mild abuse of notation here, which will continue for the rest of this paper, and which we hope will not cause any confusion. The operator j ej aj† aj acts on F while the vector |z1 ⊗ · · · ⊗ zJ is in F < , so the left side of (38) defines an operator on F > in an obvious way (actually, it defines a quadratic form). The right side must also be an operator on F > , and it is so if the number z |K0 |z is regarded as a number times the identity on F > .


Similarly, with N > ≡ greater than J,

j >J

517

aj† aj denoting the number of particles in the modes

2 2 † > 2 z1 ⊗ · · · ⊗ zJ a a − M ⊗ · · · ⊗ z + − M + z 22 = N z 1 J z 2 j j j 2 ≥ z 22 − M − 2eJ−1 MT . (39) Here, we used the normal ordering [ aj† aj ,

j ≤J

followed by the elementary bound

aj† aj ]2 =

N>

i≤J

j ≤J

ai† aj† ai aj +

j ≤J

≤ T /eJ .

The interaction part of u(z) is obtained by replacing aj by zj and aj† by z¯ j when j ≤ J . We will now derive a lower bound on this term. It is convenient to introduce the notation I (z ) = dxdy |z (x)|2 |z (y)|2 W (x − y) . (40)

Since W ≥ 0, it is possible to neglect the interaction between modes > J for a lower bound. More precisely, let P = 1≤i≤J |ϕi ϕi | and Q = 1−P . The two-body operator W (x − y) is then bounded from below by W = ((P + Q) ⊗ (P + Q))W ((P + Q) ⊗ (P + Q)) ≥ (P ⊗ P )W (P ⊗ P ) + (P ⊗ P )W (P ⊗ Q + Q ⊗ P + Q ⊗ Q) + (P ⊗ Q + Q ⊗ P + Q ⊗ Q) W (P ⊗ P ) ,

(41)

since the missing term on the right side of (41) is (Q ⊗ Q + P ⊗ Q + Q ⊗ P )W (Q ⊗ Q + P ⊗ Q + Q ⊗ P ) ≥ 0. We thus have that z1 ⊗ · · · ⊗ zJ ij kl ai† aj† ak al Wij kl z1 ⊗ · · · ⊗ zJ ≥ I (z ) + z ⊗ z |W |ϕk ⊗ ϕl ak al + ϕk ⊗ ϕl |W |z ⊗ z ak† al† +2

kl>J

z ⊗ z |W |z ⊗ ϕk ak + 2

k>J

kl>J

z ⊗ ϕk |W |z ⊗ z ak† .

(42)

k>J

Here we used that W is symmetric, implying that in the last line we could replace |z ⊗ ϕk + ϕk ⊗ z by 2|z ⊗ ϕk . We seek a lower bound to the last two expressions in (42). Note that, for a general operator A, |A + A† |2 = A2 + A†2 + AA† + A† A ≤ 2A† A + 2AA† by Schwarz’s inequality, and so (A + A† )2 ≤ 4|A|2 + 2[A, A† ] .

(43)

We apply this to the second line in (42), with A = kl>J ckl ak al and ckl = z ⊗ z |W |ϕk ⊗ ϕl . The commutator is † ak al ckm clm + 2 |ckl |2 . (44) [A, A† ] = 4 klm>J

kl>J

518


The last term in (44) is bounded by |ckl |2 ≤ |ckl |2 = z ⊗ z |W 2 |z ⊗ z ≤ W ∞ I (z ) . kl>J

(45)

kl≥1

The first term on the right side of (44) can be bounded as

ak† al ckm clm ≤

ak† al ckm clm ≤

m≥1 kl>J

klm>J

4 −1 η W 21 ∇z 42 T . 27π 4

(46)

This can be seen as follows. The integral kernel σ of the one-particle operator defined by the matrix m≥1 ckm clm is given by σ (x, x ) = dy|z (y)|2 W (x − y)W (x − y)z (x)z (x ) . (47) Using Young’s and Schwarz’s inequalities, we have, for any function f on R3 , dx dx f (x)f (x )σ (x, x ) ≤ dx dx dy|z (y)|2 W (x −y)W (x −y)|z (x)f (x)|2 ≤ W 21 z 26 z f 23 ≤ W 21 z 46 f 26 .

(48)

Hence (46) follows by applying the Sobolev inequality f 26 ≤ (4/3)(2π 2 )−2/3 ∇f 22 both to f and to z , and using the fact that − ≤ η−1 K0 . To get an upper bound on |A|2 we use Schwarz’s inequality again to obtain |ckl |2 2 † † (49) e m e n a m a n am a n |A| ≤ ek el kl>J

mn>J

for any sequence of positive numbers ej . We choose ej to be the eigenvalues of K0 , in which case 2 † † † em en a m an am an ≤ ek ak ak = T2. (50) mn>J

k>J

Moreover, |ckl |2 Q Q W z ⊗ z . = z ⊗ z W ⊗ e k el K0 K0

(51)

kl>J

We have the following two operator inequalities; the first comes from the fact that K0 ≥ eJ on the range of the projector Q and the second comes from K0 ≥ −η: 2 2 Q ≤ ≤ . K0 K0 + e J −η + eJ

(52)

Denoting the integral kernel of (− + µ)−1 by √

1 e− µ|x−x | kµ (x − x ) = , 4π |x − x |

(53)


519

we see that (51) is bounded above by

4 η2

dxdydx dy z (x)z (y)W (x − y)keJ /η (x − x )

×keJ /η (y − y )z (x )z (y )W (x − y ) 4 ≤ 2 dxdydx dy |z (x)|2 |z (y)|2 W (x − y)keJ /η (x − x ) η ×keJ /η (y − y )W (x − y ) 1 W 1 ≤ √ I (z ) . 2πη3/2 eJ

(54)

Here, we used inequality for the (x , y ) integration, as well as the fact that √Young’s 2 −1 kµ 2 = (8π µ) . By putting all this together, we have that (A + A† )2 ≤ 4W ∞ I (z ) + +

32 −1 η W 21 ∇z 42 T 27π 4

4 W 1 √ I (z ) T 2 . 2πη3/2 eJ

(55)

Since the square root preserves operator monotonicity, we can take the square root on both sides of (55). By the triangle inequality, we can take the sum of the square roots of each term on the right side. Finally, applying the Schwarz inequality to the first and third term, we conclude that, for any δ > 0, √ 2 −1/2 1 4 η |A + A | ≤ δI (z ) + W ∞ + 2 W 1 ∇z 22 T δ π 27 −1 −1/4 + eJ W 1 I (z ) + 2π η3/2 T. †

(56)

We now proceed similarly with the last term in (42) which is linear in ak and ak† . Denoting ck = z ⊗ z |W |z ⊗ ϕk , we have that

2

ck ak + ck ak†

k>J

|ck |2 ≤4 ek k>J

ek ak† ak

k>J

+2

|ck |2 .

(57)

k>J

Using Hölder’s and Sobolev’s inequality,

|ck |2 =

dxdydz |z (x)|2 |z (y)|2 |z (z)|2 W (x − y)W (x − z)

k≥1

≤ W 3/2 z 26 I (z ) 4 1/3 2/3 ≤ W ∞ W 1 ∇z 22 I (z ) . 3(2π 2 )2/3

(58)

520


Moreover, using (52) again, together with Young’s and Sobolev’s inequalities, as well as the fact that the 3/2 norm of kµ is given by 2−1/3 µ−1/2 /3, we find that |ck |2 2 dxdydx dy z (x)|z (x )|2 W (x − x )keJ /η (x − y) ≤ ek η k>J

× (y)|z (y )|2 W (y − y ) z 2 ≤ dxdydx dy |z (x)|2 |z (x )|2 W (x − x )keJ /η (x − y) η ×|z (y )|2 W (y − y ) 4 W 1 ≤ η−1/2 √ ∇z 22 I (z ) . 4/3 9π eJ

(59)

This implies that 2 † ck ak + ck a k k>J

8 1/3 2/3 W ∞ W 1 ∇z 22 I (z ) 3(2π 2 )2/3 W 1 16 + 4/3 η−1/2 √ ∇z 22 I (z ) T 9π eJ 2 √ 2 2 1 1/6 1/3 1/2 −1/4 ≤ W ∞ W 1 + η−1/4 W 1 eJ T 3 (2π 2 )1/3 3π 2/3 2 2 × ∇z 2 + I (z ) ,

≤

(60)

again using the triangle and the Schwarz inequality. As mentioned above, operator monotonicity is preserved by the square root, and hence we can take the square root on both sides of Eq. (60). This completes the lower bound on the lower symbol u(z). For the convenience of the reader, we repeat the bound just derived: C 2 −1/4 z 22 − M u(z) ≥ z |K0 |z + I (z ) 1 − δ − eJ W 1 T + M 2 1 1/6 1/3 − ∇z 22 + I (z ) 2 W ∞ W 1 3 (2π 2 )1/3 4 1/2 −1/4 √ + 2/3 η−1/4 W 1 eJ T 3π √ 2 −1/2 1 2 4 −∇z 2 2 η W 1 T − W ∞ π 27 δ 2C −1 −1/4 . (61) +T 1 − eJ − 2π 2 η3/2 eJ We note that in the following we will choose J large enough so that the last term in (61) is positive and thus can be neglected for a lower bound. (Recall that eJ → ∞ as J → ∞.)


521

3b. Lower bound on the remaining terms in U (z). A lower bound on the first term on the right side of (35) is given in (61) and, therefore, to get a lower bound on the upper symbol U (z), it remains to bound the last two terms on the right side of (35). The very last term is positive, as will be shown now, and can thus be neglected for a lower bound. Namely, C 2 2 4 1 1 ∂ u(z) = ∂ ⊗ |W | ⊗

+ (∂ ) (∂ ) z z z z z 2 2 z z¯ 2 z z¯ M = 21 ϕi ⊗ ϕj + ϕj ⊗ ϕi |W |ϕi ⊗ ϕj + ϕj ⊗ ϕi

1≤ij ≤J

+

C J (J + 1) ≥ 0. M

(62)

The remaining expression, ∂z ∂z¯ u(z), consists of the following terms. First, from the one-body part (38) of the Hamiltonian we obtain a contribution j ≤J ej . Second, from the term (39) (see also (33)) that was introduced in order to control the particle number, we get C (2N > − 2M + 1 + 2z 2 )J + 2z 2 M 2C JC > ≤ (J + 1)z 22 + 2N + 1 . M M

(63)

Finally, the following three contributions are obtained from the interaction part. From the part where all four indices are ≤ J , we have z ⊗ ϕj + ϕj ⊗ z |W |z ⊗ ϕj + ϕj ⊗ z ≤ 4 z ⊗ ϕj |W |z ⊗ ϕj

j ≤J

≤4

j ≤J

j ≤J

√ 1 W 1 z 26 ϕj 23 ≤ (4/3)3/2 2 η−1/2 W 1 ∇z 22 ej . (64) 2π j ≤J

Here, we used the inequalities of Young, Hölder and Sobolev as well as the facts that − ≤ η−1 K0 and ϕj |K0 |ϕj = ej in the last step. From the term with 3 indices ≤ J , we get z ⊗ ϕj + ϕj ⊗ z |W |ϕj ⊗ ϕk ak + adjoint . (65) 2 j ≤J k>J

Using (57), this time with ek ≡ 1, (65) is bounded above, as an operator, by 4

!

ϕj ⊗ ϕk |W |z ⊗ ϕj + ϕj ⊗ z |2 N +

j ≤J

>

1 2

"1/2 .

(66)

k>J

Similarly to (58), we can derive the bound | ϕi ⊗ ϕk |W |z ⊗ ϕi + ϕi ⊗ z |2 ≤ 4W 3/2 W 1 z 26 ϕi 36 ϕi 2 . (67) k≥1

522


Since ϕi 2 = 1 and ϕi 26 ≤ (4/3)(2π 2 )−2/3 ∇ϕi 22 ≤ (4/3)(2π 2 )−2/3 η−1 ei , this implies that 3/4 # 1 5/6 1/6 −3/4 (66) ≤ 8(4/3)5/4 W W ∇ η ei N > + 21 z 2 ∞ 1 (2π 2 )5/6 i≤J 1 5/6 1/6 −3/4 3/4 2 > 1 ∇ ≤ 4(4/3)5/4 W W η e + N + z ∞ 2 1 i 2 . (2π 2 )5/6 i≤J

(68) Here, Schwarz’s inequality was used in the last step. The last term to estimate is the one coming from 2 indices ≤ J , given by ϕj ⊗ ϕk + ϕk ⊗ ϕj |W |ϕj ⊗ ϕl + ϕl ⊗ ϕj ak† al j ≤J k,l>J

≤ (4/3)3/2

√ 1 −3/2 η W ej T . 1 2π 2

(69)

j ≤J

This inequality can be seen as follows. For any one-particle function f , ϕi ⊗ f + f ⊗ ϕi |W |ϕi ⊗ f + f ⊗ ϕi ≤ 4 ϕi ⊗ f |W |ϕi ⊗ f ≤ 4W 1 f 26 ϕi 23 √ 1 ≤ (4/3)3/2 2 η−1/2 W 1 ∇f 22 ei . (70) 2π The last inequality is the same is in (64). The result now follows using − ≤ η−1 K0 . Altogether, we have thus shown that ∂z ∂z¯ u(z) ≤

2C 2CJ > 1 (J + 1)z 22 + N +2 M M i≤J √ 1 +(4/3)3/2 2 η−1/2 W 1 ∇z 22 + η−1 T ei 2π ei +

(71)

i≤J

3/4 1 5/6 1/6 W 1 W ∞ η−3/4 ei 2 5/6 (2π ) i≤J × ∇z 22 + N > + 21 .

+4(4/3)5/4

This finishes our lower bound on the upper symbol U (z). To summarize, we have shown the following operator lower bound to the operator U (z): U (z) ≥ right side of (61) − right side of (71) .

(72)

3c. c-number bound on T . We are interested in the ground state energy of U (z) for a fixed z ∈ CJ . Since T and N > are the only operators appearing in (61) and (71), this quantity can be bounded from below using (61) and (71) if we can evaluate the expectation values of T and N > in the ground state (or one of the ground states) of U (z). Let · z denote the √ expectation value in a ground state of U (z). We can use two simple facts: i.) Since T enters (61) negatively, we can use the concavity of the square


523

√ √ root to replace T z by T z for a lower bound. ii.) Since N > appears positively in (71), and hence negatively in (72), we can replace it by the upper bound N > ≤ T /eJ . For the purpose of bounding T z we can use a lower bound to U (z) that is much simpler than (72). This is obtained by totally neglecting both the interaction part and the part controlling the particle number in u(z). These give positive contributions to u(z) (since u(z) is the expectation value of K in the coherent state). We have to be more careful about ∂z ∂z¯ u(z), however, because this contains some negative terms, as given in (71). (The annoying fact is that an upper symbol of a positive operator need not be positive, although the lower symbol is always positive.) Proceeding in the manner just described we have that U (z) ≥ z |K0 |z + T − ∂z ∂z¯ u(z) .

(73)

Now let us estimate the various terms in ∂z ∂z¯ u(z) in (71). We have η∇z 22 ≤ z |K0 |z . Also z 22 ≤ z |K0 |z /inf spec (−η + V (x)). Moreover, W 1 ≤ 4π a/N , and W ∞ ≤ 6a/(R 3 N ). We will choose R N −2/3 below. Therefore, W ∞ N . The operator N > can be bounded in terms of T as N > ≤ T /eJ . Note also that M = O(N ) by assumption. Hence we see from (71) and (73) that, for N large enough (depending on the parameters η, C and J ), U (z) ≥ 21 T − const. ,

(74)

where the constant depends only on η, C and J , but not on M or N . The value of (74) is that it allows us to control the value of T z , and thereby control inf z inf spec U (z), which is our lower bound to the ground state energy of K. There is some number E, independent of all parameters, such that inf z inf spec U (z) ≤ ME/2 − const. because inf z inf spec U (z) is less than the known upper bound to the ground state energy of K. Then we can, and will restrict our attention to z’s with T z ≤ ME because only those values of z are relevant for computing inf z inf spec U (z), as (74) shows. Only the existence of E and not its value is important. We conclude from (72) and the fact that T z ≤ ME for the z in question that inf inf spec U (z) ≥ inf E[] , z

(75)

where 2 C 22 − M M 2C 2 −D2 ∇2 − D3 − (J + 1)22 . M

E[] = |K0 | + D1 I () +

(76)

The notation is the following: D1 = 1 − δ −

−1/4 − eJ W 1 ME

−2

2 1 1/6 1/3 W ∞ W 1 3 (2π 2 )1/3

4 1/2 −1/4 η−1/4 W 1 eJ M 1/2 E 1/2 , 3π 2/3

(77)

524


D2 = 2

2 4 1 1/6 1/3 1/2 −1/4 W ∞ W 1 + η−1/4 W 1 eJ M 1/2 E 1/2 3 (2π 2 )1/3 3π 2/3 √ 2 −1/2 1 4 + 2 W 1 M 1/2 E 1/2 + (4/3)3/2 2 η−1/2 W 1 ei η π 27 2π 3/4 1 5/6 1/6 +4(4/3)5/4 W 1 W ∞ η−3/4 ei , 2 5/6 (2π )

i≤J

(78)

i≤J

and D3 =

√ 1 2CJ ME/eJ + 21 + (4/3)3/2 2 η−3/2 W 1 ME ei M 2π i≤J i≤J 3/4 1 5/6 1/6 −3/4 −1 1 +4(4/3)5/4 W W η e + MEe ∞ 1 J i 2 (2π 2 )5/6 ei +

i≤J

1 + W ∞ . δ

(79) −1/4

We have neglected the last term in (61) containing (1 − eJ (2π 2 η3/2 )−1 − 2C/eJ ), assuming J to be large enough to make this term positive. (Recall that eJ → ∞ as J → ∞.) Our final result in this section, (75)–(76), might not appear to be useful at first sight, but the reader should note that the first two terms in (76) are essentially the GP energy expression. The term |K0 | is the relevant (i.e., low momentum) part of the kinetic energy |(i∇ − A)|2 . The coefficient D1 equals 1 to leading order and I () is essen tially the GP quartic term 4πa ||4 (up to errors which will be controlled). Moreover, for C large enough the term C(22 − M)2 /M ensures that we have the right particle number. For an appropriate choice of the parameters J , η and R all other terms are of lower order as N → ∞, as we shall show. Step 4. Bounds on three-particle density. So far we have bounded the main term in (26), namely inf spec K. Of the various other terms in (26) that have to be bounded, the one that is most intuitively negligible, but which we find the hardest to control is the last term in (26). To show that it is small we have to show that the probability of finding three particles within a distance 2R of each other (in a true ground state of HM,N ) is small. This is accomplished in this section. We begin with a lemma about the possible size of the expectation value of a function of the coordinates of three bosons. Recall from Step 2 that · B denotes expectation value in the bosonic, zero-temperature state of the M-body Hamiltonian HM,N . Lemma 2. Let ξ(x1 , x2 , x3 ) be any positive function of x1 , x2 and x3 ∈ R3 . With V = the one-body potential appearing in HM,N , we define the three-body, independent particle Hamiltonian h = −1 − 2 − 3 + V (x1 ) + V (x2 ) + V (x3 ) . Let α > 0 and let e−αh (x1 , x2 , x3 ; y1 , y2 , y3 ) be the ‘heat kernel’ of h at ‘inverse temperature’ α. Finally, consider the modified integral kernel $ e−αh (x1 , x2 , x3 ; y1 , y2 , y3 ) ξ(x1 , x2 , x3 )ξ(y1 , y2 , y3 ), (80)


525

and let denote its largest eigenvalue (i.e., its norm as a map from L2 (R9 ) to L2 (R9 )). Then (81) ξ(x1 , x2 , x3 ) B ≤ exp{α(E0 (M, N ) − E0 (M − 3, N ))} . Note that for the M and N under consideration here, we have E0 (M, N ) − E0 (M − 3, N ) ≤ 3Z, as explained in Step 1. It is the appearance of the peculiar difference E0 (M, N ) − E0 (M − 3, N) in Lemma 2 that led us to the discussion in Step 1. If the three-body correlations could be bounded more expeditiously than is done here, Step 1 could be simplified. Proof. We denote by Tr [ · ] the trace over all of L2 (R3M ), not just the bosonic states, and by Pb the projection onto the bosonic (i.e., symmetric) subspace. Note that exp{−βHM,N } is trace class for large enough β, by our assumption on the logarithmic increase of the potential V (x). (This follows from the Feynman-Kac-Itô formula, together with the results in the Appendix.) Hence Tr [ξ e−αnHM,N Pb ] , ξ B = lim n→∞ Tr [e−αnHM,N P ] b

(82)

independently of α, of course. Note that HM,N commutes with Pb so e−αnHM,N Pb is self-adjoint and positive. The multiplication operator ξ is also positive and we can write ξ e−αnHM,N Pb = [ξ e−αHM,N Pb ]e−α(n−1)HM,N Pb . Hölder’s inequality for traces of positive operators states that TrAB ≤ {TrAn }1/n {TrB n/(n−1) }(n−1)/n , and therefore Tr [ξ e−αnHM,N Pb ] Tr [e−αnHM,N Pb ]

≤

Tr [(ξ e−αHM,N Pb )n ]

1/n

Tr [e−αnHM,N Pb ]

(83)

.

Consider the bigger projection P b , which symmetrizes only among particles 4, 5, . . . , M. It commutes with HM,N and also with ξ , and hence e−αHM,N Pb ≤ e−αHM,N P b . Since ξ ≥ 0, this yields the upper bound

Tr [(ξ e−αHM,N Pb )n ] Tr [e−αnHM,N Pb ]

1/n

≤

Tr [P b (ξ e−αHM,N )n ] Tr [e−αnHM,N Pb ]

1/n .

(84)

We now claim that Tr [P b (ξ e−αHM,N )n ] ≤ Tr 3 (ξ e−αh )n Tr M−3 [e−αnHM−3,N P b ] ,

(85)

where Tr 3 and Tr M−3 denote the trace over the first 3 and last M − 3 particles, respectively. Taking the limit n → ∞ this proves (81). To show (85), we write HM,N = H3,N ⊗ IM−3 + I3 ⊗ HM−3,N + W , with W denoting the interaction between the first 3 and the last M − 3 particles. Note that W ≥ 0. Using the Trotter product formula, we first replace each factor e−αHM,N by (e−αH3,N /m e−α(HM−3,N +W )/m )m for some integer m. (Here we abuse the notation slightly, omitting to write tensor products and identity operators.) For x = (x1 , x2 , x3 ), let k(x, x ) denote the integral kernel of e−αH3,N /m . Denoting by Wx the multiplication operator on the

526


subspace of the last M − 3 particles obtained by fixing the first 3 to have positions x, and introducing nm integration variables xij , 1 ≤ i ≤ n, 1 ≤ j ≤ m, we can write % m n & Tr P b ξ e−αH3,N /m e−α(HM−3,N +W )/m = dxij ξ(xi1 ) k(xij , xi(j +1) ) ij



i

×Tr M−3 P b

ij

e



−α(HM−3,N +Wxij )/m 

,

(86)

i,j

where we identify xi(m+1) ≡ x(i+1)1 and xn(m+1) ≡ x1,1 . By Hölder’s inequality for traces, we can estimate % &

−α(HM−3,N +Wxij )/m Tr M−3 P b

b e−αn(HM−3,N +Wxij ) ≤ sup P e Tr M−3 ij

i,j

≤ Tr M−3 P b e−αnHM−3,N ,

(87)

where in the last inequality we used the fact that Wx ≥ 0 and that the partition function is monotone in the potential. By the Feynman-Kac-Itô formula [24, Sect. 15], the integral kernel k(x, x ) is bounded in absolute value by the kernel of e−αh/m . Using this estimate and rewriting the integrals as a trace we obtain % m n & Tr P b ξ e−αH3,N /m e−α(HM−3,N +W )/m ≤ Tr 3 (ξ e−αh )n Tr M−3 [e−αnHM−3,N P b ] . Letting m → ∞ this yields (85).

(88)

We now use Lemma 2 to obtain a bound on the various terms in (26) and (27). Lemma 2 immediately implies that

|x1 |2 B ≤ e3αZ |x|eα(−V (x)) |x|∞ ,

(89)

with · ∞ denoting operator norm. For positive operators, the operator norm is bounded by the trace, in this case given by Tr|x|2 eα(−V (x)) . This expression, in turn, is bounded for α large enough, as shown in the Appendix. In exactly the same way we can bound |x1 |4 B . Moreover, we have that √ √ wR (x1 − x2 ) B ≤ e3αZ wR e−αh wR ∞ 1 ≤ e3αZ wR (x)dx . (90) (4πα)3/2 R3 The last inequality can be seen as follows. Denote by k(x, x ) the kernel of eα(−V (x)) . The Feynman-Kac formula implies that k(x, x ) ≤ (4π α)−3/2 for any positive V (x).


527

Hence, for any function f ∈ L2 (R6 ), $ $ dxdx dydy f (x, y) wR (x − y)k(x, x )k(y, y ) wR (x − y )f (x, y) % & $ −3/2

dxdx k(x, x ) dy wR (x − y)|f (x, y)| ≤ (4π α) % & $ × dy wR (x − y)|f (x , y)| ≤ (4π α)−3/2 % ×

wR

dxdx k(x, x )

&1/2

% dy|f (x, y)|2

&1/2 2

dy|f (x , y)|

(91)

,

where we used Schwarz’s inequality in the last step. The result now follows from the fact that eα(−V (x)) ≤ I. Similarly, repeating the above argument with x2 in place of x and (x1 , x3 ) in place of y, we obtain 1 3αZ UR (x)dx θ(2R − |x|)dx UR (x1 − x2 )θ (2R − |x2 − x3 |) B ≤ e (4πα)3 R3 R3 1 2 3 = e3αZ 3 (92) R . α 3π This finishes our bounds on the various terms appearing in (26) and (27). Step 5. Collection of all the terms and the final inequality. In this section we concatenate the various pieces of the lower bound to the energy E0 (M, N ) in (26), and finish the proof of Proposition 1. Inequality (26) contains several terms. All except inf spec K were bounded in Step 4 and in (27). The essence of Step 3 is the bound on the main term inf spec K ≥ inf inf spec U (z) ≥ inf E[] , z

(93)

where E[] is defined in (76). Let us begin by disposing of the terms mentioned in Step 4. As shown there, |x1 |2 B ≤ const. and |x1 |4 B ≤ const. for some constant depending only on Z. (Recall that Z is a fixed number of order 1.) Moreover, from (90) and (29) we see that (recalling that M ≤ N) a R2 M 2a (x − x ) ≤ const. . w R 1 2 B N 2ε ε s2

(94)

This term will thus be negligible, if R → 0 as N → ∞ (keeping ε and s fixed for the moment). We are free to choose the dependence of R on N , and we choose R to satisfy N −1/3 R N −2/3

as

N →∞.

(95)

The last term to estimate is then aM 3 UR (x1 − x2 )θ (2R − |x2 − x3 |) B ≤ const. aN R 3 1 . N2

(96)

528


Hence it follows from (26) and (27) that, for any fixed s, ε and η (recalling that λ = limN→∞ M/N), 1 1 (1 + 4η)E0 (M, N ) ≥ lim inf E[] + λκ(η) − const. λη . N→∞ N N→∞ N lim

(97)

The only thing left is the minimization of E[] given in (76), which contains the numbers D1 , D2 and D3 in (77)–(79). To evaluate them as N → ∞ we note that W 1 ≤ 4π a/N, and W ∞ N for our choice of R in (95). Hence, we see that lim lim

lim D1 = 1 , lim

δ→0 J →∞ N→∞

lim D2 = 0 , and lim

J →∞ N→∞

N→∞

1 D3 = 0 . N

(98)

Using the fact that both ∇22 and 22 are bounded relative to |K0 | , and rescaling → M 1/2 , we obtain 1 lim inf E[] ≥ N J →∞ N→∞ 2 |(x)|2 |(y)|2 UR (x − y)dxdy lim inf λ |K0 | + (1 − ε)aλ R→0 2 2 . +Cλ 2 − 1 lim

(99)

Note that the infimum can obviously be restricted to a set of bounded |K0 | , independent of R, since UR ≥ 0. Since K0 ≥ −η this implies that we can assume that ∇2 is bounded independent of R, and hence also 6 by Sobolev’s inequality. Using the inequality (proved below) |(x)|2 |(y)|2 UR (x − y)dxdy − 4π4 ≤ 8π R3 ∇2 , (100) 4 6 we see that we can interchange the limit and the infimum and thus obtain 2 1 2 2 2 lim lim . inf E[] ≥ inf λ |K0 | +(1−ε)4π aλ 4 +Cλ 2 −1 J →∞ N→∞ N (101) Inequality (100) can be obtained in the following way. Using Schwarz’s inequality, as well as UR (y)dy = 4π , |(x)|2 |(y)|2 UR (x − y)dxdy − 4π 4 4 ≤ dyUR (y) dx|(x)|2 (|(x)| + |(x + y)|) ||(x)| − |(x + y)|| 1/2

≤ 236

dyUR (y)

dx |(x)| − |(x + y)||2

.

(102)

The result now follows from the fact that || − |( · + y)|2 ≤ |y|∇2 , which can −ip·y |2 ≤ |y|2 |p|2 , and be seen by evaluating the norm in Fourier space, using |1 − e also using the fact that UR (y)|y|dy ≤ R UR (y)dy = 4π R.


529

Now, letting C → ∞, we infer from (101) that 1 λ |K0 | +(1−ε)4π aλ2 24 . (103) inf E[] ≥ inf 2 =1 C→∞ J →∞ N→∞ N lim lim

lim

The final step is to remove the momentum cutoff in K0 , i.e., to let s → 0 in Eq. (103). Again, we claim that we can interchange the limit and the infimum, at least to obtain a lower bound. Let s denote a minimizer of the functional on the right side of (103). Since K0 ≥ −η + V (x) and V (x) → ∞ as |x| → ∞, a sequence sj with sj → 0 as j → ∞ lies in a compact subset of L2 (R3 ), and hence there exists a subsequence which converges strongly and pointwise almost everywhere [11] (both in p-space and x-space) to a function 0 as j → ∞, with 0 2 = 1. All the s-independent terms in the functional on the right side of (103) are weakly lower semicontinuous. Moreover, by Fatou’s Lemma [11],

0 (p)|2 dp .

s (p)|2 dp ≥ p2 | lim p2 1 − χs (p)2 | (104) s→0

Hence the infimum and the limit s → 0 can be interchanged for a lower bound. In combination with inequalities (103) and (97), we find that 1 (1 + 4η)E0 (M, N ) N N→∞ λ − + 2p · A(x) + A(x)2 + V (x) ≥ inf lim

2 =1

+(1 − ε)4πaλ2 24 − const. λη .

(105)

(For a lower bound we simply dropped the positive terms −2η and η|x|4 in K0 .) By letting η → 0 and ε → 0 Proposition 1 is proved. As explained in Step 1, this proves Theorem 1. Remark about the optimal choice of the parameters. In Eq. (95) we showed how the parameter R has to depend on N , as N → ∞, in order to obtain the correct limit for the energy. The explicit dependence on N of the other parameters J , C, s, η and ε need not be specified so closely (unless we wish to obtain a detailed error estimate). It suffices to let J → ∞, C → ∞, s → 0, η → 0 and ε → 0 (in this order) after taking the N → ∞ limit. 3. Proof of Theorem 2 Step 1. Proof of Part (i). The fact that is a convex set follows easily from its definition. Namely, if γN and γ¯N are two approximate ground state sequences, and 0 ≤ λ ≤ 1, then λγN + (1 − λ)γ¯N is certainly also an approximate ground state sequence, whose reduced one particle density matrix is given by λγN(1) + (1 − λ)γ¯N(1) . Compactness of is also not difficult to see. Given a sequence γi ∈ , the BanachAlaoglu Theorem implies the existence of a subsequence such that γi γ∞ for some γ∞ in the weak-* sense as i → ∞. As already remarked in the introduction, the fact that Tr H0 γi ≤ const. implies that γi → γ∞ in trace norm. To prove compactness we have to show that γ∞ ∈ .

530


By definition, corresponding to every γi there is an approximate ground state sequence (1) − γ ≤ 1/ i γN,i . That is, there is a number Ni such that N ≥ Ni implies that γN,i i −1 GP and |N Tr HN γN,i − E (a)| ≤ 1/ i. (Here, · denotes trace norm.) We can assume that Ni → ∞ as i → ∞. Now, for given N , let ıˆ(N) be the largest integer i such that N ≥ Ni . Then ıˆ(N ) → ∞ as N → ∞, and hence the sequence γN,ˆı (N ) is an approximate (1) −γ ≤ γ (1) −γ ground state sequence. Moreover, γN,ˆ +γıˆ(N ) −γ∞ → 0 ∞ ıˆ(N ) ı (N ) N,ˆı (N ) as N → ∞. This proves that γ∞ ∈ , and hence is compact. Step 2. An extension of Theorem 1. A key step in the proof of Theorem 2 is an extension of the lower bound in Theorem 1 to the case of a perturbed Hamiltonian, where we replace the one-particle part H0 of the Hamiltonian (1) by H0 + S, where S is a bounded hermitian operator on the one-particle space L2 (R3 ). Let HN(S) denote the perturbed N-particle operator (S)

HN

= HN +

N

S (i) ,

i=1 (S) and let E0 (N )

= inf spec HN(S) denote its ground state energy. Correspondingly, define GP as in (5), with H + S in place of H , and let E GP (a) the perturbed GP functional E(S) 0 0 (S) denote its infimum over all φ with φ2 = 1. Then we have the following extension of Theorem 1, to whose proof we will devote the remainder of this subsection. Proposition 2. For all bounded hermitian operators S, lim inf N→∞

1 (S) GP (a) . E (N ) ≥ E(S) N 0

(106)

We start by noting that in order to prove Proposition 2 it suffices to prove it in the special case in which S is a finite rank operator with exponentially decaying eigenfunctions. In particular, we can assume that its integral kernel S(x, y) satisfies a bound |S(x, y)| ≤ B exp (−D(|x| + |y|))

(107)

for some positive constants B and D. This can be seen as follows. Let {fi }∞ i=1 be an orthonormal basis for L2 (R3 ) such that |fi (x)| < Bi exp(−Di |x|) for some choice of constants Bi , Di > 0 and let Pn denote the projection onto the first n of these functions. Clearly, Pn → I strongly as n → ∞. Then, for any bounded S, Pn SPn is of the desired form, i.e., it has finite rank and its integral kernel satisfies a bound of the form (107). For any one-particle density matrix γ , + + 1 + + 1 + Tr[H0 γ ] , + γ (S − P S − P SP ) ≤ SP (108) √ √ Tr n n n n + H H + 0

−1/2 H0

0

is compact and, therefore, is the norm with · denoting operator norm. Since limit of finite rank operators, it is easy to see that the norm in (108) goes to zero as n → ∞. On the other hand the set of numbers Tr[H0 γ ] that arise from those γ ’s that come from approximate ground states is bounded. Consequently, both sides of (106) can be approximated to within any desired ε by replacing S by Pn SPn and choosing n large enough — which implies the statement.


531

Thus we can assume (107) henceforth. The proof of Proposition 2 then follows exactly the same lines as the proof of Theorem 1. In fact, our proof of Theorem 1 has the advantage of being almost completely independent of the exact form of the Hamiltonian. The only place where we used the explicit form is Lemma 2, which was used to bound expectation values of certain one-, two- and three-body operators in the zero-temperature state of HM,N . We now have to bound the expectation value of these operators in the (S) , which we denote as · (S) . (Here, the operator H (S) is zero-temperature state of HM,N B M,N (S)

defined in the obvious way. Its ground state energy will be denoted by E0 (M, N ).) To this end, Lemma 2 can be extended in the following way. Lemma 3. Let ξ(x1 , x2 , x3 ) be any positive function of x1 , x2 and x3 ∈ R3 . Let

S denote the rank one operator on the one-particle space with integral kernel given by the right side of (107). With V = the one-body potential appearing in HM,N , we define the three-body, independent particle Hamiltonian, h(S) = −1 − 2 − 3 + V (x1 ) + V (x2 ) + V (x3 ) −

S1 −

S2 −

S3 .

(109)

Let α > 0 and let e−αh (x1 , x2 , x3 ; y1 , y2 , y3 ) be the ‘heat kernel’ of h(S) at ‘inverse temperature’ α. Finally, consider the modified integral kernel $ (S) e−αh (x1 , x2 , x3 ; y1 , y2 , y3 ) ξ(x1 , x2 , x3 )ξ(y1 , y2 , y3 ) (110) (S)

and let (S) denote its largest eigenvalue (i.e., its norm as a map from L2 (R9 ) to L2 (R9 )). Then (S) (S) (S) ξ(x1 , x2 , x3 ) B ≤ (S) exp{α(E0 (M, N ) − E0 (M − 3, N ))} . (111) The proof follows along the same lines as the proof of Lemma 2, except for one step. Before Eq. (88), it was necessary to get an upper bound on the absolute value of the integral kernel of exp{−αH3,N /m} in terms of the kernel of exp{−αh/m}, which can be obtained with the help of the Feynman-Kac-Itô formula. In the case considered here, we (S) /m}. We will now show that need an upper bound on the integral kernel of exp{−αH3,N the absolute value of this kernel is bounded above by the kernel of exp{−αh(S) /m} for the modified three-particle operator h(S) in (109). This claim follows from the Trotter product formula, together with the Feynman-KacItô formula, in the following way. Since S is a bounded (in fact, finite rank) operator, we can write

(S) α n/m e−αH3,N /m = lim e−αH3,N /n 1 − S . (112) n→∞ n By the Feynman-Kac-Itô formula, (107) and the definition of

S, n/m n/m −αH /n α 3,N e ≤ e−αh/n 1 + α

1 − S S (x, y) (x, y) . n n

(113)

In the limit n → ∞, the operator on the right side converges strongly to e−αh /m . This proves our claim. For the application of this lemma, as in Sect. 2, Step 4, it is necessary to have some (S) bounds on the kernel of e−αh . In particular, we need that the kernel is bounded, and that its diagonal decays for large |x| at least like |x|−const. α for some positive constant. (S)

532


As for the case S = 0, these properties are again shown in the appendix. It is there that the exponential decay of the kernel of

S gets used. As already mentioned, except for the replacement of Lemma 2 by Lemma 3, the proof of Proposition 2 consists of simply mimicking the discussion of the proof of Theorem 1 given in Sect. 2. Step 3. contains projections onto GP minimizers. Let GP ⊂ L2 (R3 ) denote the set of all minimizers of the GP functional (5). We now consider the special case where S = −|φ φ| for some φ ∈ GP . In this case, we claim that lim

N→∞

1 (λS) (N ) = E GP (a) − λ E N 0

(114)

for any λ ≥ 0. Given Theorem 1, the lower bound is trivial in this case. The upper bound can be derived in the same way as the upper bound for Theorem 1 in [23]. The arguments there also apply to this case, and the expectation value of S in the trial state can easily be estimated using the methods in [5, 15]. (In the non-rotating case, this was carried out in [21].) Taking the derivative of (114) at λ > 0, Griffiths’ argument [9, 18] implies that the one-particle density matrix of a ground state of HN(λS) converges to |φ φ| as N → ∞ in this case. Hence, by a similar ‘diagonal’ argument as at the end of the proof of part (i) of Theorem 2, we can find a sequence λN with λN → 0 as N → ∞ such that the (λ S)

ground state of HN N represents an approximate ground state sequence for the λ = 0 problem, and its reduced one-particle density matrix converges to |φ φ| as N → ∞. This shows that |φ φ| ∈ for any φ ∈ GP . (Remark. The claim of this subsection can in principle be proved by simply constructing an appropriate approximate ground state. However, although the one-particle density matrix of the trial state used in [23] converges to |φ φ| as N → ∞, this does not immediately imply that |φ φ| ∈ since the trial state is not symmetric! This explains the somewhat different reasoning in this subsection.) We note that also |φ φ| ∈ ext for all φ ∈ GP . This follows from the fact that all elements of are positive operators, and a rank one operator cannot be written as a non-trivial convex combination of two positive operators. In the next subsection, we will show that all elements of ext are of the form |φ φ| with φ ∈ GP . Step 4. Proof of Parts (ii) and (iii). For a given γ ∈ , let γN be an approximate ground state sequence for HN , with γN(1) → γ as N → ∞. By Proposition 2 we have that, for any bounded hermitian operator S and any λ ∈ R, 1 GP (a) . Tr HN(λS) γN ≥ E(λS) N→∞ N

E GP (a) + λ Tr Sγ = lim

(115)

Upon dividing by λ and letting λ → 0, this yields Tr Sγ ≥ lim

GP (a) − E GP (a) E(λS)

λ0

λ

.

(116)

We claim that lim

λ0

GP (a) − E GP (a) E(λS)

λ

= min φ|S|φ . φ∈GP

(117)


533

GP (a) ≤ E GP (a) Using φ ∈ GP as a trial function, we immediately see that E(λS) GP + λ φ|S|φ for all φ ∈ GP . For the other direction, we use a minimizer of E(λS) GP as a trial state for E . As λ → 0, this sequence of minimizers will have a subsequence that converges strongly to a minimizer of E GP . Hence, for some φ ∈ GP , GP (a) − E GP (a)) ≥ φ|S|φ , which proves our claim. (Note that this limλ0 λ−1 (E(λS) argument also proves that the right side of (117) is a true minimum and not merely an infimum.) We have thus shown that, for every bounded hermitian operator S, and every γ ∈ ,

Tr Sγ ≥ min φ|S|φ . φ∈GP

(118)

Replacing S by −S, this also implies that Tr Sγ ≤ maxφ∈GP φ|S|φ . Inequality (118) is the key to the proof of statements (ii) and (iii) in Theorem 2. Let Pn be a rank n projection, and let Pn = {Pn γ Pn : γ ∈ }. When γ is a bounded operator on L2 (R3 ), Pn γ Pn can be identified with an n × n complex matrix, and hence 2 with a vector in R2n . We make this identification (denoted by ι in the following) in order to be able to use finite-dimensional convexity theory (see, e.g., [20]). Note that ι is linear and continuous, and hence the set Bn = ιPn = {ιPn γ Pn : γ ∈ } is a closed 2 convex subset of R2n . An exposed point [20] of a convex set C ⊂ Rm is an extreme point p of C with the additional property that there is a tangent plane to C containing p but containing no other point of C. (For an example of points that are extreme but not exposed, let C ⊂ R2 be a square with each corner rounded off into a quarter of a circle. The extreme points are all the points on the four quarter-circles, including their endpoints, but the endpoints are not exposed.) An equivalent way to say this is that an exposed point p in C ⊂ Rm is characterized by the existence of a vector a ∈ Rm (a normal to the tangent plane) such that (a, p) ≤ (a, b)

for all b ∈ C ,

(119)

with equality if and only if b = p. (Here, (· , ·) denotes the standard inner product in Rm .) 2 γ Pn ∈ Pn . For a fixed n, an exposed point of Bn ⊂ R2n corresponds to some Pn This density matrix γ may not be unique and it may depend on n, but this is of no concern to us. We note that our space of density matrices is a complex space and, therefore, we have to translate (119) to this setting. For any two bounded operators γ , γ (not necessarily in ), the real inner product (· , ·) becomes (ιPn γ Pn , ιPn γ Pn ) = Tr(Pn γ † Pn γ ) ,

(120)

where γ † is the adjoint of γ . Translated to our original space, this means that if Pn γ Pn is an exposed point of Pn , then there exists an operator S (with Pn SPn = S) such that Tr S γ ≤ Tr Sγ

for all γ ∈ ,

(121)

or, equivalently, there exists a hermitian S such that Tr S γ ≤ Tr Sγ

for all γ ∈ .

(122)

Note that, by definition, equality holds in (122) if and only if Pn γ Pn = Pn γ Pn . We now use inequality (122), with γ = |φ φ|, where φ ∈ GP minimizes φ|S|φ among all

534


GP minimizers. We know from Step 3 that this γ is an element of . The inequalities (118) (applied to γ ) and (122) for this special choice of γ together imply that there is actually equality in this case, and thus that Pn γ Pn = Pn |φ φ|Pn . That is, all exposed points of Pn are of the form Pn |φ φ|Pn , with φ ∈ GP . We can go further and conclude that all extreme points in Pn are of this form, not only the exposed points. This follows from the fact that the set of GP minimizers is closed, together with Straszewicz’s Theorem [20, Thm. 18.6] which states that the exposed points are a dense subset of the extreme points. Carathéodory’s Theorem [20, Thm. 17.1] implies that every Pn γ Pn ∈ Pn can be 2 written as a convex combination of 2n + 1 extreme points. That is, there exist λi ≥ 0 with i λi = 1 such that  2  2n +1 Pn γ Pn = Pn  λi |φi φi | Pn , (123) i=1

with φi ∈ GP for all i. This equation defines an atomic (i.e., point) measure dµn (φ) supported on the (compact) space of projections onto GP minimizers. Let us provisionally call this space , with the intention of showing that ext = . For every ψ with Pn ψ = ψ we have thus shown that dµn (φ)| ψ|φ |2 with dµn (φ) = 1 . (124) ψ|γ |ψ =

To complete the proof of Theorem 2 we wish to take the limit n → ∞ in (124). We choose Pn in such a way that Pn converges strongly to the identity as n → ∞. The sequence dµn has a subsequence that converges weakly to some measure dµ with dµ = 1 (see [4, Vol. 1, Thm. 12.7 and 12.10]). This implies that, for ψ in a dense subset of L2 (R3 ) (namely, those ψ for which Pn ψ = ψ for some n), dµ(φ)| ψ|φ |2 with dµ(φ) = 1 . (125) ψ|γ |ψ =

Since(125) holds for a dense set of ψ, it actually holds for all ψ by continuity. That is, γ = dµ(φ)|φ φ| in the weak sense. Note that there is a representation (125) for γ ∈ ext (since there is such a representation for all γ ∈ ). It is not hard to see that for an extreme γ the corresponding Borel measure dµ must be an atomic measure at a single point in . Another way to say this is that ext ⊂ , which is exactly part (ii) of Theorem 2 (since we have already proved in Step 3 that ⊂ ext ). Part (iii) of Theorem 2 follows from (125), together with part (ii). This completes the proof of Theorem 2. We conclude with the direct proof of (11), which was promised just after the statement of Theorem 2. We start with (123) and choose Pn to be the projection onto the largest n eigenvalues of γ , with n large enough so that Tr |γ − Pn γ Pn | < ε2 /8. We now denote Pn = P , 1 − Pn = Q and B = i λi |φi φi |. From (123) (and a little algebra) we learn that γ − B = Q(γ − B)Q − QBP − P BQ. Thus Tr |γ − B| ≤ Tr (|Qγ Q| + |QBQ| + 2|QBP |). Obviously, Tr Qγ Q < ε2 /8 and since Tr B = Tr γ = 1, we also have Tr |QBQ| = Tr QBQ = Tr (1 − P )B = Tr (1 − P )γ = Tr Qγ Q < ε2 /8. The remain1/2 1/2 ing term √ can be bounded, using Schwarz’s inequality, by (Tr QBQ) (Tr P BP ) < ε/ 8. This proves (11).


535

Appendix: Heat Kernel Estimates In this appendix we derive an upper bound on the heat kernel for a general Schrödinger operator. This bound will show, in particular, that for any s > 0 and α large enough (depending on s) Tr |x|s eα(−V ) < ∞

(126)

if V (x) ≥ C1 ln(|x|) − C2 for some constants C1 > 0 and C2 . This property was used in the proof of Theorem 1. (Actually, in the proof of Theorem 1 we used only the cases s = 2 and s = 4 (see Step 4 of Sect. 2) because we assumed A = 21 ∧ x, but (126) permits the inclusion of a magnetic field with polynomial growth of A.) Our bound on the heat kernel follows an idea of Symanzik [26]. Using the FeynmanKac formula for the integral kernel, we can write α eα(−V ) (x, y) = dµx,y (ω) exp − ds V (ω(s)) , (127) 0

where dµx,y denotes the conditional Wiener measure for paths ω going from x to y in time α. By Jensen’s inequality we have, for any given path ω, α 1 α ds V (ω(s)) ≤ ds exp (−αV (ω(s))) . (128) exp − α 0 0 Therefore (using Fubini’s Theorem) 1 α eα(−V ) (x, y) ≤ ds dµx,y (ω) exp (−αV (ω(s))) α 0 α 1 = ds es e−αV e(α−s) (x, y) . α 0

(129)

To evaluate the trace in (126), we only need the heat kernel on the diagonal, i.e., for x = y. The integral kernel of et is given by jt (x − y) ≡ which leads to es e−αV e(α−s) (x, x) =

1 2 e−|x−y| /4t , 3/2 (4πt)

(130)

1 1 −αV (y) 2 dy e exp −|x − y| /4t , (4π)3 (tα)3/2 R3 (131)

where t is defined by 1/t ≡ 1/s + 1/(α − s). Let us change the integration variable from s to t, and introduce the function 2 α/4 1 jt (x) . hα (x) = dt √ (132) α 0 1 − 4t/α Then the bound (129) yields eα(−V ) (x, x) ≤

1 e−αV ∗ hα (x) , 3/2 (4πα)

(133)

536


with ∗ denoting convolution. Note that hα (x)dx = 1. It is easy to see that hα (x) ∼ exp(−|x|2 /α) for large |x|. Hence, if V (x) increases logarithmically with |x|, we see that the diagonal of the heat kernel decays at least as |x|−const. α for large |x|. Thus, we can choose α large enough to ensure that (126) is finite. For the proof of Theorem 2 it is necessary to extend this result to the case where − + V is replaced by − + V + K, with K a finite rank operator. As explained there, we can restrict ourselves to the case when K has exponentially decaying eigenfunctions. I.e., we can assume that the kernel of K, which we denote by K(x, y), satisfies a bound K(x, y) ≤ Be−D(|x|+|y|)

(134)

for some constants B > 0 and D > 0. Again we want to show that, for any s > 0 and α large enough (depending on s), Tr |x|s eα(−V −K) < ∞

(135)

if V (x) ≥ C1 ln(|x|) − C2 for some constants C1 > 0 and C2 . With the notation Lt = et (−V ) , we can use the Dyson expansion to write e

α(−V −K)

= Lα +

n≥1

(−1)

n

i ti =α

dt0 dt1 · · · dtn Lt0 KLt1 K · · · KLtn . (136)

We have already derived an upper bound on the kernel of Lα above. The kernel of the terms for n ≥ 1 in the sum can be bounded as follows. First of all, the Feynman-Kac formula tells us that since V ≥ 0 we have the inequality Lt (x, y) ≤ jt (x −√y) for the kernel of Lt . Moreover, using (134) and denoting by the function (x) = Be−D|x| , we have n−1 Lt KLt K · · · KLt (x, y) ≤ jt ∗ (x) |Lti | jtn ∗ (y) . n 0 1 0

(137)

i=1

Since Lt ≤ I, we have |Lti | ≤ 22 . Denoting sup jt ∗ (x) , ξα (x) = −1 2

(138)

Lt KLt K · · · KLt (x, y) ≤ ξα (x)ξα (y)2n . n 0 1 2

(139)

0 0 such that, along the corresponding solution, one has (i) Eκ (t) ≤ µ4 C1 e−σ κ/µ + C2 µ5 ,

for |t| ≤

Tf µ3

(3.8)

for all κ > 0. (ii) There exists a sequence of almost periodic functions {Fn }n∈N such that, defining the specific energy distribution Fnκ0 = µ4 Fn ,

Fκ = 0 if κ = nκ0

(3.9)

one has |Eκ (t) − Fκ (t)| ≤ C2 µ5 ,

|t| ≤

Tf . µ3

(3.10)

544

D. Bambusi, A. Ponno

Remark 3.3. Since Fn (t) are almost periodic functions of time their time average defined by 1 T ¯ Fn (t)dt (3.11) Fn := lim T →∞ T 0 exists (see e.g. [16]). It follows that up to the error the time average of Eκ (t) relaxes to the limit distribution obtained by rescaling F¯n as in (3.9). Remark 3.4. One can give heuristic arguments to show that the (rescaled) limit distribution F¯n is the same for all initial data in a set of full measure. Moreover such a limit distribution was computed explicitly in [32] obtaining a result in very good agreement with the numerical observations by [5]. However, we were unable to transform the heuristic argument into a rigorous one. Remark 3.5. There exist numerical results showing that the time Te of approach to equipartition in FPU systems is a stretched exponential of the inverse of the specific energy E: Te ∼ exp[(1/E)a ] [31, 6]. The existence of such a time–scale a la Nekhoroshev was first conjectured in [17] making use of probabilistic arguments. It is not yet clear whether the metastable state with energy distribution E¯κ may survive over such a time– scale. The only rigorous result in this direction was obtained in [8] (see also [30]), where the exponential stability of the fundamental mode of a nonlinear string was proved. Remark 3.6. We expect Theorem 3.2 to hold also in the β–FPU, model (the time scale should be substituted by µ−4 ). Indeed, the theory of Sects. 4, 5 can be trivially generalized to the β model, the only difference being that the KdV equation has to be substituted by a different integrable equation, namely the modified KdV equation (mKdV). However, the study of the modified KdV is less developed than the study of the KdV equation, so, even if the results of Sect. 6 are expected to hold also in the case of the mKdV, there are not “ready to use theorems” available. Remark 3.7. It is very easy to see that a variant of Theorem 3.2 holds also in the case where not only the first Fourier mode is excited, but also its higher harmonics are excited, provided that the energy decreases exponentially or at least quadratically with κ/µ. Remark 3.8. With an extension of our theory we would (probably) be able to prove stability of the solutions constructed in Theorem 3.2 with respect to excitations involving a small packet of modes, but only on a time–scale of order µ−2 . Over such a time–scale the effects of the nonlinearity are not visible, so this extension has to be considered unsatisfactory. On the time–scale µ−3 , at present, we are only able to prove stability of the solutions we constructed for perturbations of the initial data that decay fast in space (i.e. with vanishing specific energy). Thus the energy spectrum of the initial data that we can control has the shape of a sequence of peaks of height proportional to N , but decreasing exponentially with κ, each with a superimposed bump of modes of small height. Work is in progress in order to deal with more general initial data. 4. Normal Form In this section we compute the normal form of the FPU and we give a rigorous estimate of the remainder.

On Metastability in FPU

545

From now on, instead of the “specific index κ” we will use integers to label the modes and the energy per mode Ek instead of the specific energy per mode Eκ = Ek /N . As above, corresponding to an integer index 1 ≤ k0 ≤ N we define the parameter µ :=

k0 . N

(4.1)

Rewrite the FPU system in terms of new rescaled variables rj defined by rj = 0, µ2 rj := qj − qj −1 ,

(4.2)

j

one has that the change of variables q → r is well defined and invertible. Introducing also the operator of second difference 1 by (1 r)j := rj +1 + rj −1 − 2rj ,

(4.3)

the FPU equations take the form r¨j = (1 (r + µ2 r 2 ))j . Remark 4.1. Introducing also the momenta sj defined by sj = 0, pj = µ2 sj − sj +1 ,

(4.4)

(4.5)

j

one gets that the transformation (p, q) → (s, r) is canonical. Moreover, it is easy to verify that in these variables one has 2 2 2 4 rˆk + ωk sˆk (4.6) Ek = µ 2 with rˆk and sˆk the Fourier coefficients of r and s, respectively. We introduce now an interpolating function r = r(x, t) for the sequence rj , namely a (smooth) function with the property that the sequence rj (t) ≡ r(j, t)

(4.7)

fulfills the FPU equations (4.4). Moreover we will assume that the function r(x) is 2/µ periodic and has zero average, namely that 1/µ r(x, t)dx = 0. (4.8) r(x + 2/µ, t) = r(x, t) , −1/µ

Thus we postulate that the function r fulfills r¨ = 1 (r + µ2 r 2 )

(4.9)

with an obvious extension of the definition of 1 to smooth functions. It is easy to verify that this system is Hamiltonian with Hamiltonian function 1/µ 3 −s1 s + r 2 2r H (r, s) := +µ dx (4.10) 2 3 −1/µ

546


and with s a periodic function with zero average, playing the role of the momentum conjugated to the function r(x). The momentum s(x) is actually an interpolating function for the momentum introduced in Remark 4.1. Actually one has sj (t) = s(j, t). The Hamilton equations of (4.10) are given by δH dr = , dt δs

ds δH =− dt δr

(4.11)

δH 2 with δH δr denoting the L gradient of H with respect to r and similarly for δs . It is now convenient to rescale the length of the ring and the size of the momentum s, by introducing as new phase variables two function (u, v) periodic of period 2, defined by

v(µx) = µs(x) ,

u(µx) = r(x).

(4.12)

In the following we will denote by y the rescaled space variable, namely y = µx. The coordinate transformation (4.12) is not canonical, but it turns out that the equations for the variables (u, v) are still Hamiltonian with the original symplectic structure, and with Hamiltonian function H (u, v) = µK(u, v) with

K(u, v) =

1

(4.13)

−vµ v u2 µ 2 u3 + + 2µ2 2 3

dy,

(4.14)

(µ v)(y) := v(y + µ) + v(y − µ) − 2v(y).

(4.15)

−1

where we introduced the difference operator

Remark 4.2. From now on we will study the system (4.14). This clearly amounts to introducing a new time τ ≡ µt. More precisely, denote by u(τ ), v(τ ) a solution of the equations of motion of K, namely of du δK dv δK = , =− . dτ δv dτ δu Then u(µt), v(µt) is a solution of the equations of motion of H .

(4.16)

The formal expansion of the operator µ , defined in (4.15), gives µ2 ∂y4 µ 2 + O(µ4 ), = ∂ + y µ2 12

(4.17)

K = H0 + P + R1 ,

(4.18)

so that one has

with

H0 (u, v) := P (u, v) :=

1

−1

1

−1

−µ

R1 being the remainder of the expansion.

v(−∂y2 v) + u2 2

dy,

µ 2 u3 + dy, 24 3

v∂y4 v 2

(4.19) (4.20)


547

Remark 4.3. The equations of motion of the Hamiltonian H0 are uτ = −∂y2 v ,

vτ = −u,

(4.21)

and thus they are equivalent to the linear wave equation. Its flow will be denoted τ (v, u) and is periodic in time with period 2. Following [3] we are going to use a Galerkin averaging method in order to compute the corrections to the dynamics due to the presence of P , and to estimate the effect of R1 . To this end we first have to introduce a topology in the phase space. This is conveniently done in terms of Fourier coefficients. Definition 4.4. Having fixed two positive constants s, σ consider the Hilbert space 2σ,s of the complex sequences v ≡ {vK }K∈Z−{0} such that v2σ,s :=

|vK |2 |K|2s e2σ |K| < ∞.

(4.22)

K

We will identify a 2 periodic function v with its Fourier coefficients vˆK defined by 1 v(y) = √ vˆK eiπKy , 2 K∈Z and we will say that v ∈ 2σ,s if its Fourier coefficients have this property. Moreover in what follows the coefficient σ will be kept fixed. We will study the system K(u, v) in the phase spaces Ps defined by Ps := 2σ,s+1 × 2σ,s (v, u),

(4.23)

(v, u)2s := v2σ,s+1 + u2σ,s .

(4.24)

endowed with the norm

A phase point (v, u) will also be denoted by z, and the ball of radius R centered at the origin of Ps will be denoted by Bs (R). It is easy to see that the flow τ of the system H0 is unitary in all the spaces Ps . Theorem 4.5. For any r ≥ 5 there exists a constant µ∗ ≡ µ∗r , such that, if µ < µ∗ , then there exists an analytic canonical transformation T : Br (1) → Br (2) which averages K, namely such that K ◦ T = H0 + P + R ;

(4.25)

here P (z) :=

1 2

0

2

P ( τ (z))dτ

(4.26)

548


and the vector field XR of the remainder is analytic in a complex ball of radius 1 and fulfills the estimate 12

sup XR (z)0 ≤ Cr µ4− 6+r .

zr ≤1

(4.27)

Moreover for any 1 ≤ r1 ≤ r the transformation T maps Br1 into Pr1 and fulfills 6

sup z − T (z)r1 ≤ Cµ2− 6+r .

zr1 ≤1

(4.28)

The proof is an application of the techniques of [3] and, for the sake of completeness, it will be given in Appendix A. Remark 4.6. We recall that a heuristic discussion on the possibility of putting the FPU system in normal form corresponding to low frequency initial data was given in [38]. The above theorem rigorously proves that this is indeed possible. Below we give the explicit expression of the normal form, which is integrable! As a consequence we think that some of the conclusions of the paper [38], which are based on the heuristic argument that resonances enforce chaos, could be incorrect. In the rest of this section we will perform the explicit computation of the averaged equations, showing that they coincide with two uncoupled KdV equations. To obtain the result it is useful to introduce new variables in which the unperturbed flow τ assumes a simpler form. To this end we introduce the non canonical transformation u + vy u − vy ξ := √ , η := √ . (4.29) 2 2 Since the transformation is not canonical one has to modify the Poisson tensor in order to deduce the equations of motion from the Hamiltonian. Lemma 4.7. In terms of the variables ξ, η the Poisson tensor takes the form −1 0 J = ∂ , 0 1 y

(4.30)

i.e. the Hamilton equations associated to a Hamiltonian function H take the form dz δH δH = J ∇H (z) , ⇐⇒ ξτ = −∂y , ητ = ∂y , (4.31) dτ δξ δη where ∇H denotes the L2 gradient and z = (ξ, η). In the variables (ξ, η) the various parts of the Hamiltonian take the form 1 2 ξ + η2 H0 (ξ, η) = dy, (4.32) 2 −1 1 2 3 2 [∂y (ξ − η)] 2 (ξ + η) P (ξ, η) = −µ +µ dy, (4.33) √ 48 6 2 −1 and in particular the equations of motion of H0 assume the simple form

ξτ = −ξy , ητ = ηy ⇐⇒ [ξ(y, τ ) = ξ0 (y − τ ) , η(y, τ ) = η0 (y + τ )] . (4.34) It is now easy to obtain the following


549

Proposition 4.8. In the variables ξ, η the average of the perturbation is given by

1 3 3 ξ 2 + ηy2 2 y 2 (ξ + η ) P (ξ, η) = −µ +µ dy, (4.35) √ 48 6 2 −1 and the equations of motion of H0 + P are given by 1 1 ξyyy − µ2 √ ξ ξy , 24 2 2 1 1 ητ = ηy + µ2 ηyyy + µ2 √ ηηy , 24 2 2

ξτ = −ξy − µ2

(4.36) (4.37)

i.e. two uncoupled KdV equations in translating frames, and therefore such equations constitute the resonant normal form of FPU in the region of the phase space corresponding to long wavelength excitations. Remark 4.9. It is a remarkable fact that averaging an infinite dimensional system with respect to one angle only one gets a normal that is integrable (two uncoupled KdV). Similar phenomena were already pointed out in the β–FPU model (see [37]) and for the water wave problem (see [15, 12–14]). We have no a priori explaination of this fact. Remark 4.10. One could also write down the normal form in the original variables u, v, but the resulting expression would turn out to be quite complicated and difficult to read. Proof. One has to compute the average of the different terms composing Eq. (4.33). As 1 an example we deal explicitly with the term proportional to −1 dyξy ηy . One has 2 1 1 4 1 2 dyξy ηy = ds dyξy (y − s)ηy (y + s) = dα dβξy (α)ηy (β) 4 −2 −1 0 −1 0 (4.38) which vanishes due to the fact that ξy has zero average. Performing the same computation over all the terms one gets the result. Since we are interested in the energy per mode we give now the relation of Ek with the Fourier coefficients of ξ and η, which in turn are defined by 1 ξ(y) = √ (4.39) ξˆK eiKyπ 2 K∈Z and similarly for η. Proposition 4.11. Let ξ(y), η(y) be a pair of functions belonging to P0 ; denote by Ek the energy in the k th mode as defined by (3.5) in terms of the original variables. Then, for µ small enough, one has 2 2 Ek 11 4 |ξK | + |ηK | ≤ Cµ 2 (ξ, η)20 (4.40) N −µ 2 for all k such that

k N

= µK with |K| ≤

| ln µ| 2σ ;

11 |Ek | ≤ µ 2 (ξ, η)20 N

for all k such that

k N

= µK and |K| >

| ln µ| 2σ ,

and Ek = 0 otherwise.

(4.41)

550


The elementary proof is based on the exponential decay of the Fourier coefficients of a function in 2σ,0 . It is deferred to Appendix B. 5. Estimate of the Error Here we use the normal form to construct approximate solutions of FPU and we estimate their difference from true solutions. First we construct explicitly the approximate solutions. Consider the following pair of KdV equations 1 1 ξyyy − √ ξ ξy , 24 2 2 1 1 ηyyy + √ ηηy , = 24 2 2

ξτ1 = −

(5.1)

ητ1

(5.2)

obtained by rescaling time to τ1 = µ2 τ . Let ξ a (y, τ1 ), ηa (y, τ1 ) be a solution of such a pair of equations with the property that it belongs to Pr for all times τ1 , with a given r. Correspondingly, we define an approximate solution za ≡ (r a , s a ) of the FPU by ξ a (µ(x − t), µ3 t) + ηa (µ(x + t), µ3 t) , √ 2 ξ a (µ(x − t), µ3 t) − ηa (µ(x + t), µ3 t) sxa (x, t) := . √ 2

r a (x, t) :=

(5.3) (5.4)

The main result of this section is a theorem comparing the approximate solution with a corresponding true solution. Precisely, consider an initial datum (r0,j , s0,j ) and the corresponding Fourier coefficients (ˆr0,k , sˆ0,k ) as defined by Eq.(3.4). We assume that they are different from zero only if k/N = µK and that there exist two positive constants C and ρ such that

k |ˆr0,k |2 + ωk2 |ˆs0,k |2 −2ρ µN ≤ Ce . N

Finally, we define uniquely a corresponding interpolating function for the initial datum by 1 rˆ0,k eiπµKy , r0 (y) := √ 2N K where the sum runs over the integers K such that |K|µ = |k|/N ≤ 1, and in the formula one has to read k = µKN. We will consider a similar interpolating function for s0,j and corresponding initial data for the KdV equations. Theorem 5.1. Consider an initial datum for the FPU system with the above properties and denote by (rj (t), sj (t)) the corresponding solution. Consider the approximate solution ξ a (y, t), ηa (y, t) with the corresponding initial datum just constructed. Assume that for all times t the approximate solution is such that (ξ a , ηa ) ∈ P78 with some σ > 0, and fix an arbitrary Tf > 0. Then there exists µ∗ depending on Tf and on ξ a (t), ηa (t) only, such that, if µ < µ∗ then for all times t fulfilling 78 |t| ≤

Tf µ3

(5.5)


one has

551

sup rj (t) − r a (j, t) + sj (t) − s a (j, t) ≤ Cµ,

(5.6)

j

where r a , s a are given by (5.3), (5.4); moreover E (t) a (t)|2 + |ηa (t)|2 |ξ k K − µ4 K ≤ Cµ5 N 2 for all k such that

k N

= µK with |K| ≤

| ln µ| 2σ ,

and

|Ek (t)| ≤ µ5 N for all k such that

k N

= µK with |K| >

| ln µ| 2σ ,

(5.7)

(5.8)

whereas Ek (t) = 0 otherwise.

The proof of the theorem, which follows closely the strategy of [39], is deferred to Appendix C. 6. Dynamics of KdV and Conclusion of the Proof In this section we recall some known facts on the dynamics of the KdV equation with periodic boundary conditions and we use them to prove the results of Sect. 3. Consider the KdV equation (5.1), namely ξτ1 = −

1 1 ξyyy − √ ξ ξy . 24 2 2

It is a well known consequence of the Lax pair formulation that the spectrum of the Sturm Liouville operator √ Lξ := −∂yy + 6 2ξ(y, τ1 ) (6.1) with periodic boundary conditions on [0, 4] is invariant under the KdV evolution, i.e. it is independent of τ1 . The spectrum of Lξ with periodic boundary conditions on [0, 4], will be simply called the periodic spectrum of ξ . Such a periodic spectrum is of pure point type and consists of a sequence of eigenvalues λ0 < λ1 ≤ λ2 < λ3 ≤ λ4 < · · ·

(6.2)

(notice that the symbols < and ≤ do exactly alternate). The quantities γn := λ2n − λ2n−1

(6.3)

are called the gaps of the spectrum. From standard asymptotic properties of the spectrum one has γn ∈ 2 for any L2 potential ξ . Moreover, it has been proved by Garnett and Trubowitz that the sequence of the γn entirely determine the periodic spectrum of ξ . A further, very important, feature of the above Sturm Liouville problem is the relation between the sequence of the gaps and the regularity of the corresponding potential ξ . Indeed, up to a certain extent the correspondence between the regularity of ξ and the property of the sequence γn is the same one existing between the regularity of a function and its Fourier coefficients (see [27]). Precisely, the following theorem (from [34]) holds:

552


Theorem 6.1. Suppose ξ ∈ L2 ; then ξ ∈ 0,s if and only if its gap lengths satisfy n2s |γn |2 < ∞. (6.4) n≥1

Moreover, if ξ ∈ σ,s then

n2s e2σ n |γn |2 < ∞

(6.5)

n≥1

conversely, if (6.5) holds, then ξ ∈ σ ,0 with some σ > 0. From a Hamiltonian point of view the KdV is an integrable infinite dimensional system. It has been shown that a complete system of integrals of motion is given by the γn2 . Moreover the KdV admits global action angle coordinates. More precisely, the following result holds Theorem 6.2. [Kappeler-Pöschel [25]] There exists a diffeomorphism : L2 → 20,1/2 × 20,1/2 with the following properties1 : i) is one-to-one, onto, bianalytic, and canonical. ii) For each s ≥ 0, the restriction of to 20,s is a map : 20,s → 20,s+1/2 × 20,s+1/2 , which is one-to-one, onto, and bianalytic as well. iii) The coordinates (x, y) ∈ 20,3/2 × 20,3/2 are Birkhoff coordinates for the KdV equation. That is to say, in terms of the coordinates (x, y) the Hamiltonian HKdV of the KdV depends only on In := (xn2 + yn2 )/2, n ≥ 1, with (x, y) canonically conjugated coordinates. In terms of the variables (x, y) the dynamics of the KdV is trivial. To describe the latter, fix an initial datum (x 0 , y 0 ), and define νn (x 0 , y 0 ) :=

∂HKdV 0 0 (x , y ); ∂In

then the equations of motion take the form x˙n = νn yn ,

y˙n = −νn xn .

(6.6)

Thus, it is immediately seen that any solution is periodic, quasiperiodic or almost periodic, depending on the number of gaps (actions) initially different from zero. With these tools at hand it is easy to obtain the Proof of Theorem 3.2. We begin by proving (i). Consider an initial datum as in the statement of the theorem. This corresponds to initial data with ξ and η which are entire analytic functions (actually proportional to a sinus). By Theorem 6.1 the corresponding sequence of gaps decreases exponentially with any coefficient ρ in the exponential. This property is then conserved along the corresponding solution. Going back to Fourier coefficients one immediately deduces that the corresponding solution ξ(τ1 ) is analytic in the y variable in a complex strip of width σ (τ1 ). Taking the minimum of such quantities 1

By abuse of notation, here 20,α is the space of the sequences {xn }n≥1 such that

2α 2 n n |xn |

< ∞.


553

one finds the coefficient σ of Theorem 3.2. This is the result for the solution of the KdV equations. Using Theorem 5.1, Eq. (5.7), one goes back to the quantities Ek and obtains the desired result. In order to prove statement (ii) we use the fact that any solution is almost periodic in time. Denote the quantity 2 (1) EK := ξˆK ; (1)

then, EK (x(τ1 ), y(τ1 )) is almost periodic. Define also 2 (2) EK := ηˆ K

(1) (2) and EK := (E¯ K + E¯ K )/2. Scaling back to physical variables, using again Theorem 5.1, Eq. (5.7), and dividing by N where required, one gets statement (ii).

A. Appendix: Proof of Theorem 4.5 Since the Hamiltonian (and its vector field) is analytic, it is useful to complexify the phase space. Thus, from now on we will think of the phase variable z as a complex variable. The main reason is that, through Cauchy inequality the sup norm of a function controls also the supremum of the derivatives of the function. First we prove the following simple Lemma A.1. For any s ≥ 0 one has XR (z) ≤ 2µ4 , 1 s

∀z : zs+5 ≤ 2,

(A.1)

XP (z)s ≤ Cµ2 ,

∀z : zs+3 ≤ 2.

(A.2)

Proof. The estimate of XP is an immediate consequence of the definition of the norm and of the fact that 2σ,s is an algebra for s ≥ 1. Concerning XR1 just remark that the K th Fourier coefficient of its u component is given (and estimated) by 4 K 4 π 4 µ2 u ∧ 2 2 2 (A.3) vˆK XR1 (u, v) K = 2 sin (Kµπ ) − π K + µ 24 π 6 6 4 ≤ K µ vˆK 6! from which the thesis follows.

Then we perform a Galerkin cutoff of P . Precisely, define the projector n on the Fourier modes with index smaller than n, i.e. n (uˆ −∞ ...uˆ −K ...uˆ K ...uˆ ∞ ) = (uˆ −n ...uˆ n ), define also n (u, v) := (n u, n v), and finally define P (n) (z) := P (n (z)). Following [2] we have the following

(A.4)

554


Lemma A.2. For any s ≥ 1 there exists a constant C such that, for any r ≥ 0, and any n ≥ 0, one has X

2 ≤ µ Cs , nr

P −P (n) (z) s

∀z : zs+r+3 ≤ 3/2.

(A.5)

For the proof see the proof of Lemma 5.2 in [2]. Moreover it is easy to show that XP (n) is analytic as a map from Ps to itself and that X (n) (z) ≤ µ2 Cs n3 ∀z : zs ≤ 2. (A.6) P s We now use Lie transform to construct a canonical transformation averaging the Hamiltonian up to order µ4 (or more precisely, slightly less). Thus consider an auxiliary Hamiltonian function χ (of order µ2 ), assume that the corresponding Hamiltonian vector field is analytic as a map from Ps to itself ∀s ≥ 1, and consider the corresponding Hamilton equations z˙ = Xχ (z).

(A.7)

Denote by T τ the corresponding time τ flow and by T the time 1 flow. We use such a T in order to transform our Hamiltonian system K. One has K ◦ T = H0 + P (n) + {χ , H0 } + R

(A.8)

where

R = (P −P (n) ) ◦ T +R1 ◦ T + P (n) ◦ T −P (n) +[H0 ◦ T −H0 − {χ , H0 }] (A.9)

is the sum of the higher order terms (they will be estimated in a while). First of all we choose χ in such a way that P (n) + {χ , H0 } = P (n) , according to Lemma 8.4 of [1] (a simple computation); this is given by 1 2 (n) τ τ P ( (z)) − P (n) ( τ (z)) dτ χ (z) := 2 0 and its vector field is analytic and estimated by Xχ (z) ≤ µ2 Cs n3 ∀z : zs ≤ 2. s

(A.10)

(A.11)

It also follows that the transformation T exists and fulfills the estimates (4.28). Moreover the various terms of (A.9) are estimated by Lemma A.3. The following estimates hold µ2 Cs X , ∀z : zs+r+3 ≤ 1, (P −P (n) )◦T s ≤ r n XR ◦T ≤ Cµ4 , ∀z : zs+5 ≤ 1, 1 s X (n) ≤ Cµ4 n6 , ∀z : zs ≤ 1, (n) P ◦T −P s XH ◦T −H −{χ,H } ≤ Cµ4 n6 , ∀z : zs ≤ 1. 0 0 0 s

(A.12) (A.13) (A.14) (A.15)


555

Proof. All these estimates are a direct application of some lemmas already proved in [1]. In particular (A.12) and (A.13) follow from Lemma 8.2 with R = 3/2 and δ = 1/2 the first one, and R = 2 and δ = 1 the second one. Equation (A.14) is a consequence of Lemma 8.3 with R = 2 and δ = 1. Equation (A.15) is a consequence of Lemma 8.5 with R = 2 and δ = 1. We choose now n in such a way that (A.12) and (A.14) are of the same order of 2 magnitude. This leads to the choice n = µ− r+6 which gives the estimate (4.27) for the remainder. Up to now we have shown that K ◦ T = H0 + P (n) + R (A.16) with R fulfilling the wanted estimate. To conclude the proof it is enough to remark that 12 µ2 Cs X ≤ Cµ4− 6+r , P −P (n) (z) s ≤ nr

∀z : zs+r+3 ≤ 3/2,

(A.17)

and thus one can simply substitute P in place of P (n) including the difference in the remainder. B. Appendix: Proof of Proposition 4.11 Define the Fourier coefficients of the function u by 1 uˆ K := √ 2

1

−1

u(y)e−iπKy dy,

(B.1)

and similarly for v, then Lemma B.1. For a state of the FPU corresponding to a pair of functions (u, v) one has 2 2 Ek 2 vˆ K+L , uˆ K+L + ωk = N µ

∀k : µK =

L∈L

k , N

(B.2)

where L := {L ∈ Z : Lµ = 2l with l ∈ Z}

(B.3)

and Ek = 0 otherwise. Proof. First introduce a 2N –periodic interpolating function for rj , namely a smooth function r N (x) such that rj = r N (j ) ,

r N (x + 2N ) = r N (x).

(B.4)

Denote rˆkN := √

1 2N

N

−N

r N (x)e

−ikπ x N

dx ,

(B.5)

556


then one has rj = r (j ) = √ N

1 2N

k∈Z

ikπj rˆkN e N

=√

1 2N

N−1 k=−N

l∈Z

N rˆk+2Nl

e

ikπj N

which implies rˆk =

l∈Z

N rˆk+2Nl .

(B.6)

Then the relation between rˆkN and uˆ K is easily obtained remarking that µ2 1 N ikπj r N (j ) = µ2 u(µj ) = √ uˆ K eiKµj π = √ rˆk e N . 2N k∈Z 2 K∈Z

(B.7)

Proof of Proposition 4.11. We start from Eq.(B.2), and as a first step we remark that, for Kµ = k/N , one has ωk = 2 sin kπ = 2 sin µKπ ≤ π |K|, (B.8) µ µ 2N µ 2 and that, for |K| ≥ 2| ln µ|/σ one has 2 uˆ K + π 2 K 2 |vˆK |2 ≤ π 2 µ4 (u, v)20 . 2

(B.9)

Using the relation between (u, v) and (ξ, η) one gets 2 2 ˆ 2 ξK + ηˆ K uˆ K + π 2 K 2 |vˆK |2 = 2 2

(B.10)

from which, using (B.8), Eq. (4.41) immediately follows. Concerning (4.40) one has, for |K| ≤ 2| ln µ|/σ , 2

2 ˆ E ξK + ηˆ K ω2 − (µK)2 2 1 2 ωk2 k k 2 uˆ K+L + 2 |vˆK+L | 4 − vˆK + ≤ µ 2 µ2 2 µ L =0 L∈L

≤

(µK)4 µ2

1 uˆ K+L 2 + |K + L|2 |vˆK+L |2 |vˆK |2 + 2 L =0 L∈L

≤ µ (2| ln µ|)2 v2σ,1 + 2

(v, u)20 e−2σ µ . l

l =0

The logarithm of µ can obviously be estimated by µ−1/2 , while the sum is exponentially small with µ. Thus the thesis follows.


557

C. Appendix: Proof of Theorem 5.1 It is useful to use also the variables (u, v), to define ξ a (y − τ, µ2 τ ) + ηa (y + τ, µ2 τ ) , √ 2 ξ a (y − τ, µ2 τ ) − ηa (y + τ, µ2 τ ) vya (y, τ ) := , √ 2

ua (y, τ ) :=

(C.1) (C.2)

and denote za (y, τ ) = (ua (y, τ ), v a (y, τ )). Then, in order to get a better approximation we define (u, ˜ v) ˜ ≡ z˜ = T (za ) = za + ψ a (za ),

(C.3)

a 6 ψ ≤ Cµ2− 6+r , r

(C.4)

where

and (u, ˜ v) ˜ fulfills the equations v˜t = −u˜ − µ2 π0 u˜ 2 + Rv , µ v˜ u˜ t = −1 + µRu , µ

(C.5) (C.6)

where the operator 1 acts in terms of the x variables, the remainders are functions of y, τ which fulfill v 12 12 R ≤ Cµ4− 6+r , Ru σ,0 ≤ Cµ4− 6+r , (C.7) σ,1 and π0 is the projector on the space of the functions with zero average. We restrict the space variable to integer values. If µ = l/n with l and n relatively prime integers then all the quantities involved in Eqs. (C.5), (C.6) are periodic with period n. In what follows we will restrict to the case l = 1; the case l = 1 can be dealt with by simple modifications. Keeping this in mind we will allow the space variable j to vary in {−n, . . . n − 1}. For a (finite) sequence r = {rj } we define the norm r2 2 (j ) :=

n−1

|rj |2 .

(C.8)

j =−n

For the quantities u, ˜ v, ˜ Rv , Ru evaluated at the integers j we will retain the same notation as for the original quantities. Moreover it is useful to introduce the difference operator ∂ defined by (∂r)j := rj − rj −1 ,

(C.9)

where r is an arbitrary sequence. We consider the FPU model (4.9). We rewrite it in the form s˙ = −r − µ2 π0 r 2 , r˙ = −1 s,

(C.10) (C.11)

558


and we look for two sequences E ≡ {Ej } and F ≡ {Fj } such that r = u˜ + µE ,

s=

v˜ + µF µ

(C.12)

fulfill the FPU equation in the form (C.10) and (C.11). Then E and F have to fulfill µRu , E˙ = −1 F − µ

(C.13)

Rv ˜ − µ 3 π0 E 2 − . F˙ = −E − µ2 2π0 uE µ

(C.14)

Moreover, for (E, F ) we impose initial conditions such that (u, ˜ v) ˜ has initial data corresponding to those of the true initial datum, namely we assume u(µj, ˜ 0)+µE0,j = r0,j = ua (µj, 0) ,

v(µj, ˜ 0) v a (µj, 0) +µF0,j = s0,j = . µ µ (C.15)

Lemma C.1. One has 1

6

E0 2 (j ) ≤ Cµ 2 − 6+r ,

1

6

∂F0 2 (j ) ≤ Cµ 2 − 6+r .

(C.16)

Proof. From (C.3), (C.4) one has u˜ − ua ψu = , µ µ

E0 =

F0 =

v˜ − v a ψv ≡ µ2 µ2

and 6 sup u(y) ˜ − ua (y) ≤ Cµ2− 6+r , y

from which E0 2 2 (j ) ≤

n−1 j =−n

12

12

2 Cµ4− 6+r µ4− 6+r sup Ej ≤ 2n =2 , 2 µ µ3 j

from which the estimate of E0 follows. Concerning F we need an estimate of ∂ψ v . Since ψ v is a function of y, one has 6 (∂ψ v )(j ) = ψ v (µj ) − ψ v (µj − µ) ≤ µ sup ∂y ψ v (y) ≤ Cµ3− 6+r , y

from which v 2 ∂ψ 2

(j )

12 n−1 |∂ψ v (j )| 2 nµ6− 6+r ≤ ≤ . µ2 µ4

j =−n

We use now an idea of Wayne and Schneider to obtain the

(C.17)


559

Theorem C.2. Fix r = 78, and fix Tf and CF > 0, then provided µ is small enough one has that E2 2 (j ) + (∂F )2 2 (j ) ≤ CF

(C.18)

for all times t fulfilling |t| ≤

Tf . µ3

(C.19)

Proof. Define the function F(E, F ) :=

Ej2 + Fj (−1 F )j 2

j

+

2µ2 u˜ j Ej2

2

(C.20)

and remark that 1 F(E, F ) ≤ E2 2 (j ) + (∂F )2 2 (j ) ≤ 2F(E, F ). 2 Compute now the time derivative of F; inserting Eqs. (C.13) and (C.14) one gets F˙ =

Rvj Ej µRuj 2µ3 u˜ j Ruj Ej 2µ3 Ej2 ∂ u˜ j (1 F )j µ3 Ej2 + (1 F )j − − + . µ µ µ 2 ∂τ j

(C.21) In order to estimate the r.h.s. we need some preliminary estimates. The first one is √ sup (1 F )j = sup (∂F )j +1 − (∂F )j ≤ 2 sup (∂F )j ≤ 4 F. j

j

j

Next we will need an estimate of Ru 2 (j ) . This is given by u 2 R 2

(j )

≤

2 24 |Ru |2 ≤ 2n sup Ru (y) ≤ Cnµ8− 6+r ,

(C.22)

y

j

which gives u R

2 (j )

1

12

≤ Cµ4− 2 − 6+r .

(C.23)

Concerning Rv we need an estimate of ∂Rv 2 (j ) . This is given by v 12 ∂R 2 ≤ Cµ4+ 21 − 6+r , (j )

(C.24)

which is obtained by remarking that

v (∂Rv )j = Rv (µj ) − Rv (µj − µ) ≤ µ sup ∂R (y) ∂y y

and proceeding as in the proof of (C.23). Now, the first term of (C.21) is estimated by 4µ3 F 3/2 . Concerning the second term, first remark that it coincides with j (∂F )j (∂Rv )j /µ and therefore it is estimated by

560

D. Bambusi, A. Ponno 1

12

CF 1/2 µ4+ 2 − 6+r −1 . The same estimate holds for the third term, the fourth term is esti1 12 mated by CF 1/2 µ6+ 2 − 6+r −1 and the last term is also easily estimated remarking that the derivative of u˜ with respect to τ is bounded and therefore such a term is bounded by Cµ3 F. As far as F < 2CF one thus has F˙ ≤ C(µ3 + µ3 )F + Cµ5+1/4−1 . (C.25) Such a differential inequality can be easily solved giving 1

12

F(t) ≤ F0 eT0 C + eT0 C CT0 µ1+ 2 − 6+r −1

(C.26)

which, inserting the value of r implies the thesis. Moreover the result on the Fourier modes is an immediate consequence of Proposition 4.11 and of the fact that the error from a true solution is measured in the norm 2 (j ) which controls the Fourier coefficients. Acknowledgement. This work emerged from many discussions within our group. In particular we would like to thank Andrea Carati and Luigi Galgani for their very interesting suggestions and comments. We would like to thank Antonio Giorgilli who showed us many numerical simulations and stimulated our interest in the phenomenon of formation of the packet. We also thank Giancarlo Benettin whose criticism is always very stimulating and leads to a better understanding of the problems at hand.

References 1. Bambusi, D.: Nekhoroshev theorem for small amplitude solution sin nonlinear Schrödinger equation. Math. Z. 130, 345–387 (1999) 2. Bambusi, D.: An averaging theorem for quasilinear Hamiltonian PDEs. Ann. Henri Poincaré 4, 685–712 (2003) 3. Bambusi, D.: Galerkin averaging method and Poincaré normal form for some quasilinear PDEs. http://www.ma.utexas.edu/mp arc/c/05/05-28.pdf, 2005 4. Bambusi, D., Carati, A., Ponno, A.: The nonlinear Schrødinger equation as a resonant normal form. DCDS-B 2, 109–128 (2002) 5. Berchialla, L., Galgani, L., Giorgilli,A.: Localization of energy in FPU chains. DCDS-A 11, 855–866 (2005) 6. Berchialla, L., Giorgilli, A., Paleari, S.: Exponentially long times to equipartition in the thermodynamic limit. Phys. Lett. A 321, 167–172 (2004) 7. Biello, J.A., Kramer, P.R., LvovD, Y.V.: Stages of energy transfer in the FPU model. Dynamical systems and differential equations (Wilmington NC 2002). DCDS Suppl., 113–122 (2003) 8. Bambusi, D., Nekhoroshev, N.N.: A property of exponential stability in the nonlinear wave equation close to main linear mode. Physica D 122, 73–104 (1998) 9. Carati, A., Galgani, L.: On the specific heat of FPU systems and their glassy behavior. J. Stat. Phys. 94, 859–869 (1999) 10. Carati, A., Galgani, L.: Planck’s formula and glassy behaviour in classical nonequilibrium statistical mechanics. Physica A 280, 105–114 (2001) 11. Carati, A., Galgani, L., Giorgilli, A.: The Fermi–Pasta–Ulam problem as a challenge for the foundations of physics. Chaos, to appear, 2005 12. Craig, W.: Birkhoff normal form for water waves. Mathematical problems in the theory of water waves, V. 200, Providence, EI: AMS, 1996 13. Craig, W., Sulem, C.: Numerical simulation of gravity waves. J. Comput. Phys. 108, 73–83 (1993) 14. Craig, W., Worfolk, P.A.: An integrable normal form for water waves in infinite depth. Physica D 84, 513–531 (1995) 15. Dyachenko, A.I., Zakharov, V.E.: Is free-surface hydrodynamics an integrable system?. Phys. Lett. A 190, 144–148 (1994) 16. Fink, A.: Almost periodic differential equations. Berlin: Springer-Verlag, 1974 17. Fucito, F., Marchesoni, F., Marinari, E., Parisi, G., Peliti, L., Ruffo, S., Vulpiani, A.: Approach to equilibrium in a chain of nonlinear oscillators. J. de Physique 43, 707–713 (1982)


561

18. Friesecke, G., Pego, R.L.: Solitary waves on Fermi-Pasta-Ulam lattices. I. Qualitative properties renormalization and continuum limit. Nonlinearity 12, 1601–1627 (1999) 19. Friesecke, G., Pego, R.L.: Solitary waves on Fermi-Pasta-Ulam lattices. II. Linear implies nonlinear stability. Nonlinearity 15, 1343–1359 (2002) 20. Friesecke, G., Pego, R.L.: Solitary waves on Fermi-Pasta-Ulam lattices. III. Howland-type Floquet theory. Nonlinearity 17, 207–227 (2004) 21. Friesecke, G., Pego, R.L.: Solitary waves on Fermi-Pasta-Ulam lattices. IV. Proof of stability at low energy. Nonlinearity 17, 229–251 (2004) 22. Fermi, E., Pasta, J.R., Ulam, S.M.: Studies of nonlinear problems. In Collected works of E. Fermi Vol.2. Chicago: Chicago University Press, 1965 23. Galgani, L., Scotti, A.: Planck-like distribution in classical nonlinear mechanics. Phys. Rev. Lett. 28, 1173–1176 (1972) 24. Izrailev, F.M., Chirikov, B.V.: Statistical properties of a nonlinear string. Sov. Phys. Dokl. 11, 30–32 (1966) 25. Kappeler, T. Pöschel, J.: KAM & KdV. Berlin-Heidelberg-Newyork: Springer, 2003 26. Livi, R., Pettini, M., Ruffo, S., Vulpiani, A.: Further results on the equipartition threshold in large nonlinear Hamiltonian systems. Phys. Rev. A 31, 2741–2742 (1985) 27. Marchenko, V.: Sturm-Liouville operators and applications. Basel: Birkhäuser, 1986 28. Ponno, A., Bambusi, D.: Energy cascade in Fermi–Pasta–Ulam model. In: G. Gaeta et al. (eds.) Symmetry and Perturbation Theory 2004, RiverEdge, NJ: World Scientific, 2005 pp. 263–270 29. Ponno, A., Bambusi, D.: KdV equation and energy sharing in FPU. Chaos 15, 015107 (2005) 30. Paleari, S., Bambusi, D., Cacciatori, S.: Normal form and exponential stability for some nonlinear string equations. ZAMP 52, 1033–1052 (2001) 31. Pettini, M., Landolfi, M.: Relaxation properties and ergodicity breaking in nonlinear Hamiltonian dynamics. Phys. Rev. A 41, 768–783 (1990) 32. Ponno, A.: Soliton theory and the Fermi-Pasta-Ulam problem in the thermodynamic limit. Europhys. Lett. 64, 606–612 (2003) 33. Ponno, A.: The Fermi–Pasta–Ulam problem in the thermodynamic limit. In: P. Collet et al. (ed.) Proceedings of the Cargése Summer School 2003 on Chaotic Dynamics and Transport in Classical and Quantum Systems, Dordrecht: Kluwer Academic Publishers, 2005, pp. 431–440 34. Pöschel, J.: Hill’s potentials in weighted Sobolev spaces and their spectral gaps. Preprint (2004) 35. Pierce, R.D., Wayne, C.E.: On the validity of mean-field amplitude equations for counterpropagating wavetrains Nonlinearity 8, 769–780 (1995) 36. Rink, B.: Symmetric invariant manifolds in the Fermi-Pasta-Ulam lattice. Physica D 175, 31–42 (2001) 37. Rink, B.: Symmetry and resonance in periodic FPU chains. Commun. Math. Phys. 218, 665–685 (2001) 38. Shepelyansky, D.L.: Low-Energy chaos in the Fermi–Pasta–Ulam problem. Nonlinearity 10, 1331– 1338 (1997) 39. Schneider, G., Wayne, C.E.: Counter-propagating waves on fluid surfaces and the continuun limit of the Fermi Pasta Ulam model. In: Proceedings of the International Conference on Differential Equations Berlin 1999, River Edge NJ : World Scientific, 2000 40. Zabusky, N.J., Kruskal, M.D.: Interaction of solitons in a collisionless plasma and the recurrence of initial states. Phys. Rev. Lett. 15, 240–243 (1965) Communicated by G. Gallavotti

Commun. Math. Phys. 264, 563–564 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1558-z

Communications in


Erratum

The Hamiltonian Operator Associated with Some Quantum Stochastic Evolutions M. Gregoratti Dipartimento di Matematica “F.Brioschi”, Politecnico di Milano, piazza Leonardo da Vinci 32, 20133 Milano, Italy. E-mail: [email protected] Received: 29 July 2004 / Accepted: 23 January 2006 Published online: 22 March 2006 – © Springer-Verlag 2006 Commun. Math. Phys. 222, 181–200 (2001)

It was kindly pointed out to us by W. von Waldenfels that Section 3.2 of [1] contains an error when the trace operator is introduced for functions in the Sobolev space H (Rn∗ ; H): we claimed that there exists a bounded operator ·|{r =s} : H (Rn∗ ; H) → L2 (Rn−1 ; H) which naturally defines the trace of each v in H (Rn∗ ; H) as a function v|{r =s} in L2 (Rn−1 ; H), but actually such trace v|{r =s} is naturally defined only as a function in n 2 n−1 ; H) can only be closed, L2loc (Rn−1 ∗ ; H) and a trace operator from H (R∗ ; H) to L (R with a domain to be specified. Nevertheless the main result of [1], Theorem 3, is correct and provable through an adjustment of the argument. We refer to [2] for a detailed introduction of the traces ·|{r =s} and we list below the points which require an adjustment, that is the points involving ·|{r =s} which are to be handled taking into account domain constraints. (22) needs to be generalized [2] because u|∂Qm 1. The integration by parts formula v|∂Q H is not necessarily in L1 (∂Qm ) for every u and v in H (Rn ; H). Therefore, for m ∗ > 0, we introduce on Rn the totally symmetric indicator function I (r) = 0, f n (J ) ⊆ J and f n | J is unimodal. In particular, f n maps the boundary of J into itself. Asymptotic dynamics and Hausdorff dimension. Let f be a unimodal map with an attractor of solenoidal type. Assume that J is a restrictive interval of f with period n(J ). The n(J ) solenoidal attractor is equal to ∩J ∪i=1 f i (J ), where the intersection is taken over all restrictive intervals of f . The Hausdorff dimension is an invariant in the smooth category. This means that the Hausdorff dimensions of the attractors of f and f n(J ) | J coincide. In other words, we can pick up an arbitrary small restrictive interval J and then replace f by any smooth rescaling of f n(J ) | J without altering the Hausdorff dimension of the attractor. Consequently, it is enough to get universal estimates of n(J )

|f i (J )|σ

i=1

for all restrictive intervals J of high enough periods. Negative Schwarzian. We say that a C 3 function g has negative Schwarzian derivative 2 if S(g)(x) := D 3 g(x)/Dg(x) − 23 D 2 g(x)/Dg(x) < 0 whenever Dg(x) = 0. Such maps will be called S–unimodal. The Schwarzian derivative S(g) satisfies the composition law S(g ◦ h)(x) = S(g)(h(x))Dh(x)2 + S(h)(x). Thus iterates of a map with negative Schwarzian derivative also have negative Schwarzian derivative. Cross-ratio. If J ⊂ T are two intervals, then b(T , J ) =

|J ||T | , |X||Y |

where X, Y are two components of T \ J . We have a well-known property of expanding the cross-ratio b(T , J ).

568

J. Graczyk, O.S. Kozlovski

Lemma 2.1. For any J ⊂ T and any diffeomorphism g with negative Schwarzian derivative, b(g(T ), g(J )) > b(T , J ). Lemma 2.2. For every C > 0 there is λ < 1 such that if I ⊂ J ⊂ T are three intervals and b(T , J ) < C, then b(T , I ) < λ b(J, I ). Proof. From b(T , J ) < C, we infer that there exists C , which depends only on C, such that each connected component of T \ J is bigger than C |J |. Hence, |J | ≤ |T |/ (2C + 1). Now a direct calculation shows that the claim is satisfied for λ := (1 + 2C )/ (1 + C )2 < 1.

Nestedness and symmetry. An open set U is τ -nested inside T if |U | ≤ τ. dist(U, ∂T ) We say that a map g : U → R with one critical point ξ is κ-symmetric if |Dg(x1 )|/ |Dg(x2 )| < κ for every x1 , x2 ∈ Y such that g(x1 ) = g(x2 ). Note that if g is κ-symmetric, then |x1 − ξ |/|x2 − ξ | < κ.

3. Estimates near Bifurcation Saddle-node estimates. Proposition 1. Let Y be a compact interval and g : Y → R a C 3 diffeomorphism without fixed points such that |g(x) − x| has a local minimum inside Y . Let τ, δ be positive so that length of g(Y ) \ Y is greater than τ |Y | and Sg(x)|Y |2 < −δ for every x ∈ Y . Let {x0 , . . . , xk } ⊂ Y be a sequence such that g(xi−1 ) = xi for i = 1, . . . , k. Then for every α > 1/2 there exists a constant C > 0 which depends only on τ and δ such that k

|xi − xi−1 |α < C|x0 − xk |α .

i=1

Geometric set up of Proposition 1 when g(x) > x is shown on Fig. 1. 2 g(x) stand for the nonlinearity of g. The proof of Proposition 1 is Let N g(x) = DDg(x) based on the following lemma. Lemma 3.1. Suppose that g is a diffeomorphism defined in a neighborhood of 0 such that (i) Sg < −β, (ii) g(0) = w, (iii) there is no x ∈ (−L, L), where g is defined and g(x) ≤ x. Then for every L and β positive there exist K1 > 0 and K2 > 0 so that w < K1 yields N g(0) > K2 for every g satisfying (i-iii).

On Hausdorff Dimension of Unimodal Attractors

569

x

g k (x) Y

g(Y) \ Y

Fig. 1. Orbit of x near saddle-node bifurcation

Proof. For the convenience of the reader we sketch the proof from p. 248 of [7]. Let G(w, L) be a class of C 3 diffeomorphisms defined on (−L, L) and satisfying (i − iii). Recall a well-known differential equation 1 (1) DN g = Sg + (N g)2 . 2 Every function in G(w, L) is uniquely determined by a continuous function ψ = Sg and two numbers ν and µ equal to N g(0) and Dg(0) respectively. Observe that with µ fixed, Eq. (1) implies that every g ∈ G(w, L) is a monotone function of ψ and ν. Now, we will prove the following claim: for every µ ∈ R there exists w > 0 such that the solution g, g(0) = w, of Eq. (1) with ψ = −β and ν = 0 can not satisfy g(x) ≥ x for every x ∈ (−L, L). Suppose, to the contrary, that for every w > 0 there exists µ such that g(x) ≥ x. From another well-known differential formula, D 2 u = −βu, where Dg = 1/u2 . Hence, µ Dg(x) = √ cosh βx and the set of such µ must be bounded. Every g, as a map with negative Schwarzian, has exactly one infliction point which is at 0. Taking a limit, we obtain a map g with the property that g(x) ≥ x for every x ∈ (−L, L) and g(0) = 0, a contradiction. By the claim and the monotonicity of g ∈ G(w, L) with respect to ψ and ν, there exist K1 and K2 positive such that Sg < −β ⇒ N g(0) > K2 , provided w < K1 .

570


Let g be as in Proposition 1. Since Sg < 0, there exists a unique point χ ∈ Y so that |g(x) − x| achieves its local minimum at χ . Lemma 3.2. Let g be as in Proposition 1. Then there exist positive constants k0 , Q and κ (which depend only on δ and τ from Proposition 1) so that for every k > k0 the following holds: if there is a point contained in Y together with its k iterates by g then there exists a neighborhood U of χ so that |D 2 g(x)| > |YQ| if x ∈ U, |g(x) − x| > κ|Y | if x ∈ Y \ U, Proof. Let x ∈ Y . Without loss of generality g(x) > x. We normalize Y by an affine transformation x (y) = y−x |Y | . The function G(y) = x ◦ g ◦ x−1 (y) satisfies (i-iii) of Lemma 3.1 with G(0) = w = (g(x) − x)/|Y |. Therefore, there are constants P , K > 0 so that if G(0) ≤ P then K < N G(0) = |Y | N g(x).

(2)

We will choose k0 and κ as functions of P and τ . The existence of an orbit by g of length k > k0 > 0 contained in Y implies that inf (g(x) − x) < |Y |/k0 .

x∈Y

By the negative Schwarzian property, we have that either g(χ ) − χ < |Y |/k0 or |g(Y ) \ Y | < |Y |/k0 . If k0 > τ −1 then the latter is not possible. Choosing k0 > P −1 , we obtain that g(χ) − χ K/|Y |. Solving this differential inequality, we obtain that Dg(x) > e−K and consequently, D 2 g(x) > Q/|Y | for every x ∈ U . The negative Schwarzian of g and the hypothesis that |g(Y ) \ Y | > τ |Y | yield g(x) − x > min(τ, P )|Y | for every x ∈ Y \ U .

Proof of Proposition 1. Let k0 come from Lemma 3.2. If k ≤ k0 then by the power means inequality, for every α there exists C so that k

|xi − xi−1 |α < C|x1 − xk |α ,

i=1

where C depends only on α, τ , and δ. If k > k0 then we change the coordinates by an affine map χ so that χ goes to 0 and length of Y is 1. In this coordinate system g becomes φ and Y becomes V . Without loss of generality φ(x) > x. Observe that φ satisfies an (Q, κ)-lower approximation rule, i.e. φ is not smaller than the map q(x) = x + Qx 2 + φ(0) on the interval U = κ (U ), where U χ and Q > 0 come from Lemma 3.2.


571

y=l(x)

T

ξ

r

r’ g(ξ)

Y’ Y

Fig. 2. Geometric set up of Proposition 2 when g(ξ ) > ξ

We see that only what’s inside U counts. Indeed, the number of iterates n so that 0 < n < k and φ n (x) ∈ / U is uniformly bounded. Moreover, for all such values of n+1 n n, φ (x) − φ (x) is uniformly bounded away from 0. The second part of the claim follows readily from Lemma 3.2 and the first part is an obvious consequence of the second. Therefore, the sum ki=1 |φ i (x) − φ i−1 (x)|α , is majorated by a uniform constant multiple of |x − φ k (x)|α + |q i (y) − q i−1 (y)|α , q i (y)∈I

where I = [y, z] and y, z are, respectively, the first and the last point of the orbit x, . . . φ k (x) in U . By the well-known estimates for quadratic maps, see [7], |q i (y) − q i−1 (y)|α ≤ C|I |α q i (y)∈I

with C depending only on Q and α > 1/2. This completes the proof of Proposition 1.

Estimates near almost restrictive interval. Proposition 2. Let g : Y → T with Y ⊂ T be a C 3 map with exactly one critical point ξ , negative Schwarzian derivative, and two repelling fixed points. Suppose g is

572


κ-symmetric for some κ > 1 and suppose there is E > 0 such that g E (ξ ) ∈ Y . Let Y be τ -nested inside T , τ > 0, g(∂Y ) ⊂ ∂T , r be a fixed point of g with a positive multiplier and Y be a component of Y \ r not containing the critical point. Let {x0 , . . . , xk } ⊂ Y be a sequence such that g(xi−1 ) = xi for i = 1, . . . , k. Then for every α > 0 there exists a constant C > 0 which depends only on τ and κ such that k

|xi − xi−1 |α < C|x0 − xk |α .

i=1

Proof of Proposition 2. Without loss of generality assume that g is first increasing and then decreasing. Let r > ξ be a preimage of r by g. Since the iterates of the critical point leave the interval Y we have g(ξ ) > r . Let y be the endpoint of Y lying on the same side of ξ as r. Draw two lines through the points (r, r) and (y, g(y)), and through (r, r) and (ξ, g(ξ )). Let y = l(x) be the equation of the line with smaller slope. This slope D(l) is larger than some constant (> 1) which depends only on τ and κ. For x ∈ Y we have g(x) ≤ l(x). Indeed, we know that g(y) ≤ l(y), g(r) = l(r) and g(ξ ) ≥ l(ξ ). If there is a point a ∈ Y such that g(a) > l(a), then there are two points b1 and b2 in Y ∪ {r} such that b1 < a < b2 , g(b1,2 ) = l(b1,2 ). By the minimum principle, Dg(y) ≥ D(l) > 1 for every y ∈ (b1 , b2 ) which yields a contradiction. Recall that the minimum principle, see Lemma 6.1 in [14], says that every diffeomorphism with negative Schwarzian derivative from a closed interval U into R has the derivative inside U strictly larger than the minimum of the derivatives at the boundary points of U . Let I = [xk , x]. Since the graph of g for x ∈ Y lies below the graph of l, there are constants C1 , C > 0 which depend only on D(l) and α > 0 so that k

|xi − xi−1 |α < C1

i=1

|l i (x) − l i−1 (x)|α < C |I |α .

l i (x)∈I

4. Sequences of Box Mappings We recall that an open interval T is regularly returning for some dynamics f defined in an ambient space containing T if f n (∂T ) ∩ T = ∅ for every n > 0. The first entry map φ of f into a set T is defined on DT := {x : ∃ n > 0, f n (x) ∈ T } by the formula φ(x) := f n(x) (x), where n(x) := min{n > 0 : f n (x) ∈ T }. If T is a regularly returning open interval then the function n(x) is locally constant on DT . Box mappings form an important class of maps which naturally arise as first entry maps of unimodal maps to regularly returning intervals. Definition 2. Let U be an open set in R and T an open interval. We say that R : U → T is a box mapping if the following holds: • ∂T ∩ U = ∅; • R is differentiable and has exactly one non-degenerate critical point ξ , the connected component of U containing ξ is called the central domain; • if I is a connected component of U and ξ ∈ I , then the branch R | I is a diffeomorphism between the intervals I and T ; • if I is the central domain then R : I → T is proper.


573

An induced sequence of central intervals. Let R : U → T be a box mapping and let J be its central domain. It might happen that R(ξ ) is in J . Let E be a minimal positive integer such that R E (ξ ) ∈ / J . This number does not necessarily exist, all iterates of the critical point can stay in the central domain. If this number is finite, it will be called an escaping time. The end points of J are mapped by R onto an end point a of T . Let ai be a preimage of a by the i th iterate of the central branch of R, i = 0, . . . , E − 1, such that R i+1 (ai ) = a. In this notation a0 is a boundary point of J . Each point ai has a symmetrical point which we denote by ai (here “symmetrical” means that RJ (ai ) = RJ (ai )). Finally, let Ji = (ai , ai ) (so J = J0 ). We call the sequence J0 , . . . , JE−1 an induced sequence of central intervals for R. We say that a box mapping is κ-symmetric if its central branch is κ-symmetric. Lemma 4.1. Let R : U → T be a C 3 box mapping with negative Schwarzian derivative. Let δ, τ be positive and α > 1/2, κ > 1 such that (i) U is τ -nested inside T , (ii) the Schwarzian derivative of the central branch is smaller than −δ/|J |2 for every x ∈ J , (iii) R is κ-symmetric. Then there exists a constant K > 0 depending only on α, τ, κ and δ so that the following is true: if I is an interval such that • R m is monotone on I , • R i (I ) does not intersect JE−1 for i = 0, . . . , m − 1, • R m−1 (I ) and J0 are disjoint, then m

|R i (I )|α < K b(T , R m (I ))α |T |α .

i=0

We can prove this lemma only for α > 1/2 and believe that for α < 1/2 it might not hold. The constant 1/2 is not related to the order of the critical point, but to the order of “almost” tangency of the central branch and the diagonal as in Proposition 1. Proof. We divide the orbit {R i (I ), i = 0, . . . , m} into pieces according to the following rule: m0 = 0 and mj +1 is the smallest integer greater than mj such that the interval R mj +1 (I ) is disjoint from the central domain J . In this way we have constructed a sequence {mj , j = 0, . . . , p}. Put mp+1 = m. If mj +1 = mj + 1, then between mj and mj +1 iterates the interval I is either close to a saddle-node bifurcation or to a repelling fixed point. Let Vj be a domain of R such that R mj (I ) ⊂ Vj . The domain Vj exists because otherwise the map R mj +1 |I would not be monotone. Since U is τ -nested inside T , the cross-ratio b(T , Vj ) is bounded by a universal constant depending only on τ . We want to prove that there exists λ < 1 which depends only on τ so that b(Vj , R mj (I )) < λ b(Vj +1 , R mj +1 (I )). The proof falls naturally into two cases. Case I. Suppose that mj +1 = mj + 1. Since R expands cross-ratios, b(Vj , R mj (I )) < b(T , R mj +1 (I )).

(3)

574


Lemma 2.2 applied to T ⊃ Vj +1 ⊃ R mj +1 (I ) yields b(T , R mj +1 (I )) < λ b(Vj +1 , R mj +1 (I )), where λ < 1 depends only on τ . Combining these two inequalities we obtain (3). Case II. Suppose mj +1 > mj + 1. Let Ji = (ai , ai ), i = 1, . . . , E − 1, be an induced sequence of central intervals for R. Let Vj ⊃ R mj +1 (I ) be a connected component of Ji \ Ji+1 . Since R m−1 (I ) ∩ J = ∅, the intervals {R j (I ), j = 0, . . . , m − 1} never meet the boundary of the intervals Ji . Since R expands cross-ratios, b(Vj , R mj +1 (I )) < b((a, a0 ), R mj +1 (I )) < b(Vj +1 , R mj +1 (I )), b(Vj , R mj (I )) < b(T , R mj +1 (I )). Lemma 2.2 applied to the intervals T ⊃ Vj ⊃ R mj +1 (I ) yields b(T , R mj +1 (I )) < λ b(Vj , R mj +1 (I )) < λ b(Vj +1 , R mj +1 (I )), which combined with the other inequalities implies (3). Conclusion. The estimate (3) implies an exponential decay of b(T , R mp−j (I )) with j . Since |J |/|T | < b(T , J ), we also have an exponential decay of lengths of the intervals R mp−j (I ). Consequently, there exists a positive constant C1 so that p+1

|R mj (I )|α < C1 b(T , R m (I ))α |T |α ,

j =0 p+1

(4) b(T , R mj (I ))α < C1 b(T , R m (I ))α .

j =0

We see that the only missing estimate is a universal bound on mj +1

|R i (I )|α .

i=mj +1

Clearly, we may assume that mj +1 > mj + 1. Recall that Vj is either (ai , ai+1 ) or

), where i = m

l (ai , ai+1 j +1 − mj − 2. Hence R (Vj ) = (ai−l , ai−l+1 ) for l = 1, . . . , i. Therefore, by the property of expanding cross-ratios, for the same values of l, we have that |R l+mj +1 (I )|α < |R l (Vj )|α b(R l (Vj ), R l+mj +1 (I ))α < |ai−l − ai−l+1 |α b(Vj +1 , R mj +1 (I ))α . The above inequality and the estimates of Propositions 1 and 2 yield (for convenience we put a−1 = a) mj +1

i

α

|R (I )| < b(Vj +1 , R

mj +1

(I ))

i=mj +1

< C2 b(Vj +1 , R

mj +1

α

mj +1 −mj

|ai − ai−1 |α

i=0 α α

(I )) |T | .

Combining (4) and (5), we obtain the assertion of the lemma.

(5)


575

Estimates for induced box mappings. Suppose that a box mapping R : U → T is the first entry map of a unimodal map f to a regularly returning interval T . Then every branch of R is an iterate of f , i.e. for every connected component I of U , one has R | I = f n(I ) , where n(I ) is a positive integer. We say that a box mapping R is induced by f . Definition 3. An induced box mapping R is τ -extendible if there exists an interval T so that T is τ -nested inside T and every branch of R with the domain I ⊂ T of U has an extension mapping diffeomorphically over T . An inductive parameter. Let a box mapping R : U → T be induced by f . We define α

ρ (f, R) = sup

n(V )

V

|f i (V )|α ,

i=0

where the supremum is taken over connected components of the domain of R outside of T which contain points from the critical orbit, i.e. {f i (ξ ), i > 0}. The next lemma is a counterpart of Lemma 4.1 for induced box mappings. Lemma 4.2. Let R : U → T be the first entry map of an S-unimodal map f to a regularly returning interval T . Suppose that R satisfies the conditions (i-iii) from Lemma 4.1 and is τ -extendible. Let E be an escaping time and Ji , i = 0, . . . , E − 1, be induced central intervals. Then for every α > 1/2 there exists K > 1 (depending on α, δ, κ and τ ) with the following property: If I is an interval such that • f m is monotone on I and f m (I ) ⊂ T , • f i (I ) is disjoint with JE−1 for i = 0, . . . , m − 1, • f µ (I ) is disjoint with J0 , where µ is the greatest integer smaller than m such that f µ (I ) ⊂ T . We set µ = 0 if f i (I ) ∩ T = ∅ for i = 0, . . . , m − 1. Then m |f i (I )|α < Kρ α (f, R) b(T , f m (I ))α . i=0

Proof. Let 1 ≤ m1 < m2 < · · · < m be the iterates when the interval I is mapped into T , f mj (I ) ⊂ T . Denote by Vj an interval such that f mj +1 (I ) ⊂ Vj and f mj +1 −mj −1 (Vj ) = T . Let I ⊂ V0 . Since f has negative Schwarzian derivative and the range of f mj +1 −mj −1 restricted to Vj has a definite extension, the distortion of f mj +1 −mj −1 on Vj is bounded by a constant which depends only on τ -extendibility of R. Hence, for i = mj −1 + 1, . . . , mj , |f i (I )|α |f i−mj −1 −1 (V

j −1 )|

α

< C3

|f mj (I )|α , |T |α

where C3 depends only on τ . Therefore, mj i=mj −1 +1

|f i (I )|α < C3

|f mj (I )|α |T |α

< C3 ρ α (f, R)

mj

|f i−mj −1 −1 (Vj −1 )|α

j =mj −1 +1 |f mj (I )|α

|T |α

.

Lemma 4.2 follows from Lemma 4.1 and the above inequality.

576


Corollary 1. Suppose that R : U → T is as in Lemma 4.2. Then there is a constant K depending only on α, δ, κ, and τ such that if R is the first entry map of f to JE−1 , then ρ α (f, R ) < Kρ α (f, R)

|JE−1 |α . |T |α

Proof. By Lemma 4.2, ρ α (f, R ) < Kρ α (f, R) b(T , JE−1 )α . Since J is τ -nested inside T , both components of T \ JE−1 have length greater than C4 |T |. Hence, there exists a constant C5 which depends only on τ so that b(T , JE−1 )α < C5

|JE−1 |α . |T |α

5. Canonical Inducing Let f : M → M be an S-unimodal map without periodic attractors. To begin the construction, look for an open interval T0 ⊂ M which contains ξ and is regularly returning. There is a canonical way of finding an initial T0 . Note first that the fixed point q which is in the interior of M has another preimage q < 0. The interval (q , q) can be taken as T0 . The central domain of R0 can coincide with T0 . In this case the map f has a restrictive interval of period 2. If the central domain of R0 does not coincide with T0 , then it is compactly contained in T0 . Let E be an escaping time of R0 and let JE−1 be an induced central domain for R0 . Note that JE−1 is regularly returning. If E is infinite, then f has a restrictive interval properly contained in the central domain of R0 . If E is finite, set T1 = JE−1 and take the first entry map R1 of f into T1 . We can repeat this construction inductively and get a canonical induced sequence of box mappings Rl : Ul → Tl . This sequence is finite iff either f has a restrictive interval or ξ is not recurrent. Initial estimates. Suppose that f is an S-unimodal map with all periodic points repelling. If f has a restrictive interval then a canonical sequence of induced box mappings Rl is finite. We will analyze the last box mapping R, termed terminal, in this sequence. Let J be the central−idomain of R. The escaping time E of R is infinite, and hence

M = ∞ i=1 (R | J ) (J ) is a restrictive interval for f . From now on we call M a canonical restrictive interval. Lemma 5.1. Let f : M → M be an S-unimodal map with all periodic points repelling and M be canonical restrictive interval for f . Let R : U → T be a terminal box mapping from the canonical sequence induced by f and suppose that U is τ -nested inside T . If the central branch of R is a restriction of f n then there exists K < 1 which depends only on τ so that n i=1

|f i (M )| < K|M|.


577

Proof. Let V be a component of U containing f (M ). Then f n−1 (V ) = T . Moreover, f n−1 (f (M )) ⊂ M ⊂ J . The map f i on f n−1−i (V ) has negative Schwarzian derivative, J is τ -nested inside T , hence there exists a constant K < 1 depending only on τ such that |f i (M )| < K|f i−1 (V )|, i = 1, . . . , n. The orbit of V is disjoint because R is the first return map and consequently, ni=1 |f i−1 (V )| < |M|.

It is easy to see that there are exactly two repelling fixed points of the terminal box mapping R : U → T in the canonical restrictive interval M , but only one, say r, is in the interior of M . Let r ∈ M be its symmetrical point, i.e. R(r ) = R(r) = r. An interval T := (r, r ) is regularly returning. Lemma 5.2. For every λ > 1 and α > 0 there exist constants C > 0 and κ > 1 with the following property. Suppose that R : U → T is a terminal box mapping induced by f and M is a canonical restrictive interval. Let T = (r, r ) and R : U → T be the first entry map of R into T . If multipliers of the fixed points of R | M are larger than λ and R | M is κ-symmetric then ρ α (R | M , R ) < C|M |α . Proof. Let W be a connected component of M \ r not containing the critical point. If V is a connected component of the domain of R , then it is easy to see that either R(V ) = T or R k (V ) ⊂ W for some k ∈ {0, 1, 2}. Thus, in order to prove the lemma we can assume that V ⊂ W . In this case there is k such that R i (V ) ⊂ W for i = 0, . . . , k and R k+1 (V ) = T . Take κ = 2λ/(1 + λ). Then the absolute values of DR at the boundary points of W are larger than (1 + λ)/2. Due to the minimal principle this implies that the absolute value of DR | W is bigger than (1+λ)/2. Hence there exists a constant C > 0 depending on λ and α such that k

|R i (V )|α < C |M |α .

i=0

6. Universal Estimates Suppose that M is a canonical restrictive interval of f of period n. Then f n | M is a unimodal map and we have a sequence of canonically induced box mappings Rl,M : Ul → Tl . We say that Y is a canonically induced interval if there exists a canonical restrictive interval M and a box mapping Rl,M such that Y = Tl for some l. Proposition 3. There exist positive constants τ, λ > 1 and δ with the following properties. Let f : M → M be an S-unimodal map with a non-degenerate critical point and infinitely many canonical restrictive intervals Mk of different periods nk . Then for every κ > 1 there exists k0 so that for every canonically induced interval Y ⊂ Mk0 we have that • the first entry map R of f to Y is τ -extendible and the central domain J of R is τ -nested inside Y , • R is κ-symmetric, • SR | J (x)|J |2 < −δ for every x ∈ J ,

578


• the absolute value of a multiplier of every periodic point of R is larger than λ. The hypotheses of Proposition 3 that f has infinitely many restrictive intervals is redundant. It is enough to assume that length of Y is small enough. The proof of Proposition 3 is based on a few lemmas, which use only a small size of Y . Lemma 6.1. For n which is a positive integer let f be a C n+1 unimodal map from a closed interval I , normalized so that the critical point is at 0 and f (0) = 1 is a local maximum. Suppose that D 2 f (0) < 0. Then f (x) can be expressed as 1 − (h(x))2 where h(x) is a C n increasing diffeomorphism of I onto its image and h(0) = 0. √ Proof. The map h is uniquely determined: h(x) = 1 − f (x), where positive values of (x) the root are chosen for positive x and negative for negative x. Denoting θ(x) = 1−f x2 √ we can write h(x) = x θ (x). We see that θ is C n−1 in a neighborhood of 0, C n in a punctured neighborhood with the nth derivative bounded as o(|x|−1 ). Hence h is C n .

By Lemma 6.1 the central branch can be put in the desired form with the sacrifice of one order of smoothness. Lemma 6.2. Suppose that f is a C 2 unimodal map with non-degenerate critical point ξ . Then for every κ > 1 there is > 0 such that if Y is a regularly returning interval of size smaller than , then f n | M is κ-symmetric. This is a straightforward application of the previous lemma. Lemma 6.3. Let f be a C 3 unimodal map with non-degenerate critical point ξ . There exists > 0 so that for every regularly returning interval Y with length smaller than the following holds: Let ψ : Y → Y be the central branch of the first return map to Y . Then Sψ(x)|Y |2 < −1/3. Proof. By calculus, for every C 3 unimodal map f with non-degenerate critical point there exists η > 0 such that for every x ∈ (−η, η), Sf (x) < −(1/3)|x|2 . If ψ is a restriction of f n and Y ⊂ (−η, η) then the composition law for the Schwarzian derivative implies that Sψ(x)|Y |2 ≤ Sf (x)|x|2 + Sf n−1 (f (x))|Df (x)|2 |Y |2 ≤ −1/3.

Lemma 6.4 (Real bounds). For every κ > 0 there exists τ > 0 with the following property. Let f be a κ-symmetric S-unimodal map with a non-degenerate critical point and infinitely many different periods of renormalization. If Y is a canonically induced interval then the first entry map R to Y is τ -extendible and the central domain of R is τ -nested inside Y . Proof. This is a well-known fact. We refer the reader, for example, to Theorem 3.2 of [17] from which a numerical constant for the extendibility can be readily produced. The τ -nestedness is a direct consequence of the uniform extendibility and the uniform separation from 1 of the absolute values of multipliers of periodic points of f , see [14], p. 268.

Proof of Proposition 3. Observe that lengths of restrictive intervals Mk go to 0 with k since otherwise f would have a wandering interval. The first claim of Proposition 3 follows from Lemmas 6.4 and 6.2. The other two claims are obtained from Lemmas 6.2 and 6.3 with δ = 1/3. For the last claim see [14], p. 268.


579

Uniform decay of geometry. Suppose that the critical orbit of an S-unimodal map f is recurrent. Consider a canonical induced sequence of box mappings Rl : Ul → Tl . An induced sequence Rl : Ul → Tl shows the decay of geometry provided that ωl := log

|Tl | ≥ Cl |Tl+1 |

for some C > 0 (which depends only on ω0 ) and all l for which Rl is well-defined. Lemma 6.5. Let f be as in Proposition 3. Then there exists k0 so that for every k > k0 if Rl , l = 0, . . . , m, is a canonical induced sequence for fk := f nk Mk then it shows a uniform decay of geometry. Proof. We will need a simplified version of Theorem 5 of [5]: Suppose that f is an Sunimodal map with recurrent critical orbit or restrictive interval. Then for every > 0, there is a regularly returning interval Y for f which contains ξ and such that if Y denotes the connected component of the first entry map into Y containing ξ , then |Y | ≥ K, where K > 0 depends only on and not on f and either Y is a restrictive interval or |Y |/dist(Y, ∂Y ) < . Now from this (see Corollary 1 in [5]) and Proposition 3 follows that for any > 0 there exists a universal number l0 so that ωl < provided l > l0 . The starting condition of [10] is satisfied and the uniform decay of geometry follows from [10].

Lemma 6.6. For every α > 1/2 there exists a constant C > 0 with the following property. Let f be as in Proposition 3 and Rl , l = 0, . . . , m, be a canonical induced sequence for fk := f nk Mk . There exists k0 such that for every k ≥ k0 , ρ α (fk , Rm ) < Cρ α (fk , R0 ). Proof. Proposition 3 implies that the constants K produced by Corollary 1 applied to Ri , i = 0, . . . , m − 1, are all bounded by a universal constant if only k is large enough. Hence, by Corollary 1, ρ α (fk , Rm ) < K m ρ α (fk , R0 )

|Tm |α . |T0 |α

Combining Proposition 3 and Lemma 6.5, we see that lengths of Tm decay superexponentially fast in a uniform fashion. Consequently, K m |Tm |α /|T0 |α remain uniformly bounded by a constant independent from m and tend to 0 when m → ∞.

Proof of Theorem 1. We may assume that f has negative Schwarzian derivative. Let Mk be a sequence of restrictive intervals of periods nk and denote by fk the map f nk |Mk : Mk → Mk . Suppose that k > k0 , where k0 is a constant supplied by Proposition 3. In view of Proposition 3, Lemma 5.1 and combined Lemmas 6.6 and 5.2 imply respectively nk /nk−1

i=1 nk /nk−1

i=1

i |fk−1 (Mk )| < C6 |Mk−1 |,

i |fk−1 (Mk )|α < C7 |Mk−1 |α ,

580


where C6 < 1 and C7 are universal constants if we set α to be some number in (1/2, 1). i (M ) belong to M All the intervals fk−1 k k−1 for i = 1, . . . , nk /nk−1 . The distortion of n −j the map f k−1 |f j (Mk−1 ) , j = 1, . . . , nk−1 , is bounded by some universal constant, hence we have nk

|f i (Mk )| < C8

i=1 nk

|f i (Mk )|α < C9

i=1

nk−1

i=1 nk−1

|f i (Mk−1 )|, |f i (Mk−1 )|α ,

i=1

where C8 < 1. Every σ ∈ (α, 1) can be represented as (1/µ) + (1/µ )α, where 1/µ + 1/µ = 1. Using the Hölder inequality with exponents µ and µ , we obtain that nk

i

σ

|f (Mk )| =

i=1

nk

1/µ

|f i (Mk )|1/µ |f i (Mk )|α

i=1

1. The Sol-manifolds with negative λ are covered by those with positive eigenvalues. Together with (x, y, z) we shall use another coordinate system (u, v, z) on MA3 , where (u, v) are linear coordinates on the fibres related to a positively oriented eigenbasis of A. The transformation TA in these coordinates is given by     λu u  v  −→  λ−1 v  . (2) z z+1

One should note that unlike (x, y), the new coordinates (u, v) are not periodic on the tori T 2 anymore: two pairs (u, v), (u , v ) define the same point on T 2 if and only if (u−u , v −v ) = k(c11 , c12 )+m(c21 , c22 ), where k, m ∈ Z and e1 = (c11 , c12 ), e2 = (c21 , c22 ) is the basis of the lattice associated to T 2 : 1 1 −1 1 1 c1 c2 c1 c2 λ 0 a11 a12 A= = . −1 2 2 a21 a22 0λ c 1 c2 c12 c22 The Riemannian metrics on Sol-manifolds come from right-invariant metrics on the universal covering of MA3 , which has the natural structure of a solvable Lie group Sol. Topologically this group is R3 with a multiplication of the form (u, v, w) ∗ (u , v , w ) = (u + ew u , v + e−w v , w + w ). One can realise it as the group of 3 × 3 matrices of the form   w 0 u e  0 e−w v  . 0 0 1 The Sol-manifolds MA3 we consider are the quotients of the group Sol by the discrete subgroups GA corresponding to w = m ln λ, m ∈ Z and (u, v) = ke1 + le2 belonging to the integer lattice described above, z = w/ ln λ. The right-invariant metrics on the group Sol correspond to the following class of metrics on the Sol-manifold MA : ds 2 = α(z)dx 2 + 2β(z)dxdy + γ (z)dy 2 + dz2 ,

(3)

586

where

A.V. Bolsinov, H.R. Dullin, A.P. Veselov

α(z) β(z) β(z) γ (z)

= exp(−zB)

αβ βγ

exp(−zB).

Here α, β, γ are real parameters with the only condition that the form ds 2 = αdx 2 + 2βdxdy + γ dy 2 is positive definite and B is defined by the relation exp B = A : −1 1 1 c11 c21 c1 c2 ln λ 0 B= . 2 2 0 − ln λ c c c2 c2 1

2

1

2

One can consider a more general metric allowing a constant coefficient at dz2 but this will lead only to a general scaling. 3. Geodesic Flows on Sol-Manifolds: Integrals and Hamiltonian Monodromy Thus, the Hamiltonian of the geodesic flow on MA3 in (u, v, z)-coordinates can be written as H =

1 1 (Ee2z ln λ pu2 + 2Fpu pv + Ge−2z ln λ pv2 ) + pz2 , 2 2

where E, F, G are real parameters: E > 0, G > 0, EG − F 2 > 0. It is invariant under the following transformation:     λu u  λ−1 v   v       z+1   z  (4) TA∗ :   −→  −1  ,  λ pu   pu   λp  p  v v pz pz and, of course, under the translations by the elements of the lattice . The same property must be satisfied for any smooth function on T ∗ MA3 , in particular, for the first integrals of the geodesic flow. Since H depends neither on u, nor on v, the corresponding momenta pu and pv are local first integrals of the geodesic flow. However, being not invariant under (4), they are not well defined on the cotangent bundle T ∗ MA3 . That is why, to get global first integrals, we need to replace pu , pv by two smooth functions f1 (pu , pv ), f2 (pu , pv ) invariant under the transformation (pu , pv ) → (λ−1 pu , λpv ) (or, speaking in more general terms, by the invariants of the Z-action on the cotangent plane generated by the −1 hyperbolic linear transformation A ). One invariant function is evident: Q = pu pv . To find another one we introduce the following expression which will be useful also in the future E pu ln G pv . (5) α= 2 ln λ Under the transformation (4) α changes in a very simple way: α(pu , pv ) → α(pu , pv ) − 1.

Spectra of Sol-Manifolds

587

Thus, as a second integral we can take any function of α with period 1, for instance, cos(2π α) or sin(2π α). However, these functions are not smooth at pu = 0 and pv = 0. To avoid this difficulty and to get the first integrals in a more symmetric form we put: f1 = R(Q) cos 2πα, f2 = R(Q) sin 2πα, where R(Q) =

1 |Q| exp(− 2 ). Q

Remark. The fact that the second integral is not analytic is not accidental: A theorem proved by Taimanov [28] implies that Sol-manifolds do not admit integrable geodesic flows with analytic integrals (see [5] for more details). We are going to show now that one can see the topological structure of Sol-manifolds by looking at the Hamiltonian monodromy of the geodesic flow. For that we will have to investigate the bifurcation diagram (i.e. the set of critical values) of the momentum 5 = {H = 1}: mapping restricted to the isoenergy surface EA 5 FA = (f1 , f2 ) : EA → R2 .

(6)

Proposition 1. The set of critical points of the momentum mapping FA consists of five parts: a) four one-parameter families Li , (i = 1, . . . , 4) of (degenerate) 2-dimensional tori lying in the cotangent bundle given by (α is a parameter): z = −α, u and v are arbitrary,

pz = 0,

pu = ±

e2α ln λ

E 1+

√F EG

,

pv = ±

e−2α ln λ ; G 1 + √F EG

and z = −α, and v are arbitrary,

pz = 0,

pu = ±

e2α ln λ

E 1−

√F EG

,

pv = ∓

e−2α ln λ , G 1 − √F EG

b) the critical set N given by the equation Q = pu pv = 0. The bifurcation diagram of FA consists of two circles {f12 + f22 = R 2 (Q∗+ )} = FA (Li ),

i = 1, 2,

and {f12 + f22 = R 2 (Q∗− )} = FA (Li ), i = 3, 4, √ where Q∗± = (F ± EG)−1 , and the point (0, 0) = FA (N ), the centre of these circles.

588


Proof. We are interested in the singularities of FA or, which is the same, those of the Liouville foliation. These singularities can be of two types. To explain their nature we first consider the geodesic flow on the covering manifold M˜ 3 . On this (non-compact) manifold the integrals of the flow are simply pu and pv . Consider the Liouville foliation for this covering system. Its singular leaves correspond to the critical points of the momentum mapping F˜ = (pu , pv ) : E˜ 5 → R2 , ˜ Obviously, these leaves remain singular after the natwhere E˜ 5 = {H = 1} ⊂ T ∗ M. 5 5 ural projection E˜ → EA . These are singularities of the first type. On the other hand some new singularities appear since instead of pu and pv we have to consider more complicated functions f1 and f2 . In other words, these are singularities of the map (pu , pv ) → (f1 , f2 ). Let us treat both cases in turn. It is easily seen that pu and pv are functionally dependent, as functions on E˜ 5 = {H = 1} if and only if two conditions are simultaneously ∂H satisfied: 1) ∂p = 2pz = 0 and 2) ∂H ∂z = 0. Taking into account the condition H = 1, z we obtain a system of equations Ee2 log λz pu2 − Ge−2 log λz pv2 = 0, Ee2 log λz pu2 + 2Fpu pv + Ge−2 log λz pv2 = 2. The first equation gives ln z=−

E pu G pv

2 ln λ

= −α.

Now solving this system for pu and pv (after substituting z = −α), we find four distinct solutions: pv =

e−α ln λ ; G 1+ √F

−e , E 1+ √F

pv =

−α ln λ −e ; G 1+ √F

3) pu =

α ln λ e , E 1− √F

pv =

−α ln λ −e ; G 1− √F

4) pu =

α ln λ −e , E 1− √F

pv =

e . G 1− √F

1) pu =

α ln λ e , E 1+ √F

2) pu =

EG α ln λ

EG

EG

EG

EG

EG

EG −α ln λ

EG

Thus, for each value of α we obtain four 2-dimensional invariant tori in T ∗ MA3 . All 2 = {z = const = of them are diffeomorphically projected onto the same T 2 -fibre T−α −α} ⊂ M. Varying α, we obtain 4 families of degenerate Liouville 2-tori Li , i = 1, . . . , 4. It is easy to verify that and √ for each family Li the value of Q = pu pv is constant √ equal to Q∗+ = (F + EG)−1 for L1 and L2 , and equal to Q∗− = (F − EG)−1 for L3 and L4 . Hence the image of L1 and L2 is the circle f12 + f22 = R 2 (Q∗+ ), and analogously the image of L3 and L4 is the other circle f12 + f22 = R 2 (Q∗− ), as required.


589

3 × K where K is given in the figure Fig. 1. The topological structure of the singular leaf is MA

The singularities of the second type come from those of the mapping (pu , pv ) → (f1 , f2 ). It can be easily seen that the critical points of this mapping are defined by the equation Q = pu pv = 0. This implies immediately f1 = f2 = 0 which gives a single point on the bifurcation diagram, namely the centre of the circles. Notice that topologically the subset N = {Q = pu pv = 0, H = 1} ⊂ T ∗ MA3 is homeomorphic to the direct product MA3 × K, where K is a graph that consists of two vertices and four segments connecting them (see Fig. 1). This follows from the parallelizability of MA3 and the simple observation that in each cotangent space the conditions pu pv = 0, H = 1 define a graph homeomorphic to K: Two circles {Gpu2 + 21 pz2 = 1} and {Epv2 + 21 pz2 = 1} intersecting at pu = pv = 0, pz = ±1. Now we are able to describe the global structure of the foliation of the isoenergy 5 into Liouville tori. surface EA If we remove the singular set from the isoenergy surface we obtain four families of 3-dimensional Liouville tori distinguished from each other by signs of pu and pv : a) b) c) d)

pu > 0, pv > 0; pu < 0, pv < 0; pu > 0, pv < 0; pu < 0, pv > 0.

The families a) and b) are isomorphic (more precisely, they transform into each other by the globally defined time reversal automorphism of the geodesic flow (u, v, z, pu , pv , pz ) → (u, v, z, −pu , −pv , −pz )). The same is true for the families c) and d). Each Liouville 3-torus is uniquely determined by the values of two integrals Q and α mod 1, where the values of Q form the interval (0, Q∗+ ) in the first two cases and (Q∗− , 0) for the other two cases. In particular, in each of the cases, the base of the T 3 -foliation is homeomorphic to a punctured disc. As Q → 0, the Liouville torus approaches the singular set N . As Q → Q∗± , the torus shrinks into one of the degenerate 2-tori described above. 5 = {H = 1} can be considered Thus, the base of the global Liouville foliation on EA as four discs glued together at their centres. All interior points of these discs except the centre correspond one-to-one to regular 3-dimensional Liouville tori, the boundary circles of the discs correspond to the families Li of degenerate 2-tori, and finally, the common center of the discs corresponds to the singular set N .

The image of each family under the momentum map is a 2-disc with the center removed. This is exactly the situation when we can talk about Hamiltonian monodromy [9].

590


Theorem 1. For each family of Liouville 3-tori there exist a basis of cycles in the first homology group of the tori in which the Hamiltonian monodromy has the matrix A0 . 0 1 Proof. This fact can be observed in different ways. We shall follow the definition of Hamiltonian monodromy and will explicitly compute the deformation of Liouville tori and the final gluing map. Consider an arbitrary Liouville 3-torus T 3 = TQ3 0 ,α0 . In coordinates, this torus is given by three conditions: Ee2z ln λ pu2 + 2Fpu pv + Ge−2z ln λ pv2 + pz2 = 2, Q(pu , pv ) = pu pv = Q0 , α(pu , pv ) =

ln

E pu G pv 2 ln λ

= α0

(7)

mod 1.

More precisely, these conditions define a disjoint union of two or four tori, which differ from each other by the signs of the momenta pu and pv . We consider one of them TQ3 0 ,α0 by putting for definiteness pu > 0, pv > 0. For our purposes first we need to explain why the above conditions define indeed a three-dimensional torus and to describe the basic cycles on this torus. Notice that the common level set (7) of the first integrals can be regarded from two slightly different points of view: as a subset in T ∗ M˜ 3 and that in T ∗ MA3 . However one can show that the natural projection T ∗ M˜ 3 → T ∗ MA3 restricted to this level set is a diffeomorphism (no points are glued between them). Thus, in fact there is no real difference between these two points of view. In particular, instead of the conditions pu pv = Q0 , α(pu , pv ) = α0 mod 1 we may simply assume that the momenta pu , pv themselves are constant. Then the conditions (7) can be rewritten as: pu = const ,

pv = const ,

u and v are arbitrary,

and c1 cosh(2 ln λ(z + α0 )) + pz2 = c2 ,

(8) √ where c1 = 2 EG|Q0 |, c2 = 2 − 2F Q0 . We see that the variables separate and the fact that this system defines a 3-torus becomes evident. Indeed, the variables u, v run over a two-dimensional torus and the last equation defines a simple closed curve on the plane R2 (z, pz ). In other words, we have a natural splitting of TQ3 0 ,α0 into the direct product T 2 × S 1 . Thus, as basic cycles on TQ3 0 ,α0 we can take the cycles on T 2 (u, v) related to the original coordinate system (x, y) (see above) and the third cycle defined by (8). Now let us look at what happens to this torus if we change the parameters Q0 and α0 in such a way that the point FA (TQ3 0 ,α0 ) moves inside the image of the momentum mapping around the singular point FA (N ) = (0, 0). It is easy to see that this deformation just means that we change the value of α, while Q can be chosen to remain constant: Q(t) = Q0 ,

α(t) = α0 + t,

t ∈ [0, 1].

Consider the family of mappings φt (u, v, z, pu , pv , pz ) = (u, v, z − t, et ln λ pu , e−t ln λ pv , pz ).


591

It is not hard to see that the image of TQ3 0 ,α0 under φt is exactly TQ3 0 ,α0 +t and φt : TQ3 0 ,α0 → TQ3 0 ,α0 +t is a diffeomorphism. In other words, φt defines the deformation of Liouville tori we need. At the moment t = 1 the torus comes back to the initial position, i.e., TQ3 0 ,α0 = 3 TQ0 ,α0 +1 , and we obtain the monodromy map φ1 : TQ3 0 ,α0 → TQ3 0 ,α0 = TQ3 0 ,α0 +1 . Now our goal is to describe the corresponding automorphism of the first homology group: φ1 ∗ : H1 (TQ3 0 ,α0 ) = Z3 → H1 (TQ3 0 ,α0 ) = Z3 . Using the identification (4) we see that the map φt can be rewritten as follows:     λu u  v   λ−1 v       z   z  φt   =  .  pu   pu  p   p  v v pz pz We see that the only transformation is related to the variables u and v. Moreover, this transformation is exactly the original hyperbolic automorphism A : T 2 → T 2 . Taking into account the natural splitting TQ3 0 ,α0 = T 2 (u, v) × S 1 (z, pz ) we conclude immediately that the monodromy matrix in the chosen basis is A0 . 0 1

We conclude this section with the discussion of the geodesics on Sol-manifolds. They have different properties depending on the types of leaves of the Liouville foliation which they belong to. First consider the geodesics lying on Liouville tori of dimension three. They are characterized by the property that all momenta pu , pv and pz differ from zero. More precisely, the signs pu and pv always remain the same, whereas the sign of pz changes. This happens when z reaches the value ± cosh−1 √h−Fpu pv EG|pu pv | − α(pu , pv ). z± = 2 ln λ Two levels z = z+ and z = z− are exactly the caustics of the Liouville tori that contains a given geodesic. The situation is quite similar to that on a surface of revolution where the motion takes place between two levels of z. It is easy to see that the distance between these levels z+ − z− tends to infinity as pu pv tends to zero. From this it follows that the corresponding geodesics rotate many times (along the base S 1 ), then turn back, after this go in the opposite direction, then turn back and so on. As pu pv tends to zero the number of rotations in one direction until turning back (or, which is the same, between two caustics) increases up to infinity.

592


If pu pv = 0, then we are on the singular level. The corresponding geodesics have the following behaviour. If both pu and pv vanish, then we obtain the family of geodesics u = const, v = const, z = t. Such geodesics obviously form an invariant submanifold N+ in T ∗ M which is diffeomorphic to M. Exactly on this submanifold the geodesic flow is chaotic and has positive entropy. Indeed, the time-one map transforms each fibre Tz2 into itself by means of the hyperbolic automorphism A. As is well known, the entropy of A : T 2 → T 2 is ln λ > 0. There is another invariant submanifold N− with the same properties formed by vertical geodesics going in the opposite direction: u = const, v = const, z = −t. From the viewpoint of the ambient geodesic flow N+ and N− are hyperbolic invariant subsets. The stable manifold coresponding to N+ is given by pv = 0, the unstable one is pu = 0. For N− the stable and unstable manifolds interchange. The geodesics satisfying the condition pv = 0 as t → +∞ asymptotically approaches N+ , in particular, pz → +1. But there is t = t0 when pz changes sign so that for t → −∞ the geodesic approaches N− . The geodesics satisfying pu = 0 behave in the opposite way. In slightly other terms this structure can be described as follows: there are two hyperbolic submanifolds diffeomorphic to MA3 , which are connected by four 4-dimensional separatrices, see Fig. 1 Finally we would like to mention an interesting phenomenon which one would not expect from an integrable geodesic flow on a compact manifold. Namely, one of the action integrals diverges as the integral Q → 0 with the energy fixed (see the calculations and footnote in Sect. 7). A more well known scenario would be when approaching the singular level some of the cycles of the Liouville tori shrink so the actions will stay finite. The fact that this not true for Sol-manifolds when one approaches the singular (chaotic) level demonstrates once again the peculiar nature of this system. To discuss the quantum case we will need some facts from the classical number theory, which we present in the next section. 4. SL(2, Z) and Binary Quadratic Forms The content of this section is well-known (see e.g. [20, 21, 24]). a11 a12 Let A = ∈ SL(2, Z) be an integer hyperbolic matrix. Hyperbolicity as a21 a22 before means that its eigenvalues are real and distinct. We would like to consider A as the automorphism of the lattice L = Z ⊕ Z ⊂ R2 by choosing some basis e1 , e2 in this lattice. For any such A we can define the following integer binary quadratic form QA by the formula Av ∧ v = QA (v)e1 ∧ e2 , where v is a vector from R2 . Explicitly if v = xe1 + ye2 then a11 x + a12 y x = −a21 x 2 + (a11 − a22 )xy + a12 y 2 . QA (x, y) = det a21 x + a22 y y

(9)

(10)


593

It is easy to see from the definition that this form is invariant under the action of A: QA (Av) = QA (v). Notice that QA has the discriminant D = (a11 − a22 )2 + 4a12 a21 = (a11 + a22 )2 − 4(a11 a22 − a12 a21 ) = (a11 + a22 )2 − 4, which is exactly the discriminant of the characteristic equation of A: λ2 − (a11 + a22 )λ + 1 = 0. In particular, since A is hyperbolic the form QA is indefinite. Note that the discriminant D cannot be a perfect square. In general the coefficients of the quadratic form QA may have a common factor. Let ˆ A (x, y) = ax 2 + bxy + cy 2 Q

(11)

be its primitive form after division of QA by the largest common factor. It is defined correctly only up to a sign. Thus to each integer unimodular hyperbolic matrix A we relate an indefinite integer ˆ A. primitive quadratic form Q Conversely, suppose we have such a form Q(x, y) = ax 2 + bxy + cy 2 . We would like to describe all A from SL(2, Z) which preserve this form. Such A are called the automorphs of Q. Let d = b2 − 4ac be the discriminant of Q which we assume not to be a perfect square and consider the corresponding Diophantine equation called Pell’s equation: X 2 − dY 2 = 4.

(12)

Then the group of automorphs consists of matrices of the form X−bY −cY 2 , A=± aY X+bY 2 where (X, Y ) are the solutions of Pell’s equation. Modulo ±I this group is cyclic with generator X0 −bY0 −cY0 2 A0 = , (13) 0 aY0 X0 +bY 2 where (X0 , Y0 ) is the fundamental solution of this equation. Recall that √ (X0 , Y0 ) is the fundamental solution of Pell’s equation if X0 > 0, Y0 > 0 and X0 + dY0 is minimal among all such solutions. The classical result about Pell’s equation says that all other solutions can be found from the relation n √ √ X + dY X0 + dY0 , =± 2 2

594


where n√= 0, 1, . . . . One can find the fundamental solution from the continued fraction of d. This structure of the solutions of Pell’s equations induces the cyclic group structure for the automorphs. Notice that the form QA corresponding to the matrix (13) has the form Q = Y0 (ax 2 + bxy + cy 2 ). Let us call a hyperbolic element A from SL(2, Z) primitive if it cannot be represented as a power of any other element from SL(2, Z). Thus we have described a natural correspondence between the primitive binary indefinite forms Q and primitive elements A from SL(2, Z). In particular, it helps us to answer the question if a given integer unimodular matrix A is a primitive or if not which power of a primitive matrix it is. 5. Spectrum and Eigenfunctions of the Laplace-Beltrami Operator Let us now discuss the quantum geodesic problem on the Sol-manifold MA3 : −ψ = Eψ,

(14)

where is the Laplace-Beltrami operator on MA3 and ψ = ψ(P , E), P ∈ MA3 . In coordinates (u, v, z) the Laplace-Beltrami operator has the following explicit form: = Ee2z ln λ

2 ∂2 ∂2 ∂2 −2z ln λ ∂ + Ge + 2F + . ∂u2 ∂u∂v ∂v 2 ∂z2

(15)

This is a self-adjoint operator in the Hilbert space L2 (MA3 ) where the integration measure on MA3 is induced by the Riemannian metric (3). In both (x, y, z) and (u, v, z) coordinate systems the corresponding measure dµ is proportional to the standard Lebesgue measure on R3 . Because the coefficients of depend only on z it is quite natural to separate variables and look for the eigenfunctions of of the form γ (u, v, z) = e2πi(γ ,w) f (z), where γ is an element of the dual lattice ∗ corresponding to the T 2 -fibres and w = (u, v) (so the scalar product (γ , w) is defined modulo Z). By substituting into the Schrödinger equation (14), (15) we get 2 √ F sgn Q(γ ) ∂ f 2 −8π EG|Q(γ )| cosh 2 ln λ(z+α(γ )) + √ γ = f e2πi(γ ,w) , ∂z2 EG where Q(γ ) = (γ , eu )(γ , ev ) is a quadratic form on the lattice ∗ , and E (γ ,eu ) ln G (γ ,ev ) α(γ ) = . 2 ln λ Here eu and ev are the eigenvectors of A related to the eigenvalues λ and λ−1 respectively and the basis eu , ev is assumed to be positively oriented. Notice that α is the same as before in (5) if we replace pu by (γ , eu ) and pv by (γ , ev ).


595

To clarify the meaning of the coefficient in front of the cosh let us consider the basis eu∗ , ev∗ in R2∗ dual to eu , ev . The vectors eu∗ and ev∗ are also the eigenvectors of A∗ with the eigenvalues λ and λ−1 respectively. By definition we have γ = (γ , eu )eu∗ + (γ , ev )ev∗ . Since Q(γ ) is obviously invariant under the action of A∗ it is natural to compare it with the binary form QA∗ defined in Sect. 3. We have A∗ γ ∧ γ = (λ(γ , eu )eu∗ + λ−1 (γ , ev )ev∗ ) ∧ ((γ , eu )eu∗ + (γ , ev )ev∗ ) = (γ , eu )(γ , ev )(λ − λ−1 )eu∗ ∧ ev∗ . Let l1 , l2 be a positively oriented basis in the dual lattice ∗ , then by definition A∗ γ ∧γ = QA∗ (γ )l1 ∧ l2 . From these calculations and from the equalities E = |eu∗ |2 , G = |ev∗ |2 it follows that √ EG|Q(γ )| = c|QA∗ (γ )|, where A(∗ ) 1 c = c(A; E, F, G) = √ =√ . D sin θ DA(T 2 ) sin θ

(16)

A(∗ ) is the area of the dual basic parallelogram (e1∗ , e2∗ ) (which is the inverse of the area of the fibre T 2 ), D = (λ − λ−1 )2 is the discriminant of the characteristic equation of the matrix A (or equivalently A∗ ), and θ is the angle between eu∗ and ev∗ . Thus we have proved the following Proposition 2. A function = e2πi(γ ,w) f (z) satisfies Eq. (14) if and only if f (z) satisfies the modified Mathieu equation d2 (17) − 2 + |ν(γ )| cosh 2µ (z + α(γ )) f (z) = f (z), dz where µ = ln λ, ν(γ ) = 8π 2 cQA∗ (γ ) and α(γ ) is given above. The eigenvalues E and are related by the shift E = + ν(γ ) cos θ.

(18)

Recall that the modified Mathieu equation is the cosh-version of the standard Mathieu equation d2 y + (a cos 2µx + b)y = 0. dx 2 Its solutions are known as modified Mathieu functions (see e.g. [33, 34]). They appear also in the theory of Coulomb spheroidal functions [17], where one can find some related numerical results (see also [18]). Let = k (ν), k = 1, 2, . . . be the spectrum of the corresponding modified Mathieu operator M=−

d2 + |ν| cosh 2µz dz2

and fγ ,k (z) be the corresponding solutions of (17).

596


Thus, to each element γ of the dual lattice ∗ we associate the functions γ ,k (u, v, z) = e2πi(γ ,w) fγ ,k (z). The problem with these functions is that they are well defined on the covering space M˜ 3 = T 2 × R but not on the Sol-manifold MA3 itself because they are not invariant with respect to the transformation (1), (2). One can try to construct the genuine eigenfunctions of on MA3 by averaging these functions with respect to the action of Z on M˜ 3 generated by this transformation. It turns out that the averaging procedure works. To show this let us consider instead of γ ,k (u, v, z) the following sum: γ ,k (λn u, λ−n v, z + n) = A∗ n γ ,k (u, v, z). (19) γ ,k = n∈Z

n∈Z

Because of the fast decay of the eigenfunctions fγ ,k (z) this sum is absolutely convergent. It is easy to see that it defines a well-defined function on MA3 , which is an eigenfunction of the Laplace-Beltrami operator . The eigenfunctions γ ,k (u, v, z) on MA3 actually depend only on the orbits [γ ] = ∗ {A n (γ )}n∈Z with respect to the action of A∗ on ∗ : γ ,k (u, v, z) = [γ ],k (u, v, z). We should also consider separately the eigenfunctions related to γ = 0. It is easy to see that the corresponding eigenfunctions have the very simple form 0,l = e2πilz ,

l∈Z

(20)

with the eigenvalues El = (2π)2 l 2 . Theorem 2. The eigenfunctions of the Laplace-Beltrami operator [γ ],k (u, v, z), [γ ] ∈ ∗ \ {0}/A∗ and 0,l (z) form a complete basis in L2 (MA3 ). Proof. The independence and orthogonality of these functions are obvious. The only thing we have to verify is the completeness. To prove this we need to show that any smooth function : MA3 → R which is orthogonal to each eigenfunction from the list is, in fact, zero. Consider such a function (w, z) on the covering space M˜ 3 and expand it as a Fourier series (with respect to w): (w, z) = e2πi(γ ,w) aγ (z) γ ∈ ∗

with some smooth coefficients aγ (z), z ∈ R. Lemma 1. For all γ = 0 the functions aγ (z) have fast decay at infinity and thus belong to L2 (R). Proof. Since is invariant with respect to the transformation (1), we have (w, z) = (Aw, z + 1). Hence ∗ e2πi(γ ,w) aγ (z) = e2πi(γ ,Aw) aγ (z + 1) = e2πi(A γ ,w) aγ (z + 1). γ ∈ ∗

γ ∈ ∗

γ ∈ ∗

Thus the Fourier coefficients satisfy the following property: aγ (z + 1) = aA∗ γ (z),


597

or, more generally, aγ (z + n) = aA∗ n γ (z),

n ∈ Z.

Since the Fourier coefficients aγ of a smooth function decay fast for large γ and A∗ k γ for γ = 0 tends to infinity we see that the functions aγ (z) decay very fast and thus

belong to L2 (R). Now suppose that (w, z) is orthogonal to the eigenfunction [γ0 ],k (u, v, z) = 3 n n∈Z A∗ γ0 ,k (u, v, z). Since the measure on MA is proportional to the standard Lebesgue measure dudvdz we have ¯ [γ0 ],k (w, z)dσ (w, z) 0 = (w, z), [γ0 ],k (w, z) =

=

 1

0

= 0

1



MA3

 ∗n γ ,w) 0

e2πi(γ ,w) e−2πi(A

2 γ ∈ ∗ n∈Z T

e

2 n∈Z T 1 2

= A(T )

0 n∈Z

dudv  aγ (z)fA∗ n γ0 ,k (z)dz

2πi(A∗n γ0 ,w) −2πi(A∗n γ0 ,w)

e

dudv aA∗ n γ0 (z)fA∗ n γ0 ,k (z)dz

aA∗ n γ0 (z)fA∗ n γ0 ,k (z)dz.

We now use the property that fγ ,k (z+n) = fA∗ n γ ,k (z) and aγ (z+n) = aA∗ n γ (z), Z to conclude that 1 aA∗ n γ0 (z)fA∗ n γ0 ,k (z)dz 0 n∈Z

=

1

0 n∈Z

aγ0 (z + n)fγ0 ,k (z + n)dz =

n∈

+∞ −∞

aγ0 (z)fγ0 ,k (z)dz.

Thus, the Fourier coefficients aγ0 (z) for γ0 = 0 belong to L2 (R) and at the same time are orthogonal to all the functions fγ0 ,k (z) which form a complete basis in L2 (R). Hence for γ0 = 0 the coefficients aγ0 (z) ≡ 0. This means that the function must be of the form (w, z) = a(z), where a(z) is periodic with period 1. Now using orthogonality to the functions (20) we conclude that a(z) must be identically zero.

Corollary 1. The spectrum of the Laplace-Beltrami operator on Sol-manifolds consists of two parts: the trivial part E = El = 4l 2 π 2 ,

l = 0, 1, . . .

corresponding to the eigenfunctions (20) and the non-trivial part E = Ek,[γ ] = k (ν([γ ])) + ν([γ ]) cos θ,

k = 1, 2 . . . ,

[γ ] ∈ ∗ {0}/A∗

related to the modified Mathieu equation(17). The multiplicities of the trivial eigenvalues are 2 except for the ground state E = 0 which has multiplicity 1. The multiplicities of the non-trivial part of the spectrum are much more interesting and the answer depends on the arithmetical properties of the gluing map A. We discuss this in the next section.

598

A.V. Bolsinov, H.R. Dullin, A.P. Veselov py pv

10

Q = - 121 10

20

30

px

- 10

- 20

pu

Fig. 2. Fundamental domain of the lattice in (px , py ) with |Q| ≤ 302 for the cat-map. The hyperbola Q = −112 illustrates the first example of a non-trivial degeneracy

6. Multiplicities of the Eigenvalues and Number Theory As one can see from the previous section the eigenvalue of [γ ],k (u, v, z) depends on γ only via QA∗ (γ ). Thus the calculation of the multiplicity1 is reduced to the classical number theoretic problem of finding the number NQ (n) of integer solutions of the equation Q(x, y) = ax 2 + bxy + cy 2 = n for a primitive indefinite quadratic form Q different modulo its automorphs. Figure 2 illustrates this for the cat-map A with Q = −x 2 + xy + y 2 . For forms Q with certain discriminants there exists an effective formula which permits the computation of NQ (n). To be more precise we need the following notion. We say that two forms Q and Q are equivalent if there exists a transformation from SL(2, Z) mapping one into the other. It is easy to see that two equivalent forms must have the same discriminant d = b2 − 4ac. The converse is not true: there can be non-equivalent forms with the same discriminant, see the examples after Theorem 3. Let h(d) be the number of classes of primitive forms with discriminant d. Note that the discriminant d = b2 − 4ac is always 0 or 1 modulo 4 and we assume as usual that it is not a perfect square. 1 Here we consider generic values of the parameters to avoid additional accidental coincidences. Any such degeneracy can be removed by changing the metric such that c in Eq. (16) and hence are constant while θ and hence E are changed.


599

Remark. One should √ distinguish h(d) and the class number of ideals in the quadratic number field Q( d). They coincide only if the so-called negative Pell equation X2 − dY 2 = −4 has a solution; otherwise h(d) is twice as big (see e.g. [15], Chapter 16). The last property √ can be reformulated in terms of the period of the continued fraction expansion of d, but a more explicit description is unknown. If h(d) = 1 then all forms with discriminant d are equivalent. In that case there is the following remarkable formula for the number NQ (n) when n is positive and coprime with d: d NQ (n) = Nd (n) = , (21) k k|n

where the sum is taken over all divisors of n and dk is the standard Kronecker symbol (see Landau [20], Chapter IV.4). The Kronecker symbol is a real character modulo d, which has the following properties determining it uniquely: (1) If d and k are not coprime then dk = 0; d (2) If d and k are coprime then k = ±1; (3) kld = dk dl ; (4) for p odd prime which is not a divisor of d pd coincides with the Legendre symbol, which is 1 if d is quadratic residue modulo p and −1 otherwise; (5) d2 is 1 if d has residue 1 modulo 8 and −1 if it has residue 5 modulo 8. For its computation one can use the following generalisation of the Law of Quadratic Reciprocity: if p, q are coprime positive odd numbers then p−1 q−1 p q = (−1) 2 2 . q p Here is the list of the discriminants d up to 100 with h(d) = 1, see [15] 5, 8, 13, 17, 20, 29, 37, 41, 52, 53, 61, 65, 68, 73, 85, 89, 97. It is believed that there are infinitely many fundamental discriminants with h(d) = 1, but it is still an open problem. Notice that for positive definite forms it is known that there are only 9 fundamental discriminants with h(d) = 1 as it was conjectured by Gauss, namely d = −3, −4, −7, −8, −11, −19, −43, −67, −163. In general if h(d) > 1 the right-hand side of the formula (21) gives the total number of representations of n by all non-equivalent forms with discriminant d. An interesting case is when the ideal class number of d is 1 but h(d) = 2. In that case we have only two nonequivalent forms with discriminant d: Q and −Q and the formula (21) gives the number of the solutions of the equation |Q| = n. The first corresponding discriminants are: 12, 21, 24, 28, 32, 33, 44, 45, 48, 56, 57, 69, 72, 76, 77, 80, 84, 88, 92, 93 (see [15]). The only discriminants < 100 not listed in either table above are 40, 60, 85, 96 with h(d) = 2, 4, 2, 4, respectively. Note that most of these discriminants d are not of

600


the form D = t 2 − 4, but they can still be obtained from A ∈ SL2 (Z) because D/d may be an arbitrary square. Now we are ready to describe the multiplicities of the eigenvalues Ek,[γ ] . First we should take into account that the gluing map A ∈ SL(2, Z) and the corresponding form Q = QA∗ may be non-primitive. Let us define the positive integers r = r(A) ˆ where A0 ∈ SL(2, Z) and and l = l(A) from the relations A = Ar0 and QA∗ = l Q, ˆ ˆ Q = QA∗ are primitive. Theorem 3. The multiplicity m of the eigenvalue Ek,[γ ] of the Laplace-Beltrami operator for generic values of the parameters in the metric (3) is m(γ ) = 2r(A)NQ∗ (n), where n = Q∗ (γ ) = l(A)−1 QA∗ (γ ). When the discriminant d of the form Q∗ has class number 1 and n is coprime with d then NQ∗ (n) can be computed using the formula (21). Examples. 1. Let

A=

21 11

be the so-called cat-map. Then A∗ = A and QA = QA∗ = −(x 2 − xy − y 2 ) are both primitive. The discriminant D = d = 5 has class number 1, so one can use the formula (21) to compute the multiplicity of the corresponding El,[γ ] . One can check that this leads to the formula m = 2(N±1 (n) − N±2 (n)), where N±1 (n) and N±2 (n) are the numbers of divisors of n = QA (γ ) which have respectively the residues ±1 and ±2 modulo 5. This example shows that the multiplicities of the eigenvalues can be as big as we like: for example for n = 11M the multiplicity is M + 1; for n a product of M distinct primes all ±1 mod 5 the multiplicity is 2M . 2. The matrix 1 3 A= 3 10 corresponds to d = 13 with D = 32 d and QA∗ = −3(x 2 − 3xy − y 2 ) so l(A) = 3. 3. For 52 A= 21 we have d = 8 with D = 22 d and QA∗ = QA = −2(x 2 − 2xy − y 2 ) so l(A) = 2. This example shows that l(A) in general is not directly related to the largest square divisor of D. 4. For 13 A= , 14 d = D = 21 and QA∗ = −(3x 2 + 3xy − y 2 ), l(A) = 1. Here h(d) = 2, but the non-equivalent forms simply differ by a sign.


601

5. The matrices

A1 =

1 6 6 37

and A2 =

7 18 12 31

correspond to d = 40 with D = 62 d, l(Ai ) = 6. Then QA∗1 = −6(x 2 + 6xy − y 2 ) and QA∗2 = −6(3x 2 + 4xy − 2y 2 ) are the two corresponding (non-trivially) nonequivalent forms. Remark. In the case when h(d) is larger than 1 in general we do not have a simple formula for the multiplicities for a particular Sol-manifold but only for the disjoint union of Sol-manifolds with non-equivalent forms of given discriminant d. The fact that the multiplicities are large and not sensitive to the change of the parameters in the metric seems to be remarkable. A possible explanation of the rigidity of multiplicities for Sol-manifolds is in the hyperbolicity hidden in the topology of the manifolds. Remark. The same numbers NQ (n) appear in the harmonic analysis on Sol-manifolds as the multiplicities of the irreducible Sol-representations in C ∞ (MA3 ) (see Chapter 1 in [6]). Although this fact has a similar origin it does not explain the degeneracy of the spectrum of . It is interesting to compare the Sol-case with the spectra of flat tori T 2 . It is easy to see that in the last case the answer will depend drastically on the metric parameters (or equivalently, on the geometry of the basic parallelogram). For example, if it is a square then the spectrum up to a multiple is given by the values of the standard quadratic form n = x 2 + y 2 , and the multiplicity are given by the Gauss’ famous formula m = 4(N1 (n) − N3 (n)), where N1 (n) and N3 (n) are the numbers of the divisors of n with the residues 1 and 3 modulo 4 respectively. If however the basic parallelogram is generic then all multiplicities are 2 (which is due to the central symmetry of the problem). 7. Semiclassical Analysis and Weyl’s Law In this section we use semiclassical arguments to investigate the spectrum of Sol-manifolds in two limiting cases when c is either small (“fat Sol-manifold”) or large. In particular we will show that for fat Sol-manifolds one can “hear the values of the indefinite quadratic form Q”. It is instructive to see first how our exact calculation of the spectrum agrees with the famous Weyl’s law [32], which says that for a quantum system the number N () of eigenvalues E ≤ for large asymptotically is equal (up to a factor (2π )−n ) to the volume of the domain in the classical phase space with energy less than . For our Laplace-Beltrami operator (15) this means that N () ∼

2 V ol(MA3 ) 4 4 3/2 A(T ) = , π3/2 π 3 (2π)3 3 (2π )3

(22)

where V ol(MA3 ) is the volume of our Sol-manifold (which equals the area of the fibre A(T 2 ) since the length in z-direction was assumed to be 1).

602


Let us now count the eigenvalues E using the results of Sect. 5. Let us assume for simplicity that cos θ = 0, so besides the trivial part they coincide with the eigenvalues of the Mathieu operator M=−

d2 + |ν| cosh 2µz, dz2

(23)

where as before ν=√

8π 2 Q DA(T 2 ) sin θ

,

(24)

and Q is the corresponding binary quadratic form. First of all let us use the well-known fact from number theory (see e.g. [15]) that for large Q0 the number of lattice points (modulo A) with values of |Q| less than Q0 is proportional to the area of the fundamental domain up to Q0 : µ M(Q0 ) ∼ 4 √ Q0 . D The factor of four counts lattice points related by the symmetry given by changing the sign of both pu and pv and also accounts for the states in the quadrants where Q is of opposite sign. For fixed value of Q (and hence ν) there is a whole line of eigenvalues of the Mathieu operator (23). The number of these eigenvalues up to energy for large is given asymptotically by the action integral 1 I (, Q) = − |ν| cosh(2µz)dz , 2π which is of course the area of the domain in the phase plane with energy less than divided by 2π. This can be simplified to √ 2π µI = 1 − g cosh 2ζ dζ, g = |ν|/ . With ξ = cosh(2ζ ) this becomes a standard elliptic integral (see e.g. [33]) 1/g √

2π µ 1 − gξ 1−g

. dξ = 4 1 + g(K(k) − E(k)), k 2 = √ I =2 1+g ξ2 − 1 1

(25)

Let us denote this expression f (g). Thus we see that the total number of states up to energy is 3/2 1 µ N () ∼ M (Q)I (, Q)dQ = 2 √ f (g)dg, Cµπ 0 D 2

. The integral over g is best performed by treating it as a double where C = √ 8π 2 D A(T ) sin θ integral over g and ξ . Introducing η = gξ and performing the η-integral first gives 1 4 ∞ dξ 2

f (g)dg = = π. 2 3 3 ξ ξ −1 0 1


603

Thus we have N () ∼ 3/2

A(T 2 ) 4π sin θ , (2π)3 3

(26)

which agrees with Weyl’s formula (22) when θ = π/2. For general θ we have E = λ − ν cos θ . In that case we need to compute X± = 1 − g(cosh 2z ± cos θ)dzdg over the domain 0 ≤ g ≤ 1/(1 ± cos θ ) and |z| ≤ z0 where z0 is the smallest positive root of the integrant. The transformation η = g(cosh 2z + cos θ ),

ξ = cosh 2z

folds the integration region to the rectangle ξ > 1 and 0 ≤ η ≤ 1 and the integral becomes √ 1−η

X± = 2 dηdξ . (ξ ± cos θ ) ξ 2 − 1 The integral over η is easily done as before while the integral over ξ evaluates to X± =

4 3

π 2

± (θ − π2 ) . sin θ

Hence the sum of the contributions from the two cases of positive and negative Q is sin θ (X+ + X− ) =

4 π 3

as before. This computation shows that θ determines the relative number of states between the regions with positive and negative Q, namely X+ /X− = θ/(π − θ). Let us look what this calculation gives for the first eigenvalues of . There are two opposite cases depending on whether geometric parameter A = A(T 2 ) sin θ is small or large. Let us assume again for simplicity that θ = π/2, then A is simply the area of the fibre. The small A corresponds to the “rope-like” Sol-manifolds. In this case the first eigenvalues are “trivial”: El = 4l 2 π 2 , l = 0, 1, 2... which correspond to Q = 0. The second case when A is large (“fat” Sol-manifolds) is more interesting. In that case the parameter ν in the Mathieu operator is small. The action integral (25) for small ν (or large ) has the asymptotics 2 √ I∼ ln . µπ |ν| This suggests the following asymptotics for the eigenvalues for small ν (or large ): k ∼

(µπ(k + 1))2 , (ln |ν|)2

k = 0, 1, 2, . . . .

(27)

2 The fact that the action diverges logarithmically for ν → 0 seems to be surprising. To our knowledge this is the first example of a Liouville integrable system on a compact manifold for which the action diverges on approach of a singular level (with energy fixed).

604


Fig. 3. First 20 states √ of the cosh-Mathieu equation in dependence on the parameter log ν where f (, ν) = ( − ν)/( 2νµ)

Note that although k → 0 as ν → 0 the decay is slower than any power of ν. The first 20 states are shown in Fig. 3. When ν is not small the low states are like those of the harmonic oscillator, k ≈ |ν| +

√

1 , 2νλ k + 2

k = 0, 1, 2, . . .

(28)

Both semiclassical formulas (27) and (28) are well known, see e.g. [16]. In this paper it is shown that the modified Mathieu equation is equivalent to the relativistic harmonic oscillator in which ν is proportional to the speed of light. This explains why (not only semiclassically) the low lying states behave like a harmonic oscillator for large enough ν. For the corresponding Sol-manifolds this gives the following behaviour of the first eigenvalues Ek,[γ ] ∼

(µπ k)2 , (ln |CQA∗ ([γ ])|)2

k = 1, 2, ...,

[γ ] ∈ ∗ \ {0}/A∗

(29)

2

. Starting with the ground state E0 = 0 we order these for small C = √ 8π 2 D A(T ) sin θ eigenvalues E0 < E1 ≤ E2 , . . . to have Ej ∼

ln Qj (µπ )2 ), (1 − 2 (ln C)2 ln C

j = 1, 2, . . . ,

(30)

where Q0 = 0 ≤ Q1 ≤ Q2 , . . . are positive values of the form QA∗ listed in increasing order and we have assumed that Qj C1 . In particular, E1 ∼

(µπ )2 , (ln C)2

Ej − E1 ln(Qj /Q1 ) ∼ , Ei − E 1 ln(Qi /Q1 )

(31)

so when C is small we can “hear" the values of the quadratic form QA∗ . Note that the question about the next order term in Weyl’s law is non-trivial. For the simpler case of Nil-manifolds some results in this direction can be found in [23]. One can also look at the corresponding Minakshisundaram-Plejel asymptotic expansion, which is an important characteristic of a spectrum (see e.g. [11]). In particular the


605

second coefficient in this expansion is proportional to the integral of the scalar curvature K. A straightforward calculation shows that for the Sol-manifold MA3 the principal sectional curvatures are ± sin 2θ log2 λ and −2(sin θ log λ)2 , so K = −2(sin θ log λ)2 and thus is always negative. 8. Spectral Statistics The spectral statistics of integrable and chaotic systems are quite different, see, e.g., [3, 4, 10]. As we have seen the geodesic flow on Sol-manifolds has properties of both, integrable and chaotic systems. Therefore it is a natural question what the spectral statistics of the Sol-manifolds is like. Note that according to the Berry-Tabor conjecture [3] integrable systems should have Poisson distributed level spacing. We are going to show that this is not the case for Sol-manifolds. The reason is the high multiplicities of the eigenvalues. Indeed, for the simplest positive quadratic form Q0 = x 2 + y 2 the classical result due to E. Landau says √ that the number of integers up to a number K represented by this form grows as K/ log K. If there would be no degeneracies then this number would grow like the area of the fundamental region, which is proportional to K. This means that most of the level spacings of the values of Q0 are zero. P. Sarnak pointed out to us [26] that a similar fact is true for indefinite forms as well, and has been proved by Bernays [2]: Theorem (Bernays [2]). The number of positive integers up to K that can be represented √ by a given indefinite quadratic form Q is O(K/ log K). Combining this with the results of Sect. 5 we have the following: Theorem 4. The level spacing distribution for the spectrum of Sol-manifolds MA3 is not Poisson. In the fundamental work by Berry and Tabor [3] a stationary phase computation was shown to yield Poisson statistics if the energy surface in action space is curved. This argument cannot, however, detect global coincidences of eigenvalues coming from different parts of the energy surface. Berry and Tabor in [3] also discussed exceptions to the principle “Integrable systems yield Poisson statistics” with flat energy surface and the subtle number theory involved. Examples with a curved energy surface that do not have Poisson statistics have been known for some time, see e.g. [22, 25], but so far no examples with continuous parameters were known. The present example is therefore particularly interesting because the statement of the theorem is not sensitive to change of the metric in the Sol-class (3). Let us illustrate the level spacing distribution in the example of the cat-map A. In Fig. 4 (left) the level spacing statistics for the indefinite binary quadratic form QA = −x 2 + xy + y 2 is shown for three different values of Qmax . Since the cat-map is the product of two involutions, there is a simple reflection symmetry in the lattice, which causes almost all states to be at least twofold degenerate. This discrete symmetry needs to be factored out before the level spacing statistics can be studied. The involutions Ri with Ri2 = I d are 1 0 1 −1 A = R2 R1 , R1 = , R2 = . −1 −1 0 −1

606


%

%

Qmax = 10^5., # = 21749

Qmax = 10^5., # = 21749

0.12

0.25

0.1

0.2

0.08 0.15 0.06 0.1 0.04 0.05

0.02

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

%

DQ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Qmax = 10^7., # = 2154323

%

DQ

Qmax = 10^7., # = 2154323

0.1 0.3 0.08 0.06

0.2

0.04 0.1 0.02

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

%

DQ

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Qmax = 10^9., # = 215227349

%

DQ

Qmax = 10^9., # = 215227349

0.1 0.4 0.08 0.3 0.06 0.2

0.04

0.1

0.02

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

DQ

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

DQ

Fig. 4. Level spacing statistics of the indefinite binary quadratic form Q(x, y) = −x 2 + xy + y 2 (left) with degeneracies removed (right)

The fixed line of R1 is the line y = −x/2, and factoring the fundamental region in Fig. 2 by R1 simply cuts the fundamental region in half along this line. Since the values of Q are integers we chose to present the raw level spacing statistics, i.e. without unfolding the spectrum first. The number of lattice points found up to the corresponding Qmax in the reduced fundamental region is given in the heading of each figure, and the ratio √ approaches ln λ/(2 d). The figures clearly show that the proportion of degenerate levels grows in agreement with what we said above. When the degenerate levels are discarded in the statistics Fig. 4 (right) shows that the distribution appears to converge to some non-universal shape. We would like to mention here the paper [14] where the moments of the intervals between the sums of two squares were studied.

9. Quantum monodromy In view of the previous results the appearance of quantum monodromy in our problem is quite natural. However there is a problem with this notion in our case which we want to discuss first. As it was shown in [5] the geodesic flow on Sol-manifolds cannot have three analytic integrals (see Sect. 3 above). A similar fact holds in the quantum case. Namely, one can show that the algebra of the differential operators on the Sol-manifold MA3 commuting ∂ ∂ with the Laplace-Beltrami operator (15) is generated by and ∂u ∂v . This means that our quantum problem does not have enough quantum integrals, at least in the class of the


607

differential operators and therefore it is not clear if we can apply the rigorous treatment of quantum monodromy from [30]. So in this section we will treat the quantum monodromy on Sol-manifolds on the intuitive level paying more attention to geometry rather than to analysis. As we have already shown, the set of eigenfunctions is in a natural one-to-one correspondence with ∗ /A∗ × N, [γ ] ∈ ∗ /A∗ , k ∈ N. The fundamental domain of A∗ is shown in Fig. 2. It is natural to represent the orbit space as a lattice on the cone obtained by gluing the edges of the fundamental domain of A∗ on the plane (more precisely we should consider four different cones corresponding exactly to four families of Liouville tori). The complete 3-dimensional lattice of quantum states is obtained by attaching a line to each point of the Sol-flower which contains the spectrum of the modified Mathieu equation with the corresponding value of ν(Q). The Sol-flower can be viewed as a slice through this lattice. But since there is no monodromy in the Mathieu fibres we have chosen to simply present the 2-dimensional base of this 3-dimensional lattice. Quantum monodromy arises when we pass around the vertex of the cone. It is clear that the basis of the lattice will undergo the transformation A. On the other hand, nothing happens to the third direction corresponding to the parameter k. Therefore the quantum monodromy for the Sol-manifold MA3 is given by the matrix ∗ A 0 . 0 1 We want to emphasize that in this case quantum monodromy has a purely topological nature. It is determined by the topology of the underlying manifold, and not by properties of the metric. It does not depend on the parameters E, G, F , moreover the monodromy remains the same for all metrics of the form ds 2 = dsz2 + dz2 ,

30

120

20

110

10

100

0

F2

F2

where dsz2 is a flat metric on fibres Tz2 with coefficients depending on z. In the previously known examples (like the geodesic flow on the 3-dimensional ellipsoid of revolution, [31]) the metric g played the principal role.

90

- 10

80

- 20

70

- 20

- 10

0 F1

10

20

30

70

80

90 F1

100

110

120

Fig. 5. Image of the lattice Z2 in (px , py ) under the momentum map (F1 , F2 ) for fixed energy where A is the cat-map. Left: origin at the centre. Right: distorted standard lattice away from the origin

608


In Figs. 5, 6, and 7 we demonstrate the quantum monodromy of the Sol-manifold MA3 related to the cat-map 21 A= . 11 To make the image of the lattice uniform we have slightly modified the classical integrals f1 , f2 from Sect. 3 as follows: √ F1 = √|Q| cos 2πβ, F2 = |Q| sin 2πβ, u| where Q = pu pv and β = ln|p ln λ . The image of the lattice under the map F we call v| Sol-flower (see Fig. 6). Note that an alternative choice β = ln|p ln λ would give a similar picture and the freedom of the rescaling of the eigenvectors eu , ev leads simply to a rotation of the plane (F1 , F2 ). Figure 5 illustrates that away from the origin the lattice is simply a deformed standard lattice. A nice property of the map (pu , pv ) → (F1 , F2 ) is that it changes the area simply by a constant multiple: it is easy to check that

dF1 ∧ dF2 =

π dpu ∧ dpv . ln λ

It is interesting to mention that the multiplicity problem becomes the standard “circle problem" if one replaces the square lattice by the Sol-flower (but of course it does not help to compute them). When a fundamental cell is chosen in the Sol-flower as indicated in grey in Fig. 7 the monodromy can be observed as follows: A line extending a basis vector is parallel transported in the lattice. After completing a cycle about the origin this direction is changed. The left picture shows images of the lines (px , py ) = (30 − 2l, −j − l), j = 0, . . . , 27, l = 0, . . . , 5. The right picture shows images of the lines (px , py ) = (30 − l, −j − l), j = 0, . . . , 30, l = 0, . . . , 5. Denote the direction of the line shown in the left part of

Fig. 6. Image of the lattice Z2 in (px , py ) under the momentum map (F1 , F2 ) for fixed energy and A the cat-map. Left: Sol-flower with |Q| ≤ 602 . Right: Sol-flower with |Q| ≤ 902


609

Fig. 7. Parallel transport of basic directions in the image of the momentum map

Fig. 7 by e1 , and the one on the right part by e2 . The preimages of these basis vectors in Fig. 2 are −(2ex + ey ) and −(ex + ey ). Parallel transporting e1 clockwise by increasing j gives e1 + e2 (determining the second row of A), while parallel transporting e2 counterclockwise by decreasing j gives −e1 + e2 (determining the first row of A−1 ). Since A ∈ SL(2, Z) this determines the cat-map. 10. Concluding Remarks The Sol-geometry from a dynamical point of view has the special property of being on the border between integrability and chaos. Integrability is reflected in the solvability of the corresponding group while the chaos is related to a hidden (partial) hyperbolicity. This makes the Sol-case of particular interest and explains why the geodesic problem on the Sol-manifolds has both integrable and chaotic features. As we have seen the quantum case gives a new interesting twist to the story by bringing arithmetic into play. Atiyah, Donelly and Singer [1] considered a more general case of Sol-manifolds which are T n+1 torus fibres over T n . Much of our analysis can be generalised to this case as well. The quantum Toda lattice Hamiltonian will appear then as a generalisation of the modified Mathieu operator. Some very interesting results in the corresponding classical problem were found recently by Leo Butler in [7]. It would be also interesting to study in more detail how the chaos (at the degenerate level Q = 0) of the classical system manifests itself in the quantum version. We showed that the spectral statistics provides a counterexample to the Berry-Tabor conjecture, but it cannot be taken as an indicator of chaos. One simple observation is that the trivial eigenfunctions 0,s are asymptotically ‘uniformly distributed’ on the manifold. Hence the subset of eigenfunctions that are associated with the classical chaos are quantum unique ergodic, cf. [19]. Already at relatively small quantum numbers it can be seen that the nodal lines are more complicated when Q is small, see Fig. 8. Acknowledgements. We are very grateful to M. Berry, E. Bombieri, V. Kuznetsov, J. Marklof, A. Pushnitski, P. Sarnak, R. Schubert and P. Shiu for very useful and stimulating discussions. This work has been started in December 2002 when one of us (A.B.) visited Loughborough University. We are grateful to the London Mathematical Society for the support of this visit. The work of APV

610


Fig. 8. Slice of ([γ ],k ) at x = 0 for γ = (1, 0), k = 15 (left) and γ = (12, −5), k = 2 (right) for the cat-map. The appearance of the eigenfunction whose Q(γ ) is small and thus close to the classical chaos is more irregular

was partially supported by the European research network ENIGMA (contract MRTN-CT-2004-5652) and ESF Scientific programme MISGAM. The work of HRD was partially supported by the European research programme MASIE (contract HPRN-CT-2000-0113). The work of AVB was partially supported by the Russian Foundation for Basic Research (grant 05-01-00978).

References 1. Atiyah, M.F., Donnelly, H., Singer, I.M.: Eta invariants, signature defects of cusps and values of L-functions. Ann. Math. 118, 131–177 (1983) ¨ 2. Bernays, P.: Uber die Darstellung von positiven, ganzen Zahlen durch die primitiven, binären quadratischen Formen einer nicht-quadratischen Diskriminante. Dissertation Göttingen, 1912 3. Berry, M.V., Tabor, M.: Level clustering in the regular spectrum. Proc. Roy. Soc. London A 356, 375–394 (1977) 4. Bohigas, O., Giannoni, M.-J., Schmidt, C.: Characterization of chaotic quantum spectra and universality of level fluctuation laws. Phys. Rev. Lett. 52, 1–4 (1984) 5. Bolsinov, A.V., Taimanov, I.A.: Integrable geodesic flows with positive topological entropy. Invent. Math. 140, 639–650 (2000) 6. Brezin, J.: Harmonic analysis on compact solvmanifolds. LNM 602, New York: Springer-Verlag, 1977 7. Butler, L.: Toda lattices and positive-entropy integrable systems. Invent. Math. 158, 515–549 (2004) 8. Cushman, R., Duistermaat, J.J.: The quantum spherical pendulum. Bull. Amer. Math. Soc. (N.S.) 19, 475–479 (1988) 9. Duistermaat, J.J.: On global action-angle coordinates. Comm. Pure Appl. Math. 33, 687–706 (1980) 10. Duistermaat, J.J., Guillemin, V.W.: The spectrum of positive elliptic operators and periodic bicharacteristics. Invent. Math. 29(1), 39–79 (1975) 11. Gilkey, P.: Spectral geometry of a Riemannian manifold. J. Diff. Geom. 10(4), 601–618 (1975) 12. Guillemin, V., Uribe, A.: Monodromy in the quantum spherical pendulum. Commun. Math. Phys. 122, 563–574 (1989) 13. Hirzebruch, F.: Hilbert modular surfaces. L’Enseign. Math. 19 (1973), 183–281 14. Hooley, C.: On the intervals between numbers that are sums of two squares. Acta Math. 127, 279–297 (1971) 15. Keng, H.L.: Introduction to Number Theory. Berlin-Heidelberg-New York: Springer-Verlag, 1982 16. Jager, L.: Fonctions de Mathieu et fonctions propres de l’oscillateur relativiste. Ann. Fac. Sci. Toulouse Math. Serie 6, 7, 465–495 (1998) 17. Komarov, I.V., Ponomarev, L.I., Slavyanov, S.Yu.: Spheroidal and Coulomb spheroidal functions (Russian). Moscow: “Nauka”, 1975, pp 320


611

18. Komarov, I.V., Tsiganov, A.B.: Quantum two-particle periodic Toda lattice (Russian). Vestnik LGU, 4(2), 69–71 (1988) 19. Jakobson, D., Nadirashvili, N., Toth, J.: Geometric properties of eigenfunctions. Russ. Math. Surv. 56, 1085–1105 (2001) 20. Landau, E.: Elementary Number Theory. New York: Chelsea, 1958 21. LeVeque, W.J.: Fundamentals of Number Theory. New York: Dover Publications, 1996 22. Marklof, J.: Spectral form factors of rectangle billiards. Commun. Math. Phys. 199, 169–202 (1998) 23. Petridis, Y.N., Toth, J.A.: The remainder in Weyl’s law for Heisenberg manifolds. J. Differ. Geom. 60, 455–483 (2002) 24. Sarnak, P.: Class numbers of indefinite binary quadratic forms. J. Number Theory 15, 229–247 (1982) 25. Sarnak, P.: Values at integers of binary quadratic forms. CMS Conf. Proc. 21, Providence, RI: Amer. Math. Soc., 1997, pp. 181–203 26. Sarnak, P.: Private communication. (June 2003) 27. Scott, P.: The geometries of 3-manifolds. Bull. London Math. Soc. 15, 401–487 (1983) 28. Taimanov, I.A.: Topological obstructions to integrability of geodesic flows on non-simply-connected manifolds. Math. USSR Izv. 30, 403–409 (1988) 29. Thurston, W.P.: Hyperbolic geometry and 3-manifolds. London Math. Soc. Lecture Note Ser., 48, Cambridge: Cambridge Univ. Press, 1982 30. Vu Ngoc S.: Quantum monodromy in integrable systems. Commun. Math. Phys. 203(2), 465–479 (1999) 31. Waalkens, H., Dullin, H.R.: Quantum monodromy in prolate ellipsoidal billiards. Ann. Phys. 295, 81–112 (2002) 32. Weyl, H.: Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen. Math. Ann. 141, 441–479 (1912) 33. Whittaker, E.T., Watson, G.N.: A Course in Modern Analysis, 4th ed., Cambridge: Cambridge University Press, 1990 34. Zwillinger, D.: Handbook of Differential Equations, 3rd ed. Boston, MA: Academic Press, 1997 Communicated by L. Takhtajan

Commun. Math. Phys. 264, 613–630 (2006) Digital Object Identifier (DOI) 10.1007/s00220-005-1470-y

Communications in


On Curvature Decay in Expanding Cosmological Models Hans Ringström Max-Planck-Institut für Gravitationsphysik, Am Mühlenberg 1, 14476 Golm, Germany. E-mail: [email protected] Received: 21 April 2005 / Accepted: 22 June 2005 Published online: 18 November 2005 – © Springer-Verlag 2005

Abstract: Consider a globally hyperbolic cosmological spacetime. Topologically, the spacetime is then a compact 3-manifold in cartesian product with an interval. Assuming that there is an expanding direction, is there any relation between the topology of the 3-manifold and the asymptotics? In fact, there is a result by Michael Anderson, where he obtains relations between the long-time evolution in General Relativity and the geometrization of 3-manifolds. In order to obtain conclusions however, he makes assumptions concerning the rate of decay of the curvature as proper time tends to infinity. It is thus of interest to find out if such curvature decay conditions are always fulfilled. We consider here the Gowdy spacetimes, for which we prove that the decay condition holds. However, we observe that for general Bianchi VIII spacetimes, the curvature decay condition does not hold, but that some aspects of the expected asymptotic behaviour are still true. 1. Introduction The objects of study in this paper are cosmological spacetimes. We shall assume them to be globally hyperbolic, so that topologically, they are of the form I × M, where M is a compact 3-manifold. We shall also only consider spacetimes which have one expanding direction, i.e. there is one time direction in which spacetime is causally geodesically complete. The question is then, what is the relationship between the asymptotic behaviour and the topology of the compact Cauchy surfaces? Anderson, Fischer and Moncrief have written several papers on the subject, see [2] and [7] and the references cited therein. In the current paper, we are concerned with questions raised in [2] regarding the relationship between the asymptotics and geometrization. The special case of interest here is when one has a globally hyperbolic vacuum spacetime foliated by compact constant mean curvature (CMC) hypersurfaces, though in the case of Gowdy, we shall also be interested in another geometrically defined foliation. We shall assume that σ () ≤ 0

Current address: Department of Mathematics, KTH, 100 44 Stockholm, Sweden

614

H. Ringström

for any CMC hypersurface (for a definition of the σ -constant of a compact 3-manifold, see [1]) or, in other words, that does not admit a metric of positive scalar curvature, see [2]. Furthermore, we shall assume that the range of the mean curvatures attained in the foliation exhausts the interval (−∞, 0) and that the spacetime is future causally geodesically complete. In fact, we shall only be interested in the expanding direction, so it is enough if the foliation exhausts the interval [H0 , 0) for some H0 < 0, and sometimes future causal geodesic completeness will be a consequence of other assumptions. In this setting we wish to consider the behaviour of the geometry induced on the leaves of the foliation as proper time tends to infinity. Let us recall some definitions from [2]. Definition 1. Let be a closed, oriented and connected 3-manifold, satisfying σ () ≤ 0. A weak geometrization of is a decomposition of , = H ∪ G,

(1)

where H is a finite collection of complete connected hyperbolic manifolds of finite volume embedded in and G is a finite collection of connected graph manifolds embedded in . The union is along a finite collection of embedded tori T = ∪Ti , T = ∂H = ∂G. A strong geometrization of is a weak geometrization as above, for which each torus Ti in T is incompressible in , i.e. the inclusion of Ti into induces an injection of fundamental groups. For more details concerning the terminology, we refer to [2] and the references cited therein. Graph manifolds are built by gluing together Seifert fibred spaces along toral boundary components. Since we shall only be concerned with Seifert fibred 3-manifolds in this paper, the details of these constructions are not of any greater importance here. Let us however define the concept Seifert fibred space. Definition 2. A 3-manifold is said to be a Seifert fibred space if it satisfies the following two conditions: 1. It can be written as a disjoint union of circles. 2. Each circle fibre has an open neighbourhood U satisfying: – U can be written as a disjoint union of circle fibres, – U is isomorphic either to a solid torus or a cylinder where the ends have been identified after a rotation by a rational angle. When we say that U is isomorphic to a solid torus, we mean that U is diffeomorphic to a solid torus and that the circle fibres of U are mapped to the natural circle fibres of the solid torus under the diffeomorphism. Note that there are different definitions of Seifert fibred spaces in the literature. In particular, our definition coincides with the original definition by Seifert but not with that of Scott [14]. Since the geometry on the leaves of the foliation becomes more and more flat, it is natural to rescale the metric in some way. Following [2], we shall use the proper time distance to a fixed Cauchy surface in order to do so. Let be a fixed Cauchy surface and define, for an arbitrary spacetime point p, tˆ(p) = sup γ

1 0

[−γ , γ ]1/2 ds,

On Curvature Decay in Expanding Cosmological Models

615

where the supremum is taken over timelike curves γ with γ (0) ∈ and γ (1) = p and ·, · denotes inner product with respect to the spacetime metric. We also define tˆ( ) = sup tˆ(p) p∈

for a Cauchy surface . Let the leaves of the foliation be indexed by a parameter s. In the case of a CMC foliation, the parameter can be chosen to be the mean curvature of the corresponding leaf, and in the case of Gowdy, the parameter will be the so called areal time coordinate. We are interested in the interval [s0 , smax ), where s0 corresponds to some arbitrary initial hypersurface (filling the role of above) and smax corresponds to infinite expansion, i.e. smax = 0 in the CMC case and smax = ∞ in the case of the areal time coordinate in Gowdy. Let gˆ s be the Riemannian metric induced on the leaf s by the spacetime metric and define gs = tˆ−2 (s )gˆ s . The following weak asymptotics problem was raised in [2]. Suppose that is a closed, oriented, connected 3-manifold with σ () ≤ 0. Suppose further that the vacuum spacetime is future causally geodesically complete and that the CMC foliation exhausts the future development. Then for any sequence si → smax , the slices (si , gsi ) have a subsequence asymptotic to a weak geometrization of . More precisely, there should be a division of as in (1) and on the region H , the metrics gsi should converge to complete hyperbolic metrics of finite volume, while on G, the metrics collapse the graph manifold with bounded curvature. When we say that a region collapses we mean that the injectivity radius of that region converges to zero. If a region collapses even though the curvature remains bounded, we shall say that it collapses in the sense of Cheeger-Gromov. This conjecture should be compared with the work of Andersson and Moncrief [3], Choquet-Bruhat and Moncrief [4] and Fischer and Moncrief [7]. In [3], the authors considered the future development of perturbations of spatially compact variants of the k = −1 Friedmann-Robertson-Walker vacuum spacetime. They proved that the future development is covered by CMC hypersurfaces exhausting the maximal range, and that it is future causally geodesically complete. Furthermore, the rescaled metric on the spatial hypersurfaces was shown to converge to the hyperbolic one. In [4], the authors considered Cauchy surfaces that have the topology of a trivial circle bundle over a higher genus surface and they restricted their attention to small, polarized, U (1)symmetric data. They proved that the future development is foliated by CMC hypersurfaces exhausting the maximal range. Furthermore, they stated that causal geodesic completeness should hold, though they did not prove it. However, this was shown for a larger class of spacetimes in [5], a paper which extends the results of [4] to the non-polarized case, using the results of [6]. Finally, they showed that the Cauchy surfaces should undergo a Cheeger-Gromov type collapse. In [7], some known spatially homogeneous examples were studied and the expected behaviour was confirmed. Note that in all the cases mentioned above, either H = ∅ or G = ∅ in the division (1). The reason for this is the fact that all results, as far as we are aware, can be divided into the category of small data results and the category of results for a situation in which there is symmetry. The small data category may seem to be more general, but since it presupposes the existence of a symmetric solution around which to perturb, it is not more general in terms of spatial topology. In other words, all results known require the spatial manifold to allow a highly symmetric metric, and this reduces the number of allowed spatial topologies.

616

H. Ringström

In [2], the following statement was proved. Consider a spacetime which is the maximal development of vacuum initial data, with σ () ≤ 0, where is the initial hypersurface, and assume that it is foliated to the future by CMC hypersurfaces exhausting the maximum range. Assume furthermore that the curvature satisfies |R|(p) + tˆ(p)|∇R|(p) ≤

C tˆ2 (p)

,

(2)

where |R|2 is defined as the sum of the squares of the components of the Riemann curvature tensor with respect to an orthonormal frame, where the timelike unit vector in the frame is the future oriented normal to the foliation (the definition of |∇R|2 is similar). Then the spacetime is future causally geodesically complete and, for any sequence si → smax , the slices (si , gsi ) have a subsequence asymptotic to a weak geometrization. Due to this theorem, it is of interest to analyze how curvature decays in expanding cosmological spacetimes. In the following, we shall only consider whether the estimate |R|(p) ≤

C , tˆ2 (p)

(3)

holds or not. In the case of Gowdy, it turns out that such an estimate holds, at least relative to the foliation defined by the areal time coordinate. In the case of locally rotationally symmetric Bianchi VIII, the estimate also holds, but it turns out that for general Bianchi VIII it does not. In that case tˆ(p) ln tˆ(p)|R|(p) converges to a positive number as p tends to a point in the infinite future. In fact, in the case of general Bianchi VIII, one does not get a better estimate even if one considers the Kretschmann scalar κ = Rαβγ δ R αβγ δ .

(4)

It is then of interest to consider the Ricci curvature of gsi . It turns out that in general, the Ricci curvature does not have any better decay, but that there is a time sequence such that one does get the expected decay. This time sequence corresponds to the metric being locally rotationally symmetric. Concerning the topology, we have the following results. In the case of Gowdy, the topology is T 3 , and after rescaling the 3-tori collapse along 2-tori. In the Bianchi VIII case, the topology is that of a non-trivial circle bundle over a higher genus surface. After rescaling one obtains the conclusion that the length of the circle fibers converges to zero.

1.1. Gowdy spacetimes. The Gowdy spacetimes is a class of vacuum spacetimes with a two dimensional group of isometries. Of the spatial topologies compatible with the symmetry requirements, only T 3 is expected to be compatible with infinite expansion. For this reason, we shall only be interested in such a spatial topology in this paper. There are natural conditions defining the Gowdy spacetimes, see [12] and references therein, but we shall not write them down here. For the purposes of the present paper, a Gowdy T 3 spacetime is defined as a Lorentz manifold R+ × T 3 , where R+ = (0, ∞), with metric g = t −1/2 eλ/2 (−dt 2 + dθ 2 ) + t[eP dσ 2 + 2eP Qdσ dδ + (eP Q2 + e−P )dδ 2 ], (5)


617

where P , Q and λ only depend on t and θ, satisfying Einstein’s vacuum equations. In terms of P , Q and λ, the equations are 1 Ptt + Pt − Pθθ − e2P (Q2t − Q2θ ) = 0, t 1 Qtt + Qt − Qθθ + 2(Pt Qt − Pθ Qθ ) = 0, t

(6) (7)

and λt = t[Pt2 + Pθ2 + e2P (Q2t + Q2θ )],

(8)

λθ = 2t (Pθ Pt + e

(9)

2P

Qθ Qt ).

The time coordinate t appearing in (5) is called the areal time coordinate. The reason for this is that the area of the two torus given by fixing t and θ is t. On the other hand, the trace of the second fundamental form need not be constant on the hypersurfaces of constant t. One might then naively expect this to approximately be the case asymptotically. However, there are metrics of the form (5) such that there is a time sequence tk → ∞ with the property that the quotient of the maximum and the minimum of |trktk | tends to infinity, where ktk is the second fundamental form of the hypersurface defined by t = tk . We refer the reader to [13] for a proof of this fact. Thus there is certainly a difference between the CMC foliation and the areal time coordinate foliation. Since most of the analysis concerning Gowdy spacetimes has been carried out in the areal time coordinate and since this coordinate has a natural geometric definition, we shall however only consider this choice here. In the end we are interested in getting estimates for the curvature. In [12], we analyzed the asymptotics of solutions to (6)–(7). However, the analysis was not complete. In particular, [12] only contains estimates of the first derivatives of P and Q, and this is not sufficient for computing curvature. The first step is to remedy this situation. Theorem 1. Consider a solution to (6)–(7). Then

(∂θk ∂t P )2 + (∂θk+1 P )2 + e2P [(∂θk ∂t Q)2 + (∂θk+1 Q)2 ] C 0 (S 1 ,R) ≤ Ck

(ln t)2k (10) t

for t ≥ 2 and k ≥ 0. Remark 1. The above estimates together with Eqs. (6)–(7) yield estimates for the higher order derivatives involving an arbitrary number of time derivatives. In the polarized case, i.e. when Q = 0, there is an improved estimate. In fact, one does not need the logarithms. To see this, note that the case k = 0 of (10) was proved in [12] and that in the polarized case, the equation remains the same under differentiation with respect to θ . The proof is to be found at the beginning of Sect. 2. Define the proper time distance between the hypersurfaces defined by t0 and t to be τ (t0 , t), cf. (18). Then the decay estimate for the curvature is as follows. Theorem 2. Consider a metric of the form (5), where P , Q and λ satisfy (6)–(9). Assume furthermore that P and Q are not both independent of θ for all t. Then for every t0 > 0, there is a positive constant C(t0 ) and a T (t0 ) such that for t ≥ T (t0 ), |R|(t) ≤ C(t0 )τ −2 (t0 , t), where |R| is defined with respect to the areal time coordinate foliation.

(11)

618

H. Ringström

Remark 2. When considering metrics of the form (5), the spatially homogeneous solutions have a special type of behaviour. In particular, if there is some spatial variation, λ tends to infinity linearly, but if there is no spatial variation, λ tends to infinity logarithmically, cf. [12]. Since P cannot grow faster than logarithmically and Q cannot grow faster than polynomially, cf. [12], it is clear that in the spatially inhomogeneous case, the factor in front of −dt 2 + dθ 2 tends to infinity exponentially whereas all the other factors tend to infinity at worst polynomially. In other words, all the expansion is in the factor in front of −dt 2 + dθ 2 . In the spatially homogeneous case, there is however no such clear distinction between the different factors, since λ tends to infinity logarithmically. For this reason we focus on the spatially inhomogeneous case and leave the homogeneous case to the reader. The proof is to be found at the end of Sect. 2. Finally, let us say something about the rescaled Riemannian metric on the hypersurfaces of constant areal time. The proof is also to be found at the end of Sect. 2. Proposition 1. Consider a metric of the form (5), where P , Q and λ satisfy (6)–(9). Assume furthermore that P and Q are not both independent of θ for all t. Let gˆ t be the Riemannian metric induced on the hypersurface of constant areal time t, and let gt = gˆ t /τ 2 (t0 , t). Then gt is a metric on T 3 , which can be written gt = f1 (t, θ )dθ 2 + f2 (t, θ )dδ 2 + f3 (t, θ )dδdσ + f4 (t, θ )dσ 2 . The family f1 (t, ·) of functions is bounded in C 1 and from below by a positive constant, for t ≥ t0 + 1. For i ≥ 2, k ≥ 0 and t ≥ t0 + 1, we have the following estimate,

fi (t, ·) C k ≤ Ck

{ln[1 + τ (t0 , t)]}αk , τ 2 (t0 , t)

where αk and Ck are positive constants. Remark 3. By the conclusions of the proposition and the Arzela-Ascoli theorem, there is, for any time sequence tk → ∞, a subsequence such that f1 (tk , ·) converges to a positive continuous function (the limit function will of course be Lipschitz). Furthermore, it is clear that the metric collapses in the two-torus direction defined by δ and σ . Finally, if it were possible to improve the estimate (10) in such a way that the logarithms do not occur, f1 (t, ·) would be bounded in any C k norm for t ≥ t0 + 1. In particular, in the polarized Gowdy case, we have such bounds. 1.2. Bianchi VIII. For proofs of the statements made below, we refer the reader to [11] and the references cited therein. We define Bianchi VIII spacetimes in terms of initial data. Bianchi VIII initial data are given by (G, g, k), where G is a Lie group of Bianchi type VIII (to be defined below), g is a left invariant metric, k is a left invariant symmetric two tensor and g and k satisfy the constraint equations. In practice, G can be assumed to be the universal covering group of Sl(2, R). However, in general, a Lie group G is said to be of Bianchi type VIII if it has a basis ei of the Lie algebra satisfying [ei , ej ] = γijk ek , with γijk = ij l nlk , where ij l is antisymmetric in all its indices, 123 = 1, and nlk is diagonal with diagonal components ni such that n1 < 0 and n2 , n3 > 0. Given initial


619

data, there is a basis ei satisfying the conditions of the previous sentence such that g is orthonormal with respect to this basis and k is diagonal. We call such a basis a canonical basis. Such bases are not unique, but it turns out that e1 is well defined up to a sign. Let ki = k(ei , ei ). Then the initial data are said to be of NUT type if k2 = k3 and n2 = n3 . Given initial data, one can construct a globally hyperbolic Lorentz manifold (I × G, g), ¯ where I is an open interval and g¯ is of the form g¯ = −dt 2 +

3

ai2 (t)ξ i ⊗ ξ i ,

(12)

i=1

where the ξ i are the duals of ei , a canonical basis, and ai (0) = 1. Finally Ric[g] ¯ =0 and the Riemannian metric and the second fundamental form induced on = {0} × G by g¯ are given by g and k, after identifying G with in the obvious way. The development is future causally geodesically complete and independent of the canonical basis chosen. If the data are not of NUT type, the development is C 2 -inextendible, in fact, the Kretschmann scalar (4) is unbounded to the past, cf. [8]. Finally, if the data are of NUT type, a2 (t) = a3 (t) for all t. ˜ We can, without loss of generality, assume G to be Sl(2, R), the universal covering 3 ˜ group of Sl(2, R). Since Sl(2, R) is diffeomorphic to R , it is of interest to know when the geometry allows compactifications of the spatial hypersurfaces. In [11] we showed that if is a free and properly discontinuous subgroup of the isometry group of the initial data (G, g, k), then {Id}× is a free and properly discontinuous subgroup of the isometry group of the development. By taking the quotient, we thus get developments such that the corresponding CMC hypersurfaces have topology G/ . Furthermore, the compact manifold G/ must be Seifert fibred and e1 corresponds to the Seifert fibre direction. We also proved that a1 = l0 +O(t −1 ) in the NUT case and a1 (t) = c0 (ln t)1/2 [1+O(ln ln t/ ln t)] in the non-NUT case. Furthermore ai (t)/t → αi > 0 for i = 2, 3. Thus, after rescaling, the Seifert fibred spaces collapse as expected. Note that for each p > 1, there is a sub˜ ˜ group p of Sl(2, R) such that the quotient of Sl(2, R) by p (when p is viewed as a group of isometries by acting on the left) is diffeomorphic to the unit tangent bundle of a compact orientable surface of genus p with respect to some hyperbolic metric. Thus all initial data allow infinitely many different compactifications. However, the following holds. Theorem 3. Consider a Bianchi VIII spacetime. If it is of NUT type, there are constants c0 , c1 > 0 and a T > 0, such that c0 t −3 ≤ |R|(t) ≤ c1 t −3 for all t ≥ T . If it is of non-NUT type, there is a constant c0 > 0 such that lim t ln t|R|(t) = c0 .

t→∞

Furthermore, there are constants ci > 0 and sequences ti,k → ∞, i = 1, 2, such that 2 (ln ti,k )2 κ(ti,k ) = (−1)i ci , lim ti,k

k→∞

where κ is defined in (4).

620

H. Ringström

The proofs of this result and the next are to be found in Sect. 3. One can then ask the question if the Ricci curvature of the spatial hypersurfaces behaves better. This turns out not to be the case in general, but there is in fact a time sequence along which it behaves well. Proposition 2. Consider a Bianchi VIII spacetime which is not of NUT type. Then there are time sequences ti,k → ∞, i = 1, 2, and positive constants ci such that 2 4 lim t1,k (ln t1,k )2 (Rij R ij )(t1,k ) = c1 , t2,k (Rij R ij )(t2,k ) ≤ c2 ,

k→∞

where the last inequality is valid for all k, and Rij (t) denotes the Ricci tensor of the spatial hypersurface of homogeneity defined by t, with metric induced by g. ¯ Remark 4. The time sequence t2,k corresponds to the induced Riemannian metric being locally rotationally symmetric. Due to the existence of the sequence t1,k , the conjecture embodied in the weak asymptotics problem is not correct. 2. Curvature Estimates for Gowdy The expanding direction of Gowdy spacetimes was considered in [12]. The leading order behaviour for the functions P , Q and λ was sorted out and (10) was proved to hold for k = 0. In this paper, we are interested in the behaviour of curvature quantities, and thus we need to concern ourselves with the asymptotic behaviour of higher order derivatives. Proof (Theorem 1). By [12], we know that the conclusion holds for k = 0. Define t [(∂ k ∂t P ± ∂θk+1 P )2 + e2P (∂θk ∂t Q ± ∂θk+1 Q)2 ], 2 θ

Ak,± (t, ·) C 0 (S 1 ,R) . Ek (t) = Ak,± =

±

Let us make the inductive assumption that 1/2

Em (t) ≤ Cm (ln t)m for m = 0, ..., k − 1 and t ≥ 2. Observe that since (10) holds for k = 0, this holds for k = 1. Compute, for k ≥ 1, (∂t ∓ ∂θ )Ak,± = I1,k,± + I2,k,± , where I1,k,± =

1 {−(∂θk Pt )2 + (∂θk Pθ )2 + e2P [−(∂θk Qt )2 + (∂θk Qθ )]} 2 −te2P (Pt ± Pθ )[(∂θk Qt )2 − (∂θk Qθ )2 ] +te2P (Qt ± Qθ )[(∂θk Qt ∓ ∂θk Qθ )(∂θk Pt ± ∂θk Pθ ) −(∂θk Qt ± ∂θk Qθ )(∂θk Pt ∓ ∂θk Pθ )],

(13)


621

and I2,k,± = t{∂θk [e2P (Q2t − Q2θ )] − 2e2P (Qt ∂θk Qt − Qθ ∂θk Qθ )}(∂θk Pt ± ∂θk Pθ ) k−1 k j k−j j k−j 2P −2te [∂θ Pt ∂θ Qt − ∂θ Pθ ∂θ Qθ ](∂θk Qt ± ∂θk Qθ ). j j =1

Fix θ and define γ± (u) = (u, θ ± u). For f : R+ × S 1 → R, let f± = f ◦ γ± . Note that ∂u f± = [(∂t ± ∂θ )f ]± . Compute Ak,± [γ∓ (u)] = Ak,± [γ∓ (u0 )] +

u u0

[(∂t ∓ ∂θ )Ak,± ]∓ (t)dt.

(14)

Note that we have (13) and that each of the terms in I1,k,± ◦ γ∓ can be written, disregarding numerical factors, as a sum of terms of the form f1∓ f2∓ ∂u f3∓ . Here, the possibilities for f1 are 1, e2P , ue2P (Pu ± Pθ ), ue2P (Qu ± Qθ ),

(15)

the corresponding estimates for |f1 | and |∂u f1∓ | being, respectively, 1, Ce2P , Cu1/2 e2P , Cu1/2 eP and 0, Cu−1/2 e2P∓ , Ce2P∓ , CeP∓ , where we have used (6)–(7) and the fact that (10) holds for k = 0. The possibilities for f2 are (∂u ± ∂θ )∂θk P , (∂u ± ∂θ )∂θk Q,

(16)

the corresponding estimates for |f2 | and |∂u f2∓ | being, respectively (ln u)k −P∓ −1 1/2 (ln u)k 1/2 1/2 1/2 u Ek + ,e , u−1/2 Ek , u−1/2 e−P Ek and u−1 Ek + u u (17) up to numerical factors. The reason for the latter is that 1 k k k k 2P 2 2 ∂u [(∂u ± ∂θ )∂θ P ]∓ = [∂θ (Puu − Pθθ )]∓ = − ∂θ Pt + ∂θ [e (Qt − Qθ )] . u ∓ The first term on the right hand side satisfies a better estimate than the second to last expression in (17), and the terms resulting from the second term when at least one derivative hits the factor e2P are also better. What remains to be considered are terms of the form j

j

j

j

[e2P (∂θ 1 Qt ∂θ 2 Qt − ∂θ 1 Qθ ∂θ 2 Qθ )]∓ ,

622

H. Ringström

where j1 + j2 = k. These terms can be estimated by the second to last expression in (17) due to the induction hypothesis. The argument for the second possibility for f2 is similar. The possibilities for f3 are ∂θk P , ∂θk Q, and the corresponding estimates for |f3 | are k−1 (ln u)k−1 −P (ln u) , e u1/2 u1/2

due to the induction hypothesis (note that k ≥ 1). Consider u I1,k,± ◦ γ∓ (t)dt. u0

Up to numerical factors, this integral can be written as a sum of terms of the form u u u f1∓ f2∓ ∂t f3∓ dt = [f1∓ f2∓ f3∓ ]u0 − [∂t f1∓ f2∓ f3∓ + f1∓ ∂t f2∓ f3∓ ]dt. u0

u0

Note that not all combinations occur and that when taking the products, all factors of eP in the estimates cancel. Using the definition of I1,k,± and the estimates written down above, we get u k−1 1/2 ≤ C + C (ln u) I ◦ γ (t)dt Ek (u) 1,k,± ∓ u1/2 u0 u (ln t)k−1 1/2 (ln t)2k−1 +C Ek (t) + dt. t t u0 Let us turn to I2,k,± . Up to numerical factors, the first term can be written as a sum of terms of the form j

j

t∂θ 1 P · · · ∂θ l P e2P (∂θm1 Qt ∂θm2 Qt − ∂θm1 Qθ ∂θm2 Qθ )(∂θk Pt ± ∂θk Pθ ), where ji ≥ 1, mi ≤ k − 1 and j1 + · · · + jl + m1 + m2 = k. Using the induction hypothesis, this can be estimated by C

(ln t)k−l 1/2 E (t). t (l+1)/2 k

If l ≥ 1, this estimate is as good as what we already have, so let us consider terms of the form te2P (∂θm1 Qt ∂θm2 Qt − ∂θm1 Qθ ∂θm2 Qθ )(∂θk Pt ± ∂θk Pθ ), where m1 + m2 = k but mi ≤ k − 1. Note that ∂θm1 Qt ∂θm2 Qt − ∂θm1 Qθ ∂θm2 Qθ =

1 m1 [(∂ Qt ± ∂θm1 Qθ )(∂θm2 Qt ∓ ∂θm2 Qθ ) 2 θ +(∂θm1 Qt ∓ ∂θm1 Qθ )(∂θm2 Qt ± ∂θm2 Qθ )].

In other words, we need only concern ourselves with terms of the form te2P (∂θm1 Qt ± ∂θm1 Qθ )(∂θm2 Qt ∓ ∂θm2 Qθ )(∂θk Pt ± ∂θk Pθ ).


623

We can then argue as before, with f1 = te2P (∂θm1 Qt ± ∂θm1 Qθ ), f2 = (∂θk Pt ± ∂θk Pθ ) and f3 = ∂θm2 Q. Note that since m1 + m2 = k and mi ≤ k − 1, we have mi ≥ 1. The arguments for the remaining terms in I2,k,± are similar, and by (13) we get

u

u0

(ln u)k−1 1/2 Ek (u) u1/2 u (ln t)k−1 1/2 (ln t)2k−1 +C Ek (t) + dt. t t u0

[(∂t ∓ ∂θ )Ak,± ]∓ (t)dt ≤ C + C

Taking the supremum of the right hand side in (14), we thus get (ln u)k−1 1/2 Ak,± [γ∓ (u)] ≤ Ak,± (u0 , ·) C 0 (S 1 ,R) + C + C Ek (u) u1/2 u (ln t)k−1 1/2 (ln t)2k−1 +C Ek (t) + dt. t t u0 Taking the supremum of the left hand side (note that there is a θ hidden in γ± ) and adding the two estimates, we get (ln u)k−1 1/2 Ek (u) Ek (u) ≤ C + C u1/2 u (ln t)k−1 1/2 (ln t)2k−1 +C Ek (t) + dt. t t u0 Note that C

(ln u)k−1 1/2 1 1 (ln u)2k−2 + Ek (u). Ek (u) ≤ C 2 1/2 u 2 u 2

Defining Eˆ k (u) = Ek (u) + (ln u)2k , we thus get the estimate Eˆ k (u) ≤ C + C

≤ C+C

u (ln t)k−1 u0 u u0

t (ln t)k−1 t

1/2

Ek (t) +

1/2 Eˆ k (t)dt.

By a Grönwall’s lemma type argument, we conclude that Eˆ k (u) ≤ Ck (ln u)2k for u ≥ u0 . This completes the induction proof.

(ln t)2k−1 dt t

624

H. Ringström

Before we come to the curvature estimate, let us define t τ (t, t0 ) = sup [−γ (s), γ (s)]1/2 ds, γ

(18)

t0

where the supremum is taken over smooth timelike curves γ (s) = [s, x(s)], where x takes values on T 3 . Note that for an arbitrary smooth timelike curve joining the hypersurface corresponding to t0 with the hypersurface corresponding to t, one can change the parameterization so that it is of the above mentioned form. Proposition 3. Consider a metric of the form (5), where P , Q and λ satisfy (6)–(9). Assume furthermore that P and Q are not both independent of θ for all t. Given t0 > 0 there are positive constants c(t0 ) and C(t0 ) such that for t ≥ t0 + 1, c(t0 )t −1/4 eλ(t)/4 ≤ τ (t, t0 ) ≤ C(t0 )t −1/4 eλ(t)/4 .

(19)

Proof. Note that since (10) holds for k = 0, |λθ | is bounded to the future, and consequently, |λ(t, θ ) − λ(t)| ≤ C(t0 )

(20)

for t ≥ t0 . Let us estimate t

1/4 −λ(t)/4

t

e

[−γ (s), γ (s)]1/2 ds

t0 1/4

t

t exp{[λ(s, θ (s)) − λ(t)]/4}ds s t0 t 1/4 t exp{[λ(s) − λ(t)]/4}ds. ≤ C(t0 ) s t0 ≤

However, by Theorem 1.6 of [12] we have |λt (t) − c0 | ≤ C(t0 )t −1 for t ≥ t0 , where c0 > 0, assuming the solution is not independent of θ . Thus t λ(s) − λ(t) ≤ −c0 (t − s) + C(t0 ) ln . s We conclude that t

1/4 −λ(t)/4

t

e

≤ C(t0 ) = C(t0 )

[−γ (s), γ (s)]1/2 ds

t0 t t

t0 1 t0 /t

s

α(t0 )

exp[−c0 (t − s)/4]ds

u−α(t0 ) exp[−c0 t (1 − u)/4]tdu.

(21)


625

If t ≤ 2t0 , this integral is bounded. If t ≥ 2t0 we can divide the integral into two parts. Let us estimate 1 1 −α(t0 ) α(t0 ) u exp[−c0 t (1 − u)/4]tdu ≤ 2 exp[−c0 t (1 − u)/4]tdu 1/2

1/2

4 ≤ 2α(t0 ) . c0 We also have 1/2 u t0 /t

−α(t0 )

4 exp[−c0 t (1 − u)/4]tdu ≤ c0

α(t0 ) t exp[−c0 t/8] t0

which is bounded by a constant depending on t0 . Note that the constants involved in the arguments above are independent of the curve γ . Thus τ (t, t0 ) ≤ C(t0 )t −1/4 eλ(t)/4 . In order to get the opposite inequality, consider the curve γ (s) = (s, x0 ), where x0 is a fixed point on T 3 . We get t t 1/4 t 1/4 −λ(t)/4 1/2 t e [−γ (s), γ (s)] ds = exp{[λ(s, θ0 ) − λ(t)]/4}ds s t0 t0 t ≥ c(t0 ) exp{[λ(s) − λ(t)]/4}ds, t0

where c(t0 ) is a positive constant. Assuming t ≥ t0 + 1, we can use (21) to prove that t t exp{[λ(s) − λ(t)]/4}ds ≥ exp{[λ(s) − λ(t)]/4}ds t0

t−1/2

≥ c(t0 ) > 0. The proposition follows.

Proof (Theorem 2). Note that there is no loss of generality in choosing the vectors orthogonal to e0 to be e1 = t 1/4 e−λ/4 ∂θ , e2 = t −1/2 e−P /2 ∂σ , e3 = t −1/2 eP /2 (−Q∂σ + ∂δ ). It will be convenient to introduce the notation φ = t 1/4 e−λ/4 . Note that c(t0 ) ≤ φ(t, θ )τ (t0 , t) ≤ C(t0 )

(22)

α e = ∇ e . Then for t ≥ t0 + 1 and θ ∈ S 1 due to (20) and (19). Let βγ α eβ γ δ δ δ κ δ κ Reµ eν eα , eβ = eν (µα )ηδβ − eµ (να )ηδβ + µα νδ ηκβ − να µδ ηκβ κ δ +ηδβ γµν κα , κ e defines γ κ . The above where η is the Minkowski metric and where [eα , eβ ] = γαβ κ µν formulas indicate what sign conventions we are using. One can check that all the terms δ )η − e ( δ )η can be estimated by φ 2 . Furthermore, due to the estiexcept eν (µα δβ µ να δβ mate (10), one sees that the only problem consists in second derivatives of λ. However, one can check that these derivatives only occur in the combination λtt − λθθ which is O(t −1/2 ) due to (10) and the equations. This proves that |R| ≤ Cφ 2 , which together with (22) proves (11).

626

H. Ringström

Proof (Proposition 1). Let f1 = φ −2 τ −2 (t0 , t), using the notation of the previous proof. Due to (22), we conclude that f1 (τ, ·) is bounded from above and from below by positive constants. Since λθ is bounded, due to (10) for k = 0, ∂θ f1 is bounded. The conclusions concerning f1 follow. Note that if we had an estimate of the form (10) without the logarithms, ∂θk λ would be bounded to the future for any k ≥ 1, and consequently f1 (t, ·) would be bounded in any C k norm for t ≥ t0 + 1. Due to the results of [12], P does not grow faster than logarithmically and Q does not go to infinity faster than polynomially. Combining this information with (10), we conclude that ∂θk P converges to zero for any k ≥ 1 and that ∂θk Q does not grow faster than polynomially. Due to (19) and the fact that λ = c0 t + O(ln t), where c0 > 0, cf. (21), we conclude that for large t, t and ln[1 + τ (t0 , t)] are equivalent. Adding these pieces together, we get the conclusions of the proposition. 3. Bianchi VIII In this section we prove Theorem 3 and Proposition 2. The results necessary in order to carry out the computations are all taken from [11]. However, we refer the reader to [10] and the appendices of [9] for more details on curvature computations in the current setting. Proof (Theorem 3). Let e0 = ∂t and ei = (ai )−1 ei (no summation) for i = 1, 2, 3, with terminology as in Subsect. 1.2. Let Greek indices range from 0 to 3 and Latin indices δ e . Due to the form (12) and the fact that e is a from 1 to 3. Define [eα , eβ ] = γαβ δ i canonical basis, we have γij0 = γ0i0 = 0. Furthermore, we can define n, θ and k by i γijk = ij l nlk , γ0j = −θij and k(ei , ej ) = ∇ei e0 , ej .

Then nlk is diagonal, and the diagonal components will be denoted by ni . Furthermore θij is diagonal, and coincides with −k(ei , ej ). In what follows, we shall raise and lower Latin indices with δij , and we shall consequently not be very careful when it comes to indices being upstairs or downstairs. Let θ denote the trace of θij and let σij be the traceless part. Since θ is never zero in the case of Bianchi VIII, cf. Lemma 21.5 of [9], we can define √ σij 3 ni 3 ij = , Ni = , + = (22 + 33 ), − = (22 − 33 ). θ θ 2 2 The relevant curvature quantities can be written κ = Rαβγ δ R αβγ δ = 8(Eij E ij − Hij H ij ), |R|2 = 8(Eij E ij + Hij H ij ), where

1 1 θ σij − σi k σkj − σkl σ kl δij + sij , 3 3 1 Hij = −3σ k(i nj )k + nkl σ kl δij + tr(n)σij , 2 1 sij = bij − tr(b)δij , 3 bij = 2nik nkj − tr(n)nij , Eij =


627

cf. p. 19 and p. 40 of [15]. Note that Eij and Hij define diagonal traceless matrices. In order to relate these expressions to the variables defined above, it will be convenient to define H˜ i = Hii /θ 2 , E˜ i = Eii /θ 2 . Then 1 H˜ 1 = N1 + + √ (N2 − N3 )− , 3 √ 1 1 1 H˜ 2 = − N2 (+ + 3− ) + (N3 − N1 ) + − √ − , 2 2 3 2 E˜ 2 − E˜ 3 = √ − (1 − 2+ ) + (N2 − N3 )(N2 + N3 − N1 ), 3 3 2 2 2 2 1 1 E˜ 2 + E˜ 3 = + (1 + + ) − − − N12 + (N2 − N3 )2 + N1 (N2 + N3 ). 9 9 3 3 3 Note that all other components of E˜ i and H˜ i can be computed from this due to the fact that Eij and Hij both define traceless matrices. Let us consider the case when the initial data are of NUT type. The relevant statements concerning the asymptotics are then to be found on pp. 1955–1956 of [11]. In this case − = 0, N2 = N3 and + − 1 + (N1 N2 )(τ ) + 1 + N2 e−3τ/2 − cN ≤ Ce−3τ/2 2 4 for some positive constants cN and C and for τ ≥ 0. Furthermore, there are positive constants cθ , C such that 1 3τ/2 θ (τ ) − cθ e ≤C for τ ≥ 0. Finally, t and τ are related through |t (τ ) − 2cθ e3τ/2 | ≤ C(1 + τ ) for all τ ≥ 0. We conclude that H˜ i and E˜ i are all O(e−3τ/2 ) = O(θ ). We conclude that |R|2 = O(θ 6 ) = O(t −6 ). This proves the upper bound in the theorem. In order to prove the lower bound, we need only observe that cθ = 0. lim t H˜ 1 = − 4cN

t→∞

Let us consider the general case. The necessary information is contained in Proposition 6, Corollary 7 and Corollary 8 of [11]. Note that in these results, 3 1 1 2 + (N2 − N3 )2 , v := −N1 (N2 + N3 ) − , u := + − . h := − 4 2 2 We have 2 −

3 1 + (N2 − N3 )2 = +O 4 4τ

ln τ τ2

, + =

and 1 N1 (N2 + N3 ) = − + O(τ −2 ). 2

1 + O(τ −1 ) 2

(23)

628

H. Ringström

By (82) of [11], we also have ln τ N2 = cN τ −3/4 e3τ/2 1 + O τ

(24)

for some positive constant cN . In combination with the above equations, this proves that N1 converges to zero exponentially. In view of the above equations, we have H˜ 1 = O(τ −1 ), √ 1 1 1 H˜ 2 = − N2 (+ + 3− ) + N2 + − √ − + O(τ −1/2 ) 2 2 3 2 = − √ N2 − + O(τ −1/2 ), 3 ˜ ˜ E2 − E3 = 2N2 (N2 − N3 ) + O(τ −1 ), E˜ 2 + E˜ 3 = O(τ −1 ), Thus θ

−4

3 1 |R| = 8 (E˜ 2 + E˜ 3 )2 + (E˜ 2 − E˜ 3 )2 + H˜ 12 + H˜ 22 + (H˜ 1 + H˜ 2 )2 2 2 8 2 = 8[2N22 (N2 − N3 )2 + N22 − + N2 O(τ −1 )] 3 64 2 2 3 = N [ + (N2 − N3 )2 + N2−1 O(τ −1 )]. 3 2 − 4

2

Taking (23) into account, we conclude that lim τ N2−2 θ −4 |R|2 =

τ →∞

16 . 3

(25)

On p. 1972 of [11], it is shown that there is a positive constant αθ such that αθ 3τ/2 2αθ 3τ/2 1 ln τ ln τ = 1/4 e , t = 1/4 e . 1+O 1+O θ τ τ τ τ Combining this with (24), we conclude that there are positive constants ci , i = 1, 2, 3, such that lim t −2 (τ )τ N22 (τ ) = c1 ,

τ →∞

lim t (τ )θ (τ ) = c2 ,

τ →∞

lim τ [ln t (τ )]−1 = c3 .

τ →∞

Combining this with (25), we conclude that there is a positive constant c0 such that lim t ln t|R|(t) = c0 .

t→∞

Since there are sequences τi,k → ∞, i = 1, 2, such that − (τ1,k ) = 0 and (N2 − N3 )(τ2,k ) = 0, cf. [11], the conclusions concerning the Kretschmann scalar follow by similar arguments.


629

Proof (Proposition 2). Let Ric denote the Ricci curvature of a spatial hypersurface of homogeneity. One can compute that 1 Ric(ei , ej ) = 2nik nkj − tr(n)nij − nkl nkl δij + [tr(n)]2 δij , 2 with terminology as in the proof of Theorem 3. Let Ri = Ric(ei , ei ). We get θ −2 R1 =

1 2 1 1 1 N1 − (N2 − N3 )2 , θ −2 R2 = N22 − (N1 − N3 )2 2 2 2 2

and similarly for R3 . We see that θ −2 R1 tends to zero and that θ −2 R2 =

1 1 (N2 + N3 )(N2 − N3 ) − N12 + N1 N3 . 2 2

The statement concerning R3 is similar. Note that there are time sequences τi,k → ∞, i = 1, 2, such that 1/2

lim (N2 − N3 )(τ1,k )τ1,k = c0 ,

k→∞

for some positive constant c0 , and such that (N2 − N3 )(τ2,k ) = 0. Once one has made the above observations, the argument is similar to the end of the proof of Theorem 3. Acknowledgement. I am grateful to Michael Anderson for discussions that led to me considering these problems.

References 1. Anderson, M.: Scalar curvature and geometrization conjectures for 3-manifolds. Comparison Geometry, Vol. 30, MSRI Publications, Cambridge: Cambridge University Press, 1997, pp. 49–82 2. Anderson, M.: On long-time evolution in general relativity and geometrization of 3-manifolds. Commun. Math. Phys. 222, 533–567 (2001) 3. Andersson, L., Moncrief, V.: Future complete vacuum spacetimes. In: Chru´sciel, P.T., Friedrich, H. (eds.), The Einstein equations and the large scale behavior of gravitational fields, Basel: Birkhäuser, 2004, pp. 299–330 4. Choquet-Bruhat,Y., Moncrief, V.: Future Global in Time Einsteinian Spacetimes with U(1) Isometry Group. Ann. Henri Poincaré 2, 1001–1064 (2001) 5. Choquet-Bruhat, Y.: Future complete U(1) symmetric Einsteinian spacetimes, the unpolarized case. In: Chru´sciel, P.T., Friedrich, H. (eds.), The Einstein equations and the large scale behavior of gravitational fields, Basel: Birkhäuser, 2004, pp. 251–298 6. Choquet-Bruhat, Y., Cotsakis, S.: Global hyperbolicity and completeness. J. Geom. Phys. 43, 345– 350 (2002) 7. Fischer, A., Moncrief, V.: The reduced Einstein equations and the conformal volume collapse of 3-manifolds. Class. Quantum Grav. 18, 4493–4515 (2001) 8. Ringström, H.: Curvature blow up in Bianchi VIII and IX vacuum spacetimes. Class. Quantum Grav. 17, 713–731 (2000) 9. Ringström, H.: The Bianchi IX attractor Ann. Henri Poincaré 2, 405–500 (2001) 10. Ringström, H.: The future asymptotics of Bianchi VIII vacuum solutions. Class. Quantum Grav. 18, 3791–3823 (2001) 11. Ringström, H.: Future asymptotics expansions of Bianchi VIII vacuum metrics. Class. Quantum Grav. 20, 1943–1989 (2003) 12. Ringström, H.: On a wave map equation arising in General Relativity. Commun. Pure Appl. Math. 57, 657–703 (2004) 13. Ringström, H.: Data at the moment of infinite expansion for polarized Gowdy. Class. Quantum Grav. 22, 1647–1653 (2005)

630

H. Ringström

14. Scott, P.: The geometries of 3-manifolds. Bull. London Math. Soc. 15, 401–487 (1983) 15. Wainwright, J., Ellis, G.F.R. (eds.): Dynamical systems in cosmology. Cambridge: Cambridge University Press, 1997 Communicated by G.W. Gibbons

Commun. Math. Phys. 264, 631–656 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1523-x

Communications in


Forbidden Gap Argument for Phase Transitions Proved by Means of Chessboard Estimates Marek Biskup1 , Roman Kotecký2 1 2

Department of Mathematics, UCLA, Los Angeles, California, USA Center for Theoretical Study, Charles University, Prague, Czech Republic

Received: 3 May 2005 / Accepted: 13 September 2005 c M. Biskup and R. Kotecký 2006 Published online: 22 March 2006 –

Abstract: Chessboard estimates are one of the standard tools for proving phase coexistence in spin systems of physical interest. In this note we show that the method not only produces a point in the phase diagram where more than one Gibbs states coexist, but that it can also be used to rule out the existence of shift-ergodic states that differ significantly from those proved to exist. For models depending on a parameter (say, the temperature), this shows that the values of the conjugate thermodynamic quantity (the energy) inside the “transitional gap” are forbidden in all shift-ergodic Gibbs states. We point out several models where our result provides useful additional information concerning the set of possible thermodynamic equilibria. 1. Introduction One of the basic tasks of mathematical statistical mechanics is to find a rigorous approach to various first-order phase transitions in lattice spin systems. Here two methods of proof are generally available: Pirogov-Sinai theory and chessboard estimates. The former, developed in [30, 31], possesses an indisputable advantage of robustness with respect to (general) perturbations, but its drawbacks are the restrictions—not entirely without hope of being eventually eliminated [22, 23, 15, 35, 7]—to (effectively) finite sets of possible spin values and to situations with rapidly decaying correlations. The latter method, which goes back to [20, 18, 19], is limited, for the most part, to systems with nearest-neighbor interactions but it poses almost no limitations on the individual spin space and/or the rate of correlation decay; see e.g. [29]. While both techniques ultimately produce a proof of phase coexistence, Pirogov-Sinai theory offers significantly better control of the number of possible Gibbs states. Indeed, one can prove the so called completeness of phase diagram [34, 8] which asserts that the states constructed by the theory exhaust the set of all shift-ergodic Gibbs states. (In c 2006 by M. Biskup and R. Kotecký. Reproduction, by any means, of the entire article for noncommercial purposes is permitted without charge.

632

M. Biskup, R. Kotecký

technical terms, there is a one-to-one correspondence between the shift-ergodic Gibbs states and the “stable phases” defined in terms of minimal “metastable free energy”.) Unfortunately, no conclusion of this kind is currently available in the approaches based solely on chessboard estimates. This makes many of the conclusions of this technique— see [12, 33, 3, 5, 17] for a modest sample of recent references—seem to be somewhat “incomplete”. To make the distinction more explicit, let us consider the example of temperaturedriven first-order phase transition in the q-state Potts model with q 1. In dimensions d ≥ 2, there exists a transition temperature, Tt , at which there are q ordered states that are low on both entropy and energy, and one disordered state which is abundant in both quantities. The transition is accompanied by a massive jump in the energy density (as a function of temperature). Here the “standard” proof based on chessboard estimates [25, 26] produces “only” the existence of a temperature where the aforementioned q +1 states coexist, but it does not rule out the existence of other states; particularly, those with energies “inside” the jump. On the other hand, Pirogov-Sinai approaches [24, 27] permit us to conclude that no other than the above q + 1 shift-ergodic Gibbs states can exist at Tt and, in particular, there is a forbidden gap of energy densities where no shift ergodic Gibbs states are allowed to enter. The purpose of this note is to show that, after all, chessboard estimates can also be supplemented with a corresponding “forbidden-gap” argument. Explicitly, we will show that the calculations (and the assumptions) used, e.g., in [25, 12, 33, 3, 5, 17] to prove the existence of particular Gibbs states at the corresponding transition temperature, or other driving parameter, imply also the absence of Gibbs states that differ significantly from those proved to exist. We emphasize that no statement about the number of possible extremal, translation-invariant Gibbs states is being made here, i.e., the completeness of phase diagram in its full extent remains unproved. Notwithstanding, our results go some way towards a proof of completeness by ruling out, on general grounds, all but a “small neighborhood” of the few desired states (which may themselves be a non-trivial convex combination of extremal states). The assumptions we make are quite modest; indeed, apart from the necessary condition of reflection positivity we require only translation invariance and absolute summability of interactions. And, of course, the validity—uniformly in the parameter driving the transition—of a bound that is generally used to suppress the contours while proving the existence of coexisting phases. We also remark that the conclusion about the “forbidden gap” should not be interpreted too literally. Indeed, there are systems (e.g., the Potts model in an external field) where more than one gap may “open up” at the transition. Obviously, in such situations one may have to consider a larger set of observables and/or richer parametrization of the model. We refer the reader to our theorems for the precise interpretation of the phrase “forbidden gap” in a general context. The main idea of the proof is that all Gibbs states (at the same temperature) have the same large-deviation properties on the scale that is exponential in volume. This permits us to compare any translation-invariant Gibbs state with a corresponding measure on the torus, where chessboard estimates can be used to rule out most of the undesirable scenarios. The comparison with torus boundary conditions requires an estimate on the interaction “across” the boundary; as usual this is implied by the absolute summability of interactions. This is the setting we assume for the bulk of this paper (cf Theorem 2.5). For systems with unbounded interactions, a similar conclusion can be made under the assumption that the interactions are integrable with respect to the measures of interest (see Theorem 4.4).

Forbidden Gap Argument and Chessboard Estimates

633

The rest of this paper is organized as follows: In Sect. 2.1 and 2.2 we define the class of models to which our techniques apply and review various elementary facts about reflection positivity and chessboard estimates. The statements of our main theorems (Theorem 2.5 and Corollary 2.6) come in Sect 2.3. The proofs constitute the bulk of Sect. 3; applications to recent results established by means of chessboard estimates are discussed in Sect. 4. The Appendix (Sect. 5) contains the proof of Theorem 4.4 which provides an explicit estimate on the energy gap from Theorem 3 of [17]. This result is needed for one of our applications in Sect. 4. 2. Main Result In order to formulate our principal claims we will first recall the standard setup for proofs of first-order phase transitions by chessboard estimates and introduce the necessary notations. The actual theorems are stated in Sect. 2.3. 2.1. Models of interest. We will work with the standard class of spin systems on Zd and so we will keep our discussion of general concepts at the minimum possible. We refer the reader to Georgii’s monograph [21] for a more comprehensive treatment and relevant references. Our spins, sx , will take values in a compact separable metric space 0 . We equip 0 with the σ -algebra F0 of its Borel subsets and consider an a priori probability measure ν0 on (0 , F0 ). Spin configurations on Zd are the collections (sx )x∈Zd . We will d d use = Z 0 to denote the set of all spin configurations on Z and F to denote the σ -algebra of Borel subsets of defined using the product topology. If ⊂ Zd , we define F to be the sub-σ -algebra of events depending only on (sx )x∈ . For each x ∈ Zd , the map τx : → is the “translation by x” defined by (τx s)y = sx+y . It is easy to check that τx is continuous and hence measurable for all x ∈ Zd . We will write Zd to indicate that is a finite subset of Zd . To define Gibbs measures, we will consider a family of Hamiltonians (H )Zd . These will be defined in terms of interaction potentials (A )AZd . Namely, for each A Zd , let A : → R be a function with the following properties: (1) The function A is FA -measurable for each A Zd . (2) The interaction (A ) is translation invariant, i.e., A+x = A ◦ τx for all x ∈ Zd and all A Zd . (3) The interaction (A ) is absolutely summable in the sense that |||||| = A ∞ < ∞. (2.1) AZd 0∈A

The Hamiltonian on a set Zd is a function H : → R defined by H = A .

(2.2)

AZd A∩ =∅

For each β ≥ 0, let Gβ be the set of Gibbs measures for the Hamiltonian (2.2). Specifically, µ ∈ Gβ if and only if the conditional probability µ( · |Fc )—which exists since

634


is a Polish space—satisfies, for all Zd and µ-almost all s, the (conditional) DLR equation µ(ds |Fc )(s) =

e −βH (s) ν0 (dsx ). Z

(2.3)

x∈

Here Z = Z (β, sc ) is a normalization constant which is independent of s = (sx )x∈ . Remark 2.1. The results of the present paper can be generalized even to the situations with unbounded spins and interactions; see Theorem 4.5. However, the general theory of Gibbs measures with unbounded spins features some unpleasant technicalities that would obscure the presentation. We prefer to avoid them and to formulate the bulk of the paper for systems with compact spins. Our restriction to translation-invariant interactions in (2) above is mostly for convenience of exposition. Actually, the proofs in Sect. 3 can readily be modified to include periodic interactions as well.

2.2. Chessboard estimates. As alluded to before, chessboard estimates are among the principal tools for proving phase coexistence. In order to make this tool available, we have to place our spin system on a torus. Let TL be the torus of L × · · · × L sites and let L HL : T 0 → R be the function defined as follows. Given a configuration s = (sx )x∈TL , we extend s periodically to a configuration s¯ on all of Zd . Using HTL to denote the Hamiltonian associated with the embedding of TL into Zd , we define HL (s) = HTL (¯s ). The torus measure PL,β then simply is PL,β (ds) =

e −βHL (s) ν0 (dsx ). ZL

(2.4)

x∈TL

Here ZL = ZL (β) is the torus partition function. Chessboard estimates will be implied by the condition of reflection positivity. While this condition can already be defined in terms of interactions ( )Zd , it is often easier to check it directly on the torus. Let us consider a torus TL with even L and let us split − it into two symmetric halves, T+ L and TL , sharing a “plane of sites” on their boundary. + − We will refer to the set P = TL ∩ TL as a plane of reflection. Let FP+ and FP− denote − the σ -algebras of events depending only on configurations in T+ L and TL , respectively. + We assume that the naturally-defined (spatial) reflection ϑP : TL ↔ T− L gives rise TL L to a map θP : T → which obeys the following constraints: 0 0 (1) θP is an involution, θP ◦ θP = id. (2) θP is a reflection in the sense that if A ∈ FP+ depends only on configurations in − ⊂ T+ L , then θP (A) ∈ FP depends only on configurations in ϑP (). that is directly induced by the spaIn many cases of interest, θP is simply the mapping tial reflection ϑP , i.e., θP = ϑP∗ , where ϑP∗ (s) x = sϑP (x) ; our definition permits us to combine the spatial reflection with an involution of the single-spin space. Reflection positivity is now defined as follows:


635

L Definition 2.2. Let P be a probability measure on T 0 and let E be the corresponding expectation. We say that P is reflection positive, if for any plane of reflection P and any two bounded FP+ -measurable random variables X and Y , E XθP (Y ) = E Y θP (X) (2.5)

and E XθP (X) ≥ 0.

(2.6)

Here, θP (X) denotes the FL− -measurable random variable X ◦ θP . Remark 2.3. Here are some standard examples of summable two-body interactions that are reflection positive. Consider spin systems with vector-valued spins sx and interaction potentials {x,y} = Jx,y (sx , sy ),

x = y,

(2.7)

where Jx,y are coupling constants and (·, ·) denotes a positive-semidefinite inner product on . Then the corresponding torus Gibbs measure with β ≥ 0 is reflection positive (for reflections through sites) for the following choices of Jx,y ’s: (1) “Cube” interactions: Reflection-symmetric Jx,y ’s such that Jx,y = 0 unless x and y are vertices of a cube of 2 × · · · × 2 sites in Zd . (2) Yukawa-type potentials: Jx,y = e −µ|x−y|1 ,

(2.8)

where µ > 0 and |x − y|1 is the 1 -distance between x and y. (3) Power-law decaying interactions: Jx,y =

1 , |x − y|κ 1

(2.9)

with κ > 0. The proofs of these are based on the general theory developed in [20, 18, 19]; relevant calculations can also be found in [2, Sect. 4.2]. Of course, any linear combination of the above—as well as other reflection-positive interactions—with positive coefficients is still reflection positive. Now, we are finally getting to the setup underlying chessboard estimates. Suppose that L is an integer multiple of an (integer) number B. (To rule out various technical complications with the following theorem, we will actually always assume that L/B is a power of 2.) Let B ⊂ TL be the box of (B + 1) × · · · × (B + 1) sites with the “lower-left” corner at the origin—we will call such box a B-block. We can tile TL by translates of B by B-multiples of vectors from the factor torus, T = TL/B . Note that the neighboring translates of B will have a side in common. Let A be an event depending only on configurations in B ; we will call such A a B-block event. For each t ∈ T, we define the event θt (A) as follows: (1) If t has all components even, then θt (A) is simply the translation of A by vector Bt, −1 L i.e., θt (A) = τBt (A) = {s ∈ T 0 : τBt (s) ∈ A}.

636


(2) For the remaining t ∈ T, we first reflect A through the “midplane” of B in all directions whose component of t is odd, and then translate the result by Bt as before. Thus, θt (A) will always depend only on configurations in the B-block B + Bt. The desired consequence of reflection positivity is now stated as follows. L Theorem 2.4 (Chessboard estimate). Let P be a measure on T 0 which is reflectionpositive with respect to θP . Then for any B-block events A1 , . . . , Am and any distinct sites t 1 , . . . , t m ∈ T,

P

m

m 1/|T| θt j (Aj ) ≤ P θt (Aj ) .

j =1

Proof. See [20, Theorem 2.2].

(2.10)

t∈ T

j =1

The moral of this result—whose proof is nothing more than an enhanced version of the Cauchy-Schwarz inequality applied to the inner product X, Y → E(XθP (Y ))—is that the probability of any number of events factorizes, as a bound, into the product of probabilities. This is particularly useful for contour estimates; of course, provided that the word contour refers to a collection of boxes on each of which some “bad” event occurs. Indeed, by (2.10) the probability of a contour will automatically be suppressed exponentially in the number of constituting “bad” boxes. 2.3. Main theorems. For any B-block event A, we introduce the quantity

pβ (A) = lim PL,β L→∞

1/|T| θt (A)

,

(2.11)

t∈ T

with the limit taken over multiples of B. The limit exists by standard subadditivity arguments. While the definition would suggest that pβ (A) is a large-deviation rate, chessboard estimates (2.10) show that pβ (A) can also be thought of as the “probability of A regardless of the status of all other B-blocks.” This interpretation is supported by the fact that A → pβ (A) is an outer measure on FB with pβ () = 1, cf. Lemma 6.3 of [5]. Furthermore, recalling that N−1 is the block of N × · · · × N sites with the “lowerleft” corner at the lattice origin, let RN (A) =

1 |N−1 |

1A ◦ τBx

(2.12)

x∈N −1

be the fraction of B-blocks (in NB−1 ) in which A occurs. Whenever µ ∈ Gβ is a Gibbs state for the Hamiltonian (2.2) at inverse temperature β that is invariant with respect to the shifts (τBx )x∈Zd , the limit ρµ (A) = lim RN (A) N→∞

(2.13)

exists µ-almost surely. In the following, we will use ρµ (A) mostly for measures that are actually ergodic with respect to the shifts by multiples of B. In such cases the limit


637

is self-averaging, ρµ (A) = µ(A) almost surely. Notwithstanding, we will stick to the notation ρµ (A) to indicate that claims are being made about almost-sure properties of configurations and not just expectations. To keep our statements concise, we will refer to measures which are invariant and ergodic with respect to the translations (τBx )x∈Zd as B-shift ergodic. Our principal result can be formulated as follows: Theorem 2.5. Let d ≥ 2 and consider a spin system as described above for which the torus measure is reflection positive for all β ≥ 0 and all even L ≥ 2. Let G1 , . . . , Gr be a finite number of B-block events and let B = (G1 ∪ · · · ∪ Gr )c . Suppose that the good block events are mutually exclusive and non-compatible (different types of goodness cannot occur in neighboring blocks): (1) Gi ∩ Gj = ∅ for all i = j . (2) If t 1 , t 2 ∈ T are nearest neighbors, then θt 1 (Gi ) ∩ θt 2 (Gj ) = ∅ for all i = j.

(2.14)

Then for every > 0, there exists δ > 0—which may depend on d but not on the details of the model nor on B or n—such that for any β ≥ 0 with pβ (B) < δ we have ρµ (B) ∈ [0, ]

(2.15)

and ρµ (Gi ) ∈ [0, ] ∪ [1 − , 1],

i = 1, . . . , r,

(2.16)

for every B-shift ergodic Gibbs state µ ∈ Gβ . In particular, if < 1/2 then for every such µ there exists a unique i such that ρµ (Gi ) ≥ 1 − and ρµ (Gj ) ≤ for all j = i. We remark that the conclusion of Theorem 2.5 holds even when the requirement of compact single-spin space and norm-bounded interactions are relaxed to the condition of finite average energy. We state the corresponding generalization in Theorem 4.5. Theorem 2.5 directly implies the standard conclusion of chessboard estimates (cf. [14, Propositions 3.1-3.3] or [25, Theorem 4]): Corollary 2.6. Let d ≥ 2, let β1 < β2 be two inverse temperatures and let G1 and G2 be two mutually exclusive, non-compatible good B-block events (cf. conditions (1) and (2) in Theorem 2.5). Then, for every > 0 there exists a constant δ > 0—which may depend on d but not B or the details of the model—such that the conditions (1) pβ (B) < δ for all β ∈ [β1 , β2 ] and (2) pβ1 (G2 ) < δ and pβ2 (G1 ) < δ imply an existence of an inverse temperature βt ∈ (β1 , β2 ) and of two distinct B-shift ergodic Gibbs measures µ1 , µ2 ∈ Gβt such that ρµj (Gj ) ≥ 1 − ,

j = 1, 2.

(2.17)

The above assumptions (1) and (2) appear in some form in all existing proofs based on chessboard estimates; see Sect. 4 for some explicit examples. The conclusions about the set of coexistence points can be significantly strengthened when, on the basis of thermodynamic arguments and/or stochastic domination, the expected amount of goodness G2 increases (and G1 decreases) with increasing β. For 1 the phase diagram then features a unique (massive) jump at some βt from states dominated by G1 to those dominated by G2 . Theorem 2.5 implies that the bulk of the values inside the jump are not found in any ergodic Gibbs state. Both Theorem 2.5 and Corollary 2.6 are proved in Sect. 3.2.

638


Remark 2.7. Both results above single out inverse temperature as the principal parameter of interest. However, this is only a matter of convenience; all results hold equally well for any parameter of the model. An inspection of the proof shows that we can take δ = c(d) 2/d in Theorem 2.5, where c(d) is a constant that grows with dimension. However, the dependence on should be significantly better; we made no attempts to reach the optimum. In any case, the fact that δ does not depend on the details of the model is definitely sufficient to prove phase coexistence. 3. Proofs of Main Results We will assume that there is an ergodic Gibbs measure µ ∈ Gβ that violates one of the conditions (2.15–2.16), and derive a contradiction. Various steps of the proof will be encapsulated in technical lemmas in Sect. 3.1; the actual proofs come in Sect. 3.2.

3.1. Technical lemmas. Our first step is to convert the information about infinite-volume densities into a finite volume event. Using the sites from N−1 to translate B-block B by multiples of B in each coordinate direction, we get x∈N −1 (B + Bx) = NB . Similarly, considering translates of NB by vectors N Bx, where x ∈ M−1 , we get x∈M−1 (NB + NBx) = MNB . The important point is that, while the neighboring translates NB + N Bx and NB + N By are not disjoint, they have only one of their (d − 1)-dimensional sides in common. Let BN and Ej,N , j = 1, . . . , r, be events defined by

BN = RN (B) > (3.1) and

Ej,N = RN (Gj ) > ,

j = 1, . . . , r.

(3.2)

Introducing the event

EN = BN ∪

(Ei,N ∩ Ej,N )

(3.3)

1≤i<j ≤r

and the fraction RM,N (EN ) of BN -blocks (in MNB ) in which EN occurs, RM,N (EN ) =

1 |M−1 |

1EN ◦ τNBx ,

(3.4)

x∈M−1

we have: Lemma 3.1. Let < 1/2 and consider a B-shift ergodic Gibbs measure µ ∈ Gβ that violates one of the conditions (2.15–2.16). Then there exists an N0 < ∞ and, for each N ≥ N0 , there exists an M0 = M0 (N ) such that for all N ≥ N0 and all M ≥ M0 (N ), one has µ RM,N (EN ) > 1/2 >

1 . 2N d

(3.5)


639

Proof. The proof is based on a two-fold application of the Pointwise Ergodic Theorem. Indeed, by ergodicity of µ and Fatou’s lemma we know that lim inf µ(BN ) ≥ µ ρµ (B) > (3.6) N→∞

and

lim inf µ(Ei,N ∩ Ej,N ) ≥ µ ρµ (Gi ) > ∩ ρµ (Gj ) > . N→∞

(3.7)

But µ violates one of the conditions (2.15–2.16) and so either ρµ (B) > or ρµ (Gi ) > and ρµ (Gj ) > for some i = j . All of these inequalities are valid µ-almost surely and so it follows that µ(EN ) −→ 1.

(3.8)

N→∞

Now, let us fix N so that µ(EN ) ≥ 3/4. Then ergodicity with respect to translates by multiples of B implies that 1

1 µ R (E ) ◦ τ > RM,N (EN ) ◦ τBy > 1/2 ≥ µ M,N N By Nd 2 y∈N −1 y∈N −1 = µ RMN (EN ) > 1/2 −→ 1. (3.9) M→∞

1/ 2

It follows that the left-hand side exceeds once M is sufficiently large, which in conjunction with subadditivity and τBy -invariance of µ directly implies (3.5). Our nexttask will be to express EN solely in terms of conditions on bad B-blocks in NB = x∈N −1 (B + Bx). Given two distinct sites x, y ∈ N−1 , let {x y} denote the event that there is no nearest-neighbor path π = (x1 , . . . , xk ) on N−1 such that (1) π connects x to y, i.e., x1 = x and xk = y. (2) all B-blocks “along” π are good, i.e., τBxj (B c ) occurs for all j = 1, . . . , k. Note that {x y} automatically holds when one of the blocks B + Bx or B + By is bad. Further, let YN be the (FN B -measurable) random variable

YN = # (x, y) ∈ N−1 × N−1 : x = y & x y (3.10) and let CN be the event

CN = YN ≥ ( N d )2 .

(3.11)

Conditions (1) and (2) from Theorem 2.5 now directly imply: Lemma 3.2. For all N , we have EN ⊂ CN . Proof. Clearly, we have BN ⊂ CN , and so we only have to show that Ei,N ∩ Ej,N ⊂ CN ,

1 ≤ i < j ≤ r.

(3.12)

Let us fix i = j and recall that on Ei,N ∩ Ej,N , at least an -fraction of all B-blocks in NB will be i-good and at least an -fraction of them will be j -good. By conditions (1) and (2) from Theorem 2.5, no two B-blocks of different type of goodness can be connected by a path of good B-blocks, and so there are at least ( N d )2 pairs of distinct B-blocks in NB that are not connected to each other by a path of good blocks. This is exactly what defines the event CN .

640


The events EN and CN have the natural interpretation as N B-block events on TL whenever L is divisible by N B. If A is such an N B-block event, let p˜ β (A) denote the analogue of the quantity from (2.11) where the θt ’s now involve translations by multiples of NB. Our next technical lemma provides an estimate on p˜ β (CN ) in terms of pβ (B): Lemma 3.3. Let d be the dimension of the underlying lattice and suppose that d ≥ 2. For each > 0—underlying the definitions of BN , EN and CN —and each η > 0, there exists a number δ = δ( , η, d) > 0 such that if pβ (B) < δ, then p˜ β (CN ) < η. Proof. Let us use L,β (CN ) to abbreviate the quantity θt (CN ) , L,β (CN ) = PL,β

(3.13)

t∈ T

where T = TL/(NB) is the factor torus in the present context. Observing that CN is preserved by reflections through the “midplanes” of NB , a multivariate version of Chebyshev’s inequality then yields

YN ◦ τBNt L,β (CN ) ≤ EL,β . (3.14) ( N d )2 t∈ T

Here EL,β is the expectation with respect to PL,β . To estimate the right-hand side of (3.14), we will rewrite YN as a sum. Let x, y ∈ N −1 be distinct. A connected subset ⊂ N−1 is said to separate x from y (in N−1 ) if each nearest-neighbor path π from x to y on N−1 intersects . We use S(x, y) to denote the set of all such sets ⊂ N−1 . Notice that {x}, {y} ∈ S(x, y). We claim that, whenever (x, y) is a pair of points contributing to YN , there exists ∈ S(x, y) separating x from y such that every block B +Bz with z ∈ is bad. Indeed, if B +Bx is a bad block we take = {x}. If B + Bx is a good block, then we define Cx to be the maximal connected subset of N−1 containing x such that B + Bz is a good block for all z ∈ Cx , and let be its external boundary. Using 1 to denote the indicator of the event that every block B + Bz with z ∈ is bad, we get YN ≤ 1 . (3.15) x,y∈N −1 ∈S(x,y) L d ) be the volume of the factor torus and let t 1 , . . . , t K be an ordering of Let K = ( BN all sites of T. Then we have

L,β (CN ) ≤

1 ( N d )2K

(xj ,yj ) 1 ,...,K j =1,...,K

EL,β

K

1j ◦ τBNt j ,

(3.16)

j =1

where the first sum runs over collections of pairs (xj , yj ), j = 1, . . . , K, of distinct sites in N−1 and the second sum is over all collections of separating surfaces j ∈ S(xj , yj ), j = 1, . . . , K. To estimate the right-hand side of (3.16) we define pL,β (B) to be the quantity on the right-hand side of (2.11), before taking the limit L → ∞, with A = B. Since each indicator 1j ◦ τBNt j enforces bad blocks B + B(z + N t j ) for z ∈ j , and the set of blocks


641

B + B(z + N t j ), z ∈ N−1 , is, for t i = t j , disjoint from the set B + B(z + N t i ), z ∈ N−1 , we can use chessboard estimates (Theorem 2.4) to get

K | |+···+|K | EL,β 1j ◦ τBNt j ≤ pL,β (B) 1 . (3.17) j =1

A standard contour-counting argument now shows that, for any distinct x, y ∈ N−1 , || pL,β (B) ≤ c1 pL,β (B)d (3.18) ∈S(x,y)

with some constant c1 = c1 (d), provided that pL,β (B) is sufficiently small. The sum over collections of pairs (xj , yj ), j = 1, . . . , K, contains at most (N 2d )K terms, allowing us to bound

c1 pL,β (B)d K . (3.19) L,β (CN ) ≤

2 Since L,β (CN ) /K → p˜ β (CN ) as L → ∞, it follows that p˜ β (CN ) ≤ c1 pβ (B)d −2 , which for pβ (B) small enough, can be made smaller than any η initially prescribed. 1

Our final technical ingredient is an estimate on the Radon-Nikodym derivative of a Gibbs measure µ ∈ Gβ and the torus measure at the same temperature: Lemma 3.4. Let L ⊂ Zd be an L-block and let T2L be a torus of side 2L. Let us view L as embedded into T2L and let P2L,β be the torus Gibbs measure on T2L . Then for any a > 0 there exists L0 such that e −βaL P2L,β (A) ≤ µ(A) ≤ e βaL P2L,β (A) d

d

(3.20)

for all L ≥ L0 , any µ ∈ Gβ , and any FL -measurable event A. Proof. For finite-range interactions, this lemma is completely standard. However, since our setting includes also interactions with infinite range, we provide a complete proof. We will prove only the right-hand side of the above inequality; the other side is completely analogous. First, from the DLR equation we know that there exists a configuration s = (sx )x∈Zd , such that µ(A|Fc )(s) ≥ µ(A)

(3.21)

with the left-hand side of the form (2.3). Let s be a configuration on T2L . We will show that µ( · |FcL )(s) and P2L,β ( · |FcL )(s ) are absolutely continuous with respect to each other—as measures on FL —and the Radon-Nikodym derivative is bounded above by d e βaL regardless of the “boundary conditions” s and s . Suppose that sx = sx for all x ∈ L and let s¯ be its 2L-periodic extension to all of Zd . Then the Radon-Nikodym derivative of P2L,β ( · |FcL )(s ) with respect to the ) while that of µ( · |F c )(s) is product measure x∈L ν0 (dsx ) is e −βHL (¯s ) /ZL (¯s c L L

e −βHL (s) /ZL (scL ). It thus suffices to show, uniformly in (sx )x∈L , that H (s) − H (¯s ) ≤ a Ld L L 2

(3.22)

642


once L is sufficiently large. To this end, we first note that H (s) − H (¯s ) ≤ 2 A ∞ . L L

(3.23)

A : A∩L =∅ A∩cL =∅

To estimate the right-hand side, we will decompose L into “shells,” n \ n−1 , and use the fact that if A intersects n \ n−1 as well as cL , then the diameter of A must be at least L − n. Using the translation invariance of the interactions, we thus get A : A∩L =∅ A∩cL =∅

A ∞ ≤

L n=1

|n \ n−1 |

A ∞ .

(3.24)

A : 0∈A diam(A)≥L−n

But |||||| < ∞ implies that the second sum tends to zero as L − n → ∞ and since |n \ n−1 | = o(Ld ) while 1≤n≤L |n \ n−1 | = Ld , the result is thus o(Ld ). In particular, for L sufficiently large, the right-hand side of (3.23) will be less than a2 Ld . 3.2. Proofs of Theorem 2.5 and Corollary 2.6. Now we are ready to prove our main theorem: Proof of Theorem 2.5. Fix < 1/2 and let µ ∈ Gβ be a B-shift ergodic Gibbs measure for which one of the conditions (2.15–2.16) fails. Applying Lemma 3.1 and the inclusion in Lemma 3.2 we find that 1 µ RM,N (CN ) > 1/2 > (3.25) 2N d once N ≥ N0 andM ≥ M0 (N ). Now, consider the torus TL of side L = 2MN B and embed MNB = x∈M−1 (NB + N Bx) into TL in the “usual” way. By Lemma 3.4 we know that for any fixed N ≥ N0 , there exists a sequence aM of positive numbers with aM ↓ 0 as M → ∞, such that we have PL,β RM,N (CN ) > 1/2 >

1 −β(NB)d aM M d e , 2N d

M → ∞.

(3.26)

Our goal is to show that, once N is chosen sufficiently large, the left-hand side is exponentially small in M d , thus arriving at a contradiction. By conditioning on which of the M d /2 translates of BN have CN satisfied, and applying the chessboard estimates in blocks of side N B, we get d d PL,β RM,N (CN ) > 1/2 ≤ 2M p˜ 2L,β (CN )M /2 , (3.27) where p˜ 2L,β (CN ) is the finite-torus version of p˜ β (CN ). Next we choose η < 1/4 and let δ > 0 and N ≥ N0 be such that the bounds in Lemma 3.3 apply. Then for all sufficiently large M (and hence all large L) we have p˜ 2L,β (CN ) < η and so d PL,β RM,N (CN ) > 1/2 ≤ (4η)M /2 . (3.28) But this is true for all M 1 and so the bound (3.26) must be false. Hence, no such µ ∈ Gβ could exist to begin with; i.e., (2.15–2.16) must hold for all B-shift ergodic µ ∈ Gβ .


643

To finish our proofs, we will also need to establish our claims concerning phase coexistence: Proof of Corollary 2.6. Suppose that and δ are such that Theorem 2.5 applies. By condition (1), the conclusions (2.15–2.16) of this theorem are thus available for all β ∈ [β1 , β2 ]. This implies ρµ (Gj ) ∈ [0, ] ∪ [1 − , 1],

j = 1, 2,

(3.29)

for every B-shift ergodic µ ∈ Gβ at every β ∈ [β1 , β2 ]. We claim that ρµ (G2 ) is small in every ergodic state µ ∈ Gβ1 . Indeed, by Lemma 6.3 of [5] and condition (2) of the corollary, we have

pβ1 (B ∪ G2 ) ≤ pβ1 (B) + pβ1 (Gj ) < 2δ.

(3.30)

Hence, if the δ in Corollary 2.6 was so small that Theorem 2.5 applies for some < 1/2 even when δ is replaced by 2δ, we can regard B ∪ G2 as a bad event at β = β1 and conclude that ρµ (G2 ) < 1/2, and hence ρµ (G2 ) ≤ , by (3.29), in every ergodic µ ∈ Gβ1 . A similar argument proves that ρµ (G1 ) ≤ in every ergodic µ ∈ Gβ2 . Usual weak-limit arguments then yield the existence of at least one point βt ∈ (β1 , β2 ) where both types of goodness coexist. 4. Applications The formulation of our main result is somewhat abstract. In the present section, we will pick several models in which phase coexistence has been proved using chessboard estimates and use them to demonstrate the consequences of our main theorem. Although we will try to stay rather brief, we will show that, generally, the hypothesis of our main result—i.e., the assumption on smallness of the parameter pβ (B)—is directly implied by the calculations already carried out in the corresponding papers. The reader should consult the original articles for more motivation and further details concerning the particular models. 4.1. Potts model. The q-state Potts model serves as a paradigm of order-disorder transitions. The existence of the transition has been proved by chessboard estimates in [25]. While the completeness of the phase diagram has, in the meantime, been established with the help of Pirogov-Sinai theory [28], we find it useful to illustrate our general claims on this rather straightforward example. Later on we will pass to more complex systems where no form of completeness—and, more relevantly, no “forbidden gap”—has been proved. The spins σx of the q-state Potts model take values in the set {1, . . . , q} with a priori equal probabilities. The formal Hamiltonian is H (σ ) = − δσx ,σy , (4.1) x,y

where x, y runs over all (unordered) nearest-neighbor pairs in Zd . The states of minimal energy have all neighboring spins equal, and so we expect that low temperature states are dominated by nearly constant spin-configurations. On the other hand, at high temperatures the spins should be nearly independent and, in particular, neighboring spins

644


will typically be different from each other. This leads us to consider the following good events on 1-block 1 :

G dis = σ : σx = σy for all x, y ∈ 1 , |x − y| = 1 ,

G ord,m = σ : σx = m for all x ∈ 1 , m = 1, . . . , q. (4.2) Using similar events, it was proved [25] that, for d ≥ 2 and q sufficiently large, there exists an inverse temperature βt and q + 1 ergodic Gibbs states µdis ∈ Gβt and µord,m ∈ Gβt , m = 1, . . . , q, such that the corresponding 1-block densities satisfy ρµdis (G dis ) ≥ 1 −

(4.3)

and ρµord,m (G ord,m ) ≥ 1 − ,

m = 1, . . . , q,

(4.4)

where = (q) tends to zero as q → ∞. In addition, monotonicity of the energy density as a function of β can be invoked to show that ρµ (G dis ) is large in all translation-invariant µ ∈ Gβ when β < βt , while it is small in all such states when β > βt . The full completeness [28] asserts that the above-mentioned q + 1 states exhaust the set of all shift-ergodic Gibbs states in Gβt . A weaker claim follows as a straightforward application of our Theorem 2.5: For each shift-ergodic Gibbs state µ ∈ Gβt there is either ρµ (G dis ) ≥ 1 − or ρµ (G ord,m ) ≥ 1 − for some m = 1, . . . , q. The main hypothesis of our theorem amounts to the smallness of the quantity pβ (B), where q c B = G dis ∪ G ord,m ,

(4.5)

m=1

which in turn boils down to an estimate on the probability of the disseminated event B on the right-hand side of (2.11). The needed estimate coincides with the bound provided in [25] by evaluating directly (i.e., “by hand”) the energy and the number of contributing configurations. The result—which in [25] appears right before the last formula on p. 506 is used to produce (4.4 )—reads

pβ (B) ≤

q d−2−(d−1) 1

2d

(q

− 2d)d

(4.6)

.

This implies the needed bound once q 1. Remark 4.1. Analogous calculations establish the corresponding forbidden gap in more complicated variants of the Potts model; see e.g. [4]. 4.2. Intermediate phases in dilute spin systems. The first instance where our results provide some new insight are dilute annealed ferromagnets exhibiting staggered order phases at intermediate temperatures. These systems have been studied in the context of both discrete [10] and continuous spins [11]. The characteristic examples of these classes are the site-diluted Potts model with the Hamiltonian H (n, σ ) = − nx ny (δσx ,σy − 1) − λ nx − κ n x ny , (4.7) x,y

x

x,y


645

and the site-diluted XY -model with the Hamiltonian H (n, φ) = − nx ny cos(φx − φy ) − 1 − λ nx − κ nx ny . x,y

x

(4.8)

x,y

Here, as before, σx ∈ {1, . . . , q} are the Potts spins, φx ∈ [−π, π ) are variables representing the “angle” of the corresponding O(2)-spins, and nx ∈ {0, 1} indicates the presence or absence of a particle (that carries the Potts spin σx or the angle variable φx ) at site x. On the basis of “usual” arguments, the high temperature region is characterized by disordered configurations while the low temperatures feature configurations with a strong (local) order, at least at small-to-intermediate dilutions. The phenomenon discovered in [10, 11] is the existence of a region of intermediate temperatures and chemical potentials, sandwiched between the low temperature/high density ordered region and the high temperature/low density disordered region, where typical configurations exhibit preferential occupation of one of the even/odd sublattices. The appearance of such states is due to an effective entropic repulsion. Indeed, at low temperatures the spins on particles at neighboring sites are forced to be (nearly) aligned while if a particle is completely isolated, its spin is permitted to enjoy the full freedom of the available spin space. Hence, at intermediate temperatures and moderate dilutions, there is an entropic advantage for the particles to occupy only one of the sublattices. Let us concentrate on the portion of the phase boundary between the staggered region and the low temperature region. The claim can be stated uniformly for both systems in (4.7–4.8) provided we introduce the relevant good events in terms of occupation variable n. Namely, we let:

G dense = (σ, n) : nx = 1 for all x ∈ 1 ,

G even = (σ, n) : nx = 1{x even} for all x ∈ 1 , (4.9)

odd G = (σ, n) : nx = 1{x odd} for all x ∈ 1 . Again, using slightly modified versions of these events, it was shown in [10, 11] that there exist positive numbers , κ0 1 and, for every κ ∈ (0, κ0 ), an interval I (κ) ⊂ R such that the following is true: For any λ ∈ I there exist inverse temperatures β1 (κ, λ) and β2 (κ, λ), and a transition temperature βt (κ, λ) ∈ [β1 , β2 ] such that (1) for any β ∈ [βt , β2 ] there exists an “densely occupied” state µdense ∈ Gβ , for which ρµdense (G dense ) ≥ 1 − ,

(4.10)

(2) for any β ∈ [β1 , βt ] there exist two states µeven , µodd ∈ Gβ satisfying ρµeven (G even ) ≥ 1 −

and

ρµodd (G odd ) ≥ 1 − .

(4.11)

The error is of order β − /8 (cf. the bound (2.15) in [11]) in the case of the XY -model in d = 2, and it tends zero as q → ∞ in the case of the diluted Potts model. A somewhat stronger conclusion can be made for the diluted Potts model. Namely, at β = βt , there are actually q + 2 distinct states, two staggered states µeven and µodd and q ordered states µdense,m , with the latter characterized by the condition 1

ρµdense,m (G dense,m ) ≥ 1 − ,

(4.12)

646


where

G dense,m = (σ, n) : nx = 1 and σx = m for all x ∈ 1 .

(4.13)

It is plausible that an analogous conclusion applies to the XY-model in d ≥ 3 because there the low-temperature phase should exhibit magnetic order. However, in d = 2 such long-range order is not permitted by the Mermin-Wagner theorem and so there one expects to have only 3 distinct ergodic Gibbs states at βt . A weaker form of the expected conclusion is an easy consequence of our Theorem 2.5: For each extremal 2-periodic Gibbs state µ ∈ Gβt there exists G ∈ {G even , G odd , G dense } (in the case of diluted Potts model, G ∈ {G even , G odd , G dense,m , m = 1, . . . , q}) such that ρµ (G) ≥ 1 − .

(4.14)

In particular, no ergodic Gibbs state µ ∈ Gβt has particle density in [ , 1/2 − ] ∪ [1/2 +

, 1 − ]. The proof of these observations goes by noting that the smallness of pβ (B) for the bad event B = (G dense ∪ G even ∪ G odd )c is a direct consequence of the corresponding bounds from [10, 11] of the “contour events.” In the case of the XY-model in dimension d = 2, this amounts to the bounds (2.9) and (2.15) from [11]. Remark 4.2. A more general class of models, with spin taking values in a Riemannian manifold, is also considered in [11]. A related phase transition in an annealed diluted O(n) Heisenberg ferromagnet has been proved in [12]. 4.3. Order-by-disorder transitions. Another class of systems where our results provide new information are the O(2)-nearest and next-nearest neighbor antiferromagnet [3], the 120-degree model [5], and the orbital-compass model [6].All of these are continuum-spin systems whose common feature is that the infinite degeneracy of the ground states is broken, at positive temperatures, by long-wavelength (spin-wave) excitation. We will restrict our attention to the first of these models, the O(2)-nearest and next-nearest neighbor antiferromagnet. The other two models are somewhat more complicated—particularly, due to the presence of non-translation invariant ground states—but the conclusions are fairly analogous. Consider a spin system on Z2 whose spins, S x , take values on the unit circle in R2 with a priori uniform distribution. The Hamiltonian is H (S) = S x · S x+ê1 +ê2 +S x ·S x+ê1−ê2 +γ S x · S x+ê1 +S x ·S x+ê2 , (4.15) x

x

where eˆ 1 and eˆ 2 are the unit vectors in the coordinate lattice directions and the dot denotes the usual scalar product. Note that both nearest and next-nearest neighbors are coupled antiferromagnetically but with a different strength. The following are the ground state configurations for γ ∈ (−2, 2): Both even and odd sublattices enjoy a Neél (antiferromagnetic) order, but the relative orientation of these sublattice states is arbitrary. It is clear that, at low temperatures, the configurations will be locally near one of the aforementioned ground states. Due to the continuous nature of the spins, the fluctuation spectrum is dominated by “harmonic perturbations,” a.k.a. spin waves. A heuristic spin-wave calculation (cf. [5, Sect. 2.2] for an example in the context of the 120-degree model) suggests that among all 2π possible relative orientations of the sublattices, the parallel and the antiparallel orientations are those entropically most favorable. And,


647

indeed, as was proved in [3], there exist two 2-periodic Gibbs states µ1 and µ2 with the corresponding type of long-range order. However, the existence of Gibbs states with other relative orientations has not been ruled out. We will now state a stronger version of [3, Theorem 2.1]. Let B be a large even integer and consider two B-block events G1 and G2 defined as follows: fixing a positive κ 1, let G1 = {S x · S y ≥ 1 − κ} ∩ {S x · S x+ê2 ≤ −1 + κ}, (4.16) x,y∈B (y−x)·ê2 =0

x,x+ê2 ∈B

i.e., G1 enforces horizontal stripes all over B . The event G2 in turn enforces vertical stripes; the definition is as above with the roles of eˆ 1 and eˆ 2 interchanged. Then we have: Theorem 4.3. Let γ ∈ (0, 2) and let κ 1. For each > 0 there exists β0 ∈ (0, ∞) such that for each β ≥ β0 : (1) There exist two ergodic Gibbs states µ1 , µ2 ∈ Gβ , such that ρµj (Gj ) ≥ 1 − ,

j = 1, 2.

(4.17)

(2) There exists an integer B ≥ 1 such that for any µ ∈ Gβ that is ergodic with respect to shifts by multiples of B we have either ρµ (G1 ) ≥ 1 − or ρµ (G2 ) ≥ 1 − .

(4.18)

The first conclusion—the existence of Gibbs states with parallel and antiparallel relative orientation of the sublattices—was the main content of Theorem 2.1 of [3]. What we have added here is that the corresponding configurations dominate all ergodic Gibbs states. The O(2) ground-state symmetry of the relative orientation of the sublattices is thus truly broken at positive temperatures, which bolsters significantly the main point of [3]. Note that no restrictions are posed on the overall orientation of the spins. Indeed, by the Mermin-Wagner theorem every µ ∈ Gβ is invariant under simultaneous rotations of all spins. Proof of Theorem 4.3. As expected, the proof boils down to showing that, for a proper choice of scale B we have pβ (B) 1 for B = (G1 ∪ G2 )c . In [3] this is done by decomposing B into more elementary events—depending on whether the “badness” comes from excessive energy or insufficient entropy—and estimating each of them separately. The relevant bounds are proved in [3, Lemmas 4.4 and 4.5] and combined together in [3, Eq. (4.20)]. Applying Theorem 2.5 of the present paper, we thus know that every B-shift ergodic µ ∈ Gβ is dominated either by blocks of type G1 or by blocks of type G2 . Since ρµ (B) ≤ in all states, the existence of µ1 , µ2 ∈ Gβ satisfying (4.17) follows by symmetry with respect to rotation (of the lattice) by 90-degrees. 4.4. Nonlinear vector models. A class of models with continuous symmetry that are conceptually close to the Potts model has been studied recently by van Enter and Shlosman [17]. As for our previous examples with continuous spins, Pirogov-Sinai theory is not readily available and one has to rely on chessboard estimates. We will focus our attention on one example in this class, a nonlinear ferromagnet, although our conclusions apply with appropriate, and somewhat delicate, modifications also to liquid crystal models and lattice gauge models discussed in [17].

648


Let us consider an O(2)-spin system on Z2 with spins parametrized by the angular variables φx ∈ (−π, π ]. The Hamiltonian is given by 1 + cos(φx − φy ) p H (φ) = − , (4.19) 2 x,y

where p is a nonlinearity parameter. The a priori distribution of the φx ’s is the Lebesgue measure on (−π, π ]; the difference φx − φy is always taken modulo 2π . In order to define the good block events, we first split all bonds into three classes. Namely, given a configuration (φx )x∈Z2 , we say that the bond x, y is (1) strongly ordered if |φx − φy | ≤ (2) weakly ordered if

1 √ C p

1 √ C p,

< |φx − φy |
0 and each sufficiently large C > 1, there exists p0 > 0 such that for all p > p0 , there exists a number βt ∈ (0, ∞) and two distinct, shift-ergodic Gibbs states µso , µdis ∈ Gβt such that ρµso (Gso ) ≥ 1 − and ρµdis (Gdis ) ≥ 1 − .

(4.22)

In addition, for all shift-ergodic Gibbs states µ ∈ Gβt , we have either ρµ (Gdis ) ≥ 1 − or ρµ (Gso ) ≥ 1 − ,

(4.23)

ρµ (Gso ) ≥ 1 − for all shift-ergodic µ ∈ Gβ with β > βt

(4.24)

while


649

and ρµ (Gdis ) ≥ 1 − for all shift-ergodic µ ∈ Gβ with β < βt .

(4.25)

Finally, for every p > p0 and C large, every ergodic Gibbs state will have energy near zero when β > βt and at least 1 − O(C −2 ) when β < βt . We remark that the existence of a first-order transition in energy density has been a matter of some controversy in the physics literature; see [16, 17] for more discussion and relevant references. The proof of Theorem 4.4 is fairly technical and it is therefore deferred to Sect. 5.

4.5. Magnetostriction transition. Our final example is the magnetostriction transition studied recently by Shlosman and Zagrebnov [33]. The specific system considered in [33] has the Hamiltonian H (σ, r) = − J (rx,y )σx σy + κ (rx,y − R)2 + λ (rx,y − rz,y )2 . (4.26) x,y

x,y

x,y,z,y √ |x−z|= 2

Here the sites x ∈ Zd label the atoms in a crystal; the atoms have magnetic moments represented by the Ising spins σx . The crystal is not rigid; the variables rx,y ∈ R, rx,y > 0, play the role of spatial distance between neighboring crystal sites. The word magnetostriction refers to the phenomenon where a solid undergoes a magnetic transition accompanied by a drastic change in the crystalline structure. In [33] such a transition was proven for interaction potentials J = J (rx,y ) that are strong at short distances and weak at large distances. The relevant states are characterized by disjoint contracted,

G contr = (r, σ ) : rx,y ≤ η, ∀x, y ∈ 1 , |x − y| = 1 , (4.27) and expanded,

G exp,± = (r, σ ) : rx,y ≥ η+ , ∀x, y ∈ 1 , |x − y| = 1 ∩ σx = ±1, ∀x ∈ 1 , (4.28) block events. The parameters η and ε can be chosen so that there exists βt ∈ (0, ∞) for which the following holds: (1) For all β ≤ βt there exists an expanded Gibbs state µexp ∈ Gβ such that ρµexp (G exp ) ≥ 3/ ; 4 (2) For all β ≥ βt there exist two distinct contracted Gibbs states µcontr,± ∈ Gβ such that ρµcontr,± (G contr,± ) ≥ 3/4. In particular at β = βt there exist three distinct Gibbs states; one expanded and two contracted with opposite values of the magnetization. The authors conjecture that these are the only shift-ergodic Gibbs states at β = βt . Unfortunately, the above system has unbounded interactions and so it is not strictly of the form for which Theorem 2.5 applies. Instead we will use the following generalization:

650


Theorem 4.5. Let d ≥ 2 and consider a spin system with translation-invariant finiterange interaction potentials (A )AZd such that the torus measure is reflection positive for all even L. Let G1 , . . . , Gr be a collection of good B-block events satisfying the requirements in Theorem 2.5 and let B be the corresponding bad event. Then for all > 0 there exists δ > 0—depending possibly only on d but not on details of the model nor on n or B—such that for all β ≥ 0 for which pβ (B) < δ the following is true: If µ ∈ Gβ is a B-shift ergodic Gibbs state with A:

Eµ |A | < ∞,

(4.29)

AZd

0∈A

then we have ρµ (B) ∈ [0, ],

(4.30)

and there exists i ∈ {1, . . . , r} such that ρµ (Gi ) ≥ 1 − .

(4.31)

Proof. The proof is virtually identical to that of Theorem 2.5 with one exception: Since the interactions are not bounded, we cannot use Lemma 3.4 directly. Suppose we have a Gibbs state µ that obeys (4.29) but violates one of the conditions (4.30–4.31). Let RM,N (CN ) be as in (3.4). Lemma 3.1 still applies and so we have (3.5) for some N . Let L = MN B and let DM be the event that the boundary energy in the box is less than cM d−1 , i.e., (4.32) |A | ≤ cM d−1 , DM = A : A∩L =∅ A∩cL =∅

where c is a positive constant. In light of the condition (4.29), the fact that the interaction has a finite range, and the Chebyshev bound, it is clear that we can choose c so that c ) < (4N d )−1 for all M. Hence, we have µ(DM µ DM ∩ {RM,N (CN ) > 1/2} >

1 . 4N d

(4.33)

Next let s and s be as in the proof of Lemma 3.4 and suppose that both s and s belong to DM . Then, by definition, H (s) − H (s ) ≤ 2cM d−1 L L

(4.34)

and, applying the rest of the proof of Lemma 3.4, we thus have d−1 µ DM ∩ {RM,N (CN ) > 1/2} ≤ e 2βcM P2L,β DM ∩ {RM,N (CN ) > 1/2} .

(4.35)

Neglecting DL on the right-hand side and invoking (3.28), we again derive the desired contradiction once M is sufficiently large.


651

With Theorem 4.5 in hand, we can extract the desired conclusion for the magnetostriction transition. First, the energy condition is clearly satisfied in any state generated by tempered boundary conditions. We then know that, in every such ergodic state µ, only a small number of blocks will feature bonds that are neither contracted (and magnetized) nor expanded (and non-magnetized): ρµ (G exp ), ρµ (G exp,± ) ∈ [0, ] ∪ [1 − , 1]

ρµ (B) ≤ .

and

(4.36)

The existence of a phase transition follows by noting that the contracted states have less energy than the expanded ones; there is thus a jump in the energy density as the temperature varies. 5. Appendix The goal of this section is to prove Theorem 4.4 which concerns the non-linear vector model with interaction (4.19). The technical part of the proof is encapsulated into the following claim: Proposition 5.1. There exists a constant C0 > 0 such that for all δ > 0 and all C ≥ C0 the following holds: There exists p0 > 0 such that for all p ≥ p0 we have sup pβ ((Gso ∪ Gdis )c ) < δ

(5.1)

β≥0

and lim pβ (Gdis ) = 0 and

β→∞

lim pβ (Gso ) < δ.

(5.2)

β↓0

To prove this proposition, we will need to carry out a sequence of energy and entropy bounds. To make our energy estimates easier, and uniform in p, we first notice that there are constants 0 < a < b such that e −bx ≤ 2

1 + cos(x) 2 ≤ e −ax , 2

−1 ≤ x ≤ 1.

(5.3)

The argument commences by splitting the bad event B = (Gso ∪ Gdis )c into two events: The event Bwo that 1 contains a weakly-ordered bond, and Bmix = B \ Bwo which, as a moment’s thought reveals, is the event that 1 contains two adjacent bonds, one of which is strongly ordered and the other disordered. The principal chessboard estimate yields the following lemma: √ Lemma 5.2. Suppose that C ≤ p. Then

pβ (Bwo ) ≤ 4 min

C2 κ

e

−2β[e −bκ

2 /C 2

2

−e −a/C ]

,

C √ π p

e

2βe −a/C

2

1/4 (5.4)

and

2 3 −b/C 2 −1−e −aC ]

pβ (Bmix ) ≤ 4 min e −2β[ 2 e

, e 2β

1√ πC p

1/2

3/4

(5.5)

652


for all β ≥ 0 and all κ ∈ (0, 1). Moreover, we have

√

pβ (Gdis ) ≤ πC p exp −2β[e

−

b C2

− e −aC ] 2

(5.6)

and

pβ (Gso ) ≤

1 e 2β √ . πC p

(5.7)

Proof. Let ZL be the partition function obtained by integrating e −βHL over all allowed configurations. Consider the following reduced partition functions: (1) ZLdis , obtained by integrating e −βHL subject to the restriction that every bond in TL is disordered. (2) ZLso , obtained similarly while stipulating that every bond in TL is strongly ordered. (3) ZLwo , in which every bond in TL is asked to be weakly ordered. (4) ZLmix , enforcing that every other horizontal line contains only strongly-ordered bonds, and the remaining lines contain only disordered bonds. A similar periodic pattern is imposed on vertical lines as well. To prove the lemma, we will need upper and lower bounds on the partition functions in (1-2), and upper bounds on the partition functions in (3-4). We begin by upper and lower bounds on ZLdis . First, using the fact that the Hamiltonian is always non-positive, we have e −βHL ≥ 1. On the other hand, the inequalities (5.3) and a natural monotonicity of the interaction imply that 1 + cos(φ − φ ) p 1 + cos(C/√p) p 2 x y ≤ ≤ e −aC (5.8) 2 2 whenever x, y is a disordered bond. In particular, −βHL is less than 2βe −aC |TL | for every configuration contributing to ZLdis . Using these observations we now easily derive that 2

(2π)|TL | ≤ ZLdis ≤ (2π)|TL | e 2βe

−aC 2 |T | L

.

(5.9)

Similarly, for the partition function ZLso we get 2 |TL |−1 |TL | −bκ 2 /C 2 2κ ≤ ZLso ≤ 2π e 2β|TL | √ . e 2βe √ C p C p

(5.10)

Indeed, for the upper bound we first note that −βHL ≤ 2β|TL |. Then we fix a tree spanning all vertices of TL , disregard the constraints everywhere except on the edges in the tree and, starting from the “leaves,” we sequentially integrate all site variables. 2 (Thus, each site is effectively forced into an interval of length C √ p , except for the “root” which retains all of its 2π possibilities.) For the lower bound we fix a number κ ∈ (0, 1) κ and restrict the integrals to configurations such that |φx −φy | ≤ C √ p for all bonds x, y in TL . The bound −βHL ≥ 2βe −bκ /C |TL | then permits us to estimate away the Boltzmann factor for all configurations; the entropy factor reflects the fact that each site can √ . vary throughout an interval of length at least C2κ p 2

2


653

Next we will derive good upper bounds on the remaining two partition functions. First, similar estimates as those leading to the upper bound in (5.10) give us −a/C 2 2C |TL | ZLwo ≤ 2π e 2βe . √ p

(5.11)

For the partition function ZLmix we note that 1/4 of all sites are adjacent only to disordered bonds, while the remaining 3/4 are connected to one another via a grid of strongly-ordered 2 bonds. Estimating −βHL ≤ β(1 + e −aC )|TL | for all relevant configurations, similar calculations as those leading to (5.10) again give us ZLmix ≤ 2πe β(1+e

−aC 2 )|T | L

(2π)

|TL | 4

2 3 |TL |−1 4 . √ C p

(5.12)

It now remains to combine these estimates into the bounds on the quantities on the left-hand side of (5.4–5.5) and (5.6–5.7). We begin with the bound (5.6). Clearly, pβ (Gdis ) is the L → ∞ limit of (ZLdis/ZL )1/|TL | , which using the lower bound ZL ≥ ZLso with κ = 1 easily implies (5.6). The bound (5.7) is obtained similarly, except that now we use that ZL ≥ ZLdis . The remaining two bounds will conveniently use the fact that for two-dimensional nearest-neighbor models, and square tori, the torus measure PL,β is reflection positive even with respect to the diagonal planes in TL . Indeed, focusing on (5.4) for a moment, we first note that Bwo is covered by the union of four (non-disjoint) events characterized by the position of the weakly(1) ordered bond on 1 . If Bwo is the event that the lower horizontal bond is the culprit, the (1) subadditivity property of pβ —see Lemma 6.3 of [5]—gives us pβ (Bwo ) ≤ 4pβ (Bwo ). (1) Disseminating Bwo using reflections in coordinate directions, we obtain an event enforcing weakly-ordered bonds on every other horizontal line. Next we apply a reflection in a diagonal line of even parity to make this into an even parity grid. From the perspective of reflections in odd-parity diagonal lines—i.e., those not passing through the vertices of the grid—half of the “cells” enforces all four bonds therein to be weakly ordered, while the other half does nothing. Applying chessboard estimates for these diagonal reflections, we get rid of the latter cells. The result of all these operations is the bound

pβ (Bwo ) ≤ lim 4 L→∞

Z wo L

ZL

1 4|TL |

.

(5.13)

Estimating ZL from below by the left-hand sides of (5.9–5.10) now directly implies (5.4). The event Bmix is handled similarly: First we fix a position of the ordered-disordered pair of bonds and use subadditivity of pβ to enforce the same choice at every lattice plaquette; this leaves us with four overall choices. Next we use diagonal reflections to produce the event underlying ZLmix . Estimating ZL from below by 1/4-th power of the lower bound in (5.9) and 3/4-th power of the lower bound in (5.10) with κ = 1, we get the first term in the minimum in (5.5). To get the second term, we use that ZL ≥ ZLdis , 2 apply (5.12) and invoke the bound 1 + e −aC ≤ 2. Proof of Proposition 5.1. The desired properties are simple consequences of the bounds 2 2 in Lemma 5.2. Indeed, if C is so large that e −b/C > e −aC , then (5.6) implies that pβ (Gdis ) → 0 as β → ∞. On the other hand, (5.7) shows that the β → 0 limit

654


of pβ (Gso ) is order 1/√p, which can be made as small as desired by choosing p sufficiently large. To prove also (5.1), we first invoke Lemma 6.3 of [5] one last time to see that pβ (B) ≤ pβ (Bwo )+ pβ (Bmix ). We thus have to show that both pβ (Bwo ) and pβ (Bmix ) can be made arbitrary small by increasing p appropriately. We begin with pβ (Bmix ). Let C be so large that 3 −b/C 2 2e

− 1 − e −aC > 0. 2

(5.14)

1

Then for β such that e 2β > p /4 the first term in the minimum in (5.6) decays like a neg1 ative power of p, while for the complementary values of β, the second term is O(p − /8 ). As to the remaining term, pβ (Bwo ), here we choose κ ∈ (0, 1) such that e −bκ

2 /C 2

− e −a/C > 0, 2

(5.15) √

and apply the first part of the minimum in (5.4) for β with e 2β ≥ p, and the second part for the complementary β, to show that pβ (Bwo ) is also bounded by constants time a negative power of p, independently of β. Choosing p large, (5.1) follows. Now we can finally prove Theorem 4.4: Proof of Theorem 4.4. We will plug the claims of Proposition 5.1 in our main theorem. First, it is easy to check that the good block events Gso and Gdis satisfy Conditions (1) and (2) of Theorem 2.5. Then (5.1) and (2.15–2.16) imply that either ρµ (Gdis ) ≥ 1 − or ρµ (Gso ) ≥ 1 −

(5.16)

for all shift-ergodic Gibbs states µ ∈ Gβ and all β ∈ (0, ∞). The limits (5.2) and Corollary 2.6 then imply the existence of the transition temperature βt and of the corresponding coexisting states. Since the energy density with negative sign undergoes 2 2 a jump at βt from values e −b/C to values e −aC —which differ by almost one once C 1—all ergodic states for β > βt must have small energy density while the states for β < βt will have quite a lot of energy. Applying (5.16), all ergodic µ ∈ Gβ for β > βt must be dominated by strongly-ordered bonds, while those for β < βt must be dominated by disordered bonds. Acknowledgement. The research of M.B. was supported by the NSF grant DMS-0306167 and that of ˇ 201/03/0478 and MSM 0021620845. Large parts of this paper were written R.K. by the grants GACR while both authors visited Microsoft Research in Redmond. The authors would like to thank Senya Shlosman, Aernout van Enter and an anonymous referee for many valuable suggestions on the first version of this paper.

References 1. Alexander, K.S., Chayes, L.: Non-perturbative criteria for Gibbsian uniqueness. Commun. Math. Phys. 189(2), 447–464 (1997) 2. Biskup, M., Chayes, L., Crawford, N.: Mean-field driven first-order phase transitions in systems with long-range interactions. J. Statist. Phys. (to appear) 3. Biskup, M., Chayes, L., Kivelson, S.A.: Order by disorder, without order, in a two-dimensional spin system with O(2)-symmetry. Ann. Henri Poincaré 5(6), 1181–1205 (2004) 4. Biskup, M., Chayes, L., Kotecký, R.: Coexistence of partially disordered/ordered phases in an extended Potts model. J. Statist. Phys. 99 (5/6), 1169–1206 (2000)


655

5. Biskup, M., Chayes, L., Nussinov, Z.: Orbital ordering in transition-metal compounds: I. The 120degree model. Commun. Math. Phys. 255, 253–292 (2005) 6. Biskup, M., Chayes, L., Nussinov, Z.: Orbital ordering in transition-metal compounds: II. The orbital-compass model. In preparation 7. Borgs, C., Waxler, R.: First order phase transitions in unbounded spin systems. I. Construction of the phase diagram. Commun. Math. Phys. 126, 291–324 (1990) 8. Borgs, C., Waxler, R.: First order phase transitions in unbounded spin systems. II. Completeness of the phase diagram. Commun. Math. Phys. 126, 483–506 (1990) 9. Bricmont, J., Slawny, J.: Phase transitions in systems with a finite number of dominant ground states. J. Statist. Phys. 54(1-2), 89–161 (1989) 10. Chayes, L., Kotecký, R., Shlosman, S.B.:Aggregation and intermediate phases in dilute spin systems. Commun. Math. Phys. 171, 203–232 (1995) 11. Chayes, L., Kotecký, R., Shlosman, S.B.: Staggered phases in diluted systems with continuous spins. Commun. Math. Phys. 189, 631–640 (1997) 12. Chayes, L., Shlosman, S., Zagrebnov, V.: Discontinuity in magnetization in diluted O(n)-Models. J. Statist. Phys. 98, 537–549 (2000) 13. Dinaburg, E.I., Sinai, Ya.G.: An analysis of ANNNI model by Peierls’ contour method. Commun. Math. Phys. 98(1), 119–144 (1985) 14. Dobrushin, R.L., Shlosman, S.B.: Phases corresponding to minima of the local energy. Selecta Math. Soviet. 1(4), 317–338 (1981) 15. Dobrushin, R.L., Zahradn´ık, M.: Phase diagrams for continuous-spin models: an extension of the Pirogov-Sina˘ı theory. In: Dobrushin R.L. (ed.) Mathematical problems of statistical mechanics and dynamics. Math. Appl. (Soviet Ser.), Vol. 6, Dordrecht: Reidel, 1986, pp. 1–123 16. van Enter, A.C.D., Shlosman, S.B.: First-order transitions for n-vector models in two and more dimensions: Rigorous proof. Phys. Rev. Lett. 89, 285702 (2002) 17. van Enter, A.C.D., Shlosman, S.B.: Provable first-order transitions for nonlinear vector and gauge models with continuous symmetries. Commun. Math. Phys. 255, 21–32 (2005) 18. Fröhlich, J., Israel, R., Lieb, E.H., Simon, B.: Phase transitions and reflection positivity. I. General theory and long range models. Commun. Math. Phys. 62, 1–34 (1978) 19. Fröhlich, J., Israel, R., Lieb, E.H., Simon, B.: Phase transitions and reflection positivity. II. Lattice systems with short range and Coulomb interactions. J. Statist. Phys. 22, 297–347 (1980) 20. Fröhlich, J., Lieb, E.H.: Phase transitions in anisotropic lattice spin systems. Commun. Math. Phys. 60(3), 233–267 (1978) 21. Georgii, H.-O.: Gibbs Measures and Phase Transitions. de Gruyter Studies in Mathematics, Vol. 9, Berlin: Walter de Gruyter & Co., 1988 22. Imbrie, J.Z.: Phase diagrams and cluster expansions for low temperature P (ϕ)2 models. I. The phase diagram. Commun. Math. Phys. 82(2), 261–304 (1981/82) 23. Imbrie, J.Z.: Phase diagrams and cluster expansions for low temperature P (ϕ)2 models. II. The Schwinger functions. Commun. Math. Phys. 82(3), 305–343 (1981/82) 24. Kotecký, R., Laanait, L., Messager, A., Ruiz, J.: The q-state Potts model in the standard Pirogov-Sina˘ı theory: surface tensions and Wilson loops. J. Statist. Phys. 58(1-2), 199–248 (1990) 25. Kotecký, R., Shlosman, S.B.: First-order phase transitions in large entropy lattice models. Commun. Math. Phys. 83(4), 493–515 (1982) 26. Kotecký, R., Shlosman, S.B.: Existence of first-order transitions for Potts models. In: Albeverio, S., Combe, Ph. , Sirigue-Collins M. (eds.), Proc. of the International Workshop — Stochastic Processes in Quantum Theory and Statistical Physics, Lecture Notes in Physics 173, Berlin-Heidelberg-New York: Springer-Verlag, 1982, pp. 248–253 27. Laanait, L., Messager, A., Miracle-Solé, S., Ruiz, J., Shlosman, S.: Interfaces in the Potts model. I. Pirogov-Sinai theory of the Fortuin-Kasteleyn representation. Commun. Math. Phys. 140(1), 81– 91 (1991) 28. Martirosian, D.H.: Translation invariant Gibbs states in the q-state Potts model. Commun. Math. Phys. 105(2), 281–290 (1986) 29. Messager, A., Nachtergaele, B.: A model with simultaneous first and second order phase transitions. http://arxiv.org/list/cond-mat/0501229, 2005 30. Pirogov, S.A., Sinai,Ya.G.: Phase diagrams of classical lattice systems (Russian). Theor. Math. Phys. 25(3), 358–369 (1975) 31. Pirogov, S.A., Sinai, Ya.G.: Phase diagrams of classical lattice systems. Continuation (Russian). Theor. Math. Phys. 26(1), 61–76 (1976) 32. Shlosman, S.B.: The method of reflective positivity in the mathematical theory of phase transitions of the first kind (Russian). Uspekhi Mat. Nauk 41(3(249)), 69–111, 240 (1986) 33. Shlosman, S., Zagrebnov, V.: Magnetostriction transition. J. Statist. Phys. 114, 563–574 (2004) 34. Zahradn´ık, M.: An alternate version of Pirogov-Sinai theory. Commun. Math. Phys. 93, 559–581 (1984)

656


35. Zahradn´ık, M.: Contour methods and Pirogov-Sinai theory for continuous spin lattice models. In: R.A. Minlos, S. Shlosman Yu.M. Suhov (eds.), On Dobrushin’s way. From probability theory to statistical physics, Amer. Math. Soc. Transl. Ser. 2, Vol. 198, Providence, RI: Amer. Math. Soc., 2000, pp. 197–220 Communicated by M. Aizenman


Communications in


Spectral Triples of Holonomy Loops Johannes Aastrup1 , Jesper Møller Grimstrup2,3 1 2 3

Institut für Analysis, Universität Hannover, Welfengarten 1, 30167 Hannover, Germany. E-mail: [email protected] NORDITA, Blegdamsvej 17, 2100 Copenhagen, Denmark. E-mail: [email protected] Science Institute, University of Iceland, Dunhaga 3, 107 Reykjavik, Iceland

Received: 4 May 2005 / Accepted: 27 October 2005 Published online: 31 March 2006 – © Springer-Verlag 2006

Abstract: The machinery of noncommutative geometry is applied to a space of connections. A noncommutative function algebra of loops closely related to holonomy loops is investigated. The space of connections is identified as a projective limit of Lie-groups composed of copies of the gauge group. A spectral triple over the space of connections is obtained by factoring out the diffeomorphism group. The triple consist of equivalence classes of loops acting on a hilbert space of sections in an infinite dimensional Clifford bundle. We find that the Dirac operator acting on this hilbert space does not fully comply with the axioms of a spectral triple. Contents 1. 2. 3. 4. 5.

Introduction . . . . . . . . . . . . . . . . . . . . . . . The Hoop Group . . . . . . . . . . . . . . . . . . . . Hoop Group Representations . . . . . . . . . . . . . . The Space A¯ as a Projective Limit . . . . . . . . . . . Spectral Triples over Gn and the Projective Limit . . . 5.1 The hilbert space . . . . . . . . . . . . . . . . . 5.2 The Euler-Dirac operator . . . . . . . . . . . . . 5.3 The algebra . . . . . . . . . . . . . . . . . . . . 5.4 An extended Euler-Dirac operator . . . . . . . . 6. The Space of Connections . . . . . . . . . . . . . . . 6.1 Distances on A¯ . . . . . . . . . . . . . . . . . . 7. Diffeomorphism Invariance . . . . . . . . . . . . . . . 7.1 Transformations of states and operators . . . . . 7.2 Diffeomorphism invariant states . . . . . . . . . 7.3 Diffeomorphism invariance via equivalent triples 7.4 Spectral in the sense of Connes? . . . . . . . . . 8. Discussion & Outlook . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

658 661 662 663 664 664 667 668 669 669 670 670 671 671 673 674 675

658

J. Aastrup, J.M. Grimstrup

A. Clifford Algebras and Dirac Operators . . . . . . . . . . . . . . B. Projective and Inductive Limits . . . . . . . . . . . . . . . . . . B.1 Projective limits . . . . . . . . . . . . . . . . . . . . . . . B.2 Inductive limits . . . . . . . . . . . . . . . . . . . . . . . B.3 Constructing operators on inductive limits of hilbert spaces References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

677 678 678 679 680 681

1. Introduction The story of noncommutative geometry starts with the idea that instead of studying spaces one studies algebras of functions on the spaces. A concrete result supporting this idea is the Gel’fand-Naimark theorem [1] that states that the world of locally compact Hausdorff spaces is, by taking the corresponding algebra of continuous complex valued functions vanishing at infinity, the same as the world of commutative C ∗ -algebras. Hence noncommutative C ∗ -algebras can be considered as noncommutative locally compact Hausdorff spaces. The crucial leap from noncommutative topology to geometry was done by Alain Connes [2]. The key observation is that the Dirac operator on a Riemannian manifold gives full information about the metric. This idea provides the definition of a noncommutative geometry, i.e. a spectral triple, by abstractizing a Dirac operator as an operator acting on the same hilbert space as the (non)commutative algebra; satisfying a list of axioms generalizing interaction rules of smooth functions with the Dirac operator. Prime examples of noncommutative geometries are given by quotient spaces. A conceptually simple case is the set of two points identified. The classical way of identification would be to consider just one point. The noncommutative quotient is to consider two by two matrices. So we regard the two sub-algebras C 0 0 0 , 0 0 0 C as the function algebras over the two points. These algebras are then identified through the partial isometries 0 1 0 0 , , 0 0 1 0 which not only identify the points but also belong to the algebra. Represented on H = C ⊕ C the algebra of two by two matrices interacts with a Dirac operator given by 0 a D = , a ∈ R. −a 0 This noncommutative geometry, when combined with the commutative algebra of smooth functions on a manifold, is related to the Higgs effect in the Connes-Lott model [3] and to the Higgs effect in Connes’ full formulation of the Standard Model [4]. The crucial point is that exactly the noncommutativity of the algebra generates the entire bosonic sector, including the Higgs scalar, through fluctuations around the Dirac operator. The action of the standard model coupled to gravity comes out [5, 6] ˜ ˜ |ξ + Trace ϕ D , ξ |D

Spectral Triples of Holonomy Loops

659

˜ is the fluctuated Dirac operator, ξ a hilbert state and ϕ a suitable cutoff function where D selecting eigenvalues below the cutoff . Unfortunately, this beautiful unification of the standard model with general relativity is completely classical. No clear notion of quantization exist within the framework of noncommutative geometry. The aim of this paper is to explore new ideas on the unification of noncommutative geometry with the principles of quantum field theory. Quantum field theory deals with spaces of field configurations. The central object is the path integral i D exp S[] , where denotes the field content of the theory described by the (symmetries of the) classical action S[]. D is a formal measure on the space of field configurations. Therefore, rather than dealing with manifolds or algebras of functions hereon, quantum field theory lives on the much larger spaces of field configurations. We now suggest the following: If Connes’ formulation of the standard model and quantum field theory are to be linked, and if the principles of noncommutative geometry are fundamental (which we believe they are), then one should apply the machinery of noncommutative geometry to some space of field configurations. Further, since Connes’ formulation of the standard model is in principle a gravitational theory (pure geometry) we suggest that the correct implementation of quantum theory must involve quantum gravity. Thus, we suggest to study a functional space related to general relativity. The aim is to find a suitable configuration space on which a generalized Dirac operator exists. A function algebra hereon may very well be naturally noncommutative (classically). The hope is that the Dirac operator will generate a kind of quantization of the underlying space. For the space of field configurations we use ideas from loop quantum gravity [7]. Here the space is the space of certain connections modulo gauge equivalence. The function algebra is generated by traced holonomies of connections along loops, i.e. all physical observables can be expressed by these. This gives a commutative algebra. However, the lesson taught by noncommutative geometry is that the noncommutativity of the algebra provides essential structure. The idea is therefore to keep the noncommutativity by taking holonomies without tracing them; a loop L maps connections into group elements of G L : ∇ → H ol(L, ∇) ∈ G,

(1)

where H ol(L, ∇) is the holonomy along L, and G is the gauge group which we, for now, assume to be compact. Loop functions like (1) correspond to an underlying space of gauge connections which includes also gauge equivalent connections. This will also resemble Connes’ construction of standard model, since we get an algebra of matrix valued functions over a configuration space just as Connes’ matrix valued functions over a manifold. Furthermore, in loop quantum gravity a fibration of the space-time manifold into global space and time directions is considered. This is done in order to apply a canonical quantization scheme. In the present case the aim is to construct a spectral triple over a functional space of connections. For this purpose such a fibration is not needed and we therefore consider the whole manifold. Thus, the connections considered are space-time connections.

660


The central achievement of loop quantum gravity is its ability to obtain a separable hilbert space of loop functions via diffeomorphism invariance1 (see [8] and references therein). It is possible to extend these results to the case of a noncommutative algebra; we represent certain equivalent classes of noncommutative loop operators on a diffeomorphism invariant, separable hilbert space. Further, the Dirac operator we construct on the holonomy algebra is diffeomorphism invariant and hence also descends to the diffeomorphism invariant hilbert space. This is important since the Dirac operator stores the full physical information. Let us finally add a note on noncommutativity and quantum theory. Clearly, the noncommutativity suggested is classical: It is simply related to the non-Abelian structure of the group G and therefore carries no quantum aspect. On the other hand, the Dirac operator which we construct will resemble a global functional derivation. As a Dirac operator it carries spectral information of the underlying space – the space of connections – and will enable integration theory. In this sense, the quantum aspect enters through the constructed Dirac operator. Outline of the paper. The algebra of (untraced) holonomy transformations, which is a central object in this paper, is introduced as the hoop group HG in Sect. 2. Since a smooth connection in a G-bundle maps a loop L ∈ HG into G homomorphically via the holonomy transform we define in Sect. 3 the space A¯ of generalized connections as the set of homomorphisms A¯ = H om(HG, G). This is the functional space on which we wish to do geometry. Conversely, since the hoop group acts on A¯ simply by HL (∇) = ∇(L), ¯ The key technical tool we interpret HG as a noncommutative function algebra on A. for dealing with the space A¯ is described in Sect. 4. Referring to [9] we identify A¯ as a projective limit over the representations of finite subgroups of the hoop group. This enables us to work with only finitely many loops at a time. The space A¯ seen from finitely many loops looks like Gn = G × · · · × G , n times

where n is related to the number of loops. Thus, since G is a Lie-group, we are at this level dealing with just an ordinary manifold and we can therefore write down Dirac operators from classical geometry. A concrete realization of this technique/idea is worked out in Sect. 5. Since we are sitting in a projective system we are not entirely free to choose our Dirac operator; it has to fit with different choices of finitely many loops. In fact, problems arise from loops with common line segments. We remedy this defect by technically excluding such combinations of loops. Also, for technical reasons, we choose the classical Euler-Dirac instead of the real Dirac operator. 1 In fact, diffeomorphism invariance alone does not give a separable hilbert space. Instead one has to use a generalized notion of diffeomorphisms, see [8].


661

In doing this the link to connections becomes unclear. This is clarified in Sect. 6, where we show that the connection are still contained in the spectrum of the modified algebra. A key issue in the construction presented is the implementation of diffeomorphism invariance; the concern of Sect. 7. Using once more ideas from loop quantum gravity we construct diffeomorphism invariant states and a diffeomorphism invariant algebra of loop operators. Finally we are concerned with the question whether the obtained, diffeomorphism invariant triple is spectral in the sense of Connes. It turns out not to be the case since the eigenvalues of the Dirac operator have infinite multiplicity. In particular, this is linked to the kernel of the Euler-Dirac operator on G which has dimension larger than one. Although we are at present unable to solve the problem we suggest some possible solutions. We provide a final discussion and outlook in Sect. 8 and leave some extra material for the appendices. 2. The Hoop Group The starting point is a manifold M. Let us for simplicity assume that M is topological trivial. On this manifold we consider first the set P of piecewise analytic paths

P := P (t)|P : [0, 1] → M , where paths which differ only by a reparameterization are identified. If two paths P1 , P2 ∈ P have coinciding end and start points, P1 (1) = P2 (0), we define their product

P1 (2t) t ∈ 0, 21

. P1 ◦ P2 (t) = P2 (2t − 1) t ∈ 21 , 1 In case P1 (1) = P2 (0) we set their product to zero. There is a natural involution on P, P ∗ (t) = P (1 − t)

∀t,

since (P ∗ )∗ = P ,

(P1 ◦ P2 )∗ = P2∗ ◦ P1∗ .

Choose an arbitrary basepoint x o ∈ M. We call a path which starts and ends at x o a based loop. Further, by a simple loop we understand a based loop for which L(t) = x o ⇔ t ∈ {0, 1}. The set of based loops is called loop space and is denoted Lx o . An equivalence relation on loop space is generated by identifying loops which differ by a simple retracing along a path L1 = P1 ◦ P2 ◦ P2∗ ◦ P3 L1 ∼ L2 ⇔ , L2 = P1 ◦ P3 where Li ∈ Lx o , and Pi ∈ P. An equivalence class [L] is called a hoop [10]. The set of hoops is called the hoop group, denoted HG = Lx o / ∼,

662


since the involution on HG gives an inverse element [L] · [L]∗ = [Lid ], where Lid is the trivial loop Lid (t) = x o

∀t ∈ [0, 1].

To ease the notation we will denote a hoop [L] simply by an representative L of the equivalence class. Furthermore, for literary reasons we often call [L] a loop. We emphasize that since M has no metric any notion of distance between and length of loops and hoops is meaningless. 3. Hoop Group Representations Consider the space of homomorphisms A¯ = Hom(HG, G), from the hoop group into a matrix representation of a compact Lie group G (we denote both the group and its representation by G. The group G is assumed to have a both left and right invariant metric). That is, for ∇ ∈ A¯ we have ∇(L1 ) · ∇(L2 ) = ∇(L1 ◦ L2 )

∀L1 , L2 ∈ HG.

If we denote by A the space of smooth connections in a bundle with structure group G, then a connection ∇ ∈ A clearly gives such a homomorphism via ∇ : L → H ol(L, ∇),

(2)

where H ol(L, ∇) is the holonomy of the connection around the loop L. Let us recall that the holonomy is the parallel transport of the connection along a path P , H ol(P , ∇) = P exp i ∇ , P

where P is the path ordering symbol. The parallel transport along a closed loop is a nonlocal, gauge covariant object and the trace hereof, the Wilson loop, is gauge invariant. From (2) we conclude that ¯ A ⊂ A. It is however important to realize that A¯ is much larger2 [9]. The space A¯ is the general space of field configurations on which we wish to obtain ¯ To this a geometrical structure. We therefore consider an algebra of functions over A. ¯ end we first notice that a hoop L ∈ HG gives rise to a function HL on A into G via HL (∇) = ∇(L),

(3)

¯ Notice that where ∇ ∈ A. HL1 · HL2 = HL1 ◦L2 ,

(HL )∗ = HL−1 ,

HLid = 1,

In fact, A has, with respect to the Ashtekar-Lewandowski measure, zero measure in A¯ (modulo gauge transformations). 2


663

where Li ∈ HG. The set of complex linear combinations of all functions HL is a -algebra. The norm of a general linear combination a1 HL1 + · · · + an HLn is defined by a1 HL1 + · · · + an HLn = sup a1 HL1 (∇) + · · · + an HLn (∇) , ∇∈A¯

where · on the rhs is the matrix norm. Notice that HL = 1 ∀L ∈ HG if the group G is orthonormal or unitary. The closure in this norm of the algebra generated by functions HL is a C -algebra. Let us denote it C ∗ (Lx o ). For now, this is the noncommutative function algebra over A¯ which we wish to imbed in a spectral triple. However, as we will explain in the next section, we need to change the algebra slightly to be able to construct a Dirac operator. ¯ as a Projective Limit 4. The Space A The space A¯ was analyzed in [9] in a somewhat different context3 . Here, the authors identify A¯ with a projective limit (see Appendix B for details on projective and inductive limits) lim H om(F, G), ←

F ∈ F,

where F is the set of all strongly independent, finitely generated subgroups of HG (strongly independent in the sense of [10]). Let L1 . . . Ln(F ) be the strongly independent generators of F ∈ F. We then identify [10] H om(F, G) Gn(F ) , since we just map φ ∈ H om(F, G) into (φ(L1 ), . . . , φ(Ln(F ) )) ∈ Gn(F ) .

(4)

This identification is of great advantage: Since Gn(F ) is a Lie group it is now straightforward to construct a spectral triple by choosing a metric on G and then using the Euler-Dirac4 or the Dirac operator (since Gn(F ) is a Lie group it is parallelizable and hence possesses a spin structure). Once a geometrical construction on Gn(F ) is obtained we extend this to all of A¯ by taking the projective limit of the algebra and the inductive limit of the relevant hilbert space. Thus, it is tempting to consider the hilbert space L2 (Gn(F ) , Mk (C) ⊗ S), 3 The authors of [9] considered the smaller space A¯ /Ad of smooth connections modulo local gauge transformations. This otherwise important difference is not essential for the issues regarding the projective limit. 4 See Appendix A.

664


where S is the Clifford algebra or the spin bundle corresponding to either the Euler-Dirac or the Dirac operator. L2 is with respect to the Haar measure on Gn(F ) . The problem with this construction is that the Euler-Dirac and the Dirac operators contain all the metric information on the underlying space [2] and the structure maps defining the projective limit are not metric. The problems can be traced back to the definition of the generating hoops. Here, following [10], we encounter overlapping hoops which lead to structure maps PF1 ,F2 : H om(F1 , G) → H om(F2 , G) of the form5 (g1 , g2 , g3 ) → g1 g2 ,

(5)

where F1 ⊂ F2 lie in F. The problem is that such maps do not have a canonical isometric cross-section. The solution to this problem is to redefine our notion of generating hoops. This, in turn, will affect the projective limit. Let us go into detail in the next section. Before we do that we end this section by mentioning that the identification (4) indirectly chooses an orientation of the hoop. Basically, there are two possible identifications corresponding to either ϕ(L) or ϕ(L−1 ). Therefore, we can identify H om(F, G), where F is a subgroup generated by a single hoop, with both G and G−1 . 5. Spectral Triples over Gn and the Projective Limit Let FI be the set of finitely generated subgroups of HG with the property that they are generated by simple, non-selfintersecting loops that do not have overlapping segments or points6 . The inclusion of groups F1 ⊂ F2 gives an inductive system on FI and therefore a projective structure on {Hom(F, G)}F ∈FI . Again, we can identify Hom(F, G) with Gn(F ) , where n(F ), as before, is the number of simple loops in a generating set of F . Since we are looking at subgroups with the property that no two loops have overlapping segments the maps PF1 ,F2 : Gn(F2 ) → Gn(F1 ) induced by the inclusion F1 ⊂ F2 are just given by deleting some coordinates or inverting some coordinates. This eliminates structure maps of the form (5) and thus enables the following construction of a spectral triple. 5.1. The hilbert space. We first construct the hilbert space. We choose a left and right invariant metric on G. We therefore also have a metric on Gn(F ) and hence we can construct the Clifford bundle Cl(T Gn(F ) ). Due to the invariance of the metric we get the result Proposition 5.1.1. There is an embedding of hilbert spaces PF∗1 ,F2 : L2 (Gn(F1 ) , Cl(T Gn(F1 ) )) → L2 (Gn(F2 ) , Cl(T Gn(F2 ) )), where the measure on Gn(Fi ) is the Haar measure. 5 Here H om(F , G) and H om(F , G) are, as an example, identified with G1 and G3 , respectively. A 1 2 similar structure map with a G2 -subgroup is not possible due to the special construction of independent hoops. 6 In contrast to [10] we no longer require loops to be piece-wise analytic. Nor does the manifold need a real analytic structure.


665

Proof. We will need some notation. Let e1 , . . . , en be an orthonormal basis in Tid G, the tangent space over the identity in G. Due to the invariance property of the metric we get that Dg (id)(e1 ), . . . , Dg (id)(en ) is an orthonormal basis in Tg G. Here Dg (id) denotes the differential of the map mg : G → G,

mg (g1 ) = gg1

in the identity. We will also use the notation e1 , . . . , en to denote the corresponding global vector fields in T G, i.e. ek (g) = Dg (id)(ek ). We will abbreviate n(Fi ) by ni . We first consider the case where the projection PF1 ,F2 is of the form PF1 ,F2 (g1 , . . . , gn2 ) = (g1 , . . . , gn1 ),

(6)

and denote by e11 , . . . , en1 , e12 , . . . , en2 , . . . , e1ni , . . . , enni the global vector fields on Gni , where e1k , . . . , enk denote the global vector fields e1 , . . . , en in the k th component of T Gni . Put g¯ ni = (g1 , . . . , gni ). We clearly have that

elk (g¯ n1 ), elk (g¯ n1 )Tg¯n

1

Gn1

= elk (g¯ n2 ), elk (g¯ n2 )Tg¯n

2

Gn2 ,

where k, k ≤ n1 . An element in L2 (Gn1 , Cl(T Gn1 )) is a linear combination of elements of the form f e, where e is a product of elements in e11 , . . . , en1 , e12 , . . . , en2 , . . . , e1n1 , . . . , enn1 , and f ∈ L2 (Gn1 ). We define PF∗1 ,F2 (f e) = f˜e, where f˜(g¯ n2 ) ≡ f (PF1 ,F2 (gn2 )) = f (g¯ n1 ). This map preserves the inner product since f e, f e L2 (Gn1 ,Cl(T Gn1 )) = f¯(g¯ n1 )f (g¯ n1 )e, e T(g¯n ) Gn1 · dµH (g1 ) · · · dµH (gn1 ) 1 = f¯(g¯ n1 )f (g¯ n1 )e, e T(g¯n ) Gn2 · dµH (g1 ) · · · dµH (gn2 ) 2

= f˜e, f˜ e L2 (Gn2 ,Cl(T Gn2 )) , where we have used that 1dµH = 1.

666


To finish the construction we only need to consider a map of the form PF1 ,F2 (g) = g −1 ,

(7)

since any structure map is the composition of maps of the type (6) and (7). However, the map PF∗1 ,F2 : L2 (G, Cl(T G)) → L2 (G, Cl(T G)), defined by (with the notation from before) PF∗1 ,F2 (f e)(g) = f (g −1 )DP −1 (e) F1 ,F2

is, due to the left and right invariance of the metric, a map of hilbert spaces. This completes the proof. We can now construct the direct limit of these hilbert spaces (see Appendix B for a more detailed discussion on inductive limits). This is done in the following way: First define Halg = ⊕F ∈FI L2 (Gn(F ) , Cl(T Gn(F ) ))/N, where N is the subspace generated by elements of the form (. . . , v, . . . , −PF∗1 ,F2 (v), . . . ). In other words, we identify the vectors v and PF∗1 ,F2 (v). The problem is now to define an inner product on Halg . Decompose L2 (G, Cl(T G)) into the subspace generated by the function 1 and the orthogonal complement. We will write this as L2 (G, Cl(T G)) = H1 ⊕ H2 , where H1 = C. Given a vector v ∈ L2 (Gn(F ) , Cl(T Gn(F ) )) this can be uniquely decomposed into vectors of the form v1 ⊗ · · · ⊗ vn(F ) , where each vi belong either to H1 or H2 . It is therefore enough to define the inner product of vectors of this type. Further, let v1 ∈ L2 (Gn(F1 ) , Cl(T Gn(F1 ) )) and v2 ∈ L2 (Gn(F2 ) , Cl(T Gn(F2 ) )) be vectors of this form. We will assume that in the tensor decomposition of v1 and v2 only elements from H2 appear. We can assume this since else v1 and/or v2 will be the image under one of the P ’s, and we can simply pull the vector back. We finally define the inner product by v1 , v2 = PF∗1 ,F3 (v1 ), PF∗2 ,F3 (v2 )L2 (Gn(F3 ) ,Cl(T Gn(F3 )) )

(8)

if there exist a F3 with F1 , F2 ⊂ F3 and zero else. The completion of the Halg with (si ∼ respect to this inner product is the inductive limit and will be denoted by Hsi segment independent). In Eq. (8), since v1 and v2 are, per definition, decomposed into tensor-powers in H2 , the inner product will be different from zero only when F1 = F2 .


667

in terms of the hilbert space H , We can give a more concrete description of Hsi 2 namely Hsi = C ⊕ (⊕l 1 H2 ) ⊕ (⊕l 2 H2 ⊗ H2 ) ⊕ . . . ,

(9)

where l k is the set of all products of k-nonintersecting simple loops, and where ⊕ means orthogonal sum. The first C corresponds to the trivial loop. For each simple loop we get a copy of L2 (G, Cl(T G)); however the constant functions are identified in the inductive limit, and we hence only get a copy of H2 for each simple loop. This picture continues for products of two simple loops and so on. 5.2. The Euler-Dirac operator. On each of the hilbert spaces L2 (Gn(F ) , Cl(T Gn(F ) )) we have a canonical Euler-Dirac operator D(ξ ) =

ei · ∇ei (ξ ),

(10)

where {ei } are global, orthonormal sections in the tangent bundle of Gn(F ) and ∇ is the Levi-Civita connection. It is clear that this Euler-Dirac operator commutes with the structure maps PF∗1 ,F2 not involving inversions. According to [14] D can, under the identification of Cl(T M) with ∧∗ (T ∗ M) (differential forms), be identified with d + d ∗ . The exterior derivative d is invariant under all diffeomorphisms, and since d ∗ only additionally depends on the metric and the metric on G is invariant under inversions, the Euler-Dirac operator also commutes with structure maps involving inversions. Therefore . get an Euler-Dirac operator D on Hsi The reason why we choose the Euler-Dirac operator instead of the classical Dirac operator is that the former has better functorial properties. In particular, it is invariant under inversions of loops. If we consider for example the Abelian case, G = S 1 , and parameterize S 1 by θ ∈ [0, 2π ], then the Dirac operator reads D = i

∂ , ∂θ

(11)

which, under inversion of the underlying loop G → G−1

(12)

picks up a minus sign. On the other hand, we have just argued that the Euler-Dirac operator is invariant under inversions. It is of course desirable to work out a construction that works for the classical Dirac operator, but for now we choose to work with the easier Euler-Dirac operator. The particular choice of “Dirac” operator in (11) is motivated by its resemblance to a (integrated) functional derivation. Heuristically: A (smooth) connection is determined by holonomies along hoops. In the projective system described here we consider first a finite number of hoops and a connection is thus described ‘coarse-grained’ by assigning group elements to each of the finitely many elementary hoops. The Euler-Dirac operator (10) takes the derivative on each of these copies of the group G and throws it into the Clifford bundle. In this way the Dirac operator resembles a functional derivation operator.

668


We interpret this Euler-Dirac operator as intrinsically ‘quantum’ since it bears some resemblance to a canonical conjugate of the connection. Heuristically, we write D ∼

δ δ∇

(13)

and HL ∼ 1 + ∇

(14)

due to HL ’s relation to the holonomy map. Here ∇ is a connection. From (13) and (14) the non-vanishing commutator [D, HL ] = 0 obtains, on a very heuristical level, a resemblance to a commutation relation of canonical conjugate variables. Thus, it is not the noncommutativity of the algebra of holonomy loops (to be defined rigorously below) which is ‘quantum’ but rather the Dirac operator and its interaction with the algebra. This is an essential point for the interpretation of the geometrical construction presented. , or 5.3. The algebra. We will construct our algebra as an algebra of operators on Hsi rather a variant of, hereof denoted Hsi . This algebra will be similar, but not equal, to the group algebra C ∗ (Lx o ) of hoops. but where The hilbert space Hsi is constructed the same way as Hsi

L2 (Gn(F ) , Cl(T Gn(F ) ) ⊗ Mn (C)) is used instead of L2 (Gn(F ) , Cl(T Gn(F ) )). Here n is the size of the representation of G. The reason for the additional matrix factor is that we wish to represent the holonomy loops by left matrix multiplication. The decomposition analogous to (9) looks like Hsi = Mn (C) ⊕ (⊕l 1 H2 ⊗ Mn (C)) ⊕ (⊕l 2 H2 ⊗ H2 ⊗ Mn (C)) . . . . If we are given a simple hoop L, we construct an operator HL on Hsi in the following way: For a subgroup F ∈ FI we make use of the identification (4) of Gn(F ) with Hom(F, G) and hence define HL (s)(ϕ) = (id ⊗ ϕ(L))(s(ϕ)), / F . Since where s ∈ L2 (Gn(F ) , Cl(T Gn(F ) ) ⊗ Mn (C)) and where ϕ(L) = id when L ∈ HL respects the maps PF∗1 ,F2 we get an operator HL on Hsi . For a general hoop L, using the unique decomposition of L into simple hoops L1 ◦ . . . ◦ Ln define HL = HL1 ◦ · · · ◦ HLn . Our algebra, which we denote A, is the C ∗ -algebra generated by the operators HL , L ∈ HG. It is important to realize that the algebra A is not identical to the C ∗ -algebra C ∗ (Lx o ) introduced in Sect. 3. That is, we have not obtained a representation of the group algebra of hoops on M. To illustrate this consider the following two situations:


669

1. Loops with common line segment. We consider for example two loops L1 and L2 where L1 = P1 ◦ P2 ,

L2 = P2∗ ◦ P3 ,

with Pi ∈ P. Hence L3 ≡ L1 ◦ L2 = P1 ◦ P3 . 2. Intersecting loops. Consider two loops L4 and L5 where L4 (t1 ) = L5 (t2 ) = x o . In the first case L1 , L2 and L3 cannot belong to the same subgroup F ∈ FI since they all have common line segments. Thus, their associated operators HLi act on different parts of the hilbert space. This means that they commute HL1 · HL2 = HL2 · HL1 . In particular, it means that HL1 · HL2 = HL3 . In the second case, the product L4 ◦ L5 does not even belong to any subgroup F ∈ FI . Thus, the operator HL4 ◦L5 only exist as the composition HL4 · HL5 . 5.4. An extended Euler-Dirac operator. The Euler-Dirac operator defined in Eq. (10) . When acting on H it does not ‘see’ the matrix acts, basically, on the hilbert space Hsi si part of the hilbert space. This need not be so. We can for example define an extended Euler-Dirac operator by Dext (ξ ⊗ m)(g) = D(ξ(g)) ⊗ m + ξ(g) ⊗ mn (g) · m,

(15)

where mn (g) is a matrix valued function on Gn(F ) and ξ ⊗ m ∈ Hsi . The form of the operator in Eq. (15) is similar to the Dirac operators of the almost commutative geometries (including the standard model). See for example [4]. 6. The Space of Connections So far, we have considered a geometrical structure over spaces related to certain loop group homomorphisms. We now want to describe in more detail the role of connections in this construction. In the above we constructed the hilbert space Hsi = lim L2 (Gm , Cl(T Gm ) ⊗ Mn (C)). →

Let us for simplicity now consider the same hilbert space but without the spin structure and the matrix factor: H = lim L2 (Gn ). →

670


Hence, Hsi is H with coefficients in an infinite dimensional Clifford algebra tensored with n by n matrices. If we backtrack our line of reasoning we first make the identification H = L2 (lim H om(F, G)), ←

where F ∈ FI . Let ∇ be a fixed, smooth connection in A. As already mentioned, for a given F , ∇ gives rise to a homomorphism into G via the holonomy loop ∇ : L → H ol(L, ∇) ∈ G, where L ∈ F . It is easy to see that this commutes with the structure map and hence that we get a map A → lim H om(F, G). ←

Clearly, this map is injective. We therefore conclude that H is a hilbert space over a space which contains all smooth connections. ¯ On a Riemannian spin-geometry the Dirac operator D contains 6.1. Distances on A. the geometrical information of the manifold M. In particular, distances can be formulated in a purely algebraic fashion due to Connes [2]. Given two points x, y ∈ M their distance is given by d(x, y) =

sup

{|f (x) − f (y)|[D, f ] ≤ 1}.

f ∈C ∞ (M)

(16)

On a noncommutative geometry the state space replaces the notion of points. It is possible to extend the notion of distance to the state space by generalizing (16) in an obvious manner. For the present case, however, it is quite unclear in what sense a Dirac operator incorporates a distance. Further, the usefulness of such a notion is in the present situation not obvious. Clearly, if the Dirac operator (10) is interpreted as a metric it will give rise to ¯ distances on the space A. For example, it is not difficult to see that for the G = U (1) case the distance between two smooth connections will be infinite. This can be seen by first noting that the distance induced by the Dirac operator on U (1)n is just the sum of distances on each copy of U (1). This product distance of two smooth connections will differ on infinitely many non-intersecting loops. Further, summing these differences will give an infinite distance between the points. Perhaps this is not so surprising considering the fact that our geometry is infinite dimensional. 7. Diffeomorphism Invariance Clearly, the construction considered so far is very large. In fact, the hilbert space Hsi is not separable and it is unclear how to extract physical quantities in a well-defined manner. What is missing is of course the implementation of diffeomorphism invariance relative to the underlying manifold M. Invariance under arbitrary coordinate transformations is the defining symmetry of general relativity and it is therefore an essential ingredient in the formalism. It turns out that the ‘size’ of the construction can indeed be


671

drastically reduced by taking diffeomorphism invariance into account7 . First we write down transformation laws of hilbert states and operators. Next, we define diffeomorphism invariant states via a formal sum over states connected via diffeomorphisms. We are able to represent loop operators on such ‘smeared’ states albeit not as a representation of the operator algebra A. In a subsequent subsection we investigate an alternative approach where we introduce an equivalence of spectral triples to cut down the size of both the hilbert space and the algebra as well as the corresponding Euler-Dirac operator simultaneously. We find that the two approaches are in fact equivalent. Finally we look at the spectrum of the relevant Euler-Dirac operators and show that it is not fully a Dirac operator in the sense of Connes. We assume that the space-time dimension of the manifold M is larger than three. Since there exist no knot theory outside 3 dimensions we hereby avoid considering different “knot states,” etc. 7.1. Transformations of states and operators. We first consider states in L2 (Gn(F ) , Cl(T Gn(F ) ) ⊗ Mn (C)) which are polynomial in g1 , . . . , gn(F ) tensored with constant elements in Cl(T Gn(F ) ). A diffeomorphism d ∈ Diff (M) which maps d : Li → Li has a natural action on such polynomials d : p(g1 , . . . , gn(F ) ) → p(g1 , . . . , gn(F ) ),

gi

Gi

(17)

Li .

where ∈ is the group corresponding to the new loop Because we interpret states in Hsi as (polynomials in) holonomy loops we can really only state how polynomials and their closure should transform under diffeomorphisms. However, we can simply extend the transformation law (17) to all of L2 (Gn(F ) , Cl(T Gn(F ) ) ⊗ Mn (C)) d : ξ(g1 , . . . , gn(F ) ) → ξ(g1 , . . . , gn(F ) ),

and via the inductive limit to all of Hsi . The action of the diffeomorphism group on the algebra A is straightforward, simply taken from (17). Above and in the following we only consider diffeomorphisms in Diff (M) which preserve the basepoint x 0 . 7.2. Diffeomorphism invariant states. From one point of view we need to solve the diffeomorphism constraint dξ = ξ,

∀ d ∈ Diff (M) ; ξ ∈ Hsi .

(18)

Let us start by investigating this. The following is inspired by [7, 12]. Equation (18) has, of course, the formal solution, ξ˜ = d(ξ ). (19) d∈Diff (M) 7 The construction on this section works both for diffeomorphisms and for extended diffeomorphisms, and we will therefore notationally not distinguish between them. But only the latter case gives a separable hilbert space, and hence the verification of (or lack of) the axioms of a spectral triple only makes sense for the extended diffeomorphisms.

672


This, however, makes no sense in Hsi . Instead we need to consider the dual of Hsi . So, given a vector η ∈ Hsi we let the formal sum (19) act on η like ξ˜ (η) = d(ξ )|η. (20) d∈Diff (M)

Strictly speaking this does not make sense either, since the sum on the right-hand side need not be convergent. If we however define the action of ξ˜ only on the algebraic part of Hsi , i.e. only finite sums of elements in the sum (9), the sum (20) becomes finite if the summation over Diff (M) is understood correctly. We will now describe how this works: ⊗n(F ) First we define the projection onto symmetrized states. Given a state ξ ∈ H2 ⊗Mn we denote by Diff (M|F ) diffeomorphisms which preserve form as well as orientation of all loops in F . Consider next diffeomorphisms F → F which do not lie in Diff (M|F ). We denote these by Diff (F → F ). The symmetry group of F , denoted SGF , is the quotient SGF = Diff (F → F )/Diff (M|F ).

(21)

They consist of certain permutations and inversions. The projection is defined by P (ξ ) =

1 d(ξ ), NF

(22)

d∈SGF

where NF is the number of elements in SGF . Next, consider the remaining diffeomorphisms which move the loops in F outside F . We define the sum (20) by d(P ξ )|η, (23) ξ˜ (η) = d∈Diff (M)/Diff (F →F )

where the sum is interpreted as an effective sum, i.e. if d1 (F ) = d2 (F ) we identify d1 ⊗n(F ) and d2 . If η ∈ H2 ⊗ Mn we find NF contributions on the rhs of (23). Else it is zero. The vector space of linear combinations of sums (23) is given the inner product ξ˜1 |ξ˜2 = ξ˜1 (ξ2 ).

(24)

The crucial point is that this sum has finitely many non-vanishing terms (see above). The completion of this vector space in the norm (24) is a diffeomorphism invariant hilbert space which we denote by Hdiff . The problem with this construction is that it is somewhat unclear how the algebra of hoops should be represented on Hdiff . Since our goal is to find a spectral triple involving not only a separable hilbert space but also a (separable) algebra and a well defined Dirac operator, this is clearly a crucial point. The difficulty stems from the fact that the algebra is not diffeomorphism invariant but rather co-variant. The Dirac operator, on the other hand, is diffeomorphism invariant and therefore causes no problems. Essentially, we need to make sense of a ’smearing’ of algebra elements according to H˜ L = Hd(L) , (25) d∈Diff (M)


673

similar to Eq. (19). As it stands, Eq. (25) is meaningless. Instead we do the following: Given a hoop operator HL ∈ A define the symmetrized operator by 1 PF (HL ) = Hd(L) , (26) NF d∈SGF

where SGF is the symmetry group of a subgroup F including L. NF is again the total number of elements in SGF . For example, if L is simple and F is the algebra generated by L, we have 1 PF (HL ) = L + L−1 . 2 For a ’smeared’ state ξ˜ ∈ Hdiff we define the action of HL on ξ˜ by HL (ξ˜ ) = d(PF (HL ) · P (ξ )), d∈Diff (M)/Diff (F →F )

where we choose the representative ξ so that L and ξ have coinciding domains and where PF is taken with respect to the subgroup F defined by the domain of ξ and L. Note that we no longer deal with a representation of loops. For example, given a simple loop L acting on a state ξ with domain on a single copy of G we find that (using a somewhat sloppy notation) HL · HL = 41 (HL2 + HL−2 + 2). This relation, however, changes according to what states in Hdiff HL acts on. 7.3. Diffeomorphism invariance via equivalent triples. In the previous subsection we implemented diffeomorphism invariance by constructing diffeomorphism invariant states and defining an action of loop operators hereon. In fact, there is another option which, however, only works for extended diffeomorphisms. As explained above, the diffeomorphism group acts not only on the hilbert space but also on the algebra. We can therefore define an equivalence on the level of sub-triples; algebra, hilbert space and Euler-Dirac operator. This identification happens at the level of subgroups F ∈ HG. If we consider a single, simple loop L, the spectral triple associated to this is just (L, L2 (G, Cl(T G) ⊗ Mn ), D),

(27)

where a, b, . . . is the C ∗ -algebra generated by {a, b, . . . }. Since all single, simple loops are diffeomorphic, at this level we just get expression (27) when we identify spectral sub-triples which are diffeomorphic. At the level of two nonintersecting simple loops L1 and L2 the spectral triple associated to this is (L1 , L2 , L2 (G2 , Cl(T G2 ) ⊗ Mn ), D).

(28)

Again, by identifying spectral triples of diffeomorphic loops we get at this level just expression (28). This picture simply continues for all finitely generated subgroups and taking the limit hereof gives us an equivalence class of spectral triples represented by the infinite dimensional triple, (L1 , L2 , . . . , L2 (G∞ , Cl(T G∞ ) ⊗ Mn , D).

(29)

674


Further, not only are all subgroups of nonintersecting loops with n generators diffeomorphic, there are also internal diffeomorphisms which shuffle the generators. One can factor out this symmetry by symmetrizing operators and states, just as we did in the previous subsection. Therefore, the result is, in fact, identical to the result of the previous subsection. Instead of symmetrizing one could also make the noncommutative quotient of the action of the internal diffeomorphism group SGF (and the limit), i.e. consider the crossed product AF × SGF , where AF is the part of our algebra acting on the F part. This would be more in the spirit of noncommutative geometry and Connes. We will investigate this alternative elsewhere. 7.4 Spectral in the sense of Connes? It remains to clarify whether the spectral triple (29) satisfy the conditions put forward by Connes [2], see also [11]. A confirmative answer will permit us the full power of noncommutative geometry. Clearly, on each level in the projective/inductive limit, the relevant Dirac (Euler-Dirac) operator satisfies the conditions for a spectral triple, simply per construction. The question remains whether it also holds in the limit. There are three conditions. First, the operator [D, a],

(30)

where a belongs to the subspace of A of finite linear combinations of loop operators, has to be bounded. A simple loop operator a = HL , acts, according to (26), on a state via 1 (HL1 + HL−1 + · · · + HLnF + HL−1 ), (31) 1 n(F ) NF where the number n(F ) refers to the domain of L and the state on which it acts (see Sect. 7.2). We can estimate the commutator of (31) with the Dirac operator by 1 D, HL1 + HL−1 + · · · + HLn(F ) + HL−1 1 n(F ) NF

1 ≤ D, HL1 + D, HL−1 + · · · + D, HLnF + D, HL−1 1 n(F ) NF

= D, HL . (32) Because the operator HL : G → G ;

g→g

is bounded we conclude that the operator (30) is bounded for a simple loop operator. For compositions of simple loop operators the argument is repeated and therefore we conclude that the first condition is satisfied. Second, we need to investigate whether the operator 1 , D − λ

λ ∈ C/R

is compact. In fact, this turns out not to be the case. Let us explain. For simplicity we leave out the matrix part of the hilbert space and simply consider the space L2 (Gn , Cl(T Gn )) = L2 (G, Cl(T G))⊗n ,


675

where we only consider symmetrized (un-ordered) elements according to (22). Given a set of eigenfunctions {ξ1 , . . . , ξm } in L2 (G, Cl(T G)) of the Dirac operator, the product ξi1 ⊗ · · · ⊗ ξin

(33)

is an eigenfunction of the Dirac operator in L2 (Gn , Cl(T Gn )). The problem is that if we find a function ξ0 in L2 (G, Cl(T G)) with eigenvalue zero and which differs from the function 1, then we will automatically have an infinite dimensional eigenspace associated to any eigenvalue. To see this simply consider the function (33) (remember that we consider only symmetrized products) ξ0 ⊗ ξi1 ⊗ · · · ⊗ ξin in L2 (G(n+1) , Cl(T G(n+1) )). This is again an eigenfunction with the same eigenvalue as (33). According to Hodge theory (see Theorem II.5.15 in [14]) the kernel of a Euler-Dirac operator on a compact manifold M is related to the cohomology group: ker(D) = ⊕Hp ,

Hp = H p (M; C),

and the cohomology group is, at least on an orientable manifold as the Lie group G, not empty (the volume form is an example). Therefore we conclude that the Euler-Dirac operator in (29) does not satisfy Connes’ second condition. In principle, it is possible to correct this “flaw” in the construction of the Dirac operator in (29) by adding a bounded perturbation to D on each level in the projective/inductive limit. Such a perturbation will, in general, not be bounded in the limit itself. Indeed, if the perturbation is constructed in a way so that the perturbed Dirac operator satisfies condition two, then the full perturbation will be unbounded. Changing the operator on each level of the projective/inductive limit does not change the K-homology class at each level. In the limit, however, the operator will be changed (the original Euler-Dirac operator in (29) does not have a K-homology class). The third condition is self-adjointness. That D is self-adjoint is secured by construction. Let us end this subsection by noting that the fact that the Dirac operator in the triple (29) does not satisfy Connes second condition may be interpreted as a hint that there exist some extra symmetries that have not been (and should be) factored out. 8. Discussion & Outlook In the present paper we presented new ideas on the unification of noncommutative geometry – in particular Connes formulation of the standard model – and the principles of quantum field theory. We apply the machinery of noncommutative geometry to a general function space of connections related to gravity. A noncommutative algebra of holonomy loops is represented on a separable, diffeomorphism invariant hilbert space. An Euler-Dirac operator is constructed. The whole setup relies on techniques of projective and inductive limits of algebras, hilbert spaces and operators. What comes out is a geometrical structure, including integration theory, on a space of field configurations modulo diffeomorphism invariance. A global notion of differentiation (the Dirac operator) is obtained. We find it remarkable that the whole construction boils down to the study of Dirac operators on various copies of some Lie-group.

676


Whereas the noncommutativity of the algebra is intrinsically classic we interpret the Dirac operator, which resembles a functional derivation, as ‘quantum’. Certain problems arose during the analysis. First, we were unable to represent the full hoop group in a manner compatible with the Euler-Dirac operator. The solution proposed and analyzed is to consider only finite subgroups of non-intersecting loops (in the projective system). This modification has important consequences; instead of graphs (spin-networks) we deal with polynomials on various copies of the group. It is, however, not clear to us whether this is an important point. More seriously, the final Dirac operator does not fulfill the conditions formulated by Connes. In particular, it has infinite-dimensional eigenspaces. Thus, we did not succeed to construct a spectral triple which satisfies the conditions put forward by Connes. A prime concern to further development is to understand why our constructed Dirac operator is not spectral in the sense of Connes. We suspect that what is missing is a symmetry related to the infinite dimensional Clifford algebra, i.e. Cl(Tid (G∞ )). Another possible solution is to use the ordinary Dirac operator instead of the Euler-Dirac operator. To account for lack of invariance under inversions of loops one can double the hilbert space: Instead of for each simple loop to assign the hilbert space of square integrable functions over G we can assign two copies of this hilbert space; one for each orientation of the loop. The diffeomorphism associated with inversion of the loop will then act by interchanging the two hilbert spaces. There will however still be some problems, for example embedding properties when we increase the number of copies of G’s. Another concern is to extend the present construction to work for non-compact groups, since gravity involves SO(3, 1). The main problem will be the embeddings in the projective limit. For example, L2 (G) is not naturally embedded in L2 (G2 ) whenever G is non-compact. We believe, however, that this is a technical and solvable problem. Also, loops will no longer occur as states in the hilbert space; a priori not necessarily a problem. Also, we would like to understand in what sense the noncommutativity of the holonomy algebra generates a bosonic sector and, if so, what it is. Clearly, noncommutativity permits inner automorphisms and nontrivial fluctuations of the Dirac operator. If we assume that we succeed to construct a Dirac operator D satisfying all of Connes’ conditions, and if we consider fluctuations around D of the form D → D˜ = D + A + J AJ † , where J is Tomita’s anti-linear isometry [13] and A is a noncommutative one-form8 , A ∈ 1D , then we can apply Chamseddine and Connes’ spectral action principle [5, 6]. Thus, we can write down automorphism invariant quantities like ˜ ξ˜ , ξ˜ |D|

Trϕ D˜ ,

... .

Such terms can be interpreted as integrated quantities, schematically, of the form d∇ . . . A¯ /Diff

which resembles a Feynman path integral and contains both fermionic and bosonic degrees of freedom. Here the integration is defined, modulo diffeomorphisms, on a space of connections. 8

Elements of nD are of the form a0 [D, a1 ] · · · [D, an ] where the ai ’s are elements of the algebra [4].


677

In the introduction we motivated our analysis by stating that Connes formulation of the standard model coupled to gravity is intrinsically classical. With the aim of combining noncommutative geometry and the principles of quantum field theory, we have found a spectral triple which a priori appears to be quite far from field theory. It is clearly of prime concern to investigate whether the construction does contain a field theory limit and, if so, what it is. Acknowledgement. It is a pleasure to thank Raimar Wulkenhaar for comments and for carefully reading the manuscript.

A. Clifford Algebras and Dirac Operators Here we give a brief review of Clifford algebras and the Euler-Dirac operator. For a detailed account see for example [14]. Since we are interested in Clifford algebras over Lie-groups we only treat the Euclidean case. Let V be a real vector-space. We define the tensor-algebra T (V ) as i V⊗ T (V ) = i≥0

with multiplication v1 ⊗ · · · ⊗ vn · u1 ⊗ · · · ⊗ um = v1 ⊗ · · · ⊗ vn ⊗ u1 ⊗ · · · ⊗ um . Given a metric ·, · on V one defines the Clifford algebra as Cl(V ) = T (V )/(v ⊗ u + u ⊗ v = −2v, u). If e1 , . . . , en is an orthonormal basis of V the Clifford algebra Cl(V ) consists of elements on the form ei1 · · · eik , where i1 < · · · < ik and with the product rules ei ej = −ej ei ,

i = j,

ei2 = −1.

There is an inner product on Cl(V ) given by ei1 · · · eik , ej1 · · · ejl = 1 if k = l and i1 = j1 , . . . , ik = jl and zero else. The group O(n) acts on Cl(V ) by o(ei1 · · · eik ) = o(ei1 ) · · · o(eik ),

o ∈ O(n)

as automorphisms preserving the inner product. In particular one also gets an action of so(n) on Cl(V ). For a manifold M with a metric, one defines the Clifford bundle Cl(T M) as the bundle M m → Cl(Tm M), where the inner product on Tm M is the one given by the metric.

678


Let ∇ denote the Levi-Civita connection associated to the metric. Via the extension of the action of O(n) from V to Cl(V ), the Levi-Civita connection extends to a connection in Cl(T M) via the formula ∇(ei1 · · · eik ) = ei1 · · · ∇(eil ) · · · eik , l

where eil are local orthonormal sections in T M. One defines the Euler Dirac operator D by L2 (M, Cl(T M)) s → D(s) =

ei · ∇ei s,

i

where {ei } is a local orthonormal sections in T M. Since one wants to work with hilbert spaces one complexifies the space L2 (M, Cl(T M)) leaving the notion unchanged. B. Projective and Inductive Limits Here we review the concepts of projective and inductive limits. For a different treatment we refer to [9]. B.1. Projective limits. To illustrate the concept of a projective limit we will consider the index set N and for each n ∈ N the space Rn . If n1 ≤ n2 there are projection Pn2 ,n1 : Rn2 → Rn1 given by Pn2 ,n1 (x1 , . . . , xn2 ) = (x1 , . . . , xn1 ). We define the product

Rn = {(Xn )n∈N |Xn ∈ Rn },

n∈N

Rn

is just where we pick an element in each Rn for all n. An i.e. an element in n∈N element can thus be written as (x11 , (x12 , x22 ), (x13 , x23 , x33 ), . . . ). The projective limit is defined as those elements in n∈N Rn where (x1n1 , . . . , xnn11 ) = Pn2 ,n1 (x1n2 , . . . , xnn22 ), or written out x11 = x12 ,

(x12 , x22 ) = (x13 , x23 ),

(x13 , x23 , x33 ) = (x14 , x24 , x34 ), . . . .

In other words, the projective limit, also written lim(Rn , Pn2 ,n1 ), ←

is just

R∞ ,

the set of all sequences in R.


679

Another example which is more relevant to our case, comes from group theory. Let G be a group. We let F be the set of finitely generated subgroups of G. If F1 , F2 ∈ F and F1 ⊂ F2 we have the inclusion map ιF1 ,F2 : F1 → F2 . If we therefore consider group homomorphism from each of these finitely generated subgroups to a fixed group G1 we get, by dualizing, restriction maps ι∗F1 ,F2 : H om(F2 , G1 ) → H om(F1 , G1 ). As in the case of Rn we can consider the product H om(F, G1 ) = {(ϕF )F ∈F |ϕF ∈ H om(F, G1 )}, F ∈F

and the projective limit is defined as the subset of the product of sequences that are consistent with the restriction maps, i.e. a sequence (ϕF )F ∈F is in the projective limit if ι∗F1 ,F2 (ϕF2 ) = ϕF1 , for all F1 , F2 ∈ F with F1 ⊂ F2 . We note that we have a map : H om(G, G1 ) → lim(H om(F, G1 ), ι∗F1 ,F2 ) ←

just by restricting a homomorphism from G to G1 to its finite subgroups. It is easy to see that this map is a bijection, and we can hence identify H om(G, G1 ) with the projective limit. This might seem like we have just expressed something easy, namely H om(G, G1 ) with something complicated, namely the projective limit. However the description as a projective limit turns out to be very useful.

B.2. Inductive limits. Inductive limit is the dual concept of projective limit. For simplicity we take T∞ , the infinite torus (easier than R∞ since Tn is compact). This means that we have a projective system Pn2 ,n1 : Tn2 → Tn1 ,

n1 , n2 ∈ N,

n1 ≤ n2 ,

where Tn is the n-torus and Pn2 ,n1 are the natural projections. The dual of a space is the functions on the space. There are of course several candidates for functions. In this example we will take the space of square integrable functions on Tn with respect to the Haar measure, i.e. L2 (Tn , dµH ). The dual map of Pn2 ,n1 gives a map Pn∗2 ,n1 : L2 (Tn1 ) → L2 (Tn2 ) defined by Pn∗2 ,n1 (ξ )(x) = ξ(Pn2 ,n1 (x)),

x ∈ Tn2 .

These maps are embeddings and are maps of hilbert spaces since 1dµH = 1.

680


The inductive limit of these hilbert spaces are constructed in the following way: We take the direct sum ⊕n L2 (Tn ), i.e. sequences {ξn }n∈N with ξn ∈ L2 (Tn ) such that {ξn } is zero from a certain step. In this space we consider the subspace N generated by elements of the form (0, . . . , 0, ξn1 , 0, . . . , 0, −Pn∗2 ,n1 (ξn1 ), 0, . . . ), and form the quotient space ⊕n L2 (Tn )/N . This quotient just means that we consider all vectors lying in some L2 (Tn ), and identify two vectors ξn1 , ξn2 if Pn∗2 ,n1 (ξn1 ) = ξn2 . The space ⊕n L2 (Tn )/N is the algebraic inductive limit lim L2 (Tn ). →

Naively we are considering L2 (T1 ) as a subspace of L2 (T2 ), L2 (T2 ) as a subspace of L2 (T3 ), L2 (T3 ) as a subspace of L2 (T4 ) and so on, and the limit space as n tends to infinity is the direct limit. Or in a picture, L2 (T1 ) ⊂ L2 (T2 ) ⊂ L2 (T3 ) ⊂ · · · ⊂ lim L2 (Tn ). →

We have used the words algebraic inductive limit, since we want to put some hilbert space structure on the inductive limit. If we have two vectors in the inductive limit, let us say ξ1 ∈ L2 (Tn1 ) and ξn2 ∈ L2 (Tn2 ) we define the inner product by: ξn1 , ξn2 = Pn∗2 ,n1 (ξn1 ), ξn2 L2 (Tn2 ) . Since the embeddings Pn∗2 ,n1 are hilbert space maps, this inner product is well defined. The definition of the hilbert space inductive limit of {L2 (Tn ), Pn∗2 ,n1 } is therefore the completion of ⊕n L2 (Tn )/N in the inner product < ·, · >. We will also denote this limit with lim L2 (Tn ). →

B.3. Constructing operators on inductive limits of hilbert spaces. The main advantage of giving a description of spaces as projective or inductive limits is that one can work on each copy, and then extend to the hole space if the construction is compatible with the structure maps, i.e. Pn∗2 ,n1 for example. As an example of this, let us take limL2 (Tn ). On L2 (Tn ) we have the Laplacian → n : L2 (Tn ) → L2 (Tn ) defined by n = −(∂θ21 + ∂θ22 + · · · + ∂θ2n ).


681

Note that

and therefore =

Pn∗2 ,n1 (n1 (ξn1 )) = n2 (Pn∗2 ,n1 (ξn1 )),

n n

on ⊕L2 (Tn ) has the property (N ) ⊂ N,

i.e. descends to a densely defined operator on the quotient space, i.e. the inductive limit limL2 (Tn ). → References 1. Gel’fand, I.M., Naimark, M.A.: On the imbedding of normed rings into the ring of operators in Hilbert space. Mat. Sb. 12, 197–213 (1943) 2. Connes, A.: Noncommutative Geometry. London-New York: Academic Press, 1994 3. Connes, A., Lott, J.: Particle Models And Noncommutative Geometry (Expanded Version). Nucl. Phys. Proc. Suppl. 18B, 29 (1991) 4. Connes, A.: Gravity coupled with matter and the foundation of non-commutative geometry. Commun. Math. Phys. 182, 155 (1996) 5. Chamseddine, A.H., Connes, A.: Universal formula for noncommutative geometry actions: Unification of gravity and the standard model. Phys. Rev. Lett. 77, 4868 (1996) 6. Chamseddine, A.H., Connes, A.: A universal action formula. Phys. Rev. Lett. 77, 4868 (1996) 7. Ashtekar, A., Lewandowski, J.: Background independent quantum gravity: A status report. Class. Quant. Grav. 21, R53 (2004) 8. Fairbairn, W., Rovelli, C.: Separable Hilbert space in loop quantum gravity. J. Math. Phys. 45, 2802 (2004) 9. Marolf, D., Mourao, J.M.: On the support of the Ashtekar-Lewandowski measure. Commun. Math. Phys. 170, 583 (1995) 10. Ashtekar, A., Lewandowski, J.: Representation theory of analytic holonomy C* algebras. In: Baez, J. (ed.,), Knots and Quantum Gravity, Oxford: Oxford Univ. Press, 1994 11. Connes, A., Moscovici, H.: The local index formula in noncommutative geometry. Geom. Funct. Anal. 5(2), 174–243 (1995) 12. Ashtekar, A., Lewandowski, J., Marolf, D., Mourao, J., Thiemann, T.: Quantization of diffeomorphism invariant theories of connections with local degrees of freedom. J. Math. Phys. 36, 6456 (1995) 13. Takesaki, M.: Tomita’s theory on modular Hilbert algebras and its applications. Lecture Notes in Math., Berlin-Heidelberg-New York: Springer, 1970 14. Lawson, H., Michelsohn, M.: Spin Geometry. Princeton, NJ: Princeton University Press, 1989 Communicated by A. Connes


Communications in


Notes on Fast Moving Strings Andrei Mikhailov1,2 1 2

California Institute of Technology 452-48, Pasadena, CA 91125, USA. E-mail: [email protected] Institute for Theoretical and Experimental Physics, Bol. Cheremushkinskaya, 25, 117259 Moscow, Russia

Received: 13 May 2005 / Accepted: 13 July 2005 Published online: 24 January 2006 – © Springer-Verlag 2006

Abstract: We review the recent work on the mechanics of fast moving strings in antide Sitter space times a sphere and discuss the role of conserved charges. An interesting relation between the local conserved charges of rigid solutions was found in the earlier work. We propose a generalization of this relation for arbitrary solutions, not necessarily rigid. We conjecture that an infinite combination of local conserved charges is an action variable generating periodic trajectories in the classical string phase space. It corresponds to the length of the operator on the field theory side. 1. Introduction The AdS/CFT correspondence is a strong-weak coupling duality. Weakly coupled YangMills is mapped to the string theory on the highly curved AdS space. When AdS space is highly curved, the string worldsheet theory becomes strongly coupled. Therefore, the weakly coupled Yang-Mills maps to the strongly coupled string worldsheet theory. Nevertheless, in some situations elements of the YM perturbation theory can be reproduced from the string theory side. One of the examples are the “spinning strings”. Spinning strings are a class of solutions of the classical string worldsheet theory. They were first considered in the context of the AdS/CFT correspondence in [1–3]. These are strings rotating in S 5 with a large angular momentum. It was noticed in [1] that the energy of these solutions has an expansion in some small parameter which is similar in form to the perturbative expansion in the field theory on the boundary. Then [4] computed the anomalous dimensions of single trace operators with the generic large R-charge, making the actual comparison possible. In [5] more general solutions were considered, having large compact charges both in S 5 and in AdS5 . For all these solutions, computations in the classical worldsheet theory lead to the series in the small parameter which on the field theory side is identified with λ/J 2 , where λ is the ’t Hooft coupling constant and J a large conserved charge. Moreover it was shown in [6] that the quantum corrections to the classical worldsheet theory are suppressed for the solutions with the large conserved

684

A. Mikhailov

charge (see also the recent discussion in [7]). This opened the possibility that the results of the calculations in the classical mechanics of spinning strings, which are valid a priori only in the large λ limit, can be in fact extended to weak coupling and therefore compared to the Yang-Mills perturbation theory. It was conjectured that the Yang-Mills perturbation theory in the corresponding sector is reproduced by the classical dynamics of the spinning strings. The following picture is emerging. Single string states in AdS5 × S 5 correspond to single-trace operators in the N = 4 supersymmetricYang-Mills theory. (We consider the large N limit.) The dynamics of the single-trace operators is described in the perturbation theory by an integrable spin chain. This spin chain has a classical continuous limit [8] which describes a class of operators with the large R-charges. In this limit the spin chain becomes a classical continuous system. We have conjectured in [9] that this classical system is equivalent to the worldsheet theory of the classical string in AdS5 × S 5 . The Yang-Mills perturbative expansion corresponds to considering the worldsheet of the fast moving string as a perturbation of the null-surface [8–12]. The null-surface perturbation theory was previously considered in a closely related context in [13]. In this paper we will try to make the statement of equivalence more precise. We will argue that the string worldsheet theory has a “hidden” U (1) symmetry which is defined unambiguously by its characteristic properties which we describe. This U (1) commutes with the group of geometrical symmetries of the target space. It corresponds to the length of the spin chain on the field theory side. We conjecture that the phase space of the classical continuous spin chain is equivalent to the Hamiltonian reduction of the phase space of the classical string by the action of this U (1). The equivalence commutes with the action of geometrical symmetries. We should stress that the hidden U (1) symmetry which we discuss in this paper was constructed already in [12], but the explicit calculation was carried out only at the first nontrivial order of the null-surface perturbation theory. The main new result of our paper is that we discuss this hidden symmetry from the point of view of the integrability. We conjecture the relation between the U (1) symmetry and the local conserved charges which if true gives a uniform description of this symmetry at all orders of the perturbation theory. The classical string on AdS5 ×S 5 is an integrable system (see [14–18] and references there), and our U (1) corresponds to an action variable. The existence of the action variables for integrable systems with a finite-dimensional phase space is a consequence of the Liouville theorem [19]. The classical string has an infinite-dimensional phase space. We are not aware of the existence of a general theorem which would guarantee that the action variables can be constructed in the infinite-dimensional case. But we will give two arguments for the existence of one action variable for the string in AdS5 × S 5 , at least in the perturbation theory around the null-surfaces. The first argument gives an explicit procedure to construct the action variable order by order in the perturbation theory (Sects. 3, 4.4 and 4.6). The second argument uses the existence of the local conserved charges [20] (known as higher Pohlmeyer charges) and the results of the evaluation of these charges on the so-called “rigid solutions” performed in [21, 22]. The arguments in Sect. 4 of our paper together with the results of [21, 22] suggest that the action variable is an infinite linear combination of the Pohlmeyer charges and allow in principle to find the coefficients of this linear combination. The plan of the paper. In Sect. 2 we will review the classification of the null-surfaces following mostly [11, 9] and stress that the moduli space of the null-surfaces is a U (1)bundle over a loop space. Therefore it has a canonically defined action of U (1). In Sect. 3

Notes on Fast Moving Strings

685

we will explain how to extend the action of U (1) from the null-surfaces to the nearlydegenerate extremal surfaces using the perturbation theory. A large part of Sect. 3 is a review of [12]. In Sect. 4 we discuss the geometrical meaning of this U (1) as an action variable and argue that it is an infinite sum of the local conserved charges. Note added in the revised version. The coefficients of the expansion of the action variable in the local conserved charges were fixed to all orders in the first paper of [28]. Here we consider only the Pohlmeyer charges for the S 5 part of the string sigma-model. The role of the Pohlmeyer charges for AdS5 was discussed in the second paper of [28]. In the special case when the motion of the string is restricted to R × S 2 ⊂ AdS5 × S 5 the action variable discussed here corresponds to the action variable of the sine-Gordon model, see the third paper of [28]. 2. Null-surfaces 2.1. The definition. A two-dimensional surface in a space-time of Lorentzian signature is called a null-surface if it has a degenerate metric and is ruled by the light rays. There is a connection between null-surfaces and extremal surfaces. An extremal surface is a two-dimensional surface with the induced metric of the signature 1+1 which extremizes the area functional. Extremal surfaces are solutions of the string worldsheet equation of motion in the purely geometrical background (no B-field). When the string moves very fast, the metric on the worldsheet degenerates and the worldsheet becomes a null-surface. Therefore a null-surface can be considered as a degenerate limit of an extremal surface. In AdS5 × S 5 there are two types of light rays. The light rays of the first type project to points in S 5 . The light rays of the second type project to the timelike geodesics in AdS5 and the equator of S 5 . The operators of the large R-charge correspond to the null-surfaces ruled by the light rays of the second type1 . 2.2. The moduli space of null-surfaces. It is straightforward to explicitly describe all the null-surfaces of the second type in AdS5 × S 5 . We have to first describe the moduli space of the null-geodesics of the second type. An equator of S 5 is specified by a point SO(6) . Similarly, a timelike geodesic in AdS5 is specified in the coset space gS ∈ SO(2)×SO(4) SO(2,4) . Given gS and gA , let E(gS ) ⊂ S 5 and T(gA ) ⊂ AdS5 be the corby gA ∈ SO(2)×SO(4) responding equator in S 5 and timelike geodesic in AdS5 , respectively. To specify a light ray in AdS5 × S 5 we have to give also a map F : T → E which pulls back the angular coordinate on E to the length parameter on T (see Fig.1). Such maps are parametrized by S 1 . We see that each light ray is defined by a triple (T, E, F ). Therefore, the moduli space of light-rays of the second type in AdS5 × S 5 is geometrically: SO(2, 4) SO(6) S 1 . × × (1) SO(2) × SO(4) SO(2) × SO(4)

A null-surface is a one-parameter family of light rays. Therefore it determines a SO(2,4) SO(6) S 1 . But we have to also remember that an contour in SO(2)×SO(4) × × SO(2)×SO(4) arbitrary collection of the light rays is not necessarily a null-surface. It is a null-surface only if the induced metric is degenerate. To understand what it means, let us choose a 1 The null-surfaces of the first type have a boundary. They describe the shock wave propagating from the cusp of the worldline of a spectator quark in R × S 3 .

686

A. Mikhailov

Fig. 1. A null-geodesic in AdS5 ×S 5 is specified by the choice of an equator E in S 5 , a time-like geodesic T in AdS5 and a map F : T → E which maps the angular parameter ψ on the equator to the time t on the geodesic, up to a constant

space-like curve belonging to our surface. This space-like curve is a collection of points, one point on each light ray. For the surface to be null, the tangent vector to this curve at each point of the curve should be orthogonal to the light ray to which the point belongs. (This condition does not depend on how we choose a space-like curve.) What kind of a SO(2,4) SO(6) S 1 constraint does it impose on the contour? The space SO(2)×SO(4) × SO(2)×SO(4) × SO(2,4) SO(6) is a U (1) bundle over SO(2)×SO(4) × SO(2)×SO(4) . The condition of the degeneracy of the metric defines a connection on this bundle. The definition of this connection is: the curve in the total space is considered horizontal, precisely if the corresponding collection of light rays is a degenerate surface. What is the curvature of this connection? Both SO(2,4) SO(6) SO(2)×SO(4) and SO(2)×SO(4) are Kahler manifolds (if we forgive that the metric on the first coset is not positive-definite). Let us denote the Kahler forms kA and kS . The curvaSO(2,4) SO(6) × SO(2)×SO(4) ture of our U (1)-bundle is kA +kS . A curve in the base space SO(2)×SO(4) can be lifted to the horizontal curve in the total space if and only if a two-dimensional film ending on this curve has an integer Kahler area (integral of kA + kS over this film should be an integer). Moreover, it is lifted as a horizontal curve almost unambiguously, except that there is a “global” action of U (1) shifting F : T → E on every light ray by the same constant. Therefore, the moduli space of null-surfaces is the U (1) bundle over SO(2,4) SO(6) × SO(2)×SO(4) subject to the integrality condition the space of contours in SO(2)×SO(4) which we described. To summarize, the moduli space of the null-surfaces of the second type is: SO(2,4) SO(6) S 1 Map0 S 1 , SO(2)×SO(4) × × SO(2)×SO(4) . (2) Diff(S 1 )

Here Map(S 1 , X) means the space of maps from the circle to X; for X a Kahler manifold Map0 (S 1 , X) means the space of maps satisfying the integrality condition. At this


687

Fig. 2. A picture of a null-surface in AdS5 × S 5 . A null-surface is a two-dimensional surface with the degenerate metric, ruled by the light rays. We have shown five light rays and a spacial contour with a parameter σ . One can visualize the null-surface as the surface swept by the spacial contour as it moves along the light rays

point we consider the null-surfaces without a parametrization; therefore we divide by the group Diff(S 1 ) of the diffeomorphisms of the circle. Turning on the fermionic degrees of freedom on the worldsheet we get the moduli space of supersymmetric null-surfaces [9]: S 1 Map0 S 1 , Gr(2|2, 4|4) × . (3) Diff(S 1 ) Here Map0 S 1 , Gr(2|2, 4|4) is the phase space of the continuous spin chain [9]. Therefore the moduli space of null-surfaces is “almost” equivalent to the phase space of the continuous spin chain, except for the fiber S 1 and the reparametrizations Diff(S 1 ). We have to explain what happens to the fiber and why the null-surface actually comes with the parametrization. Also, we have to explain how the symplectic structure is defined on the moduli space of null surfaces. Let us start with the parametrization. 2.3. Parametrized null-surfaces. The phase space of the classical string has a boundary which consists of strings “moving with the speed of light”. A string moving very fast can be approximated by a null-surface. But one null-surface can approximate many different fast moving strings. The null-surface as we defined it so far “remembers” only the direction of the velocity at each point of the approximated string, but it misses the information √ 2 at different points of the string. Alabout the ratios of the relativistic factors 1 − v

√ though 1 − v 2 → 0 in the null-surface limit, the ratio 1 − v 2 (τ, σ1 )/ 1 − v 2 (τ, σ2 ) for two different points on the worldsheet remains finite. Therefore, if we want to think of the moduli space of the null-surfaces as the boundary of the phase space, we have to equip the null-surfaces with an additional structure. This additional structure is the parametrization. A null-surface is a one-parameter family of the light rays. The parameterization is a particular choice of the parameter. In other words, it is a monotonic function σ from the family of light rays forming the null-surface to the circle, defined modulo σ ∼ σ +const. One can also think of it as a density dσ on the set of light rays forming the null-surface. This density is roughly speaking proportional to the density of energy on the worldsheet of the fast-moving string, in the limit when it becomes the null-surface. We will now give the definition of σ .

688

A. Mikhailov

Consider the family of string worldsheets (L) converging to the null-surface 0 = (∞). We will introduce a parametrization dσ of 0 in the following way. Consider a Killing vector field U on S 5 , corresponding to some rotation of the sphere: j

U.xSi = uij xS .

(4)

Here xSi parametrizes the S 5 : (xSi )2 = 1. When L is large, (L) is close to 0 , the string moves very fast and the conserved charge corresponding to U is very large. We can approximate this charge by an integral i ∂ x j times some density dσ : over a spacial contour on the null-surface 0 of uij x0,S τ 0,S j i QU = L dσ uij x0,S ∂τ x0,S + (terms vanishing at L → ∞). (5) σ ∈[0,2π]

Here x0,S is the S 5 -part of the null-surface; we choose the τ coordinate on the null-surface to be the affine parameter on the light ray normalized by the condition x0,S (τ + 2π, σ )

= x0,S (τ, σ ). Equation (5) with the condition dσ = 2π is the definition of dσ , and also the precise definition of the large parameter L, modulo O(1/L). We choose σ as the parametrization. We can now say that the moduli space of parametrized null-surfaces is the boundary of the phase space of a classical string. We say that a family (L) of extremal surfaces has a parametrized null-surface 0 as a limit when L → ∞ if and only if • (L) has 0 as a limit when L → ∞, as a continuous family of smooth twodimensional surfaces in a smooth two-dimensional manifold, and • the density of QU approaches Eq. (5) in the limit L → ∞. This definition of the parametrization does not depend on which particular geometrical symmetry U we use. An alternative way to define the same parametrization is to use a special choice of the worldsheet coordinates on . Let us choose the worldsheet coordinates τ, σ so that ∂xS 2 ∂xA 2 ∂xA 2 ∂xS 2 + = − − = 1, ∂τ ∂σ ∂τ ∂σ ∂xS ∂xS ∂xA ∂xA = − = const, , , ∂τ ∂σ ∂τ ∂σ where xA is the projection of the string worldsheet to AdS5 and xS is the projection to S 5 . Then we define σ = σ / dσ . In the null-surface limit dσ defines the parametrization of the null-surface. 2.4. The symplectic structure. The moduli space of parametrized null-surfaces as a manifold depends only on the conformal structure of the target space. But we can introduce additional structures on this moduli space which use the metric on AdS5 × S 5 . An important additional structure is the closed 2-form which originates from the symplectic form of the classical string. Strictly speaking a differential form in the bulk of the manifold does not automatically determine a differential form on the boundary. Indeed, suppose that we have a differential form, for example a 2-form in the bulk. We can try to define the “boundary value” ω of on the boundary in the following way. Given


689

two vector fields v1 , v2 on the boundary, we find two vector fields V1 , V2 in the bulk such that lim V1 = v1 and lim V2 = v2 . Then we define ω(v1 , v2 ) = lim (V1 , V2 ). But the problem is that this definition will depend on the choice of V1 and V2 . Intuitively, if (V˜1 , V˜2 ) is some other choice of a pair of vector fields inducing (v1 , v2 ) on the boundary, and the “vertical component” of V˜i − Vi is not small enough near the boundary, then (V1 , V2 ) = (V˜1 , V˜2 ). Given this difficulty, how do we define the symplectic form on the space of null-surfaces given the symplectic form on the string phase space? When we lift the vector field v on the boundary to the vector field V in the bulk, let us require that dL(V ) goes to zero when L → ∞. We define L by Eq. (5); it is only an approximate definition at L → ∞, but this is good enough for the purpose of our definition: ω(v1 , v2 ) = lim L−1 (V1 , V2 ), L→∞

(6)

where V1 and V2 are such that dL(V1 ) = dL(V2 ) 0. One can see that ω has a kernel, which is precisely the tangent space to the fiber S 1 in the numerator of Eq. (3). The moduli space has a symmetry U (1) rotating this fiber; we will discuss this symmetry in the next section; we will call it U (1)L . Therefore ω is the symplectic form on the moduli space of null-surfaces modulo U (1)L . Equation (3) implies that the moduli space of parametrized null-surfaces modulo U (1)L is the space of parametrized contours in the Grassmannian: (7) Map0 S 1 , Gr(2|2, 4|4) . One can see that ω is equal to the integral of the symplectic form on the superGrassmannian pointwise on the contour, with the measure dσ . The symplectic area of the film filling the contour is the generating function of the shift of the origin of the circle. Therefore the integrality condition guarantees that the symplectic form does not depend on the choice of the origin on S 1 ; the symplectic form is horizontal and invariant with respect to the shifts of the origin of S 1 . Our definition of the symplectic form on the space of null-surfaces used the target space metric (just the conformal structure would not be enough) and also the fact that the target space is a product of two manifolds. 3. Nearly-Degenerate Extremal Surfaces and the Role of the Engineering Dimension Our discussion in this and the next section will be limited to the classical bosonic string. 3.1. Definition of U (1)L . The moduli space (3) of null-surfaces is a U (1)-bundle. The U (1) symmetry shifting in the fiber S 1 plays an important role in the formalism. We will call it U (1)L . On Fig. 3 we have shown schematically how U (1)L acts on the null-surfaces. We conjecture that U (1)L corresponds to the length of the spin chain. Generally speaking, the length of the spin chain is not conserved in the Yang-Mills perturbation theory [24], but it is probably conserved in the continuous limit (this should be related to the discussion of the “closed sectors” in [26]). It should be conserved modulo the corrections vanishing in the continuous limit. We therefore conjecture that there is a continuation of U (1)L from the space of null-surfaces to the phase space of the classical

690

A. Mikhailov

Fig. 3. The action of U (1)L on the null-surface. The symmetry acts only on the S 5 -part of the null-string. Each point shifts by the same angle along the equator which is the projection to S 5 of the corresponding light ray

string, at least to the region of the phase space corresponding to fast moving strings. We conjecture that this continuation is uniquely defined by the following properties: 1. The action of U (1)L preserves the symplectic structure. 2. The action of U (1)L does not change the projection of the worldsheet to AdS5 . Moreover, it preserves the projection to AdS5 of the null-directions on the worldsheet. 3. We require that the orbits of U (1)L are closed (otherwise, we would not have called it U (1)). 4. The restriction of U (1)L to the null-surfaces acts as we described (see Fig. 3). The second property reflects the fact that U (1)L corresponds to the length of the operator rather than its engineering dimension. Let E denote the Hamiltonian of U (1)L . Let X denote the phase space of the classical string, and X//(E = l) denote the Hamiltonian reduction of the phase space on the level set of E. The basic conjecture is: There is a one-to-one map from the phase space of the spin chain of the length l to the reduced phase space of the classical string X//(E = l) preserving the symplectic structure and commuting with the action of SO(2, 4) × SO(6). The reduction by U (1)L was discussed in [23] but only in a sector [24] in which U (1)L acts as some element of SO(6). The perturbation theory in this sector was discussed in [25] (see also Sect. 2 of [11]).

3.2. Action of U (1)L on nearly-degenerate extremal surfaces. In this subsection we will explain how to continue the action of U (1)L from the boundary of the phase space. Most of this section is a partial review2 of Sect. 3 of [12]. 2

Section 3 of [12] has more than just a construction of U (1)L . The next step is considering the action ∂ , where T is the global time in AdS on the invariants of U (1) and bringof the Killing vector field ∂T L 5 ing the result to the form suitable for the comparison with the field theory computation. Here we are discussing only the first step.


691

3.2.1. Particle on a sphere. Consider the phase space of a particle moving on S 5 , and restrict to the domain where the velocity of the particle is nonzero. This domain is naturally a bundle over the moduli space of equators of S 5 ; let π denote the projection map in this bundle. A point of the phase space, corresponding to the position x ∈ S 5 and the velocity v ∈ Tx S 5 , projects by π to the equator going through x and tangent to v. See the discussion in [11]. The symplectic form on the phase space is expressed in terms of the symplectic form on the base and the connection form Dψ: ω = df ∧ Dψ + f π ∗ , (8) √ (p,dx) , f = (p, p) (p is the momentum of the particle) and is where Dψ = √ (p,p) the symplectic form on the moduli space of equators. The moduli space of equators SO(6) SO(2)×SO(4) is a Kahler manifold, the symplectic form is the Kahler form. Now it is easy to construct the action of U (1). One takes V=

∂ . ∂ψ

(9)

This is a vertical vector field, it does not act on the base. The coordinate ψ is essentially the angle along the equator on which the particle is moving. More explicitly: ∂ 1 .x = √ ∂τ x. ∂ψ (∂τ x, ∂τ x)

(10)

∂ It is easy to see that the trajectories of the vector field ∂ψ on the phase space of a particle 5 on S are periodic with the period 2π . One has to remember that this vector field is defined only on the open subset of the phase space, where the velocity of the particle is nonzero. But we consider fast moving strings, and the region of the phase space where the velocity is nearly zero is not important for us.

3.2.2. String on a sphere. In some sense, a string is a continuous collection of particles. Therefore, it is natural to apply a similar construction to the string. Treating the string as a continuous collection of particles requires the choice of the coordinates on the worldsheet. We will therefore introduce the conformal gauge: (∂τ x)2 + (∂σ x)2 = 0, (∂τ x, ∂σ x) = 0. In this gauge the symplectic form is: ↔ ω = dσ (δ1 x, D τ δ2 x).

(11)

(12)

In the Hamiltonian formalism, we introduce pA = ∂τ xA ∈ T (AdS5 ) — the AdS5 component of the momentum, and pS = ∂τ xS ∈ T (S 5 ) — the S 5 -component of the momentum. Now we will interpret the string as a collection of particles parametrized by σ . We are tempted to interpret the vector field (9),(10) acting pointwise in σ as the

required U (1)L symmetry. The generator of this symmetry would be dσ |pS |. But this would be wrong. This field preserves the symplectic structure, does have periodic trajectories and acts correctly on the null surfaces. But unfortunately it does not preserve

692

A. Mikhailov

the gauge (11). It only commutes with the second constraint, (p, ∂σ x) = 0. But it does not commute with the first one, (p, p) + (∂σ x, ∂σ x) = 0. Indeed, it commutes with (∂τ xS )2 = (pS , pS ) but not with (∂σ xS )2 . Therefore we should modify this vector field so that it still has periodic trajectories, but also commutes with the constraint. There is a systematic procedure to do this, order by order in (pS1,pS ) , developed in [12]. Let us summarize this procedure, or perhaps a variation of it. To make sure that the modified vector field is Hamiltonian (preserves the symplectic structure) we construct ∂ it as a conjugation of ∂ψ with some canonical transformation, which we denote F : V.x = F −1

∂ .F [x] ∂ψ

(13)

∂ ◦ F . Since F is a canonical transformation, V is or schematically V = F −1 ◦ ∂ψ automatically a Hamiltonian vector field. Since F is single-valued, V generates periodic trajectories. It remains to construct F such that V commutes with the constraint ∂ ◦ F −1 commutes with (p, p) + (∂σ x)2 is (p, p) + (∂σ x)2 . But to require that F −1 ◦ ∂ψ

∂ the same as to require that ∂ψ commutes with F ∗ [(pS , pS ) + (∂σ xS )2 ] — the pullback of (pS , pS ) + (∂σ xS )2 by F . Therefore we have to find such a canonical transformation F that the pullback of (pS , pS ) + (∂σ xS )2 with F is annihilated by the vector ∂ field ∂ψ . In other words, we have to find a canonical transformation which removes ψ 2 from (pS , pS ) + (∂σ xS )2 ; after this canonical transformation |pS |2 + (∂

σ xS ) becomes 2 |pS | + φ0 + φ1 + . . ., where all the φk for k ≥ 0 are in involution with |pS (σ )|dσ and φk is of the order 1/|pS |2k . This was done in Sect. 3 of [12]. The canonical transformation can be expanded in 1/(pS , pS ); the corresponding generating function is expanded in the odd powers of 1/|pS |. The authors of [12] gave the explicit expression for F to the first order in 1/|pS |, but they also give a straightforward algorithm for constructing the higher orders. (We will reconsider the higher orders from a slightly different point of view in Sect. 4.4, perhaps making this algorithm more precise.) At the first order we need to find h(1) such that the canonically transformed constraint, which is a function of σ :

(pS , pS )(σ ) + (∂σ xS , ∂σ xS )(σ ) + {h(1) , [(pS , pS )(σ ) + (∂σ xS , ∂σ xS )(σ )]}

has zero Poisson bracket with dσ |pS |(σ ) up to the terms of the order 1/|pS |3 , for every σ . And h(1) should be of the order 1/|pS |. In other words, we should have:

dσ |pS |(σ ), (∂σ xS ) (σ ) + {h , |pS | (σ )} = 0.

2

(1)

2

(14)

One can see that (1)

h

1 =− 4

dσ pS ∂ σ xS , D σ (σ ) |pS | |pS |

(15)

works. Notice also that this h(1) is reparametrization invariant (where |pS | transforms as a density of weight one). Therefore it commutes also with the second constraint (pS , ∂σ xS ) = const. Therefore, to the first order in 1/|pS | the canonical transformation


693

we are looking for is generated by this h(1) . Then the generator of the U (1)L is, up to the terms of the order 1/|pS |3 : dσ |pS |(σ )} + . . . dσ |pS | − {h(1) , pS pS 1 (pS , ∂σ xS )2 = dσ |pS | + (∂σ xS )2 − Dσ , Dσ − + ... . 4|pS | |pS | |pS | (pS , pS ) (16)

E=

One can see immediately that the trajectories of this charge are closed up to the terms of the order subleading to 1/|pS |. Indeed, we have explained in Sect. 3.2.1 why the leading term gives periodic trajectories. And the second term (which as we have seen is needed to make the charge commuting with the Virasoro constraints) averages to zero on the periodic trajectories of the first term. Therefore (see for example Sect. 3 of [11]) the trajectories of E do not drift at this order. We will discuss the higher orders in Sect. 4.4.

4. Length of the Operator and Local Conserved Charges We have seen that the null-surface perturbation theory has a “hidden” symmetry U (1)L . The existence of U (1) symmetries acting on the phase space is typical for integrable systems, at least for those which have a finite-dimensional phase space. Corresponding conserved charges are called action variables [19]. Classical string in AdS5 × S 5 is an integrable system. Therefore, we should not be surprised to find such an action variable3 . The local conserved charges in involution for the classical string in AdS5 × S 5 are explicitly known. Therefore, instead of constructing U (1)L in the perturbation theory, we can try to build it as some linear combination of the already known conserved charges. In this section, we will argue that the coefficients of this linear combination are actually fixed by the calculation of [21, 22].

4.1. Local conserved charges. Consider a string in the target space which is a product of two manifolds A and S. We assume that the metric on A has the Lorentzian signature, and the metric on S has the Euclidean signature. We will need A = AdS5 and S = S 5 , but let us first consider the general A × S. The string worldsheet will be denoted . The classical trajectory of the string is an embedding x : → A × S. We are going to use the fact that the target space is a direct product. A point of A × S is obviously a pair (xA , xS ), where xA is a point of A and xS is a point of S. Therefore for each point ζ ∈ we have x(ζ ) = (xA (ζ ), xS (ζ )), where xA ∈ A and xS ∈ S. Consider the 1-forms dxA 3 Strictly speaking, the integrability is not necessary for the construction of the action variable in perturbation theory. A typical example is a particle on a sphere S 2 in an arbitrary (polynomial) potential. When the particle moves very fast, it does not feel the potential. All the trajectories are periodic in the limit of an infinite velocity. Therefore on the boundary of the phase space, when the velocity is infinite, we have an action variable |p| — the absolute value of the momentum. It is well known that the perturbation theory in 1/|p| allows us to extend this action variable from the boundary inside the phase space, but only in the perturbation theory. For an arbitrary potential, the perturbation series must diverge, because in fact there is no additional conserved quantity besides the energy. Therefore the U (1) will be actually broken by effects which are not visible in the perturbation theory, unless if the potential is such that the system is integrable. We want to thank V. Kaloshin and A. Starinets for discussions of this subject.

694

A. Mikhailov

and dxSon the string worldvolume, dxA taking values in Tx A and dxS in Tx S. In other dxA words, is a differential of x. dxS The metric on A × S has the Lorentzian signature, and we consider the string worldsheets which have the induced metric with the Lorentzian signature. Pick two vector fields ξ+ and ξ− on , which are both lightlike but have a nonzero scalar product: (ξ+ , ξ+ ) = 0, (ξ− , ξ− ) = 0, (ξ+ , ξ− ) = 0. These vector fields have a simple geometrical meaning. Since the worldsheet is twodimensional, at each point we have two different lightlike directions. The vector ξ+ points along one lightlike direction, and ξ− points along another. Pick a spacial contour C on , and a 1-form ν on such that ν(ξ− ) = 0 and ν(ξ+ ) = 0. Consider the following functional:

(dxS (ξ+ ))2 [1] . (17) ν Q [x] = ν(ξ+ ) C We will prove that this functional does not depend on a particular choice of ξ+ , ξ− , ν and C. This is therefore a correctly defined functional on the phase space of the string. Indeed, the only ambiguity in the choice of ξ+ is ξ+ = f (ζ )ξ+ , where f is some function on the worldsheet. But this function cancels in (17). The ambiguity in the choice of ξ− and ν is also in rescaling which does not change (17). It remains to prove that (17) does not depend on the choice of the integration contour C. To prove that (17) is independent of C, let us choose coordinates (τ + , τ − ) on the worldsheet in such a way that the induced metric is ds 2 = ρ(τ + , τ − )dτ + dτ − . Then ξ+ is proportional to ∂τ∂+ and ξ− is proportional to ∂τ∂− . In these coordinates Q[1] = dτ + (∂+ xS )2 (18) C

The variation of Q[1] under the variation of the contour is measured by the differential of the form: + 2 d dτ (∂+ xS ) = −dτ + ∧ dτ −

(∂+ xS , D− ∂+ xS )

(∂+ xS )2

But on the equations of motion D+ ∂− xS = 0. Therefore the integral does not depend on the choice of the contour. Let us explain why on the equations of motion we have D+ ∂− xS = 0. Let N be the second quadratic form of the surface, N : S 2 (T ) → N (here N = T (A × S)/T is the normal bundle to in A × S). The second quadratic form is defined in the following way: suppose that the particle moves on with the velocity v, then the acceleration of the particle is N (v) modulo a vector parallel to T . For the surface to be extremal, the trace of N should be zero. The trace of N is the contraction of N with the induced


695

metric on ; it is a section of N . The trace of N is proportional to D+ ∂− x, therefore we should have: D+ ∂− x = f + (τ + , τ − )∂+ x + f − (τ + , τ − )∂− x. But notice that (D+ ∂− x, ∂− x) = (D− ∂+ x, ∂+ x) = 0, therefore f + = f − = 0.Another conserved charge is: [1] = Q

dτ

−

C

(∂− xS )2 .

(19)

Are there charges containing higher derivatives of xS ? Let us consider the following expression: J+[2] (τ+ , τ− )

∂+ xS ∂+ xS 1 = D+ , D+ . |∂+ xS | |∂+ xS | |∂+ xS |

(20)

Even though D+ ∂− xS = 0 it is not true that ∂− J+[2] is zero. The covariant derivatives D+ and D− do not commute, therefore D− D+ |∂∂++ xxSS | = 0. In fact, for any function w : → T (A × S) we have [D+ , D− ]w = R(∂+ x, ∂− x).w,

(21)

where R is the Riemann tensor of A × S. Now we have to start using that S is a sphere. For S = S 5 , the Riemann tensor is constructed from the metric tensor, and [D+ , D− ]w = ∂+ x(∂− x, w) − ∂− x(∂+ x, w).

(22)

Now consider the following differential form: λ=2

dτ − ∂+ x S ∂+ xS dτ + (∂− xS , ∂+ xS ) + D+ , D+ . |∂+ xS | |∂+ xS | |∂+ xS | |∂+ xS |

(23)

Using (22) we can show that dλ = 0, therefore λ is a local conservation law. We use the formula D− D+ ∂+ xS = (∂+ xS )2 ∂− xS − (∂+ xs , ∂− xs )∂+ xs which is special for S 5 . [2 which is obtained from (23) We will denote this charge Q[2] . There is also a charge Q by replacing τ + with τ − and ∂+ or D+ with ∂− or D− . These charges are just the first examples of an infinite family of charges, which are all in involution. This infinite family was constructed in [20]. A particularly important linear combination is E2 =

1 [1] ˜ [1] ). (Q − Q 2

(24)

The construction of this charge requires only that the target space is a direct product of two manifolds.

696

A. Mikhailov

4.2. Local conserved charges are invariant under U (1)L . Consider a local conserved charge Q acting trivially on the AdS part of the worldsheet. In the conformal gauge, this means that Q is constructed as a contour integral of some combination of xS and pS . Let us decompose Q in the inverse powers of |pS |: Q = Qm + Qm+1 + Qm+2 + . . . ,

(25)

where m is a non-negative integer, the “order” of the charge; Qm is of the order 1/|pS |2m−1 , Qm+1 is of the order 1/|pS |2m+1 , etc. We have to require that Q is in involution with the Virasoro constraints. In particular, it should be in involution with |pS (σ )|2 + (∂σ xS (σ ))2 for an arbitrary σ . (Here we used that Q is trivial in AdS-part.) Let us now apply the canonical transformation F which we described in Sect. 3.2.2. After this canonical transformation |pS |2 + (∂σ xS ) 2 becomes |pS |2 + φ0 + φ1 + . . ., where all the φk for k ≥ 0 are in involution with |pS (σ )|dσ and φk is of the order 1/|pS |2k . And Q = Qm + Qm+1 + . . . becomes Q = Qm + Qm+1 + . . ., where Q is the canonically transformed Q. We should have: (26) {|pS (σ )|2 + φ0 (σ ) + φ1 (σ ) + . . . , Qm + Qm+1 + . . .} = 0

for an arbitrary σ . At the leading order in |pS | this implies that dσ |pS (σ )| is in involution with Qm . At the next order, it follows that for all values of σ the expression

{|pS (σ )|2 , Qm+1 } is in involution with dσ |pS (σ )|. This implies that: dσ |pS (σ )|, = 0. (27) dσ |pS (σ )|, Qm+1

Since the vector field generated by dσ |pS (σ )| is periodic, this equation implies that dσ |pS (σ )| is in involution with Q

m+1 . An analogous argument at higher orders shows that all the Qm+j commute with dσ |pS (σ )|. Therefore Q is in involution with the

expression dσ |pS (σ )| which is the generator of U (1)L . The conserved charges of [20] do have an expansion of the form (25) therefore they should commute with U (1)L . This reinforces our conjecture that U (1)L should be a combination of the local conserved charges. 4.3. A geometrical meaning of U (1)L . We can try to make more transparent the geometrical meaning of U (1)L by drawing an analogy with the Liouville theorem for finitedimensional integrable systems. A mechanical system with 2n-dimensional phase space is integrable if there are n functions F1 , . . . , Fn in involution with each other, and the Hamiltonian is a function of F1 , . . . , Fn . Then, there are n action variables I1 , . . . , In , each of them being some combination of F1 , . . . , Fn : Ij = Ij (F1 , . . . , Fn ), such that each Ij generates U (1) (has periodic orbits). In this paper we are dealing with an infinite-dimensional system, a classical string in AdS5 × S 5 . We can take the first ˜ [1] as a Hamiltonian4 . This Hamiltonian is presumably intePohlmeyer charge Q[1] − Q grable, because there is an infinite family of higher charges commuting with it. On the 4 It is a natural Hamiltonian on the phase space of a classical string in any case when the target space is a direct product of two manifolds.


697

other hand, it does not have any special periodicity properties (we do not see any reason ˜ [1] is an invariant torus. why it would). This means that the closure of the orbit of Q[1] − Q [1] [1] ˜ . (This fact is seen immediately, because Q[1] Our U (1)L commutes Q −Q

with can be rewritten as dτ + −(∂+ xA , ∂+ xA ) and by definition U (1)L does not act on the AdS-part of the worldsheet.) Therefore U (1)L should be a shift of one of the angles ˜ [1] . The angles parametrizing the invariant parametrizing the invariant torus of Q[1] − Q torus are in correspondence with its one-dimensional cycles. Which cycle corresponds to U (1)L ? Every invariant torus can be connected by a one-parameter family of invariant tori to a torus on the boundary of the phase space (or the one very close to the boundary). This means that every 1-cycle is connected to some 1-cycle on a torus on the boundary — the space of null-surfaces. We should take that 1-cycle which is connected to the orbit of U (1)L on the null-surfaces, described in Sect. 3.1. The corresponding action variable is E — the generator of U (1)L . These arguments show the uniqueness of U (1)L . ˜ [1] has a special property: it actually generates The first Pohlmeyer charge Q[1] − Q ˜ [1] and E should be U (1)L on the boundary. Therefore the difference between Q[1] − Q a combination of charges vanishing at the boundary. We expect that this is an infinite and linear combination. Indeed, the construction of [12] tells us that the charge we are looking for is local at each order in 1/|pS |. A nonlinear combination of the charges would be non-local (a product of integrals).

4.4. A different point of view on the perturbation theory; higher

orders. In Sect. 3 we ∂ ∂ constructed U (1)L as F −1 ◦ ∂ψ ◦ F , where ∂ψ is generated by dσ |pS (σ )| and F is the

∂ canonical transformation such that F −1 ◦ ∂ψ ◦ F commutes with |pS (σ )|2 + |∂σ xS (σ )|2 . This canonical transformation is constructed in the perturbation theory, order by order in |p1 |2 . S A disadvantage of this procedure is that at each order we have to require that our U (1)L commutes with |pS (σ )|2 + |∂σ xS (σ )|2 for any σ . Since there are infinitely many values of σ we have to impose infinitely many conditions on F at each order. At the first order, we have seen in Sect. 3.2.2 that these conditions are not really independent; one generating function h(1) takes care of all of them — see Eq. (14). At the higher orders, this is not immediately obvious. Therefore, we would like to propose a slightly different way of constructing F . Let us forget for a moment about the Virasoro constraint; instead of the phase space of the string consider the space of harmonic maps x(τ, σ ). Instead of requiring that U (1)L commutes with |pS (σ )|2 + |∂σ xS (σ )|2 , let us require that U (1)L

commutes with Q[1] = dσ |∂+ xS (σ )|. We will see that the requirement that U (1)L commutes with Q[1] already fixes U (1)L in the perturbation theory, and the resulting U (1)L will automatically commute with the Virasoro constraints. As in Sect.

3, we look for the generator of U (1)L as a pullback by a canonical transformation of |pS (σ )|dσ . In other words, let us look for such a canonical transformation F

that dσ |pS (σ )| commutes with F ∗ Q[1] (the pullback of Q[1] by F ). We can construct such a canonical transformation order by order in the perturbation theory. Let us denote K = dσ |pS (σ )|. We have:

Q[1] = K + q1 + q2 + . . . .

(28)

Under the rescaling pS → tpS : K → tK, q1 → t −1 q1 , q2 → t −3 q2 , qm → t 1−2m qm . The symplectic structure is of the degree 1: ω → tω, therefore the Poisson brackets are

698

A. Mikhailov

of the degree −1: {, } → t −1 {, }. We can construct F order by order in this grading. We have: F ∗ (Q[1] ) = K + q1 + q2 + . . . + qm + ....

(29)

Suppose that we have already found F such that q1 , . . . , qm−1 commute with K. At the order m, we want to modify F by the canonical transformation with the generating + {f , K} commutes with K. Since K is function fm of the order |pS |1−2m so that qm m periodic, we can decompose qm = qm,0 + qm,k , (30) k=0 } = ikq . Then we should take where {K, qm,k m,k

fm =

1 q . ik m,k

(31)

k=0

Repeating this procedure at higher orders, we end up with the function F such that {K, F ∗ (Q[1] )} = 0. The reparametrization invariance is manifestly preserved at each order, therefore the ∂ resulting charge F −1 ◦ ∂ψ ◦ F will commute with (pS , ∂σ xS )(σ ) for any σ . Also, the [1] fact that Q is reparametrization-invariant and the arguments analogous to the dis∂ cussion at the end of Sect. 4.2 show that F −1 ◦ ∂ψ ◦ F will automatically commute 2 2 with |pS (σ )| + |∂σ xS (σ )| , as well as with the higher Pohlmeyer charges. Indeed, we know that F ∗ Q[1] = K + q1 + q2 + . . . commutes with F ∗ (|pS |2 (σ ) + |∂σ xS |2 (σ )) = |pS |2 + φ0 + φ1 + . . .; therefore {K, φ0 } = {|pS |2 (σ ), q1 } ⇒ {K, {K, φ0 }} = 0 ⇒ {K, φ0 } = 0, {K, φ1 } + {q1 , φ0 } = {|pS |2 (σ ), q2 } ⇒ {K, {K, φ1 }} = 0 ⇒ {K, φ1 } = 0, etc. We used the periodicity of the trajectories of K when we claimed that {K, {K, φ}} = 0 implies {K, φ} = 0. Indeed, for any functional φ on the phase space, if {K, {K, φ}} = 0 then {K, φ} is constant on the trajectories of K. But if this constant were nonzero, then the change of φ along the trajectory of K would accumulate over the period of K, which would contradict the single-valuedness of φ on the phase space. 4.5. An infinite combination of local conserved charges. Expanding (23) in the conformal gauge in the powers of |p1S | we get: 1 [2] [2] 3 = dσ −2|pS | + Q −Q (∂σ xS )2 2 |pS | pS pS 3 4 2 − (pS , ∂σ xS ) + Dσ , Dσ + ... . (32) |pS |3 |pS | |pS | |pS | And for Q[1] we get: [1] = Q[1] − Q

1 1 2 (∂σ xS )2 − dσ 2|pS | + (p , ∂ x ) + . . . . S σ S |pS | |pS |3

(33)


699

We have: 1 [1] ) − 1 (Q[2] − Q [2] ) 7(Q[1] − Q 16 2 pS pS 1 (pS , ∂σ xS )2 2 = dσ |pS | + (∂σ xS ) − Dσ , Dσ − + ... . 4|pS | |pS | |pS | (pS , pS ) This coincides with the result (16) for E which we know from the perturbation theory. We see that up to the terms of the order |p1 |3 the Hamiltonian of U (1)L can be repreS sented as a sum of the first two commuting local charges. We conjecture that U (1)L is in fact an infinite combination of the local conserved charges. The perturbation theory construction suggests that it should be a worldsheet parity-invariant combination. The coefficients of this linear combination can be found from considering the conserved charges of particular solutions. There is a special class of fast moving strings, the so-called “rigid” strings. For these “rigid” strings, the corresponding field theory operators are known a priori. These operators provide local extrema of the anomalous dimension in the sector with the given charges. These “rigid” solutions were classified in [10, 27]. They are related to the solutions of the Neumann integrable system. The local conserved charges of some rigid strings were computed in [21, 22]. In [21] the local conserved charges are denoted Ek . (This agrees with our notation E for the Hamiltonian of U (1)L .) The precise definition of Ek is given in Sect. 3 of [21]. ˜ [1] = 2E2 , Q[2] +2Q[1] − Q ˜ [2] −2Q ˜ [1] = −4E4 . The relation to our notations is: Q[1] − Q The conserved charges have the following structure: (1)

En = δ2,n J +

(2)

(3)

n n n + 3 + 5 + ..., J J J

(34)

where J −2 = λ/J 2 , and J is a particular combination of the SO(6) momenta. The (m) coefficients n depend on what kind of a rigid string is considered (the ratio of spins). (m) But the authors of [21] noticed that the coefficients n for different values of n are not independent. For all the solutions they considered, they find that: E10 +

74 1898 6922 32768 1 E8 + E6 + E4 + (E2 − J ) ∼ 9 . 7 35 35 35 J

(35)

This means that up to the terms of the order 1/|pS |9 we should have: J = E2 +

6922 1898 370 35 E4 + E6 + E8 + E10 + . . . . 32768 32768 32768 32768

(36)

At first this formula looks rather strange, because it seems to imply that a certain combination of Pohlmeyer charges (which all commute with SO(6)) is equal to some component of the angular momentum (which transforms in the adjoint of SO(6)). We propose the following resolution of this puzzle. The right-hand side of (36) is actually the action variable, which for a particular class of the solutions considered in [21, 22] happens to be equal to the SO(6) charge J (because these particular solutions correspond to the chiral operators on the field theory side; see Sect. 2 of [11]). In other words, for this particular class of solutions the angular momentum J should be equal to our action variable E. The general formula should have on the left-hand side E, the generator of U (1)L , instead of J :

700

A. Mikhailov

6922 1898 370 35 E4 + E6 + E8 + E10 + . . . . 32768 32768 32768 32768

E = E2 +

(37)

This gives the expansion of the generating function of U (1)L to the order |p1 |9 . It would S be interesting to check explicitly, beyond the order 1/|pS |, that this Hamiltonian generates periodic trajectories.

4.6. More on the perturbation theory. Here we want to present a slightly different and perhaps simpler way of thinking about the continuation of U (1)L in the perturbation theory. Consider the Hamiltonian vector field ξE2 corresponding to the first Pohlmeyer charge E2 . Consider the canonical transformation F = e2πξE2 .

(38)

This canonical transformation is the Hamiltonian flow generated by E2 by the time 2π . The trajectories of E2 are almost periodic in the null-surface limit, therefore we can write F = ev1 ,

(39)

where v1 is a vector field of the order 1/|pS |2 . This vector field can be constructed in the following way. Let us choose the conformal gauge on the worldsheet. We know from (33) that E2 = dσ |pS | + f = K + f , where f is of the order 1/|pS |. Taking into account that e2π ξK = 1 we get F = 1+ + = exp +

2π

dse−sξK ξf esξK

0

ds2 ds1 e−s2 ξK ξf es2 ξK e−s1 ξK ξf es1 ξK + . . .

s1 <s2 2π

1 2

0

dse−sξK ξf esξK + ds1 ds2 [e−s2 ξK ξf es2 ξK , e−s1 ξK ξf es1 ξK ] + . . . .

(40)

s1 <s2

This defines 2π 1 v1 = dse−sξK ξf esξK + ds1 ds2 [e−s2 ξK ξf es2 ξK , e−s1 ξK ξf es1 ξK ] + . . . 2 s1 <s2 0 in the perturbation theory. (Notice that f can be decomposed in the Fourier series f = fk so that {K, fk } = ikfk and then the leading term of v is the zero mode ξf0 ; this is the “averaging” procedure of [11].) The vector field v1 defines the vector field on the moduli space of null-surfaces as a limit lim|pS |→∞ (E22 v1 ) = lim|pS |→∞ (L2 v1 ) (where L was defined in Sect. 2.3). This vector field determines the slow evolution of the null-surface; it is the Hamiltonian vector field of the Landau-Lifshitz model on the moduli space of the null-surfaces modulo U (1)L [11, 9]. . By definition E is a linear As in [21] we can consider the improved currents E2n 2n −2n+3 combination of E2 , . . . , E2n such that E2n = O(|pS | ). The Hamiltonian of the


701

Landau-Lifshitz model is the null-surface limit of E4 , more precisely lim|pS |→∞ (E2 E4 ). Given that E2 and E4 are in involution, this implies that for some a1 we have F1 = e

2π(ξE2 +a1 ξE ) 4

= ev2 ,

(41)

where v2 is of the order 1/|pS |4 . Again, lim|pS |→∞ (E24 v2 ) determines a vector field on the moduli space of null-surfaces. (It can be also defined as lim|pS |→∞ (L4 v2 ).) This vector field commutes with the time evolution of the Landau-Lifshitz model. We conjecture that this vector field is generated by the second conservation law of the Landau-Lifshitz model, which is proportional to the null-surface limit5 of E6 , more precisely lim|pS |→∞ (E23 E6 ) or lim|pS |→∞ (L3 E6 ). Repeating this procedure we get e

2π(ξE2 +a1 ξE +a2 ξE +...) 4

6

=1

in the perturbation theory. These arguments lead us to the following conclusion. First, we see once again that there is a linear combination E2 +a1 E4 +a2 E6 +. . . generating periodic trajectories. Second, the moduli space of null-surfaces in AdS5 ×S 5 modulo U (1)L is naturally equipped with the infinite tower of Hamiltonians in involution which are the nullsurface limit of the Pohlmeyer charges. This is the generalized Landau-Lifshitz model. 5. Conclusion Given a manifold with the metric of the Lorentzian signature it is possible to construct the extremal surfaces in this manifold as perturbations of the null-surfaces. In the special case when the manifold is AdS5 × S 5 the AdS/CFT correspondence predicts that the extremal surfaces (which are the same as classical string worldsheets) correspond to the states of the large R-charge in the N = 4 supersymmetric Yang-Mills theory. From this point of view considering the extremal surface as a perturbation around the null-surface corresponds to considering the state of the interacting Yang-Mills theory as a perturbation of the state of the free Yang-Mills theory. This correspondence has the following important features: 1. Locality. In the planar limit (the limit of infinitely many colors) the Yang-Mills perturbation theory is local in the following sense: the Feynman diagrams involve only interactions of those elementary field operators which stand next to each other in the product under the trace. We expect that the correspondence between the parton chains and the string worldsheets is local in each order of the perturbation theory, and therefore the locality of the planar Yang-Mills perturbation theory should correspond to the locality of the string worldsheet theory. 2. Integrability. The classical string worldsheet theory in AdS5 × S 5 is an integrable system. Because of the integrability, there is an infinite family of local conserved charges in involution. In this paper we have argued that an infinite linear combination of these local charges generates periodic trajectories on the string phase space. This statement can be verified order by order in the null-surface perturbation theory, and it is local at each order. This means that the “slow evolution” of nearly-degenerate extremal surfaces 5 Notice that the null-surface limit of E is invariant under the U (1) symmetry of the null-surL 2n faces, because the U (1)L symmetry of the null-surfaces is generated by the conserved quantity E2 which commutes with E2n .

702

A. Mikhailov

[11] is essentially controlled by the Pohlmeyer charges (we will further discuss the slow evolution and how it is related to the Pohlmeyer charges in the second paper of [28]). It would be interesting to further study the null-surface perturbation theory from the point of view of the integrability. It would be especially interesting to study those manifestations of the integrability which are local. The Bäcklund transformations [20] is one example. They allow us to construct a new extremal surface from a given extremal surface, and in the null-surface perturbation theory these transformations are well-defined and local at each order. The Bäcklund transformations are closely related to the local conserved charges, and in fact the hidden symmetry U (1)L can be considered as a consequence of the special properties of these transformations. We will discuss the relation between U (1)L and the Bäcklund transformations in the third paper of [28]. The general problem is to study those aspects of the integrability which are local in the null-surface perturbation theory. (Without a reference to the null-surface perturbation theory, we would define the locality as some sort of an independence of the choice of the boundary conditions.) This problem arises also on the field theory side. The Feynman diagrams in the planar limit are local, but we usually compute the anomalous dimension of the single-trace operators which requires summing over the whole parton chain. The spectrum of single-trace operators at large N is certainly an invariant of the theory, but it is non-local. If it is true that the planar N = 4 Yang-Mills theory is integrable, it would be important to understand the integrability as much as possible in terms of the local properties of the parton chain (perhaps on the level of the individual Feynman diagrams). We have defined the U (1)L strictly speaking in the perturbation theory, but it should be actually well-defined in the domain of the string phase space where the velocity of the string is large enough. In other words, the series defining the U (1)L in fact converges if the string moves fast enough. It would be interesting to study the global properties of U (1)L . An important question is what happens to U (1)L after the quantization. To answer this question we should first include fermions. Important steps in this direction were made recently in [29–31]. It would be interesting to understand better why the “length” is conserved on the field theory side. (Why is there a quantum number L with a well-defined classical limit?) To which extent the conservation of L is related to the integrability of the planar Yang-Mills theory? What happens to L when we turn on the fermions? Null-surfaces are obviously an important ingredient in our construction. The correspondence between the null-surfaces and the “engineering” operators in the free field theory is rather straightforward; the null-surfaces in AdS5 × S 5 appear very naturally in the description of the coherent states of the free theory [8, 9]. Is there any way to see directly on the field theory side, that turning on the Yang-Mills interaction corresponds to the deformation of the null-surface into the extremal surface? Acknowledgements. I would like to thank S. Frolov, A. Gorsky, V. Kaloshin, A. Kapustin, A. Marshakov, S. Minwalla, M. Van Raamsdonk, A. Starinets, K. Zarembo and especially A. Tseytlin for discussions, and G. Arutyunov for a correspondence on the local conserved charges. I want to thank A. Tseytlin for comments on the text. I want to thank the organizers of the String Field Theory Camp at the Banff International Research Station for their hospitality while this work was in progress. This research was supported by the Sherman Fairchild Fellowship and in part by the RFBR Grant No. 03-02-17373 and in part by the Russian Grant for the support of the scientific schools NSh-1999.2003.2.

References 1. Frolov, S., Tseytlin, A.A.: Semiclassical quantization of rotating superstring in AdS5 × S 5 . JHEP 0206, 007 (2002)


703

2. Tseytlin, A.A.: Semiclassical quantization of superstrings: AdS5 ×S 5 and beyond. Int. J. Mod. Phys. A18, 981 (2003) 3. Russo, J.G.: Anomalous dimensions in gauge theories from rotating strings in AdS5 × S 5 . JHEP 0206, 038 (2002) 4. Minahan, J.A., Zarembo, K.: The Bethe-Ansatz for N=4 Super Yang-Mills. JHEP 0303, 013 (2003) 5. Frolov, S., Tseytlin, A.A.: Multi-spin string solutions in AdS5 x S 5 . Nucl.Phys. B668, 77–110 (2003) 6. Frolov, S., Tseytlin, A.A.: Quantizing three-spin string solution in AdS5 × S 5 . JHEP 0307, 016 (2003) 7. Frolov, S.A., Park, I.Y., Tseytlin, A.A.: On one-loop correction to energy of spinning strings in S5 . Phys. Rev. D71, 026006 (2005) 8. Kruczenski, M.: Spin chains and string theory. Phys. Rev. Lett. 93, 161602 (2004) 9. Mikhailov, A.: Supersymmetric null-surfaces. JHEP 0409, 068 (2004) 10. Mikhailov, A.: Speeding Strings. JHEP 0312, 058 (2003) 11. Mikhailov, A.: Slow evolution of nearly-degenerate extremal surfaces. J. Geom. Phys. 54, 228–250 (2005) 12. Kruczenski, M., Tseytlin, A.: Semiclassical relativistic strings in S 5 and long coherent operators in N=4 SYM theory. JHEP 0409, 038 (2004) 13. De Vega, H.J., Nicolaidis, A.: Strings in strong gravitational fields. Phys. Lett. B295, 214–218 (1992); de Vega, H.J., Giannakis, I., Nicolaidis, A.: String Quantization in Curved Spacetimes: Null String Approach. Mod. Phys. Lett. A10, 2479–2484 (1995) 14. Mandal, G., Suryanarayana, N.V., Wadia, S.R.: Aspects of Semiclassical Strings in AdS5 . Phys. Lett. B543, 81 (2002) 15. Bena, I., Polchinski, J., Roiban, R.: Hidden Symmetries of the AdS5 × S 5 Superstring. Phys. Rev. D69, 046002 (2004) 16. Alday, L.F.: Non-local charges on AdS5 × S 5 and PP-waves. JHEP 0312, 033 (2003) 17. Swanson, I.: On the Integrability of String Theory in AdS5 ×S 5 . http://arxiv.org/list/hep-th/0405172, 2004 18. Swanson, I.: Quantum string integrability and AdS/CFT. Nucl. Phys. B709, 443-464 (2005) 19. Arnold, V.I.: Mathematical methods of classical mechanics. New York: Springer-Verlag, 1989 20. Pohlmeyer, K.: Integrable Hamiltonian Systems and Interactions through Quadratic Constraints. Commun. Math. Phys. 46, 207–221 (1976) 21. Arutyunov, G., Staudacher, M.: Matching Higher Conserved Charges for Strings and Spins. JHEP 0403, 004 (2004) 22. Engquist, J.: Higher Conserved Charges and Integrability for Spinning Strings in AdS5 x S 5 . JHEP 0404, 002 (2004) 23. Kazakov, V.A., Marshakov, A., Minahan, J.A., Zarembo, K.: Classical/quantum integrability in AdS/CFT. JHEP 0405, 024 (2004) 24. Beisert, N.: The su(2|3) Dynamic Spin Chain. Nucl. Phys. B682, 487–520 (2004) 25. Kruczenski, M., Ryzhov, A.V., Tseytlin, A.A.: Large spin limit of AdS5 x S 5 string theory and low energy expansion of ferromagnetic spin chains. Nucl. Phys. B692, 3–49 (2004) 26. Minahan, J.: Higher Loops Beyond the SU(2) Sector, hep-th/0405243 27. Arutyunov, G., Russo, J., Tseytlin, A.A.: Spinning strings in AdS5 × S 5 : new integrable system relations, hep-th/0311004 28. Mikhailov, A.: Plane wave limit of local conserved charges, hep-th/0502097; Anomalous dimension and local charges, hep-th/0411178; An action variable of the sine-Gordon model, hep-th/0504035 29. Arutyunov, G., Frolov, S.: Integrable Hamiltonian for Classical Strings on AdS5 × S 5 . JHEP 0502, 059 (2005) 30. Beisert, N., Kazakov, V.A., Sakai, K., Zarembo, K.: The Algebraic Curve of Classical Superstrings on AdS5 × S 5 , hep-th/0502226 31. Alday, L.F., Arutyunov, G., Tseytlin, A.A.: On Integrability of Classical SuperStrings in AdS5 × S 5 , hep-th/0502240 Communicated by G.W. Gibbons

Commun. Math. Phys. 264, 705–724 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1522-y

Communications in


Global Solutions to the Cauchy Problem for the Relativistic Boltzmann Equation with Near–Vacuum Data Robert T. Glassey Department of Mathematics, Indiana University, Bloomington, IN 47405–7106, USA. E-mail: [email protected] Received: 14 May 2005 / Accepted: 9 August 2005 Published online: 1 March 2006 – © Springer-Verlag 2006

Abstract: The Cauchy Problem for the relativistic Boltzmann equation is studied with small (i.e., near–vacuum) data. For an appropriate class of scattering cross sections, global “mild” solutions are obtained. 1. Introduction We study the Cauchy Problem for the relativistic Boltzmann equation ∂t F + vˆ · ∇x F = Q(F, F ) F (0, x, v) = F0 (x, v)

(x, v ∈ R3 , t > 0)

(RB)

for “small” data F0 , i.e., the near–vacuum situation. We normalize the speed of light c and the particle mass m to unity. Then v, ˆ the relativistic velocity, is defined in terms of the momenta v R3 by v vˆ = ; v0 ≡ 1 + |v|2 (1.1) v0 and thus |v| ˆ < 1 for all v. There is of course a vast literature on solutions to this equation, but most of it concerns the classical (nonrelativistic) formulation. General references include [6, 9, 12, 18, 22, 30, 31, 41, 42]. In contrast, the relativistic version studied here has not received comparable attention. Background information on this relativistic case may be found in [10, 13, 39]. A sampling of the many relevant papers are given in the references; other books and surveys contain a much more thorough listing, see e.g., [11, 18 and 42]. For the classical equation in all space (or with periodic boundary conditions) one has global existence of smooth solutions near the vacuum ([28, 11, 23, 36]), near the

Supported in part by NSF DMS 0204227

706

R. T. Glassey

equilibrium ([32, 11, 29, 38, 43, 24 and 18]) as well as global weak solutions ([14]). (The papers of Guo cited here apply to a more general situation in which a nonlinear Vlasov–type force term may also be included.) Moreover the “nearly homogeneous” case has also been successfully treated, see [11] for references and details. It has been known for some time that the near–equilibrium relativistic Boltzmann equation also admits global smooth solutions ([20 and 21]). For this relativistic situation details of the collisions are studied in [19], a regularizing property is presented in [1] and weak solutions are treated in [15]. There are numerous examples in Kinetic Theory which suggest that the relativistic case can be quite challenging. One such is the Cauchy Problem for the Vlasov–Poisson system. The classical version possesses global smooth solutions for smooth large data ([34, 37]), but the relativistic problem remains open. The classical near–vacuum problem was solved in the hard–sphere case via a beautiful trick used in [28] (essentially Galilean invariance: |x − tv |2 + |x − tu |2 = |x − tv|2 + |x − tu|2 ); cf. also [40]. It allows one to eliminate in many situations the dependence in estimates on the post–collisional velocities u , v . That device does not work in the present relativistic case and we are forced to proceed otherwise. In the kernel of the collision operator Q(F, F ) appears the scattering cross section σ . The specific possible forms of σ do not seem to be widely disseminated in the literature (see however [10]). In appropriate coordinates it may depend on the “relative momentum” and the scattering angle (both to be defined directly). The hypotheses we impose on this kernel will be given in (H0) below after the relevant quantities are defined. Once the requisite quadratic estimates are achieved on the collision term, the nonnegativity of the solution follows essentially from the Illner–Shinbrot iteration [28]. A “mild” solution to the initial–value problem is a continuous function satisfying the time–integrated form of (RB); see Theorem 2 below. 2. Notation and Details of the Collision Operator As is standard, we abbreviate F (t, x, u) by F (u), etc., and use primes to represent the results of collisions. The conservation laws for momentum and energy are u + v = u + v ≡ m,

(2.1)

1 + |u |2 + 1 + |v |2 = 1 + |u|2 + 1 + |v|2 ≡ e,

(2.2)

for u, v ∈ R3 . The scattering angle θ is defined as follows: given two 4–vectors 3 U = (u0 , u1 , u2 , u3 ), V ≡ (v0 , v1 , v2 , v3 ), we set U · V = u0 v0 − uk vk (which is 1

the Lorentz inner product). Then the angle θ is given by cos θ =

(V − U ) · (V − U ) . (V − U ) · (V − U )

(2.3)

u , v can be explicitly calculated [19]. In terms of a unit vector ω we have v = v − a(u, v, ω)ω,

u = u + a(u, v, ω)ω,

(2.4)

where a(u, v, ω) =

ˆ 2eu0 v0 ω · (vˆ − u) . e2 − (ω · m)2

(2.5)

Relativistic Boltzmann Equation

707

Other relevant quantities which will appear below now follow. We define s = (U + V )2 = (u0 + v0 )2 − |u + v|2 = 2(u0 v0 − u · v + 1) = e2 − |m|2 , 4g 2 = −(U − V )2 = −(u0 − v0 )2 + |u − v|2 = 2(u0 v0 − u · v − 1) = s − 4.

(2.6)

(2.7)

Furthermore, we define the Møller velocity as the scalar vM given by 2 = |vˆ − u| ˆ 2 − |vˆ × u| ˆ2= vM

or vM

s(s − 4) 4v02 u20

2g 1 + g 2 = . v 0 u0

(2.8)

The last equality is established in [20]. There are several representations of the collision operator Q; we will use that in Appendix II of [20]. Using that formulation we can write the collision operator as Q(F, F )(t, x, v) = q(u, v, ω)[F (t, x, v )F (t, x, u ) 2 S+

R3

−F (t, x, u)F (t, x, v)] du dω = “gain − loss”, = Qg (F, F ) − Q (F, F ) where 2 S+ = {ω ∈ S 2 : ω · vˆ ≥ ω · u}. ˆ

As shown in [20], the scattering kernel q admits the representation q(u, v, ω) =

4sσ e2 |ω · (vˆ − u)| ˆ 2 2 (e − (ω · m) )2

(2.9)

in which the scattering cross section σ appears. We now state the hypotheses on σ : (H0) Hypothesis on the Scattering Cross Section σ . Let ω be a unit vector and u, v ∈ R3 . Let 0 < δ < 1 and denote by pu the vector v×u. ˆ σ = σ (u, v, ω) is to be nonnegative, continuous and satisfy σ (u, v, ω) ≤

|ω · pu |σ˜ (ω) 1

g(1 + g 2 ) 2 +δ

,

where σ˜ (ω) is also nonnegative, bounded and continuous and satisfies for some constant c and for every 0 = z ∈ R3 , σ˜ (ω) dω ≤ c|z|−1 . 1 + |ω · z| |ω|=1

708

R. T. Glassey

The major theorem is as follows: Theorem 1. Let σ satisfy the Hypothesis (H0) and let δ be as in (H0). Consider (RB) with initial value F0 (x, v) satisfying 0 ≤ F0 (x, v) ∈ C 0 (R6 ) as well as 1+δ 2 exp(v0 ) 1 + |x × v|2 F0 (x, v) ≤ c0 . Then there exists a positive number with the property that if c0 ≤ , a uniquely determined nonnegative global solution F (t, x, v) to the mild form of the Cauchy problem for (RB) exists. This solution satisfies the estimate − 1+δ 2 F (t, x, v) ≤ c exp(−v0 ) 1 + |x × v|2 for some constant c depending only on δ and the data. The solution F need not be integrable in x. This will be addressed later using the causality present in (RB) as well as an additional assumption that the initial value F0 (x, v) have compact support in x. The exponential decay of F0 in v can be easily weakened to algebraic decay at the rate v0−w for sufficiently large w > 0. This is because of the inequality v0 u0 ≥ cv0

(c > 0)

which is known from [20]; see also [7]. Discussion of (H0). A sufficient condition for the integral condition in (H0) to hold is the following. Assume that σ˜ (ω) ≤ |ω3 |µ for some µ > 0 and choose spherical coordinates with the polar axis directed along z. Denote by φ the polar angle. Then we have π 2 sin φ (cos φ)µ dφ σ˜ (ω) dω |ω3 |µ dω ≤ = 2 · 2π ≤ c|z|−1 2 1 + |ω · z| 1 + |z| cos φ S+ |ω|=1 1 + |ω · z| 0 which is the desired condition. An assumption in (H0) on the decay in g is “natural” because the kernel q will be seen in Lemma 2.2 below to grow in g at a rate no greater 1 than g(1 + g 2 ) 2 . We do not know if the factor ω · pu appearing in (H0) represents a physically realistic situation. This assumption, while not entirely satisfactory and probably method–related, is needed in the estimates on the gain term. It can be shown that the imposition of this condition is not required to deal with the loss term, in which case we use part ii) of Lemma 2.2 below. We now begin the sequence of estimates with a study of the function a(u, v, ω). Lemma 2.1. The function a(u, v, ω) and the expression ω · (vˆ − u) ˆ satisfy the estimates i) ii)

2u0 v0 |ω·(v− ˆ u)| ˆ

e

vM sin

θ 2

≤ |a(u, v, ω)| ≤ ge ˆ =e √g

1+g 2

evM sin θ2 2 1+g 2

≤ |ω · (vˆ − u)| ˆ ≤ √

,

.

Proof. The lower bound in i) is trivial: in the denominator of the definition of a we simply use e2 − (ω · m)2 ≤ e2 . For the upper bound in i) we begin with the definition of the scattering angle cos θ =

(v0 − u0 )(v0 − u0 ) − (v − u) · (v − u ) . (v0 − u0 )2 − |v − u|2

(2.10)


709

We expand the denominator to get (v0 − u0 )2 − |v − u|2 = v02 + u20 − 2u0 v0 − |v|2 − |u|2 + 2u · v = 2 − 2u0 v0 + 2u · v = −4g 2 .

(2.11)

For the numerator we have by definition (v − u) · (v − u ) = (v − u) · (v − u − 2aω) and (v0 )2 − (u0 )2 e |v |2 − |u |2 = e v02 − u20 − 2aω · m = . e

v0 − u0 =

(2.12)

Therefore (v0 − u0 )(v0 − u0 ) =

(v0 + u0 )(v0 − u0 )2 − 2a(v0 − u0 ) ω · m e

so that the numerator becomes 2a(v0 − u0 )ω · m − |v − u|2 + 2aω · (v − u) e 2a = 2 − 2u0 v0 + 2u · v + ω · [e(v − u) − (v0 − u0 )m] e 2a = −4g 2 + ω · [(v0 + u0 )(v − u) − (v0 − u0 )(u + v)]. e (v0 − u0 )2 −

Now performing some elementary algebra and returning to the definition (2.10) we get ˆ 2 8u20 v02 (ω · (vˆ − u)) , e2 − (ω · m)2

−4g 2 cos θ = −4g 2 + and it follows that sin2

ˆ 2 u2 v 2 (ω · (vˆ − u)) e2 − (ω · m)2 2 θ = 02 0 2 = ·a . 2 g (e − (ω · m)2 ) 4e2 g 2

(2.13)

Hence |a| ≤

2eg e2

− (ω

· m)2

≤

2eg e2

− |m|2

2eg eg = √ = s 1 + g2

as desired. For part ii) we insert the definition of the function a from (2.5) into (2.13) and then solve for ω · (vˆ − u) ˆ to get g sin θ2 e2 − (ω · m)2 |ω · (vˆ − u)| ˆ = . u 0 v0

710

R. T. Glassey

Thus by the definition of s and the relation (2.8), √ eg sin θ2 evM sin θ2 s g sin θ2 θ vM sin = ≤ |ω · (vˆ − u)| ˆ ≤ = , 2 u 0 v0 u0 v 0 2 1 + g2 and this completes the proof.

Although we will not use part ii) of this lemma in this paper, we present it to relate the variation over the sphere (parameterized by ω) to that of the scattering angle θ . The precise relationship between ω and θ is expressed in (2.13). Another reason for our doing so is that our notation is not the same as that in other sources such as [10]. Next we derive estimates on the scattering kernel q. Lemma 2.2. The function q(u, v, ω) satisfies the following estimates: i) |q(u, v, ω)| ≤ 8 1 + g 2 gσ , ii) |q(u, v, ω)| ≤ 4σ u0 v0 |ω · (vˆ − u)|. ˆ We will not use ii) in this paper; as above, it is included for reference purposes and completeness. Proof. From Lemma 3.1 of [20] we have the relation e2 |u × v|2 |u − v|2 = , + 1 + g2 g 2 (1 + g 2 ) g2 so that e2 |u × v|2 + |u − v|2 ≤ . 2 1+g g2 Also from this same lemma in [20] we have the inequality g≥

(|u × v|2 + |u − v|2 )1/2 , √ 2 u0 v 0

and hence e2 ≤ 4u0 v0 . 1 + g2 For the second estimate we begin with the definition of q:

(2.14)

4sσ e2 |ω · (vˆ − u)| ˆ ˆ ˆ 4sσ e2 |ω · (vˆ − u)| 4σ e2 |ω · (vˆ − u)| ≤ = (e2 − (ω · m)2 )2 (e2 − |m|2 )2 s σ e2 |ω · (vˆ − u)| ˆ = ≤ 4σ u0 v0 |ω · (vˆ − u)|. ˆ 2 1+g To verify the first inequality we write q in terms of a using its definition and then apply Lemma 2.1: 2sσ e|a| 2sσ e · ge ˆ 2σ e2 g |q| = ≤ = . 2 2 u0 v0 (e − (ω · m) ) u 0 v0 s u0 v 0 1 + g 2 |q| =

Therefore by (2.14) we get

2g 1 + g 2 σ e2 |q| ≤ ≤ 8σg 1 + g2 u0 v0 (1 + g 2 ) which concludes the proof.


711

3. Estimates on the Collision operator In (RB) ∂t F + vˆ · ∇x F = Q(F, F ) we first make an elementary change of variables to achieve exponential decay in the integration variable v. Set F (t, x, v) = exp(−v0 )f (t, x, v). Then by energy conservation q[exp(−v0 )f (v ) exp(−u0 )f (u ) Q(F, F ) = R3

2 S+

− exp(−v0 )f (v) exp(−u0 )f (u)] dω du = exp(−v0 ) exp(−u0 )q[f (v )f (u ) − f (v)f (u)] dω du.

R3

2 S+

The equation for f is then ∂t f + vˆ · ∇x f = Q(f, f ) = Qg (f, f ) − Ql (f, f ), where

Ql (f, f )(t, x, v) = f (t, x, v)

and

Qg (f, f )(t, x, v) =

R3

2 S+

R3

2 S+

q(u, v, ω)e−u0 f (t, x, u) dω du

q(u, v, ω)e−u0 f (t, x, u )f (t, x, v ) dω du.

(3.1)

(3.2)

We introduce the by now standard notation f # (t, x, v) = f (t, x + vt, ˆ v).

(3.3)

Then (RB) can be written as d # f (t, x, v) = Q# (f, f )(t, x, v). dt

(3.4)

Thus Q# (f, f )(t, x, v) = Q#g (f, f )(t, x, v) − Q# (f, f )(t, x, v), where Q#g (f, f )(t, x, v) = q(u, v, ω)e−u0 f (t, x + t v, ˆ v )f (t, x + t v, ˆ u ) dω du 2 S+

=

2 S+

R3 R3

q(u, v, ω)e−u0 f # (t, x + t (vˆ − vˆ ), v )

×f # (t, x + t (vˆ − uˆ ), u ) dω du, # Q (f, f )(t, x, v) = f (t, x + t v, ˆ v) qe−u0 f (t, x + t v, ˆ u) dω du 2 S+

= f # (t, x, v)

2 S+

R3

R3

qe−u0 f # (t, x + t (vˆ − u), ˆ u) dω du

≡ f # (t, x, v)R# (f )(t, x, v).

(3.5)

712

R. T. Glassey

It is the time integrated form of (3.4) to which we will find a continuous bounded nonnegative solution f in this paper. Because f will be bounded, the original distribution function F will decay exponentially in v. The function spaces in which we will seek the solution are as follows. Let

1+δ M = f ∈ C 0 [0, ∞) × R3 × R3 : f ≡ sup (1 + |x × v|2 ) 2 |f (t, x, v)| < ∞ . t,x,v

We also define

X = f ∈ L∞ [0, ∞) × R3 × R3 with the same norm. We give a name to the weight function: ρ(x, v) = (1 + |x × v|2 )

1+δ 2

.

Turning to the proof of the major result, we begin with a simple calculus lemma. Lemma 3.1. For vectors a, b ∈ R3 with b = 0 let ν = I=

t

dτ

1 + |a + bτ |2

0

b |b|

and

1+δ . 2

Then I≤

1 c |b| (1 + |a × ν|2 ) 2δ

for some constant c independent of t, a and b. Proof. Elementary algebra gives us

1 + |a + bτ | = 1 + |a × ν| + |b| 2

2

a·b τ+ |b|2

2

2

so that

t

I = 0

≤ ≤ as desired.

2 |b|

1

1 + |a

0

× ν|2

+ |b|2 (τ

∞

+

a·b 2 ) |b|2

1+δ dτ 2

ds (1 + |a × ν|2 + s 2 ) 1

1+δ 2

c |b| (1 + |a × ν|2 ) 2δ

The next lemma provides the key estimates of the collision operator.


713

Lemma 3.2. Let hypothesis (H0) hold. For any t ≥ 0 and f # ∈ M there is a constant c independent of t, x, v for which t |Q#g (f, f )| dτ ≤ cρ(x, v)−1 f # 2 , 0 t |Q# (f, f )| dτ ≤ cρ(x, v)−1 f # 2 . 0

#g (f1 , f2 ) and Q # (f1 , f2 ) by More generally, for f1# , f2# ∈ M, t ≥ 0, define Q #g (f1 , f2 )(t, x, v) = Q e−u0 qf1# (t, x + t (vˆ − vˆ ), v ) 2 R3 S+

×f2# (t, x + t (vˆ − uˆ ), u ) dω du, # (f1 , f2 )(t, x, v) = f1# (t, x, v) Q qe−u0 f2# (t, x + t (vˆ − u), ˆ u) dω du. 2 S+

Then

t

0

0

t

R3

#g (f1 , f2 )| dτ ≤ cρ(x, v)−1 f1# f2# , |Q # (f1 , f2 )| dτ ≤ cρ(x, v)−1 f1# f2# . |Q

Proof. By definition t # Q (f, f )(τ, x, v) dτ 0 t # −u0 # = dω qe f (τ, x + τ (vˆ − u), ˆ u) du dτ f (τ, x, v) 2 S+ R3 0 t dτ ≤ ρ(x, v)−1 f # 2 dω |q|e−u0 du . (1+δ) 2 S+ R3 0 (1 + |(x + τ v) ˆ × u|2 ) 2 The time integral has the form of that in Lemma 3.1 with a = x × u, b = vˆ × u ≡ pu c . Hence and is therefore dominated by |b| t |q|e−u0 # −1 # 2 du dω Q (f, f )(τ, x, v) dτ ≤ ρ(x, v) f 2 S+ R3 |vˆ × u| 0 σ˜ (ω)|ω · pu | dω ≤ ρ(x, v)−1 f # 2

2 S+

e−u0 du, 2 δ R3 |pu |(1 + g ) where we have used Lemma 2.2 i) and (H0) to bound |q|. Now in the remaining integrals we use |ω · pu | ≤ |pu |, apply (H0) again to bound the ω integral uniformly and bound the g expression below by 1. Then t # Q (f, f )(τ, x, v) dτ ≤ ρ(x, v)−1 f # 2 , ×

0

714

R. T. Glassey

which is the desired estimate for the loss term. As expected, the gain term is much more difficult to handle. We break up its estimate into a number of steps. First we have t |Q#g (f, f )(τ, x, v)| dτ 0 t −u0 # # ˆ ˆ = e qf (τ, x + τ (vˆ − v ), v )f (τ, x + τ (vˆ − u ), u ) dω du dτ 2 R3 0 S+ t dτ ≤ f # 2 |q|e−u0 dω du 1+δ . 2 2 S+ R3 0 (1 + |(x + τ v) ˆ × v |2 )(1 + |(x + τ v) ˆ × u |2 ) (3.6) Now consider D˜ ≡ (1 + |(x + τ v) ˆ × v |2 )(1 + |(x + τ v) ˆ × u |2 ). We define av = x × v ,

au = x × u ,

bv = vˆ × v ,

bu = vˆ × u ,

νv =

νu =

bv , |bv |

bu , |bu |

cv = av · bv ;

cu = au · bu .

(3.7)

Step 1. Estimation of D˜ from below. From its definition we have D˜ = [1 + |av + τ bv |2 ][1 + |au + τ bu |2 ] = 1 + |av + τ bv |2 + |au + τ bu |2 + |av + τ bv |2 |au + τ bu |2 . The last term, quartic in τ , will be estimated using |av + τ bv | ≥ |ω · (av + τ bv )|, and for this we compute ω · bv = ω · (vˆ × v ) = ω · vˆ × (v − aω) = 0 and ω · av = ω · (x × v ) = ω · x × (v − aω) = ω · x × v. ˜ we arrive at These expressions are substituted into the quartic term in D; D˜ ≥ 1 + |av |2 + τ 2 |bv |2 + 2τ cv + |au |2 + τ 2 |bu |2 + 2τ cu +v02 (ω · px )2 |au |2 + τ 2 |bu |2 + 2τ cu , where px ≡ x × v. ˆ

(3.8)


715

We write this as D˜ ≥ ατ 2 + 2βτ + γ ,

(3.9)

where α = |bv |2 + |bu |2 (1 + v02 (ω · px )2 ), β = cv + cu (1 + v02 (ω · px )2 ), γ = 1 + |av |2 + |au |2 (1 + v02 (ω · px )2 ).

(3.10)

Estimation of the time integral. We will show shortly that αγ −β 2

Step 2. > 0. Assuming this for the moment, let t t dτ dτ It ≡ = 1+δ 1+δ ˜ 2 2 0 D 0 [1 + |av + τ bv |2 ][1 + |au + τ bu |2 ] t dτ ≤ 1+δ 0 (ατ 2 + 2βτ + γ ) 2 and complete the square to write

ατ 2 + 2βτ + γ = α Therefore It ≤ 2α

− (1+δ) 2

∞ 0

β τ+ α

ds

s2 +

αγ −β α2

2

αγ − β 2 . + α2

1+δ = cα 2

(δ−1) 2

(αγ − β 2 )− 2 . δ

(3.11)

2

Step 3. Estimation of αγ − β 2 from below. From (3.10) we have directly that αγ −β 2

= |bv |2 (1+|av |2 )+|bv |2 |au |2 (1+v02 (ω·px )2 )+|bu |2 (1+|av |2 )(1+v02 (ω·px )2 ) +|bu |2 |au |2 (1+v02 (ω·px )2 )2 −cv2 −cu2 (1+v02 (ω·px )2 )2 −2cv cu (1+v02 (ω·px )2 )

= |bv |2 +|av ×bv |2 +|au ×bu |2 (1+v02 (ω·px )2 )2 +(1+v02 (ω·px )2 )(|bv |2 |au |2 +|bu |2+|bu |2 |av |2−2cu cv ). For the last expression in parentheses here we write |bv |2 |au |2 +|bu |2 +|bu |2 |av |2 −2cu cv

= |bv |2 =

(au ·νu )2 +|au ×νu |2 +|bu |2 +|bu |2 (av ·νv )2 +|av ×νv |2 −2cu cv

2 |bv |2 cu |b |2 c2 +|bv |2 |au ×νu |2 +|bu |2 + u 2v +|bu |2 |av ×νv |2 −2cu cv |bu |2 |bv | 2 2

2 2

= |bu |2 (1+|av ×νv |2 )+|bv |2 |au ×νu |2 + |b|bv | |2cu + |b|bu | |2cv −2 |b|buv || cu · |b|buv || cv u

v

≥ |bu |2 (1+|av ×νv |2 )+|bv |2 |au ×νu |2 .

It follows that αγ − β 2 is bounded below by |bv |2 (1+|av ×νv |2 )+|bu |2 |au ×νu |2 (1+v02 (ω·px )2 )2 +(1+v02 (ω·px )2 )[|bu |2 (1+|av ×νv |2 )+|bv |2 |au ×νu |2 ] = |bv |2 +|bu |2 (1+v02 (ω·px )2 ) 1+|av ×νv |2 +(1+v02 (ω·px )2 )|au ×νu |2 =α 1+|av ×νv |2 +(1+v02 (ω·px )2 )|au ×νu |2 ≥α 1+|av ×νv |2 +|au ×νu |2 ≥cα[1+|av ×νv |+|au ×νu |]2 .

(3.12)

716

R. T. Glassey

Using the vector identity A × (B × C) = (C · A)B − (B · A)C, we compute these cross products as av × bv = (x × v ) × (vˆ × v ) = −(vˆ · x × v )v = −(v · vˆ × x)v = (v · px )v ; similarly au × bu = (u · px )u . Since |bv | = |vˆ × v | ≤ |v||v ˆ | we have |av × νv | =

|av × bv | |v · px | ≥ |bv | |v| ˆ

and therefore also

|au × νu | ≥

|u · px | . |v| ˆ

We can now estimate the expression in the last line of (3.12) from below: |v · px | |u · px | + |v| ˆ |v| ˆ −1 = |v| ˆ |v| ˆ + |v · px | + |u · px | ≥ |v| ˆ −1 |v| ˆ + |(v + u ) · px | = |v| ˆ −1 |v| ˆ + |u · px | .

1 + |av × νv | + |au × νu | ≥ 1 +

Here we have used conservation of momentum and the fact that v · px = 0. Using this in (3.12) we see that 2 ˆ + |u · px | . ˆ −2 |v| αγ − β 2 ≥ cα|v| This completes Step 3. With its result, we can estimate the time integral It from (3.11) as −δ (δ−1) δ 1 ˆ + |u · px | ˆ δ α 2 (αγ − β 2 )− 2 ≤ c|v| ˆ δ α − 2 |v| . It ≤ c|v| From the definition (3.10) of α we have α = |bv |2 + |bu |2 (1 + v02 (ω · px )2 ) ≥ c|bu |2 (1 + v0 |ω · px |)2 . For |bu | we can write |bu | = |vˆ × u | ≥ |ω · (vˆ × u )| = |ω · (vˆ × (u + aω))| = |ω · (vˆ × u)| ≡ |ω · pu |. Therefore α ≥ c|ω · pu |2 (1 + v0 |ω · px |)2 and it follows that the time integral satisfies the estimate It ≤ c|v| ˆ δ |ω · pu |−1 (1 + v0 |ω · px |)−1 [|v| ˆ + |u · px |]−δ .

(3.13)


717

Returning to the integral for the gain term in (3.6) we now have t |Q#g (f, f )(τ, x, v)| dτ 0 −δ δ # 2 ≤ c|v| ˆ f ˆ + |u · px | |q|e−u0 |ω · pu |−1 (1 + v0 |ω · px |)−1 |v| dω du S 2 R3

+ ≤ c|v| ˆ δ f # 2

2 R3 S+

≤ c|v| ˆ f δ

# 2 2 S+

R3

g 1 + g 2 σ e−u0 |ω · pu |−1 (1 + v0 |ω · px |)−1 −δ × |v| ˆ + |u · px | dω du −δ σ˜ e−u0 (1 + g 2 )−δ (1 + v0 |ω · px |)−1 |v| dω du, ˆ + |u · px | (3.14)

where we have first used part i) of Lemma 2.2 to bound q and then, in the last line, have applied hypothesis (H0) to adjust the powers of g and to cancel the factor |ω · pu |. Step 4. The desired estimate holds when |x×v| is bounded. Let us assume that |x×v| ≤ 1. Then (3.13) allows us to write t # # 2 |Qg (f, f )(τ, x, v)| dτ ≤ c f σ˜ e−u0 dω du ≤ cρ −1 f # 2 2 R3 S+

0

by (H0). Therefore in what follows we may assume that |x × v| ≥ 1. Step 5. Estimation of the ω integral from above. It is now immediate to estimate the ω integral Iω appearing in (3.13): σ˜ (ω)(1 + v0 |ω · px |)−1 dω ≤ c(v0 |px |)−1 = c|x × v|−1 Iω ≡ 2 S+

in view of hypothesis (H0). (As |x × v| ≥ 1 there is no singularity). Using this result in (3.13) we now have t # Qg (f, f )(τ, x, v) dτ 0 ≤ c|x × v|−1 |v| ˆ δ f # 2 e−u0 (1 + g 2 )−δ [|v| ˆ + |u · px |]−δ du. (3.15) R3

In this same set we may write |x × v|−1 ≤ cρ −1 |x × v|δ = cρ −1 v0δ |px |δ , and therefore (3.15) then takes the form t # Qg (f, f )(τ, x, v) dτ 0 −δ ≤ cρ −1 v0δ |px |δ |v| ˆ δ f # 2 e−u0 (1 + g 2 )−δ |v| du. ˆ + |u · px | R3

(3.16)

718

R. T. Glassey

Step 6. Estimation of the u integral and completion of the lemma. Denote by Iu the remaining integral: Iu =

R3

−δ ˆ + |u · px | e−u0 (1 + g 2 )−δ |v| du.

Our goal is to show that this integral is dominated by a constant multiple of (v0 |px |)−δ . Toward this end we partition it as Iu =

|u||v|/2

≡ I + I I.

From Lemma 3.1 of [20] we have g≥

(|u × v|2 + |u − v|2 )1/2 , √ 2 u0 v 0

and thus on the set {|u| < |v|/2} we get |u − v|2 |v|2 1 + |v|2 v0 ≥1+ ≥ = . 4v0 u0 16v0 u0 16v0 u0 16u0

1 + g2 ≥ 1 + Hence because δ < 1, I ≤ cv0−δ ≤

cv0−δ

−δ uδ0 e−u0 |v| du ˆ + |u · px |

R3 ∞

δ 2

r (1 + r ) e 2

2

√ − 1+r 2

dr 0

0

π 2

sin φ dφ (r|px | cos φ)δ

≤ c(|px |v0 )−δ . The other integral I I admits the estimate I I ≤ cv0−δ

|u|>|v|/2

uδ0 e−u0 [|v| ˆ + |u · px |]−δ du,

and this integral is dominated by the same integral which appeared in the upper bound for I above. We can now conclude that Iu ≤ c(|px |v0 )−δ . Inserting this in (3.16) we finally get t # Qg (f, f )(τ, x, v) dτ ≤ cρ −1 f # 2 0

which is desired estimate for the gain term.


719

With quadratic estimates of the form from Lemma 3.2 in hand, a small–data theorem results without difficulty. Write f (0, x, v) = f0 (x, v). Returning to (3.4), we integrate in time to get t f # (t, x, v) = f0 (x, v) + Q# (f, f )(τ, x, v) dτ. (3.17) 0

Define the operator F on M by

t

Ff # = f0 (x, v) +

Q# (f, f )(τ, x, v) dτ

0

and let MR = {f ∈ M : f # ≤ R}. Theorem 2. Let hypothesis (H0) hold. There exists a constant R0 such that if f0 is sufficiently small, then Eq. (3.17) has a unique solution f # ∈ MR0 . Moreover, under the same restrictions on f0 and R0 , this equation is uniquely solvable in X as well. Proof. The estimates of Lemma 3.2 show that if e.g., f0 ≤ R/2 and f ∈ MR , then |Ff # | ≤ ρ(x, v)−1 f0 + cρ(x, v)−1 f # 2 −1 R 2 ≤ ρ(x, v) + cR . 2 Thus F maps MR into itself for R sufficiently small. Similarly, we show that F is a contraction on MR for suitably small R. Since elements of MR are continuous, the continuity of Ff # is evident.

4. Nonnegativity of the Solution It remains to show that the solution just obtained remains nonnegative. For this purpose we use the well–known iteration of Illner and Shinbrot [28] which, to a certain point, proceeds as in the classical case. Let T > 0 be arbitrary and let MT denote the restriction of elements f ∈ M to [0, T ] × R3 × R3 . Suppose that there exist U0# , #0 ∈ M such that 0 (t, x, v) ≤ U0 (t, x, v) for all 0 ≤ t < T , (x, v) ∈ R3 × R3 . Define two sequences {k }, {Uk } by d # + #k+1 R# (Uk ) = Q#g (k , k ), k+1 (0) = f0 , dt k+1 d # # U + Uk+1 R# (k ) = Q#g (Uk , Uk ), Uk+1 (0) = f0 . dt k+1

(4.1)

Because we have assumed that U0# ∈ M, the estimates of Lemma 3.2 allow us to conclude that R# (U0 ), Q#g (U0 , U0 ) ∈ L1 ((0, T ), C 0 (R3 × R3 )).

(4.2)

Clearly there exists a solution when k = 0. These are linear ordinary differential equations; thus if k−1 , Uk−1 exist on (0, T ) then so do k , Uk .

720

R. T. Glassey

Lemma 4.1. Let 0 ≤ f0 ∈ M. Assume the beginning condition (BC) 0 ≤ 0 (t) ≤ 1 (t) ≤ U1 (t) ≤ U0 (t),

0 ≤ t < T.

(BC)

Then the system (4.1) has a unique solution #k , Uk# ∈ MT

(4.3)

for all k ≥ 1 with the property k−1 (t) ≤ k (t) ≤ Uk (t) ≤ Uk−1 (t),

0 ≤ t < T.

(4.4)

Temporarily we assume that the (BC) and the result of the lemma hold with some U0# ∈ MT . Then there exist functions , U with k , Uk U, and (t) ≤ U(t) for all t. Now integrate over [0, t] the ordinary differential equations (4.1) at step k; let k → ∞ and apply the dominated convergence theorem to get t t # (t) + # R# (U)(τ ) dτ = f0 + Q#g (, )(τ ) dτ, 0 0 (4.5) t t # # # # U R ()(τ ) dτ = f0 + Qg (U, U)(τ ) dτ. U (t) + 0

0

This is the separated Boltzmann system. If we can show that U = , then f ≡ U = will be a nonnegative “mild” solution of (RB). Proof. In order to see the monotonicity, we solve explicitly to get t t # t # #k (t) = f0 e− 0 R (Uk−1 )ds + e− τ R (Uk−1 )ds Q#g (k−1 , k−1 ) dτ.

(4.6)

0

Thus #k+1 (t)

= f0 e

−

t 0

R# (Uk )ds

t

+

e−

t τ

R# (Uk )ds

0

Q#g (k , k ) dτ.

(4.7)

Assume that for some k ≥ 1, k−1 (t) ≤ k (t) ≤ Uk (t) ≤ Uk−1 (t),

(4.8)

and subtract (4.6) from (4.7): t # t # #k+1 (t) − #k (t) = f0 e− 0 R (Uk )ds − e− 0 R (Uk−1 )ds t t # t # + e− τ R (Uk )ds − e− τ R (Uk−1 )ds Q#g (k , k ) dτ 0 t t # + e− τ R (Uk−1 )ds Q#g (k , k ) − Q#g (k−1 , k−1 ) dτ. (4.9) 0

The kernel q is nonnegative on the set of integration, and from definition (3.5), R(u) ≤ R(v) if u ≤ v a.e. So the first two terms are nonnegative. By the induction assumption, the last term is too, since Qg is monotone. Hence #k (t) ≤ #k+1 (t),

(4.10)

and a similar argument applies to the {Uk# (t)}. We see that each member of {#k }, {Uk# } is nonnegative and belongs to MT by using the estimates of Lemma 3.2. This proves the lemma.


721

In order to simplify the (BC), we take 0 = 0 and any 0 ≤ U0# ∈ MT . We claim that 0 = 0 (t) ≤ 1 (t) ≤ U1 (t).

(4.11)

Indeed, by the differential equations, d # + #1 R# (U0 ) = Q#g (0 , 0 ), dt 1 d # U + U1# R# (0 ) = Q#g (U0 , U0 ). dt 1

(4.12)

0 = 0 implies R# (0 ) = 0, Q#g (0 , 0 ) = 0.

(4.13)

Now

Therefore

U1#

t

= f0 +

0 ≤ #1 = f0 e−

0 t 0

Q#g (U0 , U0 ) ds,

R# (U0 )ds

(4.14)

≤ f0 ≤ U1# .

Hence the (BC) reduces to U1 (t) ≤ U0 (t).

(BC)

5. Satisfaction of the Beginning Condition We are assuming that 0 ≤ f0 ∈ M. Lemma 5.1. If f0 is sufficiently small, then the (BC) holds, and the separated Boltzmann system has a global solution (, U) with (# , U # ) ∈ MR . Proof. Since 0 = 0,

U1# (t) = f0 +

i.e., U1 (t, x + t v, ˆ v) = f0 (x, v) +

t 0

Q#g (U0 , U0 ) dτ,

t 0

R3

2 S+

(5.1)

q exp(−u0 )U0 (τ, x + τ v, ˆ v)

×U0 (τ, x + τ v, ˆ u ) dω du dτ.

(5.2)

The (BC) requires U1 (t) ≤ U0 (t); sufficient for this is that there exists a function U0 which satisfies t U0 (t, x + t v, ˆ v) = f0 (x, v) + q exp(−u0 )U0 (τ, x + τ v, ˆ v) 0

2 R3 S+

ˆ u ) dω du dτ. ×U0 (τ, x + τ v,

(5.3)

We recognize the right–hand side here as having the same form as the “gain term” studied above. As such, the same estimates apply and provide the existence of a solution U0 provided f0 is sufficiently small. (Large–data solutions will not exist in general to the “gain only” Boltzmann equation; see [2]).

722

R. T. Glassey

Proof that U = : It remains to show that U = . Take R from Theorem 2. Lemma 5.2. When f0 is sufficiently small, U = , where U, are the solutions of the separated Boltzmann system 4.5. Proof. By definition,

t

# (t) +

0 t

U # (t) +

t

# R# (U)(τ ) dτ = f0 +

0 t

U # R# ()(τ ) dτ = f0 +

0

0

Q#g (, )(τ ) dτ, Q#g (U, U)(τ ) dτ.

Subtracting these equations, we have t # # (U − )(t) = [Q#g (U, U − ) + Q#g (U − , ) 0

+# R# (U − ) − (U # − # )R# ()] dτ. Now we simply take norms in M, using the estimates from the second part of Lemma 3.2: U # − # ≤ c U # U # − # + # U # − # . Now U # , # both lie in MR , so each of the factors U # , # is bounded by cR. The conclusion now follows when R is sufficiently small.

As in Lemma 3.2, we can show that U = ∈ X under the same restriction on f0 . Thus the nonnegative solution just obtained must coincide with the unique solution f ∈ X obtained from the last sentence of Theorem 2. Since f0 ∈ M by hypothesis, and since M ⊆ X, our solutions must be identical, and the solution f from Theorem 2 must remain nonnegative. This completes the proof. We end with an observation about the integrability of the solution F in x and v. As f is bounded, the solution F is clearly integrable in v because it decays exponentially. Let us now assume that the initial value F0 (x, v) has compact support in x, say F0 (x, v) = 0 for |x| > k. Then it can be seen from the representation that the gain and loss terms (and hence the solution F as well) each has support in |x| ≤ k +t. Perhaps the easiest way to see this is to use induction on the successive approximations which is straightforward. (The linear case was studied in [16, 17].) Therefore the solution is integrable in x and v under this additional assumption. References 1. Andréasson, H.: Regularity of the gain term and strong L1 convergence to equilibrium for the relativistic Boltzmann equation. SIAM J. Math. Anal. 27, 1386–1405 (1996) 2. Andréasson, H., Calogero, S., Illner, R.: On Blow–up for Gain–term–only classical and relativistic Boltzmann equations. Math. Methods Appl. Sci. 27, no. 18, 2231–2240 (2004) 3. Andréasson, H.: The Einstein-Vlasov system/kinetic theory. Living Rev. Relativ. (electronic), 5, 2002–7, 33 pp (2002) 4. Asano, K., Ukai, S.: On the Cauchy Problem of the Boltzmann Equation with Soft Potentials. Publ. R.I.M.S. Kyoto Univ. 18, 477–519 (1982)


723

5. Bardos, C., Degond, P., Golse, F.: A priori estimates and existence results for the Vlasov and Boltzmann Equations. In: Proc. AMS–SIAM Summer Seminar, Santa Fe (1984), Ams Lectures in Appl. Math. Vol. 23, Providence, RI: Amer. Math. Soc., 1986 6. Boltzmann, L.: Weitere Studien u¨ ber das Wärmegleichgewicht unter Gasmolekülen. Sitzungsberichte der Akademie der Wissenschaften Wien 66, 275–370 (1872) 7. Bellomo, N., Toscani, G.: On the Cauchy Problem for the nonlinear Boltzmann equation. Global existence, uniqueness and asymptotic stability. J. Math. Phys. 26, 334–338 (1985) 8. Caflisch, R.: The Boltzmann Equation with Soft Potentials. Commun. Math. Phys. 74, 71–109 (1980) 9. Cercignani, C.: The Boltzmann Equation and its Applications. New York: Springer–Verlag, 1988 10. Cercignani, C., Kremer, G.: The relativistic Boltzmann equation, theory and applications. Boston: Birkhaeuser, 2002 11. Cercignani, C., Illner, R., Pulvirenti, M.: The Mathematical Theory of Dilute Gases. New York: Springer–Verlag, 1994 12. Chapman, S., Cowling, T.G.: The Mathematical Theory of Non–uniform Gases. Third Ed., Cambridge: Cambridge University Press, 1990 13. de Groot, S.R., van Leeuwen, W.A., van Weert, C.G.: Relativistic Kinetic Theory. Amsterdam: North–Holland, 1980 14. DiPerna, R., Lions, P.L.: On the Cauchy Problem for Boltzmann Equations: Global Existence and Weak Stability. Ann. Math. 130, 321–366 (1989) 15. Dudyński, M., Ekiel-Jezewska, M.: Global Existence Proof for Relativistic Boltzmann Equation. J. Stat. Phys. 66, 991–1001 (1992) 16. Dudyński, M., Ekiel-Jezewska, M.: Causality of the linearized relativistic Boltzmann equation. Phys. Rev. Lett. 55, 2831–2834 (1985) 17. Dudyński, M., Ekiel-Jezewska, M.: Errata: Causality of the linearized relativistic Boltzmann equation. Investigación Oper. 6, 2228 (1985) 18. Glassey, R.: The Cauchy problem in kinetic theory. Philadelphia: SIAM, 1996 19. Glassey, R., Strauss, W.: On the Derivatives of the Collision Map of Relativistic Particles. Trans. Th. Stat. Phys. 20, 55–68 (1991) 20. Glassey, R., Strauss, W.: Asymptotic Stability of the Relativistic Maxwellian. Publ. R.I.M.S. Kyoto Univ. 29, 301–347 (1993) 21. Glassey, R., Strauss, W.: Asymptotic Stability of the Relativistic Maxwellian via Fourteen Moments. Trans. Th. Stat. Phys. 24, 657–678 (1995) 22. Grad, H.: Principles of the Kinetic Theory of Gases. Handbuch der Physik 12, Berlin: Springer– Verlag, 1958, pp. 205–294 23. Guo,Y.: The Vlasov–Poisson–Boltzmann system near vacuum. Commun. Math. Phys. 218, 293–313 (2001) 24. Guo, Y.: The Vlasov–Maxwell–Boltzmann system near Maxwellians. Invent. Math. 153, 593–630 (2003) 25. Guo, Y.: Classical solutions to the Boltzmann equation for molecules with an angular cutoff. Arch. Rat. Mech. Anal. 169, no. 4, 305–353 (2003) 26. Hamdache, K.: Quelques résultats pour l’équation de Boltzmann. C. R. Acad. Sci. Paris I 299, 431–434 (1984) 27. Hamdache, K.: Initial boundary value problems for Boltzmann equation: Global existence of weak solutions. Arch. Rat. Mech. Anal. 119, 309–353 (1992) 28. Illner, R., Shinbrot, M.: The Boltzmann Equation, global existence for a rare gas in an infinite vacuum. Commun. Math. Phys. 95, 217–226 (1984) 29. Kawashima, S.: The Boltzmann Equation and Thirteen Moments. Japan J. Appl. Math. 7, 301–320 (1990) 30. Kaniel, S., Shinbrot, M.: The Boltzmann Equation, uniqueness and local existence. Commun. Math. Phys. 58, 65–84 (1978) 31. Landau, E.M., Pitaevskii, L.P.: Physical Kinetics. Vol. 10 of Course of Theoretical Physics, Oxford: Pergamon Press, 1981 32. Nishida, T., Imai, K.: Global Solutions to the initial value problem for the nonlinear Boltzmann Equation. Publ. R.I.M.S. Kyoto Univ. 12, 229–239 (1976) 33. Noutchegueme, N., Tetsadjio, M.E.: Global solutions for the relativistic Boltzmann equation in the homogeneous case on the Minkowski space–time. http://arXiv.org/abs/gr-qc/0307065, 2004 34. Pfaffelmoser, K.: Global classical solutions of the Vlasov–Poisson system in three dimensions for general initial data. J. Diff. Eqs. 95, 281–303 (1992) 35. Lions, P.L.: Compactness in Boltzmann’s equation via Fourier integral operators and applications. J. Math. Kyoto Univ. 34, 391–427 (1994) 36. Polewczak, J.: Classical solution of the Nonlinear Boltzmann equation in all R3 . J. Stat. Phys. 50 (3 & 4), 611–632 (1988)

724

R. T. Glassey

37. Schaeffer, J.: Global Existence of Smooth Solutions to the Vlasov–Poisson System in Three Dimensions. Commun. P.D.E. 16, 1313–1335 (1991) 38. Shizuta, Y.: On the Classical Solutions of the Boltzmann Equation. Commun. Pure Appl. Math. 36, 705–754 (1983) 39. Stewart, J.: Non–Equilibrium Relativistic Kinetic Theory. Lecture Notes in Physics 10, New York: Springer–Verlag 1971 40. Tartar, L.: Some Existence Theorems for semilinear hyperbolic systems in one space variable. MRC Technical Summary Report, Madison, WI, 1980 41. Truesdell, C., Muncaster, R.: Fundamentals of Maxwell Kinetic Theory of a Simple Monatomic Gas (treated as a branch of rational continuum mechanics). New York: Academic Press, 1980 42. Ukai, S.: Solutions of the Boltzmann Equation. Studies in Math. Appl. 18, 37–96 (1986) 43. Ukai, S.: On the Existence of Global Solutions of a mixed problem for the nonlinear Boltzmann equation. Proc. Japan. Acad. 50, 179–184 (1974) Communicated by P. Constantin

Commun. Math. Phys. 264, 725–740 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1521-z

Communications in


On Lieb-Thirring Inequalities for Schrödinger Operators with Virtual Level T. Ekholm, R. L. Frank Royal Institute of Technology, Department of Mathematics, 100 44 Stockholm, Sweden. E-mail: [email protected]; [email protected] Received: 16 May 2005 / Accepted: 11 October 2005 Published online: 10 February 2006 – © Springer-Verlag 2006

Abstract: We consider the operator H = − − V in L2 (Rd ), d ≥ 3. For the moments of its negative eigenvalues we prove the estimate γ tr H−

≤ Cγ ,d

Rd

(d − 2)2 V (x) − 4|x|2

γ + d2 +

dx,

γ > 0.

Similar estimates hold for the one-dimensional operator with a Dirichlet condition at the origin and for the two-dimensional Aharonov-Bohm operator. Introduction The Lieb-Thirring inequalities estimate a quantum mechanical quantity, namely moments of negative eigenvalues of the Schrödinger operator − − V in L2 (Rd ), by means of the classical phase space volume. They state for suitable values of γ and d (see [LiTh] and, for more recent results, the survey [LaWei2]) that d γ tr(− − V )− ≤ Lγ ,d V+ (x)γ + 2 dx. (0.1) Rd

Lately the main topic in connection with these inequalities has been to establish their sharp constants Lγ ,d . We are interested in a different question. As is well-known, in dimension d ≥ 3 a sufficiently weak potential cannot bind a particle. Put differently, if V ∈ C0∞ (Rd ) then − − βV is non-negative for small β > 0. This follows, e.g., from the Hardy inequality |u|2 (d − 2)2 dx ≤ |∇u|2 dx, 2 d d 4 |x| R R

u ∈ C0∞ (Rd ).

(0.2)

726

T. Ekholm, R. L. Frank

We see that the Lieb-Thirring estimate does not yield a good bound for weak potentials. 2 In the particular case V (x) = (d−2) the l.h.s. of (0.1) is zero whereas the r.h.s. is infinite! 4|x|2 In this paper we show the rather unexpected result that the part of the potential which is stronger than the Hardy weight is sufficient to estimate the moments of the negative eigenvalues of − − V . More precisely, we prove the inequality γ tr (− − V )−

≤ Cγ ,d

(d − 2)2 V (x) − 4|x|2

Rd

γ + d2 +

(0.3)

dx

for any d ≥ 3 and γ > 0. Note that a direct approach based only on (0.1) and (0.2) leads 2 to − − V ≥ −ε + (1 − ε) (d−2) − V for ε ∈ (0, 1) and hence 4|x|2 γ tr(− − V )−

≤ Lγ ,d ε

− d2

Rd

(d − 2)2 V (x) − (1 − ε) 4|x|2

γ + d2 +

dx.

However, as ε tends to zero the constant in front of the integral diverges. For a deeper analysis in the case of positive ε we refer to [St]. In order to prove (0.3) we will choose a slightly different (but equivalent) point of view and establish Lieb-Thirring inequalities for the operator − −

(d − 2)2 − βV 4|x|2

in L2 (Rd ), d ≥ 3,

(0.4)

see Theorem 1.1. We note that several works have been devoted to the investigation of this operator. In [Bi] sufficient conditions for the finiteness of the negative spectrum were given. In [BiLa] the asymptotic number of negative eigenvalues in the strong coupling regime β → ∞ was investigated and, in particular, it was shown that the Weyl-type formula may be violated. Indeed, an additional contribution of a one-dimensional auxiliary problem appears. We emphasize that such a term does not appear in our Lieb-Thirring estimates, which demonstrates the ‘smoothing effect’ of taking γ > 0. In [Wei] the weak coupling regime β → 0 is investigated and a necessary and sufficient condition on V is given for the operator (0.4) to have a negative eigenvalue for any β > 0. This is in particular the case if V ≥ 0 and stands in sharp contrast to the operator − − βV if d ≥ 3. This is what we mean by a virtual level. It will turn out that the operator (0.4) has both two- and d-dimensional features and the main difficulty in establishing a Lieb-Thirring inequality is to estimate the former. Here we rely on weighted Lieb-Thirring inequalities by Egorov-Kondrat’ev [EgKo]. The same approach allows us to obtain Lieb-Thirring inequalities for the onedimensional analogue of (0.4) with a Dirichlet boundary condition at the origin (see Theorem 1.6) and for the two-dimensional magnetic Schrödinger operator corresponding to the Aharonov-Bohm field when the critical Hardy weight is subtracted (see Theorem 1.10). Our estimates allow also the inclusion of a weight in the spirit of [GlGrMaTh, BlReSt and EgKo]. 1. Statement of the Results 1.1. Schrödinger operators in d ≥ 3. The main result of this paper is

On Lieb-Thirring Inequalities

727

Theorem 1.1. Let d ≥ 3, γ > 0 and α ≥ 0. Then γ (d − 2)2 γ + d+α − V ≤ C V (x)+ 2 |x|α dx tr − − γ ,d,α 2 4|x| Rd −

(1.1)

with a constant Cγ ,d,α independent of V . To be more precise, we prove that if V ∈ L1,loc (Rd ) and if the r.h.s. of (1.1) is finite then the quadratic form (d − 2)2 2 2 2 |∇u| − (1.2) |u| − V |u| dx 4|x|2 Rd is lower semi-bounded and closable on C0∞ (Rd ) and the estimate (1.1) holds for the operator associated with the closure of this form. Specializing to the case α = 0 we obtain for the standard Schrödinger operator Corollary 1.2. Let d ≥ 3 and γ > 0. Then γ tr (− − V )−

≤ Cγ ,d,0

Rd

(d − 2)2 V (x) − 4|x|2

γ + d2 +

dx

(1.3)

with the constant Cγ ,d,0 from (1.1). Remark 1.3. In particular, if we replace V by βV , where V is bounded and compactly supported, then the r.h.s. of (1.3) is zero for sufficiently small β > 0. As explained in the introduction, this is an important feature of (1.3) which is not shared by the classical estimate (0.1). Remark 1.4. Neither Theorem 1.1 nor Corollary 1.2 hold for γ = 0. This follows from 2 − βV has a negative eigenthe fact that if V ≥ 0, V ≡ 0, then the operator − − (d−2) 4|x|2 value for all β > 0, see Remark 8.2 in [Wei] or, for the case of a spherically symmetric V , our Proposition 3.2 below. Remark 1.5. Our constants are explicit and given in (2.5). However, they might be strongly overblown. It would be challenging to find their sharp values.

1.2. Schrödinger operators on the semi-axis. Our result has a one-dimensional analogue, which is an important ingredient in the proof of Theorem 1.1, but also of inded2 1 pendent interest. We consider the operator − dr 2 − 4r 2 − V in L2 (R+ ) with Dirichlet boundary conditions at the origin and prove Theorem 1.6. Let γ > 0 and α ≥ 0 such that γ +

1+α 2

> 1. Then

γ 1 d2 γ + 1+α ≤ Cγ ,1,α V (r)+ 2 r α dr tr − 2 − 2 − V dr 4r R+ − with a constant Cγ ,1,α independent of V .

728


Similarly as before, the precise statement involves the operator associated with the closure of the form 1 |f |2 − 2 |f |2 − V |f |2 dr (1.4) 4r R+ on C0∞ (R+ ). We stress that R+ = (0, ∞). Note also that 4r12 is the critical Hardy weight if d = 1, see (3.1) below. We will prove this theorem in two steps. The case α ≥ 1 is dealt with in Sect. 2 using results of [EgKo] and the case 0 ≤ α ≤ 1 in Sect. 3 using explicit diagonalization of the d2 1 operator − dr 2 − 4r 2 . Remark 1.7. If α = 0 and γ > 21 this leads to a Lieb-Thirring-type inequality in the spirit of Corollary 1.2. It would be interesting to extend the estimate to the critical case γ = 21 . Remark 1.8. If α ≥ 1 we can take any γ > 0. However, a similar estimate for γ = 0 d2 1 cannot hold due to the virtual level of − dr 2 − 4r 2 , see Proposition 3.2 below. Remark 1.9. If α = 1 this is a Bargmann-type inequality. Recall that γ d2 s(s + 1) 1 γ +1 tr − 2 + −V ≤ V (r)+ r dr dr r2 (1 + 2s)(γ + 1) R+ −

(1.5)

for s > − 21 and γ ≥ 0. Indeed, the proof of the Bargmann inequality in [Si2] (Theorem 7.3) applies for any s > − 21 and γ = 0. By the argument of Aizenman-Lieb [AiLi] one extends this inequality to γ ≥ 0. When s → − 21 the constant in (1.5) diverges to infinity. Our Theorem 1.6 with α = 1 yields (1.5) for s = − 21 and γ > 0 with a finite constant. Indeed, the constant can be chosen as 2π Lγ ,2 with Lγ ,2 from (0.1), see Remark 2.3. 1.3. Aharonov-Bohm operators in d = 2. If d = 2 then the Hardy inequality (0.2) becomes trivial. In this case it is interesting to consider the magnetic Schrödinger operator (−i∇ − φA)2 , where φ ∈ R and A is the Aharonov-Bohm magnetic vector potential, A(x) = |x|−2 (−x2 , x1 ),

x ∈ R2 \ {0}.

As usual, (−i∇−φA)2 is defined as the Friedrichs extensions of the corresponding differential operator on C0∞ (R2 \ {0}). By gauge invariance we can assume that − 21 < φ ≤ 21 . Recall (see [LaWei1]) the Hardy-type inequality |u|2 φ2 dx ≤ |(−i∇ − φA)u|2 dx, u ∈ C0∞ (R2 \ {0}). 2 R2 |x| R2 Here the constant φ 2 is sharp. Our result is Theorem 1.10. Let γ > 0, α ≥ 0 and − 21 < φ ≤ 21 . Then γ φ2 γ + 2+α tr (−i∇ − φA)2 − 2 − V ≤ Cγ ,2,α V (x)+ 2 |x|α dx |x| R2 − with a constant Cγ ,2,α independent of V and φ. The proof is found in Sect. 4.


729

2. Proof of Theorem 1.1 In this section we assume, unless stated otherwise, that d ≥ 3 and write H0 = − −

(d − 2)2 . 4|x|2

The main idea in the proof of Theorem 1.1 is to consider H0 − V separately on the space of spherically symmetric functions and on its orthogonal complement. An essential ingredient in our study of the operator on the former space will be the following result from [EgKo]. Proposition 2.1. Let d ≥ 2, γ > 0 and α ≥ 0. Then γ + d+α γ EK tr(− − V )− ≤ Cγ ,d,α V (x)+ 2 |x|α dx Rd

with a constant CγEK ,d,α independent of V . For the convenience of the reader we will give a proof of this proposition in the appendix. We remark that for α = 0 this coincides with the classical Lieb-Thirring inequality (0.1). The inclusion of the weight |x|α increases the power of V by α2 as compared to (0.1). That this is necessary can easily be seen by scaling of the space variables. We note that the result holds also if d ≥ 3 and γ = 0 and if d = 1 and γ > 1+α 2 . From Proposition 2.1 we deduce now the first part of Theorem 1.6. Recall that we d2 1 consider the operator − dr 2 − 4r 2 − V in L2 (R+ ) with Dirichlet boundary condition. Corollary 2.2. Let γ > 0 and α ≥ 1. Then γ 1 d2 γ + 1+α ≤ Cγ ,1,α V (r)+ 2 r α dr tr − 2 − 2 − V dr 4r R+ − with a constant Cγ ,1,α independent of V . Proof. The operator − − V (| · |) in L2 (R2 ) is unitarily equivalent to the direct sum ⊕n∈Z (hn − V ) in ⊕n∈Z L2 (R+ ), where we define hn − V := −

d2 1 n2 − + −V dr 2 4r 2 r2

as quadratic form on C0∞ (R+ ). (Here we used that C0∞ (R2 \ {0}) is a form core for − − V .) Hence Proposition 2.1 yields γ d2 1 γ ≤ tr L2 (R2 ) (− − V (| · |))− tr L2 (R+ ) − 2 − 2 − V dr 4r − γ + 1+α ≤ CγEK V (|x|)+ 2 |x|α−1 dx ,2,α−1 R2 γ + 1+α = 2πCγEK V (r)+ 2 r α dr ,2,α−1 R+

as claimed.

730


Remark 2.3. In the case α = 1 we can apply the Lieb-Thirring inequality (0.1) instead of Proposition 2.1. This shows that the sharp value of the constant Cγ ,1,1 is bounded from above by 2π Lγ ,2 with Lγ ,2 from (0.1). Corollary 2.2 will allow us to treat the part of H0 − V on spherically symmetric functions. On the orthogonal complement of that space one has an improved Hardy inequality. Lemma 2.4. Let d ≥ 2. Then

d2 |u|2 dx ≤ |∇u|2 dx 4 Rd |x|2 Rd for all u ∈ C0∞ (Rd ) satisfying Sd−1 u(rω) dω = 0 for all r ≥ 0. This inequality appears, e.g., in [BiLa]. We sketch the simple proof. Proof. We substitute u = |x|(2−d)/2 v and obtain (d − 2)2 2 2 |∇u| − |u| dx = |∇v|2 |x|2−d dx 4|x|2 Rd Rd ∞ ∂v 2 |∇θ v|2 + = dθ r dr ∂r r2 Sd−1 0 ∞ r −1 |∇θ v|2 dθ dr. (2.1) ≥ 0

Sd−1

For fixed r the function v(r·) is orthogonal to constants, i.e., to the first eigenfunction of the Laplace-Beltrami operator on Sd−1 . Since the next eigenvalue is d − 1 we find |∇θ v(rθ )|2 dθ ≥ (d − 1) |v(rθ )|2 dθ. Sd−1

Sd−1

Multiplying by r −1 and integrating yields ∞ r −1 |∇θ v(rθ )|2 dθ dr ≥ (d − 1) 0

Sd−1

Rd

Combining this with (2.1) we obtain the result.

|x|−2 |u|2 dx.

Now we are in position to give the Proof of Theorem 1.1. By the variational principle it suffices to prove the result for V ≥ 0. Moreover, we will assume that the r.h.s. of (1.1) is finite and that the quadratic form (1.2) is lower semi-bounded on C0∞ (Rd ). (Note that these assumptions are satisfied, e.g., if V is bounded and has compact support.) Since the form (1.2) is closable we can define H0 − V as the operator associated with this form. At the end of the proof we use a standard approximation argument to show that the finiteness of the r.h.s. of (1.1) implies the lower semi-boundedness. We denote by P the projection onto spherically symmetric functions, (P u)(x) := |Sd−1 |−1 u(|x| ω) dω, x ∈ Rd , Sd−1


731

and put Q := I − P . Note that P and Q commute with H0 . Moreover, for u ∈ C0∞ (Rd ) one has 2Re (P V Qu, u) ≤ 2 V 1/2 Qu · V 1/2 P u ≤ (P V P u, u) + (QV Qu, u), which implies the operator inequality P V Q + QV P ≤ P V P + QV Q. It follows that H0 − V = P (H0 − V ) P + Q (H0 − V ) Q − P V Q − QV P ≥ P (H0 − 2V ) P + Q (H0 − 2V ) Q, and hence γ

γ

γ

tr(H0 − V )− ≤ tr (P (H0 − 2V ) P )− + tr (Q (H0 − 2V ) Q)− .

(2.2)

We consider the two terms separately and begin with the second one. By Lemma 2.4 we find for all 0 < ρ ≤ 1 that Q (H0 − 2V ) Q ≥ ρQ − − ρ −1 2V Q + 41 (1 − ρ)d 2 − (d − 2)2 Q|x|−2 Q. We choose ρ such that (1 − ρ)d 2 = (d − 2)2 and obtain from Proposition 2.1 (or (0.1) if α = 0) that γ γ tr (Q (H0 − 2V ) Q)− ≤ ρ γ tr Q − − ρ −1 2V Q − γ γ −1 ≤ ρ tr − − ρ 2V (2.3) − d+α d+α d+α V (x)γ + 2 |x|α dx. ≤ ρ − 2 2γ + 2 CγEK ,d,α Rd

γ

Now we turn to the term tr (P (H0 − 2V ) P )− . We define the spherical average of V by V˜ (r) := |Sd−1 |−1

Sd−1

V (rω) dω,

r ∈ R+ ,

and note that (the non-trivial part of) P (H0 − 2V ) P is unitarily equivalent to the operd2 1 ˜ ator − dr 2 − 4r 2 − 2V in L2 (R+ ). By Corollary 2.2 (with α replaced by α + d − 1) we obtain d+α d+α γ tr (P (H0 − 2V ) P )− ≤ 2γ + 2 Cγ ,1,α+d−1 V˜ (r)γ + 2 r α+d−1 dr. R+

Now Hölder’s (or Jensen’s) inequality implies that d+α d+α V˜ (r)γ + 2 ≤ |Sd−1 |−1 V (rω)γ + 2 dω, Sd−1

r ∈ R+ ,

732


and hence γ

tr (P (H0 − 2V ) P )− ≤ 2γ +

d+α 2

= 2γ +

d+α 2

|Sd−1 |−1 Cγ ,1,α+d−1 |Sd−1 |−1 Cγ ,1,α+d−1

V (rω)γ +

d+α 2

R+ Sd−1

Rd

V (x)γ +

d+α 2

r α+d−1 dω dr

|x|α dx.

(2.4)

Adding (2.3) and (2.4) we obtain the assertion in view of (2.2) with a constant satisfying d+α d+α d−1 −1 Cγ ,d,α ≤ 2γ + 2 ρ − 2 CγEK | Cγ ,1,α+d−1 . (2.5) ,d,α + |S To complete the proof it remains that show that if 0 ≤ V ∈ L1,loc (Rd ) is such that the r.h.s. of (1.1) is finite then the quadratic form (1.2) is lower semi-bounded on C0∞ (Rd ). To see this, choose bounded, compactly supported functions 0 ≤ Vn ≤ V such that Vn → V a.e. and d+α (V (x) − Vn (x))γ + 2 |x|α dx → 0. (2.6) Rd

The operators H0 − Vn are well-defined and (1.1) holds for them. In particular, λ(Vn ) := inf σ (H0 − Vn ) satisfies 1 1 γ γ γ + d+α α 2 |x| dx Vn (x) . λ(Vn ) ≥ −Cγ ,d,α Rd

Hence for any u ∈ C0∞ (Rd ) one has (d − 2)2 2 2 |∇u|2 − dx |u| − V |u| n 4|x|2 Rd 1 1 γ d+α Vn (x)γ + 2 |x|α dx . ≥ −Cγγ,d,α u 2 Rd

Using dominated convergence and (2.6) we can pass to the limit n → ∞ and find that also the form (1.2) is bounded from below on C0∞ (Rd ). This completes the proof. Proof of Corollary 1.2. Assume that the r.h.s. of (1.3) is finite. Then according to our comments after Theorem 1.1 the form |∇u|2 − V˜ |u|2 dx, (2.7) Rd

where

(d − 2)2 (d − 2)2 V˜ (x) := + V (x) − , 4|x|2 4|x|2 +

is lower semi-bounded on C0∞ (Rd ) and we denote by − − V˜ the operator associated with its closure. Since V ≤ V˜ the form (2.7), with V˜ replaced by V , is also lower semibounded on C0∞ (Rd ) and the associated operator satisfies − − V ≥ − − V˜ . Now Corollary 1.2 follows from Theorem 1.1 with α = 0 by the variational principle.


733

3. Proof of Theorem 1.6 Recall the one-dimensional Hardy inequality

|f (r)|2 dr ≤ 4 |f (r)|2 dr, r2 R+ R+

f ∈ C0∞ (R+ ).

(3.1)

It allows to define the non-negative operator h0 = −

d2 1 − 2 dr 2 4r

in L2 (R+ )

(3.2)

as the Friedrichs extension of the quadratic form (1.4) with V ≡ 0 on C0∞ (R+ ). This operator can be diagonalized explicitly. Indeed, let J0 be the first Bessel function of order zero (see [AbSt]). Then ∞√ (F0 f )(k) := krJ0 (kr)f (r) dr, k ∈ R+ , 0

initially defined for f ∈ C0∞ (R+ ), can be extended to a unitary operator F0 : L2 (R+ ) → L2 (R+ ). It has the property (F0 h0 f )(k) = k 2 (F0 f )(k),

k ∈ R+ ,

(3.3)

for all f ∈ D(h0 ). (These facts are essentially contained in Chapter 4 of [StWei].) We denote by N (τ, h0 − V ) the number of eigenvalues less than −τ , counting multiplicities, of the operator h0 − V in L2 (R+ ). Our proof of Theorem 1.6 relies on the following Lemma 3.1. Let q > 1, 0 ≤ α ≤ 1 such that 2q − α > 1. Then q −q+ 1+α 2 N(τ, h0 − V ) ≤ Cα,q τ V (r)+ r α dr, R+

τ > 0,

(3.4)

with a constant Cα,q independent of V . What we precisely prove is that if V ∈ L1,loc (R+ ) and if the r.h.s. of (3.4) is finite, then

1/2 the form (1.4) is closed and lower semi-bounded on D h0 and for the corresponding self-adjoint operator h0 − V the estimate (3.4) holds. Before we begin the proof we recall (see [BiSo1, Si2]) that Sp denotes the class of compact operators K (in a given Hilbert space, in our case in L2 (R+ )) such that 1 p p

K p := tr(K ∗ K) 2 < ∞. We will use the following fact (see [LiTh]). If q ≥ 1 and A, B are self-adjoint, non-negative operators such that Aq B q ∈ S2 , then AB ∈ S2q and 2q

AB 2q ≤ Aq B q 22 .

(3.5)

734


Proof of Lemma 3.1. Scaling with respect to the space variables shows that it is enough to consider the case τ = 1. Moreover, by the variational principle we may assume V ≥ 0. By the Birman-Schwinger principle and the inequality (3.5) we obtain 2q 2 N (1, h0 − V ) ≤ V 1/2 (h0 + I )−1/2 2q ≤ V q/2 (h0 + I )−q/2 2 . It follows from this estimate that we can restrict ourselves to, say, bounded and compactly supported V . The general result as well as the comment we made after the lemma are derived then in a standard way. It follows from (3.3) that the operator V q/2 (h0 + I )−q/2 F0∗ has the integral kernel q√ q r, k ∈ R+ , V (r) 2 rk J0 (rk)(k 2 + 1)− 2 , and therefore q/2 V (h0 + I )−q/2 2 = 2

R+ R+

rkJ02 (rk)(k 2 + 1)−q V (r)q dk dr.

Recall (see [AbSt]) that J0 is a continuous function with J0 (0) = 1 and

π cos(x − π/4) J0 (x) ∼ as x → ∞. √ 2 x Hence with cα := supx>0 x (1−α)/2 J0 (x) < ∞ we can estimate N(1, h0 − V ) ≤ cα2 (rk)α (k 2 + 1)−q V (r)q dk dr = Cα,q V (r)q r α dr. 2

R+ R+

R+

Here Cα,q := cα R+ k α (1 + k 2 )−q dk is finite in view of our assumptions.

Given Lemma 3.1 we obtain in a standard manner the Proof of Theorem 1.6. The case α ≥ 1 was already proven in Corollary 2.2. We assume now 0 ≤ α ≤ 1. The operator inequality t t h0 − V + t ≥ h0 − V − + 2 + 2

t implies N(t, h0 − V ) ≤ N 2 , h0 − (V − 2t )+ and hence γ tr(h0 − V )− = γ N (t, h0 − V )t γ −1 dt R+

≤γ N 2t , h0 − (V − 2t )+ t γ −1 dt R+ γ = γ2 N (τ, h0 − (V − τ )+ )τ γ −1 dτ. R+

Now we fix 1 < q < γ +

1+α 2

and apply Lemma 3.1 to obtain α−1 γ q γ tr(h0 − V )− ≤ γ 2 Cα,q (V (r) − τ )+ τ γ −q+ 2 dτ r α dr R+ R+ γ + 1+α γ 1+α = γ 2 Cα,q B(γ + 2 − q, q + 1) V (r)+ 2 r α dr, R+


735

1 where B(a, b) = 0 s a−1 (1 − s)b−1 ds is the beta function. Finally, one may optimize over all 1 < q < γ + 1+α 2 . This establishes the theorem. In Theorem 1.6 it is impossible to take γ = 0 since the operator h0 has a virtual level. More precisely, one has Proposition 3.2. Let V obey R+ |V (r)|1+δ r dr < ∞ and R+ |V (r)|(1 + r δ ) r dr < ∞ for some δ > 0. Then h0 − βV has a negative eigenvalue for all β > 0 if and only if R+ V (r)r dr ≥ 0, V ≡ 0. In this case, for sufficiently small β the eigenvalue λ(β) is unique and satisfies as β → 0, λ(β) ∼ − exp

β 2

λ(β) ∼ − exp − βc2

R+ V (r)r dr

−1

if if

R+

V (r)r dr > 0,

R+

V (r)r dr = 0

with a suitable constant c = c(V ) > 0.

Here we use the notation λ(β) ∼ − exp −aβ −ρ meaning lim −β ρ log(−λ(β)) = a.

β→0

The proof uses the same idea as the proof of Corollary 2.2. Proof. We recall that −−βV (|·|) in L2 (R2 ) is unitarily equivalent to ⊕n∈Z (hn −βV ) in ⊕n∈Z L2 (R+ ), where hn − βV are similarly defined as in the proof of Corollary 2.2. Clearly ⊕n∈Z (hn − βV ) has a negative eigenvalue if and only if h0 − βV has a negative eigenvalue. The assertion follows therefore from Theorem 3.4 in [Si1]. 4. Proof of Theorem 1.10 The proof of Theorem 1.10 is similar to the proof of Theorem 1.1 and we only sketch the major steps. We write Hφ := (−i∇ − φA)2 −

φ2 . |x|2

With polar coordinates (r, θ ) we define the projections Pn , n ∈ Z, in L2 (R2 ), 1 (Pn u)(r, θ ) := 2π

π

−π

u(r, ω)e−inω dω einθ ,

r > 0, θ ∈ (−π, π ).

The subspace Pn L2 (R2 ) reduces Hφ and its part in this space is unitarily equivalent to n(n−2φ) d2 1 in L2 (R+ ), defined as quadratic form on C0∞ (R+ ). the operator − dr 2 − 4r 2 + r2 Note that this operator coincides with (3.2) if n = 0 and, if φ = 21 , also if n = 1. This means that Hφ has one virtual level if |φ| < 21 and two virtual levels if φ = 21 .

736


Proof of Theorem 1.10. We assume V ≥ 0 and put P := P−1 + P0 if − 21 < φ ≤ 0, P := P0 + P1 if 0 < φ ≤ 21 and Q := I − P . (We emphasize that one may also take P = P0 if |φ| < 21 , but then the constants below will blow up as |φ| → 21 .) As in the proof of Theorem 1.1 one finds

γ

γ γ tr(Hφ − V )− ≤ tr P Hφ − 2V P − + tr Q Hφ − 2V Q − . On QL2 (R2 ) we use the estimate QHφ Q ≥ (1 − |φ|)Q(−)Q which is easily obtained by decomposition into the subspaces Pn L2 (R2 ). By Proposition 2.1 we conclude γ

γ tr Q Hφ − 2V Q − ≤ (1 − |φ|)γ tr Q − − 2(1 − |φ|)−1 V Q − 2+α 2+α − 2+α γ + EK γ + 2 C 2 |x|α dx. ≤ (1 − |φ|) 2 2 V (x) γ ,2,α R2

On the orthogonal complement P L2 (R2 ) we estimate

P Hφ − 2V P ≥ P0 Hφ − 4V P0 + P∓1 Hφ − 4V P∓1 . The latter operator is unitarily equivalent to the operator d2 1 d2 1 1 − 2|φ| ˜ ˜ − 2 − 2 − 4V ⊕ − 2 − 2 + − 4V dr 4r dr 4r r2 in L2 (R+ ) ⊕ L2 (R+ ), where 1 V˜ (r) := 2π

π

−π

V (r, θ ) dθ,

r > 0.

We estimate 1 − 2|φ| ≥ 0 and conclude by Corollary 2.2 that γ γ

d2 1 tr P Hφ − 2V P − ≤ 2 tr − 2 − 2 − 4V˜ dr 4r − 2+α γ + 2+α 2 Cγ ,1,α+1 ≤ 2·4 V˜ (r)γ + 2 r α+1 dr. R+

It remains to use Hölder’s inequality to complete the proof. Finally, we remark that the constants can be chosen independently of φ.


737

Appendix A. An Inequality of Egorov-Kondrat’ev Our exposition in this appendix follows rather closely [EgKo]. Proposition 2.1 can be deduced by standard arguments (as, e.g., in our proof of Theorem 1.6 in Sect. 3) provided we have established Lemma A.1. Let d ≥ 2, q > 1, α ≥ 0 such that 2q − α > d. Then d+α q V (x)+ |x|α dx, τ > 0, N(τ, − − V ) ≤ C τ −q+ 2 Rd

(A.1)

with a constant C = C(α, d, q) independent of V . The same result (with the same proof) holds if d = 1, q > 1, α ≥ 0 such that q − α > 1. First some terminology. By a ‘cube’ we mean always a cube with edges parallel to the coordinate axis, and by its ‘length’ we mean the length of one of its edges. We need the following variant of Rozenblum’s covering lemma (see [EgKo], where also an explicit value for the constant can be found). Lemma A.2. Let d ≥ 1. Then there exists a constant C1 > 0 such that for any ε ∈ (0, 1], any cube Q ⊂ Rd and any non-negative f ∈ L1 (Q) there exists a finite number of cubes Q1 , . . . , QM with the following properties: (1) Q ⊂ M point in Rd is contained in at most C1 cubes, j =1 Qj and any (2) Qj f (x) dx ≤ εC1 Q f (x) dx, j = 1, . . . , M, (3) M ≤ ε−1 C1 .

d−s if d ≥ 3 and 1 ≤ p < ∞ if d = 2. Then Lemma A.3. Let 0 ≤ s < 2 and 1 ≤ p < d−2 there exists a constant C2 > 0 such that for any cube Q of length l and any u ∈ H 1 (Q), 1 p 2−d+ d−s 2p −s p |u| |x| dx ≤ C2 l |∇u|2 + l −2 |u|2 dx. (A.2) Q

Moreover, if

Q

Q u dx

= 0 then

|u| |x| 2p

−s

1 dx

p

≤ C2 l

2−d+ d−s p

Q

|∇u|2 dx.

(A.3)

Q

Proof. By scaling it suffices to consider the case l = 1. We can choose 1 < p1 < ∞ 2d if d ≥ 3 and such that sq1 < d, where p1−1 + q1−1 = 1. Then by such that 2pp1 ≤ d−2 Hölder’s inequality 1 1 p1 q1 2p −s 2pp1 −sq1 |u| |x| dx ≤ |u| dx |x| dx . Q

Q

Q

The latter integral is finite, uniformly for all cubes of length one, by our choice of q1 . Moreover, by the Sobolev embedding theorems 1 pp1 |∇u|2 + |u|2 dx |u|2pp1 dx ≤ C2 Q

Q

for some constant C2 = C2 (p, p1 , d). If u has mean value zero we may use Poincaré’s inequality instead (see [LiLo]).

738


Proof of Lemma A.1. As in the proof of Lemma 3.1 we may assume τ = 1 and V ≥ 0. Fix q > 1, α ≥ 0 such that 2q − α > d and note that p and s, defined by p −1 + q −1 = 1 and α = s(q − 1), satisfy the assumptions of Lemma A.3. Put I := V q |x|α dx Rd

and introduce the unit cube Q0 := (0, 1)d . By Hölder’s inequality and (A.2) we obtain that 1 1 q p 2 q s(q−1) 2p −s V |u| dx ≤ V |x| dx |u| |x| dx Rd

k∈Zd 1

≤ C2 I q

Q0 +k

Q0 +k

|∇u|2 + |u|2 dx.

Rd

(A.4)

In view of this inequality it suffices to prove the assertion only for, say, bounded and compactly supported V . Moreover, we conclude from this inequality that N (1, −−V ) = 0 −q if I ≤ C2 . Hence it is enough to establish the estimate N (1, − − V ) ≤ CI

(A.5)

under the additional condition −q

I ≥ C2 .

(A.6)

To obtain (A.5) we find a subspace L ⊂ H 1 (Rd ) such that |∇u|2 + |u|2 dx, V |u|2 dx ≤ Rd

Rd

u ∈ L,

(A.7)

and such that codim L ≤ CI . For 0 < ε ≤ 1 (which will be determined later) LemmaA.2 yields cubes Q1 , . . . , QM such that supp V ⊂ M j =1 Qj , such that each point is covered by at most C1 cubes, V q |x|α dx ≤ ε C1 I (A.8) Qj

and M ≤ ε−1 C1 . With lj denoting the length of Qj put J≤ := {j : lj ≤ 1},

J> := {j : lj > 1}. First we consider j ∈ J> (i.e., large cubes). We divide Qj = k Qj,k in a finite number of non-intersecting cubes with equal length l˜j ∈ (1, 2]. Estimating similarly as in (A.4) and using (A.8) we obtain V |u|2 dx ≤ Qj

k

≤ C2 2

1

1

q

|u|2p |x|−s dx

V q |x|s(q−1) dx Qj,k

2−d+ d−s p

1

(ε C1 I ) q Qj

p

Qj,k

|∇u|2 + |u|2 dx.

(A.9)


739

Now we consider j ∈ J≤ (i.e., small cubes). If u ∈ H 1 (Rd ) satisfies u dx = 0,

(A.10)

Qj

then a similar estimate (but using (A.3) instead of (A.2)) yields 1 2 q V |u| dx ≤ C2 (ε C1 I ) |∇u|2 dx. Qj

(A.11)

Qj

Let L be the space of all u ∈ H 1 (Rd ) such that (A.10) holds for all j ∈ J≤ . We sum (A.9) and (A.11) over all j to get 1 |∇u|2 + |u|2 dx, u ∈ L, V |u|2 dx ≤ C3 (εI ) q Rd

Rd

−q

. Now we choose ε := C3 I −1 . (Note that in view where C3 := C2 22−d+(d−s)/p C1 of (A.6) and C1 ≥ 1 one has ε ≤ 1.) Moreover, with this choice of ε relation (A.7) holds and 1+1/q

codim L = J≤ ≤ M ≤ ε−1 C1 = C1 C3 I. q

q

This yields (A.5) with C = C1 C3 and finishes the proof.

Remark A.4. If d ≥ 3 then Proposition 2.1 follows by the argument of Aizenman-Lieb [AiLi] from d+α EK N (0, − − V ) ≤ C0,d,α V (x)+2 |x|α dx. Rd

Different proofs of this inequality can be found in [BlReSt, EgKo and BiSo2]. It would be desirable, in particular in view of constants, to find an alternative proof of Proposition 2.1 in the case d = 2. Acknowledgements. The authors are grateful to Ari Laptev and Timo Weidl for useful discussions. The first author has been partially supported by the ESF European programme SPECT.

References [AbSt] [AiLi] [Bi] [BiLa] [BiSo1] [BiSo2] [BlReSt]

Abramowitz, M., Stegun, I.A.: Handbook of mathematical functions with formulas, graphs, and mathematical tables. Reprint of the 1972 edition. New York: Dover Publications, 1992 Aizenman, M., Lieb, E.: On semiclassical bounds for eigenvalues of Schrödinger operators. Phys. Lett. A 66(6), 427–429 (1978) Birman, M.Sh.: The spectrum of singular boundary problems. Amer. Math. Soc. Trans. (2) 53, 23–80 (1966) Birman, M.Sh., Laptev, A.: The negative discrete spectrum of a two-dimensional Schrödinger operator. Comm. Pure Appl. Math. 49, 967–997 (1996) Birman, M.Sh., Solomyak, M.Z.: Spectral theory of selfadjoint operators in Hilbert space. Dordrecht: D. Reidel, 1987 Birman, M.Sh., Solomyak, M.Z.: Schrödinger operators. Estimates for number of bound states as function-theoretic problem. Amer. Math. Soc. Transl. (2) 150, 1–54 (1992) Blanchard, Ph., Rezende, J., Stubbe, J.: New estimates on the number of bound states of Schrödinger operators. Lett. Math. Phys. 14(3), 215–225 (1987)

740 [EgKo]


Egorov, Yu.V., Kondrat’ev, V.A.: On spectral theory of elliptic operators. Oper. Theory Adv. Appl. 89, Basel: Birkhäuser, 1996 [GlGrMaTh] Glaser, V., Grosse, H., Martin, A., Thirring, W.: A family of optimal conditions for the absence of bound states in a potential. Studies in Mathematical Physics. Princeton, NJ: Princeton University Press, 1976, pp. 169–194 [LaWei1] Laptev, A., Weidl, T.: Hardy inequalities for magnetic Dirichlet forms. Mathematical results in quantum mechanics (Prague, 1998). Oper. TheoryAdv.Appl. 108, Basel: Birkhäuser, 1999, pp. 299–305 ´ [LaWei2] Laptev, A., Weidl, T.: Recent results on Lieb-Thirring inequalities. Journées “Equations aux Dérivées Partielles” (La Chapelle sur Erdre, 2000), Exp. No. XX, Nantes: Univ. Nantes, 2000 [LiLo] Lieb, E., Loss, M.: Analysis. Second edition. Graduate Studies in Mathematics 14. Providence, RI: Amer. Math. Soc., 2001 [LiTh] Lieb, E., Thirring, W.: Inequalities for the moments of the eigenvalues of the Schrödinger Hamiltonian and their relation to Sobolev inequalities. Studies in Mathematical Physics. Princeton, NJ: Princeton University Press, 1976, pp. 269–303 [Si1] Simon, B.: The bound state of weakly coupled Schrödinger operators in one and two dimensions. Ann. Physics 97(2), 279–288 (1976) [Si2] Simon, B.: Trace ideals and their applications. London Mathematical Society Lecture Note Series 35. Cambridge-New York: Cambridge University Press, 1979 [StWei] Stein, E.M., Weiss, G.: Introduction to Fourier analysis on Euclidean spaces. Princeton Mathematical Series 32. Princeton, NJ: Princeton University Press, 1971 [St] Stubbe, J.: Bounds on the number of bound states for potentials with critical decay at infinity. J. Math. Phys. 31(5), 1177–1180 (1990) [Wei] Weidl, T.: Remarks on virtual bound states for semi-bounded operators. Comm. Partial Differ. Eqs. 24(1–2), 25–60 (1999) Communicated by B. Simon


Communications in


Propagation Effects on the Breakdown of a Linear Amplifier Model: Complex-Mass Schrödinger Equation Driven by the Square of a Gaussian Field Philippe Mounaix1 , Pierre Collet1 , Joel L. Lebowitz2 1 2

Centre de Physique Théorique, UMR 7644 du CNRS, Ecole Polytechnique, 91128 Palaiseau Cedex, France. E-mail: [email protected]; [email protected] Departments of Mathematics and Physics, Rutgers, The State University of New Jersey, Piscataway, NJ 08854-8019, USA. E-mail: [email protected]

Received: 20 May 2005 / Accepted: 8 December 2005 Published online: 31 March 2006 – © Springer-Verlag 2006 i Abstract: Solutions to the equation ∂t E(x, t) − 2m E(x, t) = λ|S(x, t)|2 E(x, t) are investigated, where S(x, t) is a complex Gaussian field with zero mean and specified covariance, and m = 0 is a complex mass with Im(m) ≥ 0. For real m this equation describes the backscattering of a smoothed laser beam by an optically active medium. Assuming that S(x, t) is the sum of a finite number of independent complex Gaussian random variables, we obtain an expression for the value of λ at which the q th moment of |E(x, t)| w.r.t. the Gaussian field S diverges. This value is found to be less or equal for all m = 0, Im(m) ≥ 0 and |m| < +∞ than for |m| = +∞, i.e. when the E term is absent. Our solution is based on a distributional formulation of the Feynman path-integral and the Paley-Wiener theorem.

I. Introduction We investigate the breakdown of linear amplification in a system driven by the square of a Gaussian noise. This problem which models the backscattering of an incoherent laser by an optically active medium was first considered by Akhmanov et al. in nonlinear optics [1], and by Rose and DuBois in laser-plasma interaction [10]. The latter investigated the divergence of the average solution to the stochastic PDE, i ∂t E(x, t) − 2m E(x, t) = λ|S(x, t)|2 E(x, t), (1) t ≥ 0, x ∈ ⊂ Rd , and E(x, 0) = 1, heuristically and numerically in the “diffractive case” where Im(m) = 0 and Re(m) = 0. Here λ > 0 is the coupling constant and S is a complex Gaussian noise with zero mean1 . More recently, this problem was analyzed from a more rigorous mathematical point of 1 This is the case of interest in laser-plasma interaction and nonlinear optics in which S is the (complex) time-envelope of the laser electric field. With the help of some minor modifications, our results carry over straightforwardly to the cases where S is real.

742

Ph. Mounaix, P. Collet, J.L. Lebowitz

view in [2] and [7]. The “diffusive case” in which Re(m) = 0 and Im(m) > 0 was considered in [2], and the one dimensional diffractive case was considered in [7] for a restrictive class of S’s. In the present work we will consider the general case m = 0 and Im(m) ≥ 0 for a d-dimensional torus with d ≤ 3. As in [2] and [7] we will express the solution to (1) formally as the Feynman-Kac path-integral

t im

E(x, t) =

e

0

2

x(τ ˙ )2 +λ|S(x(τ ),τ )|2 dτ

d[x(·)],

(2)

x(·)∈B(x,t)

where B(x, t) denotes the set of all the continuous paths in satisfying x(t) = x. t In the diffusive case, the right-hand side of (2) is just the Wiener integral of exp λ 0 dτ |S(x(τ ), τ )|2 over B(x, t). This was used in [2] to prove, under some reasonable assumptions on the covariance of S, that for every t > 0 and any positive integer q the average of E(x, t)q over the realizations of S, E(x, t)q , diverges as λ increases past some critical value smaller (or equal) than in the diffusion-free case (i.e. when |m| = +∞), with equality holding for a class of S. It was conjectured there that this inequality should also apply when diffusion is replaced by diffraction, i.e. m real, m = 0, the case of physical interest considered by Rose and DuBois in [10]. The diffractive case is much more difficult because the right-hand side of (2) is no longer well defined and one cannot a priori exclude the possibility that destructive interference between paths makes the sum of divergent contributions finite, raising (possibly to infinity) the critical value of λ at which the average of (2) diverges. Using heuristics and numerical simulations, Rose and DuBois argued that |E(x, t)|2 should diverge for every t > 0 as λ increases to some finite critical value [10]. The conjecture made in [2] that diffraction should actually lower the critical coupling (or, at least, not increase it) compared to the case |m| = +∞ was proved in [7], for very special choices of S, for the divergence of |E(x, t)|. In this paper, we extend the results of [7] to a much wider class of S. We analyze the divergence of |E(x, t)|q for any positive integer q, and we treat both the diffusive and diffractive cases as well as all the intermediate cases between these two limits [i.e. complex m with Im(m) ≥ 0 and m = 0]. Our strategy for controlling the complex Feynman path-integral (2) and determining the critical value of λ uses the following three ingredients: (i) we consider a restricted but quite wide class of S for which E(x, t) can be written as a Fourier-Laplace integral w.r.t. a distribution with compact support;2 (ii) we apply the Paley-Wiener theorem to this Fourier-Laplace integral. This yields the control of (2) for “large” |S|2 ; (iii) we average |E(x, t)|q over the realizations of S and use the control obtained in (ii) to determine the smallest value of λ for which this average blows up. The rest of the paper follows this strategy quite faithfully. In Sect. II we specify the class of S which we can treat. The distributional formulation of E(x, t) is given in Sect. III and the way to control its growth is explained in Sect. IV. Finally, the determination of the critical value of λ and the proof of the conjecture made in [2] are given in Sect. V. It is worth noting that (i) and (ii) do not depend on S being Gaussian and thus apply also in a more general setting. 2 The “Fourier-Laplace” integral w.r.t. a distribution with compact support on RN is the continuation of the usual Fourier integral from RN to CN .

Propagation Effects on the Breakdown of a Linear Amplifier Model

743

II. Model and Definitions We consider the solution to the linear amplifier equation (1), written in its integral representation (2), with m in C+ \{0}, where C+ ≡ {m ∈ C : Im(m) ≥ 0}. We assume that S can be expressed as a finite combination of M complex Gaussian r.v., sn , S(x, t) =

M

(3)

sn n (x, t),

n=1

with

sn = sn sm = 0, ∗ = δ . sn sm nm

(4)

The n are normalized such that 1 ||

1

0

1 || M

|S(x, τ )|2 dτ d d x =

n=1 0

1

|n (x, τ )|2 dτ d d x = 1.

Furthermore, the n (·, τ ) are assumed to have second derivatives bounded uniformly in τ ∈ [0, t], and the n (x, ·) are piecewise continuous for every x ∈ with a finite number of discontinuities in [0, t] for all finite t. Note that locally, i.e. for each x and τ , |S(x, τ )|2 is a quadratic form of 2M real Gaussian r.v. (the real and imaginary parts of the sn ), so it is a χ 2 r.v. with 2M degrees of freedom. Equation (3) generalizes models of spatially smoothed laser beams in which the laser light is represented by a superposition of a finite number of monochromatic beamlets the amplitudes of which are independent r.v. [9]. For a large number of beamlets these r.v. can be taken as Gaussian and the laser electric field takes on the form (3) with n (x, t) ∝ exp[i(kn · x + akn2 t)], where kn is the wave vector of the nth beamlet and a > 0 is a (real) constant. It can be checked that all the assumptions made on S are fulfilled. We are interested in the critical coupling λq (x, t) and its Laplacian-free counterpart λq (x, t) obtained by setting m−1 = 0 in Eq. (1). These quantities are defined by λq (x, t) = inf{λ > 0 : |E(x, t)|q = +∞}, λq (x, t) = inf{λ > 0 : eqλ

t 0

|S(x,τ )|2 dτ

= +∞}.

(5a) (5b)

Equations (5) give the values of λ at which |E(x, t)|q diverges with and without the Laplacian on the left-hand side of (1). Note that S is not assumed to be homogeneous and the critical coupling will depend on x in general. III. Distributional Formulation of E(x, t; s) Let s be the M-dimensional Gaussian random vector the elements of which are the sn , and γ (x, τ ) the M × M Hermitian matrix defined by γnm (x, τ ) = ∗n (x, τ )m (x, τ ).

744


Inserting (3) into the right-hand side of (2) yields t im ˙ )2 + λ s † γ (x(τ ),τ )s dτ 0 2 x(τ E(x, t; s) = e d[x(·)],

(6)

x(·)∈B(x,t)

where we have made the dependence of E(x, t) on the realization of s explicit. In order to make (6) more appropriate to a distributional formulation it is desirable to replace the quadratic form s † γ (x(τ ), τ )s with its monomial decomposition. One obtains N

t im t 2 dτ x(τ ˙ ) E(x, t; s) = e2 0 exp λ ki (s) ϕi (x(τ ), τ ) dτ d[x(·)], x(·)∈B(x,t)

i=1

0

√

(7)

with√ N = M 2 , and where the ϕi are N real valued functions given by γnn , 2Re(γnm ), and 2Im(γnm ), n√< m. The components of the vector k(s) ∈ RN are given by |sn |2 , √ ∗ ), and 2Im(s s ∗ ), n < m. It can be checked that 2Re(sn sm n m

k(s) = s 2 , (8)

1/2

1/2 N M 2 2 and s = . We first give a heuristic where k(s) = i=1 ki (s) i=1 |si | derivation of the distributional formulation of (6). Then, we set it on a much firmer ground by justifying it rigorously from a mathematical point of view. A. Heuristics. Inserting the identity 1=

N i=1 R

t

δ ui −

ϕi (x(τ ), τ ) dτ

dui ,

0

in the path-integral (7) and permuting the path- and u-integrals, one obtains E(x, t; s) =

···

Gx,t (u) eλk(s)·u

RN

N

dui ,

(9)

i=1

with Gx,t (u1 , . . . , uN ) =

e

im t 2 0

x(τ ˙ )2 dτ

x(·)∈B(x,t)

t N δ ui − ϕi (x(τ ), τ ) dτ d[x(·)]. i=1

0

(10) As a Feynman-Kac path-integral, the expression (10) is not well defined. A possible way to make it meaningful consists in writing Gx,t as the Fourier transform w.r.t. η of some function (x, t; η): Gx,t (u) =

1 (2π)N

···

RN

(x, t; η) eiu·η

N

dηi ,

(11)

i=1

in which (x, t; η) has a well defined meaning. Fourier transforming (10) w.r.t. u and permuting the path- and u-integrals, one obtains


(x, t; η) =

ei

tm 0

2

745

x(τ ˙ )2 − V (x(τ ),τ ;η) dτ

d[x(·)],

(12)

x(·)∈B(x,t)

where V (x, t; η) is given by V (x, t; η) ≡

N

(13)

ηi ϕi (x, t).

i=1

We now observe that Eq. (12) is the path-integral solution to the Schrödinger equation

1 (x, t; η) + V (x, t; η) (x, t; η), i∂t (x, t; η) = − 2m

t ≥ 0, x ∈ , and (x, 0; η) = 1.

(14)

This yields a well defined (x, t; η). The permutation of path- and ordinary integrals, as well as the formal Feynman-Kac path-integral used in the derivation of Eqs. (9), (11), and (14) above require justification. The work by Cartier and DeWitt-Morette [4] suggests that we define the path-integral (6) by the right-hand side of (9) in which Gx,t is defined by its Fourier transform given as the solution to Eq. (14). We now prove the validity of this approach.

B. The distributional formulation. Let (x, t; η) be the solution t to (14) where V (x, t; η) is given by (13) with η ∈ CN . Let ai = inf x(·)∈B(x,t) 0 ϕi (x(τ ), τ ) dτ and bi = t supx(·)∈B(x,t) 0 ϕi (x(τ ), τ ) dτ . Then the following lemma holds. Lemma 1. For every t > 0, x ∈ , and m ∈ C+ \{0}, (i) Gx,t defined by (11) is a distribution with compact support on RN and suppGx,t ⊂ [a1 , b1 ] × · · · × [aN , bN ]; (ii) E(x, t; s) defined by (9) is the solution to (1). Proof. Taking the derivative of (14) with respect to ηi∗ , the complex conjugate of ηi , and using ∂ηi∗ V (x, t; η) = 0 which follows from analyticity of V (x, t; η) in η [see Eq. (13)], one finds that ∂ηi∗ (x, t; η) evolves in time according to the same Eq. (14) with the initial condition ∂ηi∗ (x, 0; η) = 0. Thus, ∂ηi∗ (x, t; η) = 0 for all t ≥ 0 and η ∈ CN which implies that (x, t; η) is analytic in η. ˜ Let (x, t; η) = (x, t; η) exp(−it N i=1 ηi ci ), where the constants ci ∈ R will be ˜ is the solution to (14) with V given by (13) in which the ϕi are replaced specified later. ˜ and the Schwartz inequality by ϕ˜i = ϕi + ci . Let i = sgn[Im(ηi )]. From Eq. (14) for one obtains d Im(m) ˜ 22 + 2 ˜ 22 = −

∇ Im(ηi )

2 dt |m| N

i=1

˜ 22 ≤ 2

N i=1

|Im(ηi )| sup (i ϕ˜i ), x∈

˜ 2d d x ϕ˜i | |

(15)

746


d Im(m) ˜ 22 = − ˜ 22 + 2

∇

Im(ηi ) 2 dt |m| N

+ 2Im

N

˜ 2d d x ϕ˜i |∇ |

˜ ˜ ∗ · ∇ ϕ˜i d d x ∇

ηi

i=1

˜ 22 ≤ 2 ∇

i=1

N

˜ 2 ˜ 2 |Im(ηi )| sup (i ϕ˜i ) + 2 ∇ x∈

i=1

N

|ηi | ∇ ϕ˜i ∞ ,

i=1

(16) and d Im(m) ˜ 22 + 2 ˜ 22 = −

∇ Im(ηi )

|m|2 dt N

+2Im

N

˜ ϕ˜i + 2∇ ˜ ∗ ˜ · ∇ ϕ˜i d d x

N

|Im(ηi )| sup (i ϕ˜i ) x∈

i=1

˜ 2 +2

˜ 2d d x ϕ˜i | |

ηi

i=1

˜ 22 ≤ 2

i=1

N

˜ 2 ϕ˜i ∞ + 2 ∇ ˜ 2 ∇ ϕ˜i ∞ , |ηi |

i=1

(17) where · 2 and · ∞ respectively denote the L2 and uniform norms 3 on for given t and η. Both ∇ ϕ˜i ∞ (t) and ϕ˜i ∞ (t) are bounded by assumption. Integrating then the inequality (15) over time from 0 to t, one obtains ˜ 2 (t, η) ≤ ||1/2 e

t i=1 |Im(ηi )| 0

N

supx∈ [i ϕ˜i (x,τ )] dτ

.

(18)

Similarly, by integrating (16) and (17) one finds  ˜ 2 (t, η) ≤ C1 t

N i=1

|ηi | + C2 t 2

N

2  t N |ηi |  e i=1 |Im(ηi )| 0 supx∈ [i ϕ˜i (x,τ )] dτ ,

i=1

(19) where C1 and C2 are finite and independent of η and m. We now substitute (18) and (19) into the right-hand side of the Sobolev-type inequality below, valid for d ≤ 3 (see e.g. [12], pp 106–107), ˜ ˜ 2 (t, η) , ˜ 2 (t, η) + | (x, t; η)| ≤ C3 In the case of a vector field v(x, t) ∈ Cd , these norms are to be understood as v 2 (t) = √ ∗ d 1/2 and v (t) = sup v(x, t) · v(x, t)∗ . ∞ x∈ v(x, t) · v(x, t) d x

3


747

with C3 finite and independent of η and m. This yields  ˜ | (x, t; η)| ≤ A + Bt

N

|ηi | + Ct 2

i=1

N

2  t N |ηi | e i=1 |Im(ηi )| 0 supy∈ [i ϕ˜i (y,τ )] dτ ,

i=1

(20) where A, B, and C are finite and independent of η and m. Take ci = −

1 2t

t

sup [ϕi (x, τ )] dτ +

0 x∈

t

inf [ϕi (x, τ )] dτ ,

0 x∈

and define

t

κi ≡ 0

1 sup [i ϕ˜i (x, τ )] dτ = 2 x∈

t

sup [ϕi (x, τ )] dτ −

0 x∈

t

inf [ϕi (x, τ )] dτ .

0 x∈

Note that, with this choice of ci , κi is independent of i . Since κi ≥ 0, one can bound the right side of (20) by  ˜ | (x, t; η)| ≤ A + Bt

N i=1

|ηi | + Ct 2

N

2  N |ηi |  e i=1 κi |ηi | .

(21)

i=1

˜ x,t be defined by Eq. (11) in which is replaced by . ˜ By the Paley-Wiener Let G theorem in the formulation given in [11] (Theorem XVI in Chapter VII), it follows from ˜ x,t is a distribution with compact sup˜ the analyticity of (x, t; η) in η and Eq. (21) that G N ˜ x,t ⊂ [−κ1 , κ1 ] × · · · × [−κN , κN ]. From the definition of ˜ one port on R and suppG ˜ x,t (u) = Gx,t (u − ct), which implies immediately by translation that Gx,t is also has G a distribution with compact support on RN and suppGx,t ⊂ [α1 , β1 ] × · · · × [αN , βN ], t t where αi = 0 inf x∈ ϕi (τ ) dτ and βi = 0 supx∈ ϕi (τ ) dτ . The permutation of time integral and space supremum (resp. infimum) can then be performed by using Lemma A1 with W = ±ϕi (see Appendix A). One obtains αi = ai and βi = bi , yielding suppGx,t ⊂ [a1 , b1 ] × · · · × [aN , bN ] .

(22)

It is worth noting that, heuristically, (22) follows immediately from the formal expression (10) since the product of the delta functions vanishes identically outside [a1 , b1 ] × ... × [aN , bN ]. It remains to prove that E(x, t; s) defined by the r.h.s. of (9) is the solution to (1). To this end it suffices to note that Eq. (9) can be written as E(x, t; s) = (x, t; η = iλk(s)) which is the solution to (14) with η = iλk(s) in the potential (13). It can be checked that the latter equation is indeed Eq. (1) [reconstruct s † γ s from its monomial decomposition and multiply (14) by −i], which completes the proof of Lemma 1.

748


IV. Controlling the Growth of |E(x, t; s)|q The advantage gained by recasting (6) as (9) is that the latter formulation is suitable for a straightforward application of the Paley-Wiener theorem (see e.g. [11], Theorem XVI in Chapter VII, and [6], Theorem 7.4 in Chapter VI), offering the possibility of controlling the growth of E(x, t; s) as s → +∞. This is embodied in Lemma 2 below. Let t sˆ ≡ s/ s be the direction of s in CM and Hx,t (ˆs ) = supx(·)∈B(x,t) 0 U (x(τ ), τ ; sˆ ) dτ , ˆ ˆ with U (x, τ ; sˆ ) = N i=1 k(s)i ϕi (x, τ ), where k(s) = k(s)/ k(s) . Lemma 2. For every t > 0, x ∈ , m ∈ C+ \{0}, and q a positive integer, one has ln |E(x, t; s)|q =qλHx,t (ˆs ),

s 2

s →+∞ lim sup

(23)

along every given direction sˆ in CM . Proof. From Eqs. (8) and (9), one can rewrite the left-hand side of (23) as N ln |E(x, t; s)|q 1 λk(s)·u = qλ lim sup G (u) e du ln · · · lim sup . x,t i 2 N

s R

s →+∞

k(s) →+∞ λ k(s) i=1

CM

Fixing the direction of s in also fixes the direction of k(s) in Write u = ˆ + u⊥ , with u⊥ · k(s) ˆ v k(s) = 0, replace Gx,t (u) by its Fourier representation (11), and let η|| and η⊥ denote the Fourier conjugated variables of v and u⊥ , respectively. One obtains, N Gx,t (u) eλk(s)·u dui = · · · eλ k(s) v ··· RN

1 × (2π )N

···

RN

RN .

RN

i=1

(x, t; η) e

i(vη|| +u⊥ ·η⊥ )

dη||

N−1

dη⊥i dv

i=1

N−1

du⊥i .

i=1

Performing the integration over u⊥ first, and then the one over η⊥ , one finds that the latter expression reduces to

···

with

RN

Gx,t (u) eλk(s)·u

N i=1

dui =

R

gx,t (v) eλ k(s) v dv,

1 gx,t (v) = (x, t; η|| , η⊥ = 0) eivη|| dη|| , (2π) R

where (x, t; η|| , η⊥ = 0) is the solution to (14) with V (x, t; η) = η|| U (x, t; sˆ ). t t Let a = inf x(·)∈B(x,t) 0 U (x(τ ), τ ; sˆ ) dτ and b = supx(·)∈B(x,t) 0 U (x(τ ), τ ; sˆ ) dτ = Hx,t (ˆs ). By Lemma 1, gx,t is a distribution with compact support on R and suppgx,t ⊂ [a, b]. This implies that sup{v : v ∈ suppgx,t } ≤ Hx,t (ˆs ), and by the Paley-Wiener theorem, 1 λ k(s) v ln gx,t (v) e dv ≤ Hx,t (ˆs ). (24) lim sup R

k(s) →+∞ λ k(s)


We now prove that (24) is an equality. Suppose that ∃ε > 0 such that 1 λ k(s) v lim sup ln gx,t (v) e dv ≤ Hx,t (ˆs ) − ε. R

k(s) →+∞ λ k(s)

749

(25)

Then, according to the Paley-Wiener theorem, sup{v : v ∈ suppgx,t } ≤ Hx,t (ˆs ) − ε. It is shown in Appendix B that sup{v : v ∈ suppgx,t } = Hx,t (ˆs ), yielding Hx,t (ˆs ) ≤ Hx,t (ˆs ) − ε, in contradiction with ε > 0. Thus, Eq. (25) is false and one obtains 1 ln gx,t (v) eλ k(s) v dv = Hx,t (ˆs ). lim sup R

k(s) →+∞ λ k(s)

This completes the proof of Lemma 2.

V. Determination of λq (x, t) and Comparison to λq (x, t) In this section we prove the conjecture made in Ref. [2] that λq ≤ λq , in the case where S(x, t) is given by (3). Since we wish to express the results in terms of eigenvalues of the correlation function S ∗ (x(t), t)S(x(t ), t ),we begin with a technical preliminary t linking these eigenvalues to those of the matrix 0 γ (x(τ ), τ ) dτ . Let µ1 [x(·)] ≥ µ2 [x(·)] ≥ · · · ≥ 0 be the eigenvalues of the covariance operator Tx(·) acting on f (τ ) ∈ L2 (dτ ), defined by

t

(Tx(·) f )(τ ) =

S ∗ (x(τ ), τ )S(x(τ ), τ )f (τ ) dτ ,

(26)

0

with 0 ≤ τ, τ ≤ t and x(·) ∈ B(x, t). Let fi (τ ) ∈ L2 (dτ ) be the eigenfunction assot ciated with µi [x(·)] and define the vector σi ∈ CM by σin = 0 ∗n (x(τ ), τ )fi∗ (τ ) dτ . From (26), (3), and (4) one has (Tx(·) fi )(τ ) =

M

∗ ∗m (x(τ ), τ )σim = µi [x(·)]fi (τ ),

(27)

m=1

and µi [x(·)]σin = =

t

∗n (x(τ ), τ )(Tx(·) fi )∗ (τ ) dτ

0 M

0

m=1

=

M

m=1

t

σim t

∗n (x(τ ), τ )m (x(τ ), τ ) dτ

γnm (x(τ ), τ ) dτ σim .

(28)

0

It follows from the last equality of (28) that any non-vanishing eigenvalue of Tx(·) is t also an eigenvalue of 0 γ (x(τ ), τ ) dτ with eigenvector σi . Conversely, any non-vant ishing eigenvalue of 0 γ (x(τ ), τ ) dτ is also an eigenvalue of Tx(·) with eigenfunction

750


M ∗ ∗ fi (τ ) = µ−1 there is a one-to-one relationm=1 σim m (x(τ ), τ ) [see Eq. (27)]. Thus, i t ship between the non-vanishing eigenvalues of Tx(·) and 0 γ (x(τ ), τ ) dτ . In the sequel t µ1 [x(·)] will denote the largest eigenvalue of Tx(·) and of 0 γ (x(τ ), τ ) dτ . Define µx,t =

sup

µ1 [x(·)].

x(·)∈B(x,t)

One can now prove the following proposition: Proposition 1. For every t > 0 and x ∈ , λq (x, t) = (qµx,t )−1 ≤ λq (x, t). Proof. First we prove λq (x, t) ≥ (qµx,t )−1 . Expressing U (x(τ ), τ ; sˆ ) in terms of the quadratic form s † γ (x(τ ), τ )s in the expression for Hx,t (ˆs ), one has t s† s Hx,t (ˆs ) = sup γ (x(τ ), τ ) dτ ≤ µx,t . ||s|| 0 x(·)∈B(x,t) ||s|| Hence, by Lemma 2, ln |E(x, t; s)|q ≤ qλµx,t . ||s||2 ||s||→+∞ lim sup

This implies that for every λ < (qµx,t )−1 , M d 2 sn 2 q |E(x, t)| = · · · e−||s|| |E(x, t; s)|q < +∞, π CM

(29)

n=1

which proves λq (x, t) ≥ (qµx,t )−1 . We now prove the inequality λq (x, t) ≤ (qµx,t )−1 [or, more exactly, λq (x, t) ≤ (qµx,t − 0+ )−1 ]. To this end we follow the same line of reasoning as in Ref. [7]. Let A(r) = {z ∈ CM : |zn | ≤ r, 1 ≤ n ≤ M}. For Eq. (29) to hold it is necessary that, for every r > 0, M d 2 sn 2 lim ··· e−||s+s || |E(x, t; s + s )|q = 0, (30) ||s||→+∞ π A(r) n=1

∗

along every direction sˆ in CM . For any fixed s ∈ CM , e−2s ·z/q E(x, t; s + z) is an ∗ entire function of z ∈ CM and hence |e−2s ·z/q E(x, t; s + z)|q is subharmonic w.r.t. each component of z [5]. Thus, writing q 2 2 2 2 e−||s+s || = e−||s|| e−||s || exp − s ∗ · s , q in the integral (30), one obtains by the subharmonicity M d 2 sn 2 e−||s+s || |E(x, t; s + s )|q ··· π A(r)

n=1

M q d 2 sn 2 2 − 2 s ∗ ·s = e−||s|| ··· e−||s || e q E(x, t; s + s ) π A(r) n=1

M 2 e−||s|| |E(x, t; s)|q , ≥ 1 − exp −r 2


751

and the condition (30) implies lim

||s||→+∞

e−||s|| |E(x, t; s)|q = 0, 2

(31)

t along every direction sˆ in CM . Since every element of the matrix 0 γ (x(τ ), τ ) dτ is a continuous functional of x(·) ∈ B(x, t) with the uniform norm on [0, t] (see Appendix B), its eigenvalues are also continuous functionals of x(·). Accordingly, ∀ε > 0 ∃xε (·) ∈ M B(x, t) such that µ tx,t − ε ≤ µ1 [xε (·)] ≤ µx,t . Let σε ∈ C (with ||σε || = 1) be an eigenvector of 0 γ (xε (τ ), τ ) dτ associated with the eigenvalue µ1 [xε (·)] and take sˆ = σε , then t † σε γ (x(τ ), τ ) dτ σε Hx,t (ˆs ) = sup ≥

0 x(·)∈B(x,t) t † σε γ (xε (τ ), τ ) dτ σε 0

≥ µx,t − ε.

(32)

Thus, along the direction of σε , Lemma 2 and Eq. (32) yield ln |E(x, t; s)|q ≥ qλ(µx,t − ε), ||s||2 ||s||→+∞ lim sup

and for every λ > (qµx,t − qε)−1 , lim sup e−||s|| |E(x, t; s)|q = +∞, 2

||s||→+∞

in contradiction with (31). Therefore, λq (x, t) ≤ (qµx,t − qε)−1 . Taking ε arbitrarily small one obtains λq (x, t) ≤ (qµx,t − 0+ )−1 , hence λq (x, t) = (qµx,t )−1 . Finally, we always have λq (x, t) = 1/qµ1 [x(·) = x] (see [2]) and µx,t ≥ µ1 [x(·) = x] yields λq (x, t) ≤ λq (x, t), which completes the proof of the proposition. VI. Summary and Perspectives In this paper, we have studied the effects of propagation on the divergence of the solution to a linear amplifier driven by the square of a Gaussian field. We have considered a model in which the propagation is that of a free Schrödinger equation with a complex mass. For this model, we have explicitly determined the values of the coupling constant at which the moments of the solution diverge. We proved that the divergence yielded by a propagation-free calculation, i.e. in the limit of an infinite mass, cannot occur at a smaller coupling constant than the one obtained with a finite mass. This extends the results of ref. [2] where such an inequality was proven in the diffusion case only, i.e. imaginary mass. As explained in the conclusion of ref. [2], the stumbling block to going beyond the purely diffusive case was to control the growth of a complex Feynman path-integral. Our solution of this problem is based on the realization that, if S is given by Eq. (3), the Feynman path-integral can be rewritten as the Fourier integral of a distribution with compact support (Lemma 1). Control can then be obtained as a consequence of the Paley-Wiener theorem (Lemma 2).

752


In conclusion we outline some possible generalizations of this work. From a practical point of view, it would be interesting to find out whether there exists a class of S of the form (3) for which there are no propagation effects on the onset of the divergence, i.e. for which λq (x, t) = λq (x, t). In addition, since most Gaussian fields of physical interest admit Karhunen-Loève-type expansions, it would also be very interesting to find a way to generalize our solution to the case where the finite sum (3) is replaced by an infinite sum. Other problems involve relaxing some of the assumptions in (1). For instance, under what conditions on S do the results carry over to the case where is replaced by Rd and E(x, 0) ∈ L2 (Rd ). It should also be checked whether our solution of the problem is robust with respect to the initial condition. If the answer is no, the size of the set of E(x, 0) for which our results do not hold should be estimated according to physically relevant measures on the space of E(x, 0). Acknowledgements. We warmly thank K.Yajima, C. Kopper, and G. Ben Arous for providing many valuable insights all along the completion of this work. The work of J. L. L. was supported by ASFOSR grant 49620-01-1-0154 and NSF grant DMR 01-279-26. J. L. L. also thanks the IHES at Bures-sur-Yvette, France, where part of this work was done.

Appendix A: On the Permutation of Time Integral and Space Supremum This appendix is devoted to the proof of the following lemma: Lemma A1. Let be a compact pathwise connected metric space [with distance denoted by d(·, ·)], and t > 0 a real number. Let W be a real function on × [0, t] such that W (·, τ ) is continuous in x uniformly in τ ∈ [0, t], and ∀x ∈ , W (x, ·) is piecewise continuous with a finite number of discontinuities. Then, for any x ∈ , t t sup W (y, τ ) dτ = sup W (x(τ ), τ ) dτ, 0 y∈

x(·)∈B(x,t) 0

where B(x, t) is the set of continuous paths in satisfying x(t) = x. Proof. We obviously have t sup W (y, τ ) dτ ≥ 0 y∈

sup

t

W (x(τ ), τ ) dτ,

x(·)∈B(x,t) 0

and it remains to prove the inequality in the other direction. First we assume W ∈ C 0 ( × [0, t]). Since × [0, t] is compact, W is uniformly continuous, and for every > 0 we can find a number δ = δ() > 0 such that if max{d(x, x ), |τ − τ |} < δ(), then |W (x, τ ) − W (x , τ )| < . Moreover, δ() tends to zero with . If t/δ() is not an integer, it is convenient to replace δ() by the smaller quantity t/(1 + [t/δ()]), where [·] denotes the integer part, and from now on we will assume that t/δ() is an integer. For a fixed > 0, let N = N () = t/δ(), and let R be a finite partition of by sets of diameter at most δ(). We observe that for any 0 ≤ q ≤ N − 1 one has sup

sup

F ∈R 0≤q≤N−1

OscF ×[qt/N, (q+1)t/N] W ≤ ,

(A1)

where “Osc” denotes the oscillation of the function (namely, its sup minus its inf). We now choose once and for all a point (xF,q , τq ) in each F × [qt/N, (q + 1)t/N ].


753

For each 0 ≤ q ≤ N − 1 and any τ ∈ [qt/N, (q + 1)t/N ] one has sup W (y, τ ) ≤ sup y∈

≤ sup

F ∈R

sup

F ∈R F ×[qt/N, (q+1)t/N]

sup

F ×[qt/N, (q+1)t/N]

W

W + W (xF,q , τq ) −

inf

F ×[qt/N, (q+1)t/N]

W

≤ sup W (xF,q , τq ) + sup OscF ×[qt/N, (q+1)t/N] W F ∈R

F ∈R

≤ sup W (xF,q , τq ) + sup F ∈R

sup

F ∈R 0≤q≤N −1

OscF ×[qt/N, (q+1)t/N] W

≤ sup W (xF,q , τq ) + , F ∈R

where we have used the inequality (A1). Thus, choosing an atom Fq ∈ R such that supF ∈R W (xF,q , τq ) = W (xFq ,q , τq ), one has for any τ ∈ [qt/N, (q + 1)t/N ], sup W (y, τ ) ≤ W (xFq ,q , τq ) + .

(A2)

y∈

Let y1 and y2 be two given points in . We now define a continuous path x in from y1 to y2 . For any 1 ≤ j ≤ N − 1 we choose a family of continuous paths xj from [−t/2N 2 , t/2N 2 ] to satisfying xj (−t/2N 2 ) = xFj −1 ,j −1 and xj (t/2N 2 ) = xFj ,j . We also choose a continuous path x0 from [0, t/2N 2 ] to such that x0 (0) = y1 and x0 (t/2N 2 ) = xF0 ,0 , and a continuous path xN from [−t/2N 2 , 0] to such that xN (0) = y2 and xN (−t/2N 2 ) = xFN −1 ,N−1 . The continuous path x is defined by  x0 (τ ) for 0 ≤ τ ≤ t/2N 2 ,    xFq ,q for qt/N + t/2N 2 ≤ τ ≤ (q + 1)t/N − t/2N 2 , x (τ ) = x (τ − qt/N ) for qt/N − t/2N 2 ≤ τ ≤ qt/N + t/2N 2 for q = 0,    q xN (τ − t) for t − t/2N 2 ≤ τ ≤ t. (A3) One can observe that the Lebesgue measure of the time domain over which x (τ ) = xFq ,q is at most equal to t/N . Since is compact and ∀τ ∈ [0, t], W (·, τ ) ∈ C 0 (), there is a finite number M > 0 such that for any (x, τ ) ∈ × [0, t] one has |W (x, τ )| ≤ M. Thus, |W (xFq ,q , τ ) − W (x (τ ), τ )| ≤ 2M for any τ in [qt/N, (q + 1)t/N ]. Using the latter estimate, the remark below (A3), and (A2) one obtains

t

sup W (y, τ ) dτ =

0 y∈

N−1 (q+1)t/N q=0

≤ t +

N−1 (q+1)t/N q=0

≤ t +

sup W (y, τ ) dτ y∈

qt/N

qt/N

2Mt + N

W (xFq ,q , τ ) dτ

t

W (x (τ ), τ ) dτ.

(A4)

0

Now, assume that there is a finite set of times independent of x, {τ1 , ..., τL }, such that ∀x ∈ the set of times at which W (x, ·) is discontinuous is a subset of {τ1 , ..., τL }.

754


Thus, (A4) applies in each time interval [τi , τi+1 ], 0 ≤ i ≤ L, with τ0 = 0 and τL+1 = t. Let y0 , y1 , ..., yL+1 be L + 2 points in with yL+1 = x. Let x be a continuous path in passing by x (τi ) = yi and defined by (A3) in each time interval [τi , τi+1 ]. From Eq. (A4) in which one writes Ci () = + 2M/Ni (), it follows

t

sup W (y, τ ) dτ ≤

0 y∈

L

L

W (x (τ ), τ ) dτ τi

i=0

=

L

t

Ci ()(τi+1 − τi ) +

W (x (τ ), τ ) dτ 0

i=0

≤

τi+1

Ci ()(τi+1 − τi ) +

Ci ()(τi+1 − τi ) +

t

sup

W (x(τ ), τ ) dτ,

x(·)∈B(x,t) 0

i=0

where the last inequality results from the fact that x ∈ B(x, t). The proof of Lemma A1 for the class of W considered in this paragraph is completed by taking the limit → 0 and observing that the Ci () tend to zero with . We are now ready to prove Lemma 1 in the general case. Since W (·, τ ) is continuous in x uniformly in τ ∈ [0, t], for any > 0, we can find δ = δ() > 0 such that if d(x, x ) ≤ δ, then sup |W (x, τ ) − W (x , τ )| ≤ .

τ ∈[0,t]

Since is compact, we can find a finite covering by open balls of radius at most δ/2, and therefore a finite partition of unity (χk ) by continuous functions whose support has diameter at most δ (see [3], paragraph 4.3 in Chapter IX). For any k we choose once for all a point xk ∈ suppχk , and define the function W (x, τ ) = W (xk , τ )χk (x). k

For each fixed τ , this function is obviously continuous in x, and for fixed x, it is piecewise continuous in τ , with the possible discontinuity points belonging to a finite set which can be chosen independent of x. From the previous result we have t t sup W (y, τ ) dτ = sup W (x(τ ), τ ) dτ. 0 y∈

x(·)∈B(x,t) 0

Since supx∈,τ ∈[0,t] |W (x, τ ) − W (x, τ )| ≤ , we deduce that t t sup W (y, τ ) dτ ≤ t + sup W (y, τ ) dτ 0 y∈

0 y∈

= t +

t

sup

W (x(τ ), τ ) dτ

x(·)∈B(x,t) 0

≤ 2t +

sup

t

W (x(τ ), τ ) dτ.

x(·)∈B(x,t) 0

Since the estimate holds for any > 0 the general result follows.


755

(m)

Appendix B: Determination of the Support of g(x,T ) (m)

Let gx,t (u) be a distribution with compact support on R whose Fourier transform, (m) (m) (x, t; η) ≡ (Fgx,t )(η) with η ∈ R, is the solution to (14) with V (x, t; η) = ˆ ηU (x, t; sˆ ), where U (x, t; sˆ ) = N i=1 k(s)i ϕi (x, t). This appendix is devoted to the (m) determination of the support of g(x,t) . We have modified the notation used in the text to make the dependence on m explicit. We begin with a technical lemma that will be useful in the sequel. Let C0∞ (R) denote the set of all smooth compactly supported functions in R: (m) Lemma B1. For every t > 0, x ∈ , and f ∈ C0∞ (R), R gx,t (v)f (v) dv is an ana (m) lytic function of m on C+ ≡ {m ∈ C : Im(m) > 0}, and R gx,t (v)f (v) dv = (m+iγ ) (v)f (v) dv for each real m = 0. limγ →0+ R gx,t Proof. As a Fourier transform of a function with compact supports on R, (Ff )(η) is an analytic function of η ∈ C. We have seen at the beginning of the proof of Lemma 1 that (m) (x, t; η), with m ∈ C+ , is also analytic in η. Furthermore, if is a torus and V is bounded on (which is the case), then (i) (m) (x, t; η) is analytic in m ∈ C+ ; and (ii) ∀η ∈ C, (m) (x, t; η) = limγ →0+ (m+iγ ) (x, t; η) for each real m = 0. We are indebted to K. Yajima for the proof of the latter result that we reproduce here for the sake of completeness [13]. Define it Um (t) = exp , t ≥ 0, m ∈ C+ , 2m and write the initial value problem (14) in the form of an integral equation, t (m) (t) = 1 − iη Um (t − τ )V (τ ) (m) (τ ) dτ. 0

Here V (t) is the multiplication operator with U (x, t; sˆ ). Let B denote the space of bounded operators in L2 (). By Fourier series expansion, it is evident that (a)||Um (t) (m) (t)||2 ≤ || (m) (t)||2 , viz. Um (t) ∈ B and ||Um (t)|| ≤ 1; (b) the function [0, ∞) × C+ (t, m) → Um (t) ∈ B is strongly continuous [viz. (t, m) → Um (t)f ∈ L2 () is continuous for every f ∈ L2 ()]; and (c) for every t ≥ 0, m → Um (t) ∈ B is analytic for m ∈ C+ and (d/dm)Um (t) is norm continuous w.r.t. (t, m) ∈ [0, ∞) × C+ . It follows from the boundedness of V that the Dyson expansion [8] t Dm (t) = Um (t) − iη Um (t − τ )V (τ )Um (τ ) dτ + · · · + 0 (−iη)n Um (t − τn )V (τn ) · · · V (τ1 )Um (τ1 ) dτ1 · · · dτn + · · · 0 0, ∃δ > 0 such that |h(α[x(·)]) − 1| < ε for every x(·) ∈ B0 (δ) ≡ {x(·) ∈ B(x, t) :


757

sup0≤τ ≤t x(τ ) − x0 (τ ) < δ}. Take ε = 1/2, in this case h(α[x(·)]) > 1/2 for every x(·) ∈ B0 (δ) and one has γ t (iγ ) ˙ )2 dτ gx,t (u)h(u) du = e− 2 0 x(τ h(α[x(·)]) d[x(·)] R x(·)∈B(x,t) γ t ˙ )2 dτ ≥ e− 2 0 x(τ h(α[x(·)]) d[x(·)] x(·)∈B0 (δ) γ t 1 ˙ )2 dτ > e− 2 0 x(τ d[x(·)]. 2 x(·)∈B0 (δ) Since the set of the Brownian paths x(·) that are in B0 (δ) has a strictly positive Wiener measure, the last term is strictly positive and one finds (iγ ) gx,t (u)h(u) du > 0. (B1) R

(iγ )

If there was an open subset of [a, b] not intersecting the support of gx,t , it would be pos (iγ ) (iγ ) sible to choose the support of h outside the one of gx,t , yielding gx,t (u)h(u) du = 0 (iγ ) in contradiction with Eq. (B1). Thus, for every x ∈ and γ > 0, the support of gx,t is equal to [a, b]. Consider now the general case m ∈ C+ \{0}. Assume that there is an open subset (m) of [a, b] not intersecting the support of gx,t . In this case it is possible to choose the (m) (m) support of h outside the one of gx,t , yielding gx,t (u)h(u) du = 0. By Lemma B1 (z) the support of gx,t must vary continuously with z ∈ C+ , whence the support of h can be taken small enough such that there is a open subset V(m) ⊂ C+ with m ∈ V(m) (z) (z) and gx,t (u)h(u) du = 0 identically in V(m). From the analyticity of gx,t (u)h(u) du (z) in z on C+ (Lemma B1), it follows immediately that gx,t (u)h(u) du = 0 identically in all C+ , in contradiction with Eq. (B1), which completes the proof of Lemma B2. In (m) particular, one has sup{v : v ∈ suppgx,t } = b = Hx,t (ˆs ), which is the result used in the proof of Lemma 2. References 1. Akhmanov, S.A., D’yakov,Yu.E., Pavlov, L.I.: Statistical phenomena in Raman scattering stimulated by a broad-band pump. Sov. Phys. JETP 39, 249–256 (1974) 2. Asselah, A., Dai Pra, P., Lebowitz, J.L., Mounaix, Ph.: Diffusion effects on the breakdown of a linear amplifier model driven by the square of a Gaussian field. J. Stat. Phys. 104, 1299–1315 (2001) 3. Bourbaki, N.: Eléments de mathématique: topologie générale, chapitres 5 a` 10. Paris: Dunod, 1997 (in French) 4. DeWitt-Morette, C.: Feynman’s path integral. Definition without limiting procedure. Commun. Math. Phys. 28, 47–67 (1972); Cartier, P., DeWitt-Morette, C.: A new perspective on functional integration. J. Math. Phys. 36, 2237–2312 (1995); Cartier, P., DeWitt-Morette, C.: Functional integration. J. Math. Phys. 41, 4154-4187 (2000) 5. Hayman, W.K., Kennedy, P.B.: Subharmonic Functions, Vol. I. London Mathematical Society Monographs, No. 9. London-New York: Academic Press (Harcourt Brace Jovanovich, Publishers), 1976 6. Katznelson, Y.: An Introduction to Harmonic Analysis. Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo: Cambridge Univeristy Press, 2004 7. Mounaix, Ph., Lebowitz, J.L.: Note on a diffraction-amplification problem. J. Phys. A: Math. Gen. 37, 5289–5294 (2004)

758


8. Reed, M., Simon, B.: Methods of Modern Mathematical Physics 2: Fourier Analysis, Self-Adjointness. New York-San Francisco-London: Academic Press, 1975 9. Rose, H.A., DuBois, D.F.: Statistical properties of laser hot spots produced by a random phase plate. Phys. Fluids B 5, 590–596 (1993) 10. Rose, H.A., DuBois, D.F.: Laser hot spots and the breakdown of linear instability theory with application to stimulated Brillouin scattering. Phys. Rev. Lett. 72, 2883–2886 (1994) 11. Schwartz, L.: Théorie des distributions. Paris: Hermann, 1997 (in French) 12. Thirring, W.: A Course in Mathematical Physics 3: Quantum Mechanics of Atoms and Molecules. Wien-New York: Springer-Verlag, 1991 13. Yajima, K.: Private communication, 2004 Communicated by A. Kupiainen


Communications in


Spectral Asymptotics of Pauli Operators and Orthogonal Polynomials in Complex Domains N. Filonov1 , A. Pushnitski2, 1

Department of Mathematical Physics, Faculty of Physics, St. Petersburg State University, 198504 St. Petersburg, Russia. E-mail: [email protected] 2 Mathematics 253-37, Caltech, Pasadena, CA 91125, U.S.A. E-mail: [email protected] Received: 24 May 2005 / Accepted: 7 September 2005 Published online: 15 February 2006 – © Springer-Verlag 2006

Abstract: We consider the spectrum of a two-dimensional Pauli operator with a compactly supported electric potential and a variable magnetic field with a positive mean value. The rate of accumulation of eigenvalues to zero is described in terms of the logarithmic capacity of the support of the electric potential. A connection between these eigenvalues and orthogonal polynomials in complex domains is established. 1. Introduction 1.1. The unperturbed Pauli operator. Let B = B(x), x = (x1 , x2 ) ∈ R2 , be a real valued function which has the physical meaning of the strength of a magnetic field in R2 . A two-dimensional non-relativistic electron in the external magnetic field B can be described by the Pauli operator + h 0 in L2 (R2 ) ⊕ L2 (R2 ). h= 0 h− The standard approach to the definition of the operators h± in L2 (R2 ) involves introducing the magnetic vector potential A(x) = (A1 (x), A2 (x)) such that B = ∂x1 A2 − ∂x2 A1 and setting h± = (−i∇ − A)2 ∓ B.

(1.1)

Instead, we adopt the approach advocated in [6], which consists of defining h± in terms of a solution = (x) to the differential equation = B. Assume that B is such that a solution can be chosen subject to the condition (x) =

B0 2 |x| + 1 (x), 4

1 = 1 ∈ L∞ (R2 ),

On leave of absence from King’s College London, U.K.

B0 > 0.

(1.2)

760

N. Filonov, A. Pushnitski

Important examples of magnetic fields B of this class are periodic fields with mean value B0 and constant magnetic fields B(x) = B0 . Next, denote, as usual, ∂ = 21 (∂x1 − i∂x2 ) and ∂ = 21 (∂x1 + i∂x2 ). Consider the quadratic forms + h [u] = 4 |∂(e(x) u(x))|2 e−2(x) dx, R2

h− [u] = 4

|∂(e−(x) u(x))|2 e2(x) dx,

(1.3)

R2

which are closed on the domains Dom(h± ) = {u ∈ L2 (R2 ) | h± [u] < ∞}. Let us define h± as the self-adjoint operators in L2 (R2 ), corresponding to the quadratic forms h± . For a wide class of magnetic fields this definition is equivalent to the standard definition (1.1) with A = (−∂x2 , ∂x1 ); see [6] for a detailed analysis of this issue. In fact, the magnetic field B or the magnetic vector potential A do not enter directly either the definition of Pauli operator or any of our considerations; instead, the ‘potential function’ becomes the main functional parameter. Note that condition (1.2) is very close to the ‘admissibility’ condition used in [16]. ± We will denote by h± 0 and h0 the above defined forms and operators corresponding to the case of the constant magnetic field B(x) = B0 > 0. Note that for all u ∈ Dom(h− 0) one has 2 h− 0 [u] 2B0 u .

(1.4)

1.2. Zero modes and the spectral gap. It is well known that Pauli operator h has infinite dimensional kernel. More precisely, by the argument dating back to [1], we have Ker h− = {0} and Ker h+ = {u ∈ L2 (R2 ) | u(x) = f (x)e−(x) , ∂f = 0},

dim Ker h+ = ∞. (1.5)

Next, the following well known supersymmetric argument (which has appeared in many forms in the literature; see e.g. [9] or [16]) establishes the existence of a spectral gap (0, m), m > 0 of the operator h. Let a0 and a0∗ be the annihilation and creation operators in L2 (R2 ), corresponding to the constant component B0 > 0 of the magnetic field B: a0 = −2ie−B0 |x|

2 /4

∂ eB0 |x|

2 /4

,

a0∗ = −2ieB0 |x|

2 /4

∂ e−B0 |x|

2 /4

.

(1.6)

Then one can define a = e−1 a0 e1 on Dom(a) = {e−1 u | u ∈ Dom(a0 )} and a ∗ = e1 a0∗ e−1 . (1.7) In terms of these operators, we have h+ = a ∗ a and h− = aa ∗ , and therefore σ (h+ ) \ {0} = σ (h− ) \ {0}. Finally, comparing the form h− with using (1.4), one obtains (see e.g. [3] or [16, Prop. 1.2])

(1.8) h− 0

and

−1 u] 2B0 e2 ess inf 1 e−1 u2 h− [u] e2 ess inf 1 h− 0 [e

2B0 e−2 osc 1 u2

(1.9)

Spectral Asymptotics and Orthogonal Polynomials

761

for all u ∈ Dom(h− ), where osc 1 = ess sup 1 − ess inf 1 . It follows that (0, m), m = 2B0 e−2 osc 1 > 0, is a gap in the spectrum of h+ and of h. See [5] for a different point of view on the issue of existence of the spectral gap and [18] for recent progress on this topic. 1.3. Perturbations of the Pauli operator and spectral asymptotics. Let v ∈ Lp (R2 ), p > 1, be a non-negative compactly supported function, which has the physical meaning of the electric potential. The Pauli operator which describes a particle in the external magnetic field B and the electric field with the potential v is + h +v 0 h + vI = in L2 (R2 ) ⊕ L2 (R2 ). 0 h− + v Along with h + vI , we will also consider the operator h − vI . In order to define h+ ± v as a quadratic form sum, let us establish that v is h+ -form compact. By (1.7), (1.8) and boundedness of 1 , we see that v is h+ -form compact if and only if ve−21 is h+ 0 -form compact. As ve−21 ∈ Lp (R2 ), p > 1, we obtain that ve−21 is h+ -form compact (see 0 [2]). A similar argument shows that v is h− -form compact and so the operators h− ± v are also well defined. The main object of interest in this paper is the rate of accumulation of the eigenvalues of h ± vI to zero. By the above established relative compactness and by the estimate (1.9), the operators h− ±v can have only finitely many eigenvalues in a sufficiently small neighbourhood of zero. Thus, the question reduces to that of the rate of accumulation of the eigenvalues of h+ ± v to zero. Due to the assumption v 0, the eigenvalues of h+ +v can accumulate to 0 only from − above, and the eigenvalues of h+ − v can do so only from below. Let λ− 1 λ2 · · · be + + + the negative eigenvalues of h − v, and λ1 λ2 · · · be the eigenvalues of h+ + v in the spectral gap (0, m); here and in the rest of the paper, we assume eigenvalues to be enumerated with multiplicities taken into account. Our aim is to describe the rate of convergence λ± n → 0 as n → ∞. Roughly speaking, we prove the following asymptotics (precise statements are given in Sect. 2): log(±n!λ± n ) = n log(B0 /2) + 2n log Cap(supp v) + o(n),

n → ∞,

(1.10)

where Cap is the logarithmic capacity of a set. The notion of logarithmic capacity is introduced in the framework of potential theory; see e.g. [7, 11]. Recall that the logarithmic capacity of compact sets in R2 has the following properties: (i) if 1 ⊂ 2 then Cap 1 Cap 2 ; (ii) Cap coincides with the logarithmic capacity of the outer boundary of (= the boundary of the unbounded component of R2 \ ); (iii) the logarithmic capacity of a disc of radius r is r; (iv) if 2 = {αx | x ∈ 1 }, α > 0, then Cap 2 = α Cap 1 . We establish (1.10) by means of the following simple chain of equivalent reformulations of the problem. Firstly, a perturbation theory argument reduces the problem to the spectral asymptotics of an auxiliary compact self-adjoint operator P0 vP0 , where P0 is the spectral projection of h+ , corresponding to the eigenvalue 0. Next, we observe that the eigenvalues of P0 vP0 coincide with the singular values of a certain embedding operator (see (4.2)). Using the approach of [13], we relate the singular numbers of this

762


embedding operator to some sequence of orthogonal polynomials in the complex domain (see below). Finally, application of the results of [22] concerning the asymptotics of these orthogonal polynomials leads to (1.10). Using the same technique, we are also able to treat two similar problems. First, we consider the Pauli operators h+ 0 ± v in the case of a constant magnetic field and describe the rate of accumulation of the eigenvalues to the higher Landau levels. Secondly, we consider the three-dimensional Pauli Hamiltonian with a constant magnetic field and a compactly supported electric potential and describe the rate of convergence of eigenvalues to 0. These results are presented in Sect. 2. The rate of convergence of eigenvalues to zero for Pauli operators in dimensions two and three was investigated before in the case of constant magnetic field B(x) = B0 > 0 for various classes of potentials v with power or exponential decay at infinity; see [21, 19, 23, 14, 15, 10, 17]. We refer the reader to the discussion in [17]. The case of a constant magnetic field and compactly supported potentials v was considered in [17] and [12]. The case of a two-dimensional operator with variable magnetic field and potentials v with power or exponential decay and also with compactly supported potentials was treated in [16]. The results of [17, 12, 16] for the case of compactly supported potentials read as log(±λ± n ) = −n log n + O(n),

n → ∞.

(1.11)

As far as we are aware, a connection between the spectral asymptotics of magnetic operators and logarithmic capacity or orthogonal polynomials has not been made before. Some physical intuition concerning these problems with constant magnetic field can be gained from [8]. 1.4. Orthogonal and Chebyshev polynomials. We identify R2 and C in a standard way: z = x1 +ix2 for (x1 , x2 ) ∈ R2 , denote by dm(z) the Lebesgue measure in C and consider v as a function of z. It appears that the sequence of polynomials in z, orthogonal with respect to the measure v(z)dm(z), is related to the asymptotics of λ± n . Here we recall necessary facts from the theory of Chebyshev and orthogonal polynomials in complex domains; see e.g. [7] and [22] for the details. For any n = 0, 1, 2, . . . , let Pn be the set of all monic polynomials in z of degree n: Pn = {zn + an−1 zn−1 + · · · + a1 z + a0 | a0 , . . . , an−1 ∈ C}.

(1.12)

Let ⊂ C be a compact set. For a fixed n, consider the problem of minimization of the norm tC() ≡ supz∈ |t (z)| on the set t ∈ Pn . It is clear that the minimum is positive and attained at some polynomial tn ∈ Pn . The polynomial tn is called the nth Chebyshev polynomial for the set . (One can prove that such a polynomial is unique, but we will not need this fact). It is well known that all zeros of tn lie in the closed convex hull of . The nth root asymptotics of tn is given by lim tn 1/n = Cap .

n→∞

(1.13)

Next, let v ∈ L1 (C, dm) be a non-negative compactly supported function. Denote Mn (v) = inf |p(z)|2 v(z)dm(z). (1.14) p∈Pn C


763

It is easy to prove that the infimum in (1.14) is attained at the polynomials {pn }∞ n=0 , pn ∈ Pn , which can be obtained by applying the Gram–Schmidt orthogonalisation process in L2 (C, v(z)dm(z)) to the sequence 1, z, z2 , . . . . All zeros of pn lie in the closed convex hull of supp v. Regarding the nth root asymptotics of pn , the following facts are known (see [22]). Denote ρ+ (v) = lim sup Mn (v)1/n ,

ρ− (v) = lim inf Mn (v)1/n . n→∞

n→∞

(1.15)

In general, it can happen that ρ− (v) < ρ+ (v) (see the proof of Theorem 1.1.9 in [22]). One has the estimates ρ+ (v) (Cap supp v)2 ,

ρ− (v) (Cap − (v))2 , log |z−ζ |r v(ζ )dm(ζ ) where − (v) = {z ∈ C | lim sup < ∞}. log r r→+0

(1.16)

The first inequality in (1.16) is a part of Corollary 1.1.7 of [22]. The second inequality in (1.16), although not stated explicitly in [22], follows directly from the proof of Theorem 4.2.1 therein. Remark 1. Let ⊂ C be a compact set with a Lipschitz boundary, and let v ∈ L1 (C, dm) be such that v(z) c > 0 for all z ∈ and v(z) = 0 for all z ∈ C \ . Then we easily find that − (v) = = supp v and therefore ρ+ (v) = ρ− (v) = (Cap )2 . 2. Main Results 2.1. Two-dimensional Pauli operators with variable magnetic field. Let h+ , as in the Introduction, be the Pauli operator defined via (1.3) with subject to (1.2). Let v and λ± n be as in the Introduction. Theorem 1. Let 0 v ∈ Lp (R2 ), p > 1, be a compactly supported potential and let Mn (v) be as defined in (1.14). Then there exists k ∈ N such that 1/n (B0 /2)Mn+k (v)1/n (1 + o(1)) (n!λ+ n)

(B0 /2)Mn−1 (v)1/n (1 + o(1)), (B0 /2)Mn−1 (v)

1/n

(1 + o(1))

(2.1)

1/n (−n!λ− n)

(B0 /2)Mn−k (v)1/n (1 + o(1)),

(2.2)

as n → ∞. In particular, 1/n = B0 ρ+ (v)/2, lim sup(±n!λ± n) n→∞

1/n lim inf (±n!λ± = B0 ρ− (v)/2, n) n→∞

where ρ± (v) are defined by (1.15). If v is of the class described in Remark 1, then the asymptotics (1.10) holds true. 2 Remark 2. Let µ be a compactly supported finite measure in R such that the quadratic form C |u(x)|2 dµ(x) is compact with respect to the quadratic form h+ . Then one can define the self-adjoint operators corresponding to the quadratic forms + |u(x)|2 dµ(x). h [u] ±

R2

All our considerations remain valid for such operators. For example, the case of a measure µ, supported by a curve, can be interesting.

764


2.2. Two-dimensional Pauli operators with constant magnetic field. Let B(x) = B0 > 0; + consider the corresponding operator h+ 0 . As it is well known, the spectrum of h0 consists ∞ of the eigenvalues {2qB0 }q=0 of infinite multiplicities; these eigenvalues are known as Landau levels. Consider the problem of accumulation of eigenvalues of h+ 0 ± v to a fixed − − higher Landau level 2qB0 , q 1. Let λq,1 λq,2 · · · be the eigenvalues of h+ 0 −v + in the interval (2(q − 1)B0 , 2qB0 ), and let λ+ λ · · · be the eigenvalues of q,1 q,2 + h0 + v in (2qB0 , 2(q + 1)B0 ). Theorem 2. Let ⊂ R2 be a compact set with Lipschitz boundary and let v ∈ Lp (R2 ), p > 1, be such that v(x) c > 0 for x ∈ and v(x) = 0 for x ∈ R2 \ . Then for the corresponding eigenvalues λ± q,n we have: B0 (Cap )2 . 2 → 2qB0 as n → ∞, q 1, was studied before in

1/n lim (±n!(λ± = q,n − 2qB0 ))

n→∞

The rate of convergence of λ± q,n [17, 12], where the asymptotics

log(±(λ± q,n − 2qB0 )) = −n log n + O(n),

n→∞

was obtained. Note that if the potential v depends only on |x|, then the result of Theorem 2 can be obtained by a direct calculation using separation of variables, see e.g. [17, Prop. 3.2]. 2.3. Three-dimensional Pauli operator with a constant magnetic field. Let H = (−i∇ − A(x))2 − B0 in L2 (R3 , dx),

x = (x1 , x2 , x3 ),

where A(x) = It is well known that the spectrum of H is absolutely continuous and coincides with the interval [0, ∞). The background information concerning the spectral theory of H and its perturbations can be found in [2]. Let V ∈ L3/2 (R3 ) be a non-negative compactly supported potential. The operator of multiplication by V in L2 (R3 ) is H -form compact (cf. [2]). Thus, one can define the self-adjoint operator H −V via the corresponding quadratic form; the essential spectrum of H − V is also [0, ∞). Let 1 2 · · · be the negative eigenvalues of H − V ; we have n → 0 as n → ∞. Below we describe the asymptotic behaviour of n as n → ∞ in terms of the auxiliary weight function ∞ w(x1 , x2 ) = V (x1 , x2 , x3 )dx3 , a.e. (x1 , x2 ) ∈ R2 . (2.3) (− 21 B0 x2 , 21 B0 x1 , 0).

−∞

As above, we consider w as a function of z = x1 + ix2 . Theorem 3. Let 0 V ∈ L3/2 (R3 ) be a compactly supported potential and w be defined by (2.3). Then there exists k ∈ N such that (B0 /2)2 Mn+k (w)2/n (1 + o(1)) (−(n!)2 n )1/n (B0 /2)2 Mn−k (w)2/n (1 + o(1)),

(2.4)

as n → ∞. In particular, lim sup(−(n!)2 n )1/n = (B0 ρ+ (w)/2)2 , n→∞

where ρ± (w) are defined by (1.15).

lim inf (−(n!)2 n )1/n = (B0 ρ− (w)/2)2 , n→∞


765

The rate of accumulation n → 0 for potentials V with power or exponential decay was considered before in [21, 19, 23, 14, 15, 10, 17]. For compactly supported potentials, this problem was considered in [17, 12], where the asymptotics log(−n ) = −2n log n + O(n),

n→∞

was obtained. Remark 3. Theorem 3 remains valid under the following assumptions on V : (i) V 0, V is H -form compact; (ii) R3 V (x)(1 + |x3 |2 )dx < ∞; (iii) the function w, defined by (2.3), is compactly supported. 3. Proof of Theorems 1, 2 and 3 Proof of Theorem 1. Let H0 ⊂ L2 (R2 ) be the kernel of h+ , and let P0 be the corresponding eigenprojection, Ran P0 = H0 . Consider the compact self-adjoint operator P0 vP0 . The key ingredient in the proof is the following Lemma 1. Let v ∈ L1 (R2 ) be a non-negative compactly supported function and let s1 s2 · · · > 0 be the eigenvalues of P0 vP0 . Then (n!sn+1 )1/n = (B0 /2)Mn (v)1/n (1 + o(1)),

n → ∞,

(3.1)

where Mn (v) are defined by (1.14). The proof is given in Sect. 4. Now it remains to employ a perturbation theory argument (see [16, Prop. 3.1] or [17, Prop. 4.1]) based on the Birman-Schwinger principle and on Weyl inequalities for eigenvalues of a sum of compact operators. This argument shows that there exists k ∈ N such that for all sufficiently large n ∈ N one has sn −λ− n 2sn−k ,

1 sn+k λ+ n sn . 2

Combining these inequalities with Lemma 1, we obtain the required result.

Proof of Theorem 2. For any q 0, denote Hq = Ker(h+ − 2qB0 ) and let Pq be the eigenprojection of h+ 0 corresponding to the eigenvalue 2qB0 . Consider the compact (q) (q) self-adjoint operator Pq vPq , and let s1 s2 · · · be the eigenvalues of this operator. As in the proof of Theorem 1, using a perturbation theory argument based on the Birman-Schwinger principle and on Weyl inequalities (see [17, Prop. 4.1]), one shows that there exists k ∈ N such that for all sufficiently large n ∈ N, 1 (q) (q) s ±(λ± q,n − 2qB0 ) 2sn−k . 2 n+k

(3.2)

Now the proof of Theorem 2 reduces to Lemma 2. Let ⊂ R2 be a compact set with Lipschitz boundary and let v ∈ L1 (R2 ) be such that v(x) c > 0 for x ∈ and v(x) = 0 for x ∈ R2 \ . Fix q ∈ N and let (q) (q) s1 s2 · · · be the eigenvalues of Pq vPq . Then one has (q)

lim (n!sn )1/n = (B0 /2)(Cap )2 .

n→∞

(3.3)

766


The proof of Lemma 2 is given in Sect. 5. From Lemma 2 and the estimate (3.2), we immediately obtain the required result. Proof of Theorem 3. The proof repeats almost word for word the construction of [19]. According to the Birman-Schwinger principle, for E > 0 we have: √ √ (3.4) {n | n < −E} = n+ (1; V (H0 + E)−1 V ). √ √ The operator V (H0 + E)−1 V can be represented as √ √ 1 V (H0 + E)−1 V = √ K1 + K2 + K3 . 2 E Here K1 , K2 are the operators in L2 (R3 ) with the integral kernels K1 (x, y) = V (x)P0 (x⊥ , y⊥ ) V (y), √ E|x3 −y3 |

e− K2 (x, y) = V (x)P0 (x⊥ , y⊥ )

√

− 1

2 E

V (y),

where the notation x⊥ = (x1 , x2 ), y⊥ = (y1 , y2 ) is used, and P0 (x⊥ , y⊥ ) is the integral kernel of the operator P0 in L2 (R2 ). Finally, K3 is the operator √ √ K3 = V Q0 (H0 + E)−1 V , where Q0 = (I − P0 ) ⊗ I in the decomposition L2 (R3 , dx1 dx2 dx3 ) = L2 (R2 , dx1 dx2 ) ⊗ L2 (R, dx3 ). The operators K2 and K3 have limits (in the operator norm) as E → +0; these limits are compact self-adjoint operators. Thus, by Weyl’s inequalities for eigenvalues (see e.g. [4]), we have for E → +0: √ √ 1 1 1 V (H0 + E)−1 V ) n+ ( ; √ K1 ) + n+ ( ; K2 + K3 ) 2 2 E 2 √ n+ ( E; K1 ) + O(1), √ √ 3 1 1 n+ (1; V (H0 + E)−1 V ) n+ ( ; √ K1 ) − n+ ( ; −K2 − K3 ) 2 2 E 2 √ n+ (3 E; K1 ) − O(1). n+ (1;

(3.5)

(3.6)

Finally, again as in [19], let us prove that the non-zero eigenvalues of K1 coincide with those of P0 wP0 , where w is defined by (2.3). It suffices to prove this statement for continuous V with compact support; the general case V ∈ L3/2 then follows by approximation argument. Let N1 : L2 (R3 , dx1 dx2 dx3 ) → L2 (R2 , dx1 dx2 ) and N2 : L2 (R2 , dx1 dx2 ) → L2 (R3 , dx1 dx2 dx3 ) be the following operators: ∞ V 1/2 (x1 , x2 , x3 )u(x1 , x2 , x3 )dx3 , (N1 u)(x1 , x2 ) = −∞ 1/2

(N2 u)(x1 , x2 , x3 ) = V

(x1 , x2 , x3 )u(x1 , x2 ).

Then K1 = N2 P0 N1 = (N2 P0 )(P0 N1 ) and P0 wP0 = (P0 N1 )(N2 P0 ). It follows that the non-zero eigenvalues of K1 coincide with {sn }, the non-zero eigenvalues of P0 wP0 ,


767

√ and so n+ ( E; K1 ) = {n | (sn )2 > E}. From here and (3.4), (3.5), (3.6) it follows that for some k ∈ N and all sufficiently large n ∈ N, one has 1 (sn+k )2 n (sn−k )2 . 9

(3.7)

Combining this with Lemma 1, we get the statement of Theorem 3.

4. Proof of Lemma 1 First let us consider the case of a constant magnetic field B(x) = B0 > 0. Let F 2 be the Hilbert space of all entire functions f such that 2 f 2F 2 = |f (z)|2 e−B0 |z| /2 dm(z) < ∞. (4.1) C

In the case B0 = 2, the space F 2 is usually called Fock space or Segal-Bargmann space. By (1.5), we have an isometry between H0 = Ker h+ ⊂ L2 (C, dm) and F 2 , given 2 by u(z) = e−B0 |z| /4 f (z), u ∈ H0 , f ∈ F 2 . Thus, the quadratic form of the operator P0 vP0 |H0 is unitarily equivalent to the quadratic form 2 |f (z)|2 v(z)e−B0 |z| /2 dm(z), f ∈ F 2 . C

It follows that the non-zero eigenvalues sn of P0 vP0 coincide with the singular values µn of the embedding operator F 2 ⊂ L2 (C, v(z)e−B0 |z|

2 /2

(4.2)

dm(z)).

The case of a variable magnetic field can be also reduced to the embedding (4.2). Indeed, if sn correspond to the case of a variable magnetic field, then, using the boundedness of 1 , one obtains (see [16, Prop. 3.2]): µn e−2 osc 1 sn µn e2 osc 1 ,

n ∈ N.

Thus, it remains to prove the asymptotic formula (n!µn+1 )1/n = (B0 /2)Mn (v)1/n (1 + o(1)),

n→∞

(4.3)

for the singular values µn of the embedding (4.2). We shall assume B0 = 2; the general case can be reduced to this one by a linear change of coordinates. The embedding F 2 ⊂ C(), where is a compact set in C, was studied in [13]. Below we repeat the arguments of [13] (with trivial modifications) to obtain the required asymptotics. By the minimax principle, we have the following variational characterisation of µn : 2 −|z|2 dm(z) C |f (z)| v(z)e sup , codim L+ (4.4) µn+1 = inf n = n, 2 + 2 + f Ln ⊂F f ∈Ln \{0} F2 2 −|z|2 dm(z) C |f (z)| v(z)e inf , dim L− µn+1 = sup n = n + 1. (4.5) 2 − − f 2 f ∈L \{0} n Ln ⊂F F2

768


Upper bound on µn+1 . 1. For the subspaces L+ n from (4.4), we will take 2 L+ n = {f ∈ F | f (z) = pn (z)g(z), g is entire function},

where pn is the sequence of monic polynomials orthogonal with respect to the measure v(z)dm(z). In order to estimate the ratio in (4.4) from above, let us prove the following auxiliary statement. Denote R0 = maxz∈supp v |z|. We claim that for any ε ∈ (0, 13 ), there exists N ∈ N such that for all n N and any f = pn g ∈ L+ n , we have sup |g(z)|2 (1 − ε)−2n

|z|R0

Indeed, we have g(z) =

1 2πi

|ζ |=r

1 pn g2F 2 . n!

f (ζ ) dζ, pn (ζ )(ζ − z)

(4.6)

r > R0 ,

and therefore sup |g(z)|2

|z|R0

r sup 2π |z|R0

|ζ |=r

|f (ζ )|2 d|ζ | |pn (ζ )|2 |ζ − z|2

for any r > R0 . Denote R = R0 /ε. Since all zeros of pn lie in the closed convex hull of supp v, we obtain: |pn (ζ )||ζ − z| ((1 − ε)r)n+1 ,

|z| R0 ,

|ζ | = r R.

Thus, we get sup |g(z)|2

|z|R0

r −2n−1 2π(1 − ε)2n+2

|ζ |=r

|f (ζ )|2 d|ζ |,

r R.

Integrating the last inequality over r from R to ∞ with the weight e−r r 2n+1 , and using the fact that ∞ R 1 1 2 2 e−r r 2n+1 dr = n! − e−r r 2n+1 dr (1 − ε)−2 n! 2 2π R 0 2

for all sufficiently large n, we obtain (4.6). 2. From (4.6) we obtain for any f = pn g ∈ L+ n: C

|f (z)|2 v(z)e−|z| dm(z) Mn (v)g2C(supp v) Mn (v)(1 − ε)−2n 2

f 2F 2 n!

.

Together with (4.4), the last estimate yields (n!µn+1 )1/n (1 − ε)−2 Mn (v)1/n for all sufficiently large n.

(4.7)


769

Lower bound for µn+1 . Let us use formula (4.5) and take L− n to be the set of all polynomials in z of degree n. As in the proof of the upper bound, we denote R0 = maxz∈supp v |z|, fix ε > 0 and set R = R0 /ε. We shall use the following norm in F 2: 2 |||f |||2F 2 = |f (z)|2 e−|z| dm(z). (4.8) |z|R

This norm is equivalent to the earlier introduced norm (4.1). Indeed, the inequality |||f |||F 2 f F 2 is trivial; the inequality f F 2 C(R)|||f |||F 2 is easy to obtain by application of the Cauchy integral formula. Let qn ∈ L− n \ {0} be the polynomial which minimizes the ratio 2 −|z|2 dm(z) C |qn (z)| v(z)e (4.9) |||qn |||2F 2 among all polynomials in L− n \ {0}. Next, without the loss of generality, we may assume that qn is monic. Denote k = deg qn n. The following standard argument shows that all zeros of qn are confined to the disk {z | |z| R0 }. Suppose that one of the zeros z0 is outside the disk; then replace qn (z) by qn (z)|z0 |(z − R02 /z0 )/(R0 (z − z0 )). One has |z0 ||z − R02 /z0 | |z0 ||z − R02 /z0 | 1 for |z| R0 and 1 for |z| R0 , R0 |z − z0 | R0 |z − z0 | so this change decreases the ratio (4.9) — contradiction. Thus, we get the estimate |qn (z)| (1 + ε)k |z|k , It follows that |||qn |||2F 2

|z| R,

k = deg qn .

|z|2k (1 + ε)2k e−|z| dm(z) (1 + ε)2k π k!. 2

|z|R

On the other hand, for the numerator of (4.9), we have 2 2 2 |qn (z)|2 v(z)e−|z| dm(z) e−R0 |qn (z)|2 v(z)dm(z) e−R0 Mk (v). C

C

Combining the above estimates, we obtain: 2 −|z|2 dm(z) C |f (z)| v(z)e µn+1 inf C(R)|||f |||2F 2 f ∈L− n \{0} Mk (v) . 0k n C1 (R)(1 + ε)2k k!

min

(4.10)

As zpk (z) ∈ Pk+1 , from the definition (1.14) of Mk (v) we get a trivial estimate Mk+1 (v) R02 Mk (v). This estimate shows that for a sufficiently large n, the minimum in (4.10) is attained at k = n. Therefore, 1/n Mn (v)1/n 1 1/n (1 + ε)−3 Mn (v)1/n (n!µn+1 ) C1 (R) (1 + ε)2 for all sufficiently large n. The latter estimate together with (4.7) completes the proof of the lemma.

770


5. Proof of Lemma 2 First recall some well known facts concerning the spectral decomposition of the operator + h+ 0 . As above, we use the notation Hq = Ker(h − 2qB0 ) and Pq is the orthogonal 2 projection in L (C, dm) onto the subspace Pq . The operator h+ 0 can be represented in terms of the annihilation and creation operators (1.6) as h+ = a0∗ a0 . The operators 0 ∗ ∗ a0 , a0 obey the commutation relation [a0 , a0 ] = 2B0 , wherefrom we get the identity q a0 (a0∗ )q u = (2B0 )q q!u for all u ∈ H0 and q ∈ N. It follows that (2B0 )−q/2 (q!)−1/2 (a0∗ )q : H0 → Hq is an isometry onto Hq .

(5.1)

Recalling the explicit isomorphism between H0 and the space F 2 (see the previous 2 section), we see that the change u = (2B0 )−q/2 (q!)−1/2 (a0∗ )q (e−B0 |z| /4 f (z)) gives a unitary equivalence between the operator Pq vPq and the operator in F 2 defined by the quadratic form 2 (2B0 )−q (q!)−1 |(a0∗ )q (e−B0 |z| /4 f (z))|2 v(z)dm(z). C

We will consider the case B0 = 2; the general case can be reduced to this one by a linear change of variables. With this simplification, the above quadratic form becomes 2 −1 (q!) |(∂ − z)q f (z)|2 v(z)e−|z| dm(z), f ∈ F 2 . (5.2) C

Let us prove the asymptotics (3.3) for the eigenvalues {sn }∞ n=1 corresponding to the form (5.2). (q) Upper bound for sn . Let δ = {z | dist(z, ) δ}. By the Cauchy integral formula, we have (q)

sup|f (j ) |

j! sup|f |, δ j δ

f ∈ F 2,

j ∈ N.

Thus, we have the following bound for the form (5.2): 2 (q!)−1 |(∂ − z)q f (z)|2 v(z)e−|z| dm(z) Cf 2C(δ ) , C

where C depends on , δ, v, q. Let us define 2 L+ n = {f ∈ F | f (z) = tn (z)g(z), g entire},

where tn is the nth Chebyshev polynomial for the set δ . Note that the proof of (4.6) uses only the fact that all zeros of pn lie in the closed convex hull of supp v. Therefore, the same estimate remains valid with the change pn → tn . Thus, we get 2 −1 (q!) |(∂ −z)q f (z)|2 v(z)e−|z| dm(z) Cf 2C(δ ) Ctn 2C(δ ) g2C(δ ) C

C(1−ε)−2n for all sufficiently large n. This yields

1 tn 2C(δ ) f 2F 2 , f ∈ L+ n, n!


771

lim sup(n!sn )1/n (1 − ε)−2 lim tn C(δ ) = (1 − ε)−2 (Cap δ )2 . 2/n

(q)

n→∞

n→∞

It remains to note that ε and δ can be chosen arbitrary small and that for any compact set one has limδ→+0 Cap δ = Cap (see e.g. [7]). (q)

Lower bound for sn . 1. Due to the compactness of the embedding of the Sobolev space W21 () ⊂ L2 (), for any γ > 0 there exists a subspace in W21 () of a finite codimension such that for all elements u in this subspace we have uL2 () γ ∇uL2 () . It follows that for any γ > 0 there exists a subspace in F 2 of a finite codimension such that for any element of this subspace we have f L2 () γ f L2 () . Arguing by induction, we see that for any γ > 0 there exists a subspace N = N (γ , q) ⊂ F 2 of a finite codimension l such that ∂ k−1 f L2 () γ ∂ k f L2 () ,

∀f ∈ N (γ , q),

∀k = 1, 2, . . . , q. (5.3)

2. We need to estimate the form (5.2) from below. Using the assumption v(z) c > 0, z ∈ , on the first step and the triangle inequality on the second step, we obtain 2 |(∂ − z)q f (z)|2 v(z)e−|z| dm(z) C(∂ − z)q f 2L2 () C

C ∂ f L2 () − q

2

q q

(−z) ∂

k

k q−k

f L2 ()

k=1

C ∂ f L2 () − q

q q

2

k q−k f L2 () k R0 ∂

,

(5.4)

k=1

where R0 = maxz∈ |z|. From this estimate it is clear that by choosing γ sufficiently small we can ensure that for all f in the corresponding subspace N (γ , q) (see (5.3)) we have C C 2 |(∂ − z)q f (z)|2 v(z)e−|z| dm(z) ∂ q f 2L2 () γ q f 2L2 () . 2 2 C 3. Now, as in the proof of Lemma 1, let L− n be the set of all polynomials in z of degree − ∩ N (γ , q); clearly, dim L − − = L n. Consider the subspace L n n n n + 1 − l, where l = codim N (γ , q). Next, again as in the proof of Lemma 1 (see (4.10) and the following argument), we obtain for all sufficiently large n: q 2 −|z|2 v(z)dm(z) (q) C |(∂ − z) f (z)| e sn+1−l f 2F 2 C

f 2L2 () f 2F 2

C

Mn (χ ) , (1 + ε)2n n!

− f ∈L n,

where χ denotes the characteristic function of the set in C. As stated in Remark 1, limn→∞ Mn (χ )1/n = (Cap )2 . Thus,

772


(q)

lim inf (n!sn )1/n n→∞

for any ε > 0.

(Cap )2 (1 + ε)2

Acknowledgement. We are indebted to M. Sh. Birman,Yu. Netrusov, G. Raikov, G. Rozenblum, A. Sobolev and H. Stahl for useful discussions. The work was supported by the Royal Society grant 2004/R1-FS. The authors are grateful to the Mathematisches Forschungsinstitut Oberwolfach for hospitality and financial support. The first named author is also grateful to Loughborough University for hospitality.

References 1. Aharonov, Y., Casher, A.: Ground state of a spin- 21 charged particle in a two-dimensional magnetic field. Phys. Rev. A (3) 19(6), 2461–2462 (1979) 2. Avron, J., Herbst, I., Simon, B.: Schrödinger operators with magnetic fields. I. General interactions. Duke Math. J. 45(4), 847–883 (1978) 3. Besch A.: Eigenvalues in spectral gaps of the two-dimensional Pauli operator. J. Math. Phys. 41, 7918–7931 (2002) 4. Birman, M.Sh., Solomyak, M.Z.: Spectral theory of self-adjoint operators in Hilbert space. Dordrecht: D. Reidel P.C., 1987 5. Dubrovin, B.A., Novikov, S.P.: Fundamental states in a periodic field. Magnetic Bloch functions and vector bundles. (Russian) Dokl. Akad. Nauk SSSR 253(6), 1293–1297 (1980) 6. Erdös, L., Vougalter, V.: Pauli operator and Aharonov-Casher theorem for measure valued magnetic fields. Commun. Math. Phys. 225(2), 399–421 (2002) 7. Hille, E.: Analytic function theory. Vol. II, Boston, Mass: Ginn and Co., 1962 8. Hornberger, K., Smilansky, U.: Magnetic edge states. Physics Reports 367, 249–385 (2002) 9. Iwatsuka, A.: The essential spectrum of two-dimensional Schrödinger operators with perturbed constant magnetic fields. J. Math. Kyoto Univ. 23(3), 475–480 (1983) 10. Ivrii, V.: Microlocal analysis and precise spectral asymptotics. Berlin: Springer, 1998 11. Landkof, N.S.: Foundations of modern potential theory. New York: Springer, 1972 12. Melgaard, M., Rozenblum, G.: Eigenvalue asymptotics for weakly perturbed Dirac and Schrödinger operators with constant magnetic fields of full rank. Comm. Partial Differ. Eqs. 28(3–4), 697–736 (2003) 13. Parfënov, O.G.: The widths of some classes of entire functions. Mat. Sb. 190(4), 87–94 (1999); translation in Sb. Math. 190(3–4), 561–568 (1999) 14. Raikov, G.D.: Eigenvalue asymptotics for the Schrödinger operator with homogeneous magnetic potential and decreasing electric potential. I. Behaviour near the essential spectrum tips. Comm. Partial Differ. Eqs. 15(3), 407–434 (1990); Errata: Comm. Partial Differ. Eqs. 18(11), 1977–1979 (1993) 15. Raikov, G.D.: Border-line eigenvalue asymptotics for the Schrödinger operator with electromagnetic potential. Integral Equations Operator Theory 14(6), 875–888 (1991) 16. Raikov, G. D.: Spectral asymptotics for the perturbed 2D Pauli operator with oscillating magnetic fields. I. Non-zero mean value of the magnetic field. Markov Processes Relat. Fields 9, 775–794 (2003) 17. Raikov, G.D., Warzel, S.: Quasi-classical versus non-classical spectral asymptotics for magnetic Schrödinger operators with decreasing electric potentials. Rev. Math. Phys. 14(10), 1051–1072 (2002) 18. Rozenblum, G., Shirokov, N.: Infiniteness of zero modes for the Pauli operator with singular magnetic field. http://lanl.arxiv.org/abs/math-ph/0501059, 2005. To appear in J. Functional Analysis 19. Sobolev, A.V.: Asymptotic behavior of energy levels of a quantum particle in a homogeneous magnetic field perturbed by an attenuating electric field. I, (Russian). Probl. Mat. Anal. 9, Leningrad: Leningrad. Univ., 1984, pp. 67–84. English translation in: J. Sov. Math. 35, 2201–2212 (1986) 20. Sobolev, A.V.: On the Lieb-Thirring estimates for the Pauli operator. Duke Math. J. 82(3), 607–635 (1996) 21. Solnyshkin, S.N.: Asymptotic behavior of the energy of bound states of the Schrödinger operator in the presence of electric and homogeneous magnetic fields (Russian). Probl. Mat. Fiz., 10, Leningrad: Leningrad. Univ., 1982, pp. 266–278 22. Stahl, H., Totik, V.: General orthogonal polynomials. Cambridge: Cambridge Univ. Press, 1992 23. Tamura, H.: Asymptotic distribution of eigenvalues for Schrödinger operators with homogeneous magnetic fields. Osaka J. Math. 25(3), 633–647 (1988) Communicated by B.Simon


Communications in


Integration with Respect to the Haar Measure on Unitary, Orthogonal and Symplectic Group 2, ´ Benoˆıt Collins1, , Piotr Sniady 1 2

Department of Mathematics, Graduate School of Science, Kyoto University, Kyoto 606-8502, Japan. E-mail: [email protected] Institute of Mathematics, University of Wroclaw, pl. Grunwaldzki 2/4, 50-384 Wroclaw, Poland. E-mail: [email protected]

Received: 24 May 2005/ Accepted: 31 October 2005 Published online: 22 March 2006 – © Springer-Verlag 2006

Abstract: We revisit the work of the first named author and using simpler algebraic arguments we calculate integrals of polynomial functions with respect to the Haar measure on the unitary group U(d). The previous result provided exact formulas only for 2d bigger than the degree of the integrated polynomial and we show that these formulas remain valid for all values of d. Also, we consider the integrals of polynomial functions on the orthogonal group O(d) and the symplectic group Sp(d). We obtain an exact character expansion and the asymptotic behavior for large d. Thus we can show the asymptotic freeness of Haar-distributed orthogonal and symplectic random matrices, as well as the convergence of integrals of the Itzykson–Zuber type. 1. Introduction Let G ⊂ End(Cd ) be a compact Lie group viewed as a group of matrices. The matrix structure provides a very natural coordinate system on G; in particular we are interested in the family of functions eij : G → C defined by eij : Md (C) m → mij which to a matrix assign one of its entries. We call polynomials in (eij ) polynomial functions on G. In this article we are interested in the integrals of the polynomial functions on compact Lie groups with respect to the Haar measure on G, i.e. the integrals of the form Ui1 j1 · · · Uin jn Ui j · · · Ui j dU. (1) G

1 1

n n

For simplicity, such integrals will be called moments of the group G. If we consider a matrix-valued random variable U the distribution of which is the Haar measure on G then the integrals of the form (1) have a natural interpretation as certain moments of the entries of U and they appear very naturally in the random matrix

B.C. is supported by a JSPS postdoctoral fellowship. ´ was supported by State Committee for Scientific Research (KBN) grant 2 P03A 007 23. P.S.

´ B. Collins, P. Sniady

774

theory. The reason for this is that quite many random matrix ensembles X are invariant with respect to the conjugation by the elements of the group G and therefore can be written as X = UX U−1 , where U and X are independent matrix-valued random variables and the distribution of U is the Haar measure on G. As a result, the expressions similar to E Tr(X1 Us1 X2 Us2 · · · Xn Usn )

(2)

are quite common in the random matrix theory, where s1 , . . . , sn ∈ {1, } and X1 , . . . , Xn are some matrix-valued random matrices independent from U. It is easy to see that the calculation of (2) can be easily reduced to the calculation of (1). In the random matrix theory we are quite often interested not in the exact value of the expression of type (2) but in its asymptotic behavior if d tends to infinity. The results of this type were obtained for the first time by Weingarten [Wei78]. In this article we are interested in the case when G ⊂ Md (C) belongs to one of the series of the classical Lie groups, i.e. G is either the unitary group U(d) or the orthogonal group O(d) or the symplectic group Sp(d/2), where in the latter case we assume that d is even. Since Haar measure of classical groups and their moments have many physical interpretations, the problem we are considering in this paper has had a long history in theoretical physics. In particular, many algorithms have been found for computing specific moments with high symmetry properties, but as far as the authors know, no general theory has been developed so far at a mathematical level of rigor. For interesting computations handling particular cases of this paper and for a good bibliography, we refer to [Gor02, BB96]. Firstly, we revisit a part of the work of the first named author [Col03] and compute with a new convolution formula the moments of the unitary group. This formula gives a new combinatorial insight into the relation between free probability and asymptotics of moments of the unitary group. Then, we make use of other features of the invariant theory to give an explicit integration formula on the orthogonal and symplectic groups and to compute the asymptotics in the latter case. This allows us to prove a new convergence result for a large family of matrix integrals. Our main tool is the Schur–Weyl duality for the unitary group and its analogues for the orthogonal and symplectic groups.

2. Integration Over Unitary Groups 2.1. Schur–Weyl duality for unitary groups. We recall a couple of notations and standard facts. A non-increasing sequence of nonnegative integers λ = (λ1 , . . . ) is said to be a partition of the integer n (abbreviated by λ n) if i λi = n. We denote by l(λ) its length, i.e. the largest index i for which λi is non-zero. There is a canonical way to parameterize all irreducible polynomial representations ρλ : U(d) → End VUλ(d) of the compact unitary group U(d) by partitions λ such U(d) that l(λ) ≤ d. The character of this representation evaluated on the torus is the Schur polynomial sλ,d (see [Ful97]). By sλ,d (x) we shall understand sλ,d (x, . . . , x) with d copies of x. In particular, sλ,d (1) is the dimension of the representation VUλ(d) of U(d). The group algebra C[Sn ] of the symmetric group Sn is semi-simple. It is endowed λ with its canonical basis {δσ }σ∈Sn . The irreducible representations ρλ Sn : Sn → End VSn are canonically labeled by λ n via the Schur functor (see [Ful97] as well); we denote the corresponding characters by χλ .

Integration on the Unitary Group

775

The following isomorphism holds: ∼ C[Sn ] =

End VSλn .

(3)

λn χλ (e)

For any λ n, let pλ = n! χλ ∈ C[Sn ] be the minimal central projector onto End VSλn . We define for future use the algebra  Cd [Sn ] = 



pλ  C[Sn ] =

λn, l(λ)≤d

λn, l(λ)≤d

End VSλn .

(4)

d ⊗n , where Consider the representation ρd Sn of Sn on (C )

ρd Sn (π) : v1 ⊗ · · · ⊗ vn → vπ−1 (1) ⊗ · · · ⊗ vπ−1 (n) is given by the natural permutation of the elementary tensors. We consider also the representation ρn of U(d) on (Cd )⊗n , where U(d) ρn U(d) (U) : v1 ⊗ · · · ⊗ vn → U(v1 ) ⊗ · · · ⊗ U(vn ) n is the diagonal action. Since the representations ρd Sn and ρU(d) commute, we obtain a

representation ρSn ×U(d) of Sn × U(d) on (Cd )⊗n .

Theorem 2.1 (Schur–Weyl duality for unitary groups [Wey39]). The action of Sn ×U(d) is multiplicity free, i.e. no irreducible representation of Sn ×U(d) occurs more than once in ρSn ×U(d) . The decomposition of ρSn ×U(d) into irreducible components is given by ∼ (Cd )⊗n =

λn, l(λ)≤d

VSλn ⊗ VUλ(d) ,

(5)

λ where Sn × U(d) acts by ρλ Sn ⊗ ρU(d) on the summand corresponding to λ.

We shall consider the inclusion of algebras d ⊗n ρd . Sn (Cd [Sn ]) ⊆ End(C )

Equations (4) and (5) show that ρd Sn is injective when restricted to Cd [Sn ] and for this d reason we shall omit ρSn whenever convenient and consider Cd [Sn ] as sitting inside End(Cd )⊗n . Conversely, we can identify every element of the image ρd Sn (Cd [Sn ]) ⊆ End(Cd )⊗n with the unique corresponding element of the group algebra Cd [Sn ].


776

2.2. Conditional expectation. For A ∈ End(Cd )⊗n we define E(A) = U⊗n A (U−1 )⊗n dU, U(d)

(6)

where the integration is taken with respect to the Haar measure on the compact group U(d). We recall that for an algebra inclusion M ⊂ N, a conditional expectation is a M-bimodule map E : N → M such that E(1N ) = 1M . Proposition 2.2. E defined in (6) is a conditional expectation of End(Cd )⊗n onto Cd [Sn ]. We regard End(Cd )⊗n as a Euclidean space with a scalar product A, B = Tr A∗ B. Then E is an orthogonal projection onto ρd Sn (Cd [Sn ]) . Moreover, it is compatible with the trace in the sense that Tr ◦E = Tr . Proof. Since Haar measure is a probability measure invariant with respect to the left and right multiplication, therefore E(A) commutes with the action of the unitary group U(d) for every A ∈ End(Cd )⊗n . Theorem 2.1 shows that E(A) ∈ Cd [Sn ] and that the range of E is exactly Cd [Sn ]. Since E(A), E(B) = E(A), B it follows that E is an orthogonal projection. The other statements of the proposition can be easily checked directly. For A ∈ End(Cd )⊗n we set −1 Φ(A) = Tr A ρd Sn (σ ) δσ ∈ C(Sn ).

(7)

σ∈Sn

Proposition 2.3. Φ fulfills the following properties: 1. Φ is a C[Sn ]–C[Sn ] bimodule morphism in the sense that Φ A ρd (σ) = Φ(A) σ, Sn Φ ρd Sn (σ) A = σ Φ(A); 2. Φ(Id) coincides with the character of ρd Sn hence it is equal to Φ(Id) = n!

sλ,d (1) pλ χλ (e)

(8)

λn

and is an invertible element of Cd [Sn ]; its inverse will be called Weingarten function and is equal to Wg =

χλ (e)2 1 χλ ; 2 sλ,d (1) (n!) λn l(λ)≤d

3. the relation between Φ(A) and E(A) is explicitly given by Φ(A) = E(A)Φ(Id);

(9)


777

4. the range of Φ is equal to Cd [Sn ]; 5. in Cd [Sn ], the following holds true: Φ(A E(B)) = Φ(A)Φ(B)Φ(Id)−1 .

(10)

Proof. Points 1 and 2 are immediate. Point 1 implies Φ(A) = Φ (E(A)) = Φ (Id E(A)) = Φ(Id) E(A) which proves Point 3. Point 4 follows from Point 3 and Point 2. Point 5 follows from Points 1 and 3. ), Corollary 2.4. Let n be a positive integer and i = (i1 , . . . , in ), i = (i1 , . . . , in j = (j1 , . . . , jn ), j = (j1 , . . . , jn ) be n-tuples of positive integers. Then Ui1 j1 · · · Uin jn Ui j · · · Uin jn dU U(d)

=

1 1

δi1 i

σ(1)

σ,τ∈Sn

If n = n then

. . . δin i

σ(n)

δj 1 j

τ(1)

. . . δjn j

τ(n)

Wg(τσ −1 ).

(11)

U(d)

Ui1 j1 · · · Uin jn Ui j · · · Ui 1 1

j n n

dU = 0.

(12)

Proof. In order to show (11) it is enough to take appropriate A and B in Md (C)⊗n and take the value of both sides of (10) in e ∈ Sn . For every u ∈ C such that |u| = 1 the map U(d) U → uU ∈ U(d) is measure preserving therefore Ui1 j1 · · · Uin jn Ui j · · · Ui j dU 1 1 n n U(d) = uUi1 j1 · · · uUin jn uUi j · · · uUi j dU 1 1

U(d)

and (12) follows.

n n

The above result was obtained by the first named author [Col03] under the assumption n ≥ d. As we shall see, this assumption is not necessary. For n ≥ d the formula (9) takes the simpler form Wg =

1 χλ (e)2 λ χ , sλ,d (1) (n!)2

(13)

λn

with no restrictions on the length of λ. The right-hand side is a rational function of d and hence we may consider it for any d ∈ C. However, the polynomial d → sλ,d (1) has zeros in the integer points −l(λ), −l(λ) + 1, . . . , l(λ) − 1, l(λ), and hence the righthand side of (13) has poles in the points −n, −n + 1, . . . , n − 1, n and therefore is not well-defined on the whole C. Nevertheless, even for the case d < n, let us plug this incorrect value (13) into (11). In this way the right-hand side of (11) becomes a rational function in d.


778

We claim that for every d ∈ N for which the left-hand side of (11) makes sense (i.e. , j , . . . , j , j , . . . , j ∈ {1, . . . , d}), the right-hand side also if i1 , . . . , in , i1 , . . . , in n 1 1 n makes sense (possibly after some cancelations of poles) and is equal to the left-hand side of (11). Indeed, let us view the product Φ(A)Φ(B) Wg as an element of C[Sn ] with rational coefficients in d. For the choice of A, B ∈ Md (C)⊗n used in the proof of Corollary 2.4 we must have Φ(A), Φ(B) ∈ Cd [Sn ], therefore the product Φ(A)Φ(B) Wg is an element of Cd [Sn ] with rational coefficients in d. Since (9) and (13) regarded as elements of C[Sn ] with rational coefficients in d coincide on Cd [Sn ], hence our claim holds true. We summarize the above discussion in the following proposition. Proposition 2.5. For fixed values of the indices i, j, i , j the integral Ui1 j1 · · · Uin jn Ui j · · · Uin jn dU 1 1

U(d)

is a rational function of d. Furthermore, Eq. (11) remains true (possibly after some cancelations of poles) if we replace the correct value (9) of Weingarten function by (13). Example. Corollary 2.4 implies that for d ≥ 2, |U11 |4 dU = U11 U11 U11 U11 dU U(d)

U(d)

= 2 Wg

1 2 1 2

+ 2 Wg

1 2 2 1

=2

−1 1 +2 , d2 − 1 d(d2 − 1)

where the values of the Weingarten function were computed by (13) and where 1 n · · · σ(1) σ(n) denotes the permutation σ. The right-hand side appears to make no sense for d = 1, nevertheless after algebraic simplifications we obtain 2 |U11 |4 dU = d(d + 1) U(d) which is a correct value for all d ≥ 1. 2.3. Asymptotics of the Weingarten function. In this section we compute the first order asymptotic of the Weingarten function for large values of d. Consider the algebra C[Sn ][[d−1 ]] of functions on Sn valued in formal power series in d−1 and the vector space A = Vect αδσ : α = O(d−|σ| ) and αd|σ| is a power series in d−2 , where |σ| denotes the minimal number of factors necessary to write σ as a product of transpositions. By the triangle inequality |σ1 | + |σ2 | ≥ |σ1 σ2 | and the parity property (−1)|σ1 | (−1)|σ2 | = (−1)|σ1 σ2 | , A turns out to be a unital subalgebra of C[Sn ][[d−1 ]]. It is easy to check that d−n Φ(Id) ∈ A. Since d−n Φ(Id) = δe + O(d−1 ), therefore i its inverse dn Wg = i 1 − d−n Φ(Id) makes sense as a formal power series in d−1 . The following proposition follows immediately.


779

Proposition 2.6. dn Wg ∈ A. Equivalently, for any σ ∈ Sn , Wg(σ) = O(d−n−|σ| ). In order to find a more precise asymptotic expansion we consider the two-sided ideal I in A generated by d−2 δe . It is easy to check that the quotient algebra A/I regarded as a vector space is spanned by the vectors d−|σ| δσ . The products of these elements are given by d−|σρ| δσρ if |σρ| = |σ| + |ρ|, −|σ| −|ρ| ∼ (d δσ )(d δρ ) = 0 if |σρ| < |σ| + |ρ|. Biane [Bia97] considered an algebra which as a vector space is equal to C[Sn ] with the multiplication δσρ if |σρ| = |σ| + |ρ|, δσ δρ = 0 if |σρ| < |σ| + |ρ|. One can easily see now that d−|ρ| δρ → δρ provides an isomorphism of A/I and Biane algebra. Under this isomorphism d−n Φ(Id) is mapped into ζ = σ∈Sn δσ . The inverse of ζ in Biane algebra is called Möbius function and is given explicitly by Moeb(σ) = c|Ci |−1 (−1)|Ci |−1 , 1≤i≤k

where σ is a permutation with a cycle decomposition, σ = C1 · · · Ck , |Ci | is the number of elements in the cycle Ci and cn =

(2n)! n!(n + 1)!

(14)

is the Catalan number. Corollary 2.7. dn+|σ| Wg(σ) = Moeb(σ) + O(d−2 ). 3. Integration Over Orthogonal Groups 3.1. Schur–Weyl duality for orthogonal groups. 3.1.1. Brauer algebras. We consider the group of orthogonal matrices O(d) = {M ∈ GL(d), M−1 = Mt = M∗ }. Its invariant theory has first been studied by R. Brauer [Bra37] who introduced a family of algebras, nowadays called Brauer algebras. These algebras have been at the center of many investigations (see [BW89, Gro99] and the references therein). Some actions of these algebras lead to an analogue of the Schur–Weyl duality in the case of the orthogonal group and symplectic groups and for this reason they are very useful for our purposes. Consider 2n vertices arranged in two rows: the upper one with n vertices denoted by U1 , . . . , Un and the bottom row with n vertices denoted by B1 , . . . , Bn . We regard S2n as a group of permutations of the set of vertices and denote by P2n the set of all pairings of this set. An example of such a pairing is presented on Fig. 1. We


780

Fig. 1. Example of an element of P20

can view P2n as a set of permutations σ ∈ S2n such that σ 2 = e and σ has no fixpoints. We will consider the action ρS2n of S2n on P2n by conjugation under the embedding P2n ⊂ S2n described above. By C[P2n ] we denote the linear space spanned by P2n . We equip this linear space with a bilinear symmetric form ·, · by requirement that the elements of P2n form an orthonormal basis. The embedding P2n ⊂ S2n extends linearly to the inclusion of S2n –modules C[P2n ] ⊂ C[S2n ] and the scalar product can be described as

a, b =

χreg (ab∗ ) , χreg (e)

where χreg denotes the character of the left regular representation. The Brauer algebra B(d, n) regarded as a vector space is isomorphic to C(P2n ). The multiplication in the algebra B(d, n) depends on the parameter d, but in this article we will not use the multiplicative structure of the Brauer algebra. 3.1.2. Canonical representation of the Brauer algebra. By ·, · we denote the canonical bilinear symmetric forms on Cd and on (Cd )⊗n . The canonical representation ρB of the Brauer algebra B(d, n) on (Cd )⊗n is defined as follows: in order to compute

u1 ⊗ · · · ⊗ un , ρB (p)[b1 ⊗ · · · ⊗ bn ], where p ∈ P2n and u1 , . . . , un , b1 , . . . , bn ∈ Cd we assign to the upper vertices of p vectors u1 , . . . , un and to bottom vertices vectors b1 , . . . , bn . The value of u1 ⊗ · · · ⊗ un , ρB (p)[b1 ⊗ · · · ⊗ bn ] is defined to be a product of the scalar products of vectors assigned to vertices joined by the same line. For example, for the diagram p from Fig. 1 we obtain:

u1 ⊗ · · · ⊗ u10 , ρB (p)[b1 ⊗ · · · ⊗ b10 ] = u1 , u3 u2 , u4 × u5 , b7 u6 , u10 u7 , u9 u8 , b8 b1 , b3 b2 , b5 b4 , b6 b9 , b10 . (15) The bilinear form ·, · allows to identify canonically Cd with its dual and we can write the isomorphism of vector spaces End(Cd )⊗n = Cd . (16) i∈{U1 ,...,Un ,B1 ,...,Bn }

Let us consider the action of S2n on End(Cd )⊗n by permutation of factors on the right-hand side of (16). We consider the representation ρn of O(d) on (Cd )⊗n , where O(d) ρn O(d) (O) : v1 ⊗ · · · ⊗ vn → O(v1 ) ⊗ · · · ⊗ O(vn ) is the diagonal action.


781

Theorem 3.1 (Schur–Weyl duality for orthogonal groups [Bra37, Wen88]). The commutant of ρn (O(d)) is equal to ρB (C[P2n ]). Furthermore if d ≥ n then ρB is O(d) injective.

3.2. Integration formula. 3.2.1. For A ∈ End(Cd )⊗n we define E(A) =

O(d)

O⊗n A(Ot )⊗n dO.

Proposition 3.2. E is a conditional expectation of End(Cd )⊗n into ρB (C[P2n ]), in particular it satisfies E2 = E. We regard End(Cd )⊗n as a Euclidean space with a scalar product A, B = Tr AB∗ . Then E is an orthogonal projection onto ρB (C[P2n ]) . It is compatible with the trace in the sense that Tr ◦E = Tr . Proof. Proof is analogous to the proof of Proposition 2.2 but instead of Theorem 2.1 we use Theorem 3.1. For A ∈ End(Cd )⊗n we set Φ(A) =

p Tr(ρB (p)t A) ∈ C[P2n ].

(17)

p∈P2n

Every element of C(P2n ) can be viewed by the representation ρB as an element of End(Cd )⊗n and therefore we can consider the linear map

= Φ ◦ ρB : C(P2n ) → C(P2n ). Φ

coincides with the Gram matrix of the set of the vectors The matrix of the operator Φ d ⊗n

We ρB (p) ∈ End(C ) indexed by p ∈ P2n . We denote by Wg the inverse of Φ. postpone answering the question if this inverse exists to Proposition 3.10. We denote by Πp1 ,p2 the partition induced by the action of the group generated by p1 , p2 . Proposition 3.3. ρB , E, Φ are morphisms of S2n -spaces. As a consequence, p1 , Wg p2 depends only on the conjugacy class of p1 p2 . Proof. The proof of this proposition is straightforward.

By a change of labels we can view P2n as the set of pairings of the set {1, . . . , 2n}. We do not care about the choice of the way in which labels {U1 , . . . , Un , B1 , . . . , Bn } are replaced by {1, . . . , 2n}. For a tuple of indices i = (i1 , . . . , i2n ), where i1 , . . . , i2n ∈ p {1, . . . , d} and a pairing p ∈ P2n we set δi = 1 if for each pair a, b ∈ {1, . . . , 2n} p connected by p we have ia = ib ; otherwise we set δi = 0.


782

Corollary 3.4. The following formulas hold true: E = ρB ◦ Wg ◦Φ, Tr AE(B) = Tr (AρB (p1 )) Tr ρB (p2 )t B p1 , Wg p2 .

(18) (19)

p1 ,p2 ∈P2n

For every choice of u1 , . . . , u2n , v1 , . . . , v2n we have

u1 , Ov1 · · · u2n , Ov2n dO O(d)

=

u1 ⊗ · · · ⊗ un , ρB (p1 ) un+1 ⊗ · · · ⊗ u2n

p1 ,p2 ∈P2n

× v1 ⊗ · · · vn , ρB (p2 ) vn+1 ⊗ · · · ⊗ v2n p1 , Wg p2 . In particular, for every choice of indices i = (i1 , . . . , i2n ), j = (j1 , . . . , j2n ), p p Oi1 j1 · · · Oi2n j2n dO = δi 1 δj 2 p1 , Wg p2 . O(d)

(20)

(21)

p1 ,p2 ∈P2n

The moments of an odd number of factors vanish: Oi1 j1 · · · Oi2n+1 j2n+1 dO = 0. O(d)

(22)

Proof. It is enough to take appropriate matrices in the canonical basis to establish this result. The map O(d) : O → −O ∈ O(d) preserves the Haar measure, therefore Oi1 j1 · · · Oi2n+1 j2n+1 dO = (−Oi1 j1 ) · · · (−Oi2n+1 j2n+1 ) dO O(d)

which shows (22).

O(d)

Therefore Wg appears to be of fundamental importance in the computation of moments of the orthogonal group, and it is of theoretical importance to give a closed formula for it. We shall do this in the following. 3.2.2. An abstract formula for the orthogonal Weingarten function. Let Id ∈ P2n be any fixed pairing; to have a concrete example let us say that Id is the identity of the Brauer algebra, i.e. the pairing which connects the pairs of vertices Ui , Bi with each 1 ≤ i ≤ n. From now on we fix an inclusion of the hyperoctahedral group On into S2n by considering On as the global stabilizer of Id under the action of S2n . We equip the set P2n of pairings with a metric l by setting l(p1 , p2 ) =

|p1 p2 | , 2

where pairings p1 , p2 are regarded on the right-hand side as elements of S2n .


783

Fig. 2. Identity in Brauer algebra

Lemma 3.5. If p1 , p2 ∈ P2n then Tr ρB (p1 )ρB (p2 )t = dn−l(p1 ,p2 ) .

(23)

Furthermore, l(p1 , p2 ) is an integer number. Each right class πOn of S2n /On is uniquely determined by its action π(Id) on the identity diagram hence the right classes S2n /On are in a one-to-one correspondence with the elements of P2n . We set |πOn | = minσ∈πOn |σ|. Then

(π(Id)) , Id = dn−|πOn | . Φ (24) Let a left and right class On ρOn be fixed. The value of |πOq | does not depend on the choice of π ∈ On ρOn , therefore the definition |On ρOn | = |ρOn | makes sense. Proof. Let e1 , . . . , ed be the orthogonal basis of Cd ; then Tr ρB (p1 )ρB (p2 )t =

ej1 ⊗ · · · ⊗ ejn , ρB (p1 )(ei1 ⊗ · · · ⊗ ein )

1≤i1 ,...,in ,j1 ,...,jn ≤d

× ej1 ⊗ · · · ⊗ ejn , ρB (p2 )(ei1 ⊗ · · · ⊗ ein ) . To every upper vertex Uk (respectively, bottom vertex Bk ) we assign the appropriate index ik (respectively, jk ). From the very definition of ρB , the right-hand side is equal to 1 if the indices corresponding to each pair of vertices connected by p1 or p2 are equal; otherwise the right-hand side is equal to 0. It follows that Tr ρB (p1 )ρB (p2 )t = dnumber of connected components of the graph depicting p1 and p2 . We observe that each connected component of the graph depicting p1 and p2 corresponds to a pair of orbits of the permutation p1 p2 . The number of orbits of p1 p2 is equal to 2n − |p1 p2 | which finishes the proof of the first part. The above considerations imply that

(π(Id)) , Id = dn− 12 ·|π Id π−1 Id | . Φ Let σ ∈ πOn . Since |π Id π −1 Id | = |σ Id σ −1 Id | ≤ |σ| + | Id σ −1 Id | = 2|σ|, therefore |π Id π −1 Id | ≤ 2 |πOn |.


784

We can decompose the set of the vertices {U1 , . . . , Un , B1 , . . . , Bn } into two classes in such a way that the graph depicting pairings π(Id) and Id is bipartite, or—in other words—each of the pairings π(Id), Id regarded as a permutation maps these two classes into each other. We leave it to the reader to check that there exists a unique permutation σ ∈ πOq which is equal to the identity on the first of these classes. It follows |π Id π −1 Id | = 2 |σ|, which shows that |π Id π −1 Id | ≥ 2 |πOn |. Let σ ∈ On . Then |π Id π −1 Id | = |π Id π −1 σ −1 Id σ| = |σπ Id π −1 σ −1 Id |, therefore |πOn | = |σπOn | finishes the proof.

Lemma 3.6. The sum of dimensions of representations of S2n of the shape 2y1 ≥ 2y2 ≥ · · · , where y1 + y2 + · · · = n equals the cardinality of P2n . Proof. The Robinson–Schensted–Knuth algorithm provides a bijection between permutations and pairs (P, Q) of standard Young tableaux of the same shape. Furthermore if σ → (P, Q) then σ −1 → (Q, P); it follows that the RSK algorithm is a bijection between involutions σ = σ −1 and standard Young tableaux. It is easy to show that for any idempotent without fixed point, the RSK algorithm which gives a pair of tableaux (P, Q) of the same shape satisfies the additional property that P = Q. Furthermore, implementing the reverse of RSK algorithm (see [Ful97]) shows that the tableaux must have the shape prescribed in the lemma, and that any such tableau gives rise to an idempotent without fixed point. Proposition 3.7. The space C(P2n ) splits under the action of S2n as a direct sum of representations associated to Young diagrams of the shape 2y1 ≥ 2y2 ≥ . . . , where y1 + . . . + yq = n, hence the action is multiplicity–free. Proof. Following Fulton [Ful97], let us consider a diagram of shape 2y1 ≥ 2y2 ≥ . . . and consider its row numbering Young tableau. Let C be the column invariant subgroup of S2n and L the line invariant subgroup; both groups are isomorphic to a product of the symmetric groups. We consider the projection operator pC associated to the trivial representation of C and the projection operator pL associated to the alternate representation of L. One can see geometrically that these two operators commute and that the partition (1, 2)(3, 4), . . . , (2n − 1, 2n) is not in the kernel of pC ◦ pL . The dimension argument of Lemma 3.6 concludes the proof and shows the uniqueness of the occurrence of any representation of the shape 2y1 ≥ 2y2 ≥ . . . .

are indexed by Young diagrams λ with the shape Proposition 3.8. The eigenspaces of Φ 2l1 ≥ 2l2 ≥ . . . . The corresponding eigenvalue is given by n−|π| λ π∈On \S2n /On d σ∈π χ (σ) , (25) zλ = λ σ∈On χ (σ) and the corresponding eigenspace is equal to the image of ρS2n (pλ ).


785

is a morphism of S2n –spaces by Proposition 3.3, hence Proposition 3.7 gives Proof. Φ

Let λ be as in Proposition 3.7; then the element the classification of the eigenspaces of Φ. ρS2n (pλ )(Id) is non-zero and belongs to an irreducible submodule of C(P2n ), thus it satisfies

ρS (pλ )(Id) = zλ ρS (pλ )(Id). Φ 2n 2n We have therefore by bilinearity

S (pλ )(Id), Id = zλ ρS (pλ )(Id), Id = zλ

Φρ 2n 2n

pλ (σ).

(26)

σ∈On

Lemma 3.5 can be used to evaluate the left-hand side of (26). Since the left-hand side of (26) is non-zero for sufficiently big d, hence also the right-hand side is non-zero and the division makes sense. Theorem 3.9. The Weingarten function is given by Wg =

1 ρS2n pλ , zλ

(27)

λ

where the sum runs over diagrams λ with a shape prescribed in Proposition 3.7 and zλ was defined in Eq. (25). In particular,

p1 , Wg p2 =

λ

1 χreg ρS2n pλ (p1 ) · ρS2n pλ (p2 ) , zλ (2n)!

(28)

where ρS2n pλ (pi ) are considered as elements of C[S2n ], · is the multiplication in C[S2n ]. Proof. The first point follows from the above discussion and for the second it is enough 1 to observe that p1 , p2 = (2n χreg (p1 pt2 ). )! Observe that Eq. (28) is a closed formula for p1 , Wg p2 as a (rational) function of the dimension d, expressed in terms of the characters of the symmetric group. A priori, Corollary 3.4 is valid only for d ≥ n since in this case ρB is injective

is invertible; otherwise the Weingarten function does not exist. The and therefore Φ following result deals also with the cases d < n. Proposition 3.10. Corollary 3.4 remains true for all values of d and n if the following definition of the Weingarten function is used: Wg =

1 ρS2n pλ , zλ

(29)

λ

where the sum is taken over all diagrams λ with a shape prescribed in Proposition 3.7 for which zλ = 0.


786

Proof. Since E is an orthogonal projection, it is enough to check the validity of (18) on the range of ρB . We denote by V ⊆ C(P2n ) the span of the images of ρS2n (pλ ) for which zλ = 0; the range of ρB is equal to ρB (V ), hence it is enough to show that

E ◦ ρB = ρB ◦ Wg ◦Φ

: V → V is equal holds true on V . The latter equality is obvious since the inverse of Φ to Wg given by (29). We can treat Φ d : C(P2n ) → C(P2n ) as a matrix the entries of which are polynomials in d and therefore its inverse Wgd : C(P2n ) → C(P2n ) makes sense as a matrix the entries of which are rational functions of d ∈ C; therefore Wgd is well–defined for all d ∈ C except for a finite set; it is explicitly given by (27). For fixed A, B ∈ Md0 (C)⊗n let us plug this (incorrect for d0 < n) value of Wgd into (19); the right-hand side becomes a rational function of d and after the cancelation of poles it has a limit d → d0 which is indeed equal to the left-hand side of (19). In other words, we claim that E = lim ρB ◦ Wgd ◦Φ.

(30)

d →d 0

It is indeed the case since for every d ∈ C the value of Wgd (Φ(A)) is the same no matter if we use (27) or (29). We summarize the above discussion in the following proposition. Proposition 3.11. Corollary 3.4 remains true for all values of d and n if the Weingarten function is regarded as a rational function computed in (27); possibly after some cancelation of poles. 3.3. Asymptotics of Weingarten function. For pairings p1 , p2 ∈ P2n let 2n1 , 2n2 , . . . denote the numbers of the elements in the orbits of the action of {p1 , p2 }. We define the Möbius function Moeb(p1 , p2 ) = (−1)ni −1 cni −1 , i

where cn is the Catalan number defined in (14). Lemma 3.12. For every p ∈ P2n and |d| sufficiently large we have Wg(p) = d −n

k≥0

p0 ,p1 ,...,pk p0 =p, pi =pi+1 for i∈{0,1,...,k−1}

(−1)k d−l(p0 ,p1 )−···−l(pk−1 ,pk ) pk .

Proof. It is enough to observe

d−n Φ(p) =p+

d−l(p,p ) p

p =p 1 2 3 and use the power series expansion 1+ x = 1 − x + x − x + · · · for the operator − n

d Φ.


787

Theorem 3.13. The leading term of the Weingarten function is given by

p, Wg p = d−n−l(p,p ) Moeb(p, p ) + O(d−n−l(p,p )−1 ).

(31)

Proof. Lemma 3.12 implies that we need to find explicitly all tuples of pairings p0 , . . . , pk such that p0 = p, pk = p which fulfill pi = pi+1 for i ∈ {0, . . . , k − 1} and l(p0 , p1 ) + · · · + l(pk−1 , pk ) = l(p0 , pk ). For every such a tuple the triangle inequality implies that l(p0 , pi ) + l(pi , pk ) = l(p0 , pk ), or equivalently, |p0 pi | + |pi pk | = |p0 pk | = |(p0 pi )(pi pk )|. The latter condition implies that every orbit of p0 pi ∈ S2n must be a subset of one of the orbits of p0 pk [Bia97, Bia98]. Therefore the pairing pi cannot connect vertices which belong to different connected components of the graph spanned by p0 and pk . It follows that it is enough to consider the case if the graph spanned by p0 and pk is connected. Suppose that the graph spanned by p0 and pk is connected. It follows that the permutation p0 pk consists of two n-cycles, we denote one of them by π. Since every orbit of p0 pi is a subset of one of the orbits of p0 pk therefore it makes sense to consider the restriction ρi of p0 pi to the support of π. Observe that knowing ρi we can reconstruct the pairing pi by the formula if ρi (s) is defined, p0 ρi (s) pi (s) = ρi−1 p0 (s) otherwise. It follows that the solutions of the equation l(p0 , pi ) + l(pi , pk ) = l(p0 , pk ) can be identified with the solutions of the equation |ρ| + |ρ−1 π| = |π|. Now one can easily see that the tuples of pairings p1 , . . . , pk−1 which fulfill pi = pi+1 for i ∈ {0, . . . , k − 1} and l(p0 , p1 ) + · · · + l(pk−1 , pk ) = l(p0 , pk ) are in oneto-one correspondence with tuples of permutations ρ1 , . . . , ρk−1 such that ρi = ρi+1 −1 −1 and |ρ0 ρ1−1 | + · · · + |ρk−1 ρk | = |ρ0 ρk |, where ρ0 is the identity permutation and ρk = π. The results of Biane [Bia97] finish the proof. 3.4. Cumulants. Recall that in the work of the first named author [Col03] the asymptotics of the cumulants of the unitary Weingarten functions have been obtained (Theorem 2.15). The purpose of this section is to establish the counterpart of this result for the orthogonal Wg functions. As we see by Proposition 3.3, the function Wg can be labeled by Wg(λ, d) were λ n is a partition of the number n. It will be more convenient to define in the obvious way Wg(π, d), where π is a partition of the interval [1, n]. The set of partitions of an interval is endowed with the order of refinement, and denoted by ≤. The set of partitions is known to be a lattice, in which there is a smallest element (the partition with only one-element blocks 0n ) and a largest element (the partition with only one block 1n ). In addition, the notion of sup and inf makes sense. For partitions Π, Π of [1, n] such that π ≤ Π ≤ Π , it is of fundamental importance to have a good understanding of the relative cumulants Cπ,Π,Π of Wg defined implicitly by the relation WgΠ (π, d) = Cπ,Π,Π whenever Π ≥ Π, with WgΠ (π) =

Π≤Π ≤Π

k Wg(π|Vk )

if one denotes Π = {V1 , . . . , Vk }.


788

Remark. Cπ,Π,Π is multiplicative, therefore it is enough to know Cπ,Π,1n to know all Cπ,Π,Π . Actually, it is shown in [Col03] that Cπ,Π,Π can be written as a sum of Cπ,π,1n ’s. Lemma 3.14. The relative cumulant is given for d large enough, by Cπ,Π,Π = d−n (−1)k d−l(p0 ,p1 )−···−l(pn−1 ,pn ) . k≥0

p=p0 ,p1 ,...,pk pi =pi+1 for i∈{0,1,...,k−1} sup(Π,π,π1 ,...,πk )=Π

The leading order of the series of Cπ,Π,Π is therefore the number of k-tuples (π1 , . . . , πk ) of elements of P2n such that l(π, π1 ) + l(π1 , π2 ) + . . . + l(πk , Id) = n + l(π, Id) − 2(#blocks(Π ) − #blocks(Π)) together with the requirement that sup(Π, π, π1 , . . . , πk ) = Π . Proof. For the first point, it is enough to check that this equation satisfies the momentcumulant equation. Asymptotics of the leading order is elementary. For a less direct approach, see also [Col03]. In order to compute the leading order, it is enough to compute the number of k -tuples (π1 , . . . , πk ) of elements of P2n such that d(π, π1 ) + d(π1 , π2 ) + . . . + d(πk , Id) = n + l(π, Id) − 2(#blocks(Π ) − #blocks(Π)) together with the requirement that the sup(π, π1 , . . . , πk ) = 1n . We call B[π, k] this number. Denote by τ1 , . . . , τn the disjoint transpositions generating the pairing Id ∈ P2n , and G be the subgroup of S2n generated by these transpositions. This group has the structure of (Z/2Z)n . The symmetric group Sn can be regarded as a subset of P2n when we identify a permutation σ with a pairing which connects the upper vertex Ui with the bottom vertex Bσ(i) for all values of 1 ≤ i ≤ n. We say that pairings which can be obtained by this construction are permutation-like. The group G acts on P2n by conjugations and one checks easily that in any orbit under the action of G there exist at least one permutationlike element. Moreover, two permutation-like elements in the same orbit are conjugate to each other when regarded as elements of Sn . More precisely, each orbit has 2l elements, where l is the number of cycles with at least 3 elements in Sn . Fix π ∈ P2n and call k the number of its connected components (i.e. the number of cycles -including trivial cycles (two-element orbits) and transpositions (four-elements orbits) of an associated permutation-like element). Let σ ∈ Sn be one image π. Consider the number of k -tuples (σ1 , . . . , σk ) of permutations of Sn such that σ1 . . . σk σ = e, the group generated by σ1 , . . . , σk acts transitively on [1, n] and |σ| + |σ1 | + . . . + |σk | = 2n − 2. This number has already been computed in [BMS00] and is equal to ˜ A[σ, k] = k

ki − 1di (nk − n − 1)! i , (nk − 2n + |σ| + 2)! i i≥1

where di denotes the number of cycles with i elements of σ. ˜ k]. Proposition 3.15. B[π, k] = 2k−1 A[σ,


789

Proof. The group G acts by conjugation on k-tuples (π1 , . . . , πk ) arising in the counting of B[π, k]. There is an element of G which turns π into a permutation like element. In other words, one can assume that π is permutation like. Introduce the group G generated by τi1,1 . . . τi1,l1 , . . . , τik,1 . . . τik,lk , where τij,1 , . . . , τij,lj correspond to the elements of the jth cycle of σ. This group has the structure of (Z/2Z)k and acts by restriction of G on the k-tuples (π1 , . . . , πk ). One checks that for any k-tuple (π1 , . . . , πk ) satisfying the length conditions, there exist exactly elements of G such that their action turns all k-tuples into permutation like elements. Theorem 3.16. Cπ,Π,Π is a rational fraction of order d−n−l(π,Id)+2(#blocks(Π )−#blocks(Π)) and its leading term is given by γπ,Π,Π . Assume that π has di cycles of length i − 1. Then q 22q−2|π|−1 (3q − 3 − |π|)! (2i − 1)! di . (32) γπ,π,1n = (−1)|π| (2q)! (i − 1)!2 i=1

Proof. The proof is exactly the same as that of Theorem 2.15 in [Col03]. It is enough, ˜ in Eq. (2.56), to replace A[σ, k] by B[π, k]. According to the remark above Lemma 3.14, it is possible to write γπ,Π,Π as a sum of elements of type γπ,π,1n . This is a straightforward adaptation of Theorem 2.15, item (iii) of [Col03]. Therefore the above theorem actually gives us full understanding of the leading order of Cπ,Π,Π . 4. Integration Over Symplectic Groups Let e1 , . . . , ed , f1 , . . . , fd be an orthonormal basis of C2d . We refer to this basis as the canonical basis. Consider the bilinear antisymmetric form ·, · such that

ei , fj = δi,j ,

ei , ej = fi , fj = 0.

(33)

The symplectic group Sp(d) is the set of unitary matrices of M2d (C) preserving ·, ·. Also by ·, · we denote the bilinear form on (C2d )⊗n given by the canonical tensor product of forms ·, · on C2d . This form is symmetric if n is even and antisymmetric if n is odd. The Brauer algebra B(−d, n) admits a natural action onto the space (C2d )⊗n given in the same way as in Sect. 3.1.2 with the difference that ·, · should be understood as in Eq. (33). Most of the results from Sect. 3 remain true also for the symplectic case. Below we present briefly which changes are necessary. Theorem 4.1 (Schur–Weyl duality for symplectic groups [Bra37, Wen88, BW89]). The commutant of ρSp(d) (Sp(d)) is equal to ρB (C[P2n ]). Furthermore if d ≥ n then ρB is injective. For A ∈ End(C2d )⊗n we set E(A) = O⊗n A(Ot )⊗n dO, Sp(d)

and define Φ(A) as in (17). All results of Sect. 3 remain true with the only difference that the value of d in all formulas should be replaced by (−d). As for the cumulants, γπ,π,1n should be replaced by (−1)k+1 γπ,π,1n , where k is the number of blocks of π.


790

5. Expectation of Product of Random Matrices and Free Probability This section is rather sketchy since it follows very closely the work of the first–named author [Col03]. 5.1. Asymptotic freeness for orthogonal matrices. Let n be an integer. We consider the following enumeration of 8n integers: 1, . . . 4n, 1, . . . , 4n. Consider T the subset of P8n such that any pairing links each i with some j. There is a natural bijection between this set and S4n . Call Ξ the element of P8n linking 2i − 1 to 2i and 2j − 1 to 2j, and S the subset of P8n such that elements link 2i − 1 to 2i and an odd (resp. even) j to an odd (resp. even) k. Let A(1) , . . . , A(2n) be (constant) matrices in Md (C). For τ ∈ P8n , and a random matrix B, define tr(A(1) , . . . , A(2n) ; B, τ)  = d−loops(Ξ,τ) E 

k1 ,...,k4n ,k1 ,...,k4n

×

2n i=1



(i) Bk2i−1 ,k2i Ak δτ,k, 2i−1 ,k2i

(34)

where δτ,k = 1 if for all pair (i, j) of τ, ki = kj , and 0 else. This expression is obviously a product of normalized traces of {B, Bt } alternating with {A(i) , A(i)t } Let τ ∈ T and σ ∈ S. Define tr(A(1) , . . . , A(2n) ; τ, σ)  = d−loops(σ,τ) E 

k1 ,...,k2n ,k1 ,...,k2n 2n (i) × Ak i=1

2i−1 ,k2i

 δτ,k δσ|{1,...4n} ,k.

(35)

Observe that σ is restricted to the set {1, . . . 4n}. It makes sense to do so because it belongs to S and elements of S do not link i’s with ¯j’s. As in Eq. (34) this expression is obviously a product of normalized traces of {A(i) , A(i)t }. Let O be a random orthogonal Haar distributed matrix in Md (C). One establishes easily E tr(A(1) , . . . , A(2n) ; O, τ) g(σ, Ξ)dl(Ξ,τ)−l(Ξ,σ)−l(σ,τ) , = tr(A(1) , . . . , A(2n) ; τ, σ)W

(36)

σ∈S

g is the asymptotic normalized Wg function restricted on the set {1, . . . , 4n}. where W From this we obtain


791

Lemma 5.1. In Eq. (36), assuming that {A(i) , A(i)∗ } admits a joint limit distribution with respect to the normalized trace tr on Md (C), any term on the right hand side has asymptotic order ≤ 0. In case l(Ξ, τ) − l(Ξ, σ) − l(σ, τ) = 0, at least two factors of tr(A(1) , . . . , A(2n) ; τ, σ) have to be of the kind tr(A(i) ). In addition, at least two of such indices i are such that neither the pattern " . . . OA(i) O∗ . . . " nor " . . . O∗ A(i) O . . . " occurs in the cycle decomposition. Proof. The first point is an obvious consequence of the triangle inequality. In the case l(Ξ, τ) − l(Ξ, σ) − l(σ, τ) = 0, observe that since l(Ξ, σ) ≥ n, one has to have l(σ, τ) ≤ 3n − 1. The remaining assertions are an easy adaptation of [Col03], Proposition 3.3 (note that according to the definition of T and S, l(σ, τ) ≥ 2n; the proof follows by an easy graphical interpretation and the description of geodesic given in proof of [Col03], Theorem 3.13). From this we deduce: Theorem 5.2. Let O1 , O2 . . . . be independent copies of orthogonal ensembles, and W be a set of matrices such that the set (W, W t ) admits a limit distribution. Then W, {O1 , Ot1 }, {O2 , Ot2 } . . . are asymptotically free. This convergence holds almost surely. Proof. Asymptotic freeness is an immediate application of definition of freeness together with the previous lemma and asymptotic multiplicativity of Wg function established at Theorem 3.13. The proof of almost sure convergence is a consequence of the computation of cumulants of Wg function in Theorem 3.16 together with an application of Chebyshev inequality and Borel-Cantelli lemma (see [Col03], Theorem 3.7 for details). Remark. We would like to draw the attention of the reader on the existence of important differences between the unitary case and other cases. For example, in the unitary case, the (d) matrix family (2d Ei,i+1 , {O, O∗ }) ∈ Md (C) admits an asymptotic joint law whereas this is not true in the orthogonal case. The existence of an asymptotic joint law does not fail if one assumes that matrices are bounded. It also holds if one modifies the joint law assumption by enlarging the family W to W, W t as we do in the previous theorem. It is also possible to write down a necessary and sufficient relation from Eq. (36) but to our knowledge, there is no mathematical need for this at this point. 5.2. Orthogonal matrix integral. In this section we deal with orthogonal matrix integrals (that is, matrix integrals where the integral is taken with respect to the Haar measure on the orthogonal group), and in particular with the orthogonal Itzykson-Zuber integral. For unitary matrix integrals many tools are available and this paper together with [Col03] just provide a complementary mathematical approach. However, interestingly enough, it seems that up to now, except character expansion, there were not so many systematic tools for the study of non-unitary (i.e. orthogonal, symplectic) matrix integrals. One bright side of our approach is to provide such a tool and therefore new formulae to theoretical physics. Theorem 5.3. Let W be a family of matrices such that the family W, W t admits a limit joint distribution. Let O1 , . . . , Ok be independent Haar distributed unitary (resp. orthogonal or symplectic) matrices. Let (Pi,j )1≤i,j≤k and (Qi,j )1≤i,j≤k be two families of noncommutative polynomials in O1 , O∗1 , . . . , Ok , O∗k and W . Let Ad be the ran k k k ∗ dom variable k i=1 j=1 tr Pi,j (O, O , W ) and Bd the variable i=1 j=1 tr Qi,j 1 (O, O∗ , W ), where tr x = d Tr x for x ∈ Md (C)(C) denotes the normalized trace.


792

• (i) For each d, the analytic function z → d−2 log E exp(zd2 Ad ) =

ad,n z n

n≥1

is such that for all n the limit limd→∞ ad,n exists and is finite. It depends only on the limit distribution of W and on the polynomials Pi,j . • (ii) For each d, the analytic function z→

E exp(zBd + zd2 Ad ) = 1 + bd,n z n E exp(zd2 Ad ) n≥1

is such that for all n the limit limd→∞ bd,n exists and is finite. It depends only on the limit distribution of W and on the polynomials Pi,j and Qi,j . Proof. This is a straightforward application of Theorem 3.16. See Theorem 4.1 of [Col03] for details. As a further illustration of our results on the asymptotics of cumulants, we state the asymptotics of d−2 Cn (d Tr Ad OBd O∗ ). This number is also known as the coefficient of the series of the orthogonal ItzyksonZuber integral. Observe that if Ad , Bd are real antisymmetric, the Harish-Chandra formula applies and yields a formula for finite dimensional IZ integral provided that the eigenvalues of Ad and Bd have no multiplicity. Without these assumptions, there is no formula to our knowledge. However, interesting results have been obtained in [BH03] (see also references therein) about asymptotics of symplectic Harish-Chandra integrals and the two results would deserve to be compared. The asymptotic convergence of d−2 Cn (d Tr Ad OBd O∗ ) provided that Ad , t Ad , Bd , Btd admit a joint limit distribution is already granted by Theorem 5.3. Let Gn be the set of (not-necessarily connected) planar graphs (such that any connected component is drawn on a distinct sphere) with n edges together with the following conditions: (i) each face has an even number of edges, (ii) the edges are labeled from 1 to n, (iii) there is a bicoloring in white and black of the vertices such that each black vertex has only white neighbors and vice versa. To each such graph g ∈ Gn we associate the permutations σ(g) (resp. τ(g)) of Sn defined by turning clockwise (resp. counterclockwise) around the white (resp. black) vertices and the function Moeb(g) = γτσ−1 ,Πτ ∨Πσ ,q+|τσ−1 |+2(C(Πτ ∨Πσ )−1) . For this definition to make sense in the orthogonal framework, we chose an embedding of Sn into B2n by partitioning [1, 2n] into two sets V1 and V2 of cardinal n and to a permutation σ, we associate an element of B2n pairing the i th element of V1 to the σ(i)th element of V2 .


793

7 8

6

5

10

9

16

3

2

4

1

11

13

15

17

12

14

For example in the picture, σ = (1 13 2)(3 5 4)(6 7)(8 9 10)(11 12)(16 17)(14 15), τ = (5 6)(7 8)(10 11)(2 3 9)(12 13)(1 4)(14 17)(15 16), τσ −1 = (1 3)(5 9 7)(6 8 11 13 4)(2 12 10)(17 15)(14 16). Two graphs are said to be equivalent if there is a positive oriented diffeomorphism of the plane transforming one into the other and respecting the coloring of the vertices and the labeling of the edges. We call ∼ this equivalence relation. For a permutation li σ ∈ Sn , we call Xσ the amount k i=1 tr X if σ splits into orbits containing l1 , . . . , lk elements. Theorem 5.4. If Xd , Yd , Xtd , Ydt admit a joint limit distribution, one has

Xτ(g) Yσ(g) Moeb(g). lim d−2 Cn (d2 A) = d

g∈Gq/

(37)

∼

We omit this proof, for it is almost the same as that of Theorem 4.3 of [Col03]. Observe that the asymptotic result only depends on traces of polynomials in Xd and traces of polynomials in Yd . Mixed patterns (involving traces of a non-commutative polynomial in the four variables Xd , Yd , Xtd , Ydt ) do not occur in the limit. However we need a control on the joint moments. In other words, the same diagrams appear as in the unitary case. The only difference is that the orthogonal function Moeb is the unitary one times 2#connected components−1 . Theorem 5.5. Let Xd be a rank one projection and assume that (Yd , Ydt ) has a limit joint distribution whose first marginal is µ, lim d−1 · Cn (d Tr(Xd OYd O∗ ) = (n − 1)!kn (µ). d

(38)

∗ In other words, the coefficients of z → d−1 log Eed Tr(Xd OYd O ) converge pointwise to those of the primitive of R-transform of µ.


794

The proof goes along the same lines as Theorem 4.7 of [Col03], therefore we omit it. Observe that this result is exactly the same as for the unitary case, except that we need an extra control on the joint moments of Xd , Yd , Xtd , Ydt . 5.3. Orthogonal replaced by symplectic. The statement when replacing orthogonal matrices by symplectic should be replaced in the following way: if P is the unitary such that POT P = O∗ , then (Xd , Xtd ) (resp. (Yd , Ydt )) should be replaced by (X2d , PXT2d P) T P)). Theorem 5.3 remains true, and Theorem 5.4 as well (one only (resp. (Y2d , PY2d needs to modify accordingly the definition of Moeb). In Theorem 5.5, µ should be replaced by −µ. 6. Examples of Wg function We present below the values of the Weingarten function computed for the orthogonal group Od . In order to obtain the appropriate results for the symplectic group Spd one should replace in the formulas d by −d. These formulae have been obtained directly from the definition of Wg, without the help of formula (28). Observe that relative cumulants that can be obtained from these value yield asymptotics predicted by Theorem 3.16, Formula (32). Wg([1]) = d−1 , d+1 , d(d − 1)(d + 2) −1 Wg([2]) = , d(d − 1)(d + 2)

Wg([1, 1]) =

Wg([1, 1, 1]) = Wg([2, 1]) = Wg([3]) = Wg([4]) = Wg([3, 1]) =

d2 + 3d − 2 , d(d − 1)(d − 2)(d + 2)(d + 4) −1 , d(d − 1)(d − 2)(d + 4) 2 , d(d − 1)(d − 2)(d + 2)(d + 4) −5d − 6 , d(d + 1)(d + 2)(d + 4)(d + 6)(d − 1)(d − 2)(d − 3) 2d + 8 , (d + 1)(d + 2)(d + 4)(d + 6)(d − 1)(d − 2)(d − 3)

Wg([2, 2]) =

d2 + 5d + 18 , d(d + 1)(d + 2)(d + 4)(d + 6)(d − 1)(d − 2)(d − 3)

Wg([2, 1, 1]) =

−d3 − 6d2 − 3d + 6 , d(d + 1)(d + 2)(d + 4)(d + 6)(d − 1)(d − 2)(d − 3)

Wg([1, 1, 1, 1]) =

d4 + 7d3 + d2 − 35d − 6 . d(d + 1)(d + 2)(d + 4)(d + 6)(d − 1)(d − 2)(d − 3)

Acknowledgements. B.C. was Allocataire Moniteur at the Ecole Normale Supérieure, Paris while a part of this work was done. He is currently a JSPS postdoctoral fellow.


795

´ was supported by State Committee for Scientific Research (KBN) grant No. 2 P03A 007 23. P.S. ´ was performed during a visit in Ecole Normale Supérieure (Paris) and Institute des Research of P.S. Hautes Etudes Scientifiques funded by European Post-Doctoral Institute for Mathematical Sciences.

References [BB96]

Brouwer, P.W., Beenakker, C.W.J.: Diagrammatic method of integration over the unitary group, with applications to quantum transport in mesoscopic systems. J. Math. Phys. 37(10), 4904–4934 (1996) [BH03] Brézin, E., Hikami, S.: An extension of the Harish Chandra-Itzykson-Zuber integral. Commun. Math. Phys. 235(1), 125–137 (2003) [Bia97] Biane, P.: Some properties of crossings and partitions. Discrete Math. 175(1–3), 41–53 (1997) [Bia98] Biane, P.: Representations of symmetric groups and free probability. Adv. Math. 138(1), 126–181 (1998) [BMS00] Bousquet-Mélou, M., Schaeffer, G.: Enumeration of planar constellations. Adv. in Appl. Math. 24(4):337–368 (2000) [Bra37] Brauer, R.: On algebras which are connected with the semisimple continuous groups. Ann. Math. 38, 857–872 (1937) [BW89] Birman, J.S., Wenzl, H.: Braids, link polynomials and a new algebra. Trans. Amer. Math. Soc. 313(1), 249–273 (1989) [Col03] Collins, B.: Moments and cumulants of polynomial random variables on unitary groups, the Itzykson-Zuber integral, and free probability. Int. Math. Res. Not. 17:953–982 (2003) [Ful97] Fulton, W.: Young tableaux. Volume 35 of London Mathematical Society Student Texts. Cambridge: Cambridge University Press, 1997 [Gor02] Gorin, T.: Integrals of monomials over the orthogonal group. J. Math. Phys. 43(6), 3342–3351 (2002) [Gro99] Grood, C.: Brauer algebras and centralizer algebras for SO(2n,C). J. Algebra 222(2), 678–707 (1999) [Wei78] Weingarten, D.: Asymptotic behavior of group integrals in the limit of infinite rank. J. Math. Phys. 19(5), 999–1001 (1978) [Wen88] Wenzl, H.: On the structure of Brauer’s centralizer algebras. Ann. of Math. (2), 128(1), 173– 193 (1988) [Wey39] Weyl, H.: The Classical Groups. Their Invariants and Representations. Princeton, NJ: Princeton University Press, 1939 Communicated by Y. Kawahigashi

Commun. Math. Phys. 264,797–810 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1555-2

Communications in


The Expected Area of the Filled Planar Brownian Loop is π/5 Christophe Garban1,2 , José A. Trujillo Ferreras1 1 2

Department of Mathematics, Cornell University, Ithaca, NY 14853-4201, USA. E-mail: [email protected] Ecole Normale Superieure, 45 rue d’ulm, 75230 Paris Cedex 05, France. E-mail: [email protected]

Received: 24 May 2005 / Accepted: 5 December 2005 Published online: 15 March 2006 – © Springer-Verlag 2006

Abstract: Let Bt , 0 ≤ t ≤ 1 be a planar Brownian loop (a Brownian motion conditioned so that B0 = B1 ). We consider the compact hull obtained by filling in all the holes, i.e. the complement of the unique unbounded component of C \ B[0, 1]. We show that the expected area of this hull is π/5. The proof uses, perhaps not surprisingly, the Schramm Loewner Evolution (SLE). As a consequence of this result, using Yor’s formula [17] for the law of the index of a Brownian loop, we find that the expected area of the region inside the loop having index zero is π/30; this value could not be obtained directly using Yor’s index description. 1. Introduction The main result of the present paper goes as follows: Let B denote a Brownian loop in C of time duration 1. There are various equivalent ways to define it. One can view it as a Brownian path (Bt , 0 ≤ t ≤ 1) appropriately conditioned to be back at its starting point at time 1. One can also write Bt = Wt − tW1 , where W is just a standard Brownian motion in C. Then, C \ B[0, 1], i.e. the complement of the path, has a unique infinite connected component H . The hull T generated by the Brownian loop is by definition equal to C \ H . This is the set obtained by filling in the holes in the loop. Let A be the random variable whose value is the area of T . Then: Theorem 1.1. The expected value of A is π/5. Our result gives interesting information regarding the Brownian loop soups introduced in [4]. This conformally invariant object plays an important role in the understanding and description of SLE curves (see, e.g. [4, 14, 5]). It can be viewed as a Poissonian cloud (of intensity c) of filled Brownian loops in subdomains of the plane. Among other things, it is announced in [15] that the dimension of the set of points in the complement of the loop soup (i.e. the points that are in the inside of no loop) can be shown to be equal to 2 − c/5, using consequences of the restriction property. A detailed proof of this

798

C. Garban, J. A. Trujillo Ferreras

Fig. 1. Random walk loop of 50000 steps and corresponding hull

statement has never been published, and in fact, our result implies the corresponding first moment estimate (i.e. the mean number of balls of radius ε needed to cover the set). The other arguments needed to derive the result announced in [15] will be detailed in [9]. Another consequence of our result concerns the direct relation between different measures on self-avoiding loops in the plane defined as outer boundaries of planar Brownian loops. See [16]. In the abundant existing literature about planar Brownian motion, there are certainly results dealing with the question of area. Paul Lévy’s stochastic area formula describing the algebraic area “swept” by a Brownian motion will likely come to the mind of many readers. Our result, however, is very different from this classical theorem, firstly because Lévy’s area is a signed area, but mainly because of the following: in order to apprehend Lévy’s area it is enough to follow the Brownian curve locally without paying attention to the rest of the curve. In our case, one needs to consider the curve globally. Also, Yor [17] has been able to give an explicit formula for the law of the index of a Brownian loop around a fixed point z. Yor’s proof relies on the fact that the index can be obtained via a stochastic integral along the loop. Let us explain how this result is related to ours. A point with a non-zero index has to be inside the loop. Using this fact, it is almost possible to describe the probability that a given point is inside the loop, modulo the problem of the zero index; indeed, there are some regions inside the Brownian loop around which the loop has an index equal to zero. In the last section of our paper, we combine Theorem 1.1 with the law of the index given by Yor, to find that the expected area of the set of points inside the hull that have index zero is π/30. The expected areas of the regions of index n ∈ Z \ {0} can be directly obtained by integrating Yor’s formula with respect to z. In [2], using physics methods, Comtet, Desbois and Ouvry obtained the values of the expected areas for the non-zero index regions by different techniques, and they pointed out the different nature of the n = 0 sector (the points in the plane of zero index) and emphasized that “it would be interesting to distinguish in the n = 0 sector, curves which do not enclose the origin from curves which do enclose the origin but an equal number

The Expected Area of the Filled Planar Brownian Loop is π/5

799

of times clockwise and anticlockwise” but they argue that the 0-case cannot be treated within the scope of their analysis. From a probabilistic viewpoint, it also appears that usual techniques for Brownian motion are not strong enough to obtain the expected area of the Brownian hull or the expected area of the 0-index region inside the Brownian hull. Let us briefly explain why. Basically, the enclosed area depends only on the boundary of the hull generated by the Brownian loop. The frontier of the Brownian loop concerns only a small subset of the time duration [0, 1]. In some sense, on certain time-intervals, the enclosed area does not depend much on the behavior of the Brownian motion. So, this problem needs a good description of the frontier of a Brownian loop. Recently, Lawler, Schramm and Werner proved a conjecture of Mandelbrot that the Hausdorff dimension of the Brownian frontier is 4/3. For this purpose they used the value of intersection exponents computed with the help of SLE curves, see for instance [6] and references therein. The description of the Brownian frontier via SLE can be done in a slightly different way using the conformal-restriction point of view, see [5]. We will use this approach, and so will present to the reader the facts needed about conformal restriction measures in the next section. Our paper gives another striking example of a simple result concerning planar Brownian motion that seemed out of reach using the usual stochastic calculus approach, but that can be derived using conformal invariance and SLE. For a thorough account on SLE processes, see [3, 13]. Let us also finally mention another related result from the Physics literature. In [1], using methods of conformal field theory, Cardy has shown that the ratio of the expected area enclosed by a self-avoiding polygon of perimeter 2n to the expected squared radius of gyration for a polygon of perimeter 2n converges as n goes to infinity to 4π/5. We note that self-avoiding polygons are supposed to have the same asymptotic shape as filled Brownian loops (see, for example, [8] and references therein). However, studying this relationship is hard basically for the following reason. The boundary of the Brownian loop is of SLE8/3 -type, but, unfortunately, there does not exist a good way of “talking about the length” of SLE curves at this moment. A rigorous analysis of the squared radius of gyration seems currently out of reach. However, the universality of the ratio of the expected area to the expected squared radius of gyration for loops has been widely explored in the theoretical physics literature. Combining Cardy’s result with our result about the expected area of the Brownian loop, and considering the above described universality one could think that the expected squared radius of gyration for the boundary of a simple random walk of length 2n in the plane behaves like 41 n. In fact, we ran some numerical simulations and this seems to be indeed the case. 2. Preliminaries Conformal restriction measures in H are measures supported on the set of closed subsets K of H such that K ∩ R = {0}, K is unbounded and H \ K has two infinite connected components, that satisfy the conformal restriction property: for all simply connected domains H ⊂ H such that H \ H is bounded and bounded away from the origin, the law of K conditioned on K ⊂ H is the law of (K), where is any conformal transformation from H to H preserving 0 and ∞ (this law doesn’t depend of the choice of ). It is proved in [5] that there is only one real parameter family of such restriction measures, Pα where α ≥ 5/8. These measures are uniquely described by the following property: for all closed A in H bounded and bounded away from 0,

800


Pα [K ∩ A = ∅] = A (0)α ,

(2.1)

where A is a conformal transformation from H \ A onto H such that A (z)/z → 1, when z → ∞. To aid with the notation for the rest of the paper whenever we write A we will be assuming that we have chosen the translate with the additional property A (0) = 0. P5/8 is the law of chordal SLE8/3 , and P1 can be constructed by filling the closed loops of a Brownian excursion in H (Brownian motion started at 0 conditioned to stay in H). An important property of these conformal restriction measures is that using two independent restriction measures Pα1 and Pα2 , we can construct Pα1 +α2 by filling the “inside” of the union of K1 and K2 . This “additivity” property and the construction of P5/8 and P1 give the good description of the Brownian motion in terms of SLE curves, namely, 8 SLE8/3 give the same hull as 5 Brownian excursions. Since we want to describe the boundary of loops of time duration 1, we will first create loops with the use of the infinite hulls described above. Restriction measures are conformally invariant (Brownian excursion, SLE8/3 ,..), so we had better use conformal maps. There is obviously no conformal equivalence which sends both ∞ and 0 to 0, so the natural idea is to consider a Möbius transformation preserving H which maps 0 to 0, and ∞ to ε. We can choose εz , z+1 z m−1 . ε (z) = ε−z mε (z) =

The limit when ε goes to zero of the measures mε (P1 ) is the Dirac measure at {0}. The good renormalization to keep something interesting is in ε2 . Hence, we define the Brownian bubble measure in H as: µbub = lim

ε→0

1 mε (P1 ) . ε2

This measure was introduced in [5], and it is an important tool for studying the link between SLE curves and the Brownian loop soup (see [4]). It was already noted in [5, 7], as an easy consequence of the “additivity” property described above, that 5 bub 1 1 5 µ = lim 2 mε (P1 ) = lim 2 mε (P5/8 ) . ε→0 ε 8 8 ε→0 ε The last measure can be seen as an infinite measure on “SLE8/3 loops”, let us call this measure µsle . Recall, that we are interested in a Brownian loop of time duration 1. We have the following time decomposition for µbub , (see [4, 3]): ∞ dt br µbub = P × Ptexc , (2.2) 2 t2 t 0 where Ptbr is the law of a one-dimensional Brownian bridge of time duration t, and Ptexc is the law of an Itô Brownian excursion re-normalized to have time t. Ptbr × Ptexc is the law of an H-Brownian bridge of time duration t, by considering the one dimensional bridge as the x coordinate of the curve, and the excursion as the y coordinate. Unfortunately, it is hard to compute fixed-time quantities with SLE techniques. Thus, we will compute a “geometric quantity” using SLE8/3 , and then extract E(A) from this geometric value by using the relation µbub = 8/5µsle and the decomposition (2.2).


801

Let us explain in a few words why we need to deal with Brownian bridges in H and cannot work directly with bridges in C. The underlying idea is the fact that one needs to choose a starting point on the boundary of the Brownian loop for the SLE loop representation. A natural choice is the (almost surely) unique lowest point, this is why we are interested in H quantities. So let AH be the random variable giving the area of an H−Brownian bridge of time duration one. Working with AH will turn out not to be a problem since, as the reader might already suspect, the random variables A and AH have the same law. For the geometric quantity, we could choose to compute A(γ )dµsle , where A(γ ) is the area enclosed in H by the “curve” γ , but this integral is infinite. Let γ ∗ be the radius of the curve γ , that is, γ ∗ = sup0≤t≤tγ |γ (t)|. We may consider the “expected” area under the law µsle “conditioned” on γ ∗ = 1. Here, µsle is not a probability measure so the term “expected value” is not correct, and the conditioning is on a set of µsle −measure equal to 0. The following definition will be sufficient for our purposes: ∗

µ (A|γ = 1) = lim sle

δ↓0

A(γ )1{γ ∗ ∈[1,1+δ)} dµsle . µsle {γ ∗ ∈ [1, 1 + δ)}

(2.3)

Using µsle = 5/8µbub , we can write in the same way: ∗


δ↓0

A(γ )1{γ ∗ ∈[1,1+δ)} dµbub . µbub {γ ∗ ∈ [1, 1 + δ)}

(2.4)

Thus, µsle (A|γ ∗ = 1) represents at the same time the “expected” area of an SLE8/3 loop conditioned to touch the half circle of radius one and the expected area of a Brownian bubble with the same conditioning. With the use of the restriction property for SLE8/3 , we will be able to compute in the last section µsle (A|γ ∗ = 1). Before, in the coming section, we will find the relationship between E(A) and µsle (A|γ ∗ = 1). 3. Extraction of E(A) from µsle (A|γ ∗ = 1) In this section we will prove the following Lemma 3.1. The expected value of A is equal to 2µsle (A|γ ∗ = 1). Proof. First of all, by using the definition of µbub in terms of limε↓0 ε12 mε (P1 ) and the restriction property of P1 , it is easy to show that µbub {γ ∗ ≥ r} = r12 . (The definition of µbub just mentioned shows that the crucial quantity to compute is the probability that an H−Brownian excursion hits a half-ball of radius ε/r centered about −1; the restriction property tells us that to evaluate this, one just needs to differentiate the appropriate map at 0. An analogous computation for the case of SLE8/3 , or equivalently P5/8 , is carried out in more detail in Lemma 4.1.) Hence, µbub {γ ∗ ∈ [1, 1 + δ)} = 1 − 1/(1 + δ)2 = 2δ + O(δ 2 ).

802


On the other hand, recalling (2.2) and letting µt = Ptbr × Ptexc , we have 1 bub A(γ )1{γ ∗ ∈[1,1+δ)} dµ = A(γ )1{γ ∗ ∈[1,1+δ)} dµt dt 2t 2 1 = t A(γ )1{γ ∗ ∈[ √1 , 1+δ √ )} dµ1 dt (Brownian scaling) 2t 2 t t 1 = A(γ ) (Fubini) 1 1 (1+δ)2 dt dµ1 , ∗2 )} 2t {t∈[ γ ∗2 γ = A(γ ) log(1 + δ)dµ1 = E(AH )(δ + O(δ 2 )). Thus, from (2.4), ∗


δ↓0

=

A(γ )1{γ ∗ ∈[1,1+δ)} dµbub µbub {γ ∗ ∈ [1, 1 + δ)}

1 E(AH ). 2

Hence, the proof of the lemma will be concluded as soon as we establish E(AH ) = E(A). There is a (almost sure) one-to-one correspondence between C-Brownian bridges and H-Brownian bridges. The idea is to start the Brownian loop from its lowest point. More precisely, if Bt , 0 ≤ t ≤ 1 is a Brownian bridge in C, with probability one, there is a unique t¯ ∈ [0, 1] such that Im(Bt¯) ≤ Im(Bt ), for all t ∈ [0, 1]. We associate to the Brownian Bridge Bt the process (Zt )0 ≤ t ≤ 1 in H, defined by this simple space-time translation: Bt¯+t − Bt¯ , 0 ≤ t ≤ 1 − t¯ , Zt = (3.1) Bt¯+t−1 − Bt¯ , 1 − t¯ ≤ t ≤ 1 . Now, we have to identify the law of Zt with P1exc × P1br . The real and imaginary parts of Bt are two independent one-dimensional Brownian bridges. The law of the random variable t¯ is independent of Re(Bt ), so in the space-time change (3.1), Re(Zt ) is still a one-dimensional bridge independent of the imaginary part of Zt . Im(Zt ) has the law of a one-dimensional Brownian bridge viewed from its (almost sure) unique lowest point. By the Vervaat Theorem (see [11]), this gives the law of an Itô excursion renormalized to have time one. Thus Zt has the law of an H-Brownian bridge of time one. Our space-time transformation obviously preserves the area, hence E(AH ) = E(A). 4. Computation of µsle (A|γ ∗ = 1) In this section we compute µsle (A|γ ∗ = 1). Combined with Lemma 3.1, this will conclude the proof of Theorem 1.1. Lemma 4.1. µsle (A|γ ∗ = 1) is equal to

π 10 .


803

The proof of this lemma provides a good example of the use of standard techniques for SLE8/3 . We have chosen to leave out some algebraic details in order to allow the reader to focus on the main ideas. We first state a result that we will use extensively. This result is due to Schramm, [10], who gave a general formula covering the values κ ∈ [0, 8). For simplicity we will only state his result for κ = 8/3, which is all that we need. Let γ be chordal SLE8/3 in the upper half-plane H, and let z = reiθ be a point in H. Then [10], P{z is to the right of γ [0, ∞)} = 1/2 + 1/2 cos(θ ).

(4.1)

Proof. Recall (2.3): ∗


δ↓0

By using the definition µsle = limε↓0

A(γ )1{γ ∗ ∈[1,1+δ)} dµsle . µsle {γ ∗ ∈ [1, 1 + δ)}

1 m (P ) , ε2 ε 5/8

we can rewrite (4.2) as:

lim lim Eε (A(γ )|γ ∗ ∈ [1, 1 + δ)), δ↓0 ε↓0

(4.2)

(4.3)

where Eε is a more appealing notation for the expected value under the law of mε (P5/8 ) (this law, in simpler words, is the law of a chordal SLE8/3 in H from 0 to ε). Recall that A(γ ) is the area of the bounded set in H enclosed by the curve γ . A(γ ) can be written as H 1{z inside} dA(z), where {z inside} means that z is in the component bounded by γ . Thus (4.3) can be written as: 1{z inside} dA(z)|γ ∗ ∈ [1, 1 + δ) , (4.4) lim lim Eε δ↓0 ε↓0

(1+δ)D+

where D+ is D ∩ H. Since everything is nicely bounded, we can interchange the limits and the integral. This gives us: lim lim Pε {z inside |γ ∗ ∈ [1, 1 + δ)} dA(z). (4.5) µsle (A|γ ∗ = 1) = D+ δ↓0 ε↓0

Therefore, what remains to be done is to compute, for a fixed z, the “probability” that this z is inside an “SLE8/3 loop” conditioned to have radius exactly 1. So let us fix z0 in D+ . Let Dε ( resp. Dεδ ) denote the image under m−1 ε (z) = z/(ε − z) of the set {z ∈ H : |z| ≥ 1} (resp. {z ∈ H : |z| ≥ 1 + δ}).

804


We warn the reader that γ will denote two different kinds of curves in H : a curve from 0 to ∞, or a curve from 0 to ε. Let Fε be the event {γ [0, ∞) ∩ Dε = ∅}, and, similarly, let Fεδ be the analogous event for Dεδ . Then, c δ Pε {z0 inside |γ ∗ ∈ [1, 1 + δ)} = P5/8 {m−1 ε (z0 ) is to the right of γ |(Fε ) ∩ Fε } .

Recall that P5/8 is the law of a chordal SLE8/3 from 0 to ∞ in H, henceforth, we will simply call it P. In order to make the formulas more concise we will denote the event {z is to the right of γ } by R(z). Then, c δ P{R(m−1 ε (z0 ))|(Fε ) ∩ Fε } =

δ δ −1 P{R(m−1 ε (z0 ))|Fε }P{Fε }−P{R(mε (z0 ))|Fε }P{Fε } . P{Fεδ } − P{Fε } (4.6)

The reason for this last step is that now all the probabilities involved can be computed using the restriction property for SLE8/3 , and a simple formula, see Lemma 4.1, for the probability that a point is to the right of an SLE8/3 path from 0 to ∞ in H. This requires (cf. Sect. 2) to know the unique conformal map ε = Dε from H \ Dε into H, with ε (0) = 0, ε (∞) = ∞ and ε (∞) = 1 (with a similar statement for Dεδ ). Thus by restriction, the law of the chordal SLE8/3 in H conditioned not to touch Dε is the inverse image of the chordal SLE in H by ε . This implies for the quantities we need to compute: −1 P{R(m−1 ε (z0 ))|Fε } = P{R(mε (ε (z0 )))},

δ −1 δ P{R(m−1 ε (z0 ))|Fε } = P{R(mε (ε (z0 )))} .

Note that m−1 obius transformation, which maps ∞ to −1. Therefore, Dε and ε is a M¨ Dεδ are half disks whose centers are very close to -1. The fact that they are not exactly centered at -1 is due to the lack of symmetry in the problem: an SLE from 0 to ε in a half disk D+ centered in 0. Nevertheless, for the computation of ε (z) and δε (z), we can think of Dε and Dεδ as two half disks centered at -1 with radii respectively ε and (1 − δ)ε. If we carried out the computations with the actual disks (straightforward but tedious), we would see that our approximation is of order O(ε 2 + ε 2 δ 2 /|z + 1| + ε 4 /|z + 1|2 ), when z goes to -1. In this way, we have ε2 ε4 ε (z) = z − ε 2 + , + O ε2 + z+1 |z + 1|2 ε4 ε 2 (1 − δ)2 ε2 δ 2 δε (z) = z − ε 2 (1 − δ)2 + . + O ε2 + + z+1 |z + 1| |z + 1|2 We now have to evaluate these functions at the point m−1 ε (z0 ) = z0 /(ε − z0 ) = −1 − ε 2 4 2 2 2 z0 + O(ε ) (recall z0 is fixed). The approximations O(ε /|z + 1| ) and O(ε δ /|z + 1|) −1 2 2 at the point mε (z0 ) are of order O(ε ) and O(εδ ), respectively; this gives us: ε ε 2 (1 − δ)2 + + O(εδ 2 + ε 2 ) z0 −ε/z0 + O(ε 2 ) 1 = −1 − ε z0 + + 2εδz0 + O(εδ 2 + ε 2 ) . z0

δε (m−1 ε (z0 )) = −1 −


805

Using the Taylor series for the logarithm, and then taking the imaginary part, we see that 1 arg δε (m−1 ) − 2εδIm(z0 ) + O(εδ 2 + ε 2 ). ε (z0 )) = π + εIm(z0 + z0 Now, using Lemma 4.1, and the Taylor series for cosine we see that ε2 1 2 1 δ −1 P{R(ε (mε (z0 )))} = Im(z0 + ) − 4δIm(z0 + )Im(z0 ) 4 z0 z0 +O(ε2 δ 2 + ε 3 ).

(4.7)

In particular, if we set δ = 0 we obtain, P{R(ε (m−1 ε (z0 )))}

ε2 = 4

2

1 Im(z0 + ) z0

+ O(ε 3 ).

(4.8)

Also, by (2.1), we have (our approximation doesn’t change significantly the derivative at 0 which is far away from small disks centered at -1): P{Fεδ } = P5/8 {γ [0, ∞) ∩ Dεδ = ∅} = (δε ) (0)5/8 = (1 − ε 2 (1 − 2δ + O(δ 2 )))5/8 + O(ε 3 ) 5 5 = 1 − ε 2 + ε2 δ + O(ε 2 δ 2 + ε 3 ). 8 4 Similarly, P{Fε } = 1 − 5/8ε 2 + O(ε 3 ), which gives P{Fεδ } − P{Fε } =

5 2 ε δ + O(ε 2 δ 2 + ε 3 ) . 4

(4.9)

Hence, by combining this last expression, (4.6), (4.7), (4.8) and using the fact that both P{Fε } and P{Fεδ } are 1 + O(ε 2 ), we obtain : lim lim Pε {z0 inside |{γ ∗ ∈ [1, 1 + δ)}} δ↓0 ε↓0

= lim lim

ε2 4 (−4δIm(z0

δ↓0 ε↓0

+

1 2 2 3 z0 )Im(z0 )) + ε O(δ ) + O(ε ) 5 2 2 2 3 4 ε δ + O(ε δ + ε )

4 1 = − Im z0 + Im(z0 ). 5 z0

Therefore, by (4.5), and using polar coordinates to evaluate the integral, we get : 4 1 µsle (A|γ ∗ = 1) = − Im z + Im(z)dA(z) z D+ 5 π = . 10 This concludes the proof of the lemma.

Remark. We can note that the 1/5 in the final result, comes from the 8/5 in the restriction formula (2.1).

806


Before finishing this section we would like to show how the techniques used in our proof yield a conditional version of Schramm’s formula (4.1), for κ = 8/3. We have decided to work with an infinitesimal slit instead of an infinitesimal ball, but the result is the same in either case. Lemma 4.2. Let γ be an SLE8/3 from 0 to ∞ in H, and let Sε be the vertical slit [−1, −1 + iε]. Then, for any z = reiθ in H, lim P{z is to the right of γ [0, ∞)|γ [0, ∞) ∩ Sε = ∅} 1 . = 1/2 + 1/2 cos(θ ) − 4/5 sin(θ )Im z+1

ε→0

√ Proof. The map ε (z) = (z + 1)2 + ε 2 − 1 + ε 2 , where the square root has been chosen to be positive on R+ , is a conformal transformation from H \ Sε onto H. It fixes both 0 and ∞, and has derivative 1 at ∞. Simple algebra yields

1 ε2 ε (z) = z 1 − 2z+1

+ O(ε 4 ),

(ε (0))5/8 = 1 −

5 2 ε + O(ε 4 ). 16

In particular, we have arg ε (z) = arg z −

1 ε2 + O(ε 4 ). 2z+1

Thus, using this last remark, the restriction property and (4.1), we have P{R(z)|γ [0, ∞) ∩ Sε = ∅} P{R(z)} − P{R(z)|γ [0, ∞) ∩ Sε = ∅}P{γ [0, ∞) ∩ Sε = ∅} = P{γ [0, ∞) ∩ Sε = ∅} = =

=

+

1 5 2 4 2 cos(arg ε (z))(1 − 16 ε ) + O(ε ) 5 2 4 16 ε + O(ε ) 1 1 2 1 4 2 [cos(arg z) − cos(arg z + (− 2 ε Im( z+1 )) + O(ε ))] 5 2 4 16 ε + O(ε ) 1 2

1 2

cos(arg z) − ( 21 +

1 + (1 + cos(arg z))(1 + o(1)) 2 1 1 [sin(arg z)(− 21 ε 2 Im( z+1 )) + O(ε 4 )] 2

1 + (1 + cos(arg z))(1 + o(1)) 2

+ O(ε 4 ) 1 1 4 )(1 + o(1)) + (1 + cos(arg z))(1 + o(1)), = − sin(arg z)Im 5 z+1 2 5 2 16 ε

which is what we wanted to prove.


807

5. Expected Areas for Fixed Index Let z ∈ C \ {0} be fixed, and (Bt )0≤t≤1 a Brownian loop in C starting at 0. Almost surely z ∈ {Bs : 0 ≤ s ≤ 1}, and therefore we can define its index nz . More precisely, ∀s ∈ [0, 1], Bs − z = Rsz exp(iθsz ), where Rsz = |Bs − z| and θsz is any continuous θ z −θ z

representative of the argument. The index nz is by definition 12π 0 ; this is the number of times that the Brownian particle winds around z. Let us now recall Yor’s result from [17]: Theorem 5.1. (M. Yor). Fix z = reiθ , with r = 0. Under the law of a Brownian loop of time duration one, starting at 0, we have the following probabilities: P(nz = n) = e−r [ r ((2n − 1)π ) − r ((2n + 1)π )] if n ∈ Z \ {0} , 2

P(nz = 0) = 1 + e−r [ r (−π) − r (π )] , 2

where ∀x = 0, x r (x) = π

∞ 0

e−r

2 cosh(t)

t2

dt . + x2

For each n ∈ Z, n = 0, let Wn denote the area of the open set of points of index nz = n. This random variable can be written as : Wn = 1{nz =n} dA(z) . C

Let W0 be the area of the open set of points inside the loop that have index zero: 1{nz =0}∩{z is inside} dA(z) . W0 = C

Fig. 2. Random walk of 50000 steps with areas of index 0 inside its hull in black

808


Since the Brownian curve is of Lebesgue measure zero, we have the following decomposition of the area A inside the Brownian loop (basically, the Brownian path does not take much place inside its hull) A= Wn . n∈Z

Hence: E(A) =

π = E(Wn ). 5 n∈Z

Using Yor’s result, it will be rather straightforward to compute E(Wn ) for n = 0. And, hence, by subtracting from π/5, one can obtain the value of E(W0 ): Theorem 5.2.

E(Wn ) =

π 30 1 2πn2

n = 0, n = 0, n ∈ Z .

(5.1)

Remark. This result is consistent with the asymptotic result obtained by Werner in [12], about the area Atn of the set of points around which the planar Brownian motion (not the loop) winds around n times on [0, t]. It is indeed proved that Atn is equivalent (in the t L2 -sense) to 2πn 2 as n goes to infinity. Very roughly the area of the n-sector for large n comes from local contributions along the path, hence the global picture of the hull is not relevant; that is why, both Brownian motion and Brownian bridge should have the same asymptotics. Werner’s proof requires to compute the asymptotics of the first and second moments. This present paper gives exact computations for the first moments in the case of the loop, but it does not provide any information about the second moments. In fact, if one were to try to compute the second moment (or any higher order moments) of the area of the loop following an approach similar to the one in this paper one would face two difficulties. First, the simplification that allows one to obtain the equation in Lemma 3.1, only occurs for the first moment. And even if somehow one were able to obtain a formula analogous to the one in the lemma, it would still be necessary to have a statement analogous to that of Lemma 4.1 for more than one point: a notoriously difficult problem. Proof. We start by computing E(Wn ) for n = 0. For this purpose we use Theorem 5.1. Thus, for each n = 0, using polar coordinates: E(Wn ) = P(nz = n)dA(z) C ∞ 2 rdre−r = 2π

0 ∞ 2n − 1 2n + 1 −r 2 cosh(t) × dte − 2 t 2 + (2n − 1)2 π 2 t + (2n + 1)2 π 2 0 ∞ ∞ 2n−1 2n + 1 2 = 2π dt 2 − 2 re−r (1+cosh(t)) dr 2 2 2 2 t + (2n − 1) π t +(2n + 1) π 0 0 ∞ 2n − 1 dt 2n + 1 =π − 2 1 + cosh(t) t 2 + (2n − 1)2 π 2 t + (2n + 1)2 π 2 0 1 = . 2π n2


809

We sketch one possible way to see how to obtain the last line in the above chain of equalities. It is slightly more convenient to generalize a bit, so thinking of 2n as x and using the symmetry of the integrand, we consider the function ∞ x−1 dt x+1 F (x) = . − 2 2 2 2 t + (x + 1)2 π 2 −∞ 1 + cosh(t) t + (x − 1) π In this new notation what we want to prove is that F (x) = π 24x 2 (for x ≥ |2|). Since, F is symmetric about 0, it is enough to study the case of x positive; furthermore, since F is real analytic on {x : x > 1}, we can allow ourselves to assume that x is not an integer. Now, for x > 1 and x not an integer, a simple residue computation with appropriate contours yields ∞ 8 (2k − 1)(x − 1) (2k − 1)(x + 1) F (x) = − 2 . − π ((x − 1)2 − (2k − 1)2 )2 ((x + 1)2 − (2k − 1)2 )2 k=1

In order to evaluate this sum, it is enough to notice that using partial fractions one can obtain ∞ ∞ (2k − 1)w 1 1 1 , = − − (w 2 − (2k − 1)2 )2 16 (k + w/2 + 1/2)2 (k − w/2 + 1/2)2 k=1

k=0

and substituting x − 1 and x + 1 for w, and noticing the telescoping cancellations one readily obtains F (x) = hence, E(Wn ) =

4 , π 2x2

1 . 2πn2

π2 1 Finally, using the fact that ∞ n=1 n2 = 6 , and the fact that the area of the Brownian π loop is π/5 we conclude E(W0 ) = 30 . This finishes the proof of the theorem.

Acknowledgement. We wish to thank Greg Lawler and Wendelin Werner for suggesting the problem and for fruitful discussions, and Wendelin Werner for pointing out the link with the paper of Yor [17] as well as for his many comments on previous versions of the manuscript. J. T. thanks Julien Dubédat for a discussion during a workshop at Oberwolfach that has led to a clearer presentation of Lemma 3.1. We thank the referees for useful suggestions.

References 1. Cardy, J.: Mean area of self-avoiding loops. Phys. Rev. Lett. 72, 1580–1583 (1994) 2. Comtet, A., Desbois, J., Ouvry, S.: Winding of planar Brownian curves. J. Phys. A: Math. Gen. 23, 3563–3572 (1990) 3. Lawler, G.F.: Conformally Invariant Processes in the Plane. Mathematical Surveys and Monographs 114, Providence, RI: Amer. Math. Soc., 2005 4. Lawler, G.F., Werner, W.: The Brownian loop soup. Probab. Theory Related Fields 128, 565–588 (2004) 5. Lawler, G.F., Schramm O., Werner, W.: Conformal restriction. The chordal case. J. Amer. Math. Soc. 16, 917–955 (2003) 6. Lawler, G.F., Schramm, O., Werner, W.: The dimension of the planar Brownian frontier is 4/3. Math. Res. Lett. 8, 401–411 (2001)

810


7. Lawler, G.F., Schramm, O., Werner, W.: On the scaling limit of planar self-avoiding walk. In: Fractal geometry and applications, A jubilee of Benoˆıt Mandelbrot, Proc. Symp. Pure Math. 72, Providence, RI: Amer. Math. Soc., 2004 8. Richard, C.: Area distribution of the planar random loop boundary. J. Phys. A. 37, 4493–4500 (2004) 9. Thacker, J.: Hausdorff Dimension of the Brownian Loop Soup. In preparation (2005) 10. Schramm, O.: A percolation formula. Electron. J. Probab. Vol. 7(2), 1–13 (2001) 11. Vervaat, W.: A relation between Brownian bridge and Brownian excursion. Ann. Probab. 7, 143–149 (1979) 12. Werner, W.: Sur l’ensemble des points autour desquels le mouvement brownien plan tourne beaucoup. Probability Theory and Related Fields 99, 111–142 (1994) 13. Werner, W.: Random planar curves and Schramm-Loewner evolutions. In: Lecture Notes from the 2002 Saint-Flour Summer School, L.N. Math. 1840, Berlin-Heidelberg-New York Springer, 2004, pp. 107–195 14. Werner, W.: Conformal restriction and related questions. http://arxiv.org/list/math.PR/0307353, 2003 15. Werner, W.: SLEs as boundaries of clusters of Brownian loops. C. R. Acad. Sci. Paris Ser. I Math. 337, 481–486 (2003) 16. Werner, W.: The conformally invariant measure on self-avoiding loops. http://arxiv.org/list/math.PR/ 0511605, 2005 17. Yor, M.: Loi de l’indice du lacet brownien, et distribution de Hartman-Watson. Z. Wahrsch. Verw. Gebiete 53, 71–95 (1980) Communicated by M. Aizenman


Communications in


Scattering Theory for Jacobi Operators with Quasi-Periodic Background Iryna Egorova1 , Johanna Michor2,3 , Gerald Teschl2,3 1 2 3

Kharkiv National University 47, Lenin ave, 61164 Kharkiv, Ukraine. E-mail: [email protected] Faculty of Mathematics, Nordbergstrasse 15, 1090 Wien, Austria International Erwin Schrödinger Institute for Mathematical Physics, Boltzmanngasse 9, 1090 Wien, Austria. E-mail: [email protected]; [email protected]

Received: 7 June 2005 / Accepted: 19 September 2005 Published online: 15 Febraury 2006 – © Springer-Verlag 2006

Abstract: We develop direct and inverse scattering theory for Jacobi operators which are short range perturbations of quasi-periodic finite-gap operators. We show existence of transformation operators, investigate their properties, derive the corresponding Gel’fandLevitan-Marchenko equation, and find minimal scattering data which determine the perturbed operator uniquely. 1. Introduction Classical scattering theory deals with the reconstruction of a given Jacobi operator H u(n) = a(n)u(n + 1) + a(n − 1)u(n − 1) + b(n)u(n),

(1.1)

which is a short range perturbation of the free one H0 associated with the coefficients a(n) = 21 , b(n) = 0. This case has been first developed on an informal level by Case in a series of papers [5–10]. The first rigorous results were established by Guseinov [18], who gave necessary and sufficient conditions for the scattering data to determine H uniquely under the assumption 1 (1.2) |n| |a(n) − | + |b(n)| < ∞. 2 n Further extensions were made by Guseinov [19, 20], and Teschl [27]. Additional details and further references can be found, e.g., in [28]. In addition to being of interest on its own, scattering theory can also be used to solve the initial value problem for the Toda equation via the inverse scattering transform. This has been formally developed by Flaschka [14] (see also [29] and [11] for the Work supported by the Austrian Science Fund (FWF) under Grant No. P17762, the Austrian Academy of Sciences under DOC-21388, and INTAS Research Network NeCCA 03-51-6637.

812

I. Egorova, J. Michor, G. Teschl

case of rapidly decaying sequences) who also worked out the inverse procedure in the reflection-less case. Further results and an extension of the method to the entire Toda hierarchy were given by Teschl in [26] and [27]. The next interesting problem is to replace the free Hamiltonian H0 by one with a periodic potential. First results in the case of Sturm-Liouville operators have been obtained by Firsova in a series of papers (see [13]). For further results, including potentials with different spatial asymptotics, and additional references see Gesztesy et al. [16]. In the discrete case, the investigation has only recently been started by Boutet de Monvel and Egorova [2] and by Volberg and Yuditskii [31], who treat the case where H has a homogeneous spectrum and is of Szegö class exhaustively from an operator point of view. Applications to the Toda lattice can be found in Bazargan and Egorova [1] and Boutet de Monvel and Egorova [3]. Finally, let us give a brief overview of the paper: Section 2 collects some well-known facts from Riemann surfaces and introduces the necessary notation. Section 3 introduces the Baker-Akhiezer function and investigates the quasi-momentum map. In the periodic case, where the integrals can be explicitly computed, this was first done in [24]. In addition, we characterize the second solution at the band edges. In Sect. 4 we prove existence of Jost solutions and use them to characterize the spectrum of the perturbed operator. In the periodic case, existence of Jost solutions was first shown by Geronimo and Van Assche [17] and the fact that there are only finitely many eigenvalues in each gap was first proven in Cojuhari [12] and later rediscovered in Teschl [25]. Section 5 introduces the transformation operator and proves the crucial decay estimate on its coefficients. This was first done by Boutet de Monvel and Egorova [2] in the periodic case under the additional assumption that all spectral gaps are open. We fix a problem in the original proof and at the same time simplify and streamline the argument. Section 6 investigates the scattering matrix. Our main result here is the reconstruction of the transmission coefficient from the reflection coefficient, which was not known previously even in the periodic case. Section 7 derives the Gel’fand-Levitan-Marchenko equation and proves positivity of the Gel’fand-Levitan-Marchenko operator. In addition, we formulate necessary conditions for the scattering data to uniquely determine our Jacobi operator. Our final Sect. 8 shows that our necessary conditions for the scattering data are also sufficient. It should be mentioned that, due to the lack of continuity with respect to the spacial variable n, a significant change in the strategy of the original proof in the continuous case from [22] is needed. Our approach uses heavily the fact that the Baker-Akhiezer function is a meromorphic function on the Riemann surface associated with the problem. This strategy gives a more streamlined treatment and more elegant proofs even in the special cases which were previously known. In this respect it is important to emphasize that, in contradistinction to the constant background case, the upper sheet of our Riemann surface is not simply connected and in particular not isomorphic to the unit disc.

2. Quasi-Periodic Finite-Gap Operators and Riemann Surfaces To set the stage let M be the Riemann surface associated with the following function 1/2 R2g+2 (z),

R2g+2 (z) =

2g+1

(z − Ej ),

j =0

E0 < E1 < · · · < E2g+1 ,

(2.1)

Scattering Theory for Jacobi Operators

813

g ∈ N. M is a compact, hyperelliptic Riemann surface of genus g. We will choose 1/2 R2g+2 (z) as the fixed branch 1/2

R2g+2 (z) = −

2g+1

z − Ej ,

(2.2)

j =0

where

√ . is the standard root with branch cut along (−∞, 0). 1/2

A point on M is denoted by p = (z, ±R2g+2 (z)) = (z, ±), z ∈ C, or p = ∞± , and the projection onto C ∪ {∞} by π(p) = z. The points {(Ej , 0), 0 ≤ j ≤ 2g + 1} ⊆ M are called branch points and the sets ± =

1/2 {(z, ±R2g+2 (z))

| z ∈ C\

g

[E2j , E2j +1 ]} ⊂ M

(2.3)

j =0

are called upper, lower sheet, respectively. g Let {aj , bj }j =1 be loops on the surface M representing the canonical generators of the fundamental group π1 (M). We require aj to surround the points E2j −1 , E2j (thereby changing sheets twice) and bj to surround E0 , E2j −1 counter-clockwise on the upper sheet, with pairwise intersection indices given by ai ◦ aj = bi ◦ bj = 0,

ai ◦ bj = δij ,

1 ≤ i, j ≤ g.

(2.4)

g

The corresponding canonical basis {ζj }j =1 for the space of holomorphic differentials can be constructed by ζ =

g

c(j )

π j −1 dπ 1/2

(2.5)

,

R2g+2

j =1

where the constants c(.) are given by

cj (k) = Cj−1 k , The differentials fulfill ζk = δj,k , aj

Cj k = ak

π j −1 dπ 1/2 R2g+2

=2

E2k

zj −1 dz

E2k−1

R2g+2 (z)

1/2

∈ R.

ζk = τj,k ,

τj,k = τk,j ,

1 ≤ j, k ≤ g.

(2.6)

bj

Now pick g numbers (the Dirichlet eigenvalues) g

g

(µˆ j )j =1 = (µj , σj )j =1

(2.7)

whose projections lie in the spectral gaps, that is, µj ∈ [E2j −1 , E2j ]. Associated with these numbers is the divisor Dµˆ which is one at the points µˆ j and zero otherwise. Using this divisor we introduce ˆ p ∈ Cg , z(p, n) = Aˆ p0 (p)−α p0 (Dµˆ )−nA∞− (∞+ )− 0

z(n) = z(∞+ , n),

(2.8)

814


where p0 is the vector of Riemann constants ˆ p0 ,j =

1−

g

k=1 τj,k

2

,

p0 = (E0 , 0),

(2.9)

and Ap0 (α p0 ) is Abel’s map (for divisors). The hat indicates that we regard it as a (single-valued) map from Mˆ (the fundamental polygon associated with M) to Cg . We recall that the function θ (z(p, n)) has precisely g zeros µˆ j (n) (with µˆ j (0) = µˆ j ), where θ(z) is the Riemann theta function of M. Then our Jacobi operator Hq is given by θ (z(n + 1))θ (z(n − 1)) , θ (z(n))2 g θ (w + z(n)) ∂ bq (n) = b˜ + cj (g) ln . ∂wj θ (w + z(n − 1)) w=0

aq (n)2 = a˜ 2

(2.10)

j =1

The constants a, ˜ b˜ depend only on the Riemann surface and will be defined in the next section. It is well known that the spectrum of Hq is purely absolutely continuous and consists of g + 1 bands σ (Hq ) =

g

[E2j , E2j +1 ].

(2.11)

j =0

For further information and proofs we refer to [28], Sect. 9. 3. The Baker-Akhiezer Function and the Quasi-Momentum Map The Baker-Akhiezer function ψq (p, n) = ψq (p, n, 0) is given by p θ (z(n0 − 1))θ (z(n0 )) θ (z(p, n)) ψq (p, n, n0 ) = exp (n − n0 ) ωˆ ∞+ ,∞− , θ (z(n − 1))θ (z(n)) θ (z(p, n0 )) p0 (3.1) where ω∞+ ,∞− is the normalized Abelian differential of the third kind with simple poles at ∞± and residues ±1, respectively. They are normalized such that ψq (p, n0 , n0 ) = 1. The two branches ψq,± (z, n) =

n−1

φq,± (z, j )

(3.2)

j =0

of the Baker-Akhiezer function are solutions of τq u = zu, z ∈ C, where τq is the difference expression associated with Hq and ([28], (8.87)) 1/2 g R2g+2 (z) Rˆ j (n) 1 φq,± (z, n) = z − bq (n) + ± g , 2aq (n) z − µj (n) j =1 (z − µj (n)) j =1

(3.3)


815

1/2

Rj (n) =

R2g+1 (µj (n)) k =j (µj (n) − µk (n))

,

Rˆ j (n) = σj (n)Rj (n).

However, the Wronskian 1/2

R2g+2 (z) Wq (ψq,− (z), ψq,+ (z)) = g j =1 (z − µj )

(3.4)

(µj = µj (0)) shows that they are linearly dependent at the band edge Ej , 0 ≤ j ≤ 2g + 1. The branch ψq,σj (z, n) has a first order pole at µj if µj is away from the band edges lim (z − µj )ψq,σj (z, n) = ψq,σj (µj , n, 1)

z→µj

Rˆ j (0) aq (0)

(3.5)

(use (3.3) and ψq,± (z, n) = ψq,± (z, n, 1)φq,± (z, 0)) and both branches have a square root singularity if µj coincides with a band edge El ,

√ il k =l |El − Ek | lim z − µj ψq,± (z, n) = ± ψq,+ (El , n, 1). (3.6)

√ z→µj 2aq (0) k =j El − µk Lemma 3.1. The solutions of τq u = zu can be characterized as follows: (i) If R2g+2 (z) = 0, there exist two solutions satisfying ψq,± (z, n) = θ± (z, n)w(z)±n ,

w(z) = exp

(z,+)

p0

ωˆ ∞+ ,∞− ,

(3.7)

with θ± (z, n) quasi-periodic. (ii) If R2g+2 (z) = 0, z = El , there are two solutions satisfying ψq (El , n) = ψq,+ (El , n) = ψq,− (El , n),

ψˆ q (El , n) = ψq (El , n)(θˆl (n)+n), (3.8)

where θˆl (n) is quasi-periodic. Proof. (ii). We construct a second linearly independent solution at z = E = El using (see [28], (1.50)) sq (E, n) = lim aq (0) z→E

ψq,+ (z, n) − ψq,− (z, n) , W (ψq,− (z), ψq,+ (z))

(3.9)

where sq (z, n) denotes the fundamental solution of τq u = zu with initial conditions sq (z, 0) = 0, sq (z, 1) = 1. W.l.o.g. we assume that El does not coincide with one of the Dirichlet eigenvalues µj (otherwise shift the base point). To derive an expression for ψq,± (z) at z = E + 2 we start with 1/2 R2g+2 (z) = (R˜ + O( 2 )), R˜ = − E − Ej . j =l

816


Moreover, R˜ (1 + O( 2 )) (E − µ ) j j =1

Wq (ψq,− (z), ψq,+ (z)) = g and for p = (E + 2 , ±) (see (3.11) below)

p

p0

ωˆ ∞+ ,∞− =

E

ωˆ ± β + O( ), 3

β=

2

g

j =1 (E

R˜

p0

z(p, n) = z(E, n) ± γ + O( 3 ),

γ =

− λj )

g j =1

c(j )

,

2E j −1 , R˜

and θ (z(p, n)) = θ (z(E, n)) ±

∂θ (z(E, n)) γ + O( 3 ). ∂z

Using this to evaluate the limit ε → 0 shows sq (E, n) = 2aq (0)

g E − µj ψˆ q (E, n) = ψq (E, n)(θˆ (n) + n), E − λj

j =1

where g ∂ 1 E j ck (j ) ln θ(z(E, n) + w), ∂w (E − λ ) k j j,k=1 j =1

θˆ (n) = g and finishes the proof.

Remark 3.2. (i) Since ψq (z, n) has a singularity if z = µj the solutions in Lemma 3.1 are not well-defined for those z. However, you can either remove the singularities of ψq (z, n) or choose a different normalization point n0 = 0 to see that solutions of the above type exist for every z. (ii) In the periodic case Floquet theory tells you that there are two possible cases at a band edge: Either two (linearly independent) periodic solutions or one periodic and one linearly growing solution. The above lemma shows that the first case happens if the corresponding gap is closed and the second if the gap is open. To understand the properties of ψq,± (z, n) we need to investigate the quasi-momentum map p

w(z) = exp p = (z, +). (3.10) ωˆ ∞+ ,∞− , p0

The differential ω∞+ ,∞− is given by

g ω∞+ ,∞− =

j =1 (π − λj ) dπ, 1/2 R2g+2

(3.11)


817

where the constants λj have to be determined from the normalization E2j g

aj

j =1 (z − λj ) dz 1/2 R2g+2 (z)

ω∞+ ,∞− = 2 E2j −1

= 0,

(3.12)

which shows λj ∈ (E2j −1 , E2j ). Since λj ∈ (E2j −1 , E2j ) the integrand is a Herglotz function and admits the following representation (cf. [28], Appendix B)

g

j =1 (z − λj ) 1/2 R2g+2 (z)

=

∞

−∞

1 d µ(λ) ˜ λ−z

(3.13)

with the probability measure

g d µ(λ) ˜ =

j =1 (λ − λj ) χσ (Hq ) (λ)dλ. 1/2 πiR2g+2 (λ)

(3.14)

Hence

z ∞ 1 ω∞+ ,∞− = d µ(λ)dζ ˜ p0 E0 −∞ λ − ζ ∞ λ − E0 = d µ(λ). ˜ ln λ−z −∞ p

g(z, ∞) =

(3.15)

In particular, note that −Re(g(z, ∞)) is the Green’s function of the upper sheet + with pole at ∞+ and µ˜ is the equilibrium measure of the spectrum (see [30], Thm. III.37). We will abbreviate g(z) = g(z, ∞). The asymptotic expansion of exp(g(z)) is given by ([28], (9.42))

p

exp p0

ωˆ ∞+ ,∞−

=−

a˜ b˜ 1

1 + + O( 2 ) , z z z

z → ∞,

(3.16)

where a˜ is the capacity of the spectrum and 2g+1 g 1 b˜ = Ej − λj . 2 j =0

(3.17)

j =1

Theorem 3.3. The map g is a bijection from the upper (resp. lower) half plane C± = {z ∈ C | ±Im(z) > 0} to S ± = {z ∈ C | ±Re(z) < 0, 0 < Im(z) < π }\

g

[g(λj ), g(E2j +1 )]

j =1

such that σ (Hq ) = {z | Re(z) = 0}.

(3.18)

818


Proof. By the Herglotz property of its integrand, the function g(z, ∞) satisfies the conditions of [23], Theorem 1(b) in Chapter VI, which shows that it is one-to-one. To prove that g(z, ∞) is surjective, it suffices to show that the boundary of C+ is mapped to the boundary of S + . Note that g(λ) is negative for λ < E0 and purely imaginary for λ ∈ [E0 , E1 ]. At E1 , the real part starts to decrease from zero until it hits its minimum at λ1 and increases again until it becomes 0 at E2 (since all a-periods are zero), while the imaginary part remains constant. Proceeding like this we move along the boundary of S + as λ moves along the real line. For λ > E2g+1 , g(λ) is again negative. Remark 3.4. In the special case where Hq is periodic the quasi-momentum is given by w(z) = exp(iN −1 arccos (z)), where (z) is the Floquet discriminant, and our result is due to [24]. Therefore the map ±

w:C →W

±

= {w ∈ C | |w| < 1, ±Im(w) > 0}\

g

[w(λj ), w(E2j +1 )]

j =1

z → exp(g(z))

(3.19)

is bijective. Denote W = W + ∪ W − ∪ (−1, 1), W0 = W \{0}. If we identify corresponding points on the slits [w(λj ), w(E2j +1 )] we obtain a Riemann surface W which is isomorphic to the upper sheet + . Remark 3.5. In [24] the largest band edge E2g+1 is chosen for p0 and w will map C± → W ∓ in this case. Moreover, in the periodic case the slits [w(λj ), w(E2j +1 )] appear at equal angles 2π N , where N is the period. Since z → w(z) = exp(g(z)) is a bijection, we consider the functions ψq,± as functions of the new parameter w whenever convenient. For notational simplicity we will write ψq,± (w, n) for ψq,± (λ(w), n) and similarly for other quantities. The functions ψq,± (w, n) are meromorphic in W and continuous up to the boundary with the only possible singularities at the images of the Dirichlet eigenvalues w(µj ) and at 0. More precisely, denote by M± the sets of poles (and square root singularities if µj = El ) of g the Weyl m-functions m ˜ ± (λ), i.e. M+ ∪ M− = {µj }j =1 (see (3.2) and [28], Sect. 2.1). Note that µj ∈ M+ ∩ M− if and only if µj = El . Then g

(B1) ψq,± (w, n) are holomorphic in W\({w(µj )}j =1 ∪ {0}) and continuous on ∂W \{w(µj )}. (B2) ψq,± (w, n) has a simple pole at w(µj ) if µj ∈ M± \{El }, no pole if µj ∈ M± , and if µj = El , ψq,± (w, n) = ±

il C(n) + O(1), w − wl

where C(n) is bounded and real. (B3) ψq,± (w, n) = ψq,∓ (w, n) for |w| = 1. (B4) At w = 0 the following asymptotics hold ψq,± (w, n) = (−1)n

n−1

∗

m=0

±1 w ( )±n (1 + O(w)). a˜

aq (m)


819

By Sect. 2.5 of [28] the vector valued functions 1 ψq,+ (λ, n) U (λ, n) = 4aq (0)2 πIm(m ˜ + (λ)) ψq,− (λ, n)

(3.20)

form an orthonormal basis for the Hilbert space L2 (σ (Hq ), C2 , dλ). The Weyl m-functions m ˜ ± (z) satisfy (see [28], Eq. (8.95)) 1/2

Im(m ˜ ± (λ)) =

∓R2g+2 (λ) ,

g 2iaq (0)2 j =1 (λ − µj )

λ ∈ σ (Hq ).

(3.21)

(z,+) Using our map w(z) = exp( p0 ωˆ ∞+ ,∞− ) we can transform this into an orthonormal basis on the unit circle. Lemma 3.6. Both functions ψq,+ (w, n) and ψq,− (w, n) form orthonormal bases in the Hilbert space L2 (S 1 , 2π1 i dω), where dω(w) =

g λ(w) − µj dw . λ(w) − λj w

(3.22)

j =1

Proof. Just use dw =w dz

g

j =1 (z − λj ) . 1/2 R2g+2 (z)

(3.23)

Observe that dω is meromorphic on W with a simple pole at w = 0. In particular, there are no poles at w(λj ). Remark 3.7. In [2] a different normalization is used. To establish the connection observe N

ψq,+ (z, n)ψq,− (z, n) = N

n=1

N−1 j =1

z − λj z − µj

(3.24)

if Hq is periodic with period N . 4. Existence of Jost Solutions After we have these preparations out of our way, we come to the study of short-range perturbations H of Hq associated with sequences a, b satisfying a(n) → aq (n) and b(n) → bq (n) as |n| → ∞. More precisely, we will make the following assumption throughout this paper. Hypothesis H. 4.1. Let H be a perturbation such that

|n| |a(n) − aq (n)| + |b(n) − bq (n)| < ∞. n∈Z

(4.1)

820


We first establish existence of Jost solutions, that is, solutions of the perturbed operator which asymptotically look like the Baker-Akhiezer solutions. Theorem 4.2. Assume (H.4.1). Then there exist solutions ψ± (z, .), z ∈ C, of τ ψ = zψ satisfying lim |w(z)∓n (ψ± (z, n) − ψq,± (z, n))| = 0,

n→±∞

(4.2)

where ψq,± (z, .) are the Baker-Akhiezer functions. Moreover, ψ± (z, .) are continuous (resp. holomorphic) with respect to z whenever ψq,± (z, .) are and inherit the properties il C± (n) (B1) and (B2), where now ψ± (z, n) = √ + O(1). (B4) has to be replaced by z−µj

ψ± (z, n) =

n−1 n

±1

1 z∓n 1

∗ ∗ aq (j ) bq (j − 01 ) 1 + B± (n) ± + O( 2 ) , A± (n) j =0 z z j =1

(4.3) where A+ (n) =

∞ a(j ) , aq (j )

j =n

A− (n) =

n−1 j =−∞

a(j ) , aq (j )

B+ (n) =

∞

(bq (m) − b(m)),

m=n+1

B− (n) =

n−1

(bq (m) − b(m)).

(4.4)

m=−∞

Proof. The proof can be done as in the periodic case (see e.g., [17, 25 or 28], Sect. 7.5). The only problem is to show that the second solution at a band edge grows at most linearly. In the periodic case this follows from Floquet theory; here we just use Lemma 3.1. From this result we obtain a complete characterization of the spectrum of H . Theorem 4.3. Assume (H.4.1). Then we have σess (H ) = σ (Hq ), the point spectrum of H is finite and confined to the spectral gaps of Hq , that is, σp (H ) ⊂ R\σ (Hq ). Furthermore, the essential spectrum of H is purely absolutely continuous. Proof. Again the proof can be done as in the periodic case (see e.g., [25 or 28], Sect. 7.5). 5. The Transformation Operator We define the kernel of the transformation operator as the Fourier coefficients of the Jost solutions ψ± (w, n) with respect to the orthonormal system given in Lemma 3.6, {ψq,± (w, n)}n∈Z , 1 ψ± (w, n)ψq,∓ (w, m)dω(w). (5.1) K± (n, m) := 2πi |w|=1


821

By the Cauchy theorem, this integral equals the residue at w = 0, K± (n, m) = Res0

1 ψ± (w, n)ψq,∓ (w, m). w

(5.2)

In particular, since ψ± (w, n)ψq,∓ (w, m) = O(w±(n−m) ), we conclude K± (n, m) = 0,

±(m − n) < 0.

(5.3)

Lemma 5.1. Assume H.4.1. The Jost solutions ψ± (w, n) can be represented as ±∞

ψ± (w, n) =

|w| = 1,

K± (n, m)ψq,± (w, m),

(5.4)

m=n

where the kernels K± (n, .) satisfy K± (n, m) = 0 for ±m < ±n and

|a(j ) − aq (j )|+|b(j )−bq (j )| ,

±∞

|K± (n, m)| ≤ C

±m > ±n.

(5.5)

j =[ m+n 2 ]±1

The constant C depends only on Hq and the value of the sum in (4.1). Proof. We prove the estimate for K+ (n, m) and omit “+” and “z” whenever possible. Define ϕ(n) = ψ(n)K(n, n)−1 , then ϕ fulfills ∞

ϕ(n) = ψq (n) +

J (n, m)ϕ(m),

(5.6)

sq (z, n, m) sq (z, n, m − 1) ˜ + b(m) aq (m − 1) aq (m)

(5.7)

m=n+1

where J (z, n, m) = a(m ˜ − 1) with the abbreviation

a(m)2 − aq (m), aq (m)

a(m) ˜ =

˜ b(m) = b(m) − bq (m).

(5.8)

On the other hand, ϕ(n) is given by ϕ(n) =

∞

κ(n, m) =

κ(n, m)ψq (m),

m=n

K(n, m) , K(n, n)

therefore ∞

∞

κ(n, m)ψq (m) =

m=n

∞

J (n, m)ψq (m)+

m=n+1

∞

J (n, m)κ(m, l)ψq (l).

m=n+1 l=m+1

(5.9) Multiplying both sides of (5.9) by ψq,− (k) and integrating over the unit circle yields κ(n, k) =

∞ m=n+1

(n, m, m, k) +

∞

∞

m=n+1 l=n+1

(n, m, l, k)κ(m, l),

(5.10)

822


where (n, m, l, k) =

1 2πi

|w|=1

J (w, n, m)ψq,+ (w, l)ψq,− (w, k)dω(w).

(5.11)

Using [28], (1.50), sq (n, m) ψq,+ (m)ψq,− (n) − ψq,+ (n)ψq,− (m) = , a(m) W (ψq,+ , ψq,− )

(5.12)

˜ (n, m, l, k) = b(m) ˜ q (n, m, l, k) + a(m) q (n, m − 1, l, k)

(5.13)

we obtain

with q (n, m, l, k) = 0 (m, n, l, k) − 0 (n, m, l, k), ψq,+ (w, n)ψq,− (w, m)ψq,+ (w, l)ψq,− (w, k) 1 0 (n, m, l, k) = dω(w) 2πi w(γ ) W (ψq,+ (w), ψq,− (w))

ψq,+ (z, n)ψq,− (z, m)ψq,+ (z, l)ψq,− (z, k) (z − µj ) 1 = dz 1/2 2πi γ W (ψq,+ (z), ψq,− (z)) R2g+2 (z) ψq,+ (z, n)ψq,− (z, m)ψq,+ (z, l)ψq,− (z, k) 1 dz. (5.14) = 2πi γ W (ψq,+ (z), ψq,− (z))2 Here γ is a path on the upper sheet encircling the spectrum. The integrand of 0 is meromorphic on the Riemann surface M with poles of order one at Ej and poles of order O(z±(n−m+l−k)−2 ) near ∞± (there are no poles at the Dirichlet eigenvalues µj ). We apply the residue theorem twice, first on the side of γ including ∞+ , then on the other side including the spectrum (and thus ∞− ), 0 (n, m, l, k) = −Res∞+ = Res∞−

ψq,+ (n)ψq,− (m)ψq,+ (l)ψq,− (k) W (ψq,+ , ψq,− )2 2g+1

ψ (n)ψ (m)ψ (l)ψ (k) q,+ q,− q,+ q,− . + ResEj W (ψq,+ , ψq,− )2 j =0

(5.15) The order of the poles at ∞± implies 0 (n, m, l, k) =

 2g+1  

j =0

ResEj

ψq,+ (n)ψq,− (m)ψq,+ (l)ψq,− (k) W (ψq,+ ,ψq,− )2

0

n−m+l−k n. Note that the residue at Ej is given by

g 2 =1 (Ej − µ )2

(5.19) ψq (Ej , n)ψq (Ej , m)ψq (Ej , l)ψq (Ej , k). =j (Ej − E ) Now we obtain for κ(n, k), ∞

κ(n, k) =

(n, m, m, k) +

m=n+1 ∞

=

∞

∞

(n, m, l, k)κ(m, l)

m=n+1 l=m+1

(n, m, m, k) +

∞

m+k−n−1

(n, m, l, k)κ(m, l),

m=n+1 l=n+k−m+1

m=[ n+k 2 ]+1

(5.20) since (n, m, m, k) = 0 only if |m − k| < m − n implying m > n+k 2 . In the third sum of (5.20) we need that |m + δ − k| < m − n for δ ≥ 1 which yields δ < k − n and δ > n + k − 2m. Two remarks might be in order: m + k − n − 1 ≥ n + k − m + 1 since m − n ≥ n − m + 2, and the starting point l = n + k − m + 1 of the third sum actually has a lower limit, namely m ≤ n+k 2 , since we require l ≥ m + 1 for κ(m, l) = 0, 1. Note that ∞ ∞ ˜ (n, m, m, k) ≤ D |b(m) + a(m)| ˜ =: q( ˆ n+k 2 ),

m=[ n+k 2 ]+1 m+k−n−1

m=[ n+k 2 ]+1

˜ |(n, m, l, k)| ≤ D (m − n − 1)|b(m) + a(m)| ˜ =: c(m) ˆ ∈ 1 (Z),

l=n+k−m+1

where D is the estimate provided by (5.18), (5.19). We set up the following iteration procedure κ0 (n, k) =

∞

(n, m, m, k),

m=[ n+k 2 ]+1

κj (n, k) =

∞

m+k−n−1

m=n+1 l=n+k−m+1

(n, m, l, k)κj −1 (m, l).

(5.21)

824


Then using induction one has

∞

|κj (n, k)| ≤

q( ˆ n+k 2 )

j

ˆ m=n+1 c(m)

(5.22)

, j! and hence the iteration converges and implies the estimate ∞ ∞ |κ(n, k)| = κj (n, k) ≤ q( ˆ n+k ) exp c(m) ˆ . 2 j =0

(5.23)

m=n+1

Associated with K± (n, m) is the operator (K± f )(n) =

±∞

K± (n, m)f (m),

f ∈ ∞ ± (Z, C),

(5.24)

m=n

which acts as a transformation operator for the pair τ , τq . Theorem 5.2. Let τq and τ be the quasi-periodic and perturbed Jacobi difference expression, respectively. Then τ K± f = K± τq f,

f ∈ ∞ ± (Z, C).

(5.25)

Proof. It suffices to show that H K± = K± Hq . Indeed, 1 H ψ± (w, n)ψq,∓ (w, m)dω(w) H K± (n, m) = 2πi |w|=1 1 = λ(w)ψ± (w, n)ψq,∓ (w, m)dω(w) 2πi |w|=1 1 = ψ± (w, n)Hq ψq,∓ (w, m)dω(w). 2πi |w|=1

(5.26)

Lemma 5.3. For n ∈ Z we have a(n) K+ (n + 1, n + 1) K− (n, n) = = , (5.27) aq (n) K+ (n, n) K− (n + 1, n + 1) K+ (n, n + 1) K+ (n − 1, n) b(n) − bq (n) = aq (n) − aq (n − 1) K+ (n, n) K+ (n − 1, n − 1) K− (n, n − 1) K− (n + 1, n) − aq (n) . = aq (n − 1) K− (n, n) K− (n + 1, n + 1) Proof. Consider the equation of the transformation operator H K± = K± Hq , which is equivalent to (cf. (5.26)) a(n − 1)K± (n − 1, m) + b(n)K± (n, m) + a(n)K± (n + 1, m) = aq (m − 1)K± (n, m − 1) + bq (m)K± (n, m) + aq (m)K± (n, m + 1). Evaluating at m = n we obtain the first equation and at m = n ∓ 1 the second.

In particular, observe K± (n, n) =

1 , A± (n)

K± (n, n ± 1) =

B± (n) A± (n)aq (n − 01 )

.

(5.28)


825

6. The Scattering Matrix Let Hq be a given quasi-periodic Jacobi operator and H a perturbation of Hq satisfying Hypothesis H.4.1. To set up scattering theory for the pair (H, Hq ) we proceed as usual. The Wronskian of our Jost functions can be evaluated as n → ±∞ and is given by W (ψ± (λ), ψ± (λ)) = Wq (ψq,± (λ), ψq,∓ (λ)) 1/2

R2g+2 (λ)

= ∓ g

j =1 (λ − µj )

,

λ ∈ σ (Hq ).

(6.1)

Hence ψ± (λ), ψ± (λ) are linearly independent for λ in the interior of σ (Hq ) and we consider the scattering relations ψ± (λ, n) = α(λ)ψ∓ (λ, n) + β∓ (λ)ψ∓ (λ, n), where α(λ) = β± (λ) =

W (ψ∓ (λ), ψ± (λ)) W (ψ∓ (λ), ψ∓ (λ)) W (ψ∓ (λ), ψ± (λ)) W (ψ± (λ), ψ± (λ))

λ ∈ σ (Hq ),

(6.2)

g

= =

j =1 (λ − µj ) W (ψ− (λ), ψ+ (λ)), 1/2 R2g+2 (λ)

g j =1 (λ − µj ) ∓ W (ψ∓ (λ), ψ± (λ)). 1/2 R2g+2 (λ)

(6.3)

While α(λ) is only defined for λ ∈ σ (Hq ), (6.3) may be used as a definition for λ ∈ C\{Ej }. Therefore α(w) can be continued as a holomorphic function on W and it is continuous up to the boundary except possibly at the band edges. Remark 6.1. Note that α(λ) does not depend on the normalization of ψ± (λ) at the base point n0 = 0 whereas β± = β±,0 does. Using ψ± (z, n, n0 ) = ψq,± (z, n0 )−1 ψ± (z, n) and W ((ψ+ (λ), ψ− (λ)) =

g λ − µj (n0 ) W ((ψ+ (λ, ., n0 ), ψ− (λ, ., n0 )) λ − µj

j =1

we see β±,0 (λ) =

ψq,∓ (λ, n0 ) β±,n0 (λ). ψq,± (λ, n0 )

(6.4)

β± (w) = β± (w) = −β∓ (w),

(6.5)

A direct calculation shows α(w) = α(w),

and the Plücker identity (cf. [28], (2.169)) implies |α(w)|2 = 1 + |β± (w)|2 ,

|w| = 1.

(6.6)

We will denote the eigenvalues of H by q

σp (H ) = {ρj }j =1 .

(6.7)

826


Our next aim is to study the behavior of α(λ) at the eigenvalues ρj , therefore we modify the Jost solutions ψ± (λ, n) according to their poles at µj and define the following eigenfunctions ψˆ ± (λ, .) ψˆ + (λ, .) = (λ − µl ) ψ+ (λ, .), µl ∈M+

ψˆ − (λ, .) =

(λ − µl ) ψ− (λ, .).

(6.8)

µl ∈M− \{Ej }

Define ψˆ q,± (λ, .) accordingly. Moreover, ψˆ ± (ρj , n) = cj± ψˆ ∓ (ρj , n) with cj+ cj− = 1. The norming constants γ±,j are defined by 1 γ±,j

=

|ψˆ ± (ρj , m)|2 .

(6.9)

m∈Z

To compute the derivative of α(λ) at ρj , note that α(λ) =

W (ψˆ − (λ), ψˆ + (λ)) 1/2

(6.10)

.

R2g+2 (λ)

By virtue of [28], Lemma 2.4, d 1 . ψˆ − (ρj , k)ψˆ + (ρj , k) = − ± W (ψˆ − (λ), ψˆ + (λ)) = − ρj dλ cj γ±,j

(6.11)

k∈Z

Therefore W (ψˆ − (ρj ), ψˆ + (ρj )) −1 d α(λ) = = . 1/2 1/2 ± ρj dλ R2g+2 (ρj ) cj γ±,j R2g+2 (ρj )

(6.12)

From (6.12) we obtain a connection between the left and right norming constants γ+,j γ−,j =

1 (α (ρ

j

))2 R

2g+2 (ρj )

.

(6.13)

As a last preparation, we study the behavior of α(w) as w → 0. By (4.3), W (ψ− (w), ψ+ (w)) =

1 aw ˜ −1 + O(w) A

(6.14)

with A = A− (0)A+ (0) and 1/2

R2g+2 (λ(w)) = aw ˜ −1 + O(1),

g (λ(w) − λ ) j j =1

(6.15)

therefore α −1 (w) is bounded at 0 with α(0) =

∞ aq (j ) . a(j )

j =−∞

(6.16)


827

We now define the scattering matrix T (w) R− (w) S(w) = , R+ (w) T (w)

|w| = 1,

(6.17)

where T (w) := α −1 (w) and R± (w) := α −1 (w)β± (w) are called transmission and reflection coefficients. Equations (6.5) and (6.6) imply Lemma 6.2. The scattering matrix S(w) is unitary. The coefficients T (w), R± (w) are bounded for |w| = 1, continuous for |w| = 1 except at possibly wl = w(El ), fulfill |T (w)|2 + |R± (w)|2 = 1, T (w)R+ (w) + T (w)R− (w) = 0,

|w| = 1, |w| = 1,

(6.18) (6.19)

and T (w) = T (w), R± (w) = R± (w) for |w| = 1. 1/2 Moreover, R2g+2 (w)T (w)−1 is continuous (in particular T (w) can only vanish at wl ) and 1/2

wl = w(µj )

1/2

wl = w(µj )

lim R2g+2 (w) R±T(w)+1 (w) = 0,

w→wl

lim R2g+2 (w) R±T(w)−1 (w) = 0,

w→wl

.

(6.20)

The transmission coefficient T (w) has a meromorphic continuation to W with simple poles at w(ρj ), 2 (6.21) Resρj T (λ) = γ+,j γ−,j R2g+2 (ρj ). In addition, T (z) ∈ R as z ∈ R\σ (Hq ) and T (0) =

∞ a(j ) 1 = , K+ (n, n)K− (n, n) aq (j )

(6.22)

j =−∞

where K± (n, n) are the coefficients of the transformation operators. Proof. To show (6.20) we use the definition (6.3), 1/2

R2g+2 (λ)

g R± (λ) + 1 = (λ − µj ) W (ψ− (λ), ψ+ (λ)) ∓ W (ψ∓ (λ), ψ± (λ)) . T (λ) j =1

There are two cases to distinguish: If µj = El then ψ± are continuous and real at λ = El and the two Wronskians cancel. Otherwise, if µj = El they are purely imaginary (by property (B2) of the Jost functions) and the two terms are equal in the limit and add up. The sets S± (H ) = {R± (w), |w| = 1; (ρj , γ±,j ), 1 ≤ j ≤ q}

(6.23)

are called left/right scattering data for H . First we want to show that the transmission coefficient can be reconstructed from either left or right scattering data.

828


Let g(w, w0 ) be the Green function associated with W and let ∂g µ(w, w0 )dw0 = (w, reiθ ) − eiθ dθ, w0 = eiθ , (6.24) r=1 ∂r be the corresponding harmonic measure on the boundary (see, e.g., [30]). Since W0 is simply connected, we can choose a function h(w, v) such that g(w, ˆ w0 ) = g(w, w0 ) + ih(w, w0 ) is analytic in W0 . Clearly gˆ is only well-defined up to an imaginary constant and it will not be analytic on W\{0} in general. Similarly we can find a corresponding ν(w, w0 ) and set µ(w, ˆ w0 ) = µ(w, w0 ) + iν(w, w0 ). Theorem 6.3. Either one of the sets S± (H ) determines the other and T (w) via the Poisson-Jensen type formula q 1 T (w) = exp g(w, ˆ w(ρj )) exp ln(1 − |R± (w0 )|2 )µ(w, ˆ w0 )dw0 , 2 |w|=1 j =1

(6.25) where the constant of gˆ has to be chosen such that T (0) > 0, and 2 Resρj T (λ) R− (w) T (w) =− , γ+,j γ−,j = 2g+1 . R+ (w) T (w) l=0 (ρj − El ) Proof. It suffices to prove the formula for T (w), since evaluating the residua provides γ±,j , together with {λl }, {El }. The formula for T (w) holds by [32], Theorem 1, at least when taking absolute values. Since both sides are analytic and have equal absolute values, they can only differ by a constant of absolute value one. But both sides are positive at w = 0 and hence this constant is one. Note that neither the Blaschke factors nor the outer function in (6.25) are single valued on W in general. In particular, the eigenvalues cannot be chosen arbitrarily, which was first observed in [21]. 7. The Gel’fand-Levitan-Marchenko Equations In this section we want to derive a procedure which allows the reconstruction of the Jacobi operator H with asymptotically quasi-periodic coefficients from its scattering data S± (H ). This will be achieved by deriving an equation for K± (n, m) which is generally known as Gel’fand-Levitan-Marchenko equation. Since K± (n, m) are essentially the Fourier coefficients of the Jost solutions ψ± (w, n) we compute the Fourier coefficients of the scattering relations (6.2). Therefore we multiply T (w)ψ∓ (w, n) = R± (w)ψ± (w, n) + ψ± (w, n)

(7.1)

by (2πi)−1 ψq,± (w, m)dω, where ±m ≥ ±n, and integrate around the unit circle. First we evaluate the right-hand side of (7.1) using (5.1), 1 ψ+ (w, n)ψq,+ (w, m)dω(w) = K+ (n, m), (7.2) 2π i |w|=1 ∞ 1 R+ (w)ψ+ (w, n)ψq,+ (w, m)dω(w) = K+ (n, l)F˜ + (l, m), 2π i |w|=1 l=n


where 1 F˜ + (l, m) = 2πi

829

|w|=1

R+ (w)ψq,+ (w, l)ψq,+ (w, m)dω(w).

(7.3)

Note that F˜ + (l, m) = F˜ + (m, l) is real. To evaluate the left hand side of (7.1) we use the residue theorem. The only poles are at the eigenvalues and at 0 if n = m, hence 1 T (w)ψ− (w, n)ψq,+ (w, m)dω(w) 2πi |w|=1 q T (λ)ψˆ − (λ, n)ψˆ q,+ (λ, m) δ(n, m) = + Resρj . 1/2 K+ (n, n) R2g+2 (λ) j =1 Here δ(n, m) is one for m = n and zero else. By (6.12) the residua at the eigenvalues are given by T (λ)ψˆ − (λ, n)ψˆ q,+ (λ, m) (7.4) = −γ+,j ψˆ + (ρj , n)ψˆ q,+ (ρj , m). Resρj 1/2 R2g+2 (λ) Collecting all terms yields K± (n, m) +

±∞ l=n

δ(n, m) K± (n, l)F˜ ± (l, m) = γ±,j ψˆ ± (ρj , n)ψˆ q,± (ρj , m) − K± (n, n) q

j =1

(7.5) and we have thus proved the following result. Theorem 7.1. The kernel K± (n, m) of the transformation operator satisfies the Gel’fand-Levitan-Marchenko equation, K± (n, m) +

±∞

K± (n, l)F ± (l, m) =

l=n

δ(n, m) , K± (n, n)

±m ≥ ±n,

(7.6)

where F ± (l, m) = F˜ ± (l, m) +

q

γ±,j ψˆ q,± (ρj , l)ψˆ q,± (ρj , m).

(7.7)

j =1

Defining the Gel’fand-Levitan-Marchenko operator Fn± f (j ) =

∞

F ± (n ± l, n ± j )f (l),

f ∈ 2 (N0 , C),

(7.8)

l=0

yields that the Gel’fand-Levitan-Marchenko equation is equal to (1 + Fn± )K± (n, n ± .) = (K± (n, n))−1 δ0 .

(7.9)

Our next aim is to study the Gel’fand-Levitan-Marchenko operator Fn± in more detail. The structure of the Gel’fand-Levitan-Marchenko equation suggests that the estimate (5.5) for K± (n, m) should imply a similar estimate for F ± (n, m).

830


Lemma 7.2. ±∞

±

|F (n, m)| ≤ C

|a(j ) − aq (j )| + |b(j ) − bq (j )| ,

(7.10)

j =[ n+m 2 ]±1

where the constant C is of the same nature as in (5.5). Proof. We abbreviate the estimate (5.5) for K+ (n, m) by |K+ (n, m)| ≤ C C+ (n + m),

(7.11)

where C+ (n + m) =

∞

c(j ),

c(j ) = |a(j ) − aq (j )| + |b(j ) − bq (j )|.

j =[ n+m 2 ]+1

Note that C+ (n + 1) ≤ C+ (n). Moreover, C+ (n) ∈ 1+ (Z) since the summation by parts formula (e.g. [28], (1.18)) N

g(m)(f (m + 1) − f (m)) = g(N)f (N + 1) − g(n − 1)f (n)

m=n

+

N

(g(m − 1) − g(m))f (m)

(7.12)

m=n

implies for g(m) = m, f (m) = C+ (m) that ∞

m c(m) = (n − 1)C+ (n) +

m=n

∞

C+ (m),

(7.13)

m=n

where we used limn→∞ n C+ (n + 1) ≤ limn→∞ ∞ m=n m c(m) = 0. Solving the GLMequation (7.6) for F + (n, m), m > n, we obtain ∞ 1 K+ (n, l)F + (l, m) |K+ (n, m)| + |F + (n, m)| ≤ K+ (n, n) l=n+1 ∞ + C+ (n + l) F (l, m) , ≤ C1 (n) C+ (n + m) + l=n+1

(n, n)|−1

→ C for n → ∞ (see (5.28)). For n large enough, i.e. where C1 (n) = C |K+ C1 (n)C+ (2n) < 1, we apply the discrete Gronwall-type inequality [28], Lemma 10.8, ∞ C1 (l)C+ (l + m)C+ (n + l) + |F (n, m)| ≤ C1 (n) C+ (n + m) +

l k=n+1 (1 − C1 (k)C+ (n + k)) l=n+1 ∞ C1 (k)C+ (n + l) , ≤ C1 (n)C+ (n + m) 1 +

l k=n+1 (1 − C1 (n)C+ (n + k)) l=n+1 (7.14) which finishes the proof.


831

Furthermore, Lemma 7.3. Let F ± (n, m) be solutions of the Gel’fand-Levitan-Marchenko equation. Then ±∞

|n| F ± (n, n) − F ± (n ± 1, n ± 1) < ∞,

(7.15)

|n| aq (n)F ± (n, n + 1) − aq (n − 1)F ± (n − 1, n) < ∞.

(7.16)

n=n0 ±∞ n=n0

Proof. We first prove (7.16) for F + . Lemma 5.3 implies b(n) − bq (n) = aq (n)κ+,1 (n) − aq (n − 1)κ+,1 (n − 1),

(7.17)

where κ+,j (n) := κ+ (n, n + j ) :=

K+ (n, n + j ) . K+ (n, n)

(7.18)

Abbreviate Fj+ (n) = F + (n + j, n). With this notation, the GLM-equation (7.6) reads κ+,l (n) + Fl+ (n) +

∞ j =1

κ+,j (n)Fj+−l (n + l) =

δ(l, 0) , K+ (n, n)2

l ≥ 0.

(7.19)

Insert the GLM-equation for F + (n, n+1), F + (n−1, n) (recall F + (n, m) = F + (m, n)) (7.20) aq (n)F1+ (n) − aq (n − 1)F1+ (n − 1) = −aq (n)κ+,1 (n) + aq (n − 1)κ+,1 (n − 1) ∞

aq (n)κ+,j (n)Fj+−1 (n + 1) − aq (n − 1)κ+,j (n − 1)Fj+−1 (n) . − j =1

Since −aq (n)κ+,1 (n) + aq (n − 1)κ+,1 (n − 1) = bq (n) − b(n) the only interesting part is the sum. For N, J < ∞, N n=n0

=

n

aq (n)κ+,j (n)Fj+−1 (n + 1) − aq (n − 1)κ+,j (n − 1)Fj+−1 (n)

j =1

N J j =1 n=n0

=

J

n aq (n)κ+,j (n)Fj+−1 (n + 1) − aq (n − 1)κ+,j (n − 1)Fj+−1 (n)

J j =1

Naq (N )κ+,j (N )Fj+−1 (N + 1) − (n0 − 1)aq (n0 − 1)κ+,j (n0 − 1)Fj+−1 (n0 )

+

N n=n0

(−1)aq (n − 1)κ+,j (n − 1)Fj+−1 (n) ,

(7.21)

832


where we used the summation by parts. Estimates (7.11), (7.14) imply for the first summand J J ˜ + (2N + j )C+ (2N + j + 1) Naq (N )κ+,j (N)Fj+−1 (N + 1) ≤ |N |aq (N )CC j =1

j =1

ˆ + (2N + 1), ≤ |N |aq (N )CC which holds uniformly in J , and (compare (7.13)) ˆ + (2N + 1) = 0. lim N aq (N )CC

N→∞

(7.22)

Moreover, lim

N,J →∞

≤ ≤

J N aq (n − 1)κ+,j (n − 1)Fj+−1 (n) j =1 n=n0

lim

N,J →∞

N J aq (n − 1)κ+,j (n − 1)Fj+−1 (n) j =1 n=n0

∞ ∞

˜ + (2n + j )C+ (2n + j + 1) < ∞. aq (n − 1)CC

j =1 n=n0

Therefore |n||aq (n)F + (n, n + 1) − aq (n − 1)F + (n − 1, n)| ∈ 1+ (Z) as desired. To apply Lemma 5.3 for F − use the symmetry property F − (n, m) = F − (m, n). For (7.15), inserting the GLM-equation yields −2 −2 F + (n, n) − F + (n + 1, n + 1) = K+ (n, n) − K+ (n + 1, n + 1) ∞

+ κ+,j (n + 1)Fj+ (n + 1) − κ+,j (n)Fj+ (n) . j =1

By (5.28), ∞ |a(n) + a (n)| a(j )2 q −2 −2 (n + 1, n + 1) ≤ |a(n) − aq (n)| K+ (n, n) − K+ a(n)2 aq (j )2 j =n+1

≤ C|a(n) − aq (n)|, and the same considerations as above imply (7.15).

(7.23)

Remark 7.4. The Gel’fand-Levitan-Marchenko equation is symmetric in K± (n, m) and F ± (n, m), therefore we can invert the analysis done in Lemma 7.3 and obtain estimates for K± (n, m) starting with an analogue of estimate (7.10) for F ± (n, m) and the estimates (7.15), (7.16) (cf. Lemma 8.1). Theorem 7.5. For n ∈ Z, the Gel’fand-Levitan-Marchenko operator Fn± : 2 → 2 is Hilbert-Schmidt. Moreover, 1 + Fn± is positive and hence invertible. In particular, the Gel’fand-Levitan-Marchenko equation (7.9) has a unique solution and S+ (H ) or S− (H ) uniquely determine H .


833

Proof. That Fn± is Hilbert-Schmidt is a straightforward consequence of our estimate in Lemma 7.2. Let f ∈ 2 (N0 ) be real (which is no restriction since F + (n, l) is real and the real ∞and imaginary part of (7.24) could be treated separately) and abbreviate fn (w) = j =0 f (j )ψq,+ (w, n + j ). Then ∞

f (j )Fn+ f (j ) =

j =0

=

1 2π i +

∞

f (j )

j =0 ∞

|w|=1

R+ (w)

∞

F + (n + j, n + l)f (l)

l=0

f (j )ψq,+ (w, n + j )ψq,+ (w, n + l)f (l) dω(w)

j,l=0

q ∞

f (j )γ+,k ψˆ q,+ (ρk , n + j )ψˆ q,+ (ρk , n + l)f (l)

k=1 j,l=0

1 = 2π i =

1 2π i

|w|=1

|w|=1

R+ (w)fn (w)fn (w) dω(w) +

q

γ+,k |fˆn (ρk )|2

k=1

R˜ + (w)|fn (w)|2 dω(w) +

q

γ+,k |fˆn (ρk )|2 ,

(7.24)

k=1

−1 where R˜ + (w) = R+ (w)fn (w) fn (w) with |R˜ + (w)| = |R+ (w)| and fˆn (w) = ∞ ˜ ˆ j =0 f (j )ψq,+ (w, n+j ). The integral over the imaginary part vanishes since R+ (w) = R˜ + (w) and we replace the real part by 1 1 |1 + R˜ + (w)|2 − 1 − |R˜ + (w)|2 = |1 + R˜ + (w)|2 + |T (w)|2 − 1, 2 2 1 2 (recall |R˜ + (w)|2 +|T (w)|2 = 1). This yields using |f (j )|2 = 2πi |w|=1 |fn (w)| dω,

Re(R˜ + (w)) =

∞ j =0

f (j )(1 + Fn+ )f (j ) =

q

γ+,k |fˆn (ρk )|2

k=1

+

1 4πi

|w|=1

|1 + R˜ + (w)|2 + |T (w)|2 |fn (w)|2 dω(w), (7.25)

which establishes 1 + Fn+ ≥ 0. According to Lemma 6.2, |T (w)|2 > 0 a.e., therefore −1 is not an eigenvalue and 1 + Fn+ ≥ n for some n > 0. To finish the direct scattering step for the Jacobi operator H with asymptotically quasi-periodic coefficients we summarize the properties of the scattering data S± (H ). Hypothesis H. 7.6. The scattering data S± (H ) = {R± (w), |w| = 1; (ρj , γ±,j ), 1 ≤ j ≤ q} satisfy the following conditions:

(7.26)

834


(i) The reflection coefficients R± (w) are continuous except possibly at wl = w(El ) and fulfill R± (w) = R± (w).

(7.27)

Moreover, |R± (w)| < 1 for w = wl and 1 − |R± (w)|2 ≥ C

2g+1

|w − wl |2 .

(7.28)

R± (w)ψq,± (w, l)ψq,± (w, m)dω(w)

(7.29)

l=0

The Fourier coefficients 1 F˜ ± (l, m) = 2πi

|w|=1

satisfy ±∞

|F˜ ± (n, m)| ≤

q(j ) ≥ 0,

q(j ),

|j |q(j ) ∈ 1 (Z),

j =n+m ±∞

|n|F˜ ± (n, n) − F˜ ± (n ± 1, n ± 1) < ∞,

n=n0 ±∞

|n|aq (n)F˜ ± (n, n + 1) − aq (n − 1)F˜ ± (n − 1, n) < ∞.

n=n0

(ii) The values ρj ∈ R\σ (Hq ), 1 ≤ j ≤ q, are distinct and the norming constants γ±,j , 1 ≤ j ≤ q, are positive. (iii) T (w) defined via Eq. (6.25) extends to a single valued function on W (i.e., it has equal values on the corresponding slits). (iv) Transmission and reflection coefficients satisfy lim (w − wl ) R±T(w)+1 (w) = 0,

w→wl

lim (w

w→wl

− wl ) R±T(w)−1 (w)

= 0,

wl = w(µj ), (7.30)

wl = w(µj ),

and the consistency conditions R− (w) T (w) , =− R+ (w) T (w)

2

Resρj T (λ)

γ+,j γ−,j = 2g+1 l=0

(ρj − El )

.

Remark 7.7. Note that (7.28) implies that ln(1 − |R± (w)|2 ) is integrable and ensures that (6.25) is well-defined, at least as a multi-valued function. Condition (iii), which is void in the constant background case, shows that the reflection coefficient and the eigenvalues cannot be chosen independent of each other.


835

8. Inverse Scattering Theory In this section we want to invert the process of scattering theory, that is, we want to reconstruct the operator H from a given set S± and a given quasi-periodic Jacobi operator Hq . If S± (satisfying H.7.6 (i)–(ii)) and Hq are known, we can construct F ± (l, m) via formula (7.7) and thus derive the Gel’fand-Levitan-Marchenko equation, which has a unique solution by Theorem 7.5. This solution K± (n, n) = δ0 , (1 + Fn± )−1 δ0 1/2 , 1 K± (n, n ± j ) = δj , (1 + Fn± )−1 δ0 K± (n, n)

(8.1)

is the kernel of the transformation operator. Since 1+Fn± is positive, K± (n, n) is positive and we can set in accordance with Lemma 5.3, K+ (n + 1, n + 1) , K+ (n, n) K− (n, n) , a− (n) = aq (n) K− (n + 1, n + 1) K+ (n, n + 1) K+ (n − 1, n) − aq (n − 1) , b+ (n) = bq (n) + aq (n) K+ (n, n) K+ (n − 1, n − 1) K− (n, n − 1) K− (n + 1, n) − aq (n) . b− (n) = bq (n) + aq (n − 1) K− (n, n) K− (n + 1, n + 1) a+ (n) = aq (n)

(8.2)

Let H+ , H− be the associated Jacobi operators. Lemma 8.1. Suppose a given set S± satisfies H.7.6 (i)–(ii). Then the sequences defined in (8.2) satisfy n|a± (n) − aq (n)|, n|b± (n) − bq (n)| ∈ 1± (N). Moreover, ψ± (λ, n) = ±∞ m=n K± (n, m)ψq,± (λ, m), where K± (n, m) is the solution of the Gel’fand-Levitan-Marchenko equation, satisfies τ± ψ± = λψ± . Proof. We only prove the statements for the “+” case. Define F + (n, m) by (cf. (7.7)) F + (l, m) = F˜ + (l, m) +

q

γ+,j ψˆ q,+ (ρj , l)ψˆ q,+ (ρj , m).

j =1

Hypothesis H.7.6 (i) implies |F + (n, m)| ≤ C

∞

q(j ) =: C+ (n + m),

(8.3)

j =n+m ∞ n=n0 ∞ n=n0

|n|F + (n, n) − F + (n + 1, n + 1) < ∞,

(8.4)

|n|aq (n)F + (n, n + 1) − aq (n − 1)F + (n − 1, n) < ∞,

(8.5)

836


since ψˆ q,+ (ρj , n) decay exponentially as n → ∞ and j γ+,j ψˆ q,+ (ρj , .)ψˆ q,+ (ρj , .) form a telescopic sum. Note that C+ (n + 1) < C+ (n). Set κ+ (n, m) := K+ (n, m)K+ (n, n)−1 . Then as in the proof of Lemma 7.2 we obtain |κ+ (n, m)| ≤ C+ (n + m)(1 + O(1)).

(8.6)

Now we have all estimates at our disposal to prove n|b+ (n) − bq (n)| ∈ 1 (N). By definition (cf. (8.2)), b+ (n) − bq (n) = aq (n)κ+ (n, n + 1) − aq (n − 1)κ+ (n − 1, n).

(8.7)

We insert the GLM-equation for κ+ (n, n + 1), κ+ (n − 1, n) and use estimate (8.5), the summation by parts formula, and estimates (8.3), (8.6) in the same way as in Lemma 7.3. Similarly using (8.4) we see ∞ 1 1 |n| 2 − 2 (8.8) < ∞. K+ (n, n) K+ (n + 1, n + 1) n=n0 Equation (8.2) yields ∞ 1 1 1 a+ (j )2

|a+ (n)2 − aq (n)2 |. − 2 2 = K+ (n, n) K+ (n + 1, n + 1) aq (n)2 aq (j )2 j =n+1 The product converges and therefore |n||a+ (n)2 − aq (n)2 | ∈ 1 (N). Next we consider ψ+ (λ, n). Abbreviate 2 (n)aq−1 (n)κ+ (n + 1, m) (K+ )(n, m) = aq (n − 1)κ+ (n − 1, m) + a+ −aq (m − 1)κ+ (n, m − 1) −aq (m)κ+ (n, m + 1) + (b+ (n) − bq (m))κ+ (n, m).

(8.9)

K+ = 0 is equivalent to the operator equality H+ K+ = K+ Hq , which in turn implies that ψ+ (λ, n) satisfies H+ ψ+ = λψ+ , H+ ψ+ = H+ K+ ψq,+ = K+ Hq ψq,+ = K+ λψq,+ = λK+ ψq,+ = λψ+ .

(8.10)

To show that K+ = 0 we insert the GLM-equation into (8.9) and obtain (K+ )(n, m) +

∞

(K+ )(n, l)F + (l, m) = 0,

m > n + 1.

(8.11)

l=n+1

In the calculations we used aq (n − 1)F + (n − 1, m) + bq (n)F + (n, m) + aq (n)F + (n + 1, m) = aq (m − 1)F + (n, m − 1) + bq (m)F + (n, m) + aq (m)F + (n, m + 1) which follows from (7.7). By Theorem 7.5 Eq. (8.11) has only the trivial solution K+ = 0 and hence the proof is complete. Now we can prove the main result of this section.


837

Theorem 8.2. Hypothesis H.7.6 is necessary and sufficient for a set S± to be the left/right scattering data of a unique Jacobi operator H associated with sequences a, b satisfying H.4.1. Proof. Necessity has been established in the previous section. By Lemma 8.1, we know existence of sequences a± , b± and corresponding solutions ψ± (w, n) associated with S+ (or S− ). Hence it remains to establish a+ (n) = a− (n) and b+ (n) = b− (n). Consider the following part of the GLM-equation + (n, .) :=

∞

K+ (n, l)F˜ + (l, .) ∈ 1+ (Z).

(8.12)

l=n

Then by use of (7.2) and Lemma 3.6,

+ (n, m)ψq,− (w, m) =

m∈Z

∞ m∈Z

K+ (n, l)F˜ + (l, m) ψq,− (w, m)

l=n

1 = R+ (w)ψ+ (w, n)ψq,+ (w, m)dω(w) ψq,− (w, m) 2πi |w|=1 m∈Z = ψq,− (w, m), R+ (w)ψ+ (w, n)ψq,− (w, m) m∈Z

= R+ (w)ψ+ (w, n).

(8.13)

On the other hand, inserting the GLM-equation yields for |w| = 1, + (n, m)ψq,− (w, m) = m∈Z

=

n−1

+ (n, m)ψq,− (w, m) +

m=−∞ ∞

−

q

γ+,j ψˆ q,+ (ρj , l)ψˆ q,+ (ρj , m) ψq,− (w, m)

j =1

n−1

−1 + (n, m)ψq,− (w, m) + ψq,− (w, n)K+ (n, n) − ψ+ (w, n)

m=−∞ q

∞

j =1

m=n

γ+,j ψˆ + (ρj , n)

−

−1 δ(n, m)K+ (n, n) − K+ (n, m)

m=n

K+ (n, l)

l=n

=

∞

ψˆ q,+ (ρj , m)ψq,− (w, m),

(8.14)

(recall the definition of ψˆ q,± from (6.8)) and therefore T (w)h− (w, n) = ψ+ (w, n) + R+ (w)ψ+ (w, n), where

|w| = 1,

n−1 ψq,− (w, n) ψq,− (w, m) 1 h− (w, n) = + (n, m) + T (w) K+ (n, n) m=−∞ ψq,− (w, n) q Wn−1 (ψˆ q,+ (ρj ), ψq,− (w)) ˆ γ+,j ψ+ (ρj , n) , + ψq,− (w, n)(λ(w) − ρj ) j =1

(8.15)

(8.16)

838


since Green’s formula ([28], Eq. (1.20)) implies for λ ∈ σ (Hq ), (λ − ρj )

∞

ψˆ q,+ (ρj , m)ψq,− (λ, m) = −Wn−1 (ψˆ q,+ (ρj ), ψq,− (λ)).

m=n

Similarly, we obtain h+ (w, n) =

∞ ψq,+ (w, n) ψq,+ (w, m) 1 − (n, m) + T (w) K− (n, n) ψq,+ (w, n) m=n+1 q Wn (ψˆ q,− (ρj ), ψq,+ (w)) γ−,j ψˆ − (ρj , n) − ψq,+ (w, n)(λ(w) − ρj )

(8.17)

j =1

with − (n, m) =

n

K− (n, l)F˜ − (l, m).

l=−∞

For n ∈ Z, |w| = 1, we see that h∓ (w −1 , n) = h∓ (w, n), since K± (n, m) and ± (n, m) are real. The functions h∓ (w, n) are continuous for |w| = 1, w = w(Ej ), since T −1 (w) is continuous on this set by the Poisson-Jensen formula (6.25) (|R± (w)| < 1 for w = w(Ej ) by H.7.6 (i)) and ψq,∓ (w, m) are continuous on ∂W \{w(µk )}. The functions h∓ (w, n) have a meromorphic continuation to W\{0} with the only possible poles at w(ρj ) and w(µj ). At w(ρj ) there are no poles, due to the zeros of T −1 (w) at w(ρj ). For w = w(µj ) we have the same type of singularity as ψq,± . In summary, h± (w, n) have simple poles at w(µj ) and are continuous at the boundary except possibly at w(Ej ). To study the behavior of h± (w, n) as w → 0, we recall z−1 = −w/a˜ (1 + O(w)). Then w + O(w 2 ) Wn−1 (ψˆ q,+ (ρj ), ψq,− (w)) a˜ (−1)n a˜ n−1 −n+1 w = n−2 (ψˆ q,+ (ρj , n − 1) + O(w)), a (j ) q j =0 −w + O(w 2 ) Wn (ψˆ q,− (ρj ), ψq,+ (w)) a˜

(−1)n nj=0 aq (j ) n+1 w (ψˆ q,− (ρj , n + 1) + O(w)), = a˜ n+1 and property (B4) implies ∓∞

−1 ± (n, m)ψq,∓ (w, m)ψq,∓ (w, n) = O(w),

w → 0.

(8.18)

m=n∓1

We conclude that lim h∓ (w, n)ψq,± (w, n) =

w→0

1 . T (0)K± (n, n)

(8.19)


839

H.7.6 (iv) and (6.1) imply the following behavior of hˆ ∓ (λ, n) as λ → ρj : lim hˆ ∓ (λ, n) = ±γ±,j ψˆ ± (ρj , n) lim

λ→ρj

λ→ρj

Wn−1 (ψˆ q,± (ρj ), ψˆ q,∓ (λ)) (λ − ρj )T (λ)

2g+1 −1 = γ±,j ψˆ ± (ρj , n) Resρj T (λ) ρj − E l ,

(8.20)

l=0

where hˆ ± are defined as in (6.8). By virtue of the consistency condition T (w)R+ (w) = −T (w)R− (w) we obtain h± (w, n) + R± (w)h± (w, n) =

R (w)

1 ± ψ∓ (w, n) + R∓ (w)ψ∓ (w, n) + ψ∓ (w, n) + R∓ (w)ψ∓ (w, n) = T (w) T (w) 1 R (w) R (w)

R± (w)R∓ (w) ∓ ± = ψ∓ (w, n) + + ψ∓ (w, n) + T (w) T (w) T (w) T (w) |w| = 1. = ψ∓ (w, n)T (w), If we eliminate R∓ (w) from the last equation and (8.15) we see −1/2 T (w)R2g+2 (w) ψˆ + (w, n)ψˆ − (w, n) − hˆ + (w, n)hˆ − (w, n)

j (λ(w)−µj ) h± (w, n)ψ± (w, n)−ψ± (w, n)h± (w, n) =: G(w, n) = 1/2 R2g+2 (w)

(8.21)

for |w| = 1. Observe that G(w, n) = G(w, n) = G(w, n), |w| = 1, since h± ψ± − −1/2 ψ± h± and R2g+2 (w) are odd functions for |w| = 1. The function G(w, n) can be continued analytically on W since the difference ψˆ + ψˆ − − hˆ + hˆ − vanishes at the poles w(ρj ) of T (w) by (8.20). Note that the product ψˆ + ψˆ − and hence also hˆ + hˆ − do not have poles at w(µj ). Moreover, since W is just the image of the upper sheet, we can ˜ by adding the image of the lower sheet. Now extend it to a compact Riemann surface W ˜ by setting G(w, n) = G(w −1 , n) for by G(w, n) = G(w, n) we can extend G to W |w| > 1. Now let us investigate the behavior at the band edges: If wl = w(µj ), we obtain by (7.30), (8.15), and real-valuedness of ψˆ ± at the band edges that 1/2

lim R2g+2 (w)

w→wl

= lim

w→wl

= lim

w→wl

(λ(w) − µj )h∓ (w, n)ψ∓ (w, n) j

1/2 R2g+2 j (λ − µj )

T

1/2 R2g+2 j (λ − µj )

T

ψ± + R ± ψ± ψ∓

(R± + 1)ψ± + ψ± − ψ± ψ∓ = 0.

840


If wl = w(µj ), the same calculation shows that 1/2

lim R2g+2 (w)

w→wl

(λ(w) − µj )h± (w, n)ψ± (w, n) j 1/2

= (−1)l+1 C+ (n)C− (n) lim R2g+2 (w) w→wl

R± (w) − 1 =0 T (w)

by (7.30), where we used ψ± (w, n) = il C± (n)(λ(w) − µj )−1/2 + O(1). Consequently, R2g+2 (w)G(w, n) is continuous at w = wl and vanishes at the band 1/2 edges. Thus the singularities of R2g+2 (w)G(w, n) at wl are removable. Furthermore, 1/2

R2g+2 (w)G(w, n) is purely imaginary for |w| = 1 and real on the slits and hence must vanish at wl by continuity. So the singularities of G(w, n) at wl are removable as well. ˜ and vanishes at w = 0, that is, G(w, n) ≡ 0 which Thus G is holomorphic on all of W implies (compare (B4))

lim ψ+ (w, n)ψ− (w, n) − h+ (w, n)h− (w, n)

w→0

= K+ (n, n)K− (n, n) − (T (0)2 K+ (n, n)K− (n, n))−1 = 0. −2 Using (8.2) we finally obtain from T (0)2 = K+ (n, n)K− (n, n) that a+ (n) = a− (n) ≡ a(n),

∀n ∈ Z.

(8.22)

It remains to prove b+ (n) = b− (n). Proceeding as for G(w, n) we can show that −1/2 T (w)R2g+2 (w) ψˆ + (w, n)ψˆ − (w, n + 1) − hˆ + (w, n + 1)hˆ − (w, n)

j (λ(w)−µj ) = h+ (w, n + 1)ψ+ (w, n)−ψ+ (w, n)h+ (w, n + 1) 1/2 R2g+2 (w)

(8.23)

is a constant equal to −1/a(n). Thus W (w, n) := a(n) (ψ+ (w, n)ψ− (w, n + 1) − h+ (w, n + 1)h− (w, n)) 1/2

= −

R2g+2 (w)

. T (w) j (λ(w) − µj )

(8.24)

Computing the asymptotics at w = 0 (compare (4.3)) we see 0 = W (w, n) − W (w, n − 1) = and in particular b+ (n) = b− (n) ≡ b(n).

1 (b+ (n) − b− (n)) A

(8.25)


841

Our operator H has the correct norming constants since as in (6.12) it follows

2g+1 −1 ˆ ˆ ψ+ (ρj , n)ψ− (ρj , n) = Resρj T (λ) ρj − E l ,

n∈Z

(8.26)

l=0

and by (8.20), n∈Z

−1 . ψˆ ± (ρj , n)ψˆ ± (ρj , n) = γ±,j

Acknowledgement. I.E. thanks A. Boutet de Monvel for the kind hospitality of University Paris-7, where part of this work was done. G.T. thanks Peter Yuditskii for several helpful discussions and hints with respect to the literature. We thank Mark Losik for help with respect to literature.

References 1. Bazargan, J., Egorova, I.: Jacobi operator with step-like asymptotically periodic coefficients. Mat. Fiz. Anal. Geom. 10(3), 425–442 (2003) 2. Boutet de Monvel, A., Egorova, I.: Transformation operator for Jacobi matrices with asymptotically periodic coefficients. J. Difference Eqs. Appl. 10, 711–727 (2004) 3. Boutet de Monvel, A., Egorova, I.: The Toda lattice with step-like initial data. Soliton asymptotics. Inverse Problems 16(4), 955–977 (2000) 4. Bulla, W., Gesztesy, F., Holden, H., Teschl, G.: Algebro-Geometric Quasi-Periodic Finite-Gap Solutions of the Toda and Kac-van Moerbeke Hierarchies. Memoirs of the Amer. Math. Soc. 135/641, (1998) 5. Case, K.M.: Orthogonal polynomials from the viewpoint of scattering theory. J. Math. Phys. 14, 2166–2175 (1973) 6. Case, K.M.: The discrete inverse scattering problem in one dimension. J. Math. Phys. 15, 143–146 (1974) 7. Case, K.M.: Orthogonal polynomials II. J. Math. Phys. 16, 1435–1440 (1975) 8. Case, K.M.: On discrete inverse scattering problems. II. J. Math. Phys. 14, 916–920 (1973) 9. Case, K.M., Chiu, S.C.: The discrete version of the Marchenko equations in the inverse scattering problem. J. Math. Phys. 14, 1643–1647 (1973) 10. Case, K.M., Kac, M.: A discrete version of the inverse scattering problem. J. Math. Phys. 14, 594–603 (1973) 11. Faddeev, L., Takhtajan, L.: Hamiltonian Methods in the Theory of Solitons. Berlin: Springer, 1987 12. Cojuhari, P.A.: Finiteness of the discrete spectrum of Jacobi matrices (Russian). In: Investigations in differential equations and mathematical analysis 173, Kishinev: “Shtiintsa”, 1988, pp. 80–93 13. Firsova, N.E.: The direct and inverse scattering problems for the one-dimensional perturbed Hill operator. Math. USSR, Sb. 58, 351–388 (1987) 14. Flaschka, H.: On the Toda lattice. II. Progr. Theoret. Phys. 51, 703–716 (1974) 15. Gardner, C.S., Green, J.M., Kruskal, M.D., Miura, R.M.: A method for solving the Korteweg-de Vries equation. Phys. Rev. Lett. 19, 1095–1097 (1967) 16. Gesztesy, F., Nowell, R., Pötz, W.: One-dimensional scattering theory for quantum systems with nontrivial spatial asymptotics. Differ. Integral Eq. 10(3), 521–546 (1997) 17. Geronimo, J.S., Van Assche, W.: Orthogonal polynomials with asymptotically periodic recurrence coefficients. J. App. Th. 46, 251–283 (1986) 18. Guseinov, G.S.: The inverse problem of scattering theory for a second-order difference equation on the whole axis. Soviet Math. Dokl. 17, 1684–1688 (1976) 19. Guseinov, G.S.: The determination of an infinite Jacobi matrix from the scattering data. Soviet Math. Dokl. 17, 596–600 (1976) 20. Guseinov, G.S.: Scattering problem for the infinite Jacobi matrix. Izv. Akad. Nauk Arm. SSR, Mat. 12, 365–379 (1977) 21. Kuznetsov, E.A., Mikha˘ılov,A.V.: Stability of stationary waves in nonlinear weakly dispersive media. Soviet Phys. JETP 40(5), 855–859 (1975) 22. Marchenko, V.A.: Sturm–Liouville Operators and Applications. Basel: Birkhäuser, 1986

842


23. Parthasarathy, T.: On Global Univalence Theorems, LNM 577. Berlin: Springer, 1983 24. Percolab, L.: The inverse problem for the periodic Jacobi matrix. Theor. funk., funk. an., pril. 42, 107–121 (1984), in Russian 25. Teschl, G.: Oscillation theory and renormalized oscillation theory for Jacobi operators. J. Diff. Eqs. 129, 532–558 (1996) 26. Teschl, G.: Inverse scattering transform for the Toda hierarchy. Math. Nach. 202, 163–171 (1999) 27. Teschl, G.: On the initial value problem for the Toda and Kac-van Moerbeke hierarchies. AMS/IP Studies in Advanced Mathematics 16, Providence, RI: Amer. Math. Soc. 2000, pp. 375–384 28. Teschl, G.: Jacobi Operators and Completely Integrable Nonlinear Lattices. Math. Surv. and Mon. 72, Providence, RI: Amer. Math. Soc., 2000 29. Toda, M.: Theory of Nonlinear Lattices, 2nd enl. edn, Berlin: Springer, 1989 30. Tsuji, M.: Potential Theory in modern Functional Analysis. Tokyo: Maruzen, 1959 31. Volberg, A., Yuditskii, P.: On the inverse scattering problem for Jacobi Matrices with the Spectrum on an Interval, a finite system of intervals or a Cantor set of positive length. Commun. Math. Phys. 226, 567–605 (2002) 32. Voichick, V., Zalcman, L.: Inner and outer functions on Riemann surfaces. Proc. Amer. Math. Soc. 16, 1200-1204 (1965) Communicated by B. Simon

Commun. Math. Phys. 264, 843 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-1557-0

Communications in


Erratum

Infinite Volume Limit for the Stationary Distribution of Abelian Sandpile Models S.R. Athreya1 , A.A. Járai2 1 2

7 SJSS marg, Indian Statistical Institute, New Delhi, 110016, India School of Mathematics and Statistics, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario K1S 5B6, Canada. E-mail: [email protected]

Received: 25 November 2005 / Accepted: 16 December 2005 Erratum published online : 3 April 2006 – © Springer-Verlag 2006 Commun. Math. Phys. 249, 197–213 (2004)

Electronic Supplementary Material: Supplementary material is available in the online version of this article at http://dx.doi.org/10.1007/s00220-006-1557-0 and is accessible for authorized users. Regrettably, our proof of the main theorem in [1] contains some errors. The results of the paper do hold without change, and the original line of argument can be followed, after appropriate modifications. The corrections can be found in the electronic supplementary material to this article. The problems are indicated below. (a) The way HF,x was defined, the inclusion {(F ∗ , x ∗ ) = (F, x)} ⊃ {T ∩ HF,x = F } in (7) may fail. On the event {T ∩ HF,x = F }, there may be descendents of x in T that do not belong to F (but belong to F ∗ ). Therefore, we cannot conclude F ∗ = F . This problem can be fixed by letting (F ∗ , e∗ ) play the role of (F ∗ , x ∗ ), where e∗ is the unique edge joining F ∗ to the rest of the tree. (b) The sets {ω ∈ : ω ∩ HF,x = F } are not disjoint (as claimed above (7)), only their intersections with X . This is remedied by a more careful application of weak convergence. (c) The description of the event B (F¯ , x) ¯ via Wilson’s algorithm is not correct. The random walks started at vertices in ∪ri=1 V (Fi ) are not sufficient to describe this event. A suitable modification of B works. Reference 1. Athreya, S.R., Járai, A.A.: Infinite volume limit for the stationary distribution of Abelian sandpile models. Commun. Math. Phys. 249, 197–213 (2004) Communicated by M. Aizenman

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 304

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 260

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 304

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 260

Recommend Documents